home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 590153892

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions issue performed_via_github_app
https://github.com/simonw/datasette/issues/682#issuecomment-590153892 https://api.github.com/repos/simonw/datasette/issues/682 590153892 MDEyOklzc3VlQ29tbWVudDU5MDE1Mzg5Mg== 9599 2020-02-24T03:10:45Z 2020-02-24T03:13:03Z OWNER

Some more detailed notes I made earlier:

Datasette would run a single write thread per database. That thread gets an exclusive connection, and a queue. Plugins can add functions to the queue which will be called and given access to that connection.

The write thread for that database is created the first time a write is attempted.

Question: should that thread have its own asyncio loop so that async techniques like httpx can be used within the thread? I think not at first - only investigate this if it turns out to be necessary in the future.

This thread will run as part of the Datasette process. This means there is always a risk that the thread will die in the middle of something because the server got restarted - so use transactions to limit risk of damage to database should that happen.

I don’t want web responses blocking waiting for stuff to happen here - so every task put on that queue will have a task ID, and that ID will be returned such that client code can poll for its completion.

Could the request block for up to 0.5s just in case the write is really fast, then return a polling token if it isn't finished yet? Looks possible - Queue.get can block with a timeout.

There will be a /-/writes page which shows currently queued writes - so each one needs a human-readable description of some sort. (You can access a deque called q.queue to see what’s in there)

Stretch goal: It would be cool if write operations could optionally handle their own progress reports. That way I can do some really nice UI around what’s going on with these things.

This mechanism has a ton of potential. It may even be how we handle things like Twitter imports and suchlike - queued writing tasks.

One catch with this approach: if a plugin is reading from APIs etc it shouldn't block writes to the database while it is doing so. So sticking a function in the queue that does additional time consuming stuff is actually an anti pattern. Instead, plugins should schedule their API access in the main event loop and occasionally write just the updates they need to make to that write queue.

Implementation notes

Maybe each item in the queue is a (callable, uuid, reply_queue) triple. You can do a blocking .get() on the reply_queue if you want to wait for the answer. The execution framework could look for the return value from callable() and automatically send it to reply_queue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
569613563  
Powered by Datasette · Queries took 1.485ms · About: github-to-sqlite