home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1111725638

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions issue performed_via_github_app
https://github.com/simonw/datasette/issues/1727#issuecomment-1111725638 https://api.github.com/repos/simonw/datasette/issues/1727 1111725638 IC_kwDOBm6k_c5CQ5ZG 9599 2022-04-28T04:15:15Z 2022-04-28T04:15:15Z OWNER

Useful theory from Keith Medcalf https://sqlite.org/forum/forumpost/e363c69d3441172e

This is true, but the concurrency is limited to the execution which occurs with the GIL released (that is, in the native C sqlite3 library itself). Each row (for example) can be retrieved in parallel but "constructing the python return objects for each row" will be serialized (by the GIL).

That is to say that if your have two python threads each with their own connection, and each one is performing a select that returns 1,000,000 rows (lets say that is 25% of the candidates for each select) then the difference in execution time between executing two python threads in parallel vs a single serial thead will not be much different (if even detectable at all). In fact it is possible that the multiple-threaded version takes longer to run both queries to completion because of the increased contention over a shared resource (the GIL).

So maybe this is a GIL thing.

I should test with some expensive SQL queries (maybe big aggregations against large tables) and see if I can spot an improvement there.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
1217759117  
Powered by Datasette · Queries took 1.05ms · About: github-to-sqlite