issue_comments: 1112878955

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	issue	performed_via_github_app
https://github.com/simonw/datasette/issues/1727#issuecomment-1112878955	https://api.github.com/repos/simonw/datasette/issues/1727	1112878955	IC_kwDOBm6k_c5CVS9r	9599	2022-04-29T05:02:40Z	2022-04-29T05:02:40Z	OWNER	Here's a very useful (recent) article about how the GIL works and how to think about it: https://pythonspeed.com/articles/python-gil/ - via https://lobste.rs/s/9hj80j/when_python_can_t_thread_deep_dive_into_gil From that article: For example, let's consider an extension module written in C or Rust that lets you talk to a PostgreSQL database server. Conceptually, handling a SQL query with this library will go through three steps: Deserialize from Python to the internal library representation. Since this will be reading Python objects, it needs to hold the GIL. Send the query to the database server, and wait for a response. This doesn't need the GIL. Convert the response into Python objects. This needs the GIL again. As you can see, how much parallelism you can get depends on how much time is spent in each step. If the bulk of time is spent in step 2, you'll get parallelism there. But if, for example, you run a `SELECT` and get a large number of rows back, the library will need to create many Python objects, and step 3 will have to hold GIL for a while. That explains what I'm seeing here. I'm pretty convinced now that the reason I'm not getting a performance boost from parallel queries is that there's more time spent in Python code assembling the results than in SQLite C code executing the query.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1217759117