{"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1728192688", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1728192688, "node_id": "IC_kwDOBm6k_c5nAiCw", "user": {"value": 173848, "label": "yozlet"}, "created_at": "2023-09-20T17:53:31Z", "updated_at": "2023-09-20T17:53:31Z", "author_association": "NONE", "body": "`/me munches popcorn at a furious rate, utterly entralled`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724045748", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724045748, "node_id": "IC_kwDOBm6k_c5mwtm0", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T17:24:07Z", "updated_at": "2023-09-18T17:24:07Z", "author_association": "OWNER", "body": "I need reliable steps to reproduce, then I can bisect and figure out which exact version of Datasette introduced the problem.\r\n\r\nI have a hunch that it relates to changes made to the `datasette/database.py` module, maybe one of these changes here: https://github.com/simonw/datasette/compare/0.61...0.63.1#diff-4e20309c969326a0008dc9237f6807f48d55783315fbfc1e7dfa480b550e16f9", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724048314", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724048314, "node_id": "IC_kwDOBm6k_c5mwuO6", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T17:25:55Z", "updated_at": "2023-09-18T17:25:55Z", "author_association": "OWNER", "body": "The good news is that this bug is currently unlikely to affect most users since named in-memory databases (created using `datasette.add_memory_database(\"airtable_refs\")` ([docs](https://docs.datasette.io/en/stable/internals.html#add-memory-database-name)) are a pretty obscure feature, only available to plugins.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724049538", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724049538, "node_id": "IC_kwDOBm6k_c5mwuiC", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T17:26:44Z", "updated_at": "2023-09-18T17:26:44Z", "author_association": "OWNER", "body": "Just managed to get this exception trace:\r\n```\r\n return await self.route_path(scope, receive, send, path)\r\n File \"/Users/simon/.local/share/virtualenvs/airtable-export-Ca4U-3qk/lib/python3.8/site-packages/datasette/app.py\", line 1354, in route_path\r\n response = await view(request, send)\r\n File \"/Users/simon/.local/share/virtualenvs/airtable-export-Ca4U-3qk/lib/python3.8/site-packages/datasette/views/base.py\", line 134, in view\r\n return await self.dispatch_request(request)\r\n File \"/Users/simon/.local/share/virtualenvs/airtable-export-Ca4U-3qk/lib/python3.8/site-packages/datasette/views/base.py\", line 91, in dispatch_request\r\n return await handler(request)\r\n File \"/Users/simon/.local/share/virtualenvs/airtable-export-Ca4U-3qk/lib/python3.8/site-packages/datasette/views/base.py\", line 361, in get\r\n response_or_template_contexts = await self.data(request, **data_kwargs)\r\n File \"/Users/simon/.local/share/virtualenvs/airtable-export-Ca4U-3qk/lib/python3.8/site-packages/datasette/views/table.py\", line 158, in data\r\n return await self._data_traced(request, default_labels, _next, _size)\r\n File \"/Users/simon/.local/share/virtualenvs/airtable-export-Ca4U-3qk/lib/python3.8/site-packages/datasette/views/table.py\", line 568, in _data_traced\r\n await gather(execute_facets(), execute_suggested_facets())\r\n File \"/Users/simon/.local/share/virtualenvs/airtable-export-Ca4U-3qk/lib/python3.8/site-packages/datasette/views/table.py\", line 177, in _gather_parallel\r\n return await asyncio.gather(*args)\r\nasyncio.exceptions.CancelledError\r\nINFO: 127.0.0.1:64109 - \"GET /airtable_refs/airtable_refs?_facet=table_name&table_name=Sessions HTTP/1.1\" 500 Internal Server Error\r\n^CError in atexit._run_exitfuncs:\r\nTraceback (most recent call last):\r\n File \"/Users/simon/.pyenv/versions/3.8.17/lib/python3.8/concurrent/futures/thread.py\", line 40, in _python_exit\r\n t.join()\r\n File \"/Users/simon/.pyenv/versions/3.8.17/lib/python3.8/threading.py\", line 1011, in join\r\n self._wait_for_tstate_lock()\r\n File \"/Users/simon/.pyenv/versions/3.8.17/lib/python3.8/threading.py\", line 1027, in _wait_for_tstate_lock\r\n elif lock.acquire(block, timeout):\r\nKeyboardInterrupt\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724051886", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724051886, "node_id": "IC_kwDOBm6k_c5mwvGu", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T17:28:20Z", "updated_at": "2023-09-18T17:30:30Z", "author_association": "OWNER", "body": "The bug exhibits when I try to add a facet. I think it's caused by the parallel query execution I added to facets at some point.\r\n\r\nhttp://127.0.0.1:8045/airtable_refs/airtable_refs - no error\r\nhttp://127.0.0.1:8045/airtable_refs/airtable_refs?_facet=table_name#facet-table_name - hangs the server\r\n\r\nCrucial line in the traceback:\r\n```\r\n await gather(execute_facets(), execute_suggested_facets())\r\n```\r\nFrom here: https://github.com/simonw/datasette/blob/917272c864ad7b8a00c48c77f5c2944093babb4e/datasette/views/table.py#L568", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724055823", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724055823, "node_id": "IC_kwDOBm6k_c5mwwEP", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T17:31:10Z", "updated_at": "2023-09-18T17:31:10Z", "author_association": "OWNER", "body": "That line was added in https://github.com/simonw/datasette/commit/942411ef946e9a34a2094944d3423cddad27efd3 which first shipped in 0.62a0.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724064440", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724064440, "node_id": "IC_kwDOBm6k_c5mwyK4", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T17:36:00Z", "updated_at": "2023-09-18T17:36:00Z", "author_association": "OWNER", "body": "I wrote this test, but it passes:\r\n```python\r\n@pytest.mark.asyncio\r\nasync def test_facet_against_in_memory_database():\r\n ds = Datasette()\r\n db = ds.add_memory_database(\"mem\")\r\n await db.execute_write(\"create table t (id integer primary key, name text)\")\r\n await db.execute_write_many(\r\n \"insert into t (name) values (?)\", [[\"one\"], [\"one\"], [\"two\"]]\r\n )\r\n response1 = await ds.client.get(\"/mem/t.json\")\r\n assert response1.status_code == 200\r\n response2 = await ds.client.get(\"/mem/t.json?_facet=name\")\r\n assert response2.status_code == 200\r\n assert response2.json() == {\r\n \"ok\": True,\r\n \"next\": None,\r\n \"facet_results\": {\r\n \"results\": {\r\n \"name\": {\r\n \"name\": \"name\",\r\n \"type\": \"column\",\r\n \"hideable\": True,\r\n \"toggle_url\": \"/mem/t.json\",\r\n \"results\": [\r\n {\r\n \"value\": \"one\",\r\n \"label\": \"one\",\r\n \"count\": 2,\r\n \"toggle_url\": \"http://localhost/mem/t.json?_facet=name&name=one\",\r\n \"selected\": False,\r\n },\r\n {\r\n \"value\": \"two\",\r\n \"label\": \"two\",\r\n \"count\": 1,\r\n \"toggle_url\": \"http://localhost/mem/t.json?_facet=name&name=two\",\r\n \"selected\": False,\r\n },\r\n ],\r\n \"truncated\": False,\r\n }\r\n },\r\n \"timed_out\": [],\r\n },\r\n \"rows\": [\r\n {\"id\": 1, \"name\": \"one\"},\r\n {\"id\": 2, \"name\": \"one\"},\r\n {\"id\": 3, \"name\": \"two\"},\r\n ],\r\n \"truncated\": False,\r\n }\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724072390", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724072390, "node_id": "IC_kwDOBm6k_c5mw0HG", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T17:39:06Z", "updated_at": "2023-09-18T17:39:06Z", "author_association": "OWNER", "body": "Landing a version of that test anyway.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724081909", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724081909, "node_id": "IC_kwDOBm6k_c5mw2b1", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T17:45:27Z", "updated_at": "2023-09-18T17:45:27Z", "author_association": "OWNER", "body": "Maybe it's not related to faceting - I just got it on a hit to `http://127.0.0.1:8045/airtable_refs/airtable_refs` instead.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724083324", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724083324, "node_id": "IC_kwDOBm6k_c5mw2x8", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T17:46:21Z", "updated_at": "2023-09-18T17:46:21Z", "author_association": "OWNER", "body": "Sometimes it takes a few clicks for the bug to occur, but it does seem to always be within the in-memory database.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724084199", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724084199, "node_id": "IC_kwDOBm6k_c5mw2_n", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T17:47:01Z", "updated_at": "2023-09-18T17:47:01Z", "author_association": "OWNER", "body": "I managed to trigger it by loading `http://127.0.0.1:8045/airtable_refs/airtable_refs` - which worked - and then hitting refresh on that page a bunch of times until it hung.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724089666", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724089666, "node_id": "IC_kwDOBm6k_c5mw4VC", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T17:49:24Z", "updated_at": "2023-09-18T17:49:24Z", "author_association": "OWNER", "body": "I switched that particular implementation to using an on-disk database instead of an in-memory database and could no longer recreate the bug.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724157182", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724157182, "node_id": "IC_kwDOBm6k_c5mxIz-", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T18:30:30Z", "updated_at": "2023-09-18T18:30:30Z", "author_association": "OWNER", "body": "OK, I can trigger the bug like this:\r\n\r\n```bash\r\ndatasette pottery2.db -p 8045 --get /airtable_refs/airtable_refs\r\n```\r\nCan I write a bash script that fails (and terminates the process) if it takes longer than X seconds?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724159882", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724159882, "node_id": "IC_kwDOBm6k_c5mxJeK", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T18:32:29Z", "updated_at": "2023-09-18T18:32:29Z", "author_association": "OWNER", "body": "This worked, including on macOS even though GPT-4 thought `timeout` would not work there: https://chat.openai.com/share/cc4628e9-5240-4f35-b640-16a9c178b315\r\n```bash\r\n#!/bin/bash\r\n\r\n# Run the command with a timeout of 5 seconds\r\ntimeout 5s datasette pottery2.db -p 8045 --get /airtable_refs/airtable_refs\r\n\r\n# Check the exit code from timeout\r\nif [ $? -eq 124 ]; then\r\n echo \"Error: Command timed out after 5 seconds.\"\r\n exit 1\r\nfi\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724257290", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724257290, "node_id": "IC_kwDOBm6k_c5mxhQK", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T19:39:27Z", "updated_at": "2023-09-18T19:44:26Z", "author_association": "OWNER", "body": "I'm now trying this test script:\r\n```bash\r\n#!/bin/bash\r\n\r\nport=8064\r\n# Start datasette server in the background and get its PID\r\ndatasette pottery2.db -p $port &\r\nserver_pid=$!\r\n\r\n# Wait for a moment to ensure the server has time to start up\r\nsleep 2\r\n\r\n# Initialize counters and parameters\r\nretry_count=0\r\nmax_retries=3\r\nsuccess_count=0\r\npath=\"/airtable_refs/airtable_refs\"\r\n\r\n# Function to run curl with a timeout\r\nfunction test_curl {\r\n # Run the curl command with a timeout of 3 seconds\r\n timeout 3s curl -s \"http://localhost:${port}${path}\" > /dev/null\r\n if [ $? -eq 0 ]; then\r\n # Curl was successful\r\n ((success_count++))\r\n fi\r\n}\r\n\r\n# Try three parallel curl requests\r\nwhile [[ $retry_count -lt $max_retries ]]; do\r\n # Reset the success counter\r\n success_count=0\r\n\r\n # Run the curls in parallel\r\n echo \" Running curls\"\r\n test_curl\r\n test_curl\r\n test_curl # & test_curl & test_curl &\r\n\r\n # Wait for all curls to finish\r\n #wait\r\n\r\n # Check the success count\r\n if [[ $success_count -eq 3 ]]; then\r\n # All curls succeeded, break out of the loop\r\n echo \" All curl succeeded\"\r\n break\r\n fi\r\n\r\n ((retry_count++))\r\ndone\r\n\r\n# Kill the datasette server\r\necho \"Killing datasette server with PID $server_pid\"\r\nkill -9 $server_pid\r\nsleep 2\r\n\r\n# Print result\r\nif [[ $success_count -eq 3 ]]; then\r\n echo \"All three curls succeeded.\"\r\n exit 0\r\nelse\r\n echo \"Error: Not all curls succeeded after $retry_count attempts.\"\r\n exit 1\r\nfi\r\n```\r\nI run it like this:\r\n```bash\r\ngit bisect reset\r\ngit bisect start\r\ngit bisect good 0.59.4\r\ngit bisect bad 1.0a6\r\ngit bisect run ../airtable-export/testit.sh\r\n```\r\nBut... it's not having the desired result, I think because the bug is intermittent so each time I run it the bisect spits out a different commit as the one that is to blame.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724258279", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724258279, "node_id": "IC_kwDOBm6k_c5mxhfn", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T19:40:13Z", "updated_at": "2023-09-18T19:40:13Z", "author_association": "OWNER", "body": "Output while it is running looks like this:\r\n```\r\nrunning '../airtable-export/testit.sh'\r\nINFO: Started server process [75649]\r\nINFO: Waiting for application startup.\r\nINFO: Application startup complete.\r\nINFO: Uvicorn running on http://127.0.0.1:8064 (Press CTRL+C to quit)\r\n Running curls\r\n Running curls\r\n Running curls\r\nKilling datasette server with PID 75649\r\n../airtable-export/testit.sh: line 54: 75649 Killed: 9 datasette pottery2.db -p $port\r\nError: Not all curls succeeded after 3 attempts.\r\nBisecting: 155 revisions left to test after this (roughly 7 steps)\r\n[247e460e08bf823142f7b84058fe44e43626787f] Update beautifulsoup4 requirement (#1703)\r\nrunning '../airtable-export/testit.sh'\r\nINFO: Started server process [75722]\r\nINFO: Waiting for application startup.\r\nINFO: Application startup complete.\r\nINFO: Uvicorn running on http://127.0.0.1:8064 (Press CTRL+C to quit)\r\n Running curls\r\n Running curls\r\n Running curls\r\nKilling datasette server with PID 75722\r\n../airtable-export/testit.sh: line 54: 75722 Killed: 9 datasette pottery2.db -p $port\r\nError: Not all curls succeeded after 3 attempts.\r\nBisecting: 77 revisions left to test after this (roughly 6 steps)\r\n[3ef47a0896c7e63404a34e465b7160c80eaa571d] Link rel=alternate header for tables and rows\r\nrunning '../airtable-export/testit.sh'\r\nINFO: Started server process [75818]\r\nINFO: Waiting for application startup.\r\nINFO: Application startup complete.\r\nINFO: Uvicorn running on http://127.0.0.1:8064 (Press CTRL+C to quit)\r\n Running curls\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724259229", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724259229, "node_id": "IC_kwDOBm6k_c5mxhud", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T19:40:56Z", "updated_at": "2023-09-18T19:40:56Z", "author_association": "OWNER", "body": "I tried it with a path of `/` and everything passed - so it's definitely the path of `/airtable_refs/airtable_refs` (an in-memory database created by an experimental branch of https://github.com/simonw/airtable-export) that triggers the problem.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724263390", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724263390, "node_id": "IC_kwDOBm6k_c5mxive", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T19:44:03Z", "updated_at": "2023-09-18T19:44:03Z", "author_association": "OWNER", "body": "I knocked it down to 1 retry just to see what happened.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724276917", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724276917, "node_id": "IC_kwDOBm6k_c5mxmC1", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T19:54:23Z", "updated_at": "2023-09-18T19:54:23Z", "author_association": "OWNER", "body": "Turned out I wasn't running the `datasette` from the current directory, so it was not testing what I intended.\r\n\r\nFIxed that with `pip install -e .` in the `datasette/` directory.\r\n\r\nNow I'm seeing some passes, which look like this:\r\n```\r\nrunning '../airtable-export/testit.sh'\r\nINFO: Started server process [77810]\r\nINFO: Waiting for application startup.\r\nINFO: Application startup complete.\r\nINFO: Uvicorn running on http://127.0.0.1:8064 (Press CTRL+C to quit)\r\n Running curls\r\nINFO: 127.0.0.1:59439 - \"GET /airtable_refs/airtable_refs HTTP/1.1\" 200 OK\r\nINFO: 127.0.0.1:59440 - \"GET /airtable_refs/airtable_refs HTTP/1.1\" 200 OK\r\nINFO: 127.0.0.1:59441 - \"GET /airtable_refs/airtable_refs HTTP/1.1\" 200 OK\r\n All curl succeeded\r\nKilling datasette server with PID 77810\r\n../airtable-export/testit.sh: line 54: 77810 Killed: 9 datasette pottery2.db -p $port\r\nAll three curls succeeded.\r\nBisecting: 4 revisions left to test after this (roughly 2 steps)\r\n[7463b051cf8d7f856df5eba9f7aa944183ebabe5] Cosmetic tweaks after blacken-docs, refs #1718\r\nrunning '../airtable-export/testit.sh'\r\nINFO: Started server process [77826]\r\nINFO: Waiting for application startup.\r\nINFO: Application startup complete.\r\nINFO: Uvicorn running on http://127.0.0.1:8064 (Press CTRL+C to quit)\r\n Running curls\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724278386", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724278386, "node_id": "IC_kwDOBm6k_c5mxmZy", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T19:55:32Z", "updated_at": "2023-09-18T19:55:32Z", "author_association": "OWNER", "body": "OK it looks like it found it!\r\n\r\n```\r\n942411ef946e9a34a2094944d3423cddad27efd3 is the first bad commit\r\ncommit \r\n\r\nAuthor: Simon Willison \r\nDate: Tue Apr 26 15:48:56 2022 -0700\r\n\r\n Execute some TableView queries in parallel\r\n \r\n Use ?_noparallel=1 to opt out (undocumented, useful for benchmark comparisons)\r\n \r\n Refs #1723, #1715\r\n\r\n datasette/views/table.py | 93 ++++++++++++++++++++++++++++++++++--------------\r\n 1 file changed, 67 insertions(+), 26 deletions(-)\r\nbisect found first bad commit\r\n```\r\nhttps://github.com/simonw/datasette/commit/942411ef946e9a34a2094944d3423cddad27efd3 does look like the cause of this problem.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724281824", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724281824, "node_id": "IC_kwDOBm6k_c5mxnPg", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T19:58:06Z", "updated_at": "2023-09-18T19:58:06Z", "author_association": "OWNER", "body": "I also confirmed that `http://127.0.0.1:8064/airtable_refs/airtable_refs?_noparallel=1` does not trigger the bug but `http://127.0.0.1:8064/airtable_refs/airtable_refs` does.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724298817", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724298817, "node_id": "IC_kwDOBm6k_c5mxrZB", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T20:11:26Z", "updated_at": "2023-09-18T20:11:26Z", "author_association": "OWNER", "body": "Now that I've confirmed that parallel query execution of the kind introduced in https://github.com/simonw/datasette/commit/942411ef946e9a34a2094944d3423cddad27efd3 can cause hangs (presumably some kind of locking issue) against in-memory databases, some options:\r\n\r\n1. Disable parallel execution entirely and rip out related code.\r\n2. Disable parallel execution entirely by leaving that code but having it always behave as if `_noparallel=1`\r\n3. Continue digging and try and find some way to avoid this problem\r\n\r\nThe parallel execution work is something I was playing with last year in the hope of speeding up Datasette pages like the table page which need to execute a bunch of queries - one for each facet, plus one for each column to see if it should be suggested as a facet.\r\n\r\nI wrote about this at the time here: https://simonwillison.net/2022/May/6/weeknotes/\r\n\r\nMy hope was that despite Python's GIL this optimization would still help, because the SQLite C module releases the GIL once it gets to SQLite.\r\n\r\nBut... that didn't hold up. It looked like enough work was happening in Python land with the GIL that the optimization didn't improve things.\r\n\r\nRunning the `nogil` fork of Python DID improve things though! I left the code in partly on the hope that the `nogil` fork would be accepted into Python core.\r\n\r\n... which it now has! But it will still be a year or two before it fully lands: https://discuss.python.org/t/a-steering-council-notice-about-pep-703-making-the-global-interpreter-lock-optional-in-cpython/30474\r\n\r\nSo I'm not particularly concerned about dropping the parallel execution. If I do drop it though do I leave the potentially complex code in that relates to it?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724305169", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724305169, "node_id": "IC_kwDOBm6k_c5mxs8R", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T20:16:22Z", "updated_at": "2023-09-18T20:16:36Z", "author_association": "OWNER", "body": "Looking again at this code:\r\n\r\nhttps://github.com/simonw/datasette/blob/6ed7908580fa2ba9297c3225d85c56f8b08b9937/datasette/database.py#L87-L117\r\n\r\n`check_same_thread=False` really stands out here.\r\n\r\nPython docs at https://docs.python.org/3/library/sqlite3.html\r\n\r\n> **check_same_thread** ([*bool*](https://docs.python.org/3/library/functions.html#bool \"bool\")) -- If `True` (default), [`ProgrammingError`](https://docs.python.org/3/library/sqlite3.html#sqlite3.ProgrammingError \"sqlite3.ProgrammingError\") will be raised if the database connection is used by a thread other than the one that created it. If `False`, the connection may be accessed in multiple threads; write operations may need to be serialized by the user to avoid data corruption. See [`threadsafety`](https://docs.python.org/3/library/sqlite3.html#sqlite3.threadsafety \"sqlite3.threadsafety\") for more information.\r\n\r\nI think I'm playing with fire by allowing multiple threads to access the same connection without doing my own serialization of those requests.\r\n\r\nI _do_ do that using the write connection - and in this particular case the bug isn't coming from write queries, it's coming from read queries - but perhaps SQLite has issues with threading for reads, too.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724315591", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724315591, "node_id": "IC_kwDOBm6k_c5mxvfH", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T20:24:30Z", "updated_at": "2023-09-18T20:24:30Z", "author_association": "OWNER", "body": "[Using SQLite In Multi-Threaded Applications](https://www.sqlite.org/threadsafe.html)\r\n\r\nThat indicates that there's a SQLite option for \"Serialized\" mode where it's safe to access anything SQLite provides from multiple threads, but as far as I can tell Python doesn't give you an option to turn that mode on or off for a connection - you can read `sqlite3.threadsafet`y to see if that mode was compiled in or not, but not actually change it.\r\n\r\nOn my Mac `sqlite3.threadsafety` returns 1 which means https://docs.python.org/3/library/sqlite3.html#sqlite3.threadsafety \"Multi-thread: In this mode, SQLite can be safely used by multiple threads provided that no single database connection is used simultaneously in two or more threads.\" - it would need to return 3 for that serialized mode.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724317367", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724317367, "node_id": "IC_kwDOBm6k_c5mxv63", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T20:25:44Z", "updated_at": "2023-09-18T20:25:44Z", "author_association": "OWNER", "body": "My current hunch is that SQLite gets unhappy if multiple threads access the same underlying C object - which sometimes happens with in-memory connections and Datasette presumably because they are faster than file-backed databases.\r\n\r\nI'm going to remove the `asyncio.gather()` code from the table view. I'll ship a 0.x release with that fix too.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1724325068", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1724325068, "node_id": "IC_kwDOBm6k_c5mxxzM", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-18T20:29:41Z", "updated_at": "2023-09-18T20:29:41Z", "author_association": "OWNER", "body": "The one other thing affected by this change is this documentation, which suggests a not-actually-safe pattern: https://github.com/simonw/datasette/blob/6ed7908580fa2ba9297c3225d85c56f8b08b9937/docs/internals.rst#L1292-L1321", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1726749355", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1726749355, "node_id": "IC_kwDOBm6k_c5m7Bqr", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-20T01:28:16Z", "updated_at": "2023-09-20T01:28:16Z", "author_association": "OWNER", "body": "Added a note to that example in the documentation: https://github.com/simonw/datasette/blob/4e6a34179eaedec44c1263275d7592fd83d7e2ac/docs/internals.rst?plain=1#L1320", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1730162283", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1730162283, "node_id": "IC_kwDOBm6k_c5nIC5r", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-21T19:19:47Z", "updated_at": "2023-09-21T19:19:47Z", "author_association": "OWNER", "body": "I'm going to release this in `1.0a7`, and I'll backport it to a `0.64.4` release too.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1730231404", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1730231404, "node_id": "IC_kwDOBm6k_c5nITxs", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-21T20:10:28Z", "updated_at": "2023-09-21T20:10:28Z", "author_association": "OWNER", "body": "Release 0.64.4: https://docs.datasette.io/en/stable/changelog.html#v0-64-4", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1730232308", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1730232308, "node_id": "IC_kwDOBm6k_c5nIT_0", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-21T20:11:16Z", "updated_at": "2023-09-21T20:11:16Z", "author_association": "OWNER", "body": "We're planning a breaking change in `1.0a7`:\r\n- #2191 \r\n\r\nSince that's a breaking change I'm going to ship 1.0a7 right now with this fix, then ship that breaking change as `1.0a8` instead.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/2189#issuecomment-1730388418", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2189", "id": 1730388418, "node_id": "IC_kwDOBm6k_c5nI6HC", "user": {"value": 9599, "label": "simonw"}, "created_at": "2023-09-21T22:26:19Z", "updated_at": "2023-09-21T22:26:19Z", "author_association": "OWNER", "body": "1.0a7 is out with this fix as well now: https://docs.datasette.io/en/1.0a7/changelog.html#a7-2023-09-21", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1901416155, "label": "Server hang on parallel execution of queries to named in-memory databases"}, "performed_via_github_app": null}