html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app https://github.com/simonw/datasette/issues/1268#issuecomment-804261915,https://api.github.com/repos/simonw/datasette/issues/1268,804261915,MDEyOklzc3VlQ29tbWVudDgwNDI2MTkxNQ==,9599,2021-03-22T17:41:12Z,2021-03-22T17:41:12Z,OWNER,"Closing this because I've figured out the root of the problem now, and I have a potential solution.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803802957,https://api.github.com/repos/simonw/datasette/issues/1268,803802957,MDEyOklzc3VlQ29tbWVudDgwMzgwMjk1Nw==,9599,2021-03-22T06:38:14Z,2021-03-22T06:38:14Z,OWNER,"Also worth trying is to change this code: ```python n = 1000 if ms < 50: n = 1 ``` What happens with `n = 10` instead?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803784902,https://api.github.com/repos/simonw/datasette/issues/1268,803784902,MDEyOklzc3VlQ29tbWVudDgwMzc4NDkwMg==,9599,2021-03-22T05:59:06Z,2021-03-22T05:59:06Z,OWNER,"Even if I implement that workaround in #1269 I'm concerned that this could still allow users to deliberately crash Datasette (if it's running SpatiaLite 5.0) by executing `select count(*) from SpatialIndex`. That `interrupt` timeout mechanism is worth digging into further.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803782705,https://api.github.com/repos/simonw/datasette/issues/1268,803782705,MDEyOklzc3VlQ29tbWVudDgwMzc4MjcwNQ==,9599,2021-03-22T05:54:19Z,2021-03-22T05:54:19Z,OWNER,"Got two new TILs out of this: * [Tracing every executed Python statement](https://til.simonwillison.net/python/tracing-every-statement) * [Running gdb against a Python process in a running Docker container](https://til.simonwillison.net/docker/gdb-python-docker)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803777724,https://api.github.com/repos/simonw/datasette/issues/1268,803777724,MDEyOklzc3VlQ29tbWVudDgwMzc3NzcyNA==,9599,2021-03-22T05:42:50Z,2021-03-22T05:43:23Z,OWNER," If I want to avoid counting virtual tables, I need to detect which tables are virtual tables. The safest way to do this is probably to pull the `sql` for every table and then, in Python, check for values that start with `create virtual table` after converting to lower case, using any number of spaces. This would catch things like ` CREATE virtual TABLE` which might be missed by a SQL `like` query. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803775121,https://api.github.com/repos/simonw/datasette/issues/1268,803775121,MDEyOklzc3VlQ29tbWVudDgwMzc3NTEyMQ==,9599,2021-03-22T05:36:26Z,2021-03-22T05:36:26Z,OWNER,So one fix could be to avoid running counts for anything that turns out to be a virtual table.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803774926,https://api.github.com/repos/simonw/datasette/issues/1268,803774926,MDEyOklzc3VlQ29tbWVudDgwMzc3NDkyNg==,9599,2021-03-22T05:35:56Z,2021-03-22T05:35:56Z,OWNER,That's in this code here: https://github.com/simonw/datasette/blob/c4f1ec7f33fd7d5b93f0f895dafb5351cc3bfc5b/datasette/database.py#L221-L241,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803774518,https://api.github.com/repos/simonw/datasette/issues/1268,803774518,MDEyOklzc3VlQ29tbWVudDgwMzc3NDUxOA==,9599,2021-03-22T05:34:57Z,2021-03-22T05:34:57Z,OWNER,"... and sure enough, adding this code fixed the problem: ```diff diff --git a/datasette/database.py b/datasette/database.py index 3579cce..b466b12 100644 --- a/datasette/database.py +++ b/datasette/database.py @@ -224,6 +226,9 @@ class Database: # Try to get counts for each table, $limit timeout for each count counts = {} for table in await self.table_names(): + if table == ""SpatialIndex"": + counts[table] = 0 + continue try: table_count = ( await self.execute( ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803773484,https://api.github.com/repos/simonw/datasette/issues/1268,803773484,MDEyOklzc3VlQ29tbWVudDgwMzc3MzQ4NA==,9599,2021-03-22T05:32:29Z,2021-03-22T05:32:29Z,OWNER,"To figure out which SQL query triggers the problem I added this code to write to a log file: ```python with sqlite_timelimit(conn, time_limit_ms): try: cursor = conn.cursor() with open(""/tmp/sql.log"", ""ab"", buffering=0) as fp: fp.write((""{}: {}\n"".format(sql, params)).encode(""utf-8"")) cursor.execute(sql, params if params is not None else {}) ``` I had to use `ab` binary mode because Python doesn't allow `buffering=0` for non-binary file operations. With the log enabled, I used `docker exec -it 589ae68de943 bash` to attach to the running container and `tail -f /tmp/sql.log` to see the logs. Here's where it broke: ``` select count(*) from [idx_civici_geom_parent]: None select count(*) from [sqlite_stat1]: None select count(*) from [sqlite_stat3]: None select count(*) from [SpatialIndex]: None ``` So attempting to run a `count(*)` against the `SpatialIndex` virtual table is the thing that triggers the bug.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803764919,https://api.github.com/repos/simonw/datasette/issues/1268,803764919,MDEyOklzc3VlQ29tbWVudDgwMzc2NDkxOQ==,9599,2021-03-22T05:11:11Z,2021-03-22T05:11:11Z,OWNER,"Maybe I could implement SQLite query timeouts using the `interrupt()` method instead of the progress handler hack I'm currently using? https://stackoverflow.com/questions/43240496/python-sqlite3-how-to-quickly-and-cleanly-interrupt-long-running-query-with-e has some tips.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803764200,https://api.github.com/repos/simonw/datasette/issues/1268,803764200,MDEyOklzc3VlQ29tbWVudDgwMzc2NDIwMA==,9599,2021-03-22T05:09:13Z,2021-03-22T05:09:13Z,OWNER,"I tried building a container where the `conn.set_progress_handler(handler, n)` line was commented out... and it fixed the bug.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803762969,https://api.github.com/repos/simonw/datasette/issues/1268,803762969,MDEyOklzc3VlQ29tbWVudDgwMzc2Mjk2OQ==,9599,2021-03-22T05:05:51Z,2021-03-22T05:05:51Z,OWNER,I had to run `docker kill 16197781a7b5` to kill the broken container - Ctrl+C in the Datasette console window didn't do anything.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803762609,https://api.github.com/repos/simonw/datasette/issues/1268,803762609,MDEyOklzc3VlQ29tbWVudDgwMzc2MjYwOQ==,9599,2021-03-22T05:05:00Z,2021-03-22T05:05:00Z,OWNER,"Using https://til.simonwillison.net/docker/attach-bash-to-running-container - I figured out how to run `gdb`. I had to use `--privileged` here because otherwise `gdb` showed a ""Could not attach to process"" error. ``` docker exec --privileged -it 16197781a7b5 bash # apt-get install gdb python3-dbg # gdb /usr/bin/python3 -p 20 ``` This paused the process. I tried running this: ``` (gdb) py-bt Traceback (most recent call first): File ""/usr/lib/python3.8/asyncio/base_events.py"", line 1845, in _run_once if handle._cancelled: File ""/usr/lib/python3.8/asyncio/base_events.py"", line 570, in run_forever self._run_once() File ""/usr/lib/python3.8/asyncio/base_events.py"", line 603, in run_until_complete self.run_forever() File ""/usr/local/lib/python3.8/dist-packages/uvicorn/server.py"", line 49, in run loop.run_until_complete(self.serve(sockets=sockets)) File ""/usr/local/lib/python3.8/dist-packages/uvicorn/main.py"", line 386, in run server.run() File ""/usr/local/lib/python3.8/dist-packages/datasette/cli.py"", line 575, in serve uvicorn.run(ds.app(), **uvicorn_kwargs) File ""/usr/local/lib/python3.8/dist-packages/click/core.py"", line 610, in invoke return callback(*args, **kwargs) File ""/usr/local/lib/python3.8/dist-packages/click/core.py"", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File ""/usr/local/lib/python3.8/dist-packages/click/core.py"", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File ""/usr/local/lib/python3.8/dist-packages/click/core.py"", line 782, in main rv = self.invoke(ctx) File ""/usr/local/lib/python3.8/dist-packages/click/core.py"", line 829, in __call__ return self.main(*args, **kwargs) File ""/usr/local/bin/datasette"", line 8, in sys.exit(cli()) File ""/usr/lib/python3.8/trace.py"", line 450, in runctx exec(cmd, globals, locals) File ""/usr/lib/python3.8/trace.py"", line 6632, in main File ""/usr/lib/python3.8/trace.py"", line 756, in main() File ""/usr/lib/python3.8/runpy.py"", line 343, in _run_code File ""/usr/lib/python3.8/runpy.py"", line 450, in _run_module_as_main ``` Not sure if that's useful or not.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803759051,https://api.github.com/repos/simonw/datasette/issues/1268,803759051,MDEyOklzc3VlQ29tbWVudDgwMzc1OTA1MQ==,9599,2021-03-22T04:55:22Z,2021-03-22T04:55:22Z,OWNER,So I think there's a bug in the way the `set_progress_handler()` mechanism works when used in conjunction with SpatiaLite 5.0 on Linux.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803758793,https://api.github.com/repos/simonw/datasette/issues/1268,803758793,MDEyOklzc3VlQ29tbWVudDgwMzc1ODc5Mw==,9599,2021-03-22T04:54:32Z,2021-03-22T04:54:32Z,OWNER,"Hitting http://localhost:8001/tuscany_housenumbers triggers the bug. It gets stuck in a loop that looks like this: Which looks to me like this code: https://github.com/simonw/datasette/blob/8e18c7943181f228ce5ebcea48deb59ce50bee1f/datasette/utils/__init__.py#L139-L158","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803758182,https://api.github.com/repos/simonw/datasette/issues/1268,803758182,MDEyOklzc3VlQ29tbWVudDgwMzc1ODE4Mg==,9599,2021-03-22T04:52:15Z,2021-03-22T04:52:15Z,OWNER,Hitting http://localhost:8001/ successfully shows the homepage (after a lot more scrolling).,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803757746,https://api.github.com/repos/simonw/datasette/issues/1268,803757746,MDEyOklzc3VlQ29tbWVudDgwMzc1Nzc0Ng==,9599,2021-03-22T04:50:40Z,2021-03-22T04:51:52Z,OWNER,"Here's a fun debugging trick: docker run -it -p 8001:8001 -v `pwd`:/mnt datasette-spatialite:latest bash root@16197781a7b5:/# python3 -m trace --trace $(which datasette) \ -p 8001 -h 0.0.0.0 /mnt/tuscany_housenumbers.sqlite \ --load-extension=spatialite A huge amount of stuff scrolls past as Datasette starts up, since we are tracing every executed line of Python. After about a minute it's finished starting and gets to this point: ``` selectors.py(452): if timeout is None: selectors.py(454): elif timeout <= 0: selectors.py(459): timeout = math.ceil(timeout * 1e3) * 1e-3 selectors.py(464): max_ev = max(len(self._fd_to_key), 1) selectors.py(466): ready = [] selectors.py(467): try: selectors.py(468): fd_event_list = self._selector.poll(timeout, max_ev) ``` Now I can make some HTTP requests against it. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703, https://github.com/simonw/datasette/issues/1268#issuecomment-803756495,https://api.github.com/repos/simonw/datasette/issues/1268,803756495,MDEyOklzc3VlQ29tbWVudDgwMzc1NjQ5NQ==,9599,2021-03-22T04:46:04Z,2021-03-22T04:46:04Z,OWNER,`gdb` may be able to help debug this: https://www.podoliaka.org/2016/04/10/debugging-cpython-gdb/,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",837308703,