html_url,issue_url,id,node_id,user,user_label,created_at,updated_at,author_association,body,reactions,issue,issue_label,performed_via_github_app https://github.com/simonw/datasette/issues/1555#issuecomment-997459958,https://api.github.com/repos/simonw/datasette/issues/1555,997459958,IC_kwDOBm6k_c47dAf2,9599,simonw,2021-12-19T20:55:59Z,2021-12-19T20:55:59Z,OWNER,"Closing this issue because I've optimized this a whole bunch, and it's definitely good enough for the moment.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1079149656,Optimize all those calls to index_list and foreign_key_list, https://github.com/simonw/datasette/issues/1555#issuecomment-997325189,https://api.github.com/repos/simonw/datasette/issues/1555,997325189,IC_kwDOBm6k_c47cfmF,9599,simonw,2021-12-19T03:55:01Z,2021-12-19T20:54:51Z,OWNER,"It's a bit annoying that the queries no longer show up in the trace at all now, thanks to running in `.execute_fn()`. I wonder if there's something smart I can do about that - maybe have `trace()` record that function with a traceback even though it doesn't have the executed SQL string? 5fac26aa221a111d7633f2dd92014641f7c0ade9 has the same problem.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1079149656,Optimize all those calls to index_list and foreign_key_list, https://github.com/simonw/datasette/issues/1555#issuecomment-997459637,https://api.github.com/repos/simonw/datasette/issues/1555,997459637,IC_kwDOBm6k_c47dAa1,9599,simonw,2021-12-19T20:53:46Z,2021-12-19T20:53:46Z,OWNER,Using #1571 showed me that the `DELETE FROM columns/foreign_keys/indexes WHERE database_name = ? and table_name = ?` queries were running way more times than I expected. I came up with a new optimization that just does `DELETE FROM columns/foreign_keys/indexes WHERE database_name = ?` instead.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1079149656,Optimize all those calls to index_list and foreign_key_list, https://github.com/simonw/datasette/issues/1555#issuecomment-997342494,https://api.github.com/repos/simonw/datasette/issues/1555,997342494,IC_kwDOBm6k_c47cj0e,9599,simonw,2021-12-19T07:22:04Z,2021-12-19T07:22:04Z,OWNER,"Another option would be to provide an abstraction that makes it easier to run a group of SQL queries in the same thread at the same time, and have them traced correctly.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1079149656,Optimize all those calls to index_list and foreign_key_list, https://github.com/simonw/datasette/issues/1555#issuecomment-997324666,https://api.github.com/repos/simonw/datasette/issues/1555,997324666,IC_kwDOBm6k_c47cfd6,9599,simonw,2021-12-19T03:47:51Z,2021-12-19T03:48:09Z,OWNER,"Here's a hacked together prototype of running all of that stuff inside a single function passed to `.execute_fn()`: ```diff diff --git a/datasette/utils/internal_db.py b/datasette/utils/internal_db.py index 95055d8..58f9982 100644 --- a/datasette/utils/internal_db.py +++ b/datasette/utils/internal_db.py @@ -1,4 +1,5 @@ import textwrap +from datasette.utils import table_column_details async def init_internal_db(db): @@ -70,49 +71,70 @@ async def populate_schema_tables(internal_db, db): ""DELETE FROM tables WHERE database_name = ?"", [database_name], block=True ) tables = (await db.execute(""select * from sqlite_master WHERE type = 'table'"")).rows - tables_to_insert = [] - columns_to_delete = [] - columns_to_insert = [] - foreign_keys_to_delete = [] - foreign_keys_to_insert = [] - indexes_to_delete = [] - indexes_to_insert = [] - for table in tables: - table_name = table[""name""] - tables_to_insert.append( - (database_name, table_name, table[""rootpage""], table[""sql""]) - ) - columns_to_delete.append((database_name, table_name)) - columns = await db.table_column_details(table_name) - columns_to_insert.extend( - { - **{""database_name"": database_name, ""table_name"": table_name}, - **column._asdict(), - } - for column in columns - ) - foreign_keys_to_delete.append((database_name, table_name)) - foreign_keys = ( - await db.execute(f""PRAGMA foreign_key_list([{table_name}])"") - ).rows - foreign_keys_to_insert.extend( - { - **{""database_name"": database_name, ""table_name"": table_name}, - **dict(foreign_key), - } - for foreign_key in foreign_keys - ) - indexes_to_delete.append((database_name, table_name)) - indexes = (await db.execute(f""PRAGMA index_list([{table_name}])"")).rows - indexes_to_insert.extend( - { - **{""database_name"": database_name, ""table_name"": table_name}, - **dict(index), - } - for index in indexes + def collect_info(conn): + tables_to_insert = [] + columns_to_delete = [] + columns_to_insert = [] + foreign_keys_to_delete = [] + foreign_keys_to_insert = [] + indexes_to_delete = [] + indexes_to_insert = [] + + for table in tables: + table_name = table[""name""] + tables_to_insert.append( + (database_name, table_name, table[""rootpage""], table[""sql""]) + ) + columns_to_delete.append((database_name, table_name)) + columns = table_column_details(conn, table_name) + columns_to_insert.extend( + { + **{""database_name"": database_name, ""table_name"": table_name}, + **column._asdict(), + } + for column in columns + ) + foreign_keys_to_delete.append((database_name, table_name)) + foreign_keys = conn.execute( + f""PRAGMA foreign_key_list([{table_name}])"" + ).fetchall() + foreign_keys_to_insert.extend( + { + **{""database_name"": database_name, ""table_name"": table_name}, + **dict(foreign_key), + } + for foreign_key in foreign_keys + ) + indexes_to_delete.append((database_name, table_name)) + indexes = conn.execute(f""PRAGMA index_list([{table_name}])"").fetchall() + indexes_to_insert.extend( + { + **{""database_name"": database_name, ""table_name"": table_name}, + **dict(index), + } + for index in indexes + ) + return ( + tables_to_insert, + columns_to_delete, + columns_to_insert, + foreign_keys_to_delete, + foreign_keys_to_insert, + indexes_to_delete, + indexes_to_insert, ) + ( + tables_to_insert, + columns_to_delete, + columns_to_insert, + foreign_keys_to_delete, + foreign_keys_to_insert, + indexes_to_delete, + indexes_to_insert, + ) = await db.execute_fn(collect_info) + await internal_db.execute_write_many( """""" INSERT INTO tables (database_name, table_name, rootpage, sql) ``` First impressions: it looks like this helps **a lot** - as far as I can tell this is now taking around 21ms to get to the point at which all of those internal databases have been populated, where previously it took more than 180ms. ![CleanShot 2021-12-18 at 19 47 22@2x](https://user-images.githubusercontent.com/9599/146663192-bba098d5-e7bd-4e2e-b525-2270867888a0.png) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1079149656,Optimize all those calls to index_list and foreign_key_list, https://github.com/simonw/datasette/issues/1555#issuecomment-997324156,https://api.github.com/repos/simonw/datasette/issues/1555,997324156,IC_kwDOBm6k_c47cfV8,9599,simonw,2021-12-19T03:40:05Z,2021-12-19T03:40:05Z,OWNER,"Using the prototype of this: - https://github.com/simonw/datasette-pretty-traces/issues/5 I'm seeing about 180ms spent running all of these queries on startup! ![CleanShot 2021-12-18 at 19 38 37@2x](https://user-images.githubusercontent.com/9599/146663045-46bda669-90de-474f-8870-345182725dc1.png) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1079149656,Optimize all those calls to index_list and foreign_key_list, https://github.com/simonw/datasette/issues/1555#issuecomment-997321767,https://api.github.com/repos/simonw/datasette/issues/1555,997321767,IC_kwDOBm6k_c47cewn,9599,simonw,2021-12-19T03:10:58Z,2021-12-19T03:10:58Z,OWNER,"I wonder how much overhead there is switching between the `async` event loop main code and the thread that runs the SQL queries. Would there be a performance boost if I gathered all of the column/index information in a single function run on the thread using `db.execute_fn()` I wonder? It would eliminate a bunch of switching between threads. Would be great to understand how much of an impact that would have.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1079149656,Optimize all those calls to index_list and foreign_key_list, https://github.com/simonw/datasette/issues/1555#issuecomment-997321653,https://api.github.com/repos/simonw/datasette/issues/1555,997321653,IC_kwDOBm6k_c47ceu1,9599,simonw,2021-12-19T03:09:43Z,2021-12-19T03:09:43Z,OWNER,"On that same documentation page I just spotted this: > This feature is experimental and is subject to change. Further documentation will become available if and when the table-valued functions for PRAGMAs feature becomes officially supported. This makes me nervous to rely on pragma function optimizations in Datasette itself.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1079149656,Optimize all those calls to index_list and foreign_key_list, https://github.com/simonw/datasette/issues/1555#issuecomment-997321477,https://api.github.com/repos/simonw/datasette/issues/1555,997321477,IC_kwDOBm6k_c47cesF,9599,simonw,2021-12-19T03:07:33Z,2021-12-19T03:07:33Z,OWNER,"If I want to continue supporting SQLite prior to 3.16.0 (2017-01-02) I'll need this optimization to only kick in with versions that support table-valued PRAGMA functions, while keeping the old `PRAGMA foreign_key_list(table)` stuff working for those older versions. That's feasible, but it's a bit more work - and I need to make sure I have robust testing in place for SQLite 3.15.0.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1079149656,Optimize all those calls to index_list and foreign_key_list, https://github.com/simonw/datasette/issues/1555#issuecomment-997321327,https://api.github.com/repos/simonw/datasette/issues/1555,997321327,IC_kwDOBm6k_c47cepv,9599,simonw,2021-12-19T03:05:39Z,2021-12-19T03:05:44Z,OWNER,"This caught me out once before in: - https://github.com/simonw/datasette/issues/1276 Turns out Glitch was running SQLite 3.11.0 from 2016-02-15.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1079149656,Optimize all those calls to index_list and foreign_key_list, https://github.com/simonw/datasette/issues/1555#issuecomment-997321217,https://api.github.com/repos/simonw/datasette/issues/1555,997321217,IC_kwDOBm6k_c47ceoB,9599,simonw,2021-12-19T03:04:16Z,2021-12-19T03:04:16Z,OWNER,"One thing to watch out for though, from https://sqlite.org/pragma.html#pragfunc > The table-valued functions for PRAGMA feature was added in SQLite version 3.16.0 (2017-01-02). Prior versions of SQLite cannot use this feature. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1079149656,Optimize all those calls to index_list and foreign_key_list, https://github.com/simonw/datasette/issues/1555#issuecomment-997321115,https://api.github.com/repos/simonw/datasette/issues/1555,997321115,IC_kwDOBm6k_c47cemb,9599,simonw,2021-12-19T03:03:12Z,2021-12-19T03:03:12Z,OWNER,"Table columns is a bit harder, because `table_xinfo` is only in SQLite 3.26.0 or higher: https://github.com/simonw/datasette/blob/d637ed46762fdbbd8e32b86f258cd9a53c1cfdc7/datasette/utils/__init__.py#L565-L581 So if that function is available: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++table_xinfo.*%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_table_xinfo%28sqlite_master.name%29+AS+table_xinfo%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27 ```sql SELECT sqlite_master.name, table_xinfo.* FROM sqlite_master, pragma_table_xinfo(sqlite_master.name) AS table_xinfo WHERE sqlite_master.type = 'table' ``` And otherwise, using `table_info`: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++table_info.*%2C%0D%0A++0+as+hidden%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_table_info%28sqlite_master.name%29+AS+table_info%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27 ```sql SELECT sqlite_master.name, table_info.*, 0 as hidden FROM sqlite_master, pragma_table_info(sqlite_master.name) AS table_info WHERE sqlite_master.type = 'table' ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1079149656,Optimize all those calls to index_list and foreign_key_list, https://github.com/simonw/datasette/issues/1555#issuecomment-997320824,https://api.github.com/repos/simonw/datasette/issues/1555,997320824,IC_kwDOBm6k_c47ceh4,9599,simonw,2021-12-19T02:59:57Z,2021-12-19T03:00:44Z,OWNER,"To list all indexes: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++index_list.*%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_index_list%28sqlite_master.name%29+AS+index_list%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27 ```sql SELECT sqlite_master.name, index_list.* FROM sqlite_master, pragma_index_list(sqlite_master.name) AS index_list WHERE sqlite_master.type = 'table' ``` Foreign keys: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++foreign_key_list.*%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_foreign_key_list%28sqlite_master.name%29+AS+foreign_key_list%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27 ```sql SELECT sqlite_master.name, foreign_key_list.* FROM sqlite_master, pragma_foreign_key_list(sqlite_master.name) AS foreign_key_list WHERE sqlite_master.type = 'table' ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1079149656,Optimize all those calls to index_list and foreign_key_list,