home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

13 rows where issue = 1079149656, "updated_at" is on date 2021-12-19 and user = 9599 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw · 13 ✖

issue 1

  • Optimize all those calls to index_list and foreign_key_list · 13 ✖

author_association 1

  • OWNER 13
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
997459958 https://github.com/simonw/datasette/issues/1555#issuecomment-997459958 https://api.github.com/repos/simonw/datasette/issues/1555 IC_kwDOBm6k_c47dAf2 simonw 9599 2021-12-19T20:55:59Z 2021-12-19T20:55:59Z OWNER

Closing this issue because I've optimized this a whole bunch, and it's definitely good enough for the moment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optimize all those calls to index_list and foreign_key_list 1079149656  
997325189 https://github.com/simonw/datasette/issues/1555#issuecomment-997325189 https://api.github.com/repos/simonw/datasette/issues/1555 IC_kwDOBm6k_c47cfmF simonw 9599 2021-12-19T03:55:01Z 2021-12-19T20:54:51Z OWNER

It's a bit annoying that the queries no longer show up in the trace at all now, thanks to running in .execute_fn(). I wonder if there's something smart I can do about that - maybe have trace() record that function with a traceback even though it doesn't have the executed SQL string?

5fac26aa221a111d7633f2dd92014641f7c0ade9 has the same problem.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optimize all those calls to index_list and foreign_key_list 1079149656  
997459637 https://github.com/simonw/datasette/issues/1555#issuecomment-997459637 https://api.github.com/repos/simonw/datasette/issues/1555 IC_kwDOBm6k_c47dAa1 simonw 9599 2021-12-19T20:53:46Z 2021-12-19T20:53:46Z OWNER

Using #1571 showed me that the DELETE FROM columns/foreign_keys/indexes WHERE database_name = ? and table_name = ? queries were running way more times than I expected. I came up with a new optimization that just does DELETE FROM columns/foreign_keys/indexes WHERE database_name = ? instead.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optimize all those calls to index_list and foreign_key_list 1079149656  
997342494 https://github.com/simonw/datasette/issues/1555#issuecomment-997342494 https://api.github.com/repos/simonw/datasette/issues/1555 IC_kwDOBm6k_c47cj0e simonw 9599 2021-12-19T07:22:04Z 2021-12-19T07:22:04Z OWNER

Another option would be to provide an abstraction that makes it easier to run a group of SQL queries in the same thread at the same time, and have them traced correctly.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optimize all those calls to index_list and foreign_key_list 1079149656  
997324666 https://github.com/simonw/datasette/issues/1555#issuecomment-997324666 https://api.github.com/repos/simonw/datasette/issues/1555 IC_kwDOBm6k_c47cfd6 simonw 9599 2021-12-19T03:47:51Z 2021-12-19T03:48:09Z OWNER

Here's a hacked together prototype of running all of that stuff inside a single function passed to .execute_fn():

```diff diff --git a/datasette/utils/internal_db.py b/datasette/utils/internal_db.py index 95055d8..58f9982 100644 --- a/datasette/utils/internal_db.py +++ b/datasette/utils/internal_db.py @@ -1,4 +1,5 @@ import textwrap +from datasette.utils import table_column_details

async def init_internal_db(db): @@ -70,49 +71,70 @@ async def populate_schema_tables(internal_db, db): "DELETE FROM tables WHERE database_name = ?", [database_name], block=True ) tables = (await db.execute("select * from sqlite_master WHERE type = 'table'")).rows - tables_to_insert = [] - columns_to_delete = [] - columns_to_insert = [] - foreign_keys_to_delete = [] - foreign_keys_to_insert = [] - indexes_to_delete = [] - indexes_to_insert = []

  • for table in tables:
  • table_name = table["name"]
  • tables_to_insert.append(
  • (database_name, table_name, table["rootpage"], table["sql"])
  • )
  • columns_to_delete.append((database_name, table_name))
  • columns = await db.table_column_details(table_name)
  • columns_to_insert.extend(
  • {
  • **{"database_name": database_name, "table_name": table_name},
  • **column._asdict(),
  • }
  • for column in columns
  • )
  • foreign_keys_to_delete.append((database_name, table_name))
  • foreign_keys = (
  • await db.execute(f"PRAGMA foreign_key_list([{table_name}])")
  • ).rows
  • foreign_keys_to_insert.extend(
  • {
  • **{"database_name": database_name, "table_name": table_name},
  • **dict(foreign_key),
  • }
  • for foreign_key in foreign_keys
  • )
  • indexes_to_delete.append((database_name, table_name))
  • indexes = (await db.execute(f"PRAGMA index_list([{table_name}])")).rows
  • indexes_to_insert.extend(
  • {
  • **{"database_name": database_name, "table_name": table_name},
  • **dict(index),
  • }
  • for index in indexes
  • def collect_info(conn):
  • tables_to_insert = []
  • columns_to_delete = []
  • columns_to_insert = []
  • foreign_keys_to_delete = []
  • foreign_keys_to_insert = []
  • indexes_to_delete = []
  • indexes_to_insert = [] +
  • for table in tables:
  • table_name = table["name"]
  • tables_to_insert.append(
  • (database_name, table_name, table["rootpage"], table["sql"])
  • )
  • columns_to_delete.append((database_name, table_name))
  • columns = table_column_details(conn, table_name)
  • columns_to_insert.extend(
  • {
  • **{"database_name": database_name, "table_name": table_name},
  • **column._asdict(),
  • }
  • for column in columns
  • )
  • foreign_keys_to_delete.append((database_name, table_name))
  • foreign_keys = conn.execute(
  • f"PRAGMA foreign_key_list([{table_name}])"
  • ).fetchall()
  • foreign_keys_to_insert.extend(
  • {
  • **{"database_name": database_name, "table_name": table_name},
  • **dict(foreign_key),
  • }
  • for foreign_key in foreign_keys
  • )
  • indexes_to_delete.append((database_name, table_name))
  • indexes = conn.execute(f"PRAGMA index_list([{table_name}])").fetchall()
  • indexes_to_insert.extend(
  • {
  • **{"database_name": database_name, "table_name": table_name},
  • **dict(index),
  • }
  • for index in indexes
  • )
  • return (
  • tables_to_insert,
  • columns_to_delete,
  • columns_to_insert,
  • foreign_keys_to_delete,
  • foreign_keys_to_insert,
  • indexes_to_delete,
  • indexes_to_insert, )

  • (

  • tables_to_insert,
  • columns_to_delete,
  • columns_to_insert,
  • foreign_keys_to_delete,
  • foreign_keys_to_insert,
  • indexes_to_delete,
  • indexes_to_insert,
  • ) = await db.execute_fn(collect_info) + await internal_db.execute_write_many( """ INSERT INTO tables (database_name, table_name, rootpage, sql) ``` First impressions: it looks like this helps a lot - as far as I can tell this is now taking around 21ms to get to the point at which all of those internal databases have been populated, where previously it took more than 180ms.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optimize all those calls to index_list and foreign_key_list 1079149656  
997324156 https://github.com/simonw/datasette/issues/1555#issuecomment-997324156 https://api.github.com/repos/simonw/datasette/issues/1555 IC_kwDOBm6k_c47cfV8 simonw 9599 2021-12-19T03:40:05Z 2021-12-19T03:40:05Z OWNER

Using the prototype of this: - https://github.com/simonw/datasette-pretty-traces/issues/5

I'm seeing about 180ms spent running all of these queries on startup!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optimize all those calls to index_list and foreign_key_list 1079149656  
997321767 https://github.com/simonw/datasette/issues/1555#issuecomment-997321767 https://api.github.com/repos/simonw/datasette/issues/1555 IC_kwDOBm6k_c47cewn simonw 9599 2021-12-19T03:10:58Z 2021-12-19T03:10:58Z OWNER

I wonder how much overhead there is switching between the async event loop main code and the thread that runs the SQL queries.

Would there be a performance boost if I gathered all of the column/index information in a single function run on the thread using db.execute_fn() I wonder? It would eliminate a bunch of switching between threads.

Would be great to understand how much of an impact that would have.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optimize all those calls to index_list and foreign_key_list 1079149656  
997321653 https://github.com/simonw/datasette/issues/1555#issuecomment-997321653 https://api.github.com/repos/simonw/datasette/issues/1555 IC_kwDOBm6k_c47ceu1 simonw 9599 2021-12-19T03:09:43Z 2021-12-19T03:09:43Z OWNER

On that same documentation page I just spotted this:

This feature is experimental and is subject to change. Further documentation will become available if and when the table-valued functions for PRAGMAs feature becomes officially supported.

This makes me nervous to rely on pragma function optimizations in Datasette itself.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optimize all those calls to index_list and foreign_key_list 1079149656  
997321477 https://github.com/simonw/datasette/issues/1555#issuecomment-997321477 https://api.github.com/repos/simonw/datasette/issues/1555 IC_kwDOBm6k_c47cesF simonw 9599 2021-12-19T03:07:33Z 2021-12-19T03:07:33Z OWNER

If I want to continue supporting SQLite prior to 3.16.0 (2017-01-02) I'll need this optimization to only kick in with versions that support table-valued PRAGMA functions, while keeping the old PRAGMA foreign_key_list(table) stuff working for those older versions.

That's feasible, but it's a bit more work - and I need to make sure I have robust testing in place for SQLite 3.15.0.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optimize all those calls to index_list and foreign_key_list 1079149656  
997321327 https://github.com/simonw/datasette/issues/1555#issuecomment-997321327 https://api.github.com/repos/simonw/datasette/issues/1555 IC_kwDOBm6k_c47cepv simonw 9599 2021-12-19T03:05:39Z 2021-12-19T03:05:44Z OWNER

This caught me out once before in: - https://github.com/simonw/datasette/issues/1276

Turns out Glitch was running SQLite 3.11.0 from 2016-02-15.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optimize all those calls to index_list and foreign_key_list 1079149656  
997321217 https://github.com/simonw/datasette/issues/1555#issuecomment-997321217 https://api.github.com/repos/simonw/datasette/issues/1555 IC_kwDOBm6k_c47ceoB simonw 9599 2021-12-19T03:04:16Z 2021-12-19T03:04:16Z OWNER

One thing to watch out for though, from https://sqlite.org/pragma.html#pragfunc

The table-valued functions for PRAGMA feature was added in SQLite version 3.16.0 (2017-01-02). Prior versions of SQLite cannot use this feature.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optimize all those calls to index_list and foreign_key_list 1079149656  
997321115 https://github.com/simonw/datasette/issues/1555#issuecomment-997321115 https://api.github.com/repos/simonw/datasette/issues/1555 IC_kwDOBm6k_c47cemb simonw 9599 2021-12-19T03:03:12Z 2021-12-19T03:03:12Z OWNER

Table columns is a bit harder, because table_xinfo is only in SQLite 3.26.0 or higher: https://github.com/simonw/datasette/blob/d637ed46762fdbbd8e32b86f258cd9a53c1cfdc7/datasette/utils/init.py#L565-L581

So if that function is available: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++table_xinfo.*%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_table_xinfo%28sqlite_master.name%29+AS+table_xinfo%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27

sql SELECT sqlite_master.name, table_xinfo.* FROM sqlite_master, pragma_table_xinfo(sqlite_master.name) AS table_xinfo WHERE sqlite_master.type = 'table' And otherwise, using table_info: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++table_info.*%2C%0D%0A++0+as+hidden%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_table_info%28sqlite_master.name%29+AS+table_info%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27

sql SELECT sqlite_master.name, table_info.*, 0 as hidden FROM sqlite_master, pragma_table_info(sqlite_master.name) AS table_info WHERE sqlite_master.type = 'table'

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optimize all those calls to index_list and foreign_key_list 1079149656  
997320824 https://github.com/simonw/datasette/issues/1555#issuecomment-997320824 https://api.github.com/repos/simonw/datasette/issues/1555 IC_kwDOBm6k_c47ceh4 simonw 9599 2021-12-19T02:59:57Z 2021-12-19T03:00:44Z OWNER

To list all indexes: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++index_list.*%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_index_list%28sqlite_master.name%29+AS+index_list%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27

sql SELECT sqlite_master.name, index_list.* FROM sqlite_master, pragma_index_list(sqlite_master.name) AS index_list WHERE sqlite_master.type = 'table'

Foreign keys: https://latest.datasette.io/fixtures?sql=SELECT%0D%0A++sqlite_master.name%2C%0D%0A++foreign_key_list.*%0D%0AFROM%0D%0A++sqlite_master%2C%0D%0A++pragma_foreign_key_list%28sqlite_master.name%29+AS+foreign_key_list%0D%0AWHERE%0D%0A++sqlite_master.type+%3D+%27table%27

sql SELECT sqlite_master.name, foreign_key_list.* FROM sqlite_master, pragma_foreign_key_list(sqlite_master.name) AS foreign_key_list WHERE sqlite_master.type = 'table'

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optimize all those calls to index_list and foreign_key_list 1079149656  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 472.932ms · About: github-to-sqlite