home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where issue = 849978964, "updated_at" is on date 2021-04-04 and user = 9599 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw · 9 ✖

issue 1

  • Show column metadata plus links for foreign keys on arbitrary query results · 9 ✖

author_association 1

  • OWNER 9
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
813116177 https://github.com/simonw/datasette/issues/1293#issuecomment-813116177 https://api.github.com/repos/simonw/datasette/issues/1293 MDEyOklzc3VlQ29tbWVudDgxMzExNjE3Nw== simonw 9599 2021-04-04T23:31:00Z 2021-04-04T23:31:00Z OWNER

Sadly it doesn't do what I need. This query should only return one column, but instead I get back every column that was consulted by the query:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Show column metadata plus links for foreign keys on arbitrary query results 849978964  
813115607 https://github.com/simonw/datasette/issues/1293#issuecomment-813115607 https://api.github.com/repos/simonw/datasette/issues/1293 MDEyOklzc3VlQ29tbWVudDgxMzExNTYwNw== simonw 9599 2021-04-04T23:25:15Z 2021-04-04T23:25:15Z OWNER

Oh wow, I just spotted https://github.com/macbre/sql-metadata

Uses tokenized query returned by python-sqlparse and generates query metadata. Extracts column names and tables used by the query. Provides a helper for normalization of SQL queries and tables aliases resolving.

It's for MySQL, PostgreSQL and Hive right now but maybe getting it working with SQLite wouldn't be too hard?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Show column metadata plus links for foreign keys on arbitrary query results 849978964  
813115414 https://github.com/simonw/datasette/issues/1293#issuecomment-813115414 https://api.github.com/repos/simonw/datasette/issues/1293 MDEyOklzc3VlQ29tbWVudDgxMzExNTQxNA== simonw 9599 2021-04-04T23:23:34Z 2021-04-04T23:23:34Z OWNER

The other approach I considered for this was to have my own SQL query parser running in Python, which could pick apart a complex query and figure out which column was sourced from which table. I dropped this idea because it felt that the moment select * came into play a pure parsing approach wouldn't work - I'd need knowledge of the schema in order to resolve the *.

A Python parser approach might be good enough to handle a subset of queries - those that don't use select * for example - and maybe that would be worth shipping? The feature doesn't have to be perfect for it to be useful.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Show column metadata plus links for foreign keys on arbitrary query results 849978964  
813114933 https://github.com/simonw/datasette/issues/1293#issuecomment-813114933 https://api.github.com/repos/simonw/datasette/issues/1293 MDEyOklzc3VlQ29tbWVudDgxMzExNDkzMw== simonw 9599 2021-04-04T23:19:22Z 2021-04-04T23:19:22Z OWNER

I asked about this on the SQLite forum: https://sqlite.org/forum/forumpost/0180277fb7

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Show column metadata plus links for foreign keys on arbitrary query results 849978964  
813113653 https://github.com/simonw/datasette/issues/1293#issuecomment-813113653 https://api.github.com/repos/simonw/datasette/issues/1293 MDEyOklzc3VlQ29tbWVudDgxMzExMzY1Mw== simonw 9599 2021-04-04T23:10:49Z 2021-04-04T23:10:49Z OWNER

One option I've not fully explored yet: could I write my own custom SQLite C extension which exposes this functionality as a callable function?

Then I could load that extension and run a SQL query something like this:

select database, table, column from analyze_query(:sql_query) Where analyze_query(...) would be a fancy virtual table function of some sort that uses the underlying sqlite3_column_database_name() C functions to analyze the SQL query and return details of what it would return.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Show column metadata plus links for foreign keys on arbitrary query results 849978964  
813113403 https://github.com/simonw/datasette/issues/1293#issuecomment-813113403 https://api.github.com/repos/simonw/datasette/issues/1293 MDEyOklzc3VlQ29tbWVudDgxMzExMzQwMw== simonw 9599 2021-04-04T23:08:48Z 2021-04-04T23:08:48Z OWNER

Worth noting that adding limit 0 to the query still causes it to conduct the permission checks, hopefully while avoiding doing any of the actual work of executing the query: pycon In [20]: db.execute('select * from compound_primary_key join facetable on facetable.rowid = compound_primary_key.rowid limit 0').fetchall() ...: args (21, None, None, None, None) kwargs {} args (20, 'compound_primary_key', 'pk1', 'main', None) kwargs {} args (20, 'compound_primary_key', 'pk2', 'main', None) kwargs {} args (20, 'compound_primary_key', 'content', 'main', None) kwargs {} args (20, 'facetable', 'pk', 'main', None) kwargs {} args (20, 'facetable', 'created', 'main', None) kwargs {} args (20, 'facetable', 'planet_int', 'main', None) kwargs {} args (20, 'facetable', 'on_earth', 'main', None) kwargs {} args (20, 'facetable', 'state', 'main', None) kwargs {} args (20, 'facetable', 'city_id', 'main', None) kwargs {} args (20, 'facetable', 'neighborhood', 'main', None) kwargs {} args (20, 'facetable', 'tags', 'main', None) kwargs {} args (20, 'facetable', 'complex_array', 'main', None) kwargs {} args (20, 'facetable', 'distinct_some_null', 'main', None) kwargs {} args (20, 'facetable', 'pk', 'main', None) kwargs {} args (20, 'compound_primary_key', 'ROWID', 'main', None) kwargs {} Out[20]: []

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Show column metadata plus links for foreign keys on arbitrary query results 849978964  
813113218 https://github.com/simonw/datasette/issues/1293#issuecomment-813113218 https://api.github.com/repos/simonw/datasette/issues/1293 MDEyOklzc3VlQ29tbWVudDgxMzExMzIxOA== simonw 9599 2021-04-04T23:07:25Z 2021-04-04T23:07:25Z OWNER

Here are all of the available constants: pycon In [3]: for k in dir(sqlite3): ...: if k.startswith("SQLITE_"): ...: print(k, getattr(sqlite3, k)) ...: SQLITE_ALTER_TABLE 26 SQLITE_ANALYZE 28 SQLITE_ATTACH 24 SQLITE_CREATE_INDEX 1 SQLITE_CREATE_TABLE 2 SQLITE_CREATE_TEMP_INDEX 3 SQLITE_CREATE_TEMP_TABLE 4 SQLITE_CREATE_TEMP_TRIGGER 5 SQLITE_CREATE_TEMP_VIEW 6 SQLITE_CREATE_TRIGGER 7 SQLITE_CREATE_VIEW 8 SQLITE_CREATE_VTABLE 29 SQLITE_DELETE 9 SQLITE_DENY 1 SQLITE_DETACH 25 SQLITE_DONE 101 SQLITE_DROP_INDEX 10 SQLITE_DROP_TABLE 11 SQLITE_DROP_TEMP_INDEX 12 SQLITE_DROP_TEMP_TABLE 13 SQLITE_DROP_TEMP_TRIGGER 14 SQLITE_DROP_TEMP_VIEW 15 SQLITE_DROP_TRIGGER 16 SQLITE_DROP_VIEW 17 SQLITE_DROP_VTABLE 30 SQLITE_FUNCTION 31 SQLITE_IGNORE 2 SQLITE_INSERT 18 SQLITE_OK 0 SQLITE_PRAGMA 19 SQLITE_READ 20 SQLITE_RECURSIVE 33 SQLITE_REINDEX 27 SQLITE_SAVEPOINT 32 SQLITE_SELECT 21 SQLITE_TRANSACTION 22 SQLITE_UPDATE 23

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Show column metadata plus links for foreign keys on arbitrary query results 849978964  
813113175 https://github.com/simonw/datasette/issues/1293#issuecomment-813113175 https://api.github.com/repos/simonw/datasette/issues/1293 MDEyOklzc3VlQ29tbWVudDgxMzExMzE3NQ== simonw 9599 2021-04-04T23:07:01Z 2021-04-04T23:07:01Z OWNER

A more promising route I found involved the db.set_authorizer method. This can be used to log the permission checks that SQLite uses, including checks for permission to access specific columns of specific tables. For a while I thought this could work!

```pycon

def print_args(args, *kwargs): ... print("args", args, "kwargs", kwargs) ... return sqlite3.SQLITE_OK

db = sqlite3.connect("fixtures.db") db.execute('select * from compound_primary_key join facetable on rowid').fetchall() args (21, None, None, None, None) kwargs {} args (20, 'compound_primary_key', 'pk1', 'main', None) kwargs {} args (20, 'compound_primary_key', 'pk2', 'main', None) kwargs {} args (20, 'compound_primary_key', 'content', 'main', None) kwargs {} args (20, 'facetable', 'pk', 'main', None) kwargs {} args (20, 'facetable', 'created', 'main', None) kwargs {} args (20, 'facetable', 'planet_int', 'main', None) kwargs {} args (20, 'facetable', 'on_earth', 'main', None) kwargs {} args (20, 'facetable', 'state', 'main', None) kwargs {} args (20, 'facetable', 'city_id', 'main', None) kwargs {} args (20, 'facetable', 'neighborhood', 'main', None) kwargs {} args (20, 'facetable', 'tags', 'main', None) kwargs {} args (20, 'facetable', 'complex_array', 'main', None) kwargs {} args (20, 'facetable', 'distinct_some_null', 'main', None) kwargs {} `` Those20values (where 20 isSQLITE_READ`) looked like they were checking permissions for the columns in the order they would be returned!

Then I found a snag:

pycon In [18]: db.execute('select 1 + 1 + (select max(rowid) from facetable)') args (21, None, None, None, None) kwargs {} args (31, None, 'max', None, None) kwargs {} args (20, 'facetable', 'pk', 'main', None) kwargs {} args (21, None, None, None, None) kwargs {} args (20, 'facetable', '', None, None) kwargs {} Once a subselect is involved the order of the 20 checks no longer matches the order in which the columns are returned from the query.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Show column metadata plus links for foreign keys on arbitrary query results 849978964  
813112546 https://github.com/simonw/datasette/issues/1293#issuecomment-813112546 https://api.github.com/repos/simonw/datasette/issues/1293 MDEyOklzc3VlQ29tbWVudDgxMzExMjU0Ng== simonw 9599 2021-04-04T23:02:45Z 2021-04-04T23:02:45Z OWNER

I've done various pieces of research into this over the past few years. Capturing what I've discovered in this ticket.

The SQLite C API has functions that can help with this: https://www.sqlite.org/c3ref/column_database_name.html details those. But they're not exposed in the Python SQLite library.

Maybe it would be possible to use them via ctypes? My hunch is that I would have to re-implement the full sqlite3 module with ctypes, which sounds daunting.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Show column metadata plus links for foreign keys on arbitrary query results 849978964  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 508.122ms · About: github-to-sqlite