home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where "updated_at" is on date 2021-03-12 and user = 9599 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 4

  • Research using CTEs for faster facet counts 3
  • Prototoype for Datasette on PostgreSQL 1
  • Research plugin hook for alternative database backends 1
  • Research: Plugin hook for alternative database connections 1

user 1

  • simonw · 6 ✖

author_association 1

  • OWNER 6
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
797804869 https://github.com/simonw/datasette/issues/1259#issuecomment-797804869 https://api.github.com/repos/simonw/datasette/issues/1259 MDEyOklzc3VlQ29tbWVudDc5NzgwNDg2OQ== simonw 9599 2021-03-12T23:05:05Z 2021-03-12T23:05:05Z OWNER

I wonder if I could optimize facet suggestion in the same way?

One challenge: the query time limit will apply to the full CTE query, not to the individual columns.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research using CTEs for faster facet counts 830567275  
797801075 https://github.com/simonw/datasette/issues/1259#issuecomment-797801075 https://api.github.com/repos/simonw/datasette/issues/1259 MDEyOklzc3VlQ29tbWVudDc5NzgwMTA3NQ== simonw 9599 2021-03-12T22:53:56Z 2021-03-12T22:55:16Z OWNER

OK, a better comparison:

https://global-power-plants.datasettes.com/global-power-plants?sql=WITH+data+as+%28%0D%0A++select%0D%0A++++%0D%0A++from%0D%0A++++%5Bglobal-power-plants%5D%0D%0A%29%2C%0D%0Acountry_long+as+%28select+%0D%0A++%27country_long%27+as+col%2C+country_long+as+value%2C+count%28%29+as+c+from+data+group+by+country_long%0D%0A++order+by+c+desc+limit+31%0D%0A%29%2C%0D%0Aprimary_fuel+as+%28%0D%0Aselect%0D%0A++%27primary_fuel%27+as+col%2C+primary_fuel+as+value%2C+count%28%29+as+c+from+data+group+by+primary_fuel%0D%0A++order+by+c+desc+limit+31%0D%0A%29%2C%0D%0Aowner+as+%28%0D%0Aselect%0D%0A++%27owner%27+as+col%2C+owner+as+value%2C+count%28%29+as+c+from+data+group+by+owner%0D%0A++order+by+c+desc+limit+31%0D%0A%29%0D%0Aselect++from+primary_fuel+union+select++from+country_long%0D%0Aunion+select++from+owner+order+by+col%2C+c+desc calculates facets against three columns. It takes 78.5ms* (and 34.5ms when I refreshed it, presumably after warming some SQLite caches of some sort).

https://global-power-plants.datasettes.com/global-power-plants/global-power-plants?_facet=country_long&_facet=primary_fuel&_trace=1&_size=0 shows those facets with size=0 on the SQL query - and shows a SQL trace at the bottom of the page.

The country_long facet query takes 45.36ms, owner takes 38.45ms, primary_fuel takes 49.04ms - so a total of 132.85ms

That's against https://global-power-plants.datasettes.com/-/versions says SQLite 3.27.3 - so even on a SQLite version that doesn't materialize the CTEs there's a significant performance boost to doing all three facets in a single CTE query.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research using CTEs for faster facet counts 830567275  
797790017 https://github.com/simonw/datasette/issues/1259#issuecomment-797790017 https://api.github.com/repos/simonw/datasette/issues/1259 MDEyOklzc3VlQ29tbWVudDc5Nzc5MDAxNw== simonw 9599 2021-03-12T22:22:12Z 2021-03-12T22:22:12Z OWNER

https://sqlite.org/lang_with.html

Prior to SQLite 3.35.0, all CTEs where treated as if the NOT MATERIALIZED phrase was present

It looks like this optimization is completely unavailable on SQLite prior to 3.35.0 (released 12th March 2021). But I could still rewrite the faceting to work in this way, using the exact same SQL - it would just be significantly faster on 3.35.0+ (assuming it's actually faster in practice - would need to benchmark).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research using CTEs for faster facet counts 830567275  
797159434 https://github.com/simonw/datasette/issues/1193#issuecomment-797159434 https://api.github.com/repos/simonw/datasette/issues/1193 MDEyOklzc3VlQ29tbWVudDc5NzE1OTQzNA== simonw 9599 2021-03-12T01:01:54Z 2021-03-12T01:01:54Z OWNER

DuckDB has a read-only mechanism: https://duckdb.org/docs/api/python

python import duckdb con = duckdb.connect(database="/tmp/blah.db", read_only=True)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research plugin hook for alternative database backends 787173276  
797159221 https://github.com/simonw/datasette/issues/1250#issuecomment-797159221 https://api.github.com/repos/simonw/datasette/issues/1250 MDEyOklzc3VlQ29tbWVudDc5NzE1OTIyMQ== simonw 9599 2021-03-12T01:01:17Z 2021-03-12T01:01:17Z OWNER

This is a duplicate of #1193.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: Plugin hook for alternative database connections 824067604  
797158641 https://github.com/simonw/datasette/issues/670#issuecomment-797158641 https://api.github.com/repos/simonw/datasette/issues/670 MDEyOklzc3VlQ29tbWVudDc5NzE1ODY0MQ== simonw 9599 2021-03-12T00:59:49Z 2021-03-12T00:59:49Z OWNER

Challenge: what's the equivalent for PostgreSQL of opening a database in read only mode? Will I have to talk users through creating read only credentials?

It looks like the answer to this is yes - I'll need users to setup read-only credentials. Here's a TIL about that: https://til.simonwillison.net/postgresql/read-only-postgresql-user

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
Prototoype for Datasette on PostgreSQL 564833696  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 575.755ms · About: github-to-sqlite