home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where author_association = "OWNER", issue = 1096558279 and "updated_at" is on date 2022-01-08 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 2

issue 1

  • create-index should run analyze after creating index · 2 ✖

author_association 1

  • OWNER · 2 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1008163050 https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008163050 https://api.github.com/repos/simonw/sqlite-utils/issues/365 IC_kwDOCGYnMM48F1jq simonw 9599 2022-01-08T22:10:51Z 2022-01-08T22:10:51Z OWNER

Is there a downside to having a sqlite_stat1 table if it has wildly incorrect statistics in it?

Imagine the following sequence of events:

  • User imports a few records, creating the table, using sqlite-utils insert
  • User runs sqlite-utils create-index ... which also creates and populates the sqlite_stat1 table
  • User runs insert again to populate several million new records

The user now has a database file with several million records and a statistics table that is wildly out of date, having been populated when they only had a few.

Will this result in surprisingly bad query performance compared to it that statistics table did not exist at all?

If so, I lean much harder towards ANALYZE as a strictly opt-in optimization, maybe with the --analyze option added to sqlite-utils insert top to help users opt in to updating their statistics after running big inserts.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
create-index should run analyze after creating index 1096558279  
1008158357 https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008158357 https://api.github.com/repos/simonw/sqlite-utils/issues/365 IC_kwDOCGYnMM48F0aV simonw 9599 2022-01-08T21:33:07Z 2022-01-08T21:33:07Z OWNER

The one thing that worries me a little bit about doing this by default is that it adds a surprising new table to the database - it may be confusing to users if they run create-index and their database suddenly has a new sqlite_stat1 table, see https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1008157132

Options here are:

  • Do it anyway. People can tolerate a surprise table appearing when they create an index.
  • Only run ANALYZE if the user says sqlite-utils create-index ... --analyze
  • Use the --analyze option, but also automatically run ANALYZE if they create an index and the database they are working with already has a sqlite_stat1 table

I'm currently leading towards that third option - @fgregg any thoughts?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
create-index should run analyze after creating index 1096558279  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 30.275ms · About: github-to-sqlite