home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

10 rows where issue = 695319258 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 10

issue 1

  • FTS table with 7 rows has _fts_docsize table with 9,141 rows · 10 ✖

author_association 1

  • OWNER 10
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
688501064 https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688501064 https://api.github.com/repos/simonw/sqlite-utils/issues/149 MDEyOklzc3VlQ29tbWVudDY4ODUwMTA2NA== simonw 9599 2020-09-07T20:30:15Z 2020-09-07T20:30:38Z OWNER

The second challenge here is cleaning up all of those junk rows in existing *_fts_docsize tables. Doing that just to the demo database from https://github-to-sqlite.dogsheep.net/github.db dropped its size from 22MB to 16MB! Here's the SQL: sql DELETE FROM [licenses_fts_docsize] WHERE id NOT IN ( SELECT rowid FROM [licenses_fts]); I can do that as part of the existing table.optimize() method, which optimizes FTS tables.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
FTS table with 7 rows has _fts_docsize table with 9,141 rows 695319258  
688499924 https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688499924 https://api.github.com/repos/simonw/sqlite-utils/issues/149 MDEyOklzc3VlQ29tbWVudDY4ODQ5OTkyNA== simonw 9599 2020-09-07T20:25:40Z 2020-09-07T20:25:50Z OWNER

https://www.sqlite.org/pragma.html#pragma_recursive_triggers says:

Prior to SQLite version 3.6.18 (2009-09-11), recursive triggers were not supported. The behavior of SQLite was always as if this pragma was set to OFF. Support for recursive triggers was added in version 3.6.18 but was initially turned OFF by default, for compatibility. Recursive triggers may be turned on by default in future versions of SQLite.

So I think the fix is to turn on recursive_triggers globally by default for sqlite-utils.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
FTS table with 7 rows has _fts_docsize table with 9,141 rows 695319258  
688499650 https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688499650 https://api.github.com/repos/simonw/sqlite-utils/issues/149 MDEyOklzc3VlQ29tbWVudDY4ODQ5OTY1MA== simonw 9599 2020-09-07T20:24:35Z 2020-09-07T20:24:35Z OWNER

This replicates the problem: (github-to-sqlite) /tmp % sqlite-utils tables --counts github.db | grep licenses {"table": "licenses", "count": 7}, {"table": "licenses_fts_data", "count": 35}, {"table": "licenses_fts_idx", "count": 16}, {"table": "licenses_fts_docsize", "count": 9151}, {"table": "licenses_fts_config", "count": 1}, {"table": "licenses_fts", "count": 7}, (github-to-sqlite) /tmp % github-to-sqlite repos github.db dogsheep (github-to-sqlite) /tmp % sqlite-utils tables --counts github.db | grep licenses {"table": "licenses", "count": 7}, {"table": "licenses_fts_data", "count": 45}, {"table": "licenses_fts_idx", "count": 26}, {"table": "licenses_fts_docsize", "count": 9161}, {"table": "licenses_fts_config", "count": 1}, {"table": "licenses_fts", "count": 7}, Note how the number of rows in licenses_fts_docsize goes from 9151 to 9161.

The number went up by ten. I used tracing from #151 to show that the following SQL executed ten times: INSERT OR REPLACE INTO [licenses] ([key], [name], [node_id], [spdx_id], [url]) VALUES (?, ?, ?, ?, ?); Then I tried executing PRAGMA recursive_triggers=on; at the start of the script. This fixed the problem - running the script did not increase the number of rows in licenses_fts_docsize.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
FTS table with 7 rows has _fts_docsize table with 9,141 rows 695319258  
688482355 https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688482355 https://api.github.com/repos/simonw/sqlite-utils/issues/149 MDEyOklzc3VlQ29tbWVudDY4ODQ4MjM1NQ== simonw 9599 2020-09-07T19:22:51Z 2020-09-07T19:22:51Z OWNER

And the SQLite documentation says:

When the REPLACE conflict resolution strategy deletes rows in order to satisfy a constraint, delete triggers fire if and only if recursive triggers are enabled.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
FTS table with 7 rows has _fts_docsize table with 9,141 rows 695319258  
688482055 https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688482055 https://api.github.com/repos/simonw/sqlite-utils/issues/149 MDEyOklzc3VlQ29tbWVudDY4ODQ4MjA1NQ== simonw 9599 2020-09-07T19:21:42Z 2020-09-07T19:21:42Z OWNER

Using replace=True there executes INSERT OR REPLACE - and Dan Kennedy (SQLite maintainer) on the SQLite forums said this:

Are you using "REPLACE INTO", or "UPDATE OR REPLACE" on the "licenses" table without having first executed "PRAGMA recursive_triggers = 1"? The docs note that delete triggers will not be fired in this case, which would explain things. Second paragraph under "REPLACE" here:

https://www.sqlite.org/lang_conflict.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
FTS table with 7 rows has _fts_docsize table with 9,141 rows 695319258  
688481374 https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688481374 https://api.github.com/repos/simonw/sqlite-utils/issues/149 MDEyOklzc3VlQ29tbWVudDY4ODQ4MTM3NA== simonw 9599 2020-09-07T19:19:08Z 2020-09-07T19:19:08Z OWNER

reading through the code for github-to-sqlite repos - one of the things it does is calls save_license for each repo:

https://github.com/dogsheep/github-to-sqlite/blob/39b2234253096bd579feed4e25104698b8ccd2ba/github_to_sqlite/utils.py#L259-L262

python def save_license(db, license): if license is None: return None return db["licenses"].insert(license, pk="key", replace=True).last_pk

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
FTS table with 7 rows has _fts_docsize table with 9,141 rows 695319258  
688480665 https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688480665 https://api.github.com/repos/simonw/sqlite-utils/issues/149 MDEyOklzc3VlQ29tbWVudDY4ODQ4MDY2NQ== simonw 9599 2020-09-07T19:16:20Z 2020-09-07T19:16:20Z OWNER

Aha! I have managed to replicate the bug: (github-to-sqlite) /tmp % sqlite-utils tables --counts github.db | grep licenses {"table": "licenses", "count": 7}, {"table": "licenses_fts_data", "count": 35}, {"table": "licenses_fts_idx", "count": 16}, {"table": "licenses_fts_docsize", "count": 9151}, {"table": "licenses_fts_config", "count": 1}, {"table": "licenses_fts", "count": 7}, (github-to-sqlite) /tmp % github-to-sqlite repos github.db dogsheep (github-to-sqlite) /tmp % sqlite-utils tables --counts github.db | grep licenses {"table": "licenses", "count": 7}, {"table": "licenses_fts_data", "count": 45}, {"table": "licenses_fts_idx", "count": 26}, {"table": "licenses_fts_docsize", "count": 9161}, {"table": "licenses_fts_config", "count": 1}, {"table": "licenses_fts", "count": 7}, Note that the number of records in licenses_fts_docsize went from 9151 to 9161.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
FTS table with 7 rows has _fts_docsize table with 9,141 rows 695319258  
688464181 https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688464181 https://api.github.com/repos/simonw/sqlite-utils/issues/149 MDEyOklzc3VlQ29tbWVudDY4ODQ2NDE4MQ== simonw 9599 2020-09-07T18:19:54Z 2020-09-07T18:19:54Z OWNER

Even though that table doesn't declare an integer primary key it does have a rowid column: https://github-to-sqlite.dogsheep.net/github?sql=select+rowid%2C+%5Bkey%5D%2C+name%2C+spdx_id%2C+url%2C+node_id+from+licenses+order+by+%5Bkey%5D+limit+101

| rowid | key | name | spdx_id | url | node_id | | --- | --- | --- | --- | --- | --- | | 9150 | apache-2.0 | Apache License 2.0 | Apache-2.0 | https://api.github.com/licenses/apache-2.0 | MDc6TGljZW5zZTI= | | 112 | bsd-3-clause | BSD 3-Clause "New" or "Revised" License | BSD-3-Clause | https://api.github.com/licenses/bsd-3-clause | MDc6TGljZW5zZTU= |

https://www.sqlite.org/rowidtable.html explains has this clue:

If the rowid is not aliased by INTEGER PRIMARY KEY then it is not persistent and might change. In particular the VACUUM command will change rowids for tables that do not declare an INTEGER PRIMARY KEY. Therefore, applications should not normally access the rowid directly, but instead use an INTEGER PRIMARY KEY.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
FTS table with 7 rows has _fts_docsize table with 9,141 rows 695319258  
688460865 https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688460865 https://api.github.com/repos/simonw/sqlite-utils/issues/149 MDEyOklzc3VlQ29tbWVudDY4ODQ2MDg2NQ== simonw 9599 2020-09-07T18:07:14Z 2020-09-07T18:07:14Z OWNER

Another likely culprit: licenses has a text primary key, so it's not using rowid: sql CREATE TABLE [licenses] ( [key] TEXT PRIMARY KEY, [name] TEXT, [spdx_id] TEXT, [url] TEXT, [node_id] TEXT );

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
FTS table with 7 rows has _fts_docsize table with 9,141 rows 695319258  
688460729 https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688460729 https://api.github.com/repos/simonw/sqlite-utils/issues/149 MDEyOklzc3VlQ29tbWVudDY4ODQ2MDcyOQ== simonw 9599 2020-09-07T18:06:44Z 2020-09-07T18:06:44Z OWNER

First posted on SQLite forum here but I'm pretty sure this is a bug in how sqlite-utils created those tables: https://sqlite.org/forum/forumpost/51aada1b45

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
FTS table with 7 rows has _fts_docsize table with 9,141 rows 695319258  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 470.619ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows