home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where "created_at" is on date 2021-08-25 and issue = 642572841 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • brandonrobertz 2
  • simonw 1

author_association 2

  • CONTRIBUTOR 2
  • OWNER 1

issue 1

  • Database page loads too slowly with many large tables (due to table counts) · 3 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
905904540 https://github.com/simonw/datasette/issues/859#issuecomment-905904540 https://api.github.com/repos/simonw/datasette/issues/859 IC_kwDOBm6k_c41_wGc brandonrobertz 2670795 2021-08-25T21:59:14Z 2021-08-25T21:59:55Z CONTRIBUTOR

I did two tests: one with 1000 5-30mb DBs and a second with 20 multi gig DBs. For the second, I created them like so: for i in {1..20}; do sqlite-generate db$i.db --tables ${i}00 --rows 100,2000 --columns 5,100 --pks 0 --fks 0; done

This was for deciding whether to use lots of small DBs or to group things into a smaller number of bigger DBs. The second strategy wins.

By simply persisting the _internal DB to disk, I was able to avoid most of the performance issues I was experiencing previously. (To do this, I changed the datasette/internal_db.py:init_internal_db creates to if not exists, and changed the _internal DB instantiation in datasette/app.py:Datasette.__init__ to a path with is_mutable=True.) Super rough, but the pages now load so I can continue testing ideas.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
905900807 https://github.com/simonw/datasette/issues/859#issuecomment-905900807 https://api.github.com/repos/simonw/datasette/issues/859 IC_kwDOBm6k_c41_vMH simonw 9599 2021-08-25T21:51:10Z 2021-08-25T21:51:10Z OWNER

10-20 minutes to populate _internal! How many databases and tables is that for?

I may have to rethink the _internal mechanism entirely. One possible alternative would be for the Datasette homepage to just show a list of available databases (maybe only if there are more than X connected) and then load in their metadata only the first time they are accessed.

I need to get my own stress testing rig setup for this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  
905899177 https://github.com/simonw/datasette/issues/859#issuecomment-905899177 https://api.github.com/repos/simonw/datasette/issues/859 IC_kwDOBm6k_c41_uyp brandonrobertz 2670795 2021-08-25T21:48:00Z 2021-08-25T21:48:00Z CONTRIBUTOR

Upon first stab, there's two issues here: - DB/table/row counts (as discussed above). This isn't too bad if the DBs are actually above the MAX limit check. - Populating the internal DB. On first load of a giant set of DBs, it can take 10-20 mins to populate. By altering datasette and persisting the internal DB to disk, this problem is vastly improved, but I'm sure this will cause problems elsewhere.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Database page loads too slowly with many large tables (due to table counts) 642572841  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 40.977ms · About: github-to-sqlite