issue_comments
13 rows where author_association = "CONTRIBUTOR" and issue = 642572841 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- Database page loads too slowly with many large tables (due to table counts) · 13 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
905904540 | https://github.com/simonw/datasette/issues/859#issuecomment-905904540 | https://api.github.com/repos/simonw/datasette/issues/859 | IC_kwDOBm6k_c41_wGc | brandonrobertz 2670795 | 2021-08-25T21:59:14Z | 2021-08-25T21:59:55Z | CONTRIBUTOR | I did two tests: one with 1000 5-30mb DBs and a second with 20 multi gig DBs. For the second, I created them like so:
This was for deciding whether to use lots of small DBs or to group things into a smaller number of bigger DBs. The second strategy wins. By simply persisting the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Database page loads too slowly with many large tables (due to table counts) 642572841 | |
905899177 | https://github.com/simonw/datasette/issues/859#issuecomment-905899177 | https://api.github.com/repos/simonw/datasette/issues/859 | IC_kwDOBm6k_c41_uyp | brandonrobertz 2670795 | 2021-08-25T21:48:00Z | 2021-08-25T21:48:00Z | CONTRIBUTOR | Upon first stab, there's two issues here: - DB/table/row counts (as discussed above). This isn't too bad if the DBs are actually above the MAX limit check. - Populating the internal DB. On first load of a giant set of DBs, it can take 10-20 mins to populate. By altering datasette and persisting the internal DB to disk, this problem is vastly improved, but I'm sure this will cause problems elsewhere. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Database page loads too slowly with many large tables (due to table counts) 642572841 | |
904982056 | https://github.com/simonw/datasette/issues/859#issuecomment-904982056 | https://api.github.com/repos/simonw/datasette/issues/859 | IC_kwDOBm6k_c418O4o | brandonrobertz 2670795 | 2021-08-24T21:15:04Z | 2021-08-24T21:15:30Z | CONTRIBUTOR | I'm running into issues with this as well. All other pages seem to work with lots of DBs except the home page, which absolutely tanks. Would be willing to put some work into this, if there's been any kind of progress on concepts on how this ought to work. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Database page loads too slowly with many large tables (due to table counts) 642572841 | |
647922203 | https://github.com/simonw/datasette/issues/859#issuecomment-647922203 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0NzkyMjIwMw== | abdusco 3243482 | 2020-06-23T05:44:58Z | 2021-01-05T08:22:43Z | CONTRIBUTOR | I'm seeing the problem on database page. Index page and table page runs quite fast.
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Database page loads too slowly with many large tables (due to table counts) 642572841 | |
652160909 | https://github.com/simonw/datasette/issues/859#issuecomment-652160909 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY1MjE2MDkwOQ== | abdusco 3243482 | 2020-07-01T03:09:32Z | 2020-07-01T03:10:21Z | CONTRIBUTOR | I've just realized Datasette tries to count hidden tables too. There are 5 visible tables, 25 hidden tables, which I haven't realize earlier to consider their effect. I've turned off counting for hidden tables to see if it has any effect. What's the point of counting FTS tables? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Database page loads too slowly with many large tables (due to table counts) 642572841 | |
648669523 | https://github.com/simonw/datasette/issues/859#issuecomment-648669523 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0ODY2OTUyMw== | abdusco 3243482 | 2020-06-24T08:13:23Z | 2020-06-24T10:30:36Z | CONTRIBUTOR | I tried setting
I feel like 10 seconds is a magic number, like a processing timeout and datasette gives up and returns the page. Index page loads instantly, table page, query page, as well. But when I return to database page after some time, it loads in 10s. EDIT: It's always like 10 + 0.3s, like 10s wait and timeout then 300ms to render the page |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Database page loads too slowly with many large tables (due to table counts) 642572841 | |
648232645 | https://github.com/simonw/datasette/issues/859#issuecomment-648232645 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0ODIzMjY0NQ== | abdusco 3243482 | 2020-06-23T15:19:53Z | 2020-06-23T15:19:53Z | CONTRIBUTOR | The issue seems to appear sporadically, like when I return to database page after a while, during which some records have been added to the database. I've just visited database, page first visit took ~10s, consecutive visits took 0.3s. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Database page loads too slowly with many large tables (due to table counts) 642572841 | |
647925594 | https://github.com/simonw/datasette/issues/859#issuecomment-647925594 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0NzkyNTU5NA== | abdusco 3243482 | 2020-06-23T05:55:21Z | 2020-06-23T06:28:29Z | CONTRIBUTOR | Hmm, not seeing the problem now. I have couple of workers that check some pages regularly and scrape new content and save to the DB. Could it be that datasette tries to recount tables every time database size changes? Normally it keeps a count cache, but as DB gets updated so often (new content every 5 min or so) it's practically recounting every time I go to the database page? EDIT: It turns out it doesn't hold cache with mutable databases. I'll update the issue with more findings and a better way to reproduce the problem if I encounter it again. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Database page loads too slowly with many large tables (due to table counts) 642572841 | |
647936117 | https://github.com/simonw/datasette/issues/859#issuecomment-647936117 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0NzkzNjExNw== | abdusco 3243482 | 2020-06-23T06:25:17Z | 2020-06-23T06:25:17Z | CONTRIBUTOR |
Try chunking write operations into batches every 1000 records or so. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Database page loads too slowly with many large tables (due to table counts) 642572841 | |
647935300 | https://github.com/simonw/datasette/issues/859#issuecomment-647935300 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0NzkzNTMwMA== | abdusco 3243482 | 2020-06-23T06:23:01Z | 2020-06-23T06:23:01Z | CONTRIBUTOR |
Ah that was a typo, I meant 50k. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Database page loads too slowly with many large tables (due to table counts) 642572841 | |
647923666 | https://github.com/simonw/datasette/issues/859#issuecomment-647923666 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0NzkyMzY2Ng== | abdusco 3243482 | 2020-06-23T05:49:31Z | 2020-06-23T05:49:31Z | CONTRIBUTOR | I think I should mention that having FTS on all tables mean I have 5 visible, 25 hidden (FTS) tables displayed on database page. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Database page loads too slowly with many large tables (due to table counts) 642572841 | |
647194131 | https://github.com/simonw/datasette/issues/859#issuecomment-647194131 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0NzE5NDEzMQ== | abdusco 3243482 | 2020-06-21T23:15:54Z | 2020-06-21T23:26:09Z | CONTRIBUTOR | I'm not sure if table counts are to blame. There shouldn't be a ~3 orders of magnitude difference. ```fish user@klein /a/w/scrapyard (master)> set sql "select count() from table_1; select count() from table_2; select count(*) from table_3;" user@klein /a/w/scrapyard (master)> time sqlite3 scrapyard.db "$sql" 187489 46492 2229 Executed in 25.57 millis fish external usr time 3.55 millis 0.00 micros 3.55 millis sys time 22.42 millis 1123.00 micros 21.30 millis ``` but not letting datasette count the tables definitely helps. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Database page loads too slowly with many large tables (due to table counts) 642572841 | |
647135713 | https://github.com/simonw/datasette/issues/859#issuecomment-647135713 | https://api.github.com/repos/simonw/datasette/issues/859 | MDEyOklzc3VlQ29tbWVudDY0NzEzNTcxMw== | abdusco 3243482 | 2020-06-21T14:30:02Z | 2020-06-21T14:30:02Z | CONTRIBUTOR | Oops, the same method is called from both index and database pages. But removing select count queries speed up the page load quite a bit. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Database page loads too slowly with many large tables (due to table counts) 642572841 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [issue] INTEGER REFERENCES [issues]([id]) , [performed_via_github_app] TEXT); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 2