github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/datasette/issues/859#issuecomment-905904540 | https://api.github.com/repos/simonw/datasette/issues/859 | 905904540 | IC_kwDOBm6k_c41_wGc | 2670795 | 2021-08-25T21:59:14Z | 2021-08-25T21:59:55Z | CONTRIBUTOR | I did two tests: one with 1000 5-30mb DBs and a second with 20 multi gig DBs. For the second, I created them like so: `for i in {1..20}; do sqlite-generate db$i.db --tables ${i}00 --rows 100,2000 --columns 5,100 --pks 0 --fks 0; done` This was for deciding whether to use lots of small DBs or to group things into a smaller number of bigger DBs. The second strategy wins. By simply persisting the `_internal` DB to disk, I was able to avoid most of the performance issues I was experiencing previously. (To do this, I changed the `datasette/internal_db.py:init_internal_db` creates to if not exists, and changed the `_internal` DB instantiation in `datasette/app.py:Datasette.__init__` to a path with `is_mutable=True`.) Super rough, but the pages now load so I can continue testing ideas. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
642572841 | |
https://github.com/simonw/datasette/issues/859#issuecomment-905900807 | https://api.github.com/repos/simonw/datasette/issues/859 | 905900807 | IC_kwDOBm6k_c41_vMH | 9599 | 2021-08-25T21:51:10Z | 2021-08-25T21:51:10Z | OWNER | 10-20 minutes to populate `_internal`! How many databases and tables is that for? I may have to rethink the `_internal` mechanism entirely. One possible alternative would be for the Datasette homepage to just show a list of available databases (maybe only if there are more than X connected) and then load in their metadata only the first time they are accessed. I need to get my own stress testing rig setup for this. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
642572841 | |
https://github.com/simonw/datasette/issues/859#issuecomment-905899177 | https://api.github.com/repos/simonw/datasette/issues/859 | 905899177 | IC_kwDOBm6k_c41_uyp | 2670795 | 2021-08-25T21:48:00Z | 2021-08-25T21:48:00Z | CONTRIBUTOR | Upon first stab, there's two issues here: - DB/table/row counts (as discussed above). This isn't too bad if the DBs are actually above the MAX limit check. - Populating the internal DB. On first load of a giant set of DBs, it can take 10-20 mins to populate. By altering datasette and persisting the internal DB to disk, this problem is vastly improved, but I'm sure this will cause problems elsewhere. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
642572841 |