github

This data as json, CSV

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	issue	performed_via_github_app
https://github.com/simonw/datasette/issues/859#issuecomment-905904540	https://api.github.com/repos/simonw/datasette/issues/859	905904540	IC_kwDOBm6k_c41_wGc	2670795	2021-08-25T21:59:14Z	2021-08-25T21:59:55Z	CONTRIBUTOR	I did two tests: one with 1000 5-30mb DBs and a second with 20 multi gig DBs. For the second, I created them like so: `for i in {1..20}; do sqlite-generate db$i.db --tables ${i}00 --rows 100,2000 --columns 5,100 --pks 0 --fks 0; done` This was for deciding whether to use lots of small DBs or to group things into a smaller number of bigger DBs. The second strategy wins. By simply persisting the `_internal` DB to disk, I was able to avoid most of the performance issues I was experiencing previously. (To do this, I changed the `datasette/internal_db.py:init_internal_db` creates to if not exists, and changed the `_internal` DB instantiation in `datasette/app.py:Datasette.__init__` to a path with `is_mutable=True`.) Super rough, but the pages now load so I can continue testing ideas.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	642572841
https://github.com/simonw/datasette/issues/859#issuecomment-905900807	https://api.github.com/repos/simonw/datasette/issues/859	905900807	IC_kwDOBm6k_c41_vMH	9599	2021-08-25T21:51:10Z	2021-08-25T21:51:10Z	OWNER	10-20 minutes to populate `_internal`! How many databases and tables is that for? I may have to rethink the `_internal` mechanism entirely. One possible alternative would be for the Datasette homepage to just show a list of available databases (maybe only if there are more than X connected) and then load in their metadata only the first time they are accessed. I need to get my own stress testing rig setup for this.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	642572841
https://github.com/simonw/datasette/issues/859#issuecomment-905899177	https://api.github.com/repos/simonw/datasette/issues/859	905899177	IC_kwDOBm6k_c41_uyp	2670795	2021-08-25T21:48:00Z	2021-08-25T21:48:00Z	CONTRIBUTOR	Upon first stab, there's two issues here: - DB/table/row counts (as discussed above). This isn't too bad if the DBs are actually above the MAX limit check. - Populating the internal DB. On first load of a giant set of DBs, it can take 10-20 mins to populate. By altering datasette and persisting the internal DB to disk, this problem is vastly improved, but I'm sure this will cause problems elsewhere.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	642572841