{"html_url": "https://github.com/simonw/datasette/issues/859#issuecomment-905904540", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/859", "id": 905904540, "node_id": "IC_kwDOBm6k_c41_wGc", "user": {"value": 2670795, "label": "brandonrobertz"}, "created_at": "2021-08-25T21:59:14Z", "updated_at": "2021-08-25T21:59:55Z", "author_association": "CONTRIBUTOR", "body": "I did two tests: one with 1000 5-30mb DBs and a second with 20 multi gig DBs. For the second, I created them like so:\r\n`for i in {1..20}; do sqlite-generate db$i.db --tables ${i}00 --rows 100,2000 --columns 5,100 --pks 0 --fks 0; done`\r\n\r\nThis was for deciding whether to use lots of small DBs or to group things into a smaller number of bigger DBs. The second strategy wins.\r\n\r\nBy simply persisting the `_internal` DB to disk, I was able to avoid most of the performance issues I was experiencing previously. (To do this, I changed the `datasette/internal_db.py:init_internal_db` creates to if not exists, and changed the `_internal` DB instantiation in `datasette/app.py:Datasette.__init__` to a path with `is_mutable=True`.) Super rough, but the pages now load so I can continue testing ideas.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 642572841, "label": "Database page loads too slowly with many large tables (due to table counts)"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/859#issuecomment-905899177", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/859", "id": 905899177, "node_id": "IC_kwDOBm6k_c41_uyp", "user": {"value": 2670795, "label": "brandonrobertz"}, "created_at": "2021-08-25T21:48:00Z", "updated_at": "2021-08-25T21:48:00Z", "author_association": "CONTRIBUTOR", "body": "Upon first stab, there's two issues here:\r\n- DB/table/row counts (as discussed above). This isn't too bad if the DBs are actually above the MAX limit check.\r\n- Populating the internal DB. On first load of a giant set of DBs, it can take 10-20 mins to populate. By altering datasette and persisting the internal DB to disk, this problem is vastly improved, but I'm sure this will cause problems elsewhere.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 642572841, "label": "Database page loads too slowly with many large tables (due to table counts)"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/859#issuecomment-904982056", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/859", "id": 904982056, "node_id": "IC_kwDOBm6k_c418O4o", "user": {"value": 2670795, "label": "brandonrobertz"}, "created_at": "2021-08-24T21:15:04Z", "updated_at": "2021-08-24T21:15:30Z", "author_association": "CONTRIBUTOR", "body": "I'm running into issues with this as well. All other pages seem to work with lots of DBs except the home page, which absolutely tanks. Would be willing to put some work into this, if there's been any kind of progress on concepts on how this ought to work.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 642572841, "label": "Database page loads too slowly with many large tables (due to table counts)"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1168#issuecomment-869076254", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1168", "id": 869076254, "node_id": "MDEyOklzc3VlQ29tbWVudDg2OTA3NjI1NA==", "user": {"value": 2670795, "label": "brandonrobertz"}, "created_at": "2021-06-27T00:03:16Z", "updated_at": "2021-06-27T00:05:51Z", "author_association": "CONTRIBUTOR", "body": "> Related: Here's an implementation of a `get_metadata()` plugin hook by @brandonrobertz [next-LI@3fd8ce9](https://github.com/next-LI/datasette/commit/3fd8ce91f3108c82227bf65ff033929426c60437)\r\n\r\nHere's a plugin that implements metadata-within-DBs: [next-LI/datasette-live-config](https://github.com/next-LI/datasette-live-config)\r\n\r\nHow it works: If a database has a `__metadata` table, then it gets parsed and included in the global metadata. It also implements a database-action hook with a UI for managing config.\r\n\r\nMore context: https://github.com/next-LI/datasette-live-config/blob/72e335e887f1c69c54c6c2441e07148955b0fc9f/datasette_live_config/__init__.py#L109-L140", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 777333388, "label": "Mechanism for storing metadata in _metadata tables"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1384#issuecomment-869074701", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1384", "id": 869074701, "node_id": "MDEyOklzc3VlQ29tbWVudDg2OTA3NDcwMQ==", "user": {"value": 2670795, "label": "brandonrobertz"}, "created_at": "2021-06-26T23:45:18Z", "updated_at": "2021-06-26T23:45:37Z", "author_association": "CONTRIBUTOR", "body": "> Here's where the plugin hook is called, demonstrating the `fallback=` argument:\r\n> \r\n> https://github.com/simonw/datasette/blob/05a312caf3debb51aa1069939923a49e21cd2bd1/datasette/app.py#L426-L472\r\n> \r\n> I'm not convinced of the use-case for passing `fallback=` to the hook here - is there a reason a plugin might care whether fallback is `True` or `False`, seeing as the `metadata()` method already respects that fallback logic on line 459?\r\n\r\nI think you're right. I can't think of a reason why the plugin would care about the `fallback` parameter since plugins are currently mandated to return a full, global metadata dict.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 930807135, "label": "Plugin hook for dynamic metadata"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1384#issuecomment-869074182", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1384", "id": 869074182, "node_id": "MDEyOklzc3VlQ29tbWVudDg2OTA3NDE4Mg==", "user": {"value": 2670795, "label": "brandonrobertz"}, "created_at": "2021-06-26T23:37:42Z", "updated_at": "2021-06-26T23:37:42Z", "author_association": "CONTRIBUTOR", "body": "> > Hmmm... that's tricky, since one of the most obvious ways to use this hook is to load metadata from database tables using SQL queries.\r\n> > @brandonrobertz do you have a working example of using this hook to populate metadata from database tables I can try?\r\n> \r\n> Answering my own question: here's how Brandon implements it in his `datasette-live-config` plugin: https://github.com/next-LI/datasette-live-config/blob/72e335e887f1c69c54c6c2441e07148955b0fc9f/datasette_live_config/__init__.py#L50-L160\r\n> \r\n> That's using a completely separate SQLite connection (actually wrapped in `sqlite-utils`) and making blocking synchronous calls to it.\r\n> \r\n> This is a pragmatic solution, which works - and likely performs just fine, because SQL queries like this against a small database are so fast that not running them asynchronously isn't actually a problem.\r\n> \r\n> But... it's weird. Everywhere else in Datasette land uses `await db.execute(...)` - but here's an example where users are encouraged to use blocking calls instead.\r\n\r\n_Ideally_ this hook would be asynchronous, but when I started down that path I quickly realized how large of a change this would be, since metadata gets used synchronously across the entire Datasette codebase. (And calling async code from sync is non-trivial.)\r\n\r\nIn my live-configuration implementation I use synchronous reads using a persistent sqlite connection. This works pretty well in practice, but I agree it's limiting. My thinking around this was to go with the path of least change as `Datasette.metadata()` is a critical core function.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 930807135, "label": "Plugin hook for dynamic metadata"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1368#issuecomment-865204472", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1368", "id": 865204472, "node_id": "MDEyOklzc3VlQ29tbWVudDg2NTIwNDQ3Mg==", "user": {"value": 2670795, "label": "brandonrobertz"}, "created_at": "2021-06-21T17:11:37Z", "updated_at": "2021-06-21T17:11:37Z", "author_association": "CONTRIBUTOR", "body": "If this is a concept ACK then I will move onto fixing the tests (adding new ones) and updating the documentation for the new plugin hook.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 913865304, "label": "DRAFT: A new plugin hook for dynamic metadata"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1368#issuecomment-856182547", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1368", "id": 856182547, "node_id": "MDEyOklzc3VlQ29tbWVudDg1NjE4MjU0Nw==", "user": {"value": 2670795, "label": "brandonrobertz"}, "created_at": "2021-06-07T18:59:47Z", "updated_at": "2021-06-07T23:04:25Z", "author_association": "CONTRIBUTOR", "body": "Note that if we went with a \"update_metadata\" hook, the hook signature would look something like this (it would return nothing):\r\n\r\n```\r\nupdate_metadata(\r\n datasette=self, metadata=metadata, key=key, database=database, table=table,\r\n fallback=fallback\r\n)\r\n```\r\n\r\nThe Datasette function `_metadata_recursive_update(self, orig, updated)` would disappear into the plugins. Doing this, though, we'd lose the easy ability to make the local metadata.yaml immutable (since we'd no longer have the recursive update).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 913865304, "label": "DRAFT: A new plugin hook for dynamic metadata"}, "performed_via_github_app": null}