{"html_url": "https://github.com/simonw/datasette/issues/1150#issuecomment-747768112", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1150", "id": 747768112, "node_id": "MDEyOklzc3VlQ29tbWVudDc0Nzc2ODExMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-12-17T23:25:21Z", "updated_at": "2020-12-17T23:25:21Z", "author_association": "OWNER", "body": "Next challenge: figure out how to use the `Database` class from https://github.com/simonw/datasette/blob/0.53/datasette/database.py for an in-memory database which persists data for the duration of the lifetime of the server, and allows access to that in-memory database from multiple threads in a way that lets them see each other's changes.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 770436876, "label": "Maintain an in-memory SQLite table of connected databases and their tables"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1150#issuecomment-747767598", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1150", "id": 747767598, "node_id": "MDEyOklzc3VlQ29tbWVudDc0Nzc2NzU5OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-12-17T23:24:03Z", "updated_at": "2020-12-17T23:24:03Z", "author_association": "OWNER", "body": "I'm going to assume that even the heaviest user will have trouble going beyond a few hundred database files, so this is fine.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 770436876, "label": "Maintain an in-memory SQLite table of connected databases and their tables"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1150#issuecomment-747767499", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1150", "id": 747767499, "node_id": "MDEyOklzc3VlQ29tbWVudDc0Nzc2NzQ5OQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-12-17T23:23:44Z", "updated_at": "2020-12-17T23:23:44Z", "author_association": "OWNER", "body": "Grabbing the schema version of 380 files in the root directory takes 70ms.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 770436876, "label": "Maintain an in-memory SQLite table of connected databases and their tables"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1150#issuecomment-747767055", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1150", "id": 747767055, "node_id": "MDEyOklzc3VlQ29tbWVudDc0Nzc2NzA1NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-12-17T23:22:41Z", "updated_at": "2020-12-17T23:22:41Z", "author_association": "OWNER", "body": "It's just recursion that's expensive. I created 380 empty SQLite databases in a folder and timed `list(pathlib.Path(\"/tmp\").glob(\"*.db\"));` and it took 0.002s.\r\n\r\nSo maybe I tell users that all SQLite databases have to be in the root folder.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 770436876, "label": "Maintain an in-memory SQLite table of connected databases and their tables"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1150#issuecomment-747766310", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1150", "id": 747766310, "node_id": "MDEyOklzc3VlQ29tbWVudDc0Nzc2NjMxMA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-12-17T23:20:49Z", "updated_at": "2020-12-17T23:20:49Z", "author_association": "OWNER", "body": "I tried against my entire `~/Development/Dropbox` folder - deeply nested with 381 SQLite database files in sub-folders - and it took 25s! But it turned out 23.9s of that was the call to `pathlib.Path(\"/Users/simon/Dropbox/Development\").glob('**/*.db')`.\r\n\r\nSo it looks like connecting to a SQLite database file and getting the schema version is extremely fast. Scanning directories is slower.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 770436876, "label": "Maintain an in-memory SQLite table of connected databases and their tables"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1150#issuecomment-747764712", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1150", "id": 747764712, "node_id": "MDEyOklzc3VlQ29tbWVudDc0Nzc2NDcxMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-12-17T23:16:31Z", "updated_at": "2020-12-17T23:16:31Z", "author_association": "OWNER", "body": "Quick micro-benchmark, run against a folder with 46 database files adding up to 1.4GB total:\r\n```python\r\nimport pathlib, sqlite3, time\r\n\r\npaths = list(pathlib.Path(\".\").glob('*.db'))\r\n\r\ndef schema_version(path):\r\n db = sqlite3.connect(path)\r\n version = db.execute(\"PRAGMA schema_version\").fetchall()[0]\r\n db.close()\r\n return version\r\n\r\ndef all():\r\n versions = {}\r\n for path in paths:\r\n versions[path.name] = schema_version(path)\r\n return versions\r\n\r\nstart = time.time(); all(); print(time.time() - start)\r\n# 0.012346982955932617\r\n```\r\nSo that's 12ms.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 770436876, "label": "Maintain an in-memory SQLite table of connected databases and their tables"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1150#issuecomment-747754229", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1150", "id": 747754229, "node_id": "MDEyOklzc3VlQ29tbWVudDc0Nzc1NDIyOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-12-17T23:04:38Z", "updated_at": "2020-12-17T23:04:38Z", "author_association": "OWNER", "body": "Open question: will this work for hundreds of database files, or is the overhead of connecting to each of 100 databases in turn to run `PRAGMA schema_version` too high?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 770436876, "label": "Maintain an in-memory SQLite table of connected databases and their tables"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1150#issuecomment-747754082", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1150", "id": 747754082, "node_id": "MDEyOklzc3VlQ29tbWVudDc0Nzc1NDA4Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-12-17T23:04:13Z", "updated_at": "2020-12-17T23:04:13Z", "author_association": "OWNER", "body": "Pages that need a list of all databases - the index page and /-/databases for example - could trigger a \"check for new directories in the configured directories\" scan.\r\n\r\nThat scan would run at most once every 5 (n) seconds - the check is triggered if it\u2019s run more recently than that it doesn\u2019t run.\r\n\r\nHopefully this means it could be done as a blocking operation, rather than trying to run it in a thread.\r\n\r\nWhen it runs it scans for *.db or *.sqlite files (maybe one or two other extensions) that it hasn\u2019t seen before. It also checks that the existing list of known database files still exists.\r\n\r\nIf it finds any new ones it connects to them once to run `.schema`. It also runs `PRAGMA schema_version` on each known database so that it can compare the schema version number to the last one it saw. That's how it detects if there are new tables or if the cached schema needs to be updated.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 770436876, "label": "Maintain an in-memory SQLite table of connected databases and their tables"}, "performed_via_github_app": null}