html_url,issue_url,id,node_id,user,user_label,created_at,updated_at,author_association,body,reactions,issue,issue_label,performed_via_github_app https://github.com/simonw/datasette/issues/1150#issuecomment-747768112,https://api.github.com/repos/simonw/datasette/issues/1150,747768112,MDEyOklzc3VlQ29tbWVudDc0Nzc2ODExMg==,9599,simonw,2020-12-17T23:25:21Z,2020-12-17T23:25:21Z,OWNER,"Next challenge: figure out how to use the `Database` class from https://github.com/simonw/datasette/blob/0.53/datasette/database.py for an in-memory database which persists data for the duration of the lifetime of the server, and allows access to that in-memory database from multiple threads in a way that lets them see each other's changes.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",770436876,Maintain an in-memory SQLite table of connected databases and their tables, https://github.com/simonw/datasette/issues/1150#issuecomment-747767598,https://api.github.com/repos/simonw/datasette/issues/1150,747767598,MDEyOklzc3VlQ29tbWVudDc0Nzc2NzU5OA==,9599,simonw,2020-12-17T23:24:03Z,2020-12-17T23:24:03Z,OWNER,"I'm going to assume that even the heaviest user will have trouble going beyond a few hundred database files, so this is fine.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",770436876,Maintain an in-memory SQLite table of connected databases and their tables, https://github.com/simonw/datasette/issues/1150#issuecomment-747767499,https://api.github.com/repos/simonw/datasette/issues/1150,747767499,MDEyOklzc3VlQ29tbWVudDc0Nzc2NzQ5OQ==,9599,simonw,2020-12-17T23:23:44Z,2020-12-17T23:23:44Z,OWNER,Grabbing the schema version of 380 files in the root directory takes 70ms.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",770436876,Maintain an in-memory SQLite table of connected databases and their tables, https://github.com/simonw/datasette/issues/1150#issuecomment-747767055,https://api.github.com/repos/simonw/datasette/issues/1150,747767055,MDEyOklzc3VlQ29tbWVudDc0Nzc2NzA1NQ==,9599,simonw,2020-12-17T23:22:41Z,2020-12-17T23:22:41Z,OWNER,"It's just recursion that's expensive. I created 380 empty SQLite databases in a folder and timed `list(pathlib.Path(""/tmp"").glob(""*.db""));` and it took 0.002s. So maybe I tell users that all SQLite databases have to be in the root folder.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",770436876,Maintain an in-memory SQLite table of connected databases and their tables, https://github.com/simonw/datasette/issues/1150#issuecomment-747766310,https://api.github.com/repos/simonw/datasette/issues/1150,747766310,MDEyOklzc3VlQ29tbWVudDc0Nzc2NjMxMA==,9599,simonw,2020-12-17T23:20:49Z,2020-12-17T23:20:49Z,OWNER,"I tried against my entire `~/Development/Dropbox` folder - deeply nested with 381 SQLite database files in sub-folders - and it took 25s! But it turned out 23.9s of that was the call to `pathlib.Path(""/Users/simon/Dropbox/Development"").glob('**/*.db')`. So it looks like connecting to a SQLite database file and getting the schema version is extremely fast. Scanning directories is slower.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",770436876,Maintain an in-memory SQLite table of connected databases and their tables, https://github.com/simonw/datasette/issues/1150#issuecomment-747764712,https://api.github.com/repos/simonw/datasette/issues/1150,747764712,MDEyOklzc3VlQ29tbWVudDc0Nzc2NDcxMg==,9599,simonw,2020-12-17T23:16:31Z,2020-12-17T23:16:31Z,OWNER,"Quick micro-benchmark, run against a folder with 46 database files adding up to 1.4GB total: ```python import pathlib, sqlite3, time paths = list(pathlib.Path(""."").glob('*.db')) def schema_version(path): db = sqlite3.connect(path) version = db.execute(""PRAGMA schema_version"").fetchall()[0] db.close() return version def all(): versions = {} for path in paths: versions[path.name] = schema_version(path) return versions start = time.time(); all(); print(time.time() - start) # 0.012346982955932617 ``` So that's 12ms. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",770436876,Maintain an in-memory SQLite table of connected databases and their tables, https://github.com/simonw/datasette/issues/1150#issuecomment-747754229,https://api.github.com/repos/simonw/datasette/issues/1150,747754229,MDEyOklzc3VlQ29tbWVudDc0Nzc1NDIyOQ==,9599,simonw,2020-12-17T23:04:38Z,2020-12-17T23:04:38Z,OWNER,"Open question: will this work for hundreds of database files, or is the overhead of connecting to each of 100 databases in turn to run `PRAGMA schema_version` too high?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",770436876,Maintain an in-memory SQLite table of connected databases and their tables, https://github.com/simonw/datasette/issues/1150#issuecomment-747754082,https://api.github.com/repos/simonw/datasette/issues/1150,747754082,MDEyOklzc3VlQ29tbWVudDc0Nzc1NDA4Mg==,9599,simonw,2020-12-17T23:04:13Z,2020-12-17T23:04:13Z,OWNER,"Pages that need a list of all databases - the index page and /-/databases for example - could trigger a ""check for new directories in the configured directories"" scan. That scan would run at most once every 5 (n) seconds - the check is triggered if it’s run more recently than that it doesn’t run. Hopefully this means it could be done as a blocking operation, rather than trying to run it in a thread. When it runs it scans for *.db or *.sqlite files (maybe one or two other extensions) that it hasn’t seen before. It also checks that the existing list of known database files still exists. If it finds any new ones it connects to them once to run `.schema`. It also runs `PRAGMA schema_version` on each known database so that it can compare the schema version number to the last one it saw. That's how it detects if there are new tables or if the cached schema needs to be updated.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",770436876,Maintain an in-memory SQLite table of connected databases and their tables,