html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app
https://github.com/simonw/datasette/issues/673#issuecomment-586455321,https://api.github.com/repos/simonw/datasette/issues/673,586455321,MDEyOklzc3VlQ29tbWVudDU4NjQ1NTMyMQ==,9599,2020-02-14T20:13:59Z,2020-02-14T20:13:59Z,OWNER,Closing this in favour of rethinking how sanity checks work.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565518772,
https://github.com/simonw/datasette/issues/673#issuecomment-586454371,https://api.github.com/repos/simonw/datasette/issues/673,586454371,MDEyOklzc3VlQ29tbWVudDU4NjQ1NDM3MQ==,9599,2020-02-14T20:11:02Z,2020-02-14T20:11:02Z,OWNER,"The technique from `run_sanity_checks` of running `PRAGMA table_info({})` for every table seems to work just fine. It failed for the Apple Photos database for example:
```
sqlite> pragma table_info(RKSceneInVersion_VirtualBufferReader);
Error: no such module: VirtualBufferReaderModule
```
So I think the solution to this ticket is going to be moving that logic into a new utility function.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565518772,
https://github.com/simonw/datasette/issues/673#issuecomment-586450571,https://api.github.com/repos/simonw/datasette/issues/673,586450571,MDEyOklzc3VlQ29tbWVudDU4NjQ1MDU3MQ==,9599,2020-02-14T19:59:41Z,2020-02-14T20:01:14Z,OWNER,"This helped:
```
$ sqlite3 /tmp/hearst.db
SQLite version 3.24.0 2018-06-04 14:10:15
Enter "".help"" for usage hints.
sqlite> delete from spatial_ref_sys where srid != 4326;
sqlitte> delete from spatial_ref_sys_aux where srid != 4326;
sqlite> vacuum;
sqlite> ^D
$ ls -lah /tmp/hearst.db
-rw-r--r-- 1 simonw wheel 216K Feb 14 12:01 /tmp/hearst.db
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565518772,
https://github.com/simonw/datasette/issues/673#issuecomment-586449286,https://api.github.com/repos/simonw/datasette/issues/673,586449286,MDEyOklzc3VlQ29tbWVudDU4NjQ0OTI4Ng==,9599,2020-02-14T19:56:00Z,2020-02-14T19:57:17Z,OWNER,"I tried to make the smallest SpatiaLite database file I could (to use for the tests), but it ended up over 5MB!
```
$ echo '{""type"":""Feature"",""properties"":{""name"":""Hearst Castle""},""geometry"":{""type"":""Point"",""coordinates"":[-121.1686,35.685]}}' | geojson-to-sqlite /tmp/hearst.db places - --spatialite
$ ls -lah /tmp/hearst.db
-rw-r--r-- 1 simonw wheel 5.3M Feb 14 11:54 /tmp/hearst.db
```
I imagine that's because of these tables:
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565518772,
https://github.com/simonw/datasette/issues/673#issuecomment-586448292,https://api.github.com/repos/simonw/datasette/issues/673,586448292,MDEyOklzc3VlQ29tbWVudDU4NjQ0ODI5Mg==,9599,2020-02-14T19:53:05Z,2020-02-14T19:53:05Z,OWNER,"I may be re-inventing this code at the moment:
https://github.com/simonw/datasette/blob/3ffb8f3b98252531d11897fd431711e9b8045ace/datasette/app.py#L219-L237","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565518772,
https://github.com/simonw/datasette/issues/673#issuecomment-586445210,https://api.github.com/repos/simonw/datasette/issues/673,586445210,MDEyOklzc3VlQ29tbWVudDU4NjQ0NTIxMA==,9599,2020-02-14T19:44:27Z,2020-02-14T19:44:27Z,OWNER,For the unit tests I think I'm going to have to create minimal binary SQLite file examples and include them in the repo.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565518772,
https://github.com/simonw/datasette/issues/673#issuecomment-586444970,https://api.github.com/repos/simonw/datasette/issues/673,586444970,MDEyOklzc3VlQ29tbWVudDU4NjQ0NDk3MA==,9599,2020-02-14T19:43:46Z,2020-02-14T19:43:46Z,OWNER,`is_openable_sqlite` perhaps?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565518772,
https://github.com/simonw/datasette/issues/673#issuecomment-586444835,https://api.github.com/repos/simonw/datasette/issues/673,586444835,MDEyOklzc3VlQ29tbWVudDU4NjQ0NDgzNQ==,9599,2020-02-14T19:43:27Z,2020-02-14T19:43:27Z,OWNER,"I can extend this function (maybe also rename it):
https://github.com/simonw/datasette/blob/52ba34701cdbf510236de87d35b0e6df330626d1/datasette/utils/__init__.py#L595-L610","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565518772,
https://github.com/simonw/datasette/issues/673#issuecomment-586443837,https://api.github.com/repos/simonw/datasette/issues/673,586443837,MDEyOklzc3VlQ29tbWVudDU4NjQ0MzgzNw==,9599,2020-02-14T19:40:42Z,2020-02-14T19:41:56Z,OWNER,"Here's how to test if the `rtree` virtual table is supported:
```
>>> import sqlite3
>>> c = sqlite3.connect("":memory:"")
>>> c.execute(""create virtual table blah using rtree (a, b, c)"")
>>> c.execute(""create virtual table blah2 using rtree2 (a, b, c)"")
Traceback (most recent call last):
File """", line 1, in
sqlite3.OperationalError: table blah already exists
```
Also:
```
>>> c.execute('''CREATE VIRTUAL TABLE SpatialIndex USING VirtualSpatialIndex()''')
Traceback (most recent call last):
File """", line 1, in
sqlite3.OperationalError: no such module: VirtualSpatialIndex
>>> c.enable_load_extension(
... True)
>>>
>>> c.load_extension(""/usr/local/lib/mod_spatialite.dylib"")
>>> c.execute('''CREATE VIRTUAL TABLE SpatialIndex USING VirtualSpatialIndex()''')
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565518772,
https://github.com/simonw/datasette/issues/673#issuecomment-586442978,https://api.github.com/repos/simonw/datasette/issues/673,586442978,MDEyOklzc3VlQ29tbWVudDU4NjQ0Mjk3OA==,9599,2020-02-14T19:38:19Z,2020-02-14T19:38:19Z,OWNER,"Amazingly, I get 0 search results on Google for `RidList_VirtualReaderModule`! I guess no-one has reverse engineered the Apple Photos SQLite database at that level yet.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565518772,
https://github.com/simonw/datasette/issues/673#issuecomment-586442292,https://api.github.com/repos/simonw/datasette/issues/673,586442292,MDEyOklzc3VlQ29tbWVudDU4NjQ0MjI5Mg==,9599,2020-02-14T19:36:37Z,2020-02-14T19:36:37Z,OWNER,This can be a function in `utils/__init__.py`.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565518772,
https://github.com/simonw/datasette/pull/672#issuecomment-586441484,https://api.github.com/repos/simonw/datasette/issues/672,586441484,MDEyOklzc3VlQ29tbWVudDU4NjQ0MTQ4NA==,9599,2020-02-14T19:34:25Z,2020-02-14T19:34:25Z,OWNER,"I've figured out how to tell if a database is safe to open or not:
```sql
select sql from sqlite_master where sql like 'CREATE VIRTUAL TABLE%';
```
This returns the SQL definitions for virtual tables. The bit after `using` tells you what they need.
Run this against a SpatiaLite database and you get the following:
```sql
CREATE VIRTUAL TABLE SpatialIndex USING VirtualSpatialIndex()
CREATE VIRTUAL TABLE ElementaryGeometries USING VirtualElementary()
```
Run it against an Apple Photos `photos.db` file (found with `find ~/Library | grep photos.db`) and you get this (partial list):
```sql
CREATE VIRTUAL TABLE RidList_VirtualReader using RidList_VirtualReaderModule
CREATE VIRTUAL TABLE Array_VirtualReader using Array_VirtualReaderModule
CREATE VIRTUAL TABLE LiGlobals_VirtualBufferReader using VirtualBufferReaderModule
CREATE VIRTUAL TABLE RKPlace_RTree using rtree (modelId,minLongitude,maxLongitude,minLatitude,maxLatitude)
```
For a database with FTS4 you get:
```sql
CREATE VIRTUAL TABLE ""docs_fts"" USING FTS4 (
[title], [content], content=""docs""
)
```
FTS5:
```sql
CREATE VIRTUAL TABLE [FARA_All_Registrants_fts] USING FTS5 (
[Name], [Address_1], [Address_2],
content=[FARA_All_Registrants]
)
```
So I can use this to figure out all of the `using` pieces and then compare them to a list of known support ones.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565064079,
https://github.com/simonw/datasette/pull/672#issuecomment-586112662,https://api.github.com/repos/simonw/datasette/issues/672,586112662,MDEyOklzc3VlQ29tbWVudDU4NjExMjY2Mg==,9599,2020-02-14T06:05:27Z,2020-02-14T06:05:27Z,OWNER,It think the fix is to use an old-fashioned `threading` module daemon thread directly. That should exit cleanly when the program exits.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565064079,
https://github.com/simonw/datasette/pull/672#issuecomment-586111619,https://api.github.com/repos/simonw/datasette/issues/672,586111619,MDEyOklzc3VlQ29tbWVudDU4NjExMTYxOQ==,9599,2020-02-14T06:01:24Z,2020-02-14T06:01:24Z,OWNER,https://gist.github.com/clchiou/f2608cbe54403edb0b13 might work.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565064079,
https://github.com/simonw/datasette/pull/672#issuecomment-586111102,https://api.github.com/repos/simonw/datasette/issues/672,586111102,MDEyOklzc3VlQ29tbWVudDU4NjExMTEwMg==,9599,2020-02-14T05:59:24Z,2020-02-14T06:00:36Z,OWNER,"Interesting new problem: hitting Ctrl+C no longer terminates the problem provided that `scan_dirs()` thread is still running.
https://stackoverflow.com/questions/49992329/the-workers-in-threadpoolexecutor-is-not-really-daemon has clues. The workers are only meant to exit when their worker queues are empty.
But... I want to run the worker every 10 seconds. How do I do that without having it loop forever and hence never quit?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565064079,
https://github.com/simonw/datasette/pull/672#issuecomment-586109784,https://api.github.com/repos/simonw/datasette/issues/672,586109784,MDEyOklzc3VlQ29tbWVudDU4NjEwOTc4NA==,9599,2020-02-14T05:53:50Z,2020-02-14T05:54:21Z,OWNER,"... cheating like this seems to work:
```
for name, db in list(self.ds.databases.items()):
```
Python built-in operations are supposedly threadsafe, so in this case I can grab a copy of the list atomically (I think) and then safely iterate over it.
Seems to work in my testing. Wish I could prove it with a unit test though.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565064079,
https://github.com/simonw/datasette/pull/672#issuecomment-586109238,https://api.github.com/repos/simonw/datasette/issues/672,586109238,MDEyOklzc3VlQ29tbWVudDU4NjEwOTIzOA==,9599,2020-02-14T05:51:12Z,2020-02-14T05:51:12Z,OWNER,"... or maybe I can cheat and wrap the access to `self.ds.databases.items()` in `list()`, so I'm iterating over an atomically-created list of those things instead? I'll try that first.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565064079,
https://github.com/simonw/datasette/pull/672#issuecomment-586109032,https://api.github.com/repos/simonw/datasette/issues/672,586109032,MDEyOklzc3VlQ29tbWVudDU4NjEwOTAzMg==,9599,2020-02-14T05:50:15Z,2020-02-14T05:50:15Z,OWNER,"So I need to ensure the `ds.databases` data structure is manipulated in a thread-safe manner.
Mainly I need to ensure that it is locked during iterations over it, then unlocked at the end.
Trickiest part is probably ensuring there is a test that proves this is working - I feel like I got lucky encountering that `RuntimeError` as early as I did.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565064079,
https://github.com/simonw/datasette/pull/672#issuecomment-586107989,https://api.github.com/repos/simonw/datasette/issues/672,586107989,MDEyOklzc3VlQ29tbWVudDU4NjEwNzk4OQ==,9599,2020-02-14T05:45:12Z,2020-02-14T05:45:12Z,OWNER,"I tried running the `scan_dirs()` method in a thread and got an interesting error while trying to load the homepage: `RuntimeError: OrderedDict mutated during iteration`
Makes sense - I had a thread that added an item to that dictionary right while the homepage was attempting to run this code:
https://github.com/simonw/datasette/blob/efa54b439fd0394440c302602b919255047b59c5/datasette/views/index.py#L24-L27
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565064079,
https://github.com/simonw/datasette/pull/672#issuecomment-586069529,https://api.github.com/repos/simonw/datasette/issues/672,586069529,MDEyOklzc3VlQ29tbWVudDU4NjA2OTUyOQ==,9599,2020-02-14T02:37:17Z,2020-02-14T02:37:17Z,OWNER,"Another problem: if any of the found databases use SpatiaLite then Datasette will fail to start at all.
It should skip them instead.
The `select * from sqlite_master` check apparently isn't quite enough to catch this case.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565064079,
https://github.com/simonw/datasette/pull/672#issuecomment-586068095,https://api.github.com/repos/simonw/datasette/issues/672,586068095,MDEyOklzc3VlQ29tbWVudDU4NjA2ODA5NQ==,9599,2020-02-14T02:30:37Z,2020-02-14T02:30:46Z,OWNER,"This can take a LONG time to run, and at the moment it's blocking and prevents Datasette from starting up.
It would be much better if this ran in a thread, or an asyncio task. Probably have to be a thread because there's no easy `async` version of `pathlib.Path.glob()` that I've seen.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565064079,
https://github.com/simonw/datasette/pull/672#issuecomment-586067794,https://api.github.com/repos/simonw/datasette/issues/672,586067794,MDEyOklzc3VlQ29tbWVudDU4NjA2Nzc5NA==,9599,2020-02-14T02:29:16Z,2020-02-14T02:29:16Z,OWNER,"One design issue: how to pick neat unique names for database files in a file hierarchy?
Here's what I have so far:
https://github.com/simonw/datasette/blob/fe6f9e6a7397cab2e4bc57745a8da9d824dad218/datasette/app.py#L231-L237
For these files:
```
../travel-old.db
../sf-tree-history/trees.db
../library-of-congress/records-from-df.db
```
It made these names:
```
travel-old
sf-tree-history_trees
library-of-congress_records-from-df
```
Maybe this is good enough? Needs some tests.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565064079,
https://github.com/simonw/datasette/issues/417#issuecomment-586066798,https://api.github.com/repos/simonw/datasette/issues/417,586066798,MDEyOklzc3VlQ29tbWVudDU4NjA2Njc5OA==,9599,2020-02-14T02:24:54Z,2020-02-14T02:24:54Z,OWNER,I'm going to move this over to a draft pull request.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",421546944,
https://github.com/simonw/datasette/issues/417#issuecomment-586065843,https://api.github.com/repos/simonw/datasette/issues/417,586065843,MDEyOklzc3VlQ29tbWVudDU4NjA2NTg0Mw==,9599,2020-02-14T02:20:53Z,2020-02-14T02:20:53Z,OWNER,"MVP for this feature: just do it once on startup, don't scan for new files every X seconds.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",421546944,
https://github.com/simonw/datasette/issues/417#issuecomment-586047525,https://api.github.com/repos/simonw/datasette/issues/417,586047525,MDEyOklzc3VlQ29tbWVudDU4NjA0NzUyNQ==,9599,2020-02-14T01:03:43Z,2020-02-14T01:59:02Z,OWNER,"OK, I have a plan. I'm going to try and implement this is a core Datasette feature (no plugins) with the following design:
- You can tell Datasette ""load any databases you find in this directory"" by passing the `--dir=path/to/dir` option to `datasette` that are valid SQLite files and will attach them to Datasette
- Every 10 seconds Datasette will re-scan those directories to see if any new files have been added
- That 10s will be the default for a new `--config directory_scan_s:10` config option. You can set this to `0` to disable scanning entirely, at which point Datasette will only run the scan once on startup.
To check if a file is valid SQLite, Datasette will first check if the first few bytes of the file are `b""SQLite format 3\x00""`. If they are, it will open a connection to the file and attempt to run `select * from sqlite_master` against it. If that runs without any errors it will assume the file is usable and connect it.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",421546944,
https://github.com/simonw/datasette/issues/671#issuecomment-586054154,https://api.github.com/repos/simonw/datasette/issues/671,586054154,MDEyOklzc3VlQ29tbWVudDU4NjA1NDE1NA==,9599,2020-02-14T01:30:35Z,2020-02-14T01:30:35Z,OWNER,Documented here: https://datasette.readthedocs.io/en/latest/datasette.html,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",565041624,
https://github.com/simonw/datasette/issues/417#issuecomment-586047995,https://api.github.com/repos/simonw/datasette/issues/417,586047995,MDEyOklzc3VlQ29tbWVudDU4NjA0Nzk5NQ==,9599,2020-02-14T01:05:20Z,2020-02-14T01:05:20Z,OWNER,"I'm going to add two methods to the Datasette class to help support this work (and to enable exciting new plugin opportunities in the future):
- `datasette.add_database(name, db)` - adds a new named database to the list of connected databases. `db` will be a `Database()` object, which may prove useful in the future for things like #670 and could also allow some plugins to provide in-memory SQLite databases.
- `datasette.remove_database(name)`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",421546944,