html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app https://github.com/simonw/datasette/issues/1235#issuecomment-781736855,https://api.github.com/repos/simonw/datasette/issues/1235,781736855,MDEyOklzc3VlQ29tbWVudDc4MTczNjg1NQ==,9599,2021-02-19T00:52:47Z,2021-02-19T01:47:53Z,OWNER,"I bumped the two lines in the `Dockerfile` to `FROM python:3.7.10-slim-stretch as build` and ran this to build it: docker build -f Dockerfile -t datasetteproject/datasette:python-3-7-10 . Then I ran it with: docker run -p 8001:8001 -v `pwd`:/mnt datasetteproject/datasette:python-3-7-10 datasette -p 8001 -h 0.0.0.0 /mnt/fixtures.db http://0.0.0.0:8001/-/versions confirmed that it was now running Python 3.7.10","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",811589344, https://github.com/simonw/datasette/issues/1235#issuecomment-781735887,https://api.github.com/repos/simonw/datasette/issues/1235,781735887,MDEyOklzc3VlQ29tbWVudDc4MTczNTg4Nw==,9599,2021-02-19T00:50:21Z,2021-02-19T00:50:55Z,OWNER,"I'll bump to `3.7.10` for the moment - the fix for 3.8 isn't out until March 1st according to https://news.ycombinator.com/item?id=26186434 https://www.python.org/downloads/release/python-3710/","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",811589344, https://github.com/simonw/datasette/issues/283#issuecomment-781670827,https://api.github.com/repos/simonw/datasette/issues/283,781670827,MDEyOklzc3VlQ29tbWVudDc4MTY3MDgyNw==,9599,2021-02-18T22:16:46Z,2021-02-18T22:16:46Z,OWNER,"Demo is now live here: https://latest.datasette.io/_memory The documentation is at https://docs.datasette.io/en/latest/sql_queries.html#cross-database-queries - it links to this example query: https://latest.datasette.io/_memory?sql=select%0D%0A++%27fixtures%27+as+database%2C+*%0D%0Afrom%0D%0A++%5Bfixtures%5D.sqlite_master%0D%0Aunion%0D%0Aselect%0D%0A++%27extra_database%27+as+database%2C+*%0D%0Afrom%0D%0A++%5Bextra_database%5D.sqlite_master","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",325958506, https://github.com/simonw/datasette/pull/1232#issuecomment-781599929,https://api.github.com/repos/simonw/datasette/issues/1232,781599929,MDEyOklzc3VlQ29tbWVudDc4MTU5OTkyOQ==,22429695,2021-02-18T19:59:54Z,2021-02-18T22:06:42Z,NONE,"# [Codecov](https://codecov.io/gh/simonw/datasette/pull/1232?src=pr&el=h1) Report > Merging [#1232](https://codecov.io/gh/simonw/datasette/pull/1232?src=pr&el=desc) (8876499) into [main](https://codecov.io/gh/simonw/datasette/commit/4df548e7668b5b21d64a267964951e67894f4712?el=desc) (4df548e) will **increase** coverage by `0.03%`. > The diff coverage is `100.00%`. [![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1232/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1)](https://codecov.io/gh/simonw/datasette/pull/1232?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## main #1232 +/- ## ========================================== + Coverage 91.42% 91.46% +0.03% ========================================== Files 32 32 Lines 3955 3970 +15 ========================================== + Hits 3616 3631 +15 Misses 339 339 ``` | [Impacted Files](https://codecov.io/gh/simonw/datasette/pull/1232?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [datasette/app.py](https://codecov.io/gh/simonw/datasette/pull/1232/diff?src=pr&el=tree#diff-ZGF0YXNldHRlL2FwcC5weQ==) | `95.68% <100.00%> (+0.06%)` | :arrow_up: | | [datasette/cli.py](https://codecov.io/gh/simonw/datasette/pull/1232/diff?src=pr&el=tree#diff-ZGF0YXNldHRlL2NsaS5weQ==) | `76.62% <100.00%> (+0.36%)` | :arrow_up: | | [datasette/views/database.py](https://codecov.io/gh/simonw/datasette/pull/1232/diff?src=pr&el=tree#diff-ZGF0YXNldHRlL3ZpZXdzL2RhdGFiYXNlLnB5) | `97.19% <100.00%> (+0.01%)` | :arrow_up: | ------ [Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1232?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/simonw/datasette/pull/1232?src=pr&el=footer). Last update [4df548e...8876499](https://codecov.io/gh/simonw/datasette/pull/1232?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",811407131, https://github.com/simonw/datasette/issues/283#issuecomment-781665560,https://api.github.com/repos/simonw/datasette/issues/283,781665560,MDEyOklzc3VlQ29tbWVudDc4MTY2NTU2MA==,9599,2021-02-18T22:06:14Z,2021-02-18T22:06:14Z,OWNER,"The implementation in #1232 is ready to land. It's the simplest-thing-that-could-possibly-work: you can run `datasette one.db two.db three.db --crossdb` and then use the `/_memory` page to run joins across tables from multiple databases. It only works on the first 10 databases that were passed to the command-line. This means that if you have a Datasette instance with hundreds of attached databases (see [Datasette Library](https://github.com/simonw/datasette/issues/417)) this won't be particularly useful for you. So... a better, future version of this feature would be one that lets you join across databases on command - maybe by hitting `/_memory?attach=db1&attach=db2` to get a special connection. Also worth noting: plugins that implement the [prepare_connection()](https://docs.datasette.io/en/stable/plugin_hooks.html#prepare-connection-conn-database-datasette) hook can attach additional databases - so if you need better, customized support for this one way to handle that would be with a custom plugin.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",325958506, https://github.com/simonw/datasette/pull/1232#issuecomment-781651283,https://api.github.com/repos/simonw/datasette/issues/1232,781651283,MDEyOklzc3VlQ29tbWVudDc4MTY1MTI4Mw==,9599,2021-02-18T21:37:55Z,2021-02-18T21:37:55Z,OWNER,"UI listing the attached tables: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",811407131, https://github.com/simonw/datasette/pull/1232#issuecomment-781641728,https://api.github.com/repos/simonw/datasette/issues/1232,781641728,MDEyOklzc3VlQ29tbWVudDc4MTY0MTcyOA==,9599,2021-02-18T21:19:34Z,2021-02-18T21:19:34Z,OWNER,"I tested the demo deployment like this: ``` datasette publish cloudrun fixtures.db extra_database.db \ -m fixtures.json \ --plugins-dir=plugins \ --branch=crossdb \ --extra-options=""--setting template_debug 1 --crossdb"" \ --install=pysqlite3-binary \ --service=datasette-latest-crossdb ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",811407131, https://github.com/simonw/datasette/pull/1232#issuecomment-781637292,https://api.github.com/repos/simonw/datasette/issues/1232,781637292,MDEyOklzc3VlQ29tbWVudDc4MTYzNzI5Mg==,9599,2021-02-18T21:11:31Z,2021-02-18T21:11:31Z,OWNER,Due to bug #1233 I'm going to publish the additional database as `extra_database.db` rather than `extra database.db` as it is used in the tests.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",811407131, https://github.com/simonw/datasette/issues/1233#issuecomment-781636590,https://api.github.com/repos/simonw/datasette/issues/1233,781636590,MDEyOklzc3VlQ29tbWVudDc4MTYzNjU5MA==,9599,2021-02-18T21:10:08Z,2021-02-18T21:10:08Z,OWNER,I think the bug is here: https://github.com/simonw/datasette/blob/640ac7071b73111ba4423812cd683756e0e1936b/datasette/utils/__init__.py#L349-L373,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",811458446, https://github.com/simonw/datasette/pull/1232#issuecomment-781634819,https://api.github.com/repos/simonw/datasette/issues/1232,781634819,MDEyOklzc3VlQ29tbWVudDc4MTYzNDgxOQ==,9599,2021-02-18T21:06:43Z,2021-02-18T21:06:43Z,OWNER,I'll document this option on https://docs.datasette.io/en/stable/sql_queries.html,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",811407131, https://github.com/simonw/datasette/pull/1232#issuecomment-781629841,https://api.github.com/repos/simonw/datasette/issues/1232,781629841,MDEyOklzc3VlQ29tbWVudDc4MTYyOTg0MQ==,9599,2021-02-18T20:57:23Z,2021-02-18T20:57:23Z,OWNER,"The new warning looks like this: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",811407131, https://github.com/simonw/datasette/pull/1232#issuecomment-781598585,https://api.github.com/repos/simonw/datasette/issues/1232,781598585,MDEyOklzc3VlQ29tbWVudDc4MTU5ODU4NQ==,9599,2021-02-18T19:57:30Z,2021-02-18T19:57:30Z,OWNER,It would also be neat if https://latest.datasette.io/ had multiple databases attached in order to demonstrate this feature.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",811407131, https://github.com/simonw/datasette/pull/1232#issuecomment-781594632,https://api.github.com/repos/simonw/datasette/issues/1232,781594632,MDEyOklzc3VlQ29tbWVudDc4MTU5NDYzMg==,9599,2021-02-18T19:50:21Z,2021-02-18T19:50:21Z,OWNER,"It would be neat if the `/_memory` page showed a list of attached databases, to indicate that the `--crossdb` option is working and give people links to click to start running queries.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",811407131, https://github.com/simonw/datasette/issues/283#issuecomment-781593169,https://api.github.com/repos/simonw/datasette/issues/283,781593169,MDEyOklzc3VlQ29tbWVudDc4MTU5MzE2OQ==,9599,2021-02-18T19:47:34Z,2021-02-18T19:47:34Z,OWNER,"I have a working version now, moving development to a pull request.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",325958506, https://github.com/simonw/datasette/issues/283#issuecomment-781591015,https://api.github.com/repos/simonw/datasette/issues/283,781591015,MDEyOklzc3VlQ29tbWVudDc4MTU5MTAxNQ==,9599,2021-02-18T19:44:02Z,2021-02-18T19:44:02Z,OWNER,For the moment I'm going to hard-code a `SQLITE_LIMIT_ATTACHED=10` constant and only attach the first 10 databases to the `_memory` connection.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",325958506, https://github.com/simonw/datasette/issues/283#issuecomment-781574786,https://api.github.com/repos/simonw/datasette/issues/283,781574786,MDEyOklzc3VlQ29tbWVudDc4MTU3NDc4Ng==,9599,2021-02-18T19:15:37Z,2021-02-18T19:15:37Z,OWNER,`select * from pragma_database_list();` is useful - shows all attached databases for the current connection.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",325958506, https://github.com/simonw/datasette/issues/283#issuecomment-781573676,https://api.github.com/repos/simonw/datasette/issues/283,781573676,MDEyOklzc3VlQ29tbWVudDc4MTU3MzY3Ng==,9599,2021-02-18T19:13:30Z,2021-02-18T19:13:30Z,OWNER,"It turns out SQLite defaults to a maximum of 10 attached databases. This can be increased using a compile-time constant, but even with that it cannot be more than 62: https://stackoverflow.com/questions/9845448/attach-limit-10","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",325958506, https://github.com/simonw/datasette/issues/1231#issuecomment-781560989,https://api.github.com/repos/simonw/datasette/issues/1231,781560989,MDEyOklzc3VlQ29tbWVudDc4MTU2MDk4OQ==,9599,2021-02-18T18:50:53Z,2021-02-18T18:50:53Z,OWNER,Ideally I'd figure out a way to replicate this error in a concurrent unit test.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",811367257, https://github.com/simonw/datasette/issues/1231#issuecomment-781560865,https://api.github.com/repos/simonw/datasette/issues/1231,781560865,MDEyOklzc3VlQ29tbWVudDc4MTU2MDg2NQ==,9599,2021-02-18T18:50:38Z,2021-02-18T18:50:38Z,OWNER,"I started trying to use locks to resolve this but I've not figured out the right way to do that yet - here's my first experiment: ```diff diff --git a/datasette/app.py b/datasette/app.py index 9e15a16..1681c9d 100644 --- a/datasette/app.py +++ b/datasette/app.py @@ -217,6 +217,7 @@ class Datasette: self.inspect_data = inspect_data self.immutables = set(immutables or []) self.databases = collections.OrderedDict() + self._refresh_schemas_lock = threading.Lock() if memory or not self.files: self.add_database(Database(self, is_memory=True), name=""_memory"") # memory_name is a random string so that each Datasette instance gets its own @@ -324,6 +325,13 @@ class Datasette: self.client = DatasetteClient(self) async def refresh_schemas(self): + return + if self._refresh_schemas_lock.locked(): + return + with self._refresh_schemas_lock: + await self._refresh_schemas() + + async def _refresh_schemas(self): internal_db = self.databases[""_internal""] if not self.internal_db_created: await init_internal_db(internal_db) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",811367257, https://github.com/simonw/datasette/issues/1226#issuecomment-781546512,https://api.github.com/repos/simonw/datasette/issues/1226,781546512,MDEyOklzc3VlQ29tbWVudDc4MTU0NjUxMg==,9599,2021-02-18T18:26:19Z,2021-02-18T18:26:19Z,OWNER,This broke CI: https://github.com/simonw/datasette/runs/1929355965?check_suite_focus=true,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808843401, https://github.com/simonw/datasette/issues/1226#issuecomment-781530157,https://api.github.com/repos/simonw/datasette/issues/1226,781530157,MDEyOklzc3VlQ29tbWVudDc4MTUzMDE1Nw==,9599,2021-02-18T18:00:15Z,2021-02-18T18:00:15Z,OWNER,"I can use `click.IntRange(min=None, max=None)` for this. https://click.palletsprojects.com/en/7.x/options/#ranges - inclusive on both edges.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808843401, https://github.com/dogsheep/google-takeout-to-sqlite/issues/4#issuecomment-781451701,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/4,781451701,MDEyOklzc3VlQ29tbWVudDc4MTQ1MTcwMQ==,203343,2021-02-18T16:06:21Z,2021-02-18T16:06:21Z,NONE,Awesome!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",778380836, https://github.com/simonw/datasette/issues/1230#issuecomment-781330466,https://api.github.com/repos/simonw/datasette/issues/1230,781330466,MDEyOklzc3VlQ29tbWVudDc4MTMzMDQ2Ng==,7107523,2021-02-18T13:06:22Z,2021-02-18T15:22:15Z,NONE,"[Edit] Oh, I just saw the ""Load all"" button under the cluster map as well as the [setting to alter the max number or results](https://docs.datasette.io/en/stable/settings.html#max-returned-rows). So I guess this issue only is about the Vega charts.
Note that datasette-cluster-map also seems to be limited to 998 displayed points: ![ss-2021-02-18_140548](https://user-images.githubusercontent.com/7107523/108361225-15fb2a80-71ea-11eb-9a19-d885e8513f55.png)
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",811054000, https://github.com/simonw/datasette/issues/283#issuecomment-781077127,https://api.github.com/repos/simonw/datasette/issues/283,781077127,MDEyOklzc3VlQ29tbWVudDc4MTA3NzEyNw==,9599,2021-02-18T05:56:30Z,2021-02-18T05:57:34Z,OWNER,I'm going to to try prototyping the `--crossdb` option that causes `/_memory` to connect to all databases as a starting point and see how well that works.,"{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 1, ""eyes"": 0}",325958506, https://github.com/simonw/datasette/issues/283#issuecomment-780991910,https://api.github.com/repos/simonw/datasette/issues/283,780991910,MDEyOklzc3VlQ29tbWVudDc4MDk5MTkxMA==,9308268,2021-02-18T02:13:56Z,2021-02-18T02:13:56Z,NONE,"I was going ask you about this issue when we talk during your office-hours schedule this Friday, but was there any support ever added for doing this cross-database joining? I have a use-case where could be pretty neat to do analysis using this tool on time-specific databases from snapshots https://ilsweb.cincinnatilibrary.org/collection-analysis/ ![image](https://user-images.githubusercontent.com/9308268/108294883-ba3a8e00-7164-11eb-9206-fcd5a8cdd883.png) and thanks again for such an amazing tool!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",325958506, https://github.com/dogsheep/google-takeout-to-sqlite/issues/4#issuecomment-780817596,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/4,780817596,MDEyOklzc3VlQ29tbWVudDc4MDgxNzU5Ng==,306240,2021-02-17T20:01:35Z,2021-02-17T20:01:35Z,NONE,I've got this almost working. Just needs some polish,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",778380836, https://github.com/simonw/sqlite-utils/issues/227#issuecomment-779785638,https://api.github.com/repos/simonw/sqlite-utils/issues/227,779785638,MDEyOklzc3VlQ29tbWVudDc3OTc4NTYzOA==,295329,2021-02-16T11:48:03Z,2021-02-16T11:48:03Z,NONE,Thank you @simonw ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807174161, https://github.com/simonw/datasette/issues/1226#issuecomment-779467451,https://api.github.com/repos/simonw/datasette/issues/1226,779467451,MDEyOklzc3VlQ29tbWVudDc3OTQ2NzQ1MQ==,9599,2021-02-15T22:02:46Z,2021-02-15T22:02:46Z,OWNER,"I'm OK with the current error message shown if you try to use too low a port: ``` datasette fivethirtyeight.db -p 800 INFO: Started server process [45511] INFO: Waiting for application startup. INFO: Application startup complete. ERROR: [Errno 13] error while attempting to bind on address ('127.0.0.1', 800): permission denied INFO: Waiting for application shutdown. INFO: Application shutdown complete. ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808843401, https://github.com/simonw/datasette/issues/1226#issuecomment-779467160,https://api.github.com/repos/simonw/datasette/issues/1226,779467160,MDEyOklzc3VlQ29tbWVudDc3OTQ2NzE2MA==,9599,2021-02-15T22:01:53Z,2021-02-15T22:01:53Z,OWNER,"This check needs to happen in two places: https://github.com/simonw/datasette/blob/9603d893b9b72653895318c9104d754229fdb146/datasette/cli.py#L222-L227 https://github.com/simonw/datasette/blob/9603d893b9b72653895318c9104d754229fdb146/datasette/cli.py#L328-L333","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808843401, https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779416619,https://api.github.com/repos/simonw/sqlite-utils/issues/147,779416619,MDEyOklzc3VlQ29tbWVudDc3OTQxNjYxOQ==,9599,2021-02-15T19:40:57Z,2021-02-15T21:27:55Z,OWNER,"Tried this experiment (not proper binary search, it only searches downwards): ```python import sqlite3 db = sqlite3.connect("":memory:"") def tryit(n): sql = ""select 1 where 1 in ({})"".format("", "".join(""?"" for i in range(n))) db.execute(sql, [0 for i in range(n)]) def find_limit(min=0, max=5_000_000): value = max while True: print('Trying', value) try: tryit(value) return value except: value = value // 2 ``` Running `find_limit()` with those default parameters takes about 1.47s on my laptop: ``` In [9]: %timeit find_limit() Trying 5000000 Trying 2500000... 1.47 s ± 28 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` Interestingly the value it suggested was 156250 - suggesting that the macOS `sqlite3` binary with a 500,000 limit isn't the same as whatever my Python is using here.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",688670158, https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779448912,https://api.github.com/repos/simonw/sqlite-utils/issues/147,779448912,MDEyOklzc3VlQ29tbWVudDc3OTQ0ODkxMg==,9599,2021-02-15T21:09:50Z,2021-02-15T21:09:50Z,OWNER,"I fiddled around and replaced that line with `batch_size = SQLITE_MAX_VARS // num_columns` - which evaluated to `10416` for this particular file. That got me this: 40.71s user 1.81s system 98% cpu 43.081 total 43s is definitely better than 56s, but it's still not as big as the ~26.5s to ~3.5s improvement described by @simonwiles at the top of this issue. I wonder what I'm missing here.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",688670158, https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779446652,https://api.github.com/repos/simonw/sqlite-utils/issues/147,779446652,MDEyOklzc3VlQ29tbWVudDc3OTQ0NjY1Mg==,9599,2021-02-15T21:04:19Z,2021-02-15T21:04:19Z,OWNER,"... but it looks like `batch_size` is hard-coded to 100, rather than `None` - which means it's not being calculated using that value: https://github.com/simonw/sqlite-utils/blob/1f49f32814a942fa076cfe5f504d1621188097ed/sqlite_utils/db.py#L704 And https://github.com/simonw/sqlite-utils/blob/1f49f32814a942fa076cfe5f504d1621188097ed/sqlite_utils/db.py#L1877","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",688670158, https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779445423,https://api.github.com/repos/simonw/sqlite-utils/issues/147,779445423,MDEyOklzc3VlQ29tbWVudDc3OTQ0NTQyMw==,9599,2021-02-15T21:00:44Z,2021-02-15T21:01:09Z,OWNER,"I tried changing the hard-coded value from 999 to 156_250 and running `sqlite-utils insert` against a 500MB CSV file, with these results: ``` (sqlite-utils) sqlite-utils % time sqlite-utils insert slow-ethos.db ethos ../ethos-datasette/ethos.csv --no-headers [###################################-] 99% 00:00:00sqlite-utils insert slow-ethos.db ethos ../ethos-datasette/ethos.csv 44.74s user 7.61s system 92% cpu 56.601 total # Increased the setting here (sqlite-utils) sqlite-utils % time sqlite-utils insert fast-ethos.db ethos ../ethos-datasette/ethos.csv --no-headers [###################################-] 99% 00:00:00sqlite-utils insert fast-ethos.db ethos ../ethos-datasette/ethos.csv 39.40s user 5.15s system 96% cpu 46.320 total ``` Not as big a difference as I was expecting.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",688670158, https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779417723,https://api.github.com/repos/simonw/sqlite-utils/issues/147,779417723,MDEyOklzc3VlQ29tbWVudDc3OTQxNzcyMw==,9599,2021-02-15T19:44:02Z,2021-02-15T19:47:00Z,OWNER,"`%timeit find_limit(max=1_000_000)` took 378ms on my laptop `%timeit find_limit(max=500_000)` took 197ms `%timeit find_limit(max=200_000)` reported 53ms per loop `%timeit find_limit(max=100_000)` reported 26.8ms per loop. All of these are still slow enough that I'm not comfortable running this search for every time the library is imported. Allowing users to opt-in to this as a performance enhancement might be better.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",688670158, https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779409770,https://api.github.com/repos/simonw/sqlite-utils/issues/147,779409770,MDEyOklzc3VlQ29tbWVudDc3OTQwOTc3MA==,9599,2021-02-15T19:23:11Z,2021-02-15T19:23:11Z,OWNER,"On my Mac right now I'm seeing a limit of 500,000: ``` % sqlite3 -cmd "".limits variable_number"" variable_number 500000 ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",688670158, https://github.com/simonw/sqlite-utils/issues/227#issuecomment-778854808,https://api.github.com/repos/simonw/sqlite-utils/issues/227,778854808,MDEyOklzc3VlQ29tbWVudDc3ODg1NDgwOA==,9599,2021-02-14T22:46:54Z,2021-02-14T22:46:54Z,OWNER,Fix is released in 3.5.,"{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807174161, https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778851721,https://api.github.com/repos/simonw/sqlite-utils/issues/228,778851721,MDEyOklzc3VlQ29tbWVudDc3ODg1MTcyMQ==,9599,2021-02-14T22:23:46Z,2021-02-14T22:23:46Z,OWNER,I called this `--no-headers` for consistency with the existing output option: https://github.com/simonw/sqlite-utils/blob/427dace184c7da57f4a04df07b1e84cdae3261e8/sqlite_utils/cli.py#L61-L64,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807437089, https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778849394,https://api.github.com/repos/simonw/sqlite-utils/issues/228,778849394,MDEyOklzc3VlQ29tbWVudDc3ODg0OTM5NA==,9599,2021-02-14T22:06:53Z,2021-02-14T22:06:53Z,OWNER,"For the moment I think just adding `--no-header` - which causes column names ""unknown1,unknown2,..."" to be used - should be enough. Users can import with that option, then use `sqlite-utils transform --rename` to rename them.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807437089, https://github.com/simonw/sqlite-utils/issues/229#issuecomment-778844016,https://api.github.com/repos/simonw/sqlite-utils/issues/229,778844016,MDEyOklzc3VlQ29tbWVudDc3ODg0NDAxNg==,9599,2021-02-14T21:22:45Z,2021-02-14T21:22:45Z,OWNER,"I'm going to use this pattern from https://stackoverflow.com/a/15063941 ```python import sys import csv maxInt = sys.maxsize while True: # decrease the maxInt value by factor 10 # as long as the OverflowError occurs. try: csv.field_size_limit(maxInt) break except OverflowError: maxInt = int(maxInt/10) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807817197, https://github.com/simonw/sqlite-utils/issues/229#issuecomment-778843503,https://api.github.com/repos/simonw/sqlite-utils/issues/229,778843503,MDEyOklzc3VlQ29tbWVudDc3ODg0MzUwMw==,9599,2021-02-14T21:18:51Z,2021-02-14T21:18:51Z,OWNER,"I want to set this to the maximum allowed limit, which seems to be surprisingly hard! That StackOverflow thread is full of ideas for that, many of them involving `ctypes`. I'm a bit loathe to add a dependency on `ctypes` though - even though it's in the Python standard library I worry that it might not be available on some architectures.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807817197, https://github.com/simonw/sqlite-utils/issues/229#issuecomment-778843362,https://api.github.com/repos/simonw/sqlite-utils/issues/229,778843362,MDEyOklzc3VlQ29tbWVudDc3ODg0MzM2Mg==,9599,2021-02-14T21:17:53Z,2021-02-14T21:17:53Z,OWNER,Same issue as #227.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807817197, https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778811746,https://api.github.com/repos/simonw/sqlite-utils/issues/228,778811746,MDEyOklzc3VlQ29tbWVudDc3ODgxMTc0Ng==,9599,2021-02-14T17:39:30Z,2021-02-14T21:16:54Z,OWNER,"I'm going to detach this from the #131 column types idea. The three things I need to handle here are: - The CSV file doesn't have a header row at all, so I need to specify what the column names should be - The CSV file DOES have a header row but I want to ignore it and use alternative column names - The CSV doesn't have a header row at all and I want to automatically use `unknown1,unknown2...` so I can start exploring it as quickly as possible. Here's a potential design that covers the first two: `--replace-header=""foo,bar,baz""` - ignore whatever is in the first row and pretend it was this instead `--add-header=""foo,bar,baz""` - add a first row with these details, to use as the header It doesn't cover the ""give me unknown column names"" case though.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807437089, https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778843086,https://api.github.com/repos/simonw/sqlite-utils/issues/228,778843086,MDEyOklzc3VlQ29tbWVudDc3ODg0MzA4Ng==,9599,2021-02-14T21:15:43Z,2021-02-14T21:15:43Z,OWNER,"I'm not convinced the `.has_header()` rules are useful for the kind of CSV files I work with: https://github.com/python/cpython/blob/63298930fb531ba2bb4f23bc3b915dbf1e17e9e1/Lib/csv.py#L383 ```python def has_header(self, sample): # Creates a dictionary of types of data in each column. If any # column is of a single type (say, integers), *except* for the first # row, then the first row is presumed to be labels. If the type # can't be determined, it is assumed to be a string in which case # the length of the string is the determining factor: if all of the # rows except for the first are the same length, it's a header. # Finally, a 'vote' is taken at the end for each column, adding or # subtracting from the likelihood of the first row being a header. ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807437089, https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778842982,https://api.github.com/repos/simonw/sqlite-utils/issues/228,778842982,MDEyOklzc3VlQ29tbWVudDc3ODg0Mjk4Mg==,9599,2021-02-14T21:15:11Z,2021-02-14T21:15:11Z,OWNER,"Implementation tip: I have code that reads the first row and uses it as headers here: https://github.com/simonw/sqlite-utils/blob/8f042ae1fd323995d966a94e8e6df85cc843b938/sqlite_utils/cli.py#L689-L691 So If I want to use `unknown1,unknown2...` I can do that by reading the first row, counting the number of columns, generating headers based on that range and then continuing to build that generator (maybe with `itertools.chain()` to replay the record we already read). ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807437089, https://github.com/simonw/sqlite-utils/issues/227#issuecomment-778841704,https://api.github.com/repos/simonw/sqlite-utils/issues/227,778841704,MDEyOklzc3VlQ29tbWVudDc3ODg0MTcwNA==,9599,2021-02-14T21:05:20Z,2021-02-14T21:05:20Z,OWNER,This has also been reported in #229.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807174161, https://github.com/simonw/sqlite-utils/pull/225#issuecomment-778841547,https://api.github.com/repos/simonw/sqlite-utils/issues/225,778841547,MDEyOklzc3VlQ29tbWVudDc3ODg0MTU0Nw==,9599,2021-02-14T21:04:13Z,2021-02-14T21:04:13Z,OWNER,I added a test and fixed this in #234 - thanks for the fix.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",797159961, https://github.com/simonw/sqlite-utils/issues/234#issuecomment-778841278,https://api.github.com/repos/simonw/sqlite-utils/issues/234,778841278,MDEyOklzc3VlQ29tbWVudDc3ODg0MTI3OA==,9599,2021-02-14T21:02:11Z,2021-02-14T21:02:11Z,OWNER,"I managed to replicate this in a test: ```python def test_insert_all_with_extra_columns_in_later_chunks(fresh_db): chunk = [ {""record"": ""Record 1""}, {""record"": ""Record 2""}, {""record"": ""Record 3""}, {""record"": ""Record 4"", ""extra"": 1}, ] fresh_db[""t""].insert_all(chunk, batch_size=2, alter=True) assert list(fresh_db[""t""].rows) == [ {""record"": ""Record 1"", ""extra"": None}, {""record"": ""Record 2"", ""extra"": None}, {""record"": ""Record 3"", ""extra"": None}, {""record"": ""Record 4"", ""extra"": 1}, ] ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808046597, https://github.com/simonw/sqlite-utils/pull/225#issuecomment-778834504,https://api.github.com/repos/simonw/sqlite-utils/issues/225,778834504,MDEyOklzc3VlQ29tbWVudDc3ODgzNDUwNA==,9599,2021-02-14T20:09:30Z,2021-02-14T20:09:30Z,OWNER,Thanks for this. I'm going to try and get the test suite to run in Windows on GitHub Actions.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",797159961, https://github.com/simonw/sqlite-utils/issues/231#issuecomment-778829456,https://api.github.com/repos/simonw/sqlite-utils/issues/231,778829456,MDEyOklzc3VlQ29tbWVudDc3ODgyOTQ1Ng==,9599,2021-02-14T19:37:52Z,2021-02-14T19:37:52Z,OWNER,"I'm going to add `limit` and `offset` to the following methods: - `rows_where()` - `search_sql()` - `search()`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808028757, https://github.com/simonw/sqlite-utils/issues/231#issuecomment-778828758,https://api.github.com/repos/simonw/sqlite-utils/issues/231,778828758,MDEyOklzc3VlQ29tbWVudDc3ODgyODc1OA==,9599,2021-02-14T19:33:14Z,2021-02-14T19:33:14Z,OWNER,The `limit=` parameter is currently only available on the `.search()` method - it would make sense to add this to other methods as well.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808028757, https://github.com/simonw/sqlite-utils/pull/224#issuecomment-778828495,https://api.github.com/repos/simonw/sqlite-utils/issues/224,778828495,MDEyOklzc3VlQ29tbWVudDc3ODgyODQ5NQ==,9599,2021-02-14T19:31:06Z,2021-02-14T19:31:06Z,OWNER,I'm going to add a `offset=` parameter to support this case. Thanks for the suggestion!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",792297010, https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778827570,https://api.github.com/repos/simonw/sqlite-utils/issues/230,778827570,MDEyOklzc3VlQ29tbWVudDc3ODgyNzU3MA==,9599,2021-02-14T19:24:20Z,2021-02-14T19:24:20Z,OWNER,Here's the implementation in Python: https://github.com/python/cpython/blob/63298930fb531ba2bb4f23bc3b915dbf1e17e9e1/Lib/csv.py#L204-L225,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808008305, https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778824361,https://api.github.com/repos/simonw/sqlite-utils/issues/230,778824361,MDEyOklzc3VlQ29tbWVudDc3ODgyNDM2MQ==,9599,2021-02-14T18:59:22Z,2021-02-14T18:59:22Z,OWNER,"I think I've got it. I can use `io.BufferedReader()` to get an object I can run `.peek(2048)` on, then wrap THAT in `io.TextIOWrapper`: ```python encoding = encoding or ""utf-8"" buffered = io.BufferedReader(json_file, buffer_size=4096) decoded = io.TextIOWrapper(buffered, encoding=encoding, line_buffering=True) if pk and len(pk) == 1: pk = pk[0] if csv or tsv: if sniff: # Read first 2048 bytes and use that to detect first_bytes = buffered.peek(2048) print('first_bytes', first_bytes) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808008305, https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778821403,https://api.github.com/repos/simonw/sqlite-utils/issues/230,778821403,MDEyOklzc3VlQ29tbWVudDc3ODgyMTQwMw==,9599,2021-02-14T18:38:16Z,2021-02-14T18:38:16Z,OWNER,"There are two code paths here that matter: - For a regular file, can read the first 2048 bytes, then `.seek(0)` before continuing. That's easy. - `stdin` is harder. I need to read and buffer the first 2048 bytes, then pass an object to `csv.reader()` which will replay that chunk and then play the rest of stdin. I'm a bit stuck on the second one. Ideally I could use something like `itertools.chain()` but I can't find an alternative for file-like objects.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808008305, https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778818639,https://api.github.com/repos/simonw/sqlite-utils/issues/230,778818639,MDEyOklzc3VlQ29tbWVudDc3ODgxODYzOQ==,9599,2021-02-14T18:22:38Z,2021-02-14T18:22:38Z,OWNER,Maybe I shouldn't be using `StreamReader` at all - https://www.python.org/dev/peps/pep-0400/ suggests that it should be deprecated in favour of `io.TextIOWrapper`. I'm using `StreamReader` due to this line: https://github.com/simonw/sqlite-utils/blob/726219c3503e77440975cd15b74d006639feb0f8/sqlite_utils/cli.py#L667-L668,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808008305, https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778817494,https://api.github.com/repos/simonw/sqlite-utils/issues/230,778817494,MDEyOklzc3VlQ29tbWVudDc3ODgxNzQ5NA==,9599,2021-02-14T18:16:06Z,2021-02-14T18:16:06Z,OWNER,"Types involved: ``` (Pdb) type(json_file.raw) (Pdb) type(json_file) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808008305, https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778816333,https://api.github.com/repos/simonw/sqlite-utils/issues/230,778816333,MDEyOklzc3VlQ29tbWVudDc3ODgxNjMzMw==,9599,2021-02-14T18:08:44Z,2021-02-14T18:08:44Z,OWNER,"No, you can't `.seek(0)` on stdin: ``` File ""/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py"", line 678, in insert_upsert_implementation json_file.raw.seek(0) OSError: [Errno 29] Illegal seek ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808008305, https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778815740,https://api.github.com/repos/simonw/sqlite-utils/issues/230,778815740,MDEyOklzc3VlQ29tbWVudDc3ODgxNTc0MA==,9599,2021-02-14T18:05:03Z,2021-02-14T18:05:03Z,OWNER,"The challenge here is how to read the first 2048 bytes and then reset the incoming file. The Python docs example looks like this: ```python with open('example.csv', newline='') as csvfile: dialect = csv.Sniffer().sniff(csvfile.read(1024)) csvfile.seek(0) reader = csv.reader(csvfile, dialect) ``` Here's the relevant code in `sqlite-utils`: https://github.com/simonw/sqlite-utils/blob/726219c3503e77440975cd15b74d006639feb0f8/sqlite_utils/cli.py#L671-L679 The challenge is going to be having the `--sniff` option work with the progress bar. Here's how `file_progress()` works: https://github.com/simonw/sqlite-utils/blob/726219c3503e77440975cd15b74d006639feb0f8/sqlite_utils/utils.py#L106-L113 If `file.raw` is `stdin` can I do the equivalent of `csvfile.seek(0)` on it?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808008305, https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778812684,https://api.github.com/repos/simonw/sqlite-utils/issues/230,778812684,MDEyOklzc3VlQ29tbWVudDc3ODgxMjY4NA==,9599,2021-02-14T17:45:16Z,2021-02-14T17:45:16Z,OWNER,"Running this could take any CSV (or TSV) file and automatically detect the delimiter. If no header row is detected it could add `unknown1,unknown2` headers: sqlite-utils insert db.db data file.csv --sniff (Using `--sniff` would imply `--csv`) This could be called `--sniffer` instead but I like `--sniff` better.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",808008305, https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778812050,https://api.github.com/repos/simonw/sqlite-utils/issues/228,778812050,MDEyOklzc3VlQ29tbWVudDc3ODgxMjA1MA==,9599,2021-02-14T17:41:30Z,2021-02-14T17:41:30Z,OWNER,"I just spotted that `csv.Sniffer` in the Python standard library has a `.has_header(sample)` method which detects if the first row appears to be a header or not, which is interesting. https://docs.python.org/3/library/csv.html#csv.Sniffer","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807437089, https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778811934,https://api.github.com/repos/simonw/sqlite-utils/issues/228,778811934,MDEyOklzc3VlQ29tbWVudDc3ODgxMTkzNA==,9599,2021-02-14T17:40:48Z,2021-02-14T17:40:48Z,OWNER,"Another pattern that might be useful is to generate a header that is just ""unknown1,unknown2,unknown3"" for each of the columns in the rest of the file. This makes it easy to e.g. facet-explore within Datasette to figure out the correct names, then use `sqlite-utils transform --rename` to rename the columns. I needed to do that for the https://bl.iro.bl.uk/work/ns/3037474a-761c-456d-a00c-9ef3c6773f4c example.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807437089, https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778511347,https://api.github.com/repos/simonw/sqlite-utils/issues/228,778511347,MDEyOklzc3VlQ29tbWVudDc3ODUxMTM0Nw==,9599,2021-02-12T23:27:50Z,2021-02-12T23:27:50Z,OWNER,"For the moment, a workaround can be to `cat` an additional row onto the start of the file. echo ""name,url,description"" | cat - missing_headings.csv | sqlite-utils insert blah.db table - --csv","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807437089, https://github.com/simonw/sqlite-utils/issues/131#issuecomment-778510528,https://api.github.com/repos/simonw/sqlite-utils/issues/131,778510528,MDEyOklzc3VlQ29tbWVudDc3ODUxMDUyOA==,9599,2021-02-12T23:25:06Z,2021-02-12T23:25:06Z,OWNER,"If `-c` isn't available, maybe `-t` or `--type` would work for specifying column types: ``` sqlite-utils insert db.db images images.tsv \ --tsv \ --type id int \ --type score float ``` or ``` sqlite-utils insert db.db images images.tsv \ --tsv \ -t id int \ -t score float ```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",675753042, https://github.com/simonw/sqlite-utils/issues/131#issuecomment-778508887,https://api.github.com/repos/simonw/sqlite-utils/issues/131,778508887,MDEyOklzc3VlQ29tbWVudDc3ODUwODg4Nw==,9599,2021-02-12T23:20:11Z,2021-02-12T23:20:11Z,OWNER,"Annoyingly `-c` is currently a shortcut for `--csv` - so I'd have to do a major version bump to use that. https://github.com/simonw/sqlite-utils/blob/726219c3503e77440975cd15b74d006639feb0f8/sqlite_utils/cli.py#L601-L603 Particularly annoying because I attempted to remove the `-c` shortcut in https://github.com/simonw/sqlite-utils/commit/2c00567aac6d9c79087cfff0d054f64922b1473d#diff-76294b3d4afeb27e74e738daa01c26dd4dc9ccb6f4477451483a2ece1095902eL48 but forgot to remove it from the input options (I removed it from the output options).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",675753042, https://github.com/simonw/datasette/issues/1220#issuecomment-778467759,https://api.github.com/repos/simonw/datasette/issues/1220,778467759,MDEyOklzc3VlQ29tbWVudDc3ODQ2Nzc1OQ==,30607,2021-02-12T21:35:17Z,2021-02-12T21:35:17Z,NONE,Thank you,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",806743116, https://github.com/simonw/datasette/issues/1220#issuecomment-778439617,https://api.github.com/repos/simonw/datasette/issues/1220,778439617,MDEyOklzc3VlQ29tbWVudDc3ODQzOTYxNw==,7476523,2021-02-12T20:33:27Z,2021-02-12T20:33:27Z,CONTRIBUTOR,"That Docker command will mount your current directory inside the Docker container at `/mnt` - so you shouldn't need to change anything locally, just run ``` docker run -p 8001:8001 -v `pwd`:/mnt \ datasetteproject/datasette \ datasette -p 8001 -h 0.0.0.0 /mnt/fixtures.db ``` and it will use the `fixtures.db` file within your current directory","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",806743116, https://github.com/dogsheep/github-to-sqlite/issues/60#issuecomment-770069864,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/60,770069864,MDEyOklzc3VlQ29tbWVudDc3MDA2OTg2NA==,22578954,2021-01-29T21:52:05Z,2021-02-12T18:29:43Z,CONTRIBUTOR,"For the purposes below I am assuming the organization I would get all the repositories and their related commits from is called `gh-organization`. The github's owner id of gh-orgnization is `123456789`. ```bash github-to-sqlite repos github.db gh-organization ``` I'm on a windows computer running git bash to be able to use the `|` command. This works for me ```bash sqlite3 github.db ""SELECT full_name FROM repos WHERE owner = '123456789';"" | tr '\n\r' ' ' | xargs | { read repos; github-to-sqlite commits github.db $repos; } ``` On a pure linux system I think this would work because the new line character is normally `\n` ```bash sqlite3 github.db ""SELECT full_name FROM repos WHERE owner = '123456789';"" | tr '\n' ' ' | xargs | { read repos; github-to-sqlite commits github.db $repos; }` ``` As expected I ran into rate limit issues #51 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",797097140, https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778349672,https://api.github.com/repos/simonw/sqlite-utils/issues/228,778349672,MDEyOklzc3VlQ29tbWVudDc3ODM0OTY3Mg==,9599,2021-02-12T18:00:43Z,2021-02-12T18:00:43Z,OWNER,"I could combine this with #131 to allow types to be specified in addition to column names. Probably need an option that means ""ignore the existing heading row and use this one instead"".","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807437089, https://github.com/simonw/sqlite-utils/issues/227#issuecomment-778349142,https://api.github.com/repos/simonw/sqlite-utils/issues/227,778349142,MDEyOklzc3VlQ29tbWVudDc3ODM0OTE0Mg==,9599,2021-02-12T17:59:35Z,2021-02-12T17:59:35Z,OWNER,It looks like I can at least bump this size limit up to the maximum allowed by Python - I'll take a look at that. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",807174161, https://github.com/dogsheep/dogsheep-photos/issues/33#issuecomment-778246347,https://api.github.com/repos/dogsheep/dogsheep-photos/issues/33,778246347,MDEyOklzc3VlQ29tbWVudDc3ODI0NjM0Nw==,41546558,2021-02-12T15:00:43Z,2021-02-12T15:00:43Z,CONTRIBUTOR,"Yes, Big Sur Photos database doesn't have `ZGENERICASSET` table. PR #31 will fix this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",803338729, https://github.com/dogsheep/dogsheep-photos/issues/33#issuecomment-778014990,https://api.github.com/repos/dogsheep/dogsheep-photos/issues/33,778014990,MDEyOklzc3VlQ29tbWVudDc3ODAxNDk5MA==,675335,2021-02-12T06:54:14Z,2021-02-12T06:54:14Z,NONE,"Ahh, that might be because macOS Big Sur has changed the structure of the photos db. Might need to wait for a later release, there is a PR which adds support for Big Sur. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",803338729, https://github.com/simonw/datasette/issues/1220#issuecomment-778008752,https://api.github.com/repos/simonw/datasette/issues/1220,778008752,MDEyOklzc3VlQ29tbWVudDc3ODAwODc1Mg==,30607,2021-02-12T06:37:34Z,2021-02-12T06:37:34Z,NONE,"I have used my path, I'm running it from the folder in wich I have the db. Do I must an absolute path? Do I must create exactly that folder?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",806743116, https://github.com/dogsheep/dogsheep-photos/issues/33#issuecomment-778002092,https://api.github.com/repos/dogsheep/dogsheep-photos/issues/33,778002092,MDEyOklzc3VlQ29tbWVudDc3ODAwMjA5Mg==,11855322,2021-02-12T06:19:32Z,2021-02-12T06:19:32Z,NONE,"hi @leafgarland that results in a new error: ``` (venv) (base) Robins-MacBook:datasette robin$ dogsheep-photos apple-photos photos.db Traceback (most recent call last): File ""/Users/robin/datasette/venv/bin/dogsheep-photos"", line 8, in sys.exit(cli()) File ""/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py"", line 829, in __call__ return self.main(*args, **kwargs) File ""/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py"", line 782, in main rv = self.invoke(ctx) File ""/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py"", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File ""/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py"", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File ""/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py"", line 610, in invoke return callback(*args, **kwargs) File ""/Users/robin/datasette/venv/lib/python3.8/site-packages/dogsheep_photos/cli.py"", line 206, in apple_photos db.conn.execute( sqlite3.OperationalError: no such table: attached.ZGENERICASSET ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",803338729, https://github.com/dogsheep/dogsheep-photos/issues/33#issuecomment-777951854,https://api.github.com/repos/dogsheep/dogsheep-photos/issues/33,777951854,MDEyOklzc3VlQ29tbWVudDc3Nzk1MTg1NA==,675335,2021-02-12T03:54:39Z,2021-02-12T03:54:39Z,NONE,"I think that is a typo in the docs, you can use > dogsheep-photos apple-photos photos.db","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",803338729, https://github.com/simonw/datasette/pull/1223#issuecomment-777949755,https://api.github.com/repos/simonw/datasette/issues/1223,777949755,MDEyOklzc3VlQ29tbWVudDc3Nzk0OTc1NQ==,22429695,2021-02-12T03:45:31Z,2021-02-12T03:45:31Z,NONE,"# [Codecov](https://codecov.io/gh/simonw/datasette/pull/1223?src=pr&el=h1) Report > Merging [#1223](https://codecov.io/gh/simonw/datasette/pull/1223?src=pr&el=desc) (d1cd1f2) into [main](https://codecov.io/gh/simonw/datasette/commit/9603d893b9b72653895318c9104d754229fdb146?el=desc) (9603d89) will **not change** coverage. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1223/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1)](https://codecov.io/gh/simonw/datasette/pull/1223?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## main #1223 +/- ## ======================================= Coverage 91.42% 91.42% ======================================= Files 32 32 Lines 3955 3955 ======================================= Hits 3616 3616 Misses 339 339 ``` ------ [Continue to review full report at Codecov](https://codecov.io/gh/simonw/datasette/pull/1223?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/simonw/datasette/pull/1223?src=pr&el=footer). Last update [9603d89...d1cd1f2](https://codecov.io/gh/simonw/datasette/pull/1223?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",806918878, https://github.com/simonw/datasette/issues/1220#issuecomment-777927946,https://api.github.com/repos/simonw/datasette/issues/1220,777927946,MDEyOklzc3VlQ29tbWVudDc3NzkyNzk0Ng==,7476523,2021-02-12T02:29:54Z,2021-02-12T02:29:54Z,CONTRIBUTOR,"According to https://github.com/simonw/datasette/blob/master/docs/installation.rst#using-docker it should be ``` docker run -p 8001:8001 -v `pwd`:/mnt \ datasetteproject/datasette \ datasette -p 8001 -h 0.0.0.0 /mnt/fixtures.db ``` This uses `/mnt/fixtures.db` whereas you're using `fixtures.db` - did you try using this path instead?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",806743116, https://github.com/simonw/datasette/issues/1221#issuecomment-777901052,https://api.github.com/repos/simonw/datasette/issues/1221,777901052,MDEyOklzc3VlQ29tbWVudDc3NzkwMTA1Mg==,9599,2021-02-12T01:09:54Z,2021-02-12T01:09:54Z,OWNER,"I also tested this manually. I generated certificate files like so: cd /tmp python -m trustme This created `/tmp/server.pem`, `/tmp/client.pem` and `/tmp/server.key` Then I started Datasette like this: datasette --memory --ssl-keyfile=/tmp/server.key --ssl-certfile=/tmp/server.pem And exercise it using `curl` like so: /tmp % curl --cacert /tmp/client.pem 'https://localhost:8001/_memory.json' {""database"": ""_memory"", ""path"": ""/_memory"", ""size"": 0, ""tables"": [], ""hidden_count"": 0, ""views"": [], ""queries"": [], ""private"": false, ""allow_execute_sql"": true, ""query_ms"": 0.8843200000114848} Note that without the `--cacert` option I get an error: ``` /tmp % curl 'https://localhost:8001/_memory.json' curl: (60) SSL certificate problem: Invalid certificate chain More details here: https://curl.haxx.se/docs/sslcerts.html curl failed to verify the legitimacy of the server and therefore could not establish a secure connection to it. To learn more about this situation and how to fix it, please visit the web page mentioned above. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",806849424, https://github.com/simonw/datasette/issues/1221#issuecomment-777887190,https://api.github.com/repos/simonw/datasette/issues/1221,777887190,MDEyOklzc3VlQ29tbWVudDc3Nzg4NzE5MA==,9599,2021-02-12T00:29:18Z,2021-02-12T00:29:18Z,OWNER,I can use this recipe to start a `datasette` server in a sub-process during the pytest run and exercise it with real HTTP requests: https://til.simonwillison.net/pytest/subprocess-server,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",806849424, https://github.com/simonw/datasette/issues/1221#issuecomment-777883452,https://api.github.com/repos/simonw/datasette/issues/1221,777883452,MDEyOklzc3VlQ29tbWVudDc3Nzg4MzQ1Mg==,9599,2021-02-12T00:19:30Z,2021-02-12T00:19:40Z,OWNER,"Uvicorn supports these options: https://www.uvicorn.org/#command-line-options ``` --ssl-keyfile TEXT SSL key file --ssl-certfile TEXT SSL certificate file --ssl-keyfile-password TEXT SSL keyfile password --ssl-version INTEGER SSL version to use (see stdlib ssl module's) [default: 2] --ssl-cert-reqs INTEGER Whether client certificate is required (see stdlib ssl module's) [default: 0] --ssl-ca-certs TEXT CA certificates file --ssl-ciphers TEXT Ciphers to use (see stdlib ssl module's) [default: TLSv1] ``` For the moment I'm going to support just `--ssl-keyfile` and `--ssl-certfile` as arguments to `datasette serve`. I'll add other options if people ask for them.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",806849424, https://github.com/dogsheep/evernote-to-sqlite/pull/10#issuecomment-777839351,https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/10,777839351,MDEyOklzc3VlQ29tbWVudDc3NzgzOTM1MQ==,9599,2021-02-11T22:37:55Z,2021-02-11T22:37:55Z,MEMBER,"I've merged these changes by hand now, thanks!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",770712149, https://github.com/dogsheep/evernote-to-sqlite/issues/7#issuecomment-777827396,https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/7,777827396,MDEyOklzc3VlQ29tbWVudDc3NzgyNzM5Ng==,9599,2021-02-11T22:13:14Z,2021-02-11T22:13:14Z,MEMBER,My best guess is that you have an older version of `sqlite-utils` installed here - the `replace=True` argument was added in version 2.0. I've bumped the dependency in `setup.py`.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",743297582, https://github.com/dogsheep/evernote-to-sqlite/issues/9#issuecomment-777821383,https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/9,777821383,MDEyOklzc3VlQ29tbWVudDc3NzgyMTM4Mw==,9599,2021-02-11T22:01:28Z,2021-02-11T22:01:28Z,MEMBER,"Aha! I think I've figured out what's going on here. The CData blocks containing the notes look like this: `
This note includes two images.

...` The DTD at http://xml.evernote.com/pub/enml2.dtd includes some entities: ``` %HTMLlat1; %HTMLsymbol; %HTMLspecial; ``` So I need to be able to handle all of those different entities. I think I can do that using `html.entities.entitydefs` from the Python standard library, which looks a bit like this: ```python {'Aacute': 'Á', 'aacute': 'á', 'Aacute;': 'Á', 'aacute;': 'á', 'Abreve;': 'Ă', 'abreve;': 'ă', 'ac;': '∾', 'acd;': '∿', # ... } ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",748372469, https://github.com/dogsheep/evernote-to-sqlite/issues/11#issuecomment-777798330,https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/11,777798330,MDEyOklzc3VlQ29tbWVudDc3Nzc5ODMzMA==,9599,2021-02-11T21:18:58Z,2021-02-11T21:18:58Z,MEMBER,Thanks for the fix!,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",792851444, https://github.com/dogsheep/evernote-to-sqlite/issues/11#issuecomment-777690332,https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/11,777690332,MDEyOklzc3VlQ29tbWVudDc3NzY5MDMzMg==,3613583,2021-02-11T18:16:01Z,2021-02-11T18:16:01Z,NONE,"I solved this issue by modifying line 31 of utils.py content = ET.tostring(ET.fromstring(content_xml.strip())).decode(""utf-8"")","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",792851444, https://github.com/simonw/datasette/issues/1200#issuecomment-777178728,https://api.github.com/repos/simonw/datasette/issues/1200,777178728,MDEyOklzc3VlQ29tbWVudDc3NzE3ODcyOA==,9599,2021-02-11T03:13:59Z,2021-02-11T03:13:59Z,OWNER,"I came up with the need for this while playing with this tool: https://calands.datasettes.com/calands?sql=select%0D%0A++AsGeoJSON(geometry)%2C+*%0D%0Afrom%0D%0A++CPAD_2020a_SuperUnits%0D%0Awhere%0D%0A++PARK_NAME+like+'%25mini%25'+and%0D%0A++Intersects(GeomFromGeoJSON(%3Afreedraw)%2C+geometry)+%3D+1%0D%0A++and+CPAD_2020a_SuperUnits.rowid+in+(%0D%0A++++select%0D%0A++++++rowid%0D%0A++++from%0D%0A++++++SpatialIndex%0D%0A++++where%0D%0A++++++f_table_name+%3D+'CPAD_2020a_SuperUnits'%0D%0A++++++and+search_frame+%3D+GeomFromGeoJSON(%3Afreedraw)%0D%0A++)&freedraw={""type""%3A""MultiPolygon""%2C""coordinates""%3A[[[[-122.42202758789064%2C37.82280243352759]%2C[-122.39868164062501%2C37.823887203271454]%2C[-122.38220214843751%2C37.81846319511331]%2C[-122.35061645507814%2C37.77071473849611]%2C[-122.34924316406251%2C37.74465712069939]%2C[-122.37258911132814%2C37.703380457832374]%2C[-122.39044189453125%2C37.690340943717715]%2C[-122.41241455078126%2C37.680559803205135]%2C[-122.44262695312501%2C37.67295135774715]%2C[-122.47283935546876%2C37.67295135774715]%2C[-122.52502441406251%2C37.68382032669382]%2C[-122.53463745117189%2C37.6892542140253]%2C[-122.54699707031251%2C37.690340943717715]%2C[-122.55798339843751%2C37.72945260537781]%2C[-122.54287719726564%2C37.77831314799672]%2C[-122.49893188476564%2C37.81303878836991]%2C[-122.46185302734376%2C37.82822612280363]%2C[-122.42889404296876%2C37.82822612280363]%2C[-122.42202758789064%2C37.82280243352759]]]]} - before I fixed https://github.com/simonw/datasette-leaflet-geojson/issues/16 it was loading a LOT of maps, which felt bad. I wanted to be able to link people to that page with a hard limit on the number of rows displayed on that page. It's mainly to guard against unexpected behaviour from limit-less queries though. It's not a very high priority feature!","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",792890765, https://github.com/simonw/datasette/issues/1200#issuecomment-777132761,https://api.github.com/repos/simonw/datasette/issues/1200,777132761,MDEyOklzc3VlQ29tbWVudDc3NzEzMjc2MQ==,7476523,2021-02-11T00:29:52Z,2021-02-11T00:29:52Z,CONTRIBUTOR,I'm probably missing something but what's the use case here - what would this offer over adding `limit 10` to the query?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",792890765, https://github.com/simonw/datasette/issues/1219#issuecomment-775442039,https://api.github.com/repos/simonw/datasette/issues/1219,775442039,MDEyOklzc3VlQ29tbWVudDc3NTQ0MjAzOQ==,9599,2021-02-08T20:39:52Z,2021-02-08T22:13:00Z,OWNER,"This comment helped me find a pattern for running Scalene against the Datasette test suite: https://github.com/emeryberger/scalene/issues/70#issuecomment-755245858 ``` pip install scalene ``` Then I created a file called `run_tests.py` with the following contents: ```python if __name__ == ""__main__"": import sys, pytest pytest.main(sys.argv) ``` Then I ran this: ``` scalene --profile-all run_tests.py -sv -x . ``` But... it quit with a segmentation fault! ``` (datasette) datasette % scalene --profile-all run_tests.py -sv -x . ======================================================================== test session starts ======================================================================== platform darwin -- Python 3.8.6, pytest-6.0.1, py-1.9.0, pluggy-0.13.1 -- python cachedir: .pytest_cache rootdir: /Users/simon/Dropbox/Development/datasette, configfile: pytest.ini plugins: asyncio-0.14.0, timeout-1.4.2 collecting ... Fatal Python error: Segmentation fault Current thread 0x0000000110c1edc0 (most recent call first): File ""/Users/simon/Dropbox/Development/datasette/datasette/utils/__init__.py"", line 553 in detect_json1 File ""/Users/simon/Dropbox/Development/datasette/datasette/filters.py"", line 168 in Filters File ""/Users/simon/Dropbox/Development/datasette/datasette/filters.py"", line 94 in File """", line 219 in _call_with_frames_removed File """", line 783 in exec_module File """", line 671 in _load_unlocked File """", line 975 in _find_and_load_unlocked File """", line 991 in _find_and_load File ""/Users/simon/Dropbox/Development/datasette/datasette/views/table.py"", line 27 in File """", line 219 in _call_with_frames_removed File """", line 783 in exec_module File """", line 671 in _load_unlocked File """", line 975 in _find_and_load_unlocked File """", line 991 in _find_and_load File ""/Users/simon/Dropbox/Development/datasette/datasette/app.py"", line 42 in File """", line 219 in _call_with_frames_removed File """", line 783 in exec_module File """", line 671 in _load_unlocked File """", line 975 in _find_and_load_unlocked File """", line 991 in _find_and_load File ""/Users/simon/Dropbox/Development/datasette/tests/test_api.py"", line 1 in File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/assertion/rewrite.py"", line 170 in exec_module File """", line 671 in _load_unlocked File """", line 975 in _find_and_load_unlocked File """", line 991 in _find_and_load File """", line 1014 in _gcd_import File ""/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/importlib/__init__.py"", line 127 in import_module File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/pathlib.py"", line 520 in import_path File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/python.py"", line 552 in _importtestmodule File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/python.py"", line 484 in _getobj File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/python.py"", line 288 in obj File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/python.py"", line 500 in _inject_setup_module_fixture File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/python.py"", line 487 in collect File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/runner.py"", line 324 in File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/runner.py"", line 294 in from_call File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/runner.py"", line 324 in pytest_make_collect_report File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/callers.py"", line 187 in _multicall File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/manager.py"", line 84 in File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/manager.py"", line 93 in _hookexec File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/hooks.py"", line 286 in __call__ File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/runner.py"", line 441 in collect_one_node File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py"", line 768 in genitems File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py"", line 771 in genitems File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py"", line 568 in _perform_collect File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py"", line 516 in perform_collect File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py"", line 306 in pytest_collection File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/callers.py"", line 187 in _multicall File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/manager.py"", line 84 in File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/manager.py"", line 93 in _hookexec File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/hooks.py"", line 286 in __call__ File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py"", line 295 in _main File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py"", line 240 in wrap_session File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py"", line 289 in pytest_cmdline_main File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/callers.py"", line 187 in _multicall File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/manager.py"", line 84 in File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/manager.py"", line 93 in _hookexec File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/hooks.py"", line 286 in __call__ File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/config/__init__.py"", line 157 in main File ""run_tests.py"", line 3 in File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/scalene/scalene_profiler.py"", line 1525 in main File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/scalene/__main__.py"", line 7 in main File ""/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/scalene/__main__.py"", line 14 in File ""/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/runpy.py"", line 87 in _run_code File ""/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/runpy.py"", line 194 in _run_module_as_main Scalene error: received signal SIGSEGV ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",803929694, https://github.com/simonw/datasette/issues/1219#issuecomment-775497449,https://api.github.com/repos/simonw/datasette/issues/1219,775497449,MDEyOklzc3VlQ29tbWVudDc3NTQ5NzQ0OQ==,9599,2021-02-08T22:11:34Z,2021-02-08T22:11:34Z,OWNER,"https://github.com/emeryberger/scalene/issues/110 reports a ""received signal SIGSEGV"" error that was fixed by upgrading to the latest Scalene version, but I'm running that already.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",803929694, https://github.com/dogsheep/pocket-to-sqlite/issues/9#issuecomment-774730656,https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/9,774730656,MDEyOklzc3VlQ29tbWVudDc3NDczMDY1Ng==,635179,2021-02-07T18:45:04Z,2021-02-07T18:45:04Z,NONE,"That URL uses TLS 1.3, but maybe only if the client supports it. It could be your Python version or your SSL library that’s not recent enough.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",801780625, https://github.com/dogsheep/pocket-to-sqlite/issues/9#issuecomment-774726123,https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/9,774726123,MDEyOklzc3VlQ29tbWVudDc3NDcyNjEyMw==,12669260,2021-02-07T18:21:08Z,2021-02-07T18:21:08Z,NONE,@simonw any ideas here?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",801780625, https://github.com/simonw/datasette/issues/1217#issuecomment-774528913,https://api.github.com/repos/simonw/datasette/issues/1217,774528913,MDEyOklzc3VlQ29tbWVudDc3NDUyODkxMw==,639730,2021-02-06T19:23:41Z,2021-02-06T19:23:41Z,NONE,I've had a lot of success running it as an OpenFaaS lambda.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",802513359, https://github.com/simonw/datasette/issues/1217#issuecomment-774385092,https://api.github.com/repos/simonw/datasette/issues/1217,774385092,MDEyOklzc3VlQ29tbWVudDc3NDM4NTA5Mg==,6165713,2021-02-06T02:49:11Z,2021-02-06T02:49:11Z,NONE,"A good reference seems to be the note to run `datasette` as a module in https://github.com/simonw/datasette/pull/556 ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",802513359, https://github.com/simonw/sqlite-utils/issues/223#issuecomment-774373829,https://api.github.com/repos/simonw/sqlite-utils/issues/223,774373829,MDEyOklzc3VlQ29tbWVudDc3NDM3MzgyOQ==,9599,2021-02-06T01:39:47Z,2021-02-06T01:39:47Z,OWNER,Documentation: https://sqlite-utils.datasette.io/en/stable/cli.html#cli-insert-csv-tsv-delimiter,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",788527932, https://github.com/simonw/datasette/issues/1208#issuecomment-774286962,https://api.github.com/repos/simonw/datasette/issues/1208,774286962,MDEyOklzc3VlQ29tbWVudDc3NDI4Njk2Mg==,4488943,2021-02-05T21:02:39Z,2021-02-05T21:02:39Z,CONTRIBUTOR,@simonw could you please take a look at the PR 1211 that fixes this issue?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",794554881, https://github.com/simonw/sqlite-utils/pull/203#issuecomment-774217792,https://api.github.com/repos/simonw/sqlite-utils/issues/203,774217792,MDEyOklzc3VlQ29tbWVudDc3NDIxNzc5Mg==,1049910,2021-02-05T18:44:13Z,2021-02-05T18:44:13Z,NONE,"Thanks for looking at this - home schooling kids has prevented me from replying. I'd struggled with how to adapt the API for the foreign keys too - I definitely tried the String/Tuple approach. I hadn't considered the breaking changes that would introduce though. I can take a look at this and try and make the change - see which of your options works best. I've got a workaround for the use-case I was looking at this for, so it wouldn't be a problem for me if it was put on the back burner until a hypothetical v4.0 anyway. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",743384829, https://github.com/simonw/datasette/issues/1210#issuecomment-773977128,https://api.github.com/repos/simonw/datasette/issues/1210,773977128,MDEyOklzc3VlQ29tbWVudDc3Mzk3NzEyOA==,525780,2021-02-05T11:30:34Z,2021-02-05T11:30:34Z,NONE,"Thanks for your quick reply! Having changed my `metadata.yml`, queries AND database I can't really reproduce it anymore, sorry. But at least I'm happy to say that it works now! :) Thanks again for the super nifty tool, very appreciated.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",796234313, https://github.com/simonw/datasette/issues/1216#issuecomment-772796111,https://api.github.com/repos/simonw/datasette/issues/1216,772796111,MDEyOklzc3VlQ29tbWVudDc3Mjc5NjExMQ==,9599,2021-02-03T20:20:48Z,2021-02-03T20:20:48Z,OWNER,Relevant code: https://github.com/simonw/datasette/blob/1600d2a3ec3ada1f6fb5b1eb73bdaeccb5f80530/datasette/app.py#L620-L632,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",800669347, https://github.com/dogsheep/twitter-to-sqlite/issues/56#issuecomment-772408273,https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/56,772408273,MDEyOklzc3VlQ29tbWVudDc3MjQwODI3Mw==,42315895,2021-02-03T10:36:36Z,2021-02-03T10:36:36Z,NONE,"I figured it out. Those tweets are in database, because somebody quote tweeted them, or retweeted them. And if you grab quoted tweet or reweeted tweet from other tweet json, It doesn't grab all of the details. So if someone quote tweeted a quote tweet, the second quote tweet won't have `quoted_status`. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",796736607, https://github.com/simonw/datasette/issues/1212#issuecomment-772007663,https://api.github.com/repos/simonw/datasette/issues/1212,772007663,MDEyOklzc3VlQ29tbWVudDc3MjAwNzY2Mw==,4488943,2021-02-02T21:36:56Z,2021-02-02T21:36:56Z,CONTRIBUTOR,"How do you get 4-5 minutes? I run my tests in WSL 2, so may be i need to try a real linux VM.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",797651831, https://github.com/simonw/datasette/issues/1214#issuecomment-772001787,https://api.github.com/repos/simonw/datasette/issues/1214,772001787,MDEyOklzc3VlQ29tbWVudDc3MjAwMTc4Nw==,9599,2021-02-02T21:28:53Z,2021-02-02T21:28:53Z,OWNER,"Fix is now live on https://latest.datasette.io/fixtures/searchable?_search=terry - clearing ""terry"" and re-submitting the form now works as expected.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",799693777, https://github.com/simonw/datasette/issues/1214#issuecomment-771992628,https://api.github.com/repos/simonw/datasette/issues/1214,771992628,MDEyOklzc3VlQ29tbWVudDc3MTk5MjYyOA==,9599,2021-02-02T21:15:18Z,2021-02-02T21:15:18Z,OWNER,"The cause of this bug is form fields which begin with `_` but ARE displayed as form inputs on the page - hence should not be duplicated in an `` element.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",799693777,