{"html_url": "https://github.com/simonw/datasette/issues/283#issuecomment-781077127", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/283", "id": 781077127, "node_id": "MDEyOklzc3VlQ29tbWVudDc4MTA3NzEyNw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-18T05:56:30Z", "updated_at": "2021-02-18T05:57:34Z", "author_association": "OWNER", "body": "I'm going to to try prototyping the `--crossdb` option that causes `/_memory` to connect to all databases as a starting point and see how well that works.", "reactions": "{\"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 1, \"eyes\": 0}", "issue": {"value": 325958506, "label": "Support cross-database joins"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1226#issuecomment-779467451", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1226", "id": 779467451, "node_id": "MDEyOklzc3VlQ29tbWVudDc3OTQ2NzQ1MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-15T22:02:46Z", "updated_at": "2021-02-15T22:02:46Z", "author_association": "OWNER", "body": "I'm OK with the current error message shown if you try to use too low a port:\r\n```\r\ndatasette fivethirtyeight.db -p 800 \r\nINFO: Started server process [45511]\r\nINFO: Waiting for application startup.\r\nINFO: Application startup complete.\r\nERROR: [Errno 13] error while attempting to bind on address ('127.0.0.1', 800): permission denied\r\nINFO: Waiting for application shutdown.\r\nINFO: Application shutdown complete.\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 808843401, "label": "--port option should validate port is between 0 and 65535"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1226#issuecomment-779467160", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1226", "id": 779467160, "node_id": "MDEyOklzc3VlQ29tbWVudDc3OTQ2NzE2MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-15T22:01:53Z", "updated_at": "2021-02-15T22:01:53Z", "author_association": "OWNER", "body": "This check needs to happen in two places:\r\n\r\nhttps://github.com/simonw/datasette/blob/9603d893b9b72653895318c9104d754229fdb146/datasette/cli.py#L222-L227\r\n\r\nhttps://github.com/simonw/datasette/blob/9603d893b9b72653895318c9104d754229fdb146/datasette/cli.py#L328-L333", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 808843401, "label": "--port option should validate port is between 0 and 65535"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779416619", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/147", "id": 779416619, "node_id": "MDEyOklzc3VlQ29tbWVudDc3OTQxNjYxOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-15T19:40:57Z", "updated_at": "2021-02-15T21:27:55Z", "author_association": "OWNER", "body": "Tried this experiment (not proper binary search, it only searches downwards):\r\n```python\r\nimport sqlite3\r\n\r\ndb = sqlite3.connect(\":memory:\")\r\n\r\ndef tryit(n):\r\n sql = \"select 1 where 1 in ({})\".format(\", \".join(\"?\" for i in range(n)))\r\n db.execute(sql, [0 for i in range(n)])\r\n\r\n\r\ndef find_limit(min=0, max=5_000_000):\r\n value = max\r\n while True:\r\n print('Trying', value)\r\n try:\r\n tryit(value)\r\n return value\r\n except:\r\n value = value // 2\r\n```\r\nRunning `find_limit()` with those default parameters takes about 1.47s on my laptop:\r\n```\r\nIn [9]: %timeit find_limit()\r\nTrying 5000000\r\nTrying 2500000...\r\n1.47 s \u00b1 28 ms per loop (mean \u00b1 std. dev. of 7 runs, 1 loop each)\r\n```\r\nInterestingly the value it suggested was 156250 - suggesting that the macOS `sqlite3` binary with a 500,000 limit isn't the same as whatever my Python is using here.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 688670158, "label": "SQLITE_MAX_VARS maybe hard-coded too low"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779448912", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/147", "id": 779448912, "node_id": "MDEyOklzc3VlQ29tbWVudDc3OTQ0ODkxMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-15T21:09:50Z", "updated_at": "2021-02-15T21:09:50Z", "author_association": "OWNER", "body": "I fiddled around and replaced that line with `batch_size = SQLITE_MAX_VARS // num_columns` - which evaluated to `10416` for this particular file. That got me this:\r\n\r\n 40.71s user 1.81s system 98% cpu 43.081 total\r\n\r\n43s is definitely better than 56s, but it's still not as big as the ~26.5s to ~3.5s improvement described by @simonwiles at the top of this issue. I wonder what I'm missing here.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 688670158, "label": "SQLITE_MAX_VARS maybe hard-coded too low"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779446652", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/147", "id": 779446652, "node_id": "MDEyOklzc3VlQ29tbWVudDc3OTQ0NjY1Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-15T21:04:19Z", "updated_at": "2021-02-15T21:04:19Z", "author_association": "OWNER", "body": "... but it looks like `batch_size` is hard-coded to 100, rather than `None` - which means it's not being calculated using that value:\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/1f49f32814a942fa076cfe5f504d1621188097ed/sqlite_utils/db.py#L704\r\n\r\nAnd\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/1f49f32814a942fa076cfe5f504d1621188097ed/sqlite_utils/db.py#L1877", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 688670158, "label": "SQLITE_MAX_VARS maybe hard-coded too low"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779445423", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/147", "id": 779445423, "node_id": "MDEyOklzc3VlQ29tbWVudDc3OTQ0NTQyMw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-15T21:00:44Z", "updated_at": "2021-02-15T21:01:09Z", "author_association": "OWNER", "body": "I tried changing the hard-coded value from 999 to 156_250 and running `sqlite-utils insert` against a 500MB CSV file, with these results:\r\n```\r\n(sqlite-utils) sqlite-utils % time sqlite-utils insert slow-ethos.db ethos ../ethos-datasette/ethos.csv --no-headers\r\n [###################################-] 99% 00:00:00sqlite-utils insert slow-ethos.db ethos ../ethos-datasette/ethos.csv\r\n44.74s user 7.61s system 92% cpu 56.601 total\r\n# Increased the setting here\r\n(sqlite-utils) sqlite-utils % time sqlite-utils insert fast-ethos.db ethos ../ethos-datasette/ethos.csv --no-headers\r\n [###################################-] 99% 00:00:00sqlite-utils insert fast-ethos.db ethos ../ethos-datasette/ethos.csv\r\n39.40s user 5.15s system 96% cpu 46.320 total\r\n```\r\nNot as big a difference as I was expecting.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 688670158, "label": "SQLITE_MAX_VARS maybe hard-coded too low"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779417723", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/147", "id": 779417723, "node_id": "MDEyOklzc3VlQ29tbWVudDc3OTQxNzcyMw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-15T19:44:02Z", "updated_at": "2021-02-15T19:47:00Z", "author_association": "OWNER", "body": "`%timeit find_limit(max=1_000_000)` took 378ms on my laptop\r\n\r\n`%timeit find_limit(max=500_000)` took 197ms\r\n\r\n`%timeit find_limit(max=200_000)` reported 53ms per loop\r\n\r\n`%timeit find_limit(max=100_000)` reported 26.8ms per loop.\r\n\r\nAll of these are still slow enough that I'm not comfortable running this search for every time the library is imported. Allowing users to opt-in to this as a performance enhancement might be better.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 688670158, "label": "SQLITE_MAX_VARS maybe hard-coded too low"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/147#issuecomment-779409770", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/147", "id": 779409770, "node_id": "MDEyOklzc3VlQ29tbWVudDc3OTQwOTc3MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-15T19:23:11Z", "updated_at": "2021-02-15T19:23:11Z", "author_association": "OWNER", "body": "On my Mac right now I'm seeing a limit of 500,000:\r\n```\r\n% sqlite3 -cmd \".limits variable_number\"\r\n variable_number 500000\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 688670158, "label": "SQLITE_MAX_VARS maybe hard-coded too low"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/227#issuecomment-778854808", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/227", "id": 778854808, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODg1NDgwOA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T22:46:54Z", "updated_at": "2021-02-14T22:46:54Z", "author_association": "OWNER", "body": "Fix is released in 3.5.", "reactions": "{\"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 1, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807174161, "label": "Error reading csv files with large column data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778851721", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/228", "id": 778851721, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODg1MTcyMQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T22:23:46Z", "updated_at": "2021-02-14T22:23:46Z", "author_association": "OWNER", "body": "I called this `--no-headers` for consistency with the existing output option: https://github.com/simonw/sqlite-utils/blob/427dace184c7da57f4a04df07b1e84cdae3261e8/sqlite_utils/cli.py#L61-L64", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807437089, "label": "--no-headers option for CSV and TSV"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778849394", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/228", "id": 778849394, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODg0OTM5NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T22:06:53Z", "updated_at": "2021-02-14T22:06:53Z", "author_association": "OWNER", "body": "For the moment I think just adding `--no-header` - which causes column names \"unknown1,unknown2,...\" to be used - should be enough.\r\n\r\nUsers can import with that option, then use `sqlite-utils transform --rename` to rename them.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807437089, "label": "--no-headers option for CSV and TSV"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/229#issuecomment-778844016", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/229", "id": 778844016, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODg0NDAxNg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T21:22:45Z", "updated_at": "2021-02-14T21:22:45Z", "author_association": "OWNER", "body": "I'm going to use this pattern from https://stackoverflow.com/a/15063941\r\n```python\r\nimport sys\r\nimport csv\r\nmaxInt = sys.maxsize\r\n\r\nwhile True:\r\n # decrease the maxInt value by factor 10 \r\n # as long as the OverflowError occurs.\r\n\r\n try:\r\n csv.field_size_limit(maxInt)\r\n break\r\n except OverflowError:\r\n maxInt = int(maxInt/10)\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807817197, "label": "Hitting `_csv.Error: field larger than field limit (131072)`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/229#issuecomment-778843503", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/229", "id": 778843503, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODg0MzUwMw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T21:18:51Z", "updated_at": "2021-02-14T21:18:51Z", "author_association": "OWNER", "body": "I want to set this to the maximum allowed limit, which seems to be surprisingly hard! That StackOverflow thread is full of ideas for that, many of them involving `ctypes`. I'm a bit loathe to add a dependency on `ctypes` though - even though it's in the Python standard library I worry that it might not be available on some architectures.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807817197, "label": "Hitting `_csv.Error: field larger than field limit (131072)`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/229#issuecomment-778843362", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/229", "id": 778843362, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODg0MzM2Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T21:17:53Z", "updated_at": "2021-02-14T21:17:53Z", "author_association": "OWNER", "body": "Same issue as #227.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807817197, "label": "Hitting `_csv.Error: field larger than field limit (131072)`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778811746", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/228", "id": 778811746, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgxMTc0Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T17:39:30Z", "updated_at": "2021-02-14T21:16:54Z", "author_association": "OWNER", "body": "I'm going to detach this from the #131 column types idea.\r\n\r\nThe three things I need to handle here are:\r\n\r\n- The CSV file doesn't have a header row at all, so I need to specify what the column names should be\r\n- The CSV file DOES have a header row but I want to ignore it and use alternative column names\r\n- The CSV doesn't have a header row at all and I want to automatically use `unknown1,unknown2...` so I can start exploring it as quickly as possible.\r\n\r\nHere's a potential design that covers the first two:\r\n\r\n`--replace-header=\"foo,bar,baz\"` - ignore whatever is in the first row and pretend it was this instead\r\n`--add-header=\"foo,bar,baz\"` - add a first row with these details, to use as the header\r\n\r\nIt doesn't cover the \"give me unknown column names\" case though.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807437089, "label": "--no-headers option for CSV and TSV"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778843086", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/228", "id": 778843086, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODg0MzA4Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T21:15:43Z", "updated_at": "2021-02-14T21:15:43Z", "author_association": "OWNER", "body": "I'm not convinced the `.has_header()` rules are useful for the kind of CSV files I work with: https://github.com/python/cpython/blob/63298930fb531ba2bb4f23bc3b915dbf1e17e9e1/Lib/csv.py#L383\r\n\r\n```python\r\n def has_header(self, sample):\r\n # Creates a dictionary of types of data in each column. If any\r\n # column is of a single type (say, integers), *except* for the first\r\n # row, then the first row is presumed to be labels. If the type\r\n # can't be determined, it is assumed to be a string in which case\r\n # the length of the string is the determining factor: if all of the\r\n # rows except for the first are the same length, it's a header.\r\n # Finally, a 'vote' is taken at the end for each column, adding or\r\n # subtracting from the likelihood of the first row being a header.\r\n```\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807437089, "label": "--no-headers option for CSV and TSV"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778842982", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/228", "id": 778842982, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODg0Mjk4Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T21:15:11Z", "updated_at": "2021-02-14T21:15:11Z", "author_association": "OWNER", "body": "Implementation tip: I have code that reads the first row and uses it as headers here: https://github.com/simonw/sqlite-utils/blob/8f042ae1fd323995d966a94e8e6df85cc843b938/sqlite_utils/cli.py#L689-L691\r\n\r\nSo If I want to use `unknown1,unknown2...` I can do that by reading the first row, counting the number of columns, generating headers based on that range and then continuing to build that generator (maybe with `itertools.chain()` to replay the record we already read).\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807437089, "label": "--no-headers option for CSV and TSV"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/227#issuecomment-778841704", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/227", "id": 778841704, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODg0MTcwNA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T21:05:20Z", "updated_at": "2021-02-14T21:05:20Z", "author_association": "OWNER", "body": "This has also been reported in #229.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807174161, "label": "Error reading csv files with large column data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/pull/225#issuecomment-778841547", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/225", "id": 778841547, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODg0MTU0Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T21:04:13Z", "updated_at": "2021-02-14T21:04:13Z", "author_association": "OWNER", "body": "I added a test and fixed this in #234 - thanks for the fix.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 797159961, "label": "fix for problem in Table.insert_all on search for columns per chunk of rows"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/234#issuecomment-778841278", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/234", "id": 778841278, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODg0MTI3OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T21:02:11Z", "updated_at": "2021-02-14T21:02:11Z", "author_association": "OWNER", "body": "I managed to replicate this in a test:\r\n```python\r\ndef test_insert_all_with_extra_columns_in_later_chunks(fresh_db):\r\n chunk = [\r\n {\"record\": \"Record 1\"},\r\n {\"record\": \"Record 2\"},\r\n {\"record\": \"Record 3\"},\r\n {\"record\": \"Record 4\", \"extra\": 1},\r\n ]\r\n fresh_db[\"t\"].insert_all(chunk, batch_size=2, alter=True)\r\n assert list(fresh_db[\"t\"].rows) == [\r\n {\"record\": \"Record 1\", \"extra\": None},\r\n {\"record\": \"Record 2\", \"extra\": None},\r\n {\"record\": \"Record 3\", \"extra\": None},\r\n {\"record\": \"Record 4\", \"extra\": 1},\r\n ]\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 808046597, "label": ".insert_all() fails if subsequent chunks contain additional columns"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/pull/225#issuecomment-778834504", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/225", "id": 778834504, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgzNDUwNA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T20:09:30Z", "updated_at": "2021-02-14T20:09:30Z", "author_association": "OWNER", "body": "Thanks for this. I'm going to try and get the test suite to run in Windows on GitHub Actions.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 797159961, "label": "fix for problem in Table.insert_all on search for columns per chunk of rows"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/231#issuecomment-778829456", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/231", "id": 778829456, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgyOTQ1Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T19:37:52Z", "updated_at": "2021-02-14T19:37:52Z", "author_association": "OWNER", "body": "I'm going to add `limit` and `offset` to the following methods:\r\n\r\n- `rows_where()`\r\n- `search_sql()`\r\n- `search()`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 808028757, "label": "limit=X, offset=Y parameters for more Python methods"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/231#issuecomment-778828758", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/231", "id": 778828758, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgyODc1OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T19:33:14Z", "updated_at": "2021-02-14T19:33:14Z", "author_association": "OWNER", "body": "The `limit=` parameter is currently only available on the `.search()` method - it would make sense to add this to other methods as well.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 808028757, "label": "limit=X, offset=Y parameters for more Python methods"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/pull/224#issuecomment-778828495", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/224", "id": 778828495, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgyODQ5NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T19:31:06Z", "updated_at": "2021-02-14T19:31:06Z", "author_association": "OWNER", "body": "I'm going to add a `offset=` parameter to support this case. Thanks for the suggestion!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 792297010, "label": "Add fts offset docs."}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778827570", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/230", "id": 778827570, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgyNzU3MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T19:24:20Z", "updated_at": "2021-02-14T19:24:20Z", "author_association": "OWNER", "body": "Here's the implementation in Python: https://github.com/python/cpython/blob/63298930fb531ba2bb4f23bc3b915dbf1e17e9e1/Lib/csv.py#L204-L225", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 808008305, "label": "--sniff option for sniffing delimiters"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778824361", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/230", "id": 778824361, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgyNDM2MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T18:59:22Z", "updated_at": "2021-02-14T18:59:22Z", "author_association": "OWNER", "body": "I think I've got it. I can use `io.BufferedReader()` to get an object I can run `.peek(2048)` on, then wrap THAT in `io.TextIOWrapper`:\r\n\r\n```python\r\n encoding = encoding or \"utf-8\"\r\n buffered = io.BufferedReader(json_file, buffer_size=4096)\r\n decoded = io.TextIOWrapper(buffered, encoding=encoding, line_buffering=True)\r\n if pk and len(pk) == 1:\r\n pk = pk[0]\r\n if csv or tsv:\r\n if sniff:\r\n # Read first 2048 bytes and use that to detect\r\n first_bytes = buffered.peek(2048)\r\n print('first_bytes', first_bytes)\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 808008305, "label": "--sniff option for sniffing delimiters"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778821403", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/230", "id": 778821403, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgyMTQwMw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T18:38:16Z", "updated_at": "2021-02-14T18:38:16Z", "author_association": "OWNER", "body": "There are two code paths here that matter:\r\n\r\n- For a regular file, can read the first 2048 bytes, then `.seek(0)` before continuing. That's easy.\r\n- `stdin` is harder. I need to read and buffer the first 2048 bytes, then pass an object to `csv.reader()` which will replay that chunk and then play the rest of stdin.\r\n\r\nI'm a bit stuck on the second one. Ideally I could use something like `itertools.chain()` but I can't find an alternative for file-like objects.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 808008305, "label": "--sniff option for sniffing delimiters"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778818639", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/230", "id": 778818639, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgxODYzOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T18:22:38Z", "updated_at": "2021-02-14T18:22:38Z", "author_association": "OWNER", "body": "Maybe I shouldn't be using `StreamReader` at all - https://www.python.org/dev/peps/pep-0400/ suggests that it should be deprecated in favour of `io.TextIOWrapper`. I'm using `StreamReader` due to this line: https://github.com/simonw/sqlite-utils/blob/726219c3503e77440975cd15b74d006639feb0f8/sqlite_utils/cli.py#L667-L668", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 808008305, "label": "--sniff option for sniffing delimiters"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778817494", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/230", "id": 778817494, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgxNzQ5NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T18:16:06Z", "updated_at": "2021-02-14T18:16:06Z", "author_association": "OWNER", "body": "Types involved:\r\n```\r\n(Pdb) type(json_file.raw)\r\n\r\n(Pdb) type(json_file)\r\n\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 808008305, "label": "--sniff option for sniffing delimiters"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778816333", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/230", "id": 778816333, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgxNjMzMw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T18:08:44Z", "updated_at": "2021-02-14T18:08:44Z", "author_association": "OWNER", "body": "No, you can't `.seek(0)` on stdin:\r\n```\r\n File \"/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py\", line 678, in insert_upsert_implementation\r\n json_file.raw.seek(0)\r\nOSError: [Errno 29] Illegal seek\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 808008305, "label": "--sniff option for sniffing delimiters"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778815740", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/230", "id": 778815740, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgxNTc0MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T18:05:03Z", "updated_at": "2021-02-14T18:05:03Z", "author_association": "OWNER", "body": "The challenge here is how to read the first 2048 bytes and then reset the incoming file.\r\n\r\nThe Python docs example looks like this:\r\n\r\n```python\r\nwith open('example.csv', newline='') as csvfile:\r\n dialect = csv.Sniffer().sniff(csvfile.read(1024))\r\n csvfile.seek(0)\r\n reader = csv.reader(csvfile, dialect)\r\n```\r\nHere's the relevant code in `sqlite-utils`: https://github.com/simonw/sqlite-utils/blob/726219c3503e77440975cd15b74d006639feb0f8/sqlite_utils/cli.py#L671-L679\r\n\r\nThe challenge is going to be having the `--sniff` option work with the progress bar. Here's how `file_progress()` works: https://github.com/simonw/sqlite-utils/blob/726219c3503e77440975cd15b74d006639feb0f8/sqlite_utils/utils.py#L106-L113\r\n\r\nIf `file.raw` is `stdin` can I do the equivalent of `csvfile.seek(0)` on it?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 808008305, "label": "--sniff option for sniffing delimiters"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/230#issuecomment-778812684", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/230", "id": 778812684, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgxMjY4NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T17:45:16Z", "updated_at": "2021-02-14T17:45:16Z", "author_association": "OWNER", "body": "Running this could take any CSV (or TSV) file and automatically detect the delimiter. If no header row is detected it could add `unknown1,unknown2` headers:\r\n\r\n sqlite-utils insert db.db data file.csv --sniff\r\n\r\n(Using `--sniff` would imply `--csv`)\r\n\r\nThis could be called `--sniffer` instead but I like `--sniff` better.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 808008305, "label": "--sniff option for sniffing delimiters"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778812050", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/228", "id": 778812050, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgxMjA1MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T17:41:30Z", "updated_at": "2021-02-14T17:41:30Z", "author_association": "OWNER", "body": "I just spotted that `csv.Sniffer` in the Python standard library has a `.has_header(sample)` method which detects if the first row appears to be a header or not, which is interesting. https://docs.python.org/3/library/csv.html#csv.Sniffer", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807437089, "label": "--no-headers option for CSV and TSV"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778811934", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/228", "id": 778811934, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODgxMTkzNA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-14T17:40:48Z", "updated_at": "2021-02-14T17:40:48Z", "author_association": "OWNER", "body": "Another pattern that might be useful is to generate a header that is just \"unknown1,unknown2,unknown3\" for each of the columns in the rest of the file. This makes it easy to e.g. facet-explore within Datasette to figure out the correct names, then use `sqlite-utils transform --rename` to rename the columns.\r\n\r\nI needed to do that for the https://bl.iro.bl.uk/work/ns/3037474a-761c-456d-a00c-9ef3c6773f4c example.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807437089, "label": "--no-headers option for CSV and TSV"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778511347", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/228", "id": 778511347, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODUxMTM0Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-12T23:27:50Z", "updated_at": "2021-02-12T23:27:50Z", "author_association": "OWNER", "body": "For the moment, a workaround can be to `cat` an additional row onto the start of the file.\r\n\r\n echo \"name,url,description\" | cat - missing_headings.csv | sqlite-utils insert blah.db table - --csv", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807437089, "label": "--no-headers option for CSV and TSV"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/131#issuecomment-778510528", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/131", "id": 778510528, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODUxMDUyOA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-12T23:25:06Z", "updated_at": "2021-02-12T23:25:06Z", "author_association": "OWNER", "body": "If `-c` isn't available, maybe `-t` or `--type` would work for specifying column types:\r\n```\r\nsqlite-utils insert db.db images images.tsv \\\r\n --tsv \\\r\n --type id int \\\r\n --type score float\r\n```\r\nor\r\n```\r\nsqlite-utils insert db.db images images.tsv \\\r\n --tsv \\\r\n -t id int \\\r\n -t score float\r\n```", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 675753042, "label": "sqlite-utils insert: options for column types"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/131#issuecomment-778508887", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/131", "id": 778508887, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODUwODg4Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-12T23:20:11Z", "updated_at": "2021-02-12T23:20:11Z", "author_association": "OWNER", "body": "Annoyingly `-c` is currently a shortcut for `--csv` - so I'd have to do a major version bump to use that.\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/726219c3503e77440975cd15b74d006639feb0f8/sqlite_utils/cli.py#L601-L603\r\n\r\nParticularly annoying because I attempted to remove the `-c` shortcut in https://github.com/simonw/sqlite-utils/commit/2c00567aac6d9c79087cfff0d054f64922b1473d#diff-76294b3d4afeb27e74e738daa01c26dd4dc9ccb6f4477451483a2ece1095902eL48 but forgot to remove it from the input options (I removed it from the output options).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 675753042, "label": "sqlite-utils insert: options for column types"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778349672", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/228", "id": 778349672, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODM0OTY3Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-12T18:00:43Z", "updated_at": "2021-02-12T18:00:43Z", "author_association": "OWNER", "body": "I could combine this with #131 to allow types to be specified in addition to column names.\r\n\r\nProbably need an option that means \"ignore the existing heading row and use this one instead\".", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807437089, "label": "--no-headers option for CSV and TSV"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/227#issuecomment-778349142", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/227", "id": 778349142, "node_id": "MDEyOklzc3VlQ29tbWVudDc3ODM0OTE0Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-12T17:59:35Z", "updated_at": "2021-02-12T17:59:35Z", "author_association": "OWNER", "body": "It looks like I can at least bump this size limit up to the maximum allowed by Python - I'll take a look at that. ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 807174161, "label": "Error reading csv files with large column data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1221#issuecomment-777901052", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1221", "id": 777901052, "node_id": "MDEyOklzc3VlQ29tbWVudDc3NzkwMTA1Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-12T01:09:54Z", "updated_at": "2021-02-12T01:09:54Z", "author_association": "OWNER", "body": "I also tested this manually. I generated certificate files like so:\r\n\r\n cd /tmp\r\n python -m trustme\r\n\r\nThis created `/tmp/server.pem`, `/tmp/client.pem` and `/tmp/server.key`\r\n\r\nThen I started Datasette like this:\r\n\r\n datasette --memory --ssl-keyfile=/tmp/server.key --ssl-certfile=/tmp/server.pem\r\n\r\nAnd exercise it using `curl` like so:\r\n\r\n /tmp % curl --cacert /tmp/client.pem 'https://localhost:8001/_memory.json'\r\n {\"database\": \"_memory\", \"path\": \"/_memory\", \"size\": 0, \"tables\": [], \"hidden_count\": 0, \"views\": [], \"queries\": [],\r\n \"private\": false, \"allow_execute_sql\": true, \"query_ms\": 0.8843200000114848}\r\n\r\nNote that without the `--cacert` option I get an error:\r\n\r\n```\r\n/tmp % curl 'https://localhost:8001/_memory.json' \r\ncurl: (60) SSL certificate problem: Invalid certificate chain\r\nMore details here: https://curl.haxx.se/docs/sslcerts.html\r\n\r\ncurl failed to verify the legitimacy of the server and therefore could not\r\nestablish a secure connection to it. To learn more about this situation and\r\nhow to fix it, please visit the web page mentioned above.\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 806849424, "label": "Support SSL/TLS directly"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1221#issuecomment-777887190", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1221", "id": 777887190, "node_id": "MDEyOklzc3VlQ29tbWVudDc3Nzg4NzE5MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-12T00:29:18Z", "updated_at": "2021-02-12T00:29:18Z", "author_association": "OWNER", "body": "I can use this recipe to start a `datasette` server in a sub-process during the pytest run and exercise it with real HTTP requests: https://til.simonwillison.net/pytest/subprocess-server", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 806849424, "label": "Support SSL/TLS directly"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1221#issuecomment-777883452", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1221", "id": 777883452, "node_id": "MDEyOklzc3VlQ29tbWVudDc3Nzg4MzQ1Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-12T00:19:30Z", "updated_at": "2021-02-12T00:19:40Z", "author_association": "OWNER", "body": "Uvicorn supports these options: https://www.uvicorn.org/#command-line-options\r\n```\r\n --ssl-keyfile TEXT SSL key file\r\n --ssl-certfile TEXT SSL certificate file\r\n --ssl-keyfile-password TEXT SSL keyfile password\r\n --ssl-version INTEGER SSL version to use (see stdlib ssl module's)\r\n [default: 2]\r\n\r\n --ssl-cert-reqs INTEGER Whether client certificate is required (see\r\n stdlib ssl module's) [default: 0]\r\n\r\n --ssl-ca-certs TEXT CA certificates file\r\n --ssl-ciphers TEXT Ciphers to use (see stdlib ssl module's)\r\n [default: TLSv1]\r\n```\r\nFor the moment I'm going to support just `--ssl-keyfile` and `--ssl-certfile` as arguments to `datasette serve`. I'll add other options if people ask for them.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 806849424, "label": "Support SSL/TLS directly"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/evernote-to-sqlite/pull/10#issuecomment-777839351", "issue_url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/10", "id": 777839351, "node_id": "MDEyOklzc3VlQ29tbWVudDc3NzgzOTM1MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-11T22:37:55Z", "updated_at": "2021-02-11T22:37:55Z", "author_association": "MEMBER", "body": "I've merged these changes by hand now, thanks!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 770712149, "label": "BugFix for encoding and not update info."}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/evernote-to-sqlite/issues/7#issuecomment-777827396", "issue_url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/7", "id": 777827396, "node_id": "MDEyOklzc3VlQ29tbWVudDc3NzgyNzM5Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-11T22:13:14Z", "updated_at": "2021-02-11T22:13:14Z", "author_association": "MEMBER", "body": "My best guess is that you have an older version of `sqlite-utils` installed here - the `replace=True` argument was added in version 2.0. I've bumped the dependency in `setup.py`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 743297582, "label": "evernote-to-sqlite on windows 10 give this error: TypeError: insert() got an unexpected keyword argument 'replace'"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/evernote-to-sqlite/issues/9#issuecomment-777821383", "issue_url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/9", "id": 777821383, "node_id": "MDEyOklzc3VlQ29tbWVudDc3NzgyMTM4Mw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-11T22:01:28Z", "updated_at": "2021-02-11T22:01:28Z", "author_association": "MEMBER", "body": "Aha! I think I've figured out what's going on here.\r\n\r\nThe CData blocks containing the notes look like this:\r\n\r\n`
This note includes two images.

...`\r\n\r\nThe DTD at http://xml.evernote.com/pub/enml2.dtd includes some entities:\r\n\r\n```\r\n\r\n\r\n\r\n%HTMLlat1;\r\n\r\n\r\n%HTMLsymbol;\r\n\r\n\r\n%HTMLspecial;\r\n```\r\nSo I need to be able to handle all of those different entities. I think I can do that using `html.entities.entitydefs` from the Python standard library, which looks a bit like this:\r\n\r\n```python\r\n{'Aacute': '\u00c1',\r\n 'aacute': '\u00e1',\r\n 'Aacute;': '\u00c1',\r\n 'aacute;': '\u00e1',\r\n 'Abreve;': '\u0102',\r\n 'abreve;': '\u0103',\r\n 'ac;': '\u223e',\r\n 'acd;': '\u223f',\r\n# ...\r\n}\r\n```\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 748372469, "label": "ParseError: undefined entity š"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/evernote-to-sqlite/issues/11#issuecomment-777798330", "issue_url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/11", "id": 777798330, "node_id": "MDEyOklzc3VlQ29tbWVudDc3Nzc5ODMzMA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-11T21:18:58Z", "updated_at": "2021-02-11T21:18:58Z", "author_association": "MEMBER", "body": "Thanks for the fix!", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 792851444, "label": "XML parse error"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1200#issuecomment-777178728", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1200", "id": 777178728, "node_id": "MDEyOklzc3VlQ29tbWVudDc3NzE3ODcyOA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-11T03:13:59Z", "updated_at": "2021-02-11T03:13:59Z", "author_association": "OWNER", "body": "I came up with the need for this while playing with this tool: https://calands.datasettes.com/calands?sql=select%0D%0A++AsGeoJSON(geometry)%2C+*%0D%0Afrom%0D%0A++CPAD_2020a_SuperUnits%0D%0Awhere%0D%0A++PARK_NAME+like+'%25mini%25'+and%0D%0A++Intersects(GeomFromGeoJSON(%3Afreedraw)%2C+geometry)+%3D+1%0D%0A++and+CPAD_2020a_SuperUnits.rowid+in+(%0D%0A++++select%0D%0A++++++rowid%0D%0A++++from%0D%0A++++++SpatialIndex%0D%0A++++where%0D%0A++++++f_table_name+%3D+'CPAD_2020a_SuperUnits'%0D%0A++++++and+search_frame+%3D+GeomFromGeoJSON(%3Afreedraw)%0D%0A++)&freedraw={\"type\"%3A\"MultiPolygon\"%2C\"coordinates\"%3A[[[[-122.42202758789064%2C37.82280243352759]%2C[-122.39868164062501%2C37.823887203271454]%2C[-122.38220214843751%2C37.81846319511331]%2C[-122.35061645507814%2C37.77071473849611]%2C[-122.34924316406251%2C37.74465712069939]%2C[-122.37258911132814%2C37.703380457832374]%2C[-122.39044189453125%2C37.690340943717715]%2C[-122.41241455078126%2C37.680559803205135]%2C[-122.44262695312501%2C37.67295135774715]%2C[-122.47283935546876%2C37.67295135774715]%2C[-122.52502441406251%2C37.68382032669382]%2C[-122.53463745117189%2C37.6892542140253]%2C[-122.54699707031251%2C37.690340943717715]%2C[-122.55798339843751%2C37.72945260537781]%2C[-122.54287719726564%2C37.77831314799672]%2C[-122.49893188476564%2C37.81303878836991]%2C[-122.46185302734376%2C37.82822612280363]%2C[-122.42889404296876%2C37.82822612280363]%2C[-122.42202758789064%2C37.82280243352759]]]]} - before I fixed https://github.com/simonw/datasette-leaflet-geojson/issues/16 it was loading a LOT of maps, which felt bad. I wanted to be able to link people to that page with a hard limit on the number of rows displayed on that page.\r\n\r\nIt's mainly to guard against unexpected behaviour from limit-less queries though. It's not a very high priority feature!", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 792890765, "label": "?_size=10 option for the arbitrary query page would be useful"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1219#issuecomment-775442039", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1219", "id": 775442039, "node_id": "MDEyOklzc3VlQ29tbWVudDc3NTQ0MjAzOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-08T20:39:52Z", "updated_at": "2021-02-08T22:13:00Z", "author_association": "OWNER", "body": "This comment helped me find a pattern for running Scalene against the Datasette test suite: https://github.com/emeryberger/scalene/issues/70#issuecomment-755245858\r\n\r\n```\r\npip install scalene\r\n```\r\nThen I created a file called `run_tests.py` with the following contents:\r\n```python\r\nif __name__ == \"__main__\":\r\n import sys, pytest\r\n pytest.main(sys.argv)\r\n```\r\nThen I ran this:\r\n```\r\nscalene --profile-all run_tests.py -sv -x .\r\n```\r\nBut... it quit with a segmentation fault!\r\n```\r\n(datasette) datasette % scalene --profile-all run_tests.py -sv -x .\r\n======================================================================== test session starts ========================================================================\r\nplatform darwin -- Python 3.8.6, pytest-6.0.1, py-1.9.0, pluggy-0.13.1 -- python\r\ncachedir: .pytest_cache\r\nrootdir: /Users/simon/Dropbox/Development/datasette, configfile: pytest.ini\r\nplugins: asyncio-0.14.0, timeout-1.4.2\r\ncollecting ... Fatal Python error: Segmentation fault\r\n\r\nCurrent thread 0x0000000110c1edc0 (most recent call first):\r\n File \"/Users/simon/Dropbox/Development/datasette/datasette/utils/__init__.py\", line 553 in detect_json1\r\n File \"/Users/simon/Dropbox/Development/datasette/datasette/filters.py\", line 168 in Filters\r\n File \"/Users/simon/Dropbox/Development/datasette/datasette/filters.py\", line 94 in \r\n File \"\", line 219 in _call_with_frames_removed\r\n File \"\", line 783 in exec_module\r\n File \"\", line 671 in _load_unlocked\r\n File \"\", line 975 in _find_and_load_unlocked\r\n File \"\", line 991 in _find_and_load\r\n File \"/Users/simon/Dropbox/Development/datasette/datasette/views/table.py\", line 27 in \r\n File \"\", line 219 in _call_with_frames_removed\r\n File \"\", line 783 in exec_module\r\n File \"\", line 671 in _load_unlocked\r\n File \"\", line 975 in _find_and_load_unlocked\r\n File \"\", line 991 in _find_and_load\r\n File \"/Users/simon/Dropbox/Development/datasette/datasette/app.py\", line 42 in \r\n File \"\", line 219 in _call_with_frames_removed\r\n File \"\", line 783 in exec_module\r\n File \"\", line 671 in _load_unlocked\r\n File \"\", line 975 in _find_and_load_unlocked\r\n File \"\", line 991 in _find_and_load\r\n File \"/Users/simon/Dropbox/Development/datasette/tests/test_api.py\", line 1 in \r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/assertion/rewrite.py\", line 170 in exec_module\r\n File \"\", line 671 in _load_unlocked\r\n File \"\", line 975 in _find_and_load_unlocked\r\n File \"\", line 991 in _find_and_load\r\n File \"\", line 1014 in _gcd_import\r\n File \"/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/importlib/__init__.py\", line 127 in import_module\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/pathlib.py\", line 520 in import_path\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/python.py\", line 552 in _importtestmodule\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/python.py\", line 484 in _getobj\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/python.py\", line 288 in obj\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/python.py\", line 500 in _inject_setup_module_fixture\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/python.py\", line 487 in collect\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/runner.py\", line 324 in \r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/runner.py\", line 294 in from_call\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/runner.py\", line 324 in pytest_make_collect_report\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/callers.py\", line 187 in _multicall\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/manager.py\", line 84 in \r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/manager.py\", line 93 in _hookexec\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/hooks.py\", line 286 in __call__\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/runner.py\", line 441 in collect_one_node\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py\", line 768 in genitems\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py\", line 771 in genitems\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py\", line 568 in _perform_collect\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py\", line 516 in perform_collect\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py\", line 306 in pytest_collection\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/callers.py\", line 187 in _multicall\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/manager.py\", line 84 in \r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/manager.py\", line 93 in _hookexec\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/hooks.py\", line 286 in __call__\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py\", line 295 in _main\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py\", line 240 in wrap_session\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/main.py\", line 289 in pytest_cmdline_main\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/callers.py\", line 187 in _multicall\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/manager.py\", line 84 in \r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/manager.py\", line 93 in _hookexec\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/pluggy/hooks.py\", line 286 in __call__\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/_pytest/config/__init__.py\", line 157 in main\r\n File \"run_tests.py\", line 3 in \r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/scalene/scalene_profiler.py\", line 1525 in main\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/scalene/__main__.py\", line 7 in main\r\n File \"/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/scalene/__main__.py\", line 14 in \r\n File \"/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/runpy.py\", line 87 in _run_code\r\n File \"/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/runpy.py\", line 194 in _run_module_as_main\r\nScalene error: received signal SIGSEGV\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 803929694, "label": "Try profiling Datasette using scalene"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1219#issuecomment-775497449", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1219", "id": 775497449, "node_id": "MDEyOklzc3VlQ29tbWVudDc3NTQ5NzQ0OQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-08T22:11:34Z", "updated_at": "2021-02-08T22:11:34Z", "author_association": "OWNER", "body": "https://github.com/emeryberger/scalene/issues/110 reports a \"received signal SIGSEGV\" error that was fixed by upgrading to the latest Scalene version, but I'm running that already.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 803929694, "label": "Try profiling Datasette using scalene"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/223#issuecomment-774373829", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/223", "id": 774373829, "node_id": "MDEyOklzc3VlQ29tbWVudDc3NDM3MzgyOQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-06T01:39:47Z", "updated_at": "2021-02-06T01:39:47Z", "author_association": "OWNER", "body": "Documentation: https://sqlite-utils.datasette.io/en/stable/cli.html#cli-insert-csv-tsv-delimiter", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 788527932, "label": "--delimiter option for CSV import"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1216#issuecomment-772796111", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1216", "id": 772796111, "node_id": "MDEyOklzc3VlQ29tbWVudDc3Mjc5NjExMQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-03T20:20:48Z", "updated_at": "2021-02-03T20:20:48Z", "author_association": "OWNER", "body": "Relevant code: https://github.com/simonw/datasette/blob/1600d2a3ec3ada1f6fb5b1eb73bdaeccb5f80530/datasette/app.py#L620-L632", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 800669347, "label": "/-/databases should reflect connection order, not alphabetical order"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1214#issuecomment-772001787", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1214", "id": 772001787, "node_id": "MDEyOklzc3VlQ29tbWVudDc3MjAwMTc4Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-02T21:28:53Z", "updated_at": "2021-02-02T21:28:53Z", "author_association": "OWNER", "body": "Fix is now live on https://latest.datasette.io/fixtures/searchable?_search=terry - clearing \"terry\" and re-submitting the form now works as expected.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 799693777, "label": "Re-submitting filter form duplicates _x querystring arguments"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1214#issuecomment-771992628", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1214", "id": 771992628, "node_id": "MDEyOklzc3VlQ29tbWVudDc3MTk5MjYyOA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-02T21:15:18Z", "updated_at": "2021-02-02T21:15:18Z", "author_association": "OWNER", "body": "The cause of this bug is form fields which begin with `_` but ARE displayed as form inputs on the page - hence should not be duplicated in an `` element.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 799693777, "label": "Re-submitting filter form duplicates _x querystring arguments"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1214#issuecomment-771992025", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1214", "id": 771992025, "node_id": "MDEyOklzc3VlQ29tbWVudDc3MTk5MjAyNQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-02T21:14:16Z", "updated_at": "2021-02-02T21:14:16Z", "author_association": "OWNER", "body": "As a result, navigating to https://github-to-sqlite.dogsheep.net/github/labels?_search=help and clearing out the `_search` field then submitting the form does NOT clear the search term.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 799693777, "label": "Re-submitting filter form duplicates _x querystring arguments"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1212#issuecomment-771976561", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1212", "id": 771976561, "node_id": "MDEyOklzc3VlQ29tbWVudDc3MTk3NjU2MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-02T20:53:27Z", "updated_at": "2021-02-02T20:53:27Z", "author_association": "OWNER", "body": "It would be great if we could get `python-xdist` to run too - I tried it in the past and gave up when I ran into those race conditions, but I've not done any further digging to see if there's a way to fix that.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 797651831, "label": "Tests are very slow. "}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1212#issuecomment-771975941", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1212", "id": 771975941, "node_id": "MDEyOklzc3VlQ29tbWVudDc3MTk3NTk0MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-02T20:52:36Z", "updated_at": "2021-02-02T20:52:36Z", "author_association": "OWNER", "body": "37 minutes, wow! They're a little slow for me (4-5 minutes perhaps) but not nearly that bad.\r\n\r\nThanks for running that profile. I think you're right: figuring out how to use more session scopes would definitely help.\r\n\r\nThe `:memory:` idea is interesting too. The new `memory_name=` feature added in #1151 (released in Datasette 0.54) could help a lot here, since it allows Datasette instances to share the same in-memory database across multiple HTTP requests and connections.\r\n\r\nNote that `memory_name=` also persists within test runs themselves, independently of any `scope=` options on the fixtures. That might actually help us here!\r\n\r\nI'd be delighted if you explored this issue further, especially the option of using `memory_name=` for the fixtures databases used by the tests.\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 797651831, "label": "Tests are very slow. "}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1213#issuecomment-771968675", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1213", "id": 771968675, "node_id": "MDEyOklzc3VlQ29tbWVudDc3MTk2ODY3NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-02T20:41:55Z", "updated_at": "2021-02-02T20:41:55Z", "author_association": "OWNER", "body": "So maybe I could a special response header which ASGI middleware can pick up that means \"Don't attempt to gzip this, just stream it through\".", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 799663959, "label": "gzip support for HTML (and JSON) responses"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1213#issuecomment-771968177", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1213", "id": 771968177, "node_id": "MDEyOklzc3VlQ29tbWVudDc3MTk2ODE3Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-02T20:41:13Z", "updated_at": "2021-02-02T20:41:13Z", "author_association": "OWNER", "body": "Starlette accumulates the full response body in a `body` variable and then does this:\r\n```python\r\n elif message_type == \"http.response.body\":\r\n # Remaining body in streaming GZip response.\r\n body = message.get(\"body\", b\"\")\r\n more_body = message.get(\"more_body\", False)\r\n\r\n self.gzip_file.write(body)\r\n if not more_body:\r\n self.gzip_file.close()\r\n\r\n message[\"body\"] = self.gzip_buffer.getvalue()\r\n self.gzip_buffer.seek(0)\r\n self.gzip_buffer.truncate()\r\n\r\n await self.send(message)\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 799663959, "label": "gzip support for HTML (and JSON) responses"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1213#issuecomment-771965281", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1213", "id": 771965281, "node_id": "MDEyOklzc3VlQ29tbWVudDc3MTk2NTI4MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-02-02T20:37:08Z", "updated_at": "2021-02-02T20:39:24Z", "author_association": "OWNER", "body": "Starlette's gzip middleware implementation is here: https://github.com/encode/starlette/blob/0.14.2/starlette/middleware/gzip.py", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 799663959, "label": "gzip support for HTML (and JSON) responses"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/github-to-sqlite/issues/60#issuecomment-770071568", "issue_url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/60", "id": 770071568, "node_id": "MDEyOklzc3VlQ29tbWVudDc3MDA3MTU2OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-29T21:56:15Z", "updated_at": "2021-01-29T21:56:15Z", "author_association": "MEMBER", "body": "I really like the way you're using pipes here - really smart. It's similar to how I build the demo database in this GitHub Actions workflow:\r\n\r\nhttps://github.com/dogsheep/github-to-sqlite/blob/62dfd3bc4014b108200001ef4bc746feb6f33b45/.github/workflows/deploy-demo.yml#L52-L82\r\n\r\n`twitter-to-sqlite` actually has a mechanism for doing this kind of thing, documented at https://github.com/dogsheep/twitter-to-sqlite#providing-input-from-a-sql-query-with---sql-and---attach\r\n\r\nIt lets you do things like:\r\n\r\n```\r\n$ twitter-to-sqlite users-lookup my.db --sql=\"select follower_id from following\" --ids\r\n```\r\nMaybe I should add something similar to `github-to-sqlite`? Feels like it could be really useful.", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 797097140, "label": "Use Data from SQLite in other commands"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/twitter-to-sqlite/issues/56#issuecomment-769957751", "issue_url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/56", "id": 769957751, "node_id": "MDEyOklzc3VlQ29tbWVudDc2OTk1Nzc1MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-29T17:59:40Z", "updated_at": "2021-01-29T17:59:40Z", "author_association": "MEMBER", "body": "This is interesting - how did you create that initial table? Was this using the `twitter-to-sqlite import archive.db ~/Downloads/twitter-2019-06-25-b31f2.zip` command, or something else?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 796736607, "label": "Not all quoted statuses get fetched?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1207#issuecomment-769534187", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1207", "id": 769534187, "node_id": "MDEyOklzc3VlQ29tbWVudDc2OTUzNDE4Nw==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-29T02:37:19Z", "updated_at": "2021-01-29T02:37:19Z", "author_association": "OWNER", "body": "https://docs.datasette.io/en/latest/testing_plugins.html#using-pdb-for-errors-thrown-inside-datasette", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 793881756, "label": "Document the Datasette(..., pdb=True) testing pattern"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1209#issuecomment-769455370", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1209", "id": 769455370, "node_id": "MDEyOklzc3VlQ29tbWVudDc2OTQ1NTM3MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-28T23:00:21Z", "updated_at": "2021-01-28T23:00:21Z", "author_association": "OWNER", "body": "Good catch on the workaround here. The root problem is that `datasette-template-sql` looks for the first available databsae if you don't provide it with a `database=` argument, and in Datasette 0.54 the first available database changed to being the new `_internal` database.\r\n\r\nIs this a bug? I think it is - because the documented behaviour on https://docs.datasette.io/en/stable/internals.html#get-database-name is this:\r\n\r\n> `name` - string, optional\r\n>\r\n> The name to be used for this database - this will be used in the URL path, e.g. `/dbname`. If not specified Datasette will pick one based on the filename or memory name.\r\n\r\nSince the new behaviour differs from what was in the documentation I'm going to treat this as a bug and fix it.", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 795367402, "label": "v0.54 500 error from sql query in custom template; code worked in v0.53; found a workaround"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1205#issuecomment-769453074", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1205", "id": 769453074, "node_id": "MDEyOklzc3VlQ29tbWVudDc2OTQ1MzA3NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-28T22:54:49Z", "updated_at": "2021-01-28T22:55:02Z", "author_association": "OWNER", "body": "\r\nI also checked that the following works:\r\n\r\n echo '{\"foo\": \"bar\"}' | sqlite-utils insert _memory.db demo -\r\n datasette _memory.db --memory\r\n\r\nSure enough, it results in the following Datasette homepage - thanks to #509\r\n\r\n\"Datasette___memory___memory_2\"", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 793027837, "label": "Rename /:memory: to /_memory"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1205#issuecomment-769452084", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1205", "id": 769452084, "node_id": "MDEyOklzc3VlQ29tbWVudDc2OTQ1MjA4NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-28T22:52:23Z", "updated_at": "2021-01-28T22:52:23Z", "author_association": "OWNER", "body": "Here are the redirect tests: https://github.com/simonw/datasette/blob/1600d2a3ec3ada1f6fb5b1eb73bdaeccb5f80530/tests/test_api.py#L635-L648", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 793027837, "label": "Rename /:memory: to /_memory"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1205#issuecomment-769442165", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1205", "id": 769442165, "node_id": "MDEyOklzc3VlQ29tbWVudDc2OTQ0MjE2NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-28T22:30:16Z", "updated_at": "2021-01-28T22:30:27Z", "author_association": "OWNER", "body": "I'm going to do this, with redirects from `/:memory:*`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 793027837, "label": "Rename /:memory: to /_memory"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1210#issuecomment-769274591", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1210", "id": 769274591, "node_id": "MDEyOklzc3VlQ29tbWVudDc2OTI3NDU5MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-28T18:10:02Z", "updated_at": "2021-01-28T18:10:02Z", "author_association": "OWNER", "body": "That definitely sounds like a bug! Can you provide a copy of your `metadata.JSON` and the command-line you are using to launch Datasette?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 796234313, "label": "Immutable Database w/ Canned Queries"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1208#issuecomment-767823684", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1208", "id": 767823684, "node_id": "MDEyOklzc3VlQ29tbWVudDc2NzgyMzY4NA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-26T20:58:51Z", "updated_at": "2021-01-26T20:58:51Z", "author_association": "OWNER", "body": "This is a good catch - I've been lazy about this, but you're right that it's an issue that needs cleaning up. Would be very happy to apply a PR, thanks!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 794554881, "label": "A lot of open(file) functions are used without a context manager thus producing ResourceWarning: unclosed file <_io.TextIOWrapper"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1151#issuecomment-767762551", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1151", "id": 767762551, "node_id": "MDEyOklzc3VlQ29tbWVudDc2Nzc2MjU1MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-26T19:07:44Z", "updated_at": "2021-01-26T19:07:44Z", "author_association": "OWNER", "body": "Mentioned in https://simonwillison.net/2021/Jan/25/datasette/", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 770448622, "label": "Database class mechanism for cross-connection in-memory databases"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/991#issuecomment-767761155", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/991", "id": 767761155, "node_id": "MDEyOklzc3VlQ29tbWVudDc2Nzc2MTE1NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-26T19:05:21Z", "updated_at": "2021-01-26T19:06:36Z", "author_association": "OWNER", "body": "Idea: implement this using the existing table view, with a custom template called `table-internal-bb0ec0-tables.html` - that's the custom template listed in the HTML comments at the bottom of https://latest.datasette.io/_internal/tables", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 714377268, "label": "Redesign application homepage"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1201#issuecomment-766991680", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1201", "id": 766991680, "node_id": "MDEyOklzc3VlQ29tbWVudDc2Njk5MTY4MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-25T17:42:21Z", "updated_at": "2021-01-25T17:42:21Z", "author_association": "OWNER", "body": "https://docs.datasette.io/en/stable/changelog.html#v0-54", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 792904595, "label": "Release notes for Datasette 0.54"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1206#issuecomment-766588371", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1206", "id": 766588371, "node_id": "MDEyOklzc3VlQ29tbWVudDc2NjU4ODM3MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-25T06:49:06Z", "updated_at": "2021-01-25T06:49:06Z", "author_association": "OWNER", "body": "Last thing to do: write up the annotated version of these release notes, assign it a URL on my blog and link to it from the release notes here so I can publish them simultaneously.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 793086333, "label": "Release 0.54"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1206#issuecomment-766588020", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1206", "id": 766588020, "node_id": "MDEyOklzc3VlQ29tbWVudDc2NjU4ODAyMA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-25T06:48:20Z", "updated_at": "2021-01-25T06:48:20Z", "author_association": "OWNER", "body": "Issues to reference in the commit message: #509, #1091, #1150, #1151, #1166, #1167, #1178, #1181, #1182, #1184, #1185, #1186, #1187, #1194, #1198", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 793086333, "label": "Release 0.54"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1201#issuecomment-766586151", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1201", "id": 766586151, "node_id": "MDEyOklzc3VlQ29tbWVudDc2NjU4NjE1MQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-25T06:44:43Z", "updated_at": "2021-01-25T06:44:43Z", "author_association": "OWNER", "body": "OK, release notes are ready to merge from that branch. I'll ship the release in the morning, to give me time to write the accompanying annotated release notes.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 792904595, "label": "Release notes for Datasette 0.54"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1201#issuecomment-766545604", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1201", "id": 766545604, "node_id": "MDEyOklzc3VlQ29tbWVudDc2NjU0NTYwNA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2021-01-25T05:14:31Z", "updated_at": "2021-01-25T05:14:31Z", "author_association": "OWNER", "body": "The two big ticket items are `