{"html_url": "https://github.com/simonw/datasette/issues/2213#issuecomment-1843072926", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2213", "id": 1843072926, "node_id": "IC_kwDOBm6k_c5t2w-e", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2023-12-06T15:05:44Z", "updated_at": "2023-12-06T15:05:44Z", "author_association": "CONTRIBUTOR", "body": "it probably does not make sense to gzip large sqlite database files on the fly. it can take many seconds to gzip a large file and you either have to have this big thing in memory, or write it to disk, which some deployment environments will not like.\r\n\r\ni wonder if it would make sense to gzip the databases as part of the datasette publish process. it would be very cool to statically serve those as if they dynamically zipped (i.e. serve the filename example.db, not example.db.zip, and rely on the browser to expand).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 2028698018, "label": "feature request: gzip compression of database downloads"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1655#issuecomment-1767219901", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1655", "id": 1767219901, "node_id": "IC_kwDOBm6k_c5pVaK9", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2023-10-17T21:29:03Z", "updated_at": "2023-10-17T21:29:03Z", "author_association": "CONTRIBUTOR", "body": "@yejiyang why don\u2019t you move this discussion to my fork to spare simon\u2019s notifications ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1163369515, "label": "query result page is using 400mb of browser memory 40x size of html page and 400x size of csv data"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1655#issuecomment-1766994810", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1655", "id": 1766994810, "node_id": "IC_kwDOBm6k_c5pUjN6", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2023-10-17T19:01:59Z", "updated_at": "2023-10-17T19:01:59Z", "author_association": "CONTRIBUTOR", "body": "hi @yejiyang, have your tried using my fork of datasette: https://github.com/fgregg/datasette/tree/no_limit_csv_publish\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1163369515, "label": "query result page is using 400mb of browser memory 40x size of html page and 400x size of csv data"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/523#issuecomment-1407264466", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/523", "id": 1407264466, "node_id": "IC_kwDOCGYnMM5T4SbS", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2023-01-28T02:41:14Z", "updated_at": "2023-01-28T02:41:14Z", "author_association": "CONTRIBUTOR", "body": "I also often then run another little script to cast all empty strings to null, but i save that for another issue if this gets accepted.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1560651350, "label": "Feature request: trim all leading and trailing white space for all columns for all tables in a database"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/pull/203#issuecomment-1404070841", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/203", "id": 1404070841, "node_id": "IC_kwDOCGYnMM5TsGu5", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2023-01-25T18:47:18Z", "updated_at": "2023-01-25T18:47:18Z", "author_association": "CONTRIBUTOR", "body": "i'll adopt this PR to make the changes @simonw suggested https://github.com/simonw/sqlite-utils/pull/203#issuecomment-753567932", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 743384829, "label": "changes to allow for compound foreign keys"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/2003#issuecomment-1404065571", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2003", "id": 1404065571, "node_id": "IC_kwDOBm6k_c5TsFcj", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2023-01-25T18:44:42Z", "updated_at": "2023-01-25T18:44:42Z", "author_association": "CONTRIBUTOR", "body": "see this related discussion to a change in API in sqlite-utils https://github.com/simonw/sqlite-utils/pull/203#issuecomment-753567932", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1555701851, "label": "Show referring tables and rows when the referring foreign key is compound"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1099#issuecomment-1402900354", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1099", "id": 1402900354, "node_id": "IC_kwDOBm6k_c5Tno-C", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2023-01-25T00:58:26Z", "updated_at": "2023-01-25T00:58:26Z", "author_association": "CONTRIBUTOR", "body": "> My original idea for compound foreign keys was to turn both of those columns into links, but that doesn't fit here because `database_name` is already part of a different foreign key.\r\n\r\nit's pretty hard to know what the right thing to do is if a field is part of multiple foreign keys. \r\n\r\nbut, if that's not the case, what about making each of the columns a link. seems like an improvement over the status quo.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 743371103, "label": "Support linking to compound foreign keys"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1099#issuecomment-1402898291", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1099", "id": 1402898291, "node_id": "IC_kwDOBm6k_c5Tnodz", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2023-01-25T00:55:06Z", "updated_at": "2023-01-25T00:55:06Z", "author_association": "CONTRIBUTOR", "body": "I went ahead and spiked something together, in #2003 ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 743371103, "label": "Support linking to compound foreign keys"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/2003#issuecomment-1402898033", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/2003", "id": 1402898033, "node_id": "IC_kwDOBm6k_c5TnoZx", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2023-01-25T00:54:41Z", "updated_at": "2023-01-25T00:54:41Z", "author_association": "CONTRIBUTOR", "body": "@simonw, let me know what you think about this approach!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1555701851, "label": "Show referring tables and rows when the referring foreign key is compound"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1099#issuecomment-1402563930", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1099", "id": 1402563930, "node_id": "IC_kwDOBm6k_c5TmW1a", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2023-01-24T20:11:11Z", "updated_at": "2023-01-24T20:11:11Z", "author_association": "CONTRIBUTOR", "body": "hi @simonw, this bug bit me today.\r\n\r\nthe UX for linking from a table to the foreign key seems tough! \r\n\r\nthe design in the other direction seems a lot easier, for a given primary key detail page, add links back to the tables that refer to the row.\r\n\r\nwould you be open to a PR that solved the second problem but not the first?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 743371103, "label": "Support linking to compound foreign keys"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1614#issuecomment-1364345119", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1614", "id": 1364345119, "node_id": "IC_kwDOBm6k_c5RUkEf", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-12-23T21:27:10Z", "updated_at": "2022-12-23T21:27:10Z", "author_association": "CONTRIBUTOR", "body": "is this issue closed by #1893?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1115435536, "label": "Try again with SQLite codemirror support"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1796#issuecomment-1364345071", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1796", "id": 1364345071, "node_id": "IC_kwDOBm6k_c5RUkDv", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-12-23T21:27:02Z", "updated_at": "2022-12-23T21:27:02Z", "author_association": "CONTRIBUTOR", "body": "@simonw is this issue closed by #1893?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1355148385, "label": "Research an upgrade to CodeMirror 6"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1886#issuecomment-1321241426", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1886", "id": 1321241426, "node_id": "IC_kwDOBm6k_c5OwItS", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-11-20T20:58:54Z", "updated_at": "2022-11-20T20:58:54Z", "author_association": "CONTRIBUTOR", "body": "i wrote up a blog post of how i'm using it! https://bunkum.us/2022/11/20/mgdo-stack.html", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1447050738, "label": "Call for birthday presents: if you're using Datasette, let us know how you're using it here"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1890#issuecomment-1317889323", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1890", "id": 1317889323, "node_id": "IC_kwDOBm6k_c5OjWUr", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-11-17T00:47:36Z", "updated_at": "2022-11-17T00:47:36Z", "author_association": "CONTRIBUTOR", "body": "amazing! thanks @simonw ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1448143294, "label": "Autocomplete text entry for filter values that correspond to facets"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1870#issuecomment-1295667649", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1870", "id": 1295667649, "node_id": "IC_kwDOBm6k_c5NOlHB", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-29T00:52:43Z", "updated_at": "2022-10-29T00:53:43Z", "author_association": "CONTRIBUTOR", "body": "> Are you saying that I can build a container, but then when I run it and it does `datasette serve -i data.db ...` it will somehow modify the image, or create a new modified filesystem layer in the runtime environment, as a result of running that `serve` command?\r\n\r\nSomehow, `datasette serve -i data.db` will lead to the `data.db` being modified, which will trigger a [copy-on-write](https://docs.docker.com/storage/storagedriver/#the-copy-on-write-cow-strategy) of `data.db` into the read-write layer of the container.\r\n\r\nI don't understand **how** that happens.\r\n\r\nit kind of feels like a bug in sqlite, but i can't quite follow the sqlite code.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1426379903, "label": "don't use immutable=1, only mode=ro"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1870#issuecomment-1294285471", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1870", "id": 1294285471, "node_id": "IC_kwDOBm6k_c5NJTqf", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-28T01:06:03Z", "updated_at": "2022-10-28T01:06:03Z", "author_association": "CONTRIBUTOR", "body": "as far as i can tell, [this is where the \"immutable\" argument is used](https://github.com/sqlite/sqlite/blob/c97bb14fab566f6fa8d967c8fd1e90f3702d5b73/src/pager.c#L4926-L4931) in sqlite:\r\n\r\n```c\r\n      pPager->noLock = sqlite3_uri_boolean(pPager->zFilename, \"nolock\", 0);\r\n      if( (iDc & SQLITE_IOCAP_IMMUTABLE)!=0\r\n       || sqlite3_uri_boolean(pPager->zFilename, \"immutable\", 0) ){\r\n          vfsFlags |= SQLITE_OPEN_READONLY;\r\n          goto act_like_temp_file;\r\n      }\r\n```\r\n\r\nso it does set the read only flag, but then has a goto.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1426379903, "label": "don't use immutable=1, only mode=ro"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1870#issuecomment-1294237783", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1870", "id": 1294237783, "node_id": "IC_kwDOBm6k_c5NJIBX", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-27T23:42:18Z", "updated_at": "2022-10-27T23:42:18Z", "author_association": "CONTRIBUTOR", "body": "Relevant sqlite forum thread: https://www.sqlite.org/forum/forumpost/02f7bda329f41e30451472421cf9ce7f715b768ce3db02797db1768e47950d48", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1426379903, "label": "don't use immutable=1, only mode=ro"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1272357976", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1272357976, "node_id": "IC_kwDOBm6k_c5L1qRY", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-08T16:56:51Z", "updated_at": "2022-10-08T16:56:51Z", "author_association": "CONTRIBUTOR", "body": "when you are running from docker, you **always** will want to run as `mode=ro` because the same thing that is causing duplication in the inspect layer will cause duplication in the final container read/write layer when `datasette serve` runs.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1271103097", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1271103097, "node_id": "IC_kwDOBm6k_c5Lw355", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T04:43:41Z", "updated_at": "2022-10-07T04:43:41Z", "author_association": "CONTRIBUTOR", "body": "@simonw, should i open up a new issue for investigating the differences between \"immutable=1\" and \"mode=ro\" and possibly switching to \"mode=ro\". Or would you like to keep that conversation in this issue?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1480#issuecomment-1271101072", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1480", "id": 1271101072, "node_id": "IC_kwDOBm6k_c5Lw3aQ", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T04:39:10Z", "updated_at": "2022-10-07T04:39:10Z", "author_association": "CONTRIBUTOR", "body": "switching from `immutable=1` to `mode=ro` completely addressed this. see https://github.com/simonw/datasette/issues/1836#issuecomment-1271100651 for details.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1015646369, "label": "Exceeding Cloud Run memory limits when deploying a 4.8G database"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1271100651", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1271100651, "node_id": "IC_kwDOBm6k_c5Lw3Tr", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T04:38:14Z", "updated_at": "2022-10-07T04:38:14Z", "author_association": "CONTRIBUTOR", "body": "> yes, and i also think that this is causing the apparent memory problems in #1480. when the container starts up, it will make some operation on the database in `immutable` mode which apparently makes some small change to the db file. if that's so, then the db files will be copied to the read/write layer which counts against cloudrun's memory allocation!\r\n> \r\n> running a test of that now.\r\n\r\nthis completely addressed #1480 ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1301#issuecomment-1271035998", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1301", "id": 1271035998, "node_id": "IC_kwDOBm6k_c5Lwnhe", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T02:38:04Z", "updated_at": "2022-10-07T02:38:04Z", "author_association": "CONTRIBUTOR", "body": "the only mode that `publish cloudrun` supports right now is immutable", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 860722711, "label": "Publishing to cloudrun with immutable mode?"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1271020193", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1271020193, "node_id": "IC_kwDOBm6k_c5Lwjqh", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T02:15:05Z", "updated_at": "2022-10-07T02:21:08Z", "author_association": "CONTRIBUTOR", "body": "when i hack the connect method to open non mutable files with \"mode=ro\" and not \"immutable=1\" https://github.com/simonw/datasette/blob/eff112498ecc499323c26612d707908831446d25/datasette/database.py#L79\r\n\r\nthen: \r\n\r\n```bash\r\n870 B  RUN /bin/sh -c datasette inspect nlrb.db --inspect-file inspect-data.json\r\n```\r\n\r\nthe `datasette inspect` layer is only the size of the json file!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1271008997", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1271008997, "node_id": "IC_kwDOBm6k_c5Lwg7l", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T02:00:37Z", "updated_at": "2022-10-07T02:00:49Z", "author_association": "CONTRIBUTOR", "body": "yes, and i also think that this is causing the apparent memory problems in #1480. when the container starts up, it will make some operation on the database in `immutable` mode which apparently makes some small change to the db file. if that's so, then the db files will be copied to the read/write layer which counts against cloudrun's memory allocation!\r\n\r\nrunning a test of that now. ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1271003212", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1271003212, "node_id": "IC_kwDOBm6k_c5LwfhM", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T01:52:04Z", "updated_at": "2022-10-07T01:52:04Z", "author_association": "CONTRIBUTOR", "body": "and if we try immutable mode, which is how things are opened by `datasette inspect` we duplicate the files!!!\r\n\r\n```python\r\n# test_sql_immutable.py\r\nimport sqlite3\r\nimport sys\r\n\r\ndb_name = sys.argv[1]\r\nconn = sqlite3.connect(f'file:/app/{db_name}?immutable=1', uri=True)\r\ncur = conn.cursor()\r\ncur.execute('select count(*) from filing')\r\nprint(cur.fetchone())\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1270992795", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1270992795, "node_id": "IC_kwDOBm6k_c5Lwc-b", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T01:29:15Z", "updated_at": "2022-10-07T01:50:14Z", "author_association": "CONTRIBUTOR", "body": "fascinatingly, telling python to open sqlite in read only mode makes this layer have a size of 0\r\n\r\n```python\r\n# test_sql_ro.py\r\nimport sqlite3\r\nimport sys\r\n\r\ndb_name = sys.argv[1]\r\nconn = sqlite3.connect(f'file:/app/{db_name}?mode=ro', uri=True)\r\ncur = conn.cursor()\r\ncur.execute('select count(*) from filing')\r\nprint(cur.fetchone())\r\n```\r\n\r\nthat's quite weird because setting the file permissions to read only didn't do anything. (on reflection, that chmod isn't doing anything because the dockerfile commands are run as root)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1270988081", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1270988081, "node_id": "IC_kwDOBm6k_c5Lwb0x", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T01:19:01Z", "updated_at": "2022-10-07T01:27:35Z", "author_association": "CONTRIBUTOR", "body": "okay, some progress!! running some sql against a database file causes that file to get duplicated even if it doesn't apparently change the file.\r\n\r\nmake a little test script like this:\r\n\r\n```python\r\n# test_sql.py\r\nimport sqlite3\r\nimport sys\r\n\r\ndb_name = sys.argv[1]\r\nconn = sqlite3.connect(f'file:/app/{db_name}', uri=True)\r\ncur = conn.cursor()\r\ncur.execute('select count(*) from filing')\r\nprint(cur.fetchone())\r\n```\r\n\r\nthen \r\n\r\n```docker\r\nRUN python test_sql.py nlrb.db\r\n```\r\n\r\nproduced a layer that's the same size as `nlrb.db`!!\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1270936982", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1270936982, "node_id": "IC_kwDOBm6k_c5LwPWW", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T00:52:41Z", "updated_at": "2022-10-07T00:52:41Z", "author_association": "CONTRIBUTOR", "body": "it's not that the inspect command is somehow changing the db files. if i set them to only read-only, the \"inspect\" layer still has the same very large size.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1270923537", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1270923537, "node_id": "IC_kwDOBm6k_c5LwMER", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T00:46:08Z", "updated_at": "2022-10-07T00:46:08Z", "author_association": "CONTRIBUTOR", "body": "i thought it was maybe to do with reading through all the files, but that does not seem to be the case\r\n\r\nif i make a little test file like:\r\n\r\n```python\r\n# test_read.py\r\nimport hashlib\r\nimport sys\r\nimport pathlib\r\n\r\nHASH_BLOCK_SIZE = 1024 * 1024\r\n\r\ndef inspect_hash(path):\r\n    \"\"\"Calculate the hash of a database, efficiently.\"\"\"\r\n    m = hashlib.sha256()\r\n    with path.open(\"rb\") as fp:\r\n        while True:\r\n            data = fp.read(HASH_BLOCK_SIZE)\r\n            if not data:\r\n                break\r\n            m.update(data)\r\n\r\n    return m.hexdigest()\r\n\r\ninspect_hash(pathlib.Path(sys.argv[1]))\r\n```\r\n\r\nthen a line in the Dockerfile like\r\n\r\n```docker\r\nRUN python test_read.py nlrb.db && echo \"[]\" > /etc/inspect.json\r\n```\r\n\r\njust produes a layer of `3B`\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1480#issuecomment-1269847461", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1480", "id": 1269847461, "node_id": "IC_kwDOBm6k_c5LsFWl", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-06T11:21:49Z", "updated_at": "2022-10-06T11:21:49Z", "author_association": "CONTRIBUTOR", "body": "thanks @simonw, i'll spend a little more time trying to figure out why this isn't working on cloudrun, and then will flip over to fly if i can't.\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1015646369, "label": "Exceeding Cloud Run memory limits when deploying a 4.8G database"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1480#issuecomment-1268629159", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1480", "id": 1268629159, "node_id": "IC_kwDOBm6k_c5Lnb6n", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-05T16:00:55Z", "updated_at": "2022-10-05T16:00:55Z", "author_association": "CONTRIBUTOR", "body": "as a next step, i'll fetch the docker image from the google registry, and see what memory and disk usage looks like when i run it locally.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1015646369, "label": "Exceeding Cloud Run memory limits when deploying a 4.8G database"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1480#issuecomment-1268613335", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1480", "id": 1268613335, "node_id": "IC_kwDOBm6k_c5LnYDX", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-05T15:45:49Z", "updated_at": "2022-10-05T15:45:49Z", "author_association": "CONTRIBUTOR", "body": "running into this as i continue to grow my labor data warehouse.\r\n\r\nHere a CloudRun PM says the container size should **not** count against memory: https://stackoverflow.com/a/56570717", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1015646369, "label": "Exceeding Cloud Run memory limits when deploying a 4.8G database"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1062#issuecomment-1260909128", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1062", "id": 1260909128, "node_id": "IC_kwDOBm6k_c5LJ_JI", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-28T13:22:53Z", "updated_at": "2022-09-28T14:09:54Z", "author_association": "CONTRIBUTOR", "body": "if you went this route:\r\n\r\n```python\r\nwith sqlite_timelimit(conn, time_limit_ms):\r\n     c.execute(query)\r\n     for chunk in c.fetchmany(chunk_size):\r\n         yield from chunk\r\n```\r\n\r\nthen `time_limit_ms` would probably have to be greatly extended, because the time spent in the loop will depend on the downstream processing.\r\n\r\ni wonder if this was why you were thinking this feature would need a dedicated connection?\r\n\r\n---\r\n\r\nreading more, there's no real limit i can find on the number of active cursors (or more precisely active prepared statements objects, because sqlite doesn't really have cursors). \r\n\r\nmaybe something like this would be okay?\r\n\r\n```python\r\nwith sqlite_timelimit(conn, time_limit_ms):\r\n     c.execute(query)\r\n     # step through at least one to evaluate the statement, not sure if this is necessary\r\n     yield c.execute.fetchone()\r\nfor chunk in c.fetchmany(chunk_size):\r\n    yield from chunk\r\n```\r\n\r\nthis seems quite weird that there's not more of limit of the number of active prepared statements, but i haven't been able to find one.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 732674148, "label": "Refactor .csv to be an output renderer - and teach register_output_renderer to stream all rows"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1062#issuecomment-1260829829", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1062", "id": 1260829829, "node_id": "IC_kwDOBm6k_c5LJryF", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-28T12:27:19Z", "updated_at": "2022-09-28T12:27:19Z", "author_association": "CONTRIBUTOR", "body": "for teaching `register_output_renderer` to stream it seems like the two options are to\r\n\r\n1. a [nested query technique ](https://github.com/simonw/datasette/issues/526#issuecomment-505162238)to paginate through\r\n2. a fetching model that looks like something\r\n```python\r\nwith sqlite_timelimit(conn, time_limit_ms):\r\n     c.execute(query)\r\n     for chunk in c.fetchmany(chunk_size):\r\n         yield from chunk\r\n```\r\ncurrently `db.execute` is not a generator, so this would probably need a new method?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 732674148, "label": "Refactor .csv to be an output renderer - and teach register_output_renderer to stream all rows"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1259718517", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1259718517, "node_id": "IC_kwDOBm6k_c5LFcd1", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-27T16:02:51Z", "updated_at": "2022-09-27T16:04:46Z", "author_association": "CONTRIBUTOR", "body": "i think that `max_returned_rows` **is** a defense mechanism, just not for connection exhaustion. `max_returned_rows` is a defense mechanism against **memory bombs**.\r\n\r\nif you are potentially yielding out hundreds of thousands or even millions of rows, you need to be quite careful about data flow to not run out of memory on the server, or on the client.\r\n\r\nyou have a lot of places in your code that are protective of that right now, but `max_returned_rows` acts as the final backstop.\r\n\r\nso, given that, it makes sense to have removing `max_returned_rows` altogether be a non-goal, but instead allow for for specific codepaths (like streaming csv's) be able to bypass.\r\n\r\nthat could dramatically lower the surface area for a memory-bomb attack.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1258910228", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1258910228, "node_id": "IC_kwDOBm6k_c5LCXIU", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-27T03:11:07Z", "updated_at": "2022-09-27T03:11:07Z", "author_association": "CONTRIBUTOR", "body": "i think this feature would be safe, as its really only the time limit that can, and imo, should protect against long running queries, as it is pretty easy to make very expensive queries that don't return many rows.\r\n\r\nmoving away from `max_returned_rows` will requires some thinking about:\r\n\r\n1. memory usage and data flows to handle potentially very large result sets\r\n2. how to avoid rendering tens or hundreds of thousands of [html rows](#1655).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1258878311", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1258878311, "node_id": "IC_kwDOBm6k_c5LCPVn", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-27T02:19:48Z", "updated_at": "2022-09-27T02:19:48Z", "author_association": "CONTRIBUTOR", "body": "this sql query doesn't trip up `maximum_returned_rows` but does timeout\r\n\r\n```sql\r\nwith recursive counter(x) as (\r\n   select 0\r\n     union\r\n   select x + 1 from counter\r\n )\r\n select * from counter LIMIT 10 OFFSET 100000000 \r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1258871525", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1258871525, "node_id": "IC_kwDOBm6k_c5LCNrl", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-27T02:09:32Z", "updated_at": "2022-09-27T02:14:53Z", "author_association": "CONTRIBUTOR", "body": "thanks @simonw, i learned something i didn't know about sqlite's execution model!\r\n\r\n> Imagine if Datasette CSVs did allow unlimited retrievals. Someone could hit the CSV endpoint for that recursive query and tie up Datasette's SQL connection effectively forever.\r\n\r\nwhy wouldn't the `sqlite_timelimit` guard prevent that?\r\n\r\n--- \r\non my local version which has the code to [turn off truncations for query csv](#1820), `sqlite_timelimit` does protect me.\r\n\r\n![Screenshot 2022-09-26 at 22-14-31 Error 500](https://user-images.githubusercontent.com/536941/192415680-94b32b7f-868f-4b89-8194-5752d45f6009.png)\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1258849766", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1258849766, "node_id": "IC_kwDOBm6k_c5LCIXm", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-27T01:27:03Z", "updated_at": "2022-09-27T01:27:03Z", "author_association": "CONTRIBUTOR", "body": "i agree with that concern! but if i'm understanding the code correctly, `maximum_returned_rows` does not protect against long-running queries in any way.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1820#issuecomment-1258803261", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1820", "id": 1258803261, "node_id": "IC_kwDOBm6k_c5LB9A9", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-27T00:03:09Z", "updated_at": "2022-09-27T00:03:09Z", "author_association": "CONTRIBUTOR", "body": "the pattern in this PR `max_returned_rows` control the maximum rows rendered through html and json, and the csv render bypasses that.\r\n\r\ni think it would be better to have each of these different query renderers have more direct control for how many rows to fetch, instead of relying on the internals of the `execute` method.\r\n\r\ngenerally, users will not want to paginate through tens of thousands of results, but often will want to download a full query as json or as csv. ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1386456717, "label": "[SPIKE] Don't truncate query CSVs"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1258337011", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1258337011, "node_id": "IC_kwDOBm6k_c5LALLz", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-26T16:49:48Z", "updated_at": "2022-09-26T16:49:48Z", "author_association": "CONTRIBUTOR", "body": "i think the smallest change that gets close to what i want is to change the behavior so that `max_returned_rows` is not applied in the `execute` method when we are are asking for a csv of query.\r\n\r\nthere are some infelicities for that approach, but i'll make a PR to make it easier to discuss.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1258167564", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1258167564, "node_id": "IC_kwDOBm6k_c5K_h0M", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-26T14:57:44Z", "updated_at": "2022-09-26T15:08:36Z", "author_association": "CONTRIBUTOR", "body": "reading the database execute method i have a few questions.\r\n\r\nhttps://github.com/simonw/datasette/blob/cb1e093fd361b758120aefc1a444df02462389a3/datasette/database.py#L229-L242\r\n\r\n---\r\nunless i'm missing something (which is very likely!!), the `max_returned_rows` argument doesn't actually offer any protections against running very expensive queries. \r\n\r\nIt's not like adding a `LIMIT max_rows` argument. it make sense that it isn't because, the query could already have an `LIMIT` argument. Doing something like `select * from (query) limit {max_returned_rows}` **might** be protective but wouldn't always.\r\n\r\nInstead the code executes the full original query, and if still has time it fetches out the first `max_rows + 1` rows. \r\n\r\nthis *does* offer some protection of memory exhaustion, as you won't hydrate a huge result set into python (however, there are [data flow patterns](https://github.com/simonw/datasette/issues/1727#issuecomment-1258129113) that could avoid that too)\r\n\r\ngiven the current architecture, i don't see how creating a new connection would be use?\r\n\r\n---\r\n\r\nIf we just removed the `max_return_rows` limitation, then i think most things would be fine **except** for the QueryViews. Right now rendering, just [5000 rows takes a lot of client-side memory](https://github.com/simonw/datasette/issues/1655) so some form of pagination would be required.\r\n\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1655#issuecomment-1258166572", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1655", "id": 1258166572, "node_id": "IC_kwDOBm6k_c5K_hks", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-26T14:57:04Z", "updated_at": "2022-09-26T14:57:04Z", "author_association": "CONTRIBUTOR", "body": "I think that paginating, even in javascript, could be very helpful. Maybe render json or csv into the page and let javascript loading that into the dom?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1163369515, "label": "query result page is using 400mb of browser memory 40x size of html page and 400x size of csv data"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1727#issuecomment-1258129113", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1727", "id": 1258129113, "node_id": "IC_kwDOBm6k_c5K_YbZ", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-26T14:30:11Z", "updated_at": "2022-09-26T14:48:31Z", "author_association": "CONTRIBUTOR", "body": "from your analysis, it seems like the GIL is blocking on loading of the data from sqlite to python, (particularly in the `fetchmany` call)\r\n\r\nthis is probably a simplistic idea, but what if you had the python code in the `execute` method iterate over the cursor and yield out rows or small chunks of rows.\r\n\r\nsomething like:             \r\n```python\r\n            with sqlite_timelimit(conn, time_limit_ms):\r\n                try:\r\n                    cursor = conn.cursor()\r\n                    cursor.execute(sql, params if params is not None else {})\r\n                except:\r\n                    ...\r\n            max_returned_rows = self.ds.max_returned_rows\r\n            if max_returned_rows == page_size:\r\n                max_returned_rows += 1\r\n                if max_returned_rows and truncate:\r\n                    for i, row in enumerate(cursor):\r\n                        yield row\r\n                        if i == max_returned_rows - 1:\r\n                            break\r\n                else:\r\n                    for row in cursor:\r\n                        yield row\r\n                    truncated = False                  \r\n```\r\n\r\nthis kind of thing works well with a postgres server side cursor, but i'm not sure if it will hold for sqlite. \r\n\r\nyou would still spend about the same amount of time in python and would be contending for the gil, but it would be could be non blocking.\r\n\r\ndepending on the data flow, this could also some benefit for memory. (data stays in more compact sqlite-land until you need it)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1217759117, "label": "Research: demonstrate if parallel SQL queries are worthwhile"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1254064260", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1254064260, "node_id": "IC_kwDOBm6k_c5Kv4CE", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-21T18:17:04Z", "updated_at": "2022-09-21T18:18:01Z", "author_association": "CONTRIBUTOR", "body": "hi @simonw, this is becoming more of a bother for my [labor data warehouse](https://labordata.bunkum.us/). Is there any research or a spike i could do that would help you investigate this issue?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1779#issuecomment-1214437408", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1779", "id": 1214437408, "node_id": "IC_kwDOBm6k_c5IYtgg", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-08-14T19:42:58Z", "updated_at": "2022-08-14T19:42:58Z", "author_association": "CONTRIBUTOR", "body": "thanks @simonw!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1334628400, "label": "google cloudrun updated their limits on maxscale based on memory and cpu count"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1779#issuecomment-1210675046", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1779", "id": 1210675046, "node_id": "IC_kwDOBm6k_c5IKW9m", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-08-10T13:28:37Z", "updated_at": "2022-08-10T13:28:37Z", "author_association": "CONTRIBUTOR", "body": "maybe a simpler solution is to set the maxscale to like 2? since datasette is not set up to make use of container scaling anyway?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1334628400, "label": "google cloudrun updated their limits on maxscale based on memory and cpu count"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/456#issuecomment-1190277829", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/456", "id": 1190277829, "node_id": "IC_kwDOCGYnMM5G8jLF", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-07-20T13:19:15Z", "updated_at": "2022-07-20T13:19:15Z", "author_association": "CONTRIBUTOR", "body": "hadley wickham's melt and reshape could be good inspo: http://had.co.nz/reshape/introduction.pdf", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1310243385, "label": "feature request: pivot command"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/456#issuecomment-1190272780", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/456", "id": 1190272780, "node_id": "IC_kwDOCGYnMM5G8h8M", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-07-20T13:14:54Z", "updated_at": "2022-07-20T13:14:54Z", "author_association": "CONTRIBUTOR", "body": "for example, i have data on votes that look like this:\r\n\r\n| ballot_id | option_id | choice |\r\n|-|-|-|\r\n| 1 | 1 | 0 | \r\n| 1 | 2 | 1 |\r\n| 1 | 3 | 0 |\r\n| 1 | 4 | 1 |\r\n| 2 | 1 | 1 |\r\n| 2 | 2 | 0 |\r\n| 2 | 3 | 1 |\r\n| 2 | 4 | 0 |\r\n\r\nand i want to reshape from this long form to this wide form:\r\n\r\n| ballot_id | option_id_1 | option_id_2 | option_id_3 | option_id_ 4|\r\n|-|-|-|-| -|\r\n| 1 | 0 | 1 | 0 | 1 | \r\n| 2 | 1 | 0 | 1| 0 | \r\n\r\ni could do such a think like this.\r\n\r\n```sql\r\nselect ballot_id, \r\nsum(choice) filter (where option_id = 1) as option_id_1,\r\nsum(choice) filter (where option_id = 2) as option_id_2,\r\nsum(choice) filter (where option_id = 3) as option_id_3,\r\nsum(choice) filter (where option_id = 4) as option_id_4\r\nfrom vote\r\ngroup by ballot_id\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1310243385, "label": "feature request: pivot command"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/423#issuecomment-1189010812", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/423", "id": 1189010812, "node_id": "IC_kwDOCGYnMM5G3t18", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-07-19T12:47:39Z", "updated_at": "2022-07-19T12:47:39Z", "author_association": "CONTRIBUTOR", "body": "just ran into this!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1199158210, "label": ".extract() doesn't set foreign key when extracted columns contain NULL value"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1713#issuecomment-1103312860", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1713", "id": 1103312860, "node_id": "IC_kwDOBm6k_c5Bwzfc", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-04-20T00:52:19Z", "updated_at": "2022-04-20T00:52:19Z", "author_association": "CONTRIBUTOR", "body": "feels related to #1402 ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1203943272, "label": "Datasette feature for publishing snapshots of query results"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1549#issuecomment-1087428593", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1549", "id": 1087428593, "node_id": "IC_kwDOBm6k_c5A0Nfx", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-04-04T11:17:13Z", "updated_at": "2022-04-04T11:17:13Z", "author_association": "CONTRIBUTOR", "body": "another way to get the behavior of downloading the file is to use the download attribute of the anchor tag\r\n\r\nhttps://developer.mozilla.org/en-US/docs/Web/HTML/Element/a#attr-download", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1077620955, "label": "Redesign CSV export to improve usability"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1684#issuecomment-1078126065", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1684", "id": 1078126065, "node_id": "IC_kwDOBm6k_c5AQuXx", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-03-24T20:08:56Z", "updated_at": "2022-03-24T20:13:19Z", "author_association": "CONTRIBUTOR", "body": "would be nice if the behavior was\r\n\r\n1. try to facet all the columns\r\n2. for bigger tables try to facet the indexed columns\r\n3. for the biggest tables, turn off autofacetting completely\r\n\r\nThis is based on my assumption that what determines autofaceting is the rarity of unique values. Which may not be true!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1179998071, "label": "Mechanism for disabling faceting on large tables only"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1581#issuecomment-1077047295", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1581", "id": 1077047295, "node_id": "IC_kwDOBm6k_c5AMm__", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-03-24T04:08:18Z", "updated_at": "2022-03-24T04:08:18Z", "author_association": "CONTRIBUTOR", "body": "this has been addressed by the datasette-hashed-urls plugin", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1089529555, "label": "when hashed urls are turned on, the _memory db has improperly long-lived cache expiry"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1582#issuecomment-1077047152", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1582", "id": 1077047152, "node_id": "IC_kwDOBm6k_c5AMm9w", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-03-24T04:07:58Z", "updated_at": "2022-03-24T04:07:58Z", "author_association": "CONTRIBUTOR", "body": "this has been obviated by the datasette-hashed-urls plugin", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1090055810, "label": "don't set far expiry if hash is '000'"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1655#issuecomment-1062450649", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1655", "id": 1062450649, "node_id": "IC_kwDOBm6k_c4_U7XZ", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-03-09T01:10:46Z", "updated_at": "2022-03-09T01:10:46Z", "author_association": "CONTRIBUTOR", "body": "i increased the max_returned_row, because I have some scripts that get CSVs from this site, and this makes doing pagination of CSVs less annoying for many cases. i know that's streaming csvs is something you are hoping to address in 1.0. let me know if there's anything i can do to help with that.\r\n\r\nas for what if anything can be done about the size of the dom, I don't have any ideas right now, but i'll poke around.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1163369515, "label": "query result page is using 400mb of browser memory 40x size of html page and 400x size of csv data"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1641#issuecomment-1049879118", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1641", "id": 1049879118, "node_id": "IC_kwDOBm6k_c4-k-JO", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-02-24T13:49:26Z", "updated_at": "2022-02-24T13:49:26Z", "author_association": "CONTRIBUTOR", "body": "maybe worth considering adding buttons for paren, asterisk, etc. under the input text box on mobile?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1149310456, "label": "Tweak mobile keyboard settings"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/403#issuecomment-1033332570", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/403", "id": 1033332570, "node_id": "IC_kwDOCGYnMM49l2da", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-02-09T04:22:43Z", "updated_at": "2022-02-09T04:22:43Z", "author_association": "CONTRIBUTOR", "body": "dddoooope", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1126692066, "label": "Document how to add a primary key to a rowid table using `sqlite-utils transform --pk`"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/403#issuecomment-1032126353", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/403", "id": 1032126353, "node_id": "IC_kwDOCGYnMM49hP-R", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-02-08T01:45:15Z", "updated_at": "2022-02-08T01:45:31Z", "author_association": "CONTRIBUTOR", "body": "you can hack something like this to achieve this result:\r\n\r\n`sqlite-utils convert my_database my_table rowid \"{'id': value}\" --multi`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1126692066, "label": "Document how to add a primary key to a rowid table using `sqlite-utils transform --pk`"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-1032120014", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 1032120014, "node_id": "IC_kwDOCGYnMM49hObO", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-02-08T01:32:34Z", "updated_at": "2022-02-08T01:32:34Z", "author_association": "CONTRIBUTOR", "body": "if you are curious about prior art, https://github.com/jsnell/json-to-multicsv is really good!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1009548580", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365", "id": 1009548580, "node_id": "IC_kwDOCGYnMM48LH0k", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-11T02:43:34Z", "updated_at": "2022-01-11T02:43:34Z", "author_association": "CONTRIBUTOR", "body": "thanks so much! always a pleasure to see how you work through these things", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1096558279, "label": "create-index should run analyze after creating index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008275546", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365", "id": 1008275546, "node_id": "IC_kwDOCGYnMM48GRBa", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-09T11:01:15Z", "updated_at": "2022-01-09T13:37:51Z", "author_association": "CONTRIBUTOR", "body": "i don\u2019t want to be such a partisan for analyze, but the query planner deciding *not* to use an index based on information collected by analyze is not necessarily a bug, but could be the correct choice.\r\n\r\n<s>the original poster in that stack overflow doesn\u2019t say there\u2019s a performance regression </s>", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1096558279, "label": "create-index should run analyze after creating index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008166084", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365", "id": 1008166084, "node_id": "IC_kwDOCGYnMM48F2TE", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-08T22:32:47Z", "updated_at": "2022-01-08T22:32:47Z", "author_association": "CONTRIBUTOR", "body": "or using \u201c pragma optimize\u201d", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1096558279, "label": "create-index should run analyze after creating index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008164786", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365", "id": 1008164786, "node_id": "IC_kwDOCGYnMM48F1-y", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-08T22:24:19Z", "updated_at": "2022-01-08T22:24:19Z", "author_association": "CONTRIBUTOR", "body": "the out-of-date scenario you describe could be addressed by automatically adding an analyze to the insert or convert commands if they implicate an index", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1096558279, "label": "create-index should run analyze after creating index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008164116", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365", "id": 1008164116, "node_id": "IC_kwDOCGYnMM48F10U", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-08T22:18:57Z", "updated_at": "2022-01-08T22:18:57Z", "author_association": "CONTRIBUTOR", "body": "the table with the query ran so bad was about 50k. \r\n\r\ni think the scenario should not be worse than no stats. \r\n\r\ni also did not know that sqlite was so different from postgres and needed an explicit analyze call.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1096558279, "label": "create-index should run analyze after creating index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008161965", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365", "id": 1008161965, "node_id": "IC_kwDOCGYnMM48F1St", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-08T22:02:56Z", "updated_at": "2022-01-08T22:02:56Z", "author_association": "CONTRIBUTOR", "body": "for options 2 and 3, i would worry about discoverablity. \r\n\r\nin other db\u2019s it is not necessary to explicitly call analyze for most indices. ie for postgres\r\n\r\n> The system regularly collects statistics on all of a table's columns. Newly-created non-expression indexes can immediately use these statistics to determine an index's usefulness.\r\n\r\ni suppose i would propose raising a warning if the stats table is created that explains what is going on and informs users about a \u2014no-analyze argument.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1096558279, "label": "create-index should run analyze after creating index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1574#issuecomment-1007844190", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1574", "id": 1007844190, "node_id": "IC_kwDOBm6k_c48Ente", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-08T00:42:12Z", "updated_at": "2022-01-08T00:42:12Z", "author_association": "CONTRIBUTOR", "body": "is there a reason to not always use the slim option?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1084193403, "label": "introduce new option for datasette package to use a slim base image"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007636709", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365", "id": 1007636709, "node_id": "IC_kwDOCGYnMM48D1Dl", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-01-07T18:28:33Z", "updated_at": "2022-01-07T18:29:43Z", "author_association": "CONTRIBUTOR", "body": "i added an index to one table with sqlite-utils, and then a query that used to take about 1 second started taking hundreds of seconds. \r\n\r\nrunning analyze got me back to sub second speed.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1096558279, "label": "create-index should run analyze after creating index"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1583#issuecomment-1002825217", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1583", "id": 1002825217, "node_id": "IC_kwDOBm6k_c47xeYB", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-12-30T00:34:16Z", "updated_at": "2021-12-30T00:34:16Z", "author_association": "CONTRIBUTOR", "body": "if that is not desirable, it might be good to document that users might want to set up a lifecycle rule to automatically delete these build artifacts. something like https://stackoverflow.com/questions/59937542/can-i-delete-container-images-from-google-cloud-storage-artifacts-bucket", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1090810196, "label": "consider adding deletion step of cloudbuild artifacts to gcloud publish"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1561#issuecomment-997128712", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1561", "id": 997128712, "node_id": "IC_kwDOBm6k_c47bvoI", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-12-18T02:35:48Z", "updated_at": "2021-12-18T02:35:48Z", "author_association": "CONTRIBUTOR", "body": "interesting! i love this feature. this + full caching with cloudflare is really super!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1082765654, "label": "add hash id to \"_memory\" url if hashed url mode is turned on and crossdb is also turned on"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-993078038", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 993078038, "node_id": "IC_kwDOBm6k_c47MSsW", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-12-14T01:46:52Z", "updated_at": "2021-12-14T01:46:52Z", "author_association": "CONTRIBUTOR", "body": "the nested query idea is very nice, and i stole if for [my client side paginator](https://observablehq.com/d/1d5da3a3c3f2f347#DatasetteClient). However,  it won't do the right thing if the original query orders by random().\r\n\r\nIf you go the nested query route, maybe raise a 4XX status code if the query has such a clause?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1553#issuecomment-993014772", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1553", "id": 993014772, "node_id": "IC_kwDOBm6k_c47MDP0", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-12-13T23:46:18Z", "updated_at": "2021-12-13T23:46:18Z", "author_association": "CONTRIBUTOR", "body": "these headers would also be relevant for json exports of custom queries", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079111498, "label": "if csv export is truncated in non streaming mode set informative response header"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1553#issuecomment-992986587", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1553", "id": 992986587, "node_id": "IC_kwDOBm6k_c47L8Xb", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-12-13T22:57:04Z", "updated_at": "2021-12-13T22:57:04Z", "author_association": "CONTRIBUTOR", "body": "would also be good if the header said the what the max row limit was", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1079111498, "label": "if csv export is truncated in non streaming mode set informative response header"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-992971072", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 992971072, "node_id": "IC_kwDOBm6k_c47L4lA", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-12-13T22:29:34Z", "updated_at": "2021-12-13T22:29:34Z", "author_association": "CONTRIBUTOR", "body": "just came by to open this issue. would make my data analysis in observable a lot better!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1549#issuecomment-991754237", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1549", "id": 991754237, "node_id": "IC_kwDOBm6k_c47HPf9", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-12-11T19:14:39Z", "updated_at": "2021-12-11T19:14:39Z", "author_association": "CONTRIBUTOR", "body": "that option is not available on [custom queries](https://labordata.bunkum.us/odpr-962a140?sql=with+local_union_filings+as+%28%0D%0A++select+*+from+lm_data+%0D%0A++where%0D%0A++++yr_covered+%3E+cast%28strftime%28%27%25Y%27%2C+%27now%27%2C+%27-5+years%27%29+as+int%29%0D%0A++++and+desig_name+%3D+%27LU%27%0D%0A++order+by+yr_covered+desc%0D%0A%29%2C%0D%0Amost_recent_filing+as+%28%0D%0A++select%0D%0A++++*%0D%0A++from+local_union_filings%0D%0A++group+by%0D%0A++++f_num%0D%0A%29%0D%0Aselect%0D%0A++*%0D%0Afrom%0D%0A++most_recent_filing%0D%0Awhere%0D%0A++next_election+%3E%3D+strftime%28%27%25Y-%25m%27%2C+%27now%27%29%0D%0A++and+next_election+%3C+strftime%28%27%25Y-%25m%27%2C+%27now%27%2C+%27%2B1+year%27%29%0D%0Aorder+by%0D%0A++members+desc%3B).\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1077620955, "label": "Redesign CSV export to improve usability"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/353#issuecomment-991405755", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/353", "id": 991405755, "node_id": "IC_kwDOCGYnMM47F6a7", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-12-11T01:38:29Z", "updated_at": "2021-12-11T01:38:29Z", "author_association": "CONTRIBUTOR", "body": "wow! that's awesome! thanks so much, @simonw!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1077102934, "label": "Allow passing a file of code to \"sqlite-utils convert\""}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-964205475", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 964205475, "node_id": "IC_kwDOCGYnMM45eJuj", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-11-09T14:31:29Z", "updated_at": "2021-11-09T14:31:29Z", "author_association": "CONTRIBUTOR", "body": "i was just reaching for a tool to do this this morning", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/pull/1495#issuecomment-954384496", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1495", "id": 954384496, "node_id": "IC_kwDOBm6k_c444sBw", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-10-29T03:07:13Z", "updated_at": "2021-10-29T03:07:13Z", "author_association": "CONTRIBUTOR", "body": "okay @simonw, made the requested changes. tests are running locally. i think this is ready for you to look at again.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1033678984, "label": "Allow routes to have extra options"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1284#issuecomment-949604763", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1284", "id": 949604763, "node_id": "IC_kwDOBm6k_c44mdGb", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-10-22T12:54:34Z", "updated_at": "2021-10-22T12:54:34Z", "author_association": "CONTRIBUTOR", "body": "i'm going to take a swing at this today. we'll see.", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 845794436, "label": "Feature or Documentation Request: Individual table as home page template"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1419#issuecomment-893114612", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1419", "id": 893114612, "node_id": "IC_kwDOBm6k_c41O9j0", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-08-05T02:29:06Z", "updated_at": "2021-08-05T02:29:06Z", "author_association": "CONTRIBUTOR", "body": "there's a lot of complexity here, that's probably not worth addressing. i got what i needed by patching the dockerfile that cloudrun uses to install a newer version of sqlite.\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 959710008, "label": "`publish cloudrun` should deploy a more recent SQLite version"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1419#issuecomment-892276385", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1419", "id": 892276385, "node_id": "IC_kwDOBm6k_c41Lw6h", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-08-04T00:58:49Z", "updated_at": "2021-08-04T00:58:49Z", "author_association": "CONTRIBUTOR", "body": "yes, [filter clause on aggregate queries were added to sqlite3 in 3.30](https://www.sqlite.org/releaselog/3_30_1.html)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 959710008, "label": "`publish cloudrun` should deploy a more recent SQLite version"}, "performed_via_github_app": null}
{"html_url": "https://github.com/simonw/datasette/issues/1401#issuecomment-884910320", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1401", "id": 884910320, "node_id": "IC_kwDOBm6k_c40vqjw", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2021-07-22T13:26:01Z", "updated_at": "2021-07-22T13:26:01Z", "author_association": "CONTRIBUTOR", "body": "ordered lists didn't work either, btw", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 950664971, "label": "unordered list is not rendering bullet points in description_html on database page"}, "performed_via_github_app": null}