{"html_url": "https://github.com/simonw/sqlite-utils/issues/510#issuecomment-1318777114", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/510", "id": 1318777114, "node_id": "IC_kwDOCGYnMM5OmvEa", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-11-17T15:09:47Z", "updated_at": "2022-11-17T15:09:47Z", "author_association": "CONTRIBUTOR", "body": "why close? is the only problem that the _config table that incorrectly says 4 for fts5? if so, that's still something that should be fixed", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1434911255, "label": "Cannot enable FTS5 despite it being available"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1890#issuecomment-1317889323", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1890", "id": 1317889323, "node_id": "IC_kwDOBm6k_c5OjWUr", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-11-17T00:47:36Z", "updated_at": "2022-11-17T00:47:36Z", "author_association": "CONTRIBUTOR", "body": "amazing! thanks @simonw ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1448143294, "label": "Autocomplete text entry for filter values that correspond to facets"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1899#issuecomment-1317873458", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1899", "id": 1317873458, "node_id": "IC_kwDOBm6k_c5OjScy", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-17T00:31:07Z", "updated_at": "2022-11-17T00:31:07Z", "author_association": "CONTRIBUTOR", "body": "This is one way to fix it\r\n\r\n```patch\r\nr.html\r\ndiff --git a/datasette/static/cm-editor-6.0.1.js b/datasette/static/cm-editor-6.0.1.js\r\nindex c1fd2ab..68cf398 100644\r\n--- a/datasette/static/cm-editor-6.0.1.js\r\n+++ b/datasette/static/cm-editor-6.0.1.js\r\n@@ -22,7 +22,14 @@ export function editorFromTextArea(textarea, conf = {}) {\r\n // https://github.com/codemirror/lang-sql#user-content-sqlconfig.tables\r\n let view = new EditorView({\r\n doc: textarea.value,\r\n+\r\n extensions: [\r\n+ EditorView.theme({\r\n+ \".cm-content\": {\r\n+ // Height on cm-content ensures the editor is focusable by clicking beyond the height of the text\r\n+ minHeight: \"70px\",\r\n+ },\r\n+ }),\r\n keymap.of([\r\n {\r\n key: \"Shift-Enter\",\r\ndiff --git a/datasette/templates/_codemirror.html b/datasette/templates/_codemirror.html\r\nindex dea4710..c4629ae 100644\r\n--- a/datasette/templates/_codemirror.html\r\n+++ b/datasette/templates/_codemirror.html\r\n@@ -4,7 +4,6 @@\r\n .cm-editor {\r\n resize: both;\r\n overflow: hidden;\r\n- min-height: 70px;\r\n width: 80%;\r\n border: 1px solid #ddd;\r\n }\r\n```\r\n\r\nI don't love it but it seems to work for the default case. You can still retrigger the bug by resizing the editor to be > 70px high.\r\n\r\nThe other approach would be to listen for a click on that empty region and move focus to the editor, or something", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1452495049, "label": "Clicking within the CodeMirror area below the SQL (i.e. when there's only a single line) doesn't cause the editor to get focused "}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1317834838", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1317834838, "node_id": "IC_kwDOBm6k_c5OjJBW", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T23:50:58Z", "updated_at": "2022-11-16T23:50:58Z", "author_association": "CONTRIBUTOR", "body": "Should we empty out the fixture schema to avoid fixture autocomplete showing up on live databases in the interim, or are you planning to tackle #1897 shortly?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1317805482", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1317805482, "node_id": "IC_kwDOBm6k_c5OjB2q", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T23:18:17Z", "updated_at": "2022-11-16T23:18:17Z", "author_association": "CONTRIBUTOR", "body": "Alright with https://github.com/simonw/datasette/pull/1893/commits/f254be4b38936e95e7a7f25866e7c6b0520db96f we should be getting autocomplete on fixture data. Give that a test and see what you think", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1317789308", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1317789308, "node_id": "IC_kwDOBm6k_c5Oi958", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T22:59:57Z", "updated_at": "2022-11-16T22:59:57Z", "author_association": "CONTRIBUTOR", "body": "I can push up a commit that uses the static fixtures schema for testing, but given that the query used to generate it is authed we would still need some work to make that work on live data, right? Ideally it could come down to db and query views directly to avoid waiting on an extra xhr and managing that state change.On Nov 16, 2022, at 2:16 PM, Simon Willison ***@***.***> wrote:\ufeff\nHonestly I'm not too bothered if table names with weird characters don't work correctly here - I care about those in the Datasette fixtures.db database because Datasette aims to support ANY valid SQLite database, so I need stuff in the test suite that includes weird edge cases like this. But I would hope very few people actually create tables with spaces in their names, so it's not a huge concern to me if autocompletion doesn't work properly for those.\n\n\u2014Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1317715580", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1317715580, "node_id": "IC_kwDOBm6k_c5Oir58", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T21:49:51Z", "updated_at": "2022-11-16T21:49:51Z", "author_association": "CONTRIBUTOR", "body": "I think the table completion still has some quirks to work out. Something like\r\n\r\n```\r\n schema: {\r\n \"[123_starts_with_digits]\": [\"content\"],\r\n }\r\n```\r\n\r\nSeems to work alright, although it will append it after any other numbers you've started typing - so you end up with `select * from 12[123_starts_with_digits]` if you typed \"12\" to get the completion to appear. This might just be an issue with numeric names, I haven't tested it in a lot of detail.\r\n\r\nYou can do \r\n\r\n```\r\n searchable: [\r\n {\r\n label: \"name with . and spaces\",\r\n apply: \"[name with . and spaces]\",\r\n },\r\n \"pk\",\r\n \"text1\",\r\n \"text2\",\r\n ],\r\n```\r\n\r\nWhich is pretty neat and will show the non-escaped string but complete to the escaped one. You can't easily do that with the table names themselves (you can pass a `tables` array like so https://github.com/codemirror/lang-sql/blob/ebf115fffdbe07f91465ccbd82868c587f8182bc/src/sql.ts#L121 but it will overwrite the columns from the schema ).\r\n\r\nIt's buggy enough (bad output for these unusual table names) that I'd suggest that work gets moved into a follow up to the upgrade to 6. That would give space to sort out how to deliver that to the view directly, figure out where name escaping should happen, and have overall testing to uncover bugs and fix papercuts before enabling it.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1317681193", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1317681193, "node_id": "IC_kwDOBm6k_c5Oijgp", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T21:19:13Z", "updated_at": "2022-11-16T21:19:13Z", "author_association": "CONTRIBUTOR", "body": "Alright, added Cmd+Enter to submit (Ctrl+Enter on Windows as well bc of using Meta-Enter on codemirror). We can make that MacOS only by changing the combo to Cmd+Enter specifically but I think it's probably fine to have both.", "reactions": "{\"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1317522323", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1317522323, "node_id": "IC_kwDOBm6k_c5Oh8uT", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T18:59:49Z", "updated_at": "2022-11-16T18:59:49Z", "author_association": "CONTRIBUTOR", "body": "Or I guess you could return only the escaped table name and then we could derive the unescaped from the client side (removing the outer `[]` when present)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1317520304", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1317520304, "node_id": "IC_kwDOBm6k_c5Oh8Ow", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T18:58:43Z", "updated_at": "2022-11-16T18:58:43Z", "author_association": "CONTRIBUTOR", "body": "Nice. And is it possible to include another field which is an escaped table name (only when necessary) - i.e. `[123_starts_with_digits]`. Or is that easy enough to derive on the client? I'm thinking we'd map those to Completion objects so that CM would show the non escaped text but complete to escaped.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1317329157", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1317329157, "node_id": "IC_kwDOBm6k_c5OhNkF", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T16:46:52Z", "updated_at": "2022-11-16T16:46:52Z", "author_association": "CONTRIBUTOR", "body": "> \"Screenshot\r\n> \r\n> UI issue I see on the autocomplete popup with overlapping icon & text. Screenshot's from Firefox, it seems even a little more pronounced on Safari\r\n\r\nI checked and if I empty out app.css the bug goes away, so there's some kind of inheritance issue there. It's hard to debug bc the autocomplete popup goes away on blur (i.e. when trying to inspect it in devtools), but at least it's narrowed down a bit.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1317326406", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1317326406, "node_id": "IC_kwDOBm6k_c5OhM5G", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T16:45:09Z", "updated_at": "2022-11-16T16:45:09Z", "author_association": "CONTRIBUTOR", "body": "For escaped table names it looks like we could pass a Completion object (https://codemirror.net/docs/ref/#autocomplete) instead of a string which would allow the non escaped name to be a label and then the escaped name to actually complete in the editor, which might help with some of the funkiness I was seeing w/ completion", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1317314064", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1317314064, "node_id": "IC_kwDOBm6k_c5OhJ4Q", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T16:36:46Z", "updated_at": "2022-11-16T16:36:46Z", "author_association": "CONTRIBUTOR", "body": "With\r\n\r\n```patch\r\ndiff --git a/datasette/templates/_codemirror_foot.html b/datasette/templates/_codemirror_foot.html\r\nindex ed709b3..74fe18e 100644\r\n--- a/datasette/templates/_codemirror_foot.html\r\n+++ b/datasette/templates/_codemirror_foot.html\r\n@@ -7,7 +7,11 @@\r\n sqlFormat.hidden = false;\r\n }\r\n if (sqlInput) {\r\n- var editor = (window.editor = cm.editorFromTextArea(sqlInput));\r\n+ var editor = (window.editor = cm.editorFromTextArea(sqlInput, {\r\n+ schema: {\r\n+ compound_three_primary_keys: [\"pk1\", \"pk2\", \"pk3\", \"content\"],\r\n+ },\r\n+ }));\r\n```\r\n\r\nwe get table autocompletion and column completion if you name the table in the query (see screencast). I do see bugs with escaped table names like `\"'123_starts_with_digits'\": [\"col1\", \"col2\"]` or `\"[123_starts_with_digits]\": [\"col1\", \"col2\"]` where it doesn't seem to pick up the column names though. I think it needs some further testing and debugging. \r\n\r\nhttps://user-images.githubusercontent.com/95570/202238521-e613b4e2-ba92-4418-9068-fc022edaee93.mp4\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1317281292", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1317281292, "node_id": "IC_kwDOBm6k_c5OhB4M", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T16:19:16Z", "updated_at": "2022-11-16T16:19:16Z", "author_association": "CONTRIBUTOR", "body": "Ha, nice idea! Updating the dialect with that list.\r\n\r\nI'm thinking of also adding `count` to the list since that's a common thing people would want to autocomplete. I notice BQ console highlights `count` in the same manner as other keywords like `select` as well.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1316387382", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1316387382, "node_id": "IC_kwDOBm6k_c5Odno2", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T05:33:55Z", "updated_at": "2022-11-16T05:33:55Z", "author_association": "CONTRIBUTOR", "body": "I added a commit to make our own dialect at https://github.com/simonw/datasette/pull/1893/commits/e273fc8ed5341bdf0b622e722d761bd2acc30a90. Pulled in the full list of keywords from https://www.sqlite.org/lang_keywords.html but haven't gone through and pruned it to only include common select keywords. @simonw you'll have better knowledge than me on that - do you want to take a first shot at narrowing that down to the set that people will be using in the editor?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1316339035", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1316339035, "node_id": "IC_kwDOBm6k_c5Odb1b", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T04:47:11Z", "updated_at": "2022-11-16T04:47:11Z", "author_association": "CONTRIBUTOR", "body": "> Have you ever seen CodeMirror correctly auto-completing columns? I'm not entirely sure I believe that the feature works anywhere else.\r\n\r\nI was thinking of the BigQuery console, like \r\n\r\n\"Screenshot\r\n\r\nBut they must be doing something pretty custom & appears to be using Monaco anyway. I suspect some kind of lower level autocomplete integration could make this work, but if the table completion is a good-enough starting point I think it's not too hard. The main issue is that we don't pass the relevant table data down to QueryView.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1316320521", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1316320521, "node_id": "IC_kwDOBm6k_c5OdXUJ", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T04:29:23Z", "updated_at": "2022-11-16T04:29:23Z", "author_association": "CONTRIBUTOR", "body": "\"Screenshot\r\n\r\nUI issue I see on the autocomplete popup with overlapping icon & text. Screenshot's from Firefox, it seems even a little more pronounced on Safari", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1316318961", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1316318961, "node_id": "IC_kwDOBm6k_c5OdW7x", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T04:27:51Z", "updated_at": "2022-11-16T04:27:51Z", "author_association": "CONTRIBUTOR", "body": "> The resize handle doesn't appear on Mobile Safari on iPhone - I don't think that particularly matters though.\r\n> \r\n> The textarea does get a weird border around it when focused on iPhone though.\r\n\r\nThe default focus styles appear to be\r\n\r\n```\r\n.c1.cm-editor.cm-focused {\r\n outline: 1px dotted #212121;\r\n}\r\n```\r\n\r\nWhich I also see on desktop. Would be nice to changed to whatever the default UA textarea styles are to blend in better but I wouldn't recommend removing it entirely - just to keep the visual indication that the element is focused. Maybe followup material to have a theming pass", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1316256386", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1316256386, "node_id": "IC_kwDOBm6k_c5OdHqC", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T03:18:06Z", "updated_at": "2022-11-16T03:18:06Z", "author_association": "CONTRIBUTOR", "body": "> If you can get a version of this working with table and column autocompletion just using a static JavaScript object in the source code with the right tables and columns, I'm happy to take on the work of turning that static object into something that Datasette includes in the page itself with all of the correct values.\r\n\r\nThis version \"sort of\" works when on the main database page where the template passes the relevant data https://github.com/bgrins/datasette/commit/8431c98850c7a552dbcde2a4dd0c3dc942a97d25 by doing this and passing that into the `schema` object:\r\n\r\n```\r\n let TABLES_DATA = [];\r\n {% if tables is defined %} \r\n TABLES_DATA = {{ tables | tojson(indent=2) }};\r\n {% endif %}\r\n\r\n // Turn into an object, shaped like https://github.com/codemirror/lang-sql/blob/ebf115fffdbe07f91465ccbd82868c587f8182bc/test/test-complete.ts#L27.\r\n const TABLES_SCHEMA = Object.fromEntries(\r\n new Map(\r\n TABLES_DATA.map((table) => {\r\n return [table.name, table.columns];\r\n })\r\n ).entries()\r\n );\r\n```\r\n\r\nBut there are a number of papercuts with it - it's not escaping table names with spaces (likely be fixable from the data being passed into the view) but mainly it doesn't seem to autocomplete columns. I think it might only want to do it when you first type the table name from my read of https://github.com/codemirror/lang-sql/blob/ebf115fffdbe07f91465ccbd82868c587f8182bc/test/test-complete.ts#L37. It's possible I'm just passing something wrong, but it may end up being something that needs feature work upstream.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1316243602", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1316243602, "node_id": "IC_kwDOBm6k_c5OdEiS", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-16T03:11:46Z", "updated_at": "2022-11-16T03:11:46Z", "author_association": "CONTRIBUTOR", "body": "Was just reviewing the SQL options and there's an [upperCaseKeywords](https://github.com/codemirror/lang-sql#user-content-sqlconfig.uppercasekeywords) if we'd rather have SELECT vs select. Datasette seems to prefer lowercase so probably best to keep it as-is", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1316041828", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1316041828, "node_id": "IC_kwDOBm6k_c5OcTRk", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-15T23:51:35Z", "updated_at": "2022-11-15T23:51:35Z", "author_association": "CONTRIBUTOR", "body": "I experimented with autocompleting the actual schema in https://github.com/bgrins/datasette/commit/8431c98850c7a552dbcde2a4dd0c3dc942a97d25, but it would need some work (current problems with it listed in the commit message there)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1315869946", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1315869946, "node_id": "IC_kwDOBm6k_c5ObpT6", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-15T21:12:38Z", "updated_at": "2022-11-15T21:12:38Z", "author_association": "CONTRIBUTOR", "body": "https://github.com/Sphinxxxx/cm-resize isn't compatible with 6. There's a suggestion to try using CSS resize in https://discuss.codemirror.net/t/resizing-codemirror-6/3265/2", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1315869040", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1315869040, "node_id": "IC_kwDOBm6k_c5ObpFw", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-15T21:11:42Z", "updated_at": "2022-11-15T21:11:42Z", "author_association": "CONTRIBUTOR", "body": "extraKeys is done - Shift+Enter is added in the helper function, and it appears that the Tab behavior now defaults to what the `Tab: false` setting was doing (allowing it to escape to the form)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1893#issuecomment-1315853097", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1893", "id": 1315853097, "node_id": "IC_kwDOBm6k_c5OblMp", "user": {"value": 95570, "label": "bgrins"}, "created_at": "2022-11-15T20:55:40Z", "updated_at": "2022-11-15T20:55:40Z", "author_association": "CONTRIBUTOR", "body": "Should also minify the bundled output", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1450363982, "label": "Upgrade to CodeMirror 6, add SQL autocomplete"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1886#issuecomment-1314241058", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1886", "id": 1314241058, "node_id": "IC_kwDOBm6k_c5OVboi", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-11-14T19:06:35Z", "updated_at": "2022-11-14T19:06:35Z", "author_association": "CONTRIBUTOR", "body": "This probably counts as a case study: https://github.com/eyeseast/spatial-data-cooking-show. Even has video.\r\n\r\nSeriously, though, this workflow has become integral to my work with reporters and editors across USA TODAY Network. Very often, I get sent a folder of data in mixed formats, with a vague ask of how we should communicate some part of it to users. Datasette and its constellation of tools makes it easy to get a quick look at that data, run exploratory queries, map it and ask questions to figure out what's important to show. And then I export a version of the data that's exactly what I need for display.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1447050738, "label": "Call for birthday presents: if you're using Datasette, let us know how you're using it here"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1884#issuecomment-1314066229", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1884", "id": 1314066229, "node_id": "IC_kwDOBm6k_c5OUw81", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-11-14T16:48:35Z", "updated_at": "2022-11-14T16:48:35Z", "author_association": "CONTRIBUTOR", "body": "I'm realizing I don't know if a virtual table will ever return a count. Maybe it depends on the implementation. For these three, just checking now, it'll always return zero.\r\n\r\nThat said, I'm not sure there's any downside to having them return zero and caching that. (They're hidden, too.) ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1439009231, "label": "Exclude virtual tables from datasette inspect"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1884#issuecomment-1313962183", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1884", "id": 1313962183, "node_id": "IC_kwDOBm6k_c5OUXjH", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-11-14T15:46:32Z", "updated_at": "2022-11-14T15:46:32Z", "author_association": "CONTRIBUTOR", "body": "It does work, though I think it's probably still worth excluding virtual tables that will always be zero. Here's the same inspection as before, now with `--load-extension spatialite`:\r\n\r\n```json\r\n{\r\n \"alltheplaces\": {\r\n \"hash\": \"0843cfe414439ab903c22d1121b7ddbc643418c35c7f0edbcec82ef1452411df\",\r\n \"size\": 963375104,\r\n \"file\": \"alltheplaces.db\",\r\n \"tables\": {\r\n \"spatial_ref_sys\": {\r\n \"count\": 6215\r\n },\r\n \"spatialite_history\": {\r\n \"count\": 18\r\n },\r\n \"sqlite_sequence\": {\r\n \"count\": 2\r\n },\r\n \"geometry_columns\": {\r\n \"count\": 3\r\n },\r\n \"spatial_ref_sys_aux\": {\r\n \"count\": 6164\r\n },\r\n \"views_geometry_columns\": {\r\n \"count\": 0\r\n },\r\n \"virts_geometry_columns\": {\r\n \"count\": 0\r\n },\r\n \"geometry_columns_statistics\": {\r\n \"count\": 3\r\n },\r\n \"views_geometry_columns_statistics\": {\r\n \"count\": 0\r\n },\r\n \"virts_geometry_columns_statistics\": {\r\n \"count\": 0\r\n },\r\n \"geometry_columns_field_infos\": {\r\n \"count\": 0\r\n },\r\n \"views_geometry_columns_field_infos\": {\r\n \"count\": 0\r\n },\r\n \"virts_geometry_columns_field_infos\": {\r\n \"count\": 0\r\n },\r\n \"geometry_columns_time\": {\r\n \"count\": 3\r\n },\r\n \"geometry_columns_auth\": {\r\n \"count\": 3\r\n },\r\n \"views_geometry_columns_auth\": {\r\n \"count\": 0\r\n },\r\n \"virts_geometry_columns_auth\": {\r\n \"count\": 0\r\n },\r\n \"data_licenses\": {\r\n \"count\": 10\r\n },\r\n \"sql_statements_log\": {\r\n \"count\": 0\r\n },\r\n \"states\": {\r\n \"count\": 56\r\n },\r\n \"counties\": {\r\n \"count\": 3234\r\n },\r\n \"idx_states_geometry_rowid\": {\r\n \"count\": 56\r\n },\r\n \"idx_states_geometry_node\": {\r\n \"count\": 3\r\n },\r\n \"idx_states_geometry_parent\": {\r\n \"count\": 2\r\n },\r\n \"idx_counties_geometry_rowid\": {\r\n \"count\": 3234\r\n },\r\n \"idx_counties_geometry_node\": {\r\n \"count\": 98\r\n },\r\n \"idx_counties_geometry_parent\": {\r\n \"count\": 97\r\n },\r\n \"idx_places_geometry_rowid\": {\r\n \"count\": 1236796\r\n },\r\n \"idx_places_geometry_node\": {\r\n \"count\": 38163\r\n },\r\n \"idx_places_geometry_parent\": {\r\n \"count\": 38162\r\n },\r\n \"places\": {\r\n \"count\": 1332609\r\n },\r\n \"SpatialIndex\": {\r\n \"count\": 0\r\n },\r\n \"ElementaryGeometries\": {\r\n \"count\": 0\r\n },\r\n \"KNN\": {\r\n \"count\": 0\r\n },\r\n \"idx_states_geometry\": {\r\n \"count\": 56\r\n },\r\n \"idx_counties_geometry\": {\r\n \"count\": 3234\r\n },\r\n \"idx_places_geometry\": {\r\n \"count\": 1236796\r\n }\r\n }\r\n }\r\n}\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1439009231, "label": "Exclude virtual tables from datasette inspect"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1886#issuecomment-1313252879", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1886", "id": 1313252879, "node_id": "IC_kwDOBm6k_c5ORqYP", "user": {"value": 883348, "label": "adipasquale"}, "created_at": "2022-11-14T08:10:23Z", "updated_at": "2022-11-14T08:10:23Z", "author_association": "CONTRIBUTOR", "body": "Hi @simonw and thanks for the great tools you're publishing, your dedication is inspiring!\r\n\r\nI work for the French Ministry of Culture on a surveying tool for objects protected for their historical value. It is part of a program building modern public services called [beta.gouv.fr](https://beta.gouv.fr/).\r\n\r\nIn that context I'm using data published by the Ministry that I have ingested into datasette and published on a free Fly instance : https://collectif-objets-datasette.fly.dev . I have also ingested another data set with infos about french cities on this instance so that I can perform joined queries.\r\n\r\nThe surveying tool synchronizes its data regularly from this datasette instance, and I also use it to perform queries when asked generic questions about the distribution of objects. (The data is not very accessible as it's undocumented and for internal usage mostly)", "reactions": "{\"total_count\": 3, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 3, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1447050738, "label": "Call for birthday presents: if you're using Datasette, let us know how you're using it here"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1884#issuecomment-1309735529", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1884", "id": 1309735529, "node_id": "IC_kwDOBm6k_c5OEPpp", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-11-10T03:57:23Z", "updated_at": "2022-11-10T03:57:23Z", "author_association": "CONTRIBUTOR", "body": "Here's how to get a list of virtual tables: https://stackoverflow.com/questions/46617118/how-to-fetch-names-of-virtual-tables", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1439009231, "label": "Exclude virtual tables from datasette inspect"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1871#issuecomment-1309650806", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1871", "id": 1309650806, "node_id": "IC_kwDOBm6k_c5OD692", "user": {"value": 3556, "label": "davidbgk"}, "created_at": "2022-11-10T01:38:58Z", "updated_at": "2022-11-10T01:38:58Z", "author_association": "CONTRIBUTOR", "body": "> Realized the API explorer doesn't need the API key piece at all - it can work with standard cookie-based auth.\r\n> \r\n> This also reflects how most plugins are likely to use this API, where they'll be adding JavaScript that uses `fetch()` to call the write API directly.\r\n\r\nI agree (that's what I did with the previous insert plugin), maybe a complete example using `fetch()` in the documentation would be valuable as a \u201cGetting started with the API\u201d or similar?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1427293909, "label": "API explorer tool"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/511#issuecomment-1304320521", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/511", "id": 1304320521, "node_id": "IC_kwDOCGYnMM5NvloJ", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-11-04T22:54:09Z", "updated_at": "2022-11-04T22:59:54Z", "author_association": "CONTRIBUTOR", "body": "I ran `PRAGMA integrity_check` and it returned `ok`. but then I tried restoring from a backup and I didn't get this `IntegrityError: constraint failed` error. So I think it was just something wrong with my database. If it happens again I will first try to reindex and see if that fixes the issue", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1436539554, "label": "[insert_all, upsert_all] IntegrityError: constraint failed"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/511#issuecomment-1304078945", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/511", "id": 1304078945, "node_id": "IC_kwDOCGYnMM5Nuqph", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-11-04T19:38:36Z", "updated_at": "2022-11-04T20:13:17Z", "author_association": "CONTRIBUTOR", "body": "Even more bizarre, the source db only has one record and the target table has no conflicting record:\r\n\r\n```\r\n875 0.3s lb:/ (main|\u271a2) [0|0]\ud83c\udf3a sqlite-utils tube_71.db 'select * from media where path = \"https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz\"' | jq\r\n[\r\n {\r\n \"size\": null,\r\n \"time_created\": null,\r\n \"play_count\": 1,\r\n \"language\": null,\r\n \"view_count\": null,\r\n \"width\": null,\r\n \"height\": null,\r\n \"fps\": null,\r\n \"average_rating\": null,\r\n \"live_status\": null,\r\n \"age_limit\": null,\r\n \"uploader\": null,\r\n \"time_played\": 0,\r\n \"path\": \"https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz\",\r\n \"id\": \"088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz/074 - Home Away from Home, Rainy Day Robot, Odie the Amazing DVDRip XviD [PhZ].mkv\",\r\n \"ie_key\": \"ArchiveOrg\",\r\n \"playlist_path\": \"https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz\",\r\n \"duration\": 1424.05,\r\n \"tags\": null,\r\n \"title\": \"074 - Home Away from Home, Rainy Day Robot, Odie the Amazing DVDRip XviD [PhZ].mkv\"\r\n }\r\n]\r\n875 0.3s lb:/ (main|\u271a2) [0|0]\ud83e\udd67 sqlite-utils video.db 'select * from media where path = \"https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz\"' | jq\r\n[]\r\n```\r\n\r\nI've been able to use this code successfully several times before so not sure what's causing the issue.\r\n\r\nI guess the way that I'm handling multiple databases is an issue, though it hasn't ever inserted into the source db, not sure what's different. The only reasonable explanation is that it is trying to insert into the source db from the source db for some reason? Or maybe sqlite3 is checking the source db for primary key violation because the table name is the same", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1436539554, "label": "[insert_all, upsert_all] IntegrityError: constraint failed"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/50#issuecomment-1303660293", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/50", "id": 1303660293, "node_id": "IC_kwDOCGYnMM5NtEcF", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-11-04T14:38:36Z", "updated_at": "2022-11-04T14:38:36Z", "author_association": "CONTRIBUTOR", "body": "where did you see the limit as 999? I believe the limit has been 32766 for quite some time. If you could detect which one this could speed up batch insert of some types of data significantly", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 473083260, "label": "\"Too many SQL variables\" on large inserts"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/507#issuecomment-1297859539", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/507", "id": 1297859539, "node_id": "IC_kwDOCGYnMM5NW8PT", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-11-01T00:40:16Z", "updated_at": "2022-11-01T00:40:16Z", "author_association": "CONTRIBUTOR", "body": "Ideally people could fix their data if they run into this issue.\r\n\r\nIf you are using filenames try [convmv](https://linux.die.net/man/1/convmv)\r\n\r\n```\r\nconvmv --preserve-mtimes -f utf8 -t utf8 --notest -i -r .\r\n```\r\n\r\nmaybe this script will also help: \r\n\r\n```py\r\nimport argparse, shutil\r\nfrom pathlib import Path\r\n\r\nimport ftfy\r\n\r\nfrom xklb import utils\r\nfrom xklb.utils import log\r\n\r\n\r\ndef parse_args() -> argparse.Namespace:\r\n parser = argparse.ArgumentParser()\r\n parser.add_argument(\"paths\", nargs='*')\r\n parser.add_argument(\"--verbose\", \"-v\", action=\"count\", default=0)\r\n args = parser.parse_args()\r\n\r\n log.info(utils.dict_filter_bool(args.__dict__))\r\n return args\r\n\r\n\r\ndef rename_invalid_paths() -> None:\r\n args = parse_args()\r\n\r\n for path in args.paths:\r\n log.info(path)\r\n for p in sorted([str(p) for p in Path(path).rglob(\"*\")], key=len):\r\n fixed = ftfy.fix_text(p, uncurl_quotes=False).replace(\"\\r\\n\", \"\\n\").replace(\"\\r\", \"\\n\").replace(\"\\n\", \"\")\r\n if p != fixed:\r\n try:\r\n shutil.move(p, fixed)\r\n except FileNotFoundError:\r\n log.warning(\"FileNotFound. %s\", p)\r\n else:\r\n log.info(fixed)\r\n\r\n\r\nif __name__ == \"__main__\":\r\n rename_invalid_paths()\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1430325103, "label": "conn.execute: UnicodeEncodeError: 'utf-8' codec can't encode character"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/448#issuecomment-1297703307", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/448", "id": 1297703307, "node_id": "IC_kwDOCGYnMM5NWWGL", "user": {"value": 167893, "label": "mcarpenter"}, "created_at": "2022-10-31T21:23:51Z", "updated_at": "2022-10-31T21:27:32Z", "author_association": "CONTRIBUTOR", "body": "The Windows aspect is a red herring: OP's sample above produces the same error on Linux. (Though I don't know what's going on with the CI).\r\n\r\nThe same error can also be obtained by passing an `io` from a file opened in non-binary mode (`'r'` as opposed to `'rb'`) to `rows_from_file()`. This is how I got here.\r\n\r\nThe fix for my case is easy: open the file in mode `'rb'`. The analagous fix for OP's problem also works: use `BytesIO` in place of `StringIO`.\r\n\r\nMinimal test case (derived from [utils.py](https://github.com/simonw/sqlite-utils/blob/main/sqlite_utils/utils.py#L304)):\r\n\r\n``` python\r\nimport io\r\nfrom typing import cast\r\n\r\n#fp = io.StringIO(\"id,name\\n1,Cleo\") # error\r\nfp = io.BytesIO(bytes(\"id,name\\n1,Cleo\", encoding='utf-8')) # okay\r\nreader = io.BufferedReader(cast(io.RawIOBase, fp))\r\nreader.peek(1) # exception thrown here\r\n```\r\nI see the signature of `rows_from_file()` correctly has `fp: BinaryIO` but I guess you'd need either a runtime type check for that (not all `io`s have `mode()`), or to catch the `AttributeError` on `peek()` to produce a better error for users. Neither option is ideal.\r\n\r\nSome thoughts on testing binary-ness of `io`s in this SO question: https://stackoverflow.com/questions/44584829/how-to-determine-if-file-is-opened-in-binary-or-text-mode", "reactions": "{\"total_count\": 2, \"+1\": 2, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1279144769, "label": "Reading rows from a file => AttributeError: '_io.StringIO' object has no attribute 'readinto'"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1872#issuecomment-1296080804", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1872", "id": 1296080804, "node_id": "IC_kwDOBm6k_c5NQJ-k", "user": {"value": 192568, "label": "mroswell"}, "created_at": "2022-10-30T03:06:32Z", "updated_at": "2022-10-30T03:06:32Z", "author_association": "CONTRIBUTOR", "body": "I updated datasette-publish-vercel to 0.14.2 in requirements.txt\r\n\r\nAnd the site is back up!\r\n\r\nIs there a way that we can get some sort of notice when something like this will have critical impact on website function?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1428560020, "label": "SITE-BUSTING ERROR: \"render_template() called before await ds.invoke_startup()\""}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1872#issuecomment-1296076803", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1872", "id": 1296076803, "node_id": "IC_kwDOBm6k_c5NQJAD", "user": {"value": 192568, "label": "mroswell"}, "created_at": "2022-10-30T02:50:34Z", "updated_at": "2022-10-30T02:50:34Z", "author_association": "CONTRIBUTOR", "body": "should this issue be under https://github.com/simonw/datasette-publish-vercel/issues ?\r\n\r\nPerhaps I just need to update: \r\ndatasette-publish-vercel==0.11\r\nin requirements.txt?\r\n \r\n I'll try that and see what happens...\r\n ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1428560020, "label": "SITE-BUSTING ERROR: \"render_template() called before await ds.invoke_startup()\""}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1870#issuecomment-1295667649", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1870", "id": 1295667649, "node_id": "IC_kwDOBm6k_c5NOlHB", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-29T00:52:43Z", "updated_at": "2022-10-29T00:53:43Z", "author_association": "CONTRIBUTOR", "body": "> Are you saying that I can build a container, but then when I run it and it does `datasette serve -i data.db ...` it will somehow modify the image, or create a new modified filesystem layer in the runtime environment, as a result of running that `serve` command?\r\n\r\nSomehow, `datasette serve -i data.db` will lead to the `data.db` being modified, which will trigger a [copy-on-write](https://docs.docker.com/storage/storagedriver/#the-copy-on-write-cow-strategy) of `data.db` into the read-write layer of the container.\r\n\r\nI don't understand **how** that happens.\r\n\r\nit kind of feels like a bug in sqlite, but i can't quite follow the sqlite code.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1426379903, "label": "don't use immutable=1, only mode=ro"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1870#issuecomment-1294285471", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1870", "id": 1294285471, "node_id": "IC_kwDOBm6k_c5NJTqf", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-28T01:06:03Z", "updated_at": "2022-10-28T01:06:03Z", "author_association": "CONTRIBUTOR", "body": "as far as i can tell, [this is where the \"immutable\" argument is used](https://github.com/sqlite/sqlite/blob/c97bb14fab566f6fa8d967c8fd1e90f3702d5b73/src/pager.c#L4926-L4931) in sqlite:\r\n\r\n```c\r\n pPager->noLock = sqlite3_uri_boolean(pPager->zFilename, \"nolock\", 0);\r\n if( (iDc & SQLITE_IOCAP_IMMUTABLE)!=0\r\n || sqlite3_uri_boolean(pPager->zFilename, \"immutable\", 0) ){\r\n vfsFlags |= SQLITE_OPEN_READONLY;\r\n goto act_like_temp_file;\r\n }\r\n```\r\n\r\nso it does set the read only flag, but then has a goto.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1426379903, "label": "don't use immutable=1, only mode=ro"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1870#issuecomment-1294237783", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1870", "id": 1294237783, "node_id": "IC_kwDOBm6k_c5NJIBX", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-27T23:42:18Z", "updated_at": "2022-10-27T23:42:18Z", "author_association": "CONTRIBUTOR", "body": "Relevant sqlite forum thread: https://www.sqlite.org/forum/forumpost/02f7bda329f41e30451472421cf9ce7f715b768ce3db02797db1768e47950d48", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1426379903, "label": "don't use immutable=1, only mode=ro"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1851#issuecomment-1292592210", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1851", "id": 1292592210, "node_id": "IC_kwDOBm6k_c5NC2RS", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-10-26T20:03:46Z", "updated_at": "2022-10-26T20:03:46Z", "author_association": "CONTRIBUTOR", "body": "Yeah, every time I see something cool done with triggers, I remember that I need to start using triggers.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1421544654, "label": "API to insert a single record into an existing table"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1851#issuecomment-1292519956", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1851", "id": 1292519956, "node_id": "IC_kwDOBm6k_c5NCkoU", "user": {"value": 15178711, "label": "asg017"}, "created_at": "2022-10-26T19:20:33Z", "updated_at": "2022-10-26T19:20:33Z", "author_association": "CONTRIBUTOR", "body": "> This could use a new plugin hook, too. I don't want to complicate your life too much, but for things like GIS, I'd want a way to turn regular JSON into SpatiaLite geometries or combine X/Y coordinates into point geometries and such. Happy to help however I can.\r\n\r\n @eyeseast Maybe you could do this with triggers? Like you can insert JSON-friendly data into a \"raw\" table, and create a trigger that transforms that inserted data into the proper table\r\n\r\nHere's an example:\r\n\r\n```sql\r\n-- meant to be updated from a Datasette insert\r\ncreate table points_raw(longitude int, latitude int);\r\n\r\n-- the target table with proper spatliate geometries\r\ncreate table points(point geometry);\r\n\r\nCREATE TRIGGER insert_points_raw INSERT ON points_raw \r\n BEGIN\r\n insert into points(point) values (makepoint(new.longitude, new.latitude))\r\n END;\r\n```\r\n\r\nYou could then POST a new row to `points_raw` like this:\r\n```\r\nPOST /db/points_raw\r\nAuthorization: Bearer xxx\r\nContent-Type: application/json\r\n{\r\n \"row\": {\r\n \"longitude\": 27.64356,\r\n \"latitude\": -47.29384\r\n }\r\n}\r\n```\r\n\r\nThen SQLite with run the trigger and insert a new row in `points` with the correct geometry point. Downside is you'd have duplicated data with `points_raw`, but maybe it could be a `TEMP` table (or have a cron that deletes all rows from that table every so often?)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1421544654, "label": "API to insert a single record into an existing table"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/pull/499#issuecomment-1292401308", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/499", "id": 1292401308, "node_id": "IC_kwDOCGYnMM5NCHqc", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-10-26T17:54:26Z", "updated_at": "2022-10-26T17:54:51Z", "author_association": "CONTRIBUTOR", "body": "The problem with how it is currently is that the transformed fts table _will_ return incorrect results (unless the table was only 1 row or something), even if create_triggers was enabled previously. Maybe the simplest solution is to disable fts on a transformed table rather than try to recreate it? Thoughts?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1405196044, "label": "feat: recreate fts triggers after table transform"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1851#issuecomment-1291228502", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1851", "id": 1291228502, "node_id": "IC_kwDOBm6k_c5M9pVW", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-10-25T23:02:10Z", "updated_at": "2022-10-25T23:02:10Z", "author_association": "CONTRIBUTOR", "body": "That's reasonable. Canned queries and custom endpoints are certainly going to give more room for specific needs. ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1421544654, "label": "API to insert a single record into an existing table"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1851#issuecomment-1290615599", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1851", "id": 1290615599, "node_id": "IC_kwDOBm6k_c5M7Tsv", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-10-25T14:05:12Z", "updated_at": "2022-10-25T14:05:12Z", "author_association": "CONTRIBUTOR", "body": "This could use a new plugin hook, too. I don't want to complicate your life too much, but for things like GIS, I'd want a way to turn regular JSON into SpatiaLite geometries or combine X/Y coordinates into point geometries and such. Happy to help however I can.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1421544654, "label": "API to insert a single record into an existing table"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/pull/498#issuecomment-1274153135", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/498", "id": 1274153135, "node_id": "IC_kwDOCGYnMM5L8giv", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-10-11T06:34:31Z", "updated_at": "2022-10-11T06:34:31Z", "author_association": "CONTRIBUTOR", "body": "nevermind it was because I was running `db[table].transform`. The fts tables would still be there but the triggers would be dropped", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1404013495, "label": "fix: enable-fts permanently save triggers"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1272357976", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1272357976, "node_id": "IC_kwDOBm6k_c5L1qRY", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-08T16:56:51Z", "updated_at": "2022-10-08T16:56:51Z", "author_association": "CONTRIBUTOR", "body": "when you are running from docker, you **always** will want to run as `mode=ro` because the same thing that is causing duplication in the inspect layer will cause duplication in the final container read/write layer when `datasette serve` runs.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1271103097", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1271103097, "node_id": "IC_kwDOBm6k_c5Lw355", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T04:43:41Z", "updated_at": "2022-10-07T04:43:41Z", "author_association": "CONTRIBUTOR", "body": "@simonw, should i open up a new issue for investigating the differences between \"immutable=1\" and \"mode=ro\" and possibly switching to \"mode=ro\". Or would you like to keep that conversation in this issue?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1480#issuecomment-1271101072", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1480", "id": 1271101072, "node_id": "IC_kwDOBm6k_c5Lw3aQ", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T04:39:10Z", "updated_at": "2022-10-07T04:39:10Z", "author_association": "CONTRIBUTOR", "body": "switching from `immutable=1` to `mode=ro` completely addressed this. see https://github.com/simonw/datasette/issues/1836#issuecomment-1271100651 for details.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1015646369, "label": "Exceeding Cloud Run memory limits when deploying a 4.8G database"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1271100651", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1271100651, "node_id": "IC_kwDOBm6k_c5Lw3Tr", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T04:38:14Z", "updated_at": "2022-10-07T04:38:14Z", "author_association": "CONTRIBUTOR", "body": "> yes, and i also think that this is causing the apparent memory problems in #1480. when the container starts up, it will make some operation on the database in `immutable` mode which apparently makes some small change to the db file. if that's so, then the db files will be copied to the read/write layer which counts against cloudrun's memory allocation!\r\n> \r\n> running a test of that now.\r\n\r\nthis completely addressed #1480 ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1301#issuecomment-1271035998", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1301", "id": 1271035998, "node_id": "IC_kwDOBm6k_c5Lwnhe", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T02:38:04Z", "updated_at": "2022-10-07T02:38:04Z", "author_association": "CONTRIBUTOR", "body": "the only mode that `publish cloudrun` supports right now is immutable", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 860722711, "label": "Publishing to cloudrun with immutable mode?"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1271020193", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1271020193, "node_id": "IC_kwDOBm6k_c5Lwjqh", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T02:15:05Z", "updated_at": "2022-10-07T02:21:08Z", "author_association": "CONTRIBUTOR", "body": "when i hack the connect method to open non mutable files with \"mode=ro\" and not \"immutable=1\" https://github.com/simonw/datasette/blob/eff112498ecc499323c26612d707908831446d25/datasette/database.py#L79\r\n\r\nthen: \r\n\r\n```bash\r\n870 B RUN /bin/sh -c datasette inspect nlrb.db --inspect-file inspect-data.json\r\n```\r\n\r\nthe `datasette inspect` layer is only the size of the json file!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1271008997", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1271008997, "node_id": "IC_kwDOBm6k_c5Lwg7l", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T02:00:37Z", "updated_at": "2022-10-07T02:00:49Z", "author_association": "CONTRIBUTOR", "body": "yes, and i also think that this is causing the apparent memory problems in #1480. when the container starts up, it will make some operation on the database in `immutable` mode which apparently makes some small change to the db file. if that's so, then the db files will be copied to the read/write layer which counts against cloudrun's memory allocation!\r\n\r\nrunning a test of that now. ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1271003212", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1271003212, "node_id": "IC_kwDOBm6k_c5LwfhM", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T01:52:04Z", "updated_at": "2022-10-07T01:52:04Z", "author_association": "CONTRIBUTOR", "body": "and if we try immutable mode, which is how things are opened by `datasette inspect` we duplicate the files!!!\r\n\r\n```python\r\n# test_sql_immutable.py\r\nimport sqlite3\r\nimport sys\r\n\r\ndb_name = sys.argv[1]\r\nconn = sqlite3.connect(f'file:/app/{db_name}?immutable=1', uri=True)\r\ncur = conn.cursor()\r\ncur.execute('select count(*) from filing')\r\nprint(cur.fetchone())\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1270992795", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1270992795, "node_id": "IC_kwDOBm6k_c5Lwc-b", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T01:29:15Z", "updated_at": "2022-10-07T01:50:14Z", "author_association": "CONTRIBUTOR", "body": "fascinatingly, telling python to open sqlite in read only mode makes this layer have a size of 0\r\n\r\n```python\r\n# test_sql_ro.py\r\nimport sqlite3\r\nimport sys\r\n\r\ndb_name = sys.argv[1]\r\nconn = sqlite3.connect(f'file:/app/{db_name}?mode=ro', uri=True)\r\ncur = conn.cursor()\r\ncur.execute('select count(*) from filing')\r\nprint(cur.fetchone())\r\n```\r\n\r\nthat's quite weird because setting the file permissions to read only didn't do anything. (on reflection, that chmod isn't doing anything because the dockerfile commands are run as root)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1270988081", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1270988081, "node_id": "IC_kwDOBm6k_c5Lwb0x", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T01:19:01Z", "updated_at": "2022-10-07T01:27:35Z", "author_association": "CONTRIBUTOR", "body": "okay, some progress!! running some sql against a database file causes that file to get duplicated even if it doesn't apparently change the file.\r\n\r\nmake a little test script like this:\r\n\r\n```python\r\n# test_sql.py\r\nimport sqlite3\r\nimport sys\r\n\r\ndb_name = sys.argv[1]\r\nconn = sqlite3.connect(f'file:/app/{db_name}', uri=True)\r\ncur = conn.cursor()\r\ncur.execute('select count(*) from filing')\r\nprint(cur.fetchone())\r\n```\r\n\r\nthen \r\n\r\n```docker\r\nRUN python test_sql.py nlrb.db\r\n```\r\n\r\nproduced a layer that's the same size as `nlrb.db`!!\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1270936982", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1270936982, "node_id": "IC_kwDOBm6k_c5LwPWW", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T00:52:41Z", "updated_at": "2022-10-07T00:52:41Z", "author_association": "CONTRIBUTOR", "body": "it's not that the inspect command is somehow changing the db files. if i set them to only read-only, the \"inspect\" layer still has the same very large size.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1836#issuecomment-1270923537", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1836", "id": 1270923537, "node_id": "IC_kwDOBm6k_c5LwMER", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-07T00:46:08Z", "updated_at": "2022-10-07T00:46:08Z", "author_association": "CONTRIBUTOR", "body": "i thought it was maybe to do with reading through all the files, but that does not seem to be the case\r\n\r\nif i make a little test file like:\r\n\r\n```python\r\n# test_read.py\r\nimport hashlib\r\nimport sys\r\nimport pathlib\r\n\r\nHASH_BLOCK_SIZE = 1024 * 1024\r\n\r\ndef inspect_hash(path):\r\n \"\"\"Calculate the hash of a database, efficiently.\"\"\"\r\n m = hashlib.sha256()\r\n with path.open(\"rb\") as fp:\r\n while True:\r\n data = fp.read(HASH_BLOCK_SIZE)\r\n if not data:\r\n break\r\n m.update(data)\r\n\r\n return m.hexdigest()\r\n\r\ninspect_hash(pathlib.Path(sys.argv[1]))\r\n```\r\n\r\nthen a line in the Dockerfile like\r\n\r\n```docker\r\nRUN python test_read.py nlrb.db && echo \"[]\" > /etc/inspect.json\r\n```\r\n\r\njust produes a layer of `3B`\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1400374908, "label": "docker image is duplicating db files somehow"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1480#issuecomment-1269847461", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1480", "id": 1269847461, "node_id": "IC_kwDOBm6k_c5LsFWl", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-06T11:21:49Z", "updated_at": "2022-10-06T11:21:49Z", "author_association": "CONTRIBUTOR", "body": "thanks @simonw, i'll spend a little more time trying to figure out why this isn't working on cloudrun, and then will flip over to fly if i can't.\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1015646369, "label": "Exceeding Cloud Run memory limits when deploying a 4.8G database"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1480#issuecomment-1268629159", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1480", "id": 1268629159, "node_id": "IC_kwDOBm6k_c5Lnb6n", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-05T16:00:55Z", "updated_at": "2022-10-05T16:00:55Z", "author_association": "CONTRIBUTOR", "body": "as a next step, i'll fetch the docker image from the google registry, and see what memory and disk usage looks like when i run it locally.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1015646369, "label": "Exceeding Cloud Run memory limits when deploying a 4.8G database"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1480#issuecomment-1268613335", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1480", "id": 1268613335, "node_id": "IC_kwDOBm6k_c5LnYDX", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-10-05T15:45:49Z", "updated_at": "2022-10-05T15:45:49Z", "author_association": "CONTRIBUTOR", "body": "running into this as i continue to grow my labor data warehouse.\r\n\r\nHere a CloudRun PM says the container size should **not** count against memory: https://stackoverflow.com/a/56570717", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1015646369, "label": "Exceeding Cloud Run memory limits when deploying a 4.8G database"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/409#issuecomment-1264223554", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/409", "id": 1264223554, "node_id": "IC_kwDOCGYnMM5LWoVC", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-10-01T03:42:50Z", "updated_at": "2022-10-01T03:42:50Z", "author_association": "CONTRIBUTOR", "body": "oh weird. it inserts into db2", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1149661489, "label": "`with db:` for transactions"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/409#issuecomment-1264223363", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/409", "id": 1264223363, "node_id": "IC_kwDOCGYnMM5LWoSD", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-10-01T03:41:45Z", "updated_at": "2022-10-01T03:41:45Z", "author_association": "CONTRIBUTOR", "body": "```\r\npytest xklb/check.py --pdb\r\n\r\nxklb/check.py:11: in test_transaction\r\n assert list(db2[\"t\"].rows) == []\r\nE AssertionError: assert [{'foo': 1}] == []\r\nE + where [{'foo': 1}] = list()\r\nE + where = .rows\r\n>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\r\n\r\n>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PDB post_mortem (IO-capturing turned off) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\r\n> /home/xk/github/xk/lb/xklb/check.py(11)test_transaction()\r\n 9 with db1.conn:\r\n 10 db1[\"t\"].insert({\"foo\": 1})\r\n---> 11 assert list(db2[\"t\"].rows) == []\r\n 12 assert list(db2[\"t\"].rows) == [{\"foo\": 1}]\r\n```\r\n\r\nIt fails because it is already inserted.\r\n\r\nbtw if you put these two lines in you pyproject.toml you can get `ipdb` in pytest\r\n\r\n```\r\n[tool.pytest.ini_options]\r\naddopts = \"--pdbcls=IPython.terminal.debugger:TerminalPdb --ignore=tests/data --capture=tee-sys --log-cli-level=ERROR\"\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1149661489, "label": "`with db:` for transactions"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/493#issuecomment-1264219650", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/493", "id": 1264219650, "node_id": "IC_kwDOCGYnMM5LWnYC", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-10-01T03:22:50Z", "updated_at": "2022-10-01T03:23:58Z", "author_association": "CONTRIBUTOR", "body": "this is likely what you are looking for: https://stackoverflow.com/a/51076749/697964\r\n\r\nbut yeah I would say just disable smart quotes", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1386562662, "label": "Tiny typographical error in install/uninstall docs"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/370#issuecomment-1261930179", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/370", "id": 1261930179, "node_id": "IC_kwDOBm6k_c5LN4bD", "user": {"value": 72577720, "label": "MichaelTiemannOSC"}, "created_at": "2022-09-29T08:17:46Z", "updated_at": "2022-09-29T08:17:46Z", "author_association": "CONTRIBUTOR", "body": "Just watched this video which demonstrates the integration of *any* webapp into JupyterLab: https://youtu.be/FH1dKKmvFtc\r\n\r\nMaybe this is the answer?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 377155320, "label": "Integration with JupyterLab"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1062#issuecomment-1260909128", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1062", "id": 1260909128, "node_id": "IC_kwDOBm6k_c5LJ_JI", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-28T13:22:53Z", "updated_at": "2022-09-28T14:09:54Z", "author_association": "CONTRIBUTOR", "body": "if you went this route:\r\n\r\n```python\r\nwith sqlite_timelimit(conn, time_limit_ms):\r\n c.execute(query)\r\n for chunk in c.fetchmany(chunk_size):\r\n yield from chunk\r\n```\r\n\r\nthen `time_limit_ms` would probably have to be greatly extended, because the time spent in the loop will depend on the downstream processing.\r\n\r\ni wonder if this was why you were thinking this feature would need a dedicated connection?\r\n\r\n---\r\n\r\nreading more, there's no real limit i can find on the number of active cursors (or more precisely active prepared statements objects, because sqlite doesn't really have cursors). \r\n\r\nmaybe something like this would be okay?\r\n\r\n```python\r\nwith sqlite_timelimit(conn, time_limit_ms):\r\n c.execute(query)\r\n # step through at least one to evaluate the statement, not sure if this is necessary\r\n yield c.execute.fetchone()\r\nfor chunk in c.fetchmany(chunk_size):\r\n yield from chunk\r\n```\r\n\r\nthis seems quite weird that there's not more of limit of the number of active prepared statements, but i haven't been able to find one.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 732674148, "label": "Refactor .csv to be an output renderer - and teach register_output_renderer to stream all rows"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1062#issuecomment-1260829829", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1062", "id": 1260829829, "node_id": "IC_kwDOBm6k_c5LJryF", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-28T12:27:19Z", "updated_at": "2022-09-28T12:27:19Z", "author_association": "CONTRIBUTOR", "body": "for teaching `register_output_renderer` to stream it seems like the two options are to\r\n\r\n1. a [nested query technique ](https://github.com/simonw/datasette/issues/526#issuecomment-505162238)to paginate through\r\n2. a fetching model that looks like something\r\n```python\r\nwith sqlite_timelimit(conn, time_limit_ms):\r\n c.execute(query)\r\n for chunk in c.fetchmany(chunk_size):\r\n yield from chunk\r\n```\r\ncurrently `db.execute` is not a generator, so this would probably need a new method?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 732674148, "label": "Refactor .csv to be an output renderer - and teach register_output_renderer to stream all rows"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1259718517", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1259718517, "node_id": "IC_kwDOBm6k_c5LFcd1", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-27T16:02:51Z", "updated_at": "2022-09-27T16:04:46Z", "author_association": "CONTRIBUTOR", "body": "i think that `max_returned_rows` **is** a defense mechanism, just not for connection exhaustion. `max_returned_rows` is a defense mechanism against **memory bombs**.\r\n\r\nif you are potentially yielding out hundreds of thousands or even millions of rows, you need to be quite careful about data flow to not run out of memory on the server, or on the client.\r\n\r\nyou have a lot of places in your code that are protective of that right now, but `max_returned_rows` acts as the final backstop.\r\n\r\nso, given that, it makes sense to have removing `max_returned_rows` altogether be a non-goal, but instead allow for for specific codepaths (like streaming csv's) be able to bypass.\r\n\r\nthat could dramatically lower the surface area for a memory-bomb attack.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1258910228", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1258910228, "node_id": "IC_kwDOBm6k_c5LCXIU", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-27T03:11:07Z", "updated_at": "2022-09-27T03:11:07Z", "author_association": "CONTRIBUTOR", "body": "i think this feature would be safe, as its really only the time limit that can, and imo, should protect against long running queries, as it is pretty easy to make very expensive queries that don't return many rows.\r\n\r\nmoving away from `max_returned_rows` will requires some thinking about:\r\n\r\n1. memory usage and data flows to handle potentially very large result sets\r\n2. how to avoid rendering tens or hundreds of thousands of [html rows](#1655).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1258878311", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1258878311, "node_id": "IC_kwDOBm6k_c5LCPVn", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-27T02:19:48Z", "updated_at": "2022-09-27T02:19:48Z", "author_association": "CONTRIBUTOR", "body": "this sql query doesn't trip up `maximum_returned_rows` but does timeout\r\n\r\n```sql\r\nwith recursive counter(x) as (\r\n select 0\r\n union\r\n select x + 1 from counter\r\n )\r\n select * from counter LIMIT 10 OFFSET 100000000 \r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1258871525", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1258871525, "node_id": "IC_kwDOBm6k_c5LCNrl", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-27T02:09:32Z", "updated_at": "2022-09-27T02:14:53Z", "author_association": "CONTRIBUTOR", "body": "thanks @simonw, i learned something i didn't know about sqlite's execution model!\r\n\r\n> Imagine if Datasette CSVs did allow unlimited retrievals. Someone could hit the CSV endpoint for that recursive query and tie up Datasette's SQL connection effectively forever.\r\n\r\nwhy wouldn't the `sqlite_timelimit` guard prevent that?\r\n\r\n--- \r\non my local version which has the code to [turn off truncations for query csv](#1820), `sqlite_timelimit` does protect me.\r\n\r\n![Screenshot 2022-09-26 at 22-14-31 Error 500](https://user-images.githubusercontent.com/536941/192415680-94b32b7f-868f-4b89-8194-5752d45f6009.png)\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1258849766", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1258849766, "node_id": "IC_kwDOBm6k_c5LCIXm", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-27T01:27:03Z", "updated_at": "2022-09-27T01:27:03Z", "author_association": "CONTRIBUTOR", "body": "i agree with that concern! but if i'm understanding the code correctly, `maximum_returned_rows` does not protect against long-running queries in any way.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1820#issuecomment-1258803261", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1820", "id": 1258803261, "node_id": "IC_kwDOBm6k_c5LB9A9", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-27T00:03:09Z", "updated_at": "2022-09-27T00:03:09Z", "author_association": "CONTRIBUTOR", "body": "the pattern in this PR `max_returned_rows` control the maximum rows rendered through html and json, and the csv render bypasses that.\r\n\r\ni think it would be better to have each of these different query renderers have more direct control for how many rows to fetch, instead of relying on the internals of the `execute` method.\r\n\r\ngenerally, users will not want to paginate through tens of thousands of results, but often will want to download a full query as json or as csv. ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1386456717, "label": "[SPIKE] Don't truncate query CSVs"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1258712931", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/491", "id": 1258712931, "node_id": "IC_kwDOCGYnMM5LBm9j", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-09-26T22:31:58Z", "updated_at": "2022-09-26T22:31:58Z", "author_association": "CONTRIBUTOR", "body": "Right. The backup command will copy tables completely, but in the case of conflicting table names, the destination gets overwritten silently. That might not be what you want here. ", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1383646615, "label": "Ability to merge databases and tables"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1258508215", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/491", "id": 1258508215, "node_id": "IC_kwDOCGYnMM5LA0-3", "user": {"value": 25778, "label": "eyeseast"}, "created_at": "2022-09-26T19:22:14Z", "updated_at": "2022-09-26T19:22:14Z", "author_association": "CONTRIBUTOR", "body": "This might be fairly straightforward using SQLite's backup utility: https://docs.python.org/3/library/sqlite3.html#sqlite3.Connection.backup\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1383646615, "label": "Ability to merge databases and tables"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1258337011", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1258337011, "node_id": "IC_kwDOBm6k_c5LALLz", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-26T16:49:48Z", "updated_at": "2022-09-26T16:49:48Z", "author_association": "CONTRIBUTOR", "body": "i think the smallest change that gets close to what i want is to change the behavior so that `max_returned_rows` is not applied in the `execute` method when we are are asking for a csv of query.\r\n\r\nthere are some infelicities for that approach, but i'll make a PR to make it easier to discuss.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1258167564", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1258167564, "node_id": "IC_kwDOBm6k_c5K_h0M", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-26T14:57:44Z", "updated_at": "2022-09-26T15:08:36Z", "author_association": "CONTRIBUTOR", "body": "reading the database execute method i have a few questions.\r\n\r\nhttps://github.com/simonw/datasette/blob/cb1e093fd361b758120aefc1a444df02462389a3/datasette/database.py#L229-L242\r\n\r\n---\r\nunless i'm missing something (which is very likely!!), the `max_returned_rows` argument doesn't actually offer any protections against running very expensive queries. \r\n\r\nIt's not like adding a `LIMIT max_rows` argument. it make sense that it isn't because, the query could already have an `LIMIT` argument. Doing something like `select * from (query) limit {max_returned_rows}` **might** be protective but wouldn't always.\r\n\r\nInstead the code executes the full original query, and if still has time it fetches out the first `max_rows + 1` rows. \r\n\r\nthis *does* offer some protection of memory exhaustion, as you won't hydrate a huge result set into python (however, there are [data flow patterns](https://github.com/simonw/datasette/issues/1727#issuecomment-1258129113) that could avoid that too)\r\n\r\ngiven the current architecture, i don't see how creating a new connection would be use?\r\n\r\n---\r\n\r\nIf we just removed the `max_return_rows` limitation, then i think most things would be fine **except** for the QueryViews. Right now rendering, just [5000 rows takes a lot of client-side memory](https://github.com/simonw/datasette/issues/1655) so some form of pagination would be required.\r\n\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1655#issuecomment-1258166572", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1655", "id": 1258166572, "node_id": "IC_kwDOBm6k_c5K_hks", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-26T14:57:04Z", "updated_at": "2022-09-26T14:57:04Z", "author_association": "CONTRIBUTOR", "body": "I think that paginating, even in javascript, could be very helpful. Maybe render json or csv into the page and let javascript loading that into the dom?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1163369515, "label": "query result page is using 400mb of browser memory 40x size of html page and 400x size of csv data"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1727#issuecomment-1258129113", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1727", "id": 1258129113, "node_id": "IC_kwDOBm6k_c5K_YbZ", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-26T14:30:11Z", "updated_at": "2022-09-26T14:48:31Z", "author_association": "CONTRIBUTOR", "body": "from your analysis, it seems like the GIL is blocking on loading of the data from sqlite to python, (particularly in the `fetchmany` call)\r\n\r\nthis is probably a simplistic idea, but what if you had the python code in the `execute` method iterate over the cursor and yield out rows or small chunks of rows.\r\n\r\nsomething like: \r\n```python\r\n with sqlite_timelimit(conn, time_limit_ms):\r\n try:\r\n cursor = conn.cursor()\r\n cursor.execute(sql, params if params is not None else {})\r\n except:\r\n ...\r\n max_returned_rows = self.ds.max_returned_rows\r\n if max_returned_rows == page_size:\r\n max_returned_rows += 1\r\n if max_returned_rows and truncate:\r\n for i, row in enumerate(cursor):\r\n yield row\r\n if i == max_returned_rows - 1:\r\n break\r\n else:\r\n for row in cursor:\r\n yield row\r\n truncated = False \r\n```\r\n\r\nthis kind of thing works well with a postgres server side cursor, but i'm not sure if it will hold for sqlite. \r\n\r\nyou would still spend about the same amount of time in python and would be contending for the gil, but it would be could be non blocking.\r\n\r\ndepending on the data flow, this could also some benefit for memory. (data stays in more compact sqlite-land until you need it)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1217759117, "label": "Research: demonstrate if parallel SQL queries are worthwhile"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1256858763", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/491", "id": 1256858763, "node_id": "IC_kwDOCGYnMM5K6iSL", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-09-24T04:50:59Z", "updated_at": "2022-09-24T04:52:08Z", "author_association": "CONTRIBUTOR", "body": "Instead of outputting binary data to stdout the interface might be better like this\r\n\r\n```\r\nsqlite-utils merge animals.db cats.db dogs.db\r\n```\r\n\r\nsimilar to `zip`, `ogr2ogr`, etc\r\n\r\nActually I think this might already be possible within `ogr2ogr`. I don't believe spatial data is a requirement though it might add an `ogc_id` column or something\r\n\r\n```\r\ncp cats.db animals.db\r\nogr2ogr -append animals.db dogs.db\r\nogr2ogr -append animals.db another.db\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1383646615, "label": "Ability to merge databases and tables"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1817#issuecomment-1256781274", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1817", "id": 1256781274, "node_id": "IC_kwDOBm6k_c5K6PXa", "user": {"value": 50527, "label": "jefftriplett"}, "created_at": "2022-09-23T22:59:46Z", "updated_at": "2022-09-23T22:59:46Z", "author_association": "CONTRIBUTOR", "body": "While you are adding features, would you be future-proofing your APIs if you switched over some arguments over to keyword-only arguments or would that be too disruptive?\r\n\r\nThinking out loud:\r\n\r\n```\r\nasync def render_template( \r\n self, templates, *, context=None, plugin_context=None, request=None, view_name=None \r\n ): \r\n```\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1384273985, "label": "Expose `sql` and `params` arguments to various plugin hooks"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/526#issuecomment-1254064260", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/526", "id": 1254064260, "node_id": "IC_kwDOBm6k_c5Kv4CE", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-09-21T18:17:04Z", "updated_at": "2022-09-21T18:18:01Z", "author_association": "CONTRIBUTOR", "body": "hi @simonw, this is becoming more of a bother for my [labor data warehouse](https://labordata.bunkum.us/). Is there any research or a spike i could do that would help you investigate this issue?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 459882902, "label": "Stream all results for arbitrary SQL and canned queries"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/433#issuecomment-1252898131", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/433", "id": 1252898131, "node_id": "IC_kwDOCGYnMM5KrbVT", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-09-20T20:51:21Z", "updated_at": "2022-09-20T20:56:07Z", "author_association": "CONTRIBUTOR", "body": "When I run `reset` it fixes my terminal. I suspect it is related to the progress bar\r\n\r\nhttps://linux.die.net/man/1/reset\r\n\r\n```\r\n950 1s /m/d/03_Downloads \ud83d\udc11 echo $TERM\r\nxterm-kitty\r\n\u2593\u2591\u2592\u2591 /m/d/03_Downloads \ud83c\udf0f kitty -v\r\nkitty 0.26.2 created by Kovid Goyal\r\n$ sqlite-utils insert test.db facility facility-boundary-us-all.csv --csv\r\nblah blah blah (no offense)\r\n$ \r\n$ reset\r\n$ \r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1239034903, "label": "CLI eats my cursor"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1813#issuecomment-1250901367", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1813", "id": 1250901367, "node_id": "IC_kwDOBm6k_c5Kjz13", "user": {"value": 883348, "label": "adipasquale"}, "created_at": "2022-09-19T11:34:45Z", "updated_at": "2022-09-19T11:34:45Z", "author_association": "CONTRIBUTOR", "body": "oh and by writing this I just realized the difference: the URL on fly.io is with a custom SQL command whereas the local one is without. \r\nIt seems that there is no pagination when using custom SQL commands which makes sense\r\n\r\nSorry for this useless issue, maybe this can be useful for someone else / me in the future.\r\n\r\nThanks again for this wonderful project !", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1377811868, "label": "missing next and next_url in JSON responses from an instance deployed on Fly "}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1810#issuecomment-1248204219", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1810", "id": 1248204219, "node_id": "IC_kwDOBm6k_c5KZhW7", "user": {"value": 82988, "label": "psychemedia"}, "created_at": "2022-09-15T14:44:47Z", "updated_at": "2022-09-15T14:46:26Z", "author_association": "CONTRIBUTOR", "body": "A couple+ of possible use case examples:\r\n\r\n- someone has a collection of articles indexed with FTS; they want to publish a simple search tool over the results;\r\n- someone has an image collection and they want to be able to search over description text to return images;\r\n- someone has a set of locations with descriptions, and wants to run a query over places and descriptions and get results as a listing or on a map;\r\n- someone has a set of audio or video files with titles, descriptions and/or transcripts, and wants to be able to search over them and return playable versions of returned items.\r\n\r\nIn many cases, I suspect the raw content will be in one table, but the search table will be a second (eg FTS) table. Generally, the search may be over one or more joined tables, and the results constructed from one or more tables (which may or may not be distinct from the search tables).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1374626873, "label": "Featured table(s) on the homepage"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1685#issuecomment-1237381620", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1685", "id": 1237381620, "node_id": "IC_kwDOBm6k_c5JwPH0", "user": {"value": 49699333, "label": "dependabot[bot]"}, "created_at": "2022-09-05T18:36:47Z", "updated_at": "2022-09-05T18:36:47Z", "author_association": "CONTRIBUTOR", "body": "Looks like jinja2 is no longer updatable, so this is no longer needed.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1180778860, "label": "Update jinja2 requirement from <3.1.0,>=2.10.3 to >=2.10.3,<3.2.0"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1799#issuecomment-1237381569", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1799", "id": 1237381569, "node_id": "IC_kwDOBm6k_c5JwPHB", "user": {"value": 49699333, "label": "dependabot[bot]"}, "created_at": "2022-09-05T18:36:42Z", "updated_at": "2022-09-05T18:36:42Z", "author_association": "CONTRIBUTOR", "body": "Looks like aiofiles is no longer updatable, so this is no longer needed.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1362242558, "label": "Update aiofiles requirement from <0.9,>=0.4 to >=0.4,<22.2"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/pull/480#issuecomment-1232356302", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/480", "id": 1232356302, "node_id": "IC_kwDOCGYnMM5JdEPO", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-08-31T01:51:49Z", "updated_at": "2022-08-31T01:51:49Z", "author_association": "CONTRIBUTOR", "body": "Thanks for pointing me to the right place", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1355433619, "label": "search_sql add include_rank option"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/467#issuecomment-1224382336", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/467", "id": 1224382336, "node_id": "IC_kwDOCGYnMM5I-peA", "user": {"value": 50527, "label": "jefftriplett"}, "created_at": "2022-08-23T17:16:13Z", "updated_at": "2022-08-23T17:16:13Z", "author_association": "CONTRIBUTOR", "body": "> Should passing `alter=True` also drop any columns that aren't included in the new table structure?\r\n> \r\n> It could even spot column types that aren't correct and fix those.\r\n> \r\n> Is that consistent with the expectations set by how `alter=True` works elsewhere?\r\n\r\nI would lean towards not dropping them (or making a `drop=True` or `drop_columns=True`or `drop_missing_columns=True`) to work with existing tables easier. \r\n\r\nI do like that sqlite-utils mostly just works with existing tables but it's also nice to add to existing fields in a few cases. \r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1348169997, "label": "Mechanism for ensuring a table has all the columns"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1789#issuecomment-1223347322", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1789", "id": 1223347322, "node_id": "IC_kwDOBm6k_c5I6sx6", "user": {"value": 15178711, "label": "asg017"}, "created_at": "2022-08-23T00:03:20Z", "updated_at": "2022-08-23T00:03:20Z", "author_association": "CONTRIBUTOR", "body": "@simonw to build the extension on ubuntu, you can run:\r\n\r\n```\r\napt-get update && apt-get install libsqlite3-dev gcc\r\ngcc ext.c -fPIC -shared -o ext.so\r\n```\r\n\r\nI'm not the best with Actions, but if you set the cache key to `ext.c`, run those two commands to download dependencies + compile to `ext.so`, then the unit test should pick it up and run it correctly. Let me know if you want me to update the PR with that added", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1344823170, "label": "Add new entrypoint option to `--load-extension`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/pull/1789#issuecomment-1221576460", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1789", "id": 1221576460, "node_id": "IC_kwDOBm6k_c5Iz8cM", "user": {"value": 15178711, "label": "asg017"}, "created_at": "2022-08-21T16:16:42Z", "updated_at": "2022-08-21T16:16:42Z", "author_association": "CONTRIBUTOR", "body": "Rebased, Read the docs failure should now now fixed\r\n\r\nRe docs - ya that's a pretty ambitious page, I'm still not 100% sure what the best practices are/should be... Would be happy to make that page in a future PR", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1344823170, "label": "Add new entrypoint option to `--load-extension`"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1779#issuecomment-1214437408", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1779", "id": 1214437408, "node_id": "IC_kwDOBm6k_c5IYtgg", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-08-14T19:42:58Z", "updated_at": "2022-08-14T19:42:58Z", "author_association": "CONTRIBUTOR", "body": "thanks @simonw!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1334628400, "label": "google cloudrun updated their limits on maxscale based on memory and cpu count"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1779#issuecomment-1210675046", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1779", "id": 1210675046, "node_id": "IC_kwDOBm6k_c5IKW9m", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-08-10T13:28:37Z", "updated_at": "2022-08-10T13:28:37Z", "author_association": "CONTRIBUTOR", "body": "maybe a simpler solution is to set the maxscale to like 2? since datasette is not set up to make use of container scaling anyway?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1334628400, "label": "google cloudrun updated their limits on maxscale based on memory and cpu count"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1191#issuecomment-1200732975", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1191", "id": 1200732975, "node_id": "IC_kwDOBm6k_c5Hkbsv", "user": {"value": 2670795, "label": "brandonrobertz"}, "created_at": "2022-08-01T05:39:27Z", "updated_at": "2022-08-01T05:39:27Z", "author_association": "CONTRIBUTOR", "body": "I've got a URL shortening plugin that I would like to embed on the query page but I'd like avoid capturing the entire `query.html` template. A feature like this would solve it. Where's this at and how can I help?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 787098345, "label": "Ability for plugins to collaborate when adding extra HTML to blocks in default templates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/456#issuecomment-1190277829", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/456", "id": 1190277829, "node_id": "IC_kwDOCGYnMM5G8jLF", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-07-20T13:19:15Z", "updated_at": "2022-07-20T13:19:15Z", "author_association": "CONTRIBUTOR", "body": "hadley wickham's melt and reshape could be good inspo: http://had.co.nz/reshape/introduction.pdf", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1310243385, "label": "feature request: pivot command"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/456#issuecomment-1190272780", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/456", "id": 1190272780, "node_id": "IC_kwDOCGYnMM5G8h8M", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-07-20T13:14:54Z", "updated_at": "2022-07-20T13:14:54Z", "author_association": "CONTRIBUTOR", "body": "for example, i have data on votes that look like this:\r\n\r\n| ballot_id | option_id | choice |\r\n|-|-|-|\r\n| 1 | 1 | 0 | \r\n| 1 | 2 | 1 |\r\n| 1 | 3 | 0 |\r\n| 1 | 4 | 1 |\r\n| 2 | 1 | 1 |\r\n| 2 | 2 | 0 |\r\n| 2 | 3 | 1 |\r\n| 2 | 4 | 0 |\r\n\r\nand i want to reshape from this long form to this wide form:\r\n\r\n| ballot_id | option_id_1 | option_id_2 | option_id_3 | option_id_ 4|\r\n|-|-|-|-| -|\r\n| 1 | 0 | 1 | 0 | 1 | \r\n| 2 | 1 | 0 | 1| 0 | \r\n\r\ni could do such a think like this.\r\n\r\n```sql\r\nselect ballot_id, \r\nsum(choice) filter (where option_id = 1) as option_id_1,\r\nsum(choice) filter (where option_id = 2) as option_id_2,\r\nsum(choice) filter (where option_id = 3) as option_id_3,\r\nsum(choice) filter (where option_id = 4) as option_id_4\r\nfrom vote\r\ngroup by ballot_id\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1310243385, "label": "feature request: pivot command"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/423#issuecomment-1189010812", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/423", "id": 1189010812, "node_id": "IC_kwDOCGYnMM5G3t18", "user": {"value": 536941, "label": "fgregg"}, "created_at": "2022-07-19T12:47:39Z", "updated_at": "2022-07-19T12:47:39Z", "author_association": "CONTRIBUTOR", "body": "just ran into this!", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1199158210, "label": ".extract() doesn't set foreign key when extracted columns contain NULL value"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/449#issuecomment-1179579878", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/449", "id": 1179579878, "node_id": "IC_kwDOCGYnMM5GTvXm", "user": {"value": 1690072, "label": "davidleejy"}, "created_at": "2022-07-09T17:41:32Z", "updated_at": "2022-07-09T17:41:50Z", "author_association": "CONTRIBUTOR", "body": "Learnt that the types in Sqlite-utils differ somewhat from those in Sqlite. I've changed my test to account for this difference and the test has passed successfully. I will submit a PR.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1279863844, "label": "Utilities for duplicating tables and creating a table with the results of a query"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/449#issuecomment-1174027079", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/449", "id": 1174027079, "node_id": "IC_kwDOCGYnMM5F-jtH", "user": {"value": 1690072, "label": "davidleejy"}, "created_at": "2022-07-04T17:33:04Z", "updated_at": "2022-07-04T17:48:43Z", "author_association": "CONTRIBUTOR", "body": "I've written the code and test. Would you be able to advise how to compare table columns in a pytest function properly? Experiencing a challenge when comparing columns.\r\n\r\nTest:\r\n```python\r\ndef test_duplicate(fresh_db):\r\n table = fresh_db.create_table(\r\n \"table1\",\r\n {\r\n \"text_col\": str,\r\n \"float_col\": float,\r\n \"int_col\": int,\r\n \"bool_col\": bool,\r\n \"bytes_col\": bytes,\r\n \"datetime_col\": datetime.datetime,\r\n },\r\n )\r\n dt = datetime.datetime.now()\r\n b = bytes('hello world', 'utf-8')\r\n data = {\"text_col\": \"Cleo\", \r\n \"float_col\": 3.14,\r\n \"int_col\": -2,\r\n \"bool_col\": True,\r\n \"bytes_col\": b,\r\n \"datetime_col\": str(dt)}\r\n table1 = fresh_db[\"table1\"]\r\n row_id = table1.insert(data).last_rowid\r\n table1.duplicate('table2')\r\n table2 = fresh_db[\"table2\"]\r\n assert data == table2.get(row_id)\r\n assert table1.columns == table2.columns # FAILS HERE\r\n```\r\n\r\nResult:\r\n![Screenshot 2022-07-05 at 1 31 55 AM](https://user-images.githubusercontent.com/1690072/177198814-daac48c9-5746-49d0-a14a-14fe181c5a2f.png)\r\n\r\nFailure is due to column types being named differently -- e.g. 'FLOAT' vs 'REAL', 'INTEGER' vs 'INT'. How should I go about comparing columns while accounting for equivalent types?\r\n\r\nOr did I miss out something in my duplication code correctly? Here's how I did it: in `db.py`, I've added the following code:\r\n```python\r\nclass Table(Queryable):\r\n [...]\r\n def duplicate(\r\n self, \r\n name_new: str\r\n ) -> \"Table\":\r\n \"\"\"\r\n Duplicate this table in this database.\r\n\r\n :param name_new: Name of new table.\r\n \"\"\"\r\n assert self.exists()\r\n with self.db.conn:\r\n sql = \"CREATE TABLE [{new_table}] AS SELECT * FROM [{table}];\".format(\r\n new_table = name_new,\r\n table = self.name,\r\n )\r\n self.db.execute(sql)\r\n return self.db[name_new]\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1279863844, "label": "Utilities for duplicating tables and creating a table with the results of a query"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/datasette/issues/1713#issuecomment-1173358747", "issue_url": "https://api.github.com/repos/simonw/datasette/issues/1713", "id": 1173358747, "node_id": "IC_kwDOBm6k_c5F8Aib", "user": {"value": 2670795, "label": "brandonrobertz"}, "created_at": "2022-07-04T05:16:35Z", "updated_at": "2022-07-04T05:16:35Z", "author_association": "CONTRIBUTOR", "body": "This feature is pretty important and would be nice if it would be all within Datasette (no separate CLI/deploy required). My workflow now is to basically just copy the result and paste into a Google Sheet, which works, but then it's not discoverable to other journalists browsing the Datasette instance. I started building a plugin similar to [datasette-saved-queries](https://datasette.io/plugins/datasette-saved-queries) but one that maintains its own DB (required if you're working with all immutable DBs), but got bogged down in details.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1203943272, "label": "Datasette feature for publishing snapshots of query results"}, "performed_via_github_app": null}