github: issue_comments: 616 rows where author_association = "CONTRIBUTOR" sorted by updated

616 rows where author_association = "CONTRIBUTOR" sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1316256386	https://github.com/simonw/datasette/pull/1893#issuecomment-1316256386	https://api.github.com/repos/simonw/datasette/issues/1893	IC_kwDOBm6k_c5OdHqC	bgrins 95570	2022-11-16T03:18:06Z	2022-11-16T03:18:06Z	CONTRIBUTOR	If you can get a version of this working with table and column autocompletion just using a static JavaScript object in the source code with the right tables and columns, I'm happy to take on the work of turning that static object into something that Datasette includes in the page itself with all of the correct values. This version "sort of" works when on the main database page where the template passes the relevant data https://github.com/bgrins/datasette/commit/8431c98850c7a552dbcde2a4dd0c3dc942a97d25 by doing this and passing that into the `schema` object: ``` let TABLES_DATA = []; {% if tables is defined %} TABLES_DATA = {{ tables \| tojson(indent=2) }}; {% endif %} // Turn into an object, shaped like https://github.com/codemirror/lang-sql/blob/ebf115fffdbe07f91465ccbd82868c587f8182bc/test/test-complete.ts#L27. const TABLES_SCHEMA = Object.fromEntries( new Map( TABLES_DATA.map((table) => { return [table.name, table.columns]; }) ).entries() ); ``` But there are a number of papercuts with it - it's not escaping table names with spaces (likely be fixable from the data being passed into the view) but mainly it doesn't seem to autocomplete columns. I think it might only want to do it when you first type the table name from my read of https://github.com/codemirror/lang-sql/blob/ebf115fffdbe07f91465ccbd82868c587f8182bc/test/test-complete.ts#L37. It's possible I'm just passing something wrong, but it may end up being something that needs feature work upstream.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Upgrade to CodeMirror 6, add SQL autocomplete 1450363982
1316243602	https://github.com/simonw/datasette/pull/1893#issuecomment-1316243602	https://api.github.com/repos/simonw/datasette/issues/1893	IC_kwDOBm6k_c5OdEiS	bgrins 95570	2022-11-16T03:11:46Z	2022-11-16T03:11:46Z	CONTRIBUTOR	Was just reviewing the SQL options and there's an upperCaseKeywords if we'd rather have SELECT vs select. Datasette seems to prefer lowercase so probably best to keep it as-is	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Upgrade to CodeMirror 6, add SQL autocomplete 1450363982
1316041828	https://github.com/simonw/datasette/pull/1893#issuecomment-1316041828	https://api.github.com/repos/simonw/datasette/issues/1893	IC_kwDOBm6k_c5OcTRk	bgrins 95570	2022-11-15T23:51:35Z	2022-11-15T23:51:35Z	CONTRIBUTOR	I experimented with autocompleting the actual schema in https://github.com/bgrins/datasette/commit/8431c98850c7a552dbcde2a4dd0c3dc942a97d25, but it would need some work (current problems with it listed in the commit message there)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Upgrade to CodeMirror 6, add SQL autocomplete 1450363982
1315869946	https://github.com/simonw/datasette/pull/1893#issuecomment-1315869946	https://api.github.com/repos/simonw/datasette/issues/1893	IC_kwDOBm6k_c5ObpT6	bgrins 95570	2022-11-15T21:12:38Z	2022-11-15T21:12:38Z	CONTRIBUTOR	https://github.com/Sphinxxxx/cm-resize isn't compatible with 6. There's a suggestion to try using CSS resize in https://discuss.codemirror.net/t/resizing-codemirror-6/3265/2	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Upgrade to CodeMirror 6, add SQL autocomplete 1450363982
1315869040	https://github.com/simonw/datasette/pull/1893#issuecomment-1315869040	https://api.github.com/repos/simonw/datasette/issues/1893	IC_kwDOBm6k_c5ObpFw	bgrins 95570	2022-11-15T21:11:42Z	2022-11-15T21:11:42Z	CONTRIBUTOR	extraKeys is done - Shift+Enter is added in the helper function, and it appears that the Tab behavior now defaults to what the `Tab: false` setting was doing (allowing it to escape to the form)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Upgrade to CodeMirror 6, add SQL autocomplete 1450363982
1315853097	https://github.com/simonw/datasette/pull/1893#issuecomment-1315853097	https://api.github.com/repos/simonw/datasette/issues/1893	IC_kwDOBm6k_c5OblMp	bgrins 95570	2022-11-15T20:55:40Z	2022-11-15T20:55:40Z	CONTRIBUTOR	Should also minify the bundled output	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Upgrade to CodeMirror 6, add SQL autocomplete 1450363982
1314241058	https://github.com/simonw/datasette/issues/1886#issuecomment-1314241058	https://api.github.com/repos/simonw/datasette/issues/1886	IC_kwDOBm6k_c5OVboi	eyeseast 25778	2022-11-14T19:06:35Z	2022-11-14T19:06:35Z	CONTRIBUTOR	This probably counts as a case study: https://github.com/eyeseast/spatial-data-cooking-show. Even has video. Seriously, though, this workflow has become integral to my work with reporters and editors across USA TODAY Network. Very often, I get sent a folder of data in mixed formats, with a vague ask of how we should communicate some part of it to users. Datasette and its constellation of tools makes it easy to get a quick look at that data, run exploratory queries, map it and ask questions to figure out what's important to show. And then I export a version of the data that's exactly what I need for display.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Call for birthday presents: if you're using Datasette, let us know how you're using it here 1447050738
1314066229	https://github.com/simonw/datasette/issues/1884#issuecomment-1314066229	https://api.github.com/repos/simonw/datasette/issues/1884	IC_kwDOBm6k_c5OUw81	eyeseast 25778	2022-11-14T16:48:35Z	2022-11-14T16:48:35Z	CONTRIBUTOR	I'm realizing I don't know if a virtual table will ever return a count. Maybe it depends on the implementation. For these three, just checking now, it'll always return zero. That said, I'm not sure there's any downside to having them return zero and caching that. (They're hidden, too.)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Exclude virtual tables from datasette inspect 1439009231
1313962183	https://github.com/simonw/datasette/issues/1884#issuecomment-1313962183	https://api.github.com/repos/simonw/datasette/issues/1884	IC_kwDOBm6k_c5OUXjH	eyeseast 25778	2022-11-14T15:46:32Z	2022-11-14T15:46:32Z	CONTRIBUTOR	It does work, though I think it's probably still worth excluding virtual tables that will always be zero. Here's the same inspection as before, now with `--load-extension spatialite`: json { "alltheplaces": { "hash": "0843cfe414439ab903c22d1121b7ddbc643418c35c7f0edbcec82ef1452411df", "size": 963375104, "file": "alltheplaces.db", "tables": { "spatial_ref_sys": { "count": 6215 }, "spatialite_history": { "count": 18 }, "sqlite_sequence": { "count": 2 }, "geometry_columns": { "count": 3 }, "spatial_ref_sys_aux": { "count": 6164 }, "views_geometry_columns": { "count": 0 }, "virts_geometry_columns": { "count": 0 }, "geometry_columns_statistics": { "count": 3 }, "views_geometry_columns_statistics": { "count": 0 }, "virts_geometry_columns_statistics": { "count": 0 }, "geometry_columns_field_infos": { "count": 0 }, "views_geometry_columns_field_infos": { "count": 0 }, "virts_geometry_columns_field_infos": { "count": 0 }, "geometry_columns_time": { "count": 3 }, "geometry_columns_auth": { "count": 3 }, "views_geometry_columns_auth": { "count": 0 }, "virts_geometry_columns_auth": { "count": 0 }, "data_licenses": { "count": 10 }, "sql_statements_log": { "count": 0 }, "states": { "count": 56 }, "counties": { "count": 3234 }, "idx_states_geometry_rowid": { "count": 56 }, "idx_states_geometry_node": { "count": 3 }, "idx_states_geometry_parent": { "count": 2 }, "idx_counties_geometry_rowid": { "count": 3234 }, "idx_counties_geometry_node": { "count": 98 }, "idx_counties_geometry_parent": { "count": 97 }, "idx_places_geometry_rowid": { "count": 1236796 }, "idx_places_geometry_node": { "count": 38163 }, "idx_places_geometry_parent": { "count": 38162 }, "places": { "count": 1332609 }, "SpatialIndex": { "count": 0 }, "ElementaryGeometries": { "count": 0 }, "KNN": { "count": 0 }, "idx_states_geometry": { "count": 56 }, "idx_counties_geometry": { "count": 3234 }, "idx_places_geometry": { "count": 1236796 } } } }	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Exclude virtual tables from datasette inspect 1439009231
1313252879	https://github.com/simonw/datasette/issues/1886#issuecomment-1313252879	https://api.github.com/repos/simonw/datasette/issues/1886	IC_kwDOBm6k_c5ORqYP	adipasquale 883348	2022-11-14T08:10:23Z	2022-11-14T08:10:23Z	CONTRIBUTOR	Hi @simonw and thanks for the great tools you're publishing, your dedication is inspiring! I work for the French Ministry of Culture on a surveying tool for objects protected for their historical value. It is part of a program building modern public services called beta.gouv.fr. In that context I'm using data published by the Ministry that I have ingested into datasette and published on a free Fly instance : https://collectif-objets-datasette.fly.dev . I have also ingested another data set with infos about french cities on this instance so that I can perform joined queries. The surveying tool synchronizes its data regularly from this datasette instance, and I also use it to perform queries when asked generic questions about the distribution of objects. (The data is not very accessible as it's undocumented and for internal usage mostly)	{ "total_count": 3, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 3, "rocket": 0, "eyes": 0 }	Call for birthday presents: if you're using Datasette, let us know how you're using it here 1447050738
1309735529	https://github.com/simonw/datasette/issues/1884#issuecomment-1309735529	https://api.github.com/repos/simonw/datasette/issues/1884	IC_kwDOBm6k_c5OEPpp	eyeseast 25778	2022-11-10T03:57:23Z	2022-11-10T03:57:23Z	CONTRIBUTOR	Here's how to get a list of virtual tables: https://stackoverflow.com/questions/46617118/how-to-fetch-names-of-virtual-tables	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Exclude virtual tables from datasette inspect 1439009231
1309650806	https://github.com/simonw/datasette/issues/1871#issuecomment-1309650806	https://api.github.com/repos/simonw/datasette/issues/1871	IC_kwDOBm6k_c5OD692	davidbgk 3556	2022-11-10T01:38:58Z	2022-11-10T01:38:58Z	CONTRIBUTOR	Realized the API explorer doesn't need the API key piece at all - it can work with standard cookie-based auth. This also reflects how most plugins are likely to use this API, where they'll be adding JavaScript that uses `fetch()` to call the write API directly. I agree (that's what I did with the previous insert plugin), maybe a complete example using `fetch()` in the documentation would be valuable as a “Getting started with the API” or similar?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API explorer tool 1427293909
1304320521	https://github.com/simonw/sqlite-utils/issues/511#issuecomment-1304320521	https://api.github.com/repos/simonw/sqlite-utils/issues/511	IC_kwDOCGYnMM5NvloJ	chapmanjacobd 7908073	2022-11-04T22:54:09Z	2022-11-04T22:59:54Z	CONTRIBUTOR	I ran `PRAGMA integrity_check` and it returned `ok`. but then I tried restoring from a backup and I didn't get this `IntegrityError: constraint failed` error. So I think it was just something wrong with my database. If it happens again I will first try to reindex and see if that fixes the issue	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[insert_all, upsert_all] IntegrityError: constraint failed 1436539554
1304078945	https://github.com/simonw/sqlite-utils/issues/511#issuecomment-1304078945	https://api.github.com/repos/simonw/sqlite-utils/issues/511	IC_kwDOCGYnMM5Nuqph	chapmanjacobd 7908073	2022-11-04T19:38:36Z	2022-11-04T20:13:17Z	CONTRIBUTOR	Even more bizarre, the source db only has one record and the target table has no conflicting record: 875 0.3s lb:/ (main\|✚2) [0\|0]🌺 sqlite-utils tube_71.db 'select * from media where path = "https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz"' \| jq [ { "size": null, "time_created": null, "play_count": 1, "language": null, "view_count": null, "width": null, "height": null, "fps": null, "average_rating": null, "live_status": null, "age_limit": null, "uploader": null, "time_played": 0, "path": "https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz", "id": "088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz/074 - Home Away from Home, Rainy Day Robot, Odie the Amazing DVDRip XviD [PhZ].mkv", "ie_key": "ArchiveOrg", "playlist_path": "https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz", "duration": 1424.05, "tags": null, "title": "074 - Home Away from Home, Rainy Day Robot, Odie the Amazing DVDRip XviD [PhZ].mkv" } ] 875 0.3s lb:/ (main\|✚2) [0\|0]🥧 sqlite-utils video.db 'select * from media where path = "https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz"' \| jq [] I've been able to use this code successfully several times before so not sure what's causing the issue. I guess the way that I'm handling multiple databases is an issue, though it hasn't ever inserted into the source db, not sure what's different. The only reasonable explanation is that it is trying to insert into the source db from the source db for some reason? Or maybe sqlite3 is checking the source db for primary key violation because the table name is the same	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[insert_all, upsert_all] IntegrityError: constraint failed 1436539554
1303660293	https://github.com/simonw/sqlite-utils/issues/50#issuecomment-1303660293	https://api.github.com/repos/simonw/sqlite-utils/issues/50	IC_kwDOCGYnMM5NtEcF	chapmanjacobd 7908073	2022-11-04T14:38:36Z	2022-11-04T14:38:36Z	CONTRIBUTOR	where did you see the limit as 999? I believe the limit has been 32766 for quite some time. If you could detect which one this could speed up batch insert of some types of data significantly	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	"Too many SQL variables" on large inserts 473083260
1297859539	https://github.com/simonw/sqlite-utils/issues/507#issuecomment-1297859539	https://api.github.com/repos/simonw/sqlite-utils/issues/507	IC_kwDOCGYnMM5NW8PT	chapmanjacobd 7908073	2022-11-01T00:40:16Z	2022-11-01T00:40:16Z	CONTRIBUTOR	Ideally people could fix their data if they run into this issue. If you are using filenames try convmv `convmv --preserve-mtimes -f utf8 -t utf8 --notest -i -r .` maybe this script will also help: ```py import argparse, shutil from pathlib import Path import ftfy from xklb import utils from xklb.utils import log def parse_args() -> argparse.Namespace: parser = argparse.ArgumentParser() parser.add_argument("paths", nargs='') parser.add_argument("--verbose", "-v", action="count", default=0) args = parser.parse_args() `log.info(utils.dict_filter_bool(args.__dict__)) return args` def rename_invalid_paths() -> None: args = parse_args() `for path in args.paths: log.info(path) for p in sorted([str(p) for p in Path(path).rglob("")], key=len): fixed = ftfy.fix_text(p, uncurl_quotes=False).replace("\r\n", "\n").replace("\r", "\n").replace("\n", "") if p != fixed: try: shutil.move(p, fixed) except FileNotFoundError: log.warning("FileNotFound. %s", p) else: log.info(fixed)` if name == "main": rename_invalid_paths() ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	conn.execute: UnicodeEncodeError: 'utf-8' codec can't encode character 1430325103
1297703307	https://github.com/simonw/sqlite-utils/issues/448#issuecomment-1297703307	https://api.github.com/repos/simonw/sqlite-utils/issues/448	IC_kwDOCGYnMM5NWWGL	mcarpenter 167893	2022-10-31T21:23:51Z	2022-10-31T21:27:32Z	CONTRIBUTOR	The Windows aspect is a red herring: OP's sample above produces the same error on Linux. (Though I don't know what's going on with the CI). The same error can also be obtained by passing an `io` from a file opened in non-binary mode (`'r'` as opposed to `'rb'`) to `rows_from_file()`. This is how I got here. The fix for my case is easy: open the file in mode `'rb'`. The analagous fix for OP's problem also works: use `BytesIO` in place of `StringIO`. Minimal test case (derived from utils.py): ``` python import io from typing import cast fp = io.StringIO("id,name\n1,Cleo") # error fp = io.BytesIO(bytes("id,name\n1,Cleo", encoding='utf-8')) # okay reader = io.BufferedReader(cast(io.RawIOBase, fp)) reader.peek(1) # exception thrown here `` I see the signature ofrows_from_file()`correctly has`fp: BinaryIO`but I guess you'd need either a runtime type check for that (not all`io`s have`mode()`), or to catch the`AttributeError`on`peek()` to produce a better error for users. Neither option is ideal. Some thoughts on testing binary-ness of `io`s in this SO question: https://stackoverflow.com/questions/44584829/how-to-determine-if-file-is-opened-in-binary-or-text-mode	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Reading rows from a file => AttributeError: '_io.StringIO' object has no attribute 'readinto' 1279144769
1296080804	https://github.com/simonw/datasette/issues/1872#issuecomment-1296080804	https://api.github.com/repos/simonw/datasette/issues/1872	IC_kwDOBm6k_c5NQJ-k	mroswell 192568	2022-10-30T03:06:32Z	2022-10-30T03:06:32Z	CONTRIBUTOR	I updated datasette-publish-vercel to 0.14.2 in requirements.txt And the site is back up! Is there a way that we can get some sort of notice when something like this will have critical impact on website function?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	SITE-BUSTING ERROR: "render_template() called before await ds.invoke_startup()" 1428560020
1296076803	https://github.com/simonw/datasette/issues/1872#issuecomment-1296076803	https://api.github.com/repos/simonw/datasette/issues/1872	IC_kwDOBm6k_c5NQJAD	mroswell 192568	2022-10-30T02:50:34Z	2022-10-30T02:50:34Z	CONTRIBUTOR	should this issue be under https://github.com/simonw/datasette-publish-vercel/issues ? Perhaps I just need to update: datasette-publish-vercel==0.11 in requirements.txt? I'll try that and see what happens...	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	SITE-BUSTING ERROR: "render_template() called before await ds.invoke_startup()" 1428560020
1295667649	https://github.com/simonw/datasette/pull/1870#issuecomment-1295667649	https://api.github.com/repos/simonw/datasette/issues/1870	IC_kwDOBm6k_c5NOlHB	fgregg 536941	2022-10-29T00:52:43Z	2022-10-29T00:53:43Z	CONTRIBUTOR	Are you saying that I can build a container, but then when I run it and it does `datasette serve -i data.db ...` it will somehow modify the image, or create a new modified filesystem layer in the runtime environment, as a result of running that `serve` command? Somehow, `datasette serve -i data.db` will lead to the `data.db` being modified, which will trigger a copy-on-write of `data.db` into the read-write layer of the container. I don't understand how that happens. it kind of feels like a bug in sqlite, but i can't quite follow the sqlite code.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	don't use immutable=1, only mode=ro 1426379903
1294285471	https://github.com/simonw/datasette/pull/1870#issuecomment-1294285471	https://api.github.com/repos/simonw/datasette/issues/1870	IC_kwDOBm6k_c5NJTqf	fgregg 536941	2022-10-28T01:06:03Z	2022-10-28T01:06:03Z	CONTRIBUTOR	as far as i can tell, this is where the "immutable" argument is used in sqlite: `c pPager->noLock = sqlite3_uri_boolean(pPager->zFilename, "nolock", 0); if( (iDc & SQLITE_IOCAP_IMMUTABLE)!=0 \|\| sqlite3_uri_boolean(pPager->zFilename, "immutable", 0) ){ vfsFlags \|= SQLITE_OPEN_READONLY; goto act_like_temp_file; }` so it does set the read only flag, but then has a goto.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	don't use immutable=1, only mode=ro 1426379903
1294237783	https://github.com/simonw/datasette/pull/1870#issuecomment-1294237783	https://api.github.com/repos/simonw/datasette/issues/1870	IC_kwDOBm6k_c5NJIBX	fgregg 536941	2022-10-27T23:42:18Z	2022-10-27T23:42:18Z	CONTRIBUTOR	Relevant sqlite forum thread: https://www.sqlite.org/forum/forumpost/02f7bda329f41e30451472421cf9ce7f715b768ce3db02797db1768e47950d48	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	don't use immutable=1, only mode=ro 1426379903
1292592210	https://github.com/simonw/datasette/issues/1851#issuecomment-1292592210	https://api.github.com/repos/simonw/datasette/issues/1851	IC_kwDOBm6k_c5NC2RS	eyeseast 25778	2022-10-26T20:03:46Z	2022-10-26T20:03:46Z	CONTRIBUTOR	Yeah, every time I see something cool done with triggers, I remember that I need to start using triggers.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API to insert a single record into an existing table 1421544654
1292519956	https://github.com/simonw/datasette/issues/1851#issuecomment-1292519956	https://api.github.com/repos/simonw/datasette/issues/1851	IC_kwDOBm6k_c5NCkoU	asg017 15178711	2022-10-26T19:20:33Z	2022-10-26T19:20:33Z	CONTRIBUTOR	This could use a new plugin hook, too. I don't want to complicate your life too much, but for things like GIS, I'd want a way to turn regular JSON into SpatiaLite geometries or combine X/Y coordinates into point geometries and such. Happy to help however I can. @eyeseast Maybe you could do this with triggers? Like you can insert JSON-friendly data into a "raw" table, and create a trigger that transforms that inserted data into the proper table Here's an example: ```sql -- meant to be updated from a Datasette insert create table points_raw(longitude int, latitude int); -- the target table with proper spatliate geometries create table points(point geometry); CREATE TRIGGER insert_points_raw INSERT ON points_raw BEGIN insert into points(point) values (makepoint(new.longitude, new.latitude)) END; ``` You could then POST a new row to `points_raw` like this: `POST /db/points_raw Authorization: Bearer xxx Content-Type: application/json { "row": { "longitude": 27.64356, "latitude": -47.29384 } }` Then SQLite with run the trigger and insert a new row in `points` with the correct geometry point. Downside is you'd have duplicated data with `points_raw`, but maybe it could be a `TEMP` table (or have a cron that deletes all rows from that table every so often?)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API to insert a single record into an existing table 1421544654
1292401308	https://github.com/simonw/sqlite-utils/pull/499#issuecomment-1292401308	https://api.github.com/repos/simonw/sqlite-utils/issues/499	IC_kwDOCGYnMM5NCHqc	chapmanjacobd 7908073	2022-10-26T17:54:26Z	2022-10-26T17:54:51Z	CONTRIBUTOR	The problem with how it is currently is that the transformed fts table will return incorrect results (unless the table was only 1 row or something), even if create_triggers was enabled previously. Maybe the simplest solution is to disable fts on a transformed table rather than try to recreate it? Thoughts?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	feat: recreate fts triggers after table transform 1405196044
1291228502	https://github.com/simonw/datasette/issues/1851#issuecomment-1291228502	https://api.github.com/repos/simonw/datasette/issues/1851	IC_kwDOBm6k_c5M9pVW	eyeseast 25778	2022-10-25T23:02:10Z	2022-10-25T23:02:10Z	CONTRIBUTOR	That's reasonable. Canned queries and custom endpoints are certainly going to give more room for specific needs.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API to insert a single record into an existing table 1421544654
1290615599	https://github.com/simonw/datasette/issues/1851#issuecomment-1290615599	https://api.github.com/repos/simonw/datasette/issues/1851	IC_kwDOBm6k_c5M7Tsv	eyeseast 25778	2022-10-25T14:05:12Z	2022-10-25T14:05:12Z	CONTRIBUTOR	This could use a new plugin hook, too. I don't want to complicate your life too much, but for things like GIS, I'd want a way to turn regular JSON into SpatiaLite geometries or combine X/Y coordinates into point geometries and such. Happy to help however I can.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API to insert a single record into an existing table 1421544654
1274153135	https://github.com/simonw/sqlite-utils/pull/498#issuecomment-1274153135	https://api.github.com/repos/simonw/sqlite-utils/issues/498	IC_kwDOCGYnMM5L8giv	chapmanjacobd 7908073	2022-10-11T06:34:31Z	2022-10-11T06:34:31Z	CONTRIBUTOR	nevermind it was because I was running `db[table].transform`. The fts tables would still be there but the triggers would be dropped	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	fix: enable-fts permanently save triggers 1404013495
1272357976	https://github.com/simonw/datasette/issues/1836#issuecomment-1272357976	https://api.github.com/repos/simonw/datasette/issues/1836	IC_kwDOBm6k_c5L1qRY	fgregg 536941	2022-10-08T16:56:51Z	2022-10-08T16:56:51Z	CONTRIBUTOR	when you are running from docker, you always will want to run as `mode=ro` because the same thing that is causing duplication in the inspect layer will cause duplication in the final container read/write layer when `datasette serve` runs.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	docker image is duplicating db files somehow 1400374908
1271103097	https://github.com/simonw/datasette/issues/1836#issuecomment-1271103097	https://api.github.com/repos/simonw/datasette/issues/1836	IC_kwDOBm6k_c5Lw355	fgregg 536941	2022-10-07T04:43:41Z	2022-10-07T04:43:41Z	CONTRIBUTOR	@simonw, should i open up a new issue for investigating the differences between "immutable=1" and "mode=ro" and possibly switching to "mode=ro". Or would you like to keep that conversation in this issue?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	docker image is duplicating db files somehow 1400374908
1271101072	https://github.com/simonw/datasette/issues/1480#issuecomment-1271101072	https://api.github.com/repos/simonw/datasette/issues/1480	IC_kwDOBm6k_c5Lw3aQ	fgregg 536941	2022-10-07T04:39:10Z	2022-10-07T04:39:10Z	CONTRIBUTOR	switching from `immutable=1` to `mode=ro` completely addressed this. see https://github.com/simonw/datasette/issues/1836#issuecomment-1271100651 for details.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Exceeding Cloud Run memory limits when deploying a 4.8G database 1015646369
1271100651	https://github.com/simonw/datasette/issues/1836#issuecomment-1271100651	https://api.github.com/repos/simonw/datasette/issues/1836	IC_kwDOBm6k_c5Lw3Tr	fgregg 536941	2022-10-07T04:38:14Z	2022-10-07T04:38:14Z	CONTRIBUTOR	yes, and i also think that this is causing the apparent memory problems in #1480. when the container starts up, it will make some operation on the database in `immutable` mode which apparently makes some small change to the db file. if that's so, then the db files will be copied to the read/write layer which counts against cloudrun's memory allocation! running a test of that now. this completely addressed #1480	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	docker image is duplicating db files somehow 1400374908
1271035998	https://github.com/simonw/datasette/issues/1301#issuecomment-1271035998	https://api.github.com/repos/simonw/datasette/issues/1301	IC_kwDOBm6k_c5Lwnhe	fgregg 536941	2022-10-07T02:38:04Z	2022-10-07T02:38:04Z	CONTRIBUTOR	the only mode that `publish cloudrun` supports right now is immutable	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Publishing to cloudrun with immutable mode? 860722711
1271020193	https://github.com/simonw/datasette/issues/1836#issuecomment-1271020193	https://api.github.com/repos/simonw/datasette/issues/1836	IC_kwDOBm6k_c5Lwjqh	fgregg 536941	2022-10-07T02:15:05Z	2022-10-07T02:21:08Z	CONTRIBUTOR	when i hack the connect method to open non mutable files with "mode=ro" and not "immutable=1" https://github.com/simonw/datasette/blob/eff112498ecc499323c26612d707908831446d25/datasette/database.py#L79 then: `bash 870 B RUN /bin/sh -c datasette inspect nlrb.db --inspect-file inspect-data.json` the `datasette inspect` layer is only the size of the json file!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	docker image is duplicating db files somehow 1400374908
1271008997	https://github.com/simonw/datasette/issues/1836#issuecomment-1271008997	https://api.github.com/repos/simonw/datasette/issues/1836	IC_kwDOBm6k_c5Lwg7l	fgregg 536941	2022-10-07T02:00:37Z	2022-10-07T02:00:49Z	CONTRIBUTOR	yes, and i also think that this is causing the apparent memory problems in #1480. when the container starts up, it will make some operation on the database in `immutable` mode which apparently makes some small change to the db file. if that's so, then the db files will be copied to the read/write layer which counts against cloudrun's memory allocation! running a test of that now.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	docker image is duplicating db files somehow 1400374908
1271003212	https://github.com/simonw/datasette/issues/1836#issuecomment-1271003212	https://api.github.com/repos/simonw/datasette/issues/1836	IC_kwDOBm6k_c5LwfhM	fgregg 536941	2022-10-07T01:52:04Z	2022-10-07T01:52:04Z	CONTRIBUTOR	and if we try immutable mode, which is how things are opened by `datasette inspect` we duplicate the files!!! ```python test_sql_immutable.py import sqlite3 import sys db_name = sys.argv[1] conn = sqlite3.connect(f'file:/app/{db_name}?immutable=1', uri=True) cur = conn.cursor() cur.execute('select count(*) from filing') print(cur.fetchone()) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	docker image is duplicating db files somehow 1400374908
1270992795	https://github.com/simonw/datasette/issues/1836#issuecomment-1270992795	https://api.github.com/repos/simonw/datasette/issues/1836	IC_kwDOBm6k_c5Lwc-b	fgregg 536941	2022-10-07T01:29:15Z	2022-10-07T01:50:14Z	CONTRIBUTOR	fascinatingly, telling python to open sqlite in read only mode makes this layer have a size of 0 ```python test_sql_ro.py import sqlite3 import sys db_name = sys.argv[1] conn = sqlite3.connect(f'file:/app/{db_name}?mode=ro', uri=True) cur = conn.cursor() cur.execute('select count(*) from filing') print(cur.fetchone()) ``` that's quite weird because setting the file permissions to read only didn't do anything. (on reflection, that chmod isn't doing anything because the dockerfile commands are run as root)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	docker image is duplicating db files somehow 1400374908
1270988081	https://github.com/simonw/datasette/issues/1836#issuecomment-1270988081	https://api.github.com/repos/simonw/datasette/issues/1836	IC_kwDOBm6k_c5Lwb0x	fgregg 536941	2022-10-07T01:19:01Z	2022-10-07T01:27:35Z	CONTRIBUTOR	okay, some progress!! running some sql against a database file causes that file to get duplicated even if it doesn't apparently change the file. make a little test script like this: ```python test_sql.py import sqlite3 import sys db_name = sys.argv[1] conn = sqlite3.connect(f'file:/app/{db_name}', uri=True) cur = conn.cursor() cur.execute('select count(*) from filing') print(cur.fetchone()) ``` then `docker RUN python test_sql.py nlrb.db` produced a layer that's the same size as `nlrb.db`!!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	docker image is duplicating db files somehow 1400374908
1270936982	https://github.com/simonw/datasette/issues/1836#issuecomment-1270936982	https://api.github.com/repos/simonw/datasette/issues/1836	IC_kwDOBm6k_c5LwPWW	fgregg 536941	2022-10-07T00:52:41Z	2022-10-07T00:52:41Z	CONTRIBUTOR	it's not that the inspect command is somehow changing the db files. if i set them to only read-only, the "inspect" layer still has the same very large size.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	docker image is duplicating db files somehow 1400374908
1270923537	https://github.com/simonw/datasette/issues/1836#issuecomment-1270923537	https://api.github.com/repos/simonw/datasette/issues/1836	IC_kwDOBm6k_c5LwMER	fgregg 536941	2022-10-07T00:46:08Z	2022-10-07T00:46:08Z	CONTRIBUTOR	i thought it was maybe to do with reading through all the files, but that does not seem to be the case if i make a little test file like: ```python test_read.py import hashlib import sys import pathlib HASH_BLOCK_SIZE = 1024 * 1024 def inspect_hash(path): """Calculate the hash of a database, efficiently.""" m = hashlib.sha256() with path.open("rb") as fp: while True: data = fp.read(HASH_BLOCK_SIZE) if not data: break m.update(data) `return m.hexdigest()` inspect_hash(pathlib.Path(sys.argv[1])) ``` then a line in the Dockerfile like `docker RUN python test_read.py nlrb.db && echo "[]" > /etc/inspect.json` just produes a layer of `3B`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	docker image is duplicating db files somehow 1400374908
1269847461	https://github.com/simonw/datasette/issues/1480#issuecomment-1269847461	https://api.github.com/repos/simonw/datasette/issues/1480	IC_kwDOBm6k_c5LsFWl	fgregg 536941	2022-10-06T11:21:49Z	2022-10-06T11:21:49Z	CONTRIBUTOR	thanks @simonw, i'll spend a little more time trying to figure out why this isn't working on cloudrun, and then will flip over to fly if i can't.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Exceeding Cloud Run memory limits when deploying a 4.8G database 1015646369
1268629159	https://github.com/simonw/datasette/issues/1480#issuecomment-1268629159	https://api.github.com/repos/simonw/datasette/issues/1480	IC_kwDOBm6k_c5Lnb6n	fgregg 536941	2022-10-05T16:00:55Z	2022-10-05T16:00:55Z	CONTRIBUTOR	as a next step, i'll fetch the docker image from the google registry, and see what memory and disk usage looks like when i run it locally.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Exceeding Cloud Run memory limits when deploying a 4.8G database 1015646369
1268613335	https://github.com/simonw/datasette/issues/1480#issuecomment-1268613335	https://api.github.com/repos/simonw/datasette/issues/1480	IC_kwDOBm6k_c5LnYDX	fgregg 536941	2022-10-05T15:45:49Z	2022-10-05T15:45:49Z	CONTRIBUTOR	running into this as i continue to grow my labor data warehouse. Here a CloudRun PM says the container size should not count against memory: https://stackoverflow.com/a/56570717	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Exceeding Cloud Run memory limits when deploying a 4.8G database 1015646369
1264223554	https://github.com/simonw/sqlite-utils/issues/409#issuecomment-1264223554	https://api.github.com/repos/simonw/sqlite-utils/issues/409	IC_kwDOCGYnMM5LWoVC	chapmanjacobd 7908073	2022-10-01T03:42:50Z	2022-10-01T03:42:50Z	CONTRIBUTOR	oh weird. it inserts into db2	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	`with db:` for transactions 1149661489
1264223363	https://github.com/simonw/sqlite-utils/issues/409#issuecomment-1264223363	https://api.github.com/repos/simonw/sqlite-utils/issues/409	IC_kwDOCGYnMM5LWoSD	chapmanjacobd 7908073	2022-10-01T03:41:45Z	2022-10-01T03:41:45Z	CONTRIBUTOR	``` pytest xklb/check.py --pdb xklb/check.py:11: in test_transaction assert list(db2["t"].rows) == [] E AssertionError: assert [{'foo': 1}] == [] E + where [{'foo': 1}] = list(<generator object Queryable.rows_where at 0x7f2d84d1f0d0>) E + where <generator object Queryable.rows_where at 0x7f2d84d1f0d0> = <Table t (foo)>.rows entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PDB post_mortem (IO-capturing turned off) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /home/xk/github/xk/lb/xklb/check.py(11)test_transaction() 9 with db1.conn: 10 db1["t"].insert({"foo": 1}) ---> 11 assert list(db2["t"].rows) == [] 12 assert list(db2["t"].rows) == [{"foo": 1}] ``` It fails because it is already inserted. btw if you put these two lines in you pyproject.toml you can get `ipdb` in pytest `[tool.pytest.ini_options] addopts = "--pdbcls=IPython.terminal.debugger:TerminalPdb --ignore=tests/data --capture=tee-sys --log-cli-level=ERROR"`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	`with db:` for transactions 1149661489
1264219650	https://github.com/simonw/sqlite-utils/issues/493#issuecomment-1264219650	https://api.github.com/repos/simonw/sqlite-utils/issues/493	IC_kwDOCGYnMM5LWnYC	chapmanjacobd 7908073	2022-10-01T03:22:50Z	2022-10-01T03:23:58Z	CONTRIBUTOR	this is likely what you are looking for: https://stackoverflow.com/a/51076749/697964 but yeah I would say just disable smart quotes	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Tiny typographical error in install/uninstall docs 1386562662
1261930179	https://github.com/simonw/datasette/issues/370#issuecomment-1261930179	https://api.github.com/repos/simonw/datasette/issues/370	IC_kwDOBm6k_c5LN4bD	MichaelTiemannOSC 72577720	2022-09-29T08:17:46Z	2022-09-29T08:17:46Z	CONTRIBUTOR	Just watched this video which demonstrates the integration of any webapp into JupyterLab: https://youtu.be/FH1dKKmvFtc Maybe this is the answer?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Integration with JupyterLab 377155320
1260909128	https://github.com/simonw/datasette/issues/1062#issuecomment-1260909128	https://api.github.com/repos/simonw/datasette/issues/1062	IC_kwDOBm6k_c5LJ_JI	fgregg 536941	2022-09-28T13:22:53Z	2022-09-28T14:09:54Z	CONTRIBUTOR	if you went this route: `python with sqlite_timelimit(conn, time_limit_ms): c.execute(query) for chunk in c.fetchmany(chunk_size): yield from chunk` then `time_limit_ms` would probably have to be greatly extended, because the time spent in the loop will depend on the downstream processing. i wonder if this was why you were thinking this feature would need a dedicated connection? reading more, there's no real limit i can find on the number of active cursors (or more precisely active prepared statements objects, because sqlite doesn't really have cursors). maybe something like this would be okay? `python with sqlite_timelimit(conn, time_limit_ms): c.execute(query) # step through at least one to evaluate the statement, not sure if this is necessary yield c.execute.fetchone() for chunk in c.fetchmany(chunk_size): yield from chunk` this seems quite weird that there's not more of limit of the number of active prepared statements, but i haven't been able to find one.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Refactor .csv to be an output renderer - and teach register_output_renderer to stream all rows 732674148
1260829829	https://github.com/simonw/datasette/issues/1062#issuecomment-1260829829	https://api.github.com/repos/simonw/datasette/issues/1062	IC_kwDOBm6k_c5LJryF	fgregg 536941	2022-09-28T12:27:19Z	2022-09-28T12:27:19Z	CONTRIBUTOR	for teaching `register_output_renderer` to stream it seems like the two options are to a nested query technique to paginate through a fetching model that looks like something `python with sqlite_timelimit(conn, time_limit_ms): c.execute(query) for chunk in c.fetchmany(chunk_size): yield from chunk` currently `db.execute` is not a generator, so this would probably need a new method?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Refactor .csv to be an output renderer - and teach register_output_renderer to stream all rows 732674148
1259718517	https://github.com/simonw/datasette/issues/526#issuecomment-1259718517	https://api.github.com/repos/simonw/datasette/issues/526	IC_kwDOBm6k_c5LFcd1	fgregg 536941	2022-09-27T16:02:51Z	2022-09-27T16:04:46Z	CONTRIBUTOR	i think that `max_returned_rows` is a defense mechanism, just not for connection exhaustion. `max_returned_rows` is a defense mechanism against memory bombs. if you are potentially yielding out hundreds of thousands or even millions of rows, you need to be quite careful about data flow to not run out of memory on the server, or on the client. you have a lot of places in your code that are protective of that right now, but `max_returned_rows` acts as the final backstop. so, given that, it makes sense to have removing `max_returned_rows` altogether be a non-goal, but instead allow for for specific codepaths (like streaming csv's) be able to bypass. that could dramatically lower the surface area for a memory-bomb attack.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Stream all results for arbitrary SQL and canned queries 459882902
1258910228	https://github.com/simonw/datasette/issues/526#issuecomment-1258910228	https://api.github.com/repos/simonw/datasette/issues/526	IC_kwDOBm6k_c5LCXIU	fgregg 536941	2022-09-27T03:11:07Z	2022-09-27T03:11:07Z	CONTRIBUTOR	i think this feature would be safe, as its really only the time limit that can, and imo, should protect against long running queries, as it is pretty easy to make very expensive queries that don't return many rows. moving away from `max_returned_rows` will requires some thinking about: memory usage and data flows to handle potentially very large result sets how to avoid rendering tens or hundreds of thousands of html rows.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Stream all results for arbitrary SQL and canned queries 459882902
1258878311	https://github.com/simonw/datasette/issues/526#issuecomment-1258878311	https://api.github.com/repos/simonw/datasette/issues/526	IC_kwDOBm6k_c5LCPVn	fgregg 536941	2022-09-27T02:19:48Z	2022-09-27T02:19:48Z	CONTRIBUTOR	this sql query doesn't trip up `maximum_returned_rows` but does timeout `sql with recursive counter(x) as ( select 0 union select x + 1 from counter ) select * from counter LIMIT 10 OFFSET 100000000`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Stream all results for arbitrary SQL and canned queries 459882902
1258871525	https://github.com/simonw/datasette/issues/526#issuecomment-1258871525	https://api.github.com/repos/simonw/datasette/issues/526	IC_kwDOBm6k_c5LCNrl	fgregg 536941	2022-09-27T02:09:32Z	2022-09-27T02:14:53Z	CONTRIBUTOR	thanks @simonw, i learned something i didn't know about sqlite's execution model! Imagine if Datasette CSVs did allow unlimited retrievals. Someone could hit the CSV endpoint for that recursive query and tie up Datasette's SQL connection effectively forever. why wouldn't the `sqlite_timelimit` guard prevent that? on my local version which has the code to turn off truncations for query csv, `sqlite_timelimit` does protect me.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Stream all results for arbitrary SQL and canned queries 459882902
1258849766	https://github.com/simonw/datasette/issues/526#issuecomment-1258849766	https://api.github.com/repos/simonw/datasette/issues/526	IC_kwDOBm6k_c5LCIXm	fgregg 536941	2022-09-27T01:27:03Z	2022-09-27T01:27:03Z	CONTRIBUTOR	i agree with that concern! but if i'm understanding the code correctly, `maximum_returned_rows` does not protect against long-running queries in any way.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Stream all results for arbitrary SQL and canned queries 459882902
1258803261	https://github.com/simonw/datasette/pull/1820#issuecomment-1258803261	https://api.github.com/repos/simonw/datasette/issues/1820	IC_kwDOBm6k_c5LB9A9	fgregg 536941	2022-09-27T00:03:09Z	2022-09-27T00:03:09Z	CONTRIBUTOR	the pattern in this PR `max_returned_rows` control the maximum rows rendered through html and json, and the csv render bypasses that. i think it would be better to have each of these different query renderers have more direct control for how many rows to fetch, instead of relying on the internals of the `execute` method. generally, users will not want to paginate through tens of thousands of results, but often will want to download a full query as json or as csv.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[SPIKE] Don't truncate query CSVs 1386456717
1258712931	https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1258712931	https://api.github.com/repos/simonw/sqlite-utils/issues/491	IC_kwDOCGYnMM5LBm9j	eyeseast 25778	2022-09-26T22:31:58Z	2022-09-26T22:31:58Z	CONTRIBUTOR	Right. The backup command will copy tables completely, but in the case of conflicting table names, the destination gets overwritten silently. That might not be what you want here.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Ability to merge databases and tables 1383646615
1258508215	https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1258508215	https://api.github.com/repos/simonw/sqlite-utils/issues/491	IC_kwDOCGYnMM5LA0-3	eyeseast 25778	2022-09-26T19:22:14Z	2022-09-26T19:22:14Z	CONTRIBUTOR	This might be fairly straightforward using SQLite's backup utility: https://docs.python.org/3/library/sqlite3.html#sqlite3.Connection.backup	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Ability to merge databases and tables 1383646615
1258337011	https://github.com/simonw/datasette/issues/526#issuecomment-1258337011	https://api.github.com/repos/simonw/datasette/issues/526	IC_kwDOBm6k_c5LALLz	fgregg 536941	2022-09-26T16:49:48Z	2022-09-26T16:49:48Z	CONTRIBUTOR	i think the smallest change that gets close to what i want is to change the behavior so that `max_returned_rows` is not applied in the `execute` method when we are are asking for a csv of query. there are some infelicities for that approach, but i'll make a PR to make it easier to discuss.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Stream all results for arbitrary SQL and canned queries 459882902
1258167564	https://github.com/simonw/datasette/issues/526#issuecomment-1258167564	https://api.github.com/repos/simonw/datasette/issues/526	IC_kwDOBm6k_c5K_h0M	fgregg 536941	2022-09-26T14:57:44Z	2022-09-26T15:08:36Z	CONTRIBUTOR	reading the database execute method i have a few questions. https://github.com/simonw/datasette/blob/cb1e093fd361b758120aefc1a444df02462389a3/datasette/database.py#L229-L242 unless i'm missing something (which is very likely!!), the `max_returned_rows` argument doesn't actually offer any protections against running very expensive queries. It's not like adding a `LIMIT max_rows` argument. it make sense that it isn't because, the query could already have an `LIMIT` argument. Doing something like `select * from (query) limit {max_returned_rows}` might be protective but wouldn't always. Instead the code executes the full original query, and if still has time it fetches out the first `max_rows + 1` rows. this does offer some protection of memory exhaustion, as you won't hydrate a huge result set into python (however, there are data flow patterns that could avoid that too) given the current architecture, i don't see how creating a new connection would be use? If we just removed the `max_return_rows` limitation, then i think most things would be fine except for the QueryViews. Right now rendering, just 5000 rows takes a lot of client-side memory so some form of pagination would be required.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Stream all results for arbitrary SQL and canned queries 459882902
1258166572	https://github.com/simonw/datasette/issues/1655#issuecomment-1258166572	https://api.github.com/repos/simonw/datasette/issues/1655	IC_kwDOBm6k_c5K_hks	fgregg 536941	2022-09-26T14:57:04Z	2022-09-26T14:57:04Z	CONTRIBUTOR	I think that paginating, even in javascript, could be very helpful. Maybe render json or csv into the page and let javascript loading that into the dom?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	query result page is using 400mb of browser memory 40x size of html page and 400x size of csv data 1163369515
1258129113	https://github.com/simonw/datasette/issues/1727#issuecomment-1258129113	https://api.github.com/repos/simonw/datasette/issues/1727	IC_kwDOBm6k_c5K_YbZ	fgregg 536941	2022-09-26T14:30:11Z	2022-09-26T14:48:31Z	CONTRIBUTOR	from your analysis, it seems like the GIL is blocking on loading of the data from sqlite to python, (particularly in the `fetchmany` call) this is probably a simplistic idea, but what if you had the python code in the `execute` method iterate over the cursor and yield out rows or small chunks of rows. something like: `python with sqlite_timelimit(conn, time_limit_ms): try: cursor = conn.cursor() cursor.execute(sql, params if params is not None else {}) except: ... max_returned_rows = self.ds.max_returned_rows if max_returned_rows == page_size: max_returned_rows += 1 if max_returned_rows and truncate: for i, row in enumerate(cursor): yield row if i == max_returned_rows - 1: break else: for row in cursor: yield row truncated = False` this kind of thing works well with a postgres server side cursor, but i'm not sure if it will hold for sqlite. you would still spend about the same amount of time in python and would be contending for the gil, but it would be could be non blocking. depending on the data flow, this could also some benefit for memory. (data stays in more compact sqlite-land until you need it)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Research: demonstrate if parallel SQL queries are worthwhile 1217759117
1256858763	https://github.com/simonw/sqlite-utils/issues/491#issuecomment-1256858763	https://api.github.com/repos/simonw/sqlite-utils/issues/491	IC_kwDOCGYnMM5K6iSL	chapmanjacobd 7908073	2022-09-24T04:50:59Z	2022-09-24T04:52:08Z	CONTRIBUTOR	Instead of outputting binary data to stdout the interface might be better like this `sqlite-utils merge animals.db cats.db dogs.db` similar to `zip`, `ogr2ogr`, etc Actually I think this might already be possible within `ogr2ogr`. I don't believe spatial data is a requirement though it might add an `ogc_id` column or something `cp cats.db animals.db ogr2ogr -append animals.db dogs.db ogr2ogr -append animals.db another.db`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Ability to merge databases and tables 1383646615
1256781274	https://github.com/simonw/datasette/issues/1817#issuecomment-1256781274	https://api.github.com/repos/simonw/datasette/issues/1817	IC_kwDOBm6k_c5K6PXa	jefftriplett 50527	2022-09-23T22:59:46Z	2022-09-23T22:59:46Z	CONTRIBUTOR	While you are adding features, would you be future-proofing your APIs if you switched over some arguments over to keyword-only arguments or would that be too disruptive? Thinking out loud: `async def render_template( self, templates, *, context=None, plugin_context=None, request=None, view_name=None ):`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose `sql` and `params` arguments to various plugin hooks 1384273985
1254064260	https://github.com/simonw/datasette/issues/526#issuecomment-1254064260	https://api.github.com/repos/simonw/datasette/issues/526	IC_kwDOBm6k_c5Kv4CE	fgregg 536941	2022-09-21T18:17:04Z	2022-09-21T18:18:01Z	CONTRIBUTOR	hi @simonw, this is becoming more of a bother for my labor data warehouse. Is there any research or a spike i could do that would help you investigate this issue?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Stream all results for arbitrary SQL and canned queries 459882902
1252898131	https://github.com/simonw/sqlite-utils/issues/433#issuecomment-1252898131	https://api.github.com/repos/simonw/sqlite-utils/issues/433	IC_kwDOCGYnMM5KrbVT	chapmanjacobd 7908073	2022-09-20T20:51:21Z	2022-09-20T20:56:07Z	CONTRIBUTOR	When I run `reset` it fixes my terminal. I suspect it is related to the progress bar https://linux.die.net/man/1/reset `950 1s /m/d/03_Downloads 🐑 echo $TERM xterm-kitty ▓░▒░ /m/d/03_Downloads 🌏 kitty -v kitty 0.26.2 created by Kovid Goyal $ sqlite-utils insert test.db facility facility-boundary-us-all.csv --csv blah blah blah (no offense) $ <no cursor> $ reset $ <cursor lives again (resurrection [explicit])>`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	CLI eats my cursor 1239034903
1250901367	https://github.com/simonw/datasette/issues/1813#issuecomment-1250901367	https://api.github.com/repos/simonw/datasette/issues/1813	IC_kwDOBm6k_c5Kjz13	adipasquale 883348	2022-09-19T11:34:45Z	2022-09-19T11:34:45Z	CONTRIBUTOR	oh and by writing this I just realized the difference: the URL on fly.io is with a custom SQL command whereas the local one is without. It seems that there is no pagination when using custom SQL commands which makes sense Sorry for this useless issue, maybe this can be useful for someone else / me in the future. Thanks again for this wonderful project !	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	missing next and next_url in JSON responses from an instance deployed on Fly 1377811868
1248204219	https://github.com/simonw/datasette/issues/1810#issuecomment-1248204219	https://api.github.com/repos/simonw/datasette/issues/1810	IC_kwDOBm6k_c5KZhW7	psychemedia 82988	2022-09-15T14:44:47Z	2022-09-15T14:46:26Z	CONTRIBUTOR	A couple+ of possible use case examples: someone has a collection of articles indexed with FTS; they want to publish a simple search tool over the results; someone has an image collection and they want to be able to search over description text to return images; someone has a set of locations with descriptions, and wants to run a query over places and descriptions and get results as a listing or on a map; someone has a set of audio or video files with titles, descriptions and/or transcripts, and wants to be able to search over them and return playable versions of returned items. In many cases, I suspect the raw content will be in one table, but the search table will be a second (eg FTS) table. Generally, the search may be over one or more joined tables, and the results constructed from one or more tables (which may or may not be distinct from the search tables).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Featured table(s) on the homepage 1374626873
1237381620	https://github.com/simonw/datasette/pull/1685#issuecomment-1237381620	https://api.github.com/repos/simonw/datasette/issues/1685	IC_kwDOBm6k_c5JwPH0	dependabot[bot] 49699333	2022-09-05T18:36:47Z	2022-09-05T18:36:47Z	CONTRIBUTOR	Looks like jinja2 is no longer updatable, so this is no longer needed.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Update jinja2 requirement from <3.1.0,>=2.10.3 to >=2.10.3,<3.2.0 1180778860
1237381569	https://github.com/simonw/datasette/pull/1799#issuecomment-1237381569	https://api.github.com/repos/simonw/datasette/issues/1799	IC_kwDOBm6k_c5JwPHB	dependabot[bot] 49699333	2022-09-05T18:36:42Z	2022-09-05T18:36:42Z	CONTRIBUTOR	Looks like aiofiles is no longer updatable, so this is no longer needed.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Update aiofiles requirement from <0.9,>=0.4 to >=0.4,<22.2 1362242558
1232356302	https://github.com/simonw/sqlite-utils/pull/480#issuecomment-1232356302	https://api.github.com/repos/simonw/sqlite-utils/issues/480	IC_kwDOCGYnMM5JdEPO	chapmanjacobd 7908073	2022-08-31T01:51:49Z	2022-08-31T01:51:49Z	CONTRIBUTOR	Thanks for pointing me to the right place	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	search_sql add include_rank option 1355433619
1224382336	https://github.com/simonw/sqlite-utils/issues/467#issuecomment-1224382336	https://api.github.com/repos/simonw/sqlite-utils/issues/467	IC_kwDOCGYnMM5I-peA	jefftriplett 50527	2022-08-23T17:16:13Z	2022-08-23T17:16:13Z	CONTRIBUTOR	Should passing `alter=True` also drop any columns that aren't included in the new table structure? It could even spot column types that aren't correct and fix those. Is that consistent with the expectations set by how `alter=True` works elsewhere? I would lean towards not dropping them (or making a `drop=True` or `drop_columns=True`or `drop_missing_columns=True`) to work with existing tables easier. I do like that sqlite-utils mostly just works with existing tables but it's also nice to add to existing fields in a few cases.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Mechanism for ensuring a table has all the columns 1348169997
1223347322	https://github.com/simonw/datasette/pull/1789#issuecomment-1223347322	https://api.github.com/repos/simonw/datasette/issues/1789	IC_kwDOBm6k_c5I6sx6	asg017 15178711	2022-08-23T00:03:20Z	2022-08-23T00:03:20Z	CONTRIBUTOR	@simonw to build the extension on ubuntu, you can run: `apt-get update && apt-get install libsqlite3-dev gcc gcc ext.c -fPIC -shared -o ext.so` I'm not the best with Actions, but if you set the cache key to `ext.c`, run those two commands to download dependencies + compile to `ext.so`, then the unit test should pick it up and run it correctly. Let me know if you want me to update the PR with that added	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Add new entrypoint option to `--load-extension` 1344823170
1221576460	https://github.com/simonw/datasette/pull/1789#issuecomment-1221576460	https://api.github.com/repos/simonw/datasette/issues/1789	IC_kwDOBm6k_c5Iz8cM	asg017 15178711	2022-08-21T16:16:42Z	2022-08-21T16:16:42Z	CONTRIBUTOR	Rebased, Read the docs failure should now now fixed Re docs - ya that's a pretty ambitious page, I'm still not 100% sure what the best practices are/should be... Would be happy to make that page in a future PR	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Add new entrypoint option to `--load-extension` 1344823170
1214437408	https://github.com/simonw/datasette/issues/1779#issuecomment-1214437408	https://api.github.com/repos/simonw/datasette/issues/1779	IC_kwDOBm6k_c5IYtgg	fgregg 536941	2022-08-14T19:42:58Z	2022-08-14T19:42:58Z	CONTRIBUTOR	thanks @simonw!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	google cloudrun updated their limits on maxscale based on memory and cpu count 1334628400
1210675046	https://github.com/simonw/datasette/issues/1779#issuecomment-1210675046	https://api.github.com/repos/simonw/datasette/issues/1779	IC_kwDOBm6k_c5IKW9m	fgregg 536941	2022-08-10T13:28:37Z	2022-08-10T13:28:37Z	CONTRIBUTOR	maybe a simpler solution is to set the maxscale to like 2? since datasette is not set up to make use of container scaling anyway?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	google cloudrun updated their limits on maxscale based on memory and cpu count 1334628400
1200732975	https://github.com/simonw/datasette/issues/1191#issuecomment-1200732975	https://api.github.com/repos/simonw/datasette/issues/1191	IC_kwDOBm6k_c5Hkbsv	brandonrobertz 2670795	2022-08-01T05:39:27Z	2022-08-01T05:39:27Z	CONTRIBUTOR	I've got a URL shortening plugin that I would like to embed on the query page but I'd like avoid capturing the entire `query.html` template. A feature like this would solve it. Where's this at and how can I help?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Ability for plugins to collaborate when adding extra HTML to blocks in default templates 787098345
1190277829	https://github.com/simonw/sqlite-utils/issues/456#issuecomment-1190277829	https://api.github.com/repos/simonw/sqlite-utils/issues/456	IC_kwDOCGYnMM5G8jLF	fgregg 536941	2022-07-20T13:19:15Z	2022-07-20T13:19:15Z	CONTRIBUTOR	hadley wickham's melt and reshape could be good inspo: http://had.co.nz/reshape/introduction.pdf	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	feature request: pivot command 1310243385
1190272780	https://github.com/simonw/sqlite-utils/issues/456#issuecomment-1190272780	https://api.github.com/repos/simonw/sqlite-utils/issues/456	IC_kwDOCGYnMM5G8h8M	fgregg 536941	2022-07-20T13:14:54Z	2022-07-20T13:14:54Z	CONTRIBUTOR	for example, i have data on votes that look like this: \| ballot_id \| option_id \| choice \| \|-\|-\|-\| \| 1 \| 1 \| 0 \| \| 1 \| 2 \| 1 \| \| 1 \| 3 \| 0 \| \| 1 \| 4 \| 1 \| \| 2 \| 1 \| 1 \| \| 2 \| 2 \| 0 \| \| 2 \| 3 \| 1 \| \| 2 \| 4 \| 0 \| and i want to reshape from this long form to this wide form: \| ballot_id \| option_id_1 \| option_id_2 \| option_id_3 \| option_id_ 4\| \|-\|-\|-\|-\| -\| \| 1 \| 0 \| 1 \| 0 \| 1 \| \| 2 \| 1 \| 0 \| 1\| 0 \| i could do such a think like this. `sql select ballot_id, sum(choice) filter (where option_id = 1) as option_id_1, sum(choice) filter (where option_id = 2) as option_id_2, sum(choice) filter (where option_id = 3) as option_id_3, sum(choice) filter (where option_id = 4) as option_id_4 from vote group by ballot_id`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	feature request: pivot command 1310243385
1189010812	https://github.com/simonw/sqlite-utils/issues/423#issuecomment-1189010812	https://api.github.com/repos/simonw/sqlite-utils/issues/423	IC_kwDOCGYnMM5G3t18	fgregg 536941	2022-07-19T12:47:39Z	2022-07-19T12:47:39Z	CONTRIBUTOR	just ran into this!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	.extract() doesn't set foreign key when extracted columns contain NULL value 1199158210
1179579878	https://github.com/simonw/sqlite-utils/issues/449#issuecomment-1179579878	https://api.github.com/repos/simonw/sqlite-utils/issues/449	IC_kwDOCGYnMM5GTvXm	davidleejy 1690072	2022-07-09T17:41:32Z	2022-07-09T17:41:50Z	CONTRIBUTOR	Learnt that the types in Sqlite-utils differ somewhat from those in Sqlite. I've changed my test to account for this difference and the test has passed successfully. I will submit a PR.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Utilities for duplicating tables and creating a table with the results of a query 1279863844
1174027079	https://github.com/simonw/sqlite-utils/issues/449#issuecomment-1174027079	https://api.github.com/repos/simonw/sqlite-utils/issues/449	IC_kwDOCGYnMM5F-jtH	davidleejy 1690072	2022-07-04T17:33:04Z	2022-07-04T17:48:43Z	CONTRIBUTOR	I've written the code and test. Would you be able to advise how to compare table columns in a pytest function properly? Experiencing a challenge when comparing columns. Test: python def test_duplicate(fresh_db): table = fresh_db.create_table( "table1", { "text_col": str, "float_col": float, "int_col": int, "bool_col": bool, "bytes_col": bytes, "datetime_col": datetime.datetime, }, ) dt = datetime.datetime.now() b = bytes('hello world', 'utf-8') data = {"text_col": "Cleo", "float_col": 3.14, "int_col": -2, "bool_col": True, "bytes_col": b, "datetime_col": str(dt)} table1 = fresh_db["table1"] row_id = table1.insert(data).last_rowid table1.duplicate('table2') table2 = fresh_db["table2"] assert data == table2.get(row_id) assert table1.columns == table2.columns # FAILS HERE Result: Failure is due to column types being named differently -- e.g. 'FLOAT' vs 'REAL', 'INTEGER' vs 'INT'. How should I go about comparing columns while accounting for equivalent types? Or did I miss out something in my duplication code correctly? Here's how I did it: in `db.py`, I've added the following code: ```python class Table(Queryable): [...] def duplicate( self, name_new: str ) -> "Table": """ Duplicate this table in this database. `:param name_new: Name of new table. """ assert self.exists() with self.db.conn: sql = "CREATE TABLE [{new_table}] AS SELECT * FROM [{table}];".format( new_table = name_new, table = self.name, ) self.db.execute(sql) return self.db[name_new]` ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Utilities for duplicating tables and creating a table with the results of a query 1279863844
1173358747	https://github.com/simonw/datasette/issues/1713#issuecomment-1173358747	https://api.github.com/repos/simonw/datasette/issues/1713	IC_kwDOBm6k_c5F8Aib	brandonrobertz 2670795	2022-07-04T05:16:35Z	2022-07-04T05:16:35Z	CONTRIBUTOR	This feature is pretty important and would be nice if it would be all within Datasette (no separate CLI/deploy required). My workflow now is to basically just copy the result and paste into a Google Sheet, which works, but then it's not discoverable to other journalists browsing the Datasette instance. I started building a plugin similar to datasette-saved-queries but one that maintains its own DB (required if you're working with all immutable DBs), but got bogged down in details.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Datasette feature for publishing snapshots of query results 1203943272
1168704157	https://github.com/simonw/datasette/pull/1693#issuecomment-1168704157	https://api.github.com/repos/simonw/datasette/issues/1693	IC_kwDOBm6k_c5FqQKd	dependabot[bot] 49699333	2022-06-28T13:11:36Z	2022-06-28T13:11:36Z	CONTRIBUTOR	Superseded by #1763.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Bump black from 22.1.0 to 22.3.0 1184850337
1163091750	https://github.com/simonw/datasette/pull/1753#issuecomment-1163091750	https://api.github.com/repos/simonw/datasette/issues/1753	IC_kwDOBm6k_c5FU18m	dependabot[bot] 49699333	2022-06-22T13:22:34Z	2022-06-22T13:22:34Z	CONTRIBUTOR	Superseded by #1760.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Bump furo from 2022.4.7 to 2022.6.4.1 1261826957
1151887842	https://github.com/simonw/datasette/issues/1528#issuecomment-1151887842	https://api.github.com/repos/simonw/datasette/issues/1528	IC_kwDOBm6k_c5EqGni	eyeseast 25778	2022-06-10T03:23:08Z	2022-06-10T03:23:08Z	CONTRIBUTOR	I just put together a version of this in a plugin: https://github.com/eyeseast/datasette-query-files. Happy to have any feedback.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Add new `"sql_file"` key to Canned Queries in metadata? 1060631257
1128064864	https://github.com/simonw/datasette/issues/1742#issuecomment-1128064864	https://api.github.com/repos/simonw/datasette/issues/1742	IC_kwDOBm6k_c5DPOdg	eyeseast 25778	2022-05-16T19:42:13Z	2022-05-16T19:42:13Z	CONTRIBUTOR	Just to add a wrinkle here, this loads fine: https://alltheplaces-datasette.fly.dev/alltheplaces/places.geojson?_trace=1 But also, this doesn't add any trace data: https://alltheplaces-datasette.fly.dev/alltheplaces/places.json?_trace=1 What am I missing?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	?_trace=1 fails with datasette-geojson for some reason 1237586379
1128049716	https://github.com/simonw/datasette/issues/1742#issuecomment-1128049716	https://api.github.com/repos/simonw/datasette/issues/1742	IC_kwDOBm6k_c5DPKw0	eyeseast 25778	2022-05-16T19:24:44Z	2022-05-16T19:24:44Z	CONTRIBUTOR	Where is `_trace` getting injected? And is it something a plugin should be able to handle? (If it is, I guess I should handle it in this case.)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	?_trace=1 fails with datasette-geojson for some reason 1237586379
1125342229	https://github.com/simonw/datasette/issues/741#issuecomment-1125342229	https://api.github.com/repos/simonw/datasette/issues/741	IC_kwDOBm6k_c5DE1wV	eyeseast 25778	2022-05-12T19:21:16Z	2022-05-12T19:21:16Z	CONTRIBUTOR	Came here to check if this had been flagged already. Was helping a colleague get something on Cloud Run and had to dig to find `--extra-options="--setting sql_time_limit_ms 2500"`. If I get some time next week, maybe I'll try to tackle it. Would definitely make things easier to be able to do something like this: `sh datasette publish cloudrun something.db --setting sql_time_limit_ms 2500`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Replace "datasette publish --extra-options" with "--setting" 607223136
1111752676	https://github.com/simonw/datasette/issues/1728#issuecomment-1111752676	https://api.github.com/repos/simonw/datasette/issues/1728	IC_kwDOBm6k_c5CQ__k	wragge 127565	2022-04-28T05:11:54Z	2022-04-28T05:11:54Z	CONTRIBUTOR	And in terms of the bug, yep I agree that option 2 would be the most useful and least frustrating.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Writable canned queries fail with useless non-error against immutable databases 1218133366
1111751734	https://github.com/simonw/datasette/issues/1728#issuecomment-1111751734	https://api.github.com/repos/simonw/datasette/issues/1728	IC_kwDOBm6k_c5CQ_w2	wragge 127565	2022-04-28T05:09:59Z	2022-04-28T05:09:59Z	CONTRIBUTOR	Thanks, I'll give it a try!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Writable canned queries fail with useless non-error against immutable databases 1218133366
1111712953	https://github.com/simonw/datasette/issues/1728#issuecomment-1111712953	https://api.github.com/repos/simonw/datasette/issues/1728	IC_kwDOBm6k_c5CQ2S5	wragge 127565	2022-04-28T03:48:36Z	2022-04-28T03:48:36Z	CONTRIBUTOR	I don't think that'd work for this project. The db is very big, and my aim was to have an environment where researchers could be making use of the data, but be easily able to add corrections to the HTR/OCR extracted data when they came across problems. It's in its immutable (!) form here: https://sydney-stock-exchange-xqtkxtd5za-ts.a.run.app/stock_exchange/stocks	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Writable canned queries fail with useless non-error against immutable databases 1218133366
1111705323	https://github.com/simonw/datasette/issues/1728#issuecomment-1111705323	https://api.github.com/repos/simonw/datasette/issues/1728	IC_kwDOBm6k_c5CQ0br	wragge 127565	2022-04-28T03:32:06Z	2022-04-28T03:32:06Z	CONTRIBUTOR	Ah, that would be it! I have a core set of data which doesn't change to which I want authorised users to be able to submit corrections. I was going to deal with the persistence issue by just grabbing the user corrections at regular intervals and saving to GitHub. I might need to rethink. Thanks!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Writable canned queries fail with useless non-error against immutable databases 1218133366
1105642187	https://github.com/simonw/datasette/issues/1101#issuecomment-1105642187	https://api.github.com/repos/simonw/datasette/issues/1101	IC_kwDOBm6k_c5B5sLL	eyeseast 25778	2022-04-21T18:59:08Z	2022-04-21T18:59:08Z	CONTRIBUTOR	Ha! That was your idea (and a good one). But it's probably worth measuring to see what overhead it adds. It did require both passing in the database and making the whole thing `async`. Just timing the queries themselves: Using `AsGeoJSON(geometry) as geometry` takes 10.235 ms Leaving as binary takes 8.63 ms Looking at the network panel: Takes about 200 ms for the `fetch` request Takes about 300 ms I'm not sure how best to time the GeoJSON generation, but it would be interesting to check. Maybe I'll write a plugin to add query times to response headers. The other thing to consider with async streaming is that it might be well-suited for a slower response. When I have to get the whole result and send a response in a fixed amount of time, I need the most efficient query possible. If I can hang onto a connection and get things one chunk at a time, maybe it's ok if there's some overhead.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	register_output_renderer() should support streaming data 749283032
1105588651	https://github.com/simonw/datasette/issues/1101#issuecomment-1105588651	https://api.github.com/repos/simonw/datasette/issues/1101	IC_kwDOBm6k_c5B5fGr	eyeseast 25778	2022-04-21T18:15:39Z	2022-04-21T18:15:39Z	CONTRIBUTOR	What if you split rendering and streaming into two things: `render` is a function that returns a response `stream` is a function that sends chunks, or yields chunks passed to an ASGI `send` callback That way current plugins still work, and streaming is purely additive. A `stream` function could get a cursor or iterator of rows, instead of a list, so it could more efficiently handle large queries.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	register_output_renderer() should support streaming data 749283032
1103312860	https://github.com/simonw/datasette/issues/1713#issuecomment-1103312860	https://api.github.com/repos/simonw/datasette/issues/1713	IC_kwDOBm6k_c5Bwzfc	fgregg 536941	2022-04-20T00:52:19Z	2022-04-20T00:52:19Z	CONTRIBUTOR	feels related to #1402	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Datasette feature for publishing snapshots of query results 1203943272
1099540225	https://github.com/simonw/datasette/issues/1713#issuecomment-1099540225	https://api.github.com/repos/simonw/datasette/issues/1713	IC_kwDOBm6k_c5BiacB	eyeseast 25778	2022-04-14T19:09:57Z	2022-04-14T19:09:57Z	CONTRIBUTOR	I wonder if this overlaps with what I outlined in #1605. You could run something like this: `sh datasette freeze -d exports/ aws s3 cp exports/ s3://my-export-bucket/$(date)` And maybe that does what you need. Of course, that plugin isn't built yet. But that's the idea.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Datasette feature for publishing snapshots of query results 1203943272
1094453751	https://github.com/simonw/datasette/issues/1699#issuecomment-1094453751	https://api.github.com/repos/simonw/datasette/issues/1699	IC_kwDOBm6k_c5BPAn3	eyeseast 25778	2022-04-11T01:32:12Z	2022-04-11T01:32:12Z	CONTRIBUTOR	Was looking through old issues and realized a bunch of this got discussed in #1101 (including by me!), so sorry to rehash all this. Happy to help with whatever piece of it I can. Would be very excited to be able to use format plugins with exports.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Proposal: datasette query 1193090967
1092386254	https://github.com/simonw/datasette/issues/1699#issuecomment-1092386254	https://api.github.com/repos/simonw/datasette/issues/1699	IC_kwDOBm6k_c5BHH3O	eyeseast 25778	2022-04-08T02:39:25Z	2022-04-08T02:39:25Z	CONTRIBUTOR	And just to think this through a little more, here's what `stream_geojson` might look like: `python async def stream_geojson(datasette, columns, rows, database, stream): db = datasette.get_database(database) for row in rows: feature = await row_to_geojson(row, db) stream.write(feature + "\n") # just assuming newline mode for now` Alternately, that could be an async generator, like this: `python async def stream_geojson(datasette, columns, rows, database): db = datasette.get_database(database) for row in rows: feature = await row_to_geojson(row, db) yield feature` Not sure which makes more sense, but I think this pattern would open up a lot of possibility. If you had your stream_indented_json function, you could do `yield from stream_indented_json(rows, 2)` and be one your way.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Proposal: datasette query 1193090967
1092370880	https://github.com/simonw/datasette/issues/1699#issuecomment-1092370880	https://api.github.com/repos/simonw/datasette/issues/1699	IC_kwDOBm6k_c5BHEHA	eyeseast 25778	2022-04-08T02:07:40Z	2022-04-08T02:07:40Z	CONTRIBUTOR	So maybe `render_output_render` returns something like this: `python @hookimpl def register_output_renderer(datasette): return { "extension": "geojson", "render": render_geojson, "stream": stream_geojson, "can_render": can_render_geojson, }` And stream gets an iterator, instead of a list of rows, so it can efficiently handle large queries. Maybe it also gets passed a destination stream, or it returns an iterator. I'm not sure what makes more sense. Either way, that might cover both CLI exports and streaming responses.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Proposal: datasette query 1193090967
1092357672	https://github.com/simonw/datasette/issues/1699#issuecomment-1092357672	https://api.github.com/repos/simonw/datasette/issues/1699	IC_kwDOBm6k_c5BHA4o	eyeseast 25778	2022-04-08T01:39:40Z	2022-04-08T01:39:40Z	CONTRIBUTOR	My best thought on how to differentiate them so far is plugins: if Datasette plugins that provide alternative outputs - like .geojson and .yml and suchlike - also work for the datasette query command that would make a lot of sense to me. That's my thinking, too. It's really the thing I've been wanting since writing `datasette-geojson`, since I'm always exporting with `datasette --get`. The workflow I'm always looking for is something like this: `sh cd alltheplaces-datasette datasette query dunkin_in_suffolk -f geojson -o dunkin_in_suffolk.geojson` I think this probably needs either a new plugin hook separate from `register_output_renderer` or a way to use that without going through the HTTP stack. Or maybe a render mode that writes to a stream instead of a response. Maybe there's a new key in the dictionary that `register_output_renderer` returns that handles CLI exports.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Proposal: datasette query 1193090967

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);

issue_comments

616 rows where author_association = "CONTRIBUTOR" sorted by updated_at descending

fp = io.StringIO("id,name\n1,Cleo") # error

test_sql_immutable.py

test_sql_ro.py

test_sql.py

test_read.py

Advanced export