github

This data as json, CSV

id	node_id	number	title	user	state	assignee	milestone	comments	created_at	updated_at	author_association	pull_request	body	repo	type	reactions	draft	state_reason
849978964	MDU6SXNzdWU4NDk5Nzg5NjQ=	1293	Show column metadata plus links for foreign keys on arbitrary query results	9599	open			51	2021-04-04T22:59:42Z	2022-09-02T17:34:09Z	OWNER		Related to #620. It would be _really_ cool if Datasette could magically detect the source of the data displayed in an arbitrary query and, if that data represents a foreign key, display it as a hyperlink. Compare https://latest.datasette.io/fixtures/facetable <img width="1202" alt="fixtures__facetable__15_rows" src="https://user-images.githubusercontent.com/9599/113523748-a7d2b300-955e-11eb-904d-8cf2639b0b5f.png"> To https://latest.datasette.io/fixtures?sql=select+pk%2C+created%2C+planet_int%2C+on_earth%2C+state%2C+city_id%2C+neighborhood%2C+tags%2C+complex_array%2C+distinct_some_null+from+facetable+order+by+pk+limit+101 <img width="998" alt="fixtures__select_pk__created__planet_int__on_earth__state__city_id__neighborhood__tags__complex_array__distinct_some_null_from_facetable_order_by_pk_limit_101" src="https://user-images.githubusercontent.com/9599/113523761-be790a00-955e-11eb-8c82-36b9d0226bc8.png">	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1293/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		reopened
712260429	MDU6SXNzdWU3MTIyNjA0Mjk=	983	JavaScript plugin hooks mechanism similar to pluggy	9599	open			47	2020-09-30T20:32:43Z	2021-01-25T04:43:58Z	OWNER		> It would be neat to provide a JavaScript plugin hook that plugins can use to add their own options to this menu. No idea what that would look like though. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/981#issuecomment-701616922_	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/983/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1058072543	I_kwDOBm6k_c4_EOff	1518	Complete refactor of TableView and table.html template	9599	open		3268330	45	2021-11-19T02:55:16Z	2022-03-15T18:35:49Z	OWNER		Split from #878. The current `TableView` class is by far the most complex part of Datasette, and the most difficult to work on: https://github.com/simonw/datasette/blob/0.59.2/datasette/views/table.py In #878 I started exploring a new pattern for building views. In doing so it became clear that `TableView` is the first beast that I need to slay - if I can refactor that into something neat the pattern for building other views will emerge as a natural consequence. I've been trying to build this as a `register_routes()` plugin, as originally suggested in #870 - though unfortunately it looks like those plugins can't replace existing Datasette default views at the moment, see #1517. [UPDATE: I was wrong about this, plugins can over-ride default views just fine] I also know that I want to have a fully documented template context for `table.html` as a major step on the way to Datasette 1.0, see #1510. All of this adds up to the `TableView` factor being a major project that will unblock a whole flurry of other things - so I'm going to work on that in this separate issue.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1518/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1217759117	I_kwDOBm6k_c5IlYeN	1727	Research: demonstrate if parallel SQL queries are worthwhile	9599	open			32	2022-04-27T18:54:21Z	2022-09-26T14:48:31Z	OWNER		I added parallel SQL query execution here: - https://github.com/simonw/datasette/issues/1723 My hunch is that this will take advantage of multiple cores, since Python's `sqlite3` module releases the GIL once a query is passed to SQLite. I'd really like to prove this is the case though. Just not sure how to do it! Larger question: is this performance optimization actually improving performance at all? Under what circumstances is it worthwhile?	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1727/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
323658641	MDU6SXNzdWUzMjM2NTg2NDE=	262	Add ?_extra= mechanism for requesting extra properties in JSON	9599	open		3268330	27	2018-05-16T14:55:42Z	2023-03-29T06:22:22Z	OWNER		Datasette views currently work by creating a set of data that should be returned as JSON, then defining an additional, optional `template_data()` function which is called if the view is being rendered as HTML. This `template_data()` function calculates extra template context variables which are necessary for the HTML view but should not be included in the JSON. Example of how that is used today: https://github.com/simonw/datasette/blob/2b79f2bdeb1efa86e0756e741292d625f91cb93d/datasette/views/table.py#L672-L704 With features like Facets in #255 I'm beginning to want to move more items into the `template_data()` - in the case of facets it's the `suggested_facets` array. This saves that feature from being calculated (involving several SQL queries) for the JSON case where it is unlikely to be used. But... as an API user, I want to still optionally be able to access that information. Solution: Add a `?_extra=suggested_facets&_extra=table_metadata` argument which can be used to optionally request additional blocks to be added to the JSON API. Then redefine as many of the current `template_data()` features as extra arguments instead, and teach Datasette to return certain extras by default when rendering templates. This could allow the JSON representation to be slimmed down further (removing e.g. the `table_definition` and `view_definition` keys) while still making that information available to API users who need it.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/262/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
648435885	MDU6SXNzdWU2NDg0MzU4ODU=	878	New pattern for views that return either JSON or HTML, available for plugins	9599	open		3268330	26	2020-06-30T19:26:13Z	2022-03-19T16:19:30Z	OWNER		Can be part of #870 - refactoring existing views to use `register_routes()`. > I'm going to put the new `check_permissions()` method on `BaseView` as well. If I want that method to be available to plugins I can do so by turning that `BaseView` class into a documented API that plugins are encouraged to use themselves. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/832#issuecomment-651995453_	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/878/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
639993467	MDU6SXNzdWU2Mzk5OTM0Njc=	850	Proof of concept for Datasette on AWS Lambda with EFS	9599	open			25	2020-06-16T21:48:31Z	2020-06-16T23:52:16Z	OWNER		https://aws.amazon.com/about-aws/whats-new/2020/06/aws-lambda-support-for-amazon-elastic-file-system-now-generally-/ If Datasette can run on Lambda with access to EFS it could both read AND write large databases there.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/850/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
813880401	MDExOlB1bGxSZXF1ZXN0NTc3OTUzNzI3	5	WIP: Add Gmail takeout mbox import	306240	open			25	2021-02-22T21:30:40Z	2021-07-28T07:18:56Z	FIRST_TIME_CONTRIBUTOR	dogsheep/google-takeout-to-sqlite/pulls/5	WIP This PR adds the ability to import emails from a Gmail mbox export from Google Takeout. This is my first PR to a datasette/dogsheep repo. I've tested this on my personal Google Takeout mbox with ~520,000 emails going back to 2004. This took around ~20 minutes to process. To provide some feedback on the progress of the import I added the "rich" python module. I'm happy to remove that if adding a dependency is discouraged. However, I think it makes a nice addition to give feedback on the progress of a long import. Do we want to log emails that have errors when trying to import them? Dealing with encodings with emails is a bit tricky. I'm very open to feedback on how to deal with those better. As well as any other feedback for improvements.	206649770	pull	{ "url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1124731464	I_kwDOCGYnMM5DCgpI	399	Make it easier to insert geometries, with documentation and maybe code	9599	open			25	2022-02-05T00:11:26Z	2023-05-16T03:11:52Z	OWNER		In playing with the new SpatiaLite helpers from #385 I noticed that actually populating geometry columns is still a little bit tricky. Here's what I ended up doing: ```python import httpx, sqlite_utils db = sqlite_utils.Database("/tmp/spatial.db") attractions = httpx.get("https://latest.datasette.io/fixtures/roadside_attractions.json?_shape=array").json() db["attractions"].insert_all(attractions, pk="pk") # Schema of that table is now: # CREATE TABLE [attractions] ( # [pk] INTEGER PRIMARY KEY, # [name] TEXT, # [address] TEXT, # [latitude] FLOAT, # [longitude] FLOAT # ) db.init_spatialite() db["attractions"].add_geometry_column("point", "POINT") db.execute(""" update attractions set point = GeomFromText( 'POINT(' \|\| longitude \|\| ' ' \|\| latitude \|\| ')', 4326 ) """) ``` That last line took some figuring out - especially the need for the SRID of `4326`, without which I got this error: > `IntegrityError: attractions.point violates Geometry constraint [geom-type or SRID not allowed]` It would be good to both document this in more detail, but ideally also to come up with a more obvious pattern for inserting common types of spatial data. Also related: - #398 - #79	140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/399/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1855885427	I_kwDOBm6k_c5unpBz	2143	De-tangling Metadata before Datasette 1.0	15178711	open			24	2023-08-18T00:51:50Z	2023-08-24T18:28:27Z	CONTRIBUTOR		Metadata in Datasette is a really powerful feature, but is a bit difficult to work with. It was initially a way to add "metadata" about your "data" in Datasette instances, like descriptions for databases/tables/columns, titles, source URLs, licenses, etc. But it later became the go-to spot for other Datasette features that have nothing to do with metadata, like permissions/plugins/canned queries. Specifically, I've found the following problems when working with Datasette metadata: 1. Metadata cannot be updated without re-starting the entire Datasette instance. 2. The `metadata.json`/`metadata.yaml` has become a kitchen sink of unrelated (imo) features like plugin config, authentication config, canned queries 3. The Python APIs for defining extra metadata are a bit awkward (the `datasette.metadata()` class, `get_metadata()` hook, etc.) ## Possible solutions Here's a few ideas of Datasette core changes we can make to address these problems. ### Re-vamp the Datasette Python metadata APIs The Datasette object has a single `datasette.metadata()` method that's a bit difficult to work with. There's also no Python API for inserted new metadata, so plugins have to rely on the `get_metadata()` hook. The `get_metadata()` hook can also be improved - it doesn't work with async functions yet, so you're quite limited to what you can do. (I'm a bit fuzzy on what to actually do here, but I imagine it'll be very small breaking changes to a few Python methods) ### Add an optional `datasette_metadata` table Datasette should detect and use metadata stored in a new special table called `datasette_metadata`. This would be a regular table that a user can edit on their own, and would serve as a "live updating" source of metadata, than can be changed while the Datasette instance is running. Not too sure what the schema would look like, but I'd imagine: ```sql CREATE TABLE datasette_metadata( level text, target any, key text, value any, primary key (level, target) ) ``` Every row…	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/2143/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
459882902	MDU6SXNzdWU0NTk4ODI5MDI=	526	Stream all results for arbitrary SQL and canned queries	50578294	open			23	2019-06-24T13:09:45Z	2022-09-28T04:01:25Z	NONE		I think that there is a difficulty with canned queries. When I want to stream all results of a canned query TwoDays I get only first 1.000 records. Example: `http://myserver/history_sample/two_days.csv?_stream=on` returns only first 1.000 records. If I do the same with the whole database i.e. `http://myserver/history_sample/database.csv?_stream=on` I get correctly all records. Any ideas?	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/526/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
775666296	MDU6SXNzdWU3NzU2NjYyOTY=	1160	"datasette insert" command and plugin hook	9599	open			23	2020-12-29T02:37:03Z	2021-06-17T18:12:32Z	OWNER		Tools for loading data into Datasette currently mostly exist as separate utilities - `yaml-to-sqlite` and `csvs-to-sqlite` and suchlike. Bringing these into Datasette could have some interesting properties: - A `datasette insert` command could be extended with plugins to handle more formats - Any format that can be inserted on the command-line could also be inserted using a web UI or web API - which would benefit from new format plugin hooks - If Datasette ever grows beyond SQLite (see #670) a built-in import mechanism could work for those other databases as well - without me needing to write `yaml-to-postgresql` and suchlike	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1160/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
944846776	MDU6SXNzdWU5NDQ4NDY3NzY=	297	Option for importing CSV data using the SQLite .import mechanism	9599	open			23	2021-07-14T22:36:41Z	2023-09-22T20:49:52Z	OWNER		As seen in https://til.simonwillison.net/sqlite/import-csv - `.mode csv` and then `.import school.csv schools` is hugely faster than importing via `sqlite-utils insert` and doing the work in Python - but it can only be implemented by shelling out to the `sqlite3` CLI tool, it's not functionality that is exposed to the Python `sqlite3` module. An option to use this would be useful - maybe something like this: sqlite-utils insert blah.db blah blah.csv --fast	140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/297/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1857234285	I_kwDOBm6k_c5usyVt	2145	If a row has a primary key of `null` various things break	9599	open			23	2023-08-18T20:06:28Z	2023-08-21T17:30:01Z	OWNER		Stumbled across this while experimenting with `datasette-write-ui`. The error I got was a 500 on the `/db` page: > `'NoneType' object has no attribute 'encode'` Tracked it down to this code, which assembles the URL for a row page: https://github.com/simonw/datasette/blob/943df09dcca93c3b9861b8c96277a01320db8662/datasette/utils/__init__.py#L120-L134 That's because `tilde_encode` can't handle `None`: https://github.com/simonw/datasette/blob/943df09dcca93c3b9861b8c96277a01320db8662/datasette/utils/__init__.py#L1175-L1178	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/2145/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
930807135	MDU6SXNzdWU5MzA4MDcxMzU=	1384	Plugin hook for dynamic metadata	9599	open			22	2021-06-26T22:36:03Z	2022-03-14T00:36:42Z	OWNER		@brandonrobertz contributed an implementation of this in PR #1368, which I just merged. Opening this ticket to track further work on this before it goes out in a Datasette release (likely preceded by an alpha).	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1384/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
642572841	MDU6SXNzdWU2NDI1NzI4NDE=	859	Database page loads too slowly with many large tables (due to table counts)	3243482	open			21	2020-06-21T14:23:17Z	2021-08-25T21:59:55Z	CONTRIBUTOR		Hey, I have a database that I save in HTML from couple of web scrapers. There are around 200k+, 50+ rows in a couple of tables, with sqlite file weighing around 600MB. The app runs on a VPS with 2 core CPU, 4GB RAM and refreshing database page regularly takes more than 10 seconds. I was suspecting that counting tables was the culprit, but manually running `select count() from table_name` for the largest table finishes under a second. I've looked at the source code. There's a check for index page for mutable databases larger than 100MB https://github.com/simonw/datasette/blob/799c5d53570d773203527f19530cf772dc2eeb24/datasette/views/index.py#L15 but this check is not performed for database page. I've manually crippled `Database::table_counts` method ```py async def table_counts(self, limit=10): if not self.is_mutable and self.cached_table_counts is not None: return self.cached_table_counts # Try to get counts for each table, $limit timeout for each count counts = {} for table in await self.table_names(): try: # table_count = ( # await self.execute( # "select count() from [{}]".format(table), # custom_time_limit=limit, # ) # ).rows[0][0] counts[table] = 10 # table_count # In some cases I saw "SQL Logic Error" here in addition to # QueryInterrupted - so we catch that too: except (QueryInterrupted, sqlite3.OperationalError, sqlite3.DatabaseError): counts[table] = None if not self.is_mutable: self.cached_table_counts = counts return counts ``` now the page loads in <100ms. Is it possible to apply size check on database page too? <details> <summary> /-/versions output </summary> <pre> { "python": { "version": "3.8.0", "full": "3.8.0 (default, Oct 28 2019, 16:14:01) \n[GCC 8.3.0]" }, "datasette": { "version": "0.44" }, "asgi": "3.…	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/859/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
657572753	MDU6SXNzdWU2NTc1NzI3NTM=	894	?sort=colname~numeric to sort by by column cast to real	9599	open			21	2020-07-15T18:47:48Z	2021-08-20T02:07:53Z	OWNER		If a text column actually contains numbers, being able to "sort by column, treated as numeric" would be really useful. Probably depends on column actions enabled by #690	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/894/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
777333388	MDU6SXNzdWU3NzczMzMzODg=	1168	Mechanism for storing metadata in _metadata tables	9599	open			21	2021-01-01T18:47:27Z	2023-09-28T18:29:05Z	OWNER		_Original title: Perhaps metadata should all live in a `_metadata` in-memory database_ Inspired by #1150 - metadata should be exposed as an API, and for large Datasette instances that API may need to be paginated. So why not expose it through an in-memory database table? One catch to this: plugins. #860 aims to add a plugin hook for metadata. But if the metadata comes from an in-memory table, how do the plugins interact with it? The need to paginate over metadata does make a plugin hook that returns metadata for an individual table seem less wise, since we don't want to have to do 10,000 plugin hook invocations to show a list of all metadata. If those plugins write directly to the in-memory table how can their contributions survive the server restarting?	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1168/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
912864936	MDU6SXNzdWU5MTI4NjQ5MzY=	1362	Consider using CSP to protect against future XSS	9599	open			17	2021-06-06T15:32:20Z	2022-10-08T18:42:09Z	OWNER		The XSS in #1360 would have been a lot less damaging if Datasette used CSP to protect against such vulnerabilities: https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1362/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
816526538	MDU6SXNzdWU4MTY1MjY1Mzg=	239	sqlite-utils extract could handle nested objects	9599	open			16	2021-02-25T15:10:28Z	2022-09-03T23:46:02Z	OWNER		Imagine a table (imported from a nested JSON file) where one of the columns contains values that look like this: {"email": "anonymous@noreply.airtable.com", "id": "usrROSHARE0000000", "name": "Anonymous"} The `sqlite-utils extract` command already uses single text values in a column to populate a new table. It would not be much of a stretch for it to be able to use JSON instead, including specifying which of those values should be used as the primary key in the new table.	140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239/reactions", "total_count": 6, "+1": 5, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1408757705	I_kwDOBm6k_c5T9-_J	1843	Intermittent "Too many open files" error running tests	9599	open			16	2022-10-14T04:45:01Z	2022-12-17T22:02:41Z	OWNER		Partial stack trace from one of them: ``` /Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/jinja2/loaders.py:200: in get_source f = open_if_exists(filename) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ filename = '/Users/simon/Dropbox/Development/datasette/datasette/templates/error.html', mode = 'rb' def open_if_exists(filename: str, mode: str = "rb") -> t.Optional[t.IO]: """Returns a file descriptor for the filename if that file exists, otherwise ``None``. """ if not os.path.isfile(filename): return None > return open(filename, mode) E OSError: [Errno 24] Too many open files: '/Users/simon/Dropbox/Development/datasette/datasette/templates/error.html' ```	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1843/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		reopened
564833696	MDU6SXNzdWU1NjQ4MzM2OTY=	670	Prototoype for Datasette on PostgreSQL	9599	open			15	2020-02-13T17:17:55Z	2023-11-17T15:32:21Z	OWNER		I thought this would never happen, but now that I'm deep in the weeds of running SQLite in production for Datasette Cloud I'm starting to reconsider my policy of only supporting SQLite. Some of the factors making me think PostgreSQL support could be worth the effort: - Serverless. I'm getting increasingly excited about writable-database use-cases for Datasette. If it could talk to PostgreSQL then users could easily deploy it on Heroku or other serverless providers that can talk to a managed RDS-style PostgreSQL. - Existing databases. Plenty of organizations have PostgreSQL databases. They can export to SQLite using [db-to-sqlite](https://github.com/simonw/db-to-sqlite) but that's a pretty big barrier to getting started - being able to run `datasette postgresql://connection-string` and start trying it out would be a massively better experience. - Data size. I keep running into use-cases where I want to run Datasette against many GBs of data. SQLite can do this but PostgreSQL is much more optimized for large data, especially given the existence of tools like Citus. - Marketing. Convincing people to trust their data to SQLite is potentially a big barrier to adoption. Even if I've convinced myself it's trustworthy I still have to convince everyone else. - It might not be that hard? If this required a ground-up rewrite it wouldn't be worth the effort, but I have a hunch that it may not be too hard - most of the SQL in Datasette should work on both databases since it's almost all portable SELECT statements. If Datasette did DML this would be a lot harder, but it doesn't. - Plugins! This feels like a natural surface for a plugin - at which point people could add MySQL support and suchlike in the future. The above reasons feel strong enough to justify a prototype.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/670/reactions", "total_count": 19, "+1": 14, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 5, "rocket": 0, "eyes": 0 }
565064079	MDExOlB1bGxSZXF1ZXN0Mzc1MTgwODMy	672	--dirs option for scanning directories for SQLite databases	9599	open			15	2020-02-14T02:25:52Z	2020-03-27T01:03:53Z	OWNER	simonw/datasette/pulls/672	Refs #417.	107914493	pull	{ "url": "https://api.github.com/repos/simonw/datasette/issues/672/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
455486286	MDU6SXNzdWU0NTU0ODYyODY=	26	Mechanism for turning nested JSON into foreign keys / many-to-many	9599	open			14	2019-06-13T00:52:06Z	2022-06-29T23:35:29Z	OWNER		The GitHub JSON APIs have a really interesting convention with respect to related objects. Consider https://api.github.com/repos/simonw/sqlite-utils/issues - here's a truncated subset: ```json { "id": 449818897, "node_id": "MDU6SXNzdWU0NDk4MTg4OTc=", "number": 24, "title": "Additional Column Constraints?", "user": { "login": "IgnoredAmbience", "id": 98555, "node_id": "MDQ6VXNlcjk4NTU1", "avatar_url": "https://avatars0.githubusercontent.com/u/98555?v=4", "gravatar_id": "" }, "labels": [ { "id": 993377884, "node_id": "MDU6TGFiZWw5OTMzNzc4ODQ=", "url": "https://api.github.com/repos/simonw/sqlite-utils/labels/enhancement", "name": "enhancement", "color": "a2eeef", "default": true } ], "state": "open" } ``` The `user` column lists a complete user. The `labels` column has a list of labels. Since both user and label have populated `id` field this is actually enough information for us to create records for them AND set up the corresponding foreign key (for user) and m2m relationships (for labels). It would be really neat if `sqlite-utils` had some kind of mechanism for correctly processing these kind of patterns. Thanks to `jq` there's not much need for extra customization of the shape here - if we support a narrowly defined structure users can use `jq` to reshape arbitrary JSON to match.	140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
647095487	MDU6SXNzdWU2NDcwOTU0ODc=	873	"datasette -p 0 --root" gives the wrong URL	9599	open			14	2020-06-29T04:03:06Z	2020-08-18T17:26:10Z	OWNER		``` $ datasette -p 0 --root http://127.0.0.1:0/-/auth-token?token=2d498c... ``` The port is incorrect.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/873/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1125297737	I_kwDOCGYnMM5DEq5J	402	Advanced class-based `conversions=` mechanism	9599	open			14	2022-02-06T19:47:41Z	2022-02-16T10:18:55Z	OWNER		The `conversions=` parameter works like this at the moment: https://sqlite-utils.datasette.io/en/3.23/python-api.html#converting-column-values-using-sql-functions ```python db["places"].insert( {"name": "Wales", "geometry": wkt}, conversions={"geometry": "GeomFromText(?, 4326)"}, ) ``` This proposal is to support values in that dictionary that are objects, not strings, which can represent more complex conversions - spun out from #399. New proposed mechanism: ```python from sqlite_utils.utils import LongitudeLatitude db["places"].insert( { "name": "London", "point": (-0.118092, 51.509865) }, conversions={"point": LongitudeLatitude}, ) ``` Here `LongitudeLatitude` is a magical value which does TWO things: it sets up the `GeomFromText(?, 4326)` SQL function, and it handles converting the `(51.509865, -0.118092)` tuple into a `POINT({} {})` string. This would involve a change to the `conversions=` contract - where it usually expects a SQL string fragment, but it can also take an object which combines that SQL string fragment with a Python conversion function. Best of all... this resolves the `lat, lon` v.s. `lon, lat` dilemma because you can use `from sqlite_utils.utils import LongitudeLatitude` OR `from sqlite_utils.utils import LatitudeLongitude` depending on which you prefer! _Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030739566_	140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/402/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1573424830	I_kwDOBm6k_c5dyI6-	2019	Refactor out the keyset pagination code	9599	open			14	2023-02-06T23:04:00Z	2023-02-08T01:40:46Z	OWNER		While working on: - #1999 I noticed that some of the most complex code in the existing table view is the code that implements keyset pagination: https://github.com/simonw/datasette/blob/0b4a28691468b5c758df74fa1d72a823813c96bf/datasette/views/table.py#L417-L493 Extracting that into a utility function would simplify that code a lot.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/2019/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1615692818	I_kwDOBm6k_c5gTYQS	2035	Potential feature: special support for `?a=1&a=2` on the query page	9599	open		3268330	14	2023-03-08T18:05:03Z	2023-03-31T16:09:08Z	OWNER		From a discussion on Discord: https://discord.com/channels/823971286308356157/996877076982415491/1082789517062320138 The key idea is to make it easier for people to implement `where id in (...)` that's populated from query string arguments. What if you could add `?id=11&id=32&id=62` to the URL and have that made available as a list that can be used in the query?	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/2035/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
456578474	MDU6SXNzdWU0NTY1Nzg0NzQ=	511	Get Datasette tests passing on Windows in GitHub Actions	9599	open			13	2019-06-15T21:41:58Z	2021-07-11T17:23:05Z	OWNER		This should almost happen as a side-effect or moving from Sanic to Uvicorn during the port to ASGI: #272 Additional steps: - test it manually - update documentation - set up some form of Windows CI	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/511/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
612287234	MDU6SXNzdWU2MTIyODcyMzQ=	16	Import machine-learning detected labels (dog, llama etc) from Apple Photos	9599	open			13	2020-05-05T02:45:43Z	2020-05-05T05:38:16Z	MEMBER		Follow-on from #1. Apple Photos runs some very sophisticated machine learning on-device to figure out if photos are of dogs, llamas and so on. I really want to extract those labels out into my own database.	256834907	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16/reactions", "total_count": 2, "+1": 0, "-1": 0, "laugh": 1, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
749283032	MDU6SXNzdWU3NDkyODMwMzI=	1101	register_output_renderer() should support streaming data	9599	open		3268330	13	2020-11-24T02:17:09Z	2023-01-21T22:07:19Z	OWNER		> I'd like to implement this by first extending the `register_output_renderer()` hook to support streaming huge responses, then switching CSV to use the plugin hook in addition to TSV using it. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/1096#issuecomment-732542285_	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1101/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
817989436	MDU6SXNzdWU4MTc5ODk0MzY=	242	Async support	25778	open			13	2021-02-27T18:29:38Z	2021-10-28T14:37:56Z	CONTRIBUTOR		Following our conversation last week, want to note this here before I forget. I've had a couple situations where I'd like to do a bunch of updates in an async event loop, but I run into SQLite's issues with concurrent writes. This feels like something sqlite-utils could help with. PeeWee ORM has a [SQLite write queue](http://docs.peewee-orm.com/en/latest/peewee/playhouse.html#sqliteq) that might be a good model. It's using threads or gevent, but I _think_ that approach would translate well enough to asyncio. Happy to help with this, too.	140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1160182768	I_kwDOCGYnMM5FJvvw	412	Optional Pandas integration	9599	open			13	2022-03-05T01:49:27Z	2022-06-14T15:36:29Z	OWNER		It would be neat if there was a way to use this more seamlessly with Pandas, in particular Pandas dataframes - but without making Pandas a required dependency.	140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1400374908	I_kwDOBm6k_c5TeAZ8	1836	docker image is duplicating db files somehow	536941	open			13	2022-10-06T22:35:54Z	2022-10-08T16:56:51Z	CONTRIBUTOR		if you look into the docker image created by docker publish, the `datasette inspect` line is duplicating the db files. here's the result of the inspect command: <img width="490" alt="Screen Shot 2022-10-06 at 2 58 08 PM" src="https://user-images.githubusercontent.com/536941/194430909-9303ee1a-9bdd-4212-b54b-de28f43768d4.png">	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1836/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1447050738	I_kwDOBm6k_c5WQD3y	1886	Call for birthday presents: if you're using Datasette, let us know how you're using it here	9599	open			13	2022-11-13T19:25:51Z	2022-12-18T17:34:20Z	OWNER		Datasette is 5 years old today. To celebrate, I'm asking the community for birthday presents: https://simonwillison.net/2022/Nov/13/datasette-birthday/ > To celebrate this open source project’s birthday, I’ve decided to try something new: I’m going to ask for birthday presents. > > An aspect of Datastte’s marketing that I’ve so far neglected is social proof. I think it’s time to change that: I know people are using the software to do cool things, but this often happens behind closed doors. > > For Datastte’s birthday, I’m looking for endorsements and case studies and just general demonstrations that show how people are using it do so cool stuff. > > So: if you’ve used Datasette to solve a problem, and you’re willing to publicize it, please give us the gift of your endorsement! > > [...] > > Add a comment to [this issue thread](https://github.com/simonw/datasette/issues/1886) describing what you’re doing. Just a few sentences is fine—though a screenshot or even a link to a live instance would be even better	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1886/reactions", "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 2, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1657861026	I_kwDOBm6k_c5i0POi	2054	Make detailed notes on how table, query and row views work right now	9599	open			13	2023-04-06T18:21:09Z	2023-04-07T20:14:38Z	OWNER		Research to help influence the following: - #2049 - #2053 - #2050 - #262	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/2054/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
274615452	MDU6SXNzdWUyNzQ2MTU0NTI=	111	Add “updated” to metadata	9599	open			12	2017-11-16T18:22:20Z	2021-09-21T22:48:27Z	OWNER		To give an indication as to when the data was last updated. This should be a field in the metadata that is then shown on the index page and in the footer, if it is set. Also support setting it using an option to “datasette publish” and “datasette package” - which can either be a string or can be the magic string “today” to set it to today’s date: datasette publish file.db --updated=today	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/111/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
299760684	MDU6SXNzdWUyOTk3NjA2ODQ=	185	Metadata should be a nested arbitrary KV store	222245	open			12	2018-02-23T16:02:07Z	2019-05-13T18:33:33Z	NONE		I started using the metadata feature and was surprised to find that values are not inherited from the root object down to specific databases and tables. This makes metadata much less useful and requires a lot of pointless duplication. Ideally, metadata should allow arbitrary key-value pairs, and there should be a way of accessing metadata either in an inherited or non-inherited manner. Something like `metadata.page.key` vs. `metadata.this.key` might work as an interface.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/185/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
323718842	MDU6SXNzdWUzMjM3MTg4NDI=	268	Mechanism for ranking results from SQLite full-text search	9599	open			12	2018-05-16T17:36:40Z	2022-01-13T22:21:28Z	OWNER		This isn't particularly straight-forward - all the more reason for Datasette to implement it for you. This article is helpful: http://charlesleifer.com/blog/using-sqlite-full-text-search-with-python/	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/268/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
421546944	MDU6SXNzdWU0MjE1NDY5NDQ=	417	Datasette Library	9599	open			12	2019-03-15T14:30:22Z	2020-12-29T14:34:50Z	OWNER		The ability to run Datasette in a mode where it automatically picks up new (or modified) files in a directory tree without needing to restart the server. Suggested command: datasette library /path/to/mydbs/	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/417/reactions", "total_count": 8, "+1": 8, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
463544206	MDU6SXNzdWU0NjM1NDQyMDY=	537	Populate "endpoint" key in ASGI scope	9599	open			12	2019-07-03T04:54:47Z	2019-07-22T06:03:18Z	OWNER		This is a trick used by Starlette so that other layers of ASGI middleware can see which route was selected. They added it here: https://github.com/encode/starlette/commit/34d0097feb6f057bd050d5057df5a2f96b97384e If Datasette supports it as well we can benefit from it if we integrate this sentry_asgi middleware (probably as a `datasette-sentry` plugin): https://github.com/encode/sentry-asgi/blob/c6a42d44d31f85885b79e4ee898683ecf8104971/sentry_asgi/middleware.py#L34-L35	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/537/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
567902704	MDU6SXNzdWU1Njc5MDI3MDQ=	675	--cp option for datasette publish and datasette package for shipping additional files and directories	141844	open			12	2020-02-19T22:55:56Z	2020-12-28T18:49:21Z	NONE		I’m working on integrating Datasette into a documentation-oriented publishing workflow internally in my company, and in order to deploy the Docker image created by `datasette package` I need to add an additional file to the image — in my case, it’s a sort of a deployment directive. I’ve worked out a way to do this after the image has been created, but it’s convoluted and brittle. So it’d be excellent if there was an additional option for this command, something like, like, `--copy`. I’d envision it looking something like: ```shell $ datasette package --copy /the/source/path:/the/target/path data.db ``` I’d be happy to help design, specify, implement, and test this feature, if you’d be interested. Thanks for the fantastic tools!	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/675/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
770598024	MDU6SXNzdWU3NzA1OTgwMjQ=	1152	Efficiently calculate list of databases/tables a user can view	9599	open			12	2020-12-18T06:13:01Z	2021-12-27T23:04:31Z	OWNER		> The homepage currently performs a massive flurry of permission checks - one for each, database, table and view: https://github.com/simonw/datasette/blob/0.53/datasette/views/index.py#L21-L75 > > A paginated version of this is a little daunting as the permission checks would have to be carried out in every single table just to calculate the count that will be paginated. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/1150#issuecomment-747864831_	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1152/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
787098345	MDU6SXNzdWU3ODcwOTgzNDU=	1191	Ability for plugins to collaborate when adding extra HTML to blocks in default templates	9599	open		3268330	12	2021-01-15T18:18:51Z	2023-09-18T06:55:52Z	OWNER		Sometimes a plugin may want to add content to an existing default template - for example `datasette-search-all` adds a new search box at the top of `index.html`. I also want `datasette-upload-csvs` to add a CTA on the `database.html` page: https://github.com/simonw/datasette-upload-csvs/issues/18 Currently plugins can do this by providing a new version of the `index.html` template - but if multiple plugins try to do that only one of them will succeed. It would be better if there were known areas of those templates which plugins could add additional content to, such that multiple plugins can use the same spot.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1191/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1219385669	I_kwDOBm6k_c5IrllF	1729	Implement ?_extra and new API design for TableView	9599	open		8755003	12	2022-04-28T22:28:14Z	2022-12-13T05:29:07Z	OWNER		Part of: - #262 - #1518	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1729/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1250495688	I_kwDOCGYnMM5KiQzI	439	Misleading progress bar against utf-16-le CSV input	4068	open			12	2022-05-27T08:34:49Z	2022-06-15T03:53:43Z	NONE		The program crashes without any error. ``` wget "https://artsdatabanken.no/Fab2018/api/export/csv" sqlite-utils create-database test.db sqlite-utils insert --csv --delimiter ";" --encoding "utf-16-le" test test.db csv [------------------------------------] 0% [#################-------------------] 49% 00:00:01 ``` I would like to highlight various issues: 1. sqlite-utils catches exceptions without printing the stacktrace and/or reraising the exception, so there is no easy way to use `pdb` or similar to debug the program, solution: add a debug option 2. Silent crash: this is related to (1.), and it happens when there is a catch-all mechanism; solution: let the program fail.	140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/439/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
317001500	MDU6SXNzdWUzMTcwMDE1MDA=	236	datasette publish lambda plugin	9599	open			11	2018-04-23T22:10:30Z	2023-03-12T14:04:15Z	OWNER		Refs #217 - create a publish plugin that can deploy to AWS Lambda. https://docs.aws.amazon.com/lambda/latest/dg/limits.html says lambda packages can be up to 50 MB, so this would only work with smaller databases (the command can check the filesize before attempting to package and deploy it). Lambdas do get a 512 MB `/tmp` directory too, so for larger databases the function could start and then download up to 512MB from an S3 bucket - so the plugin could take an optional S3 bucket to write to and know how to upload the `.db` file there and then have the lambda download it on startup.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/236/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
632724154	MDU6SXNzdWU2MzI3MjQxNTQ=	805	Writable canned queries live demo on Glitch	9599	open			11	2020-06-06T20:52:13Z	2020-07-01T22:44:01Z	OWNER		Needs to run somewhere with a mutable disk drive, so not Cloud Run or Heroku or Vercel. I think I'll put it on Glitch.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/805/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1363766973	I_kwDOCGYnMM5RSW69	484	Expose convert recipes to `sqlite-utils --functions`	9599	open			11	2022-09-06T20:15:08Z	2022-09-07T19:09:52Z	OWNER		`--functions` was added in: - #471 It would be useful if the `r.jsonsplit()` and similar recipes for `sqlite-utils convert` could be used in these blocks of code too: https://sqlite-utils.datasette.io/en/stable/cli.html#sqlite-utils-convert-recipes	140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/484/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1428630253	I_kwDOBm6k_c5VJyrt	1873	Ensure insert API has good tests for rowid and compound primark key tables	9599	open		8755003	11	2022-10-30T06:22:17Z	2022-12-13T05:29:08Z	OWNER		Following: - #1866 I need to design and implement various edge-cases or primary keys: - Table without an auto-incrementing primary key - Table with compound primary keys - Table with just a `rowid`	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1873/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		reopened
1452572348	I_kwDOBm6k_c5WlH68	1900	datasette package --spatialite throws error during build	419145	open			11	2022-11-17T02:03:28Z	2022-11-18T08:00:38Z	NONE		Hello! Attempting to use `datasette package` to bundle up a SpatiaLite DB and I'm getting this error during the `docker build`: ``` sqlite3.OperationalError: /usr/lib/x86_64-linux-gnu/mod_spatialite.so.so: cannot open shared object file: No such file or directory ``` Seems to be throwing when this step is ran: ``` ERROR [6/6] RUN datasette inspect results.db --inspect-file inspect-data.json ``` This is with `v0.63.1`.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1900/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
346027040	MDU6SXNzdWUzNDYwMjcwNDA=	355	Table view should support filtering via many-to-many relationships	9599	open			10	2018-07-31T04:04:16Z	2019-05-23T06:04:03Z	OWNER		Parent: #354	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/355/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
447469253	MDU6SXNzdWU0NDc0NjkyNTM=	485	Improvements to table label detection	9599	open	9599		10	2019-05-23T06:19:49Z	2022-10-03T00:04:42Z	OWNER		Label detection doesn't work if the primary key is called pk rather than id, so this page doesn't work: https://latest.datasette.io/fixtures/roadside_attraction_characteristics Code is here: https://github.com/simonw/datasette/blob/cccea85be6aaaeadb31f3b588ec7f732628815f5/datasette/app.py#L644-L653	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/485/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
472115381	MDU6SXNzdWU0NzIxMTUzODE=	49	extracts= should support multiple-column extracts	9599	open			10	2019-07-24T07:06:41Z	2020-10-16T19:18:19Z	OWNER		Lookup tables can be constructed on compound columns, but the `extracts=` option doesn't currently support that. Right now extracts can be defined in two ways: ```python # Extract these columns into tables with the same name: dogs = db.table("dogs", extracts=["breed", "most_recent_trophy"]) # Same as above but with custom table names: dogs = db.table("dogs", extracts={"breed": "Breeds", "most_recent_trophy": "Trophies"}) ``` Need some kind of syntax for much more complicated extractions, like when two columns (say "source" and "source_version") are extracted into a single table.	140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/49/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
646737558	MDU6SXNzdWU2NDY3Mzc1NTg=	870	Refactor default views to use register_routes	9599	open			10	2020-06-27T18:53:12Z	2022-03-15T20:07:18Z	OWNER		It would be much cleaner if Datasette's default views were all registered using the new `register_routes()` plugin hook. Could dramatically reduce the code in `datasette/app.py`. > The ideal fix here would be to rework my `BaseView` subclass mechanism to work with `register_routes()` so that those views don't have any special privileges above plugin-provided views. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/864#issuecomment-648580556_	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/870/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
777140799	MDU6SXNzdWU3NzcxNDA3OTk=	1166	Adopt Prettier for JavaScript code formatting	9599	open			10	2020-12-31T21:25:27Z	2022-01-13T22:22:18Z	OWNER		https://prettier.io/ - I'm going to go with 2 spaces.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1166/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
903986178	MDU6SXNzdWU5MDM5ODYxNzg=	1344	Test Datasette Docker images built for different architectures	9599	open			10	2021-05-27T16:52:29Z	2022-09-06T00:07:58Z	OWNER		Continuing on from #1319 - now that we have the ability to build Datasette's Docker image against multiple architectures we should test that it works. We can do this with QEMU emulation, see https://twitter.com/nevali/status/1397958044571602945	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1344/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
910092577	MDU6SXNzdWU5MTAwOTI1Nzc=	1356	Research: syntactic sugar for using --get with SQL queries, maybe "datasette query"	9599	open			10	2021-06-03T04:49:42Z	2022-01-20T01:06:37Z	OWNER		Inspired by https://github.com/simonw/sqlite-utils/issues/264 - in particular this example: ``` datasette covid.db --get='/covid.yaml?sql=select * from ny_times_us_counties limit 1' - date: '2020-01-21' county: Snohomish state: Washington fips: 53061 cases: 1 deaths: 0 ``` Having to construct that URL - including potentially URL escaping the SQL query - isn't a great developer experience. Imagine if you could do this instead: datasette covid.db --query "select * from ny_times_us_counties limit 1" --format yaml	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1356/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1108671952	I_kwDOBm6k_c5CFP3Q	1605	Scripted exports	25778	open			10	2022-01-19T23:45:55Z	2022-11-30T15:06:38Z	CONTRIBUTOR		Posting this while I'm thinking about it: I mentioned at the end of [this thread](https://twitter.com/eyeseast/status/1483893011658551299) that I'm usually doing `datasette --get` to export canned queries. I used to use a tool called [datafreeze](https://github.com/pudo/datafreeze) to do scripted exports, but that project looks dead now. The ergonomics of it are pretty nice, though, and the `Freezefile.yml` structure is actually not too far from Datasette's canned queries. This is related to the idea for `datasette query` (#1356) but I think it's a distinct feature. It's most likely a plugin, but I want to raise it here because it's probably something other people have thought about.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1605/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1493471221	I_kwDOBm6k_c5ZBI_1	1949	`.json` errors should be returned as JSON	9599	open		8755003	10	2022-12-13T06:14:12Z	2022-12-15T00:46:27Z	OWNER		Eg the error in this issue: - #1945	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1949/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
275125561	MDU6SXNzdWUyNzUxMjU1NjE=	123	Datasette serve should accept paths/URLs to CSVs and other file formats	9599	open			9	2017-11-19T02:05:48Z	2021-07-19T00:04:32Z	OWNER		This would remove the csvs-to-sqlite step which I end up using for almost everything. I'm hesitant to introduce pandas as a required dependency though since it require compiling numpy. Could build it so this option is only available if you have pandas installed.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/123/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 }
496415321	MDU6SXNzdWU0OTY0MTUzMjE=	1	Figure out some interesting example SQL queries	9599	open			9	2019-09-20T15:28:07Z	2021-05-03T03:46:23Z	MEMBER		My knowledge of genetics has left me short here. I'd love to be able to provide some interesting example SELECT queries - maybe one that spots if you are [likely to have red hair?](https://www.snpedia.com/index.php/Rs1805007)	209590345	issue	{ "url": "https://api.github.com/repos/dogsheep/genome-to-sqlite/issues/1/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
507454958	MDU6SXNzdWU1MDc0NTQ5NTg=	596	Handle really wide tables better	9599	open			9	2019-10-15T20:05:46Z	2022-09-07T00:58:41Z	OWNER		If a table has hundreds of columns the Datasette UI starts getting unwieldy. Addressing this would be neat. One option would be to only select the first 30 columns by default and provide a UI for selecting more.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/596/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
607223136	MDU6SXNzdWU2MDcyMjMxMzY=	741	Replace "datasette publish --extra-options" with "--setting"	9599	open		3268330	9	2020-04-27T04:29:04Z	2022-05-12T19:21:16Z	OWNER		See https://github.com/simonw/datasette-publish-now/issues/9#issuecomment-618155764 - the `--extra-options` mechanism is in practice just used to set `--config` options in data that you publish, but that means you end up with pretty messy looking commands: datasette publish my.db --extra-options="--config default_page_size:50 --config sql_time_limit_ms:3500" A neater design would be to support `--config` as an option for `datasette publish` directly: datasette publish my.db --config default_page_size:50 --config sql_time_limit_ms:3500	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/741/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
702386948	MDU6SXNzdWU3MDIzODY5NDg=	159	.delete_where() does not auto-commit (unlike .insert() or .upsert())	11712349	open			9	2020-09-16T01:55:52Z	2023-04-01T17:21:05Z	NONE		When you use the delete_where() function on a table, it never commits.... Is that intentional?	140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/159/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
763361458	MDU6SXNzdWU3NjMzNjE0NTg=	1142	"Stream all rows" is not at all obvious	9599	open			9	2020-12-12T06:24:57Z	2021-06-17T18:12:31Z	OWNER		Got a question about how to download all rows - the current option isn't at all clear. <img width="668" alt="loans__ppp_loans__9_511_rows_where_where_search_matches__tech__sorted_by_rowid" src="https://user-images.githubusercontent.com/9599/101977057-ac660b00-3bff-11eb-88f4-c93ffd03d3e0.png">	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1142/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
774332247	MDExOlB1bGxSZXF1ZXN0NTQ1MjY0NDM2	1159	Improve the display of facets information	552629	open		3268330	9	2020-12-24T11:01:47Z	2023-07-31T18:57:59Z	FIRST_TIME_CONTRIBUTOR	simonw/datasette/pulls/1159	This PR changes the display of facets to hopefully make them more readable. Before \| After ---\|--- ![image](https://user-images.githubusercontent.com/552629/103084609-b1ec2980-45df-11eb-85bc-68ab8df3e8d9.png) \| ![image](https://user-images.githubusercontent.com/552629/103085220-620e6200-45e1-11eb-8189-5dd5d3e2569e.png)	107914493	pull	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1159/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
776634318	MDU6SXNzdWU3NzY2MzQzMTg=	1164	Mechanism for minifying JavaScript that ships with Datasette	9599	open			9	2020-12-30T20:59:06Z	2022-01-13T22:21:29Z	OWNER		> If I'm going to minify it I'll need to figure out a build step in Datasette itself so that I can easily work on that minified version. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/983#issuecomment-752748496_	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1164/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
776635426	MDU6SXNzdWU3NzY2MzU0MjY=	1165	Mechanism for executing JavaScript unit tests	9599	open			9	2020-12-30T21:02:34Z	2022-01-13T22:21:29Z	OWNER		> I'm going to need to add JavaScript unit tests for this new plugin system. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/983#issuecomment-752757289_	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1165/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
964322136	MDU6SXNzdWU5NjQzMjIxMzY=	1426	Manage /robots.txt in Datasette core, block robots by default	9599	open			9	2021-08-09T19:56:56Z	2021-12-04T07:11:29Z	OWNER		See accompanying Twitter thread: https://twitter.com/simonw/status/1424820203603431439 > Datasette currently has a plugin for configuring robots.txt, but I'm beginning to think it should be part of core and crawlers should be blocked by default - having people explicitly opt-in to having their sites crawled and indexed feels a lot safer https://datasette.io/plugins/datasette-block-robots I have a lot of Datasettes deployed now, and tailing logs shows that they are being hammered by search engine crawlers even though many of them are not interesting enough to warrant indexing. I'm starting to think blocking crawlers would actually be a better default for most people, provided it was well documented and easy to understand how to allow them. Default-deny is usually a better policy than default-allow!	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1426/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1015646369	I_kwDOBm6k_c48iYih	1480	Exceeding Cloud Run memory limits when deploying a 4.8G database	110420	open			9	2021-10-04T21:20:24Z	2022-10-07T04:39:10Z	CONTRIBUTOR		When I try to deploy a 4.8G SQLite database to Google Cloud Run, I get this error message: > Memory limit of 8192M exceeded with 8826M used. Consider increasing the memory limit, see https://cloud.google.com/run/docs/configuring/memory-limits Unfortunately, the maximum amount of memory that can be allocated to an instance is 8192M. Naively profiling the memory usage of running Datasette with this database locally on my MacBook shows the following memory usage (using Activity Monitor) when I just start up Datasette locally: - Real Memory Size: 70.6 MB - Virtual Memory Size: 4.51 GB - Shared Memory Size: 2.5 MB - Private Memory Size: 57.4 MB I'm trying to understand if there's a query or other operation that gets run during container deployment that causes memory use to be so large and if this can be avoided somehow. This is somewhat related to #1082, but on a different platform, so I decided to open a new issue.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1480/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1066603133	PR_kwDOCGYnMM4vKAzW	347	Test against pysqlite3 running SQLite 3.37	9599	open			9	2021-11-29T23:17:57Z	2021-12-11T01:02:19Z	OWNER	simonw/sqlite-utils/pulls/347	Refs #346 and #344.	140912432	pull	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/347/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1294641696	I_kwDOBm6k_c5NKqog	1767	Ability to set a custom favicon	9599	open			9	2022-07-05T18:41:12Z	2022-07-05T18:56:43Z	OWNER		If you're running a website on Datasette, like https://www.niche-museums.com/ or https://til.simonwillison.net/ - you should have the ability to easily specify a custom favicon. Currently the `/favicon.ico` view is hard-coded to do this: https://github.com/simonw/datasette/blob/9f1eb0d4eac483b953392157bd9fd6cc4df37de7/datasette/app.py#L179-L188	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1767/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1323346408	I_kwDOBm6k_c5O4Kno	1775	i18n support	428820	open			9	2022-07-31T02:51:04Z	2023-02-10T18:04:40Z	NONE		I want contribute for translate UI to es, de, de and it if you share strings	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1775/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1374939463	I_kwDOCGYnMM5R8-lH	489	Ability to load JSON records held in a file with a single top level key that is a list of objects	9599	open			9	2022-09-15T18:46:03Z	2022-09-15T20:56:10Z	OWNER		It's very common for JSON to look like this: ```json { "Version": "5.5.52.6", "List": [ { "Description": "Nonpartisan", "Id": 1, "ExternalId": "" }, { "Description": "Undeclared", "Id": 2, "ExternalId": "" } ] } ``` This example taken from the records downloaded from https://www.elections.alaska.gov/election-results/e/ Right now you can't import this into `sqlite-utils` - you need to run it through `jq .List` first. But since this is so common, it would be neat if `sqlite-utils` could have a rule of thumb that says "if it's an object, but it has a single key that is is a list of objects, use that instead".	140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/489/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1663399821	I_kwDOBm6k_c5jJXeN	2058	500 "attempt to write a readonly database" error caused by "PRAGMA schema_version"	9599	open			9	2023-04-11T23:57:50Z	2023-04-13T16:35:21Z	OWNER		I've not been able to replicate this myself yet, but I've seen log files from a user affected by it. ``` File "/usr/local/lib/python3.11/site-packages/datasette/views/base.py", line 89, in dispatch_request await self.ds.refresh_schemas() File "/usr/local/lib/python3.11/site-packages/datasette/app.py", line 371, in refresh_schemas await self._refresh_schemas() File "/usr/local/lib/python3.11/site-packages/datasette/app.py", line 386, in _refresh_schemas schema_version = (await db.execute("PRAGMA schema_version")).first()[0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 267, in execute results = await self.execute_fn(sql_operation_in_thread) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 213, in execute_fn return await asyncio.get_event_loop().run_in_executor( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(self.args, *self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 211, in in_thread return fn(conn) ^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 237, in sql_operation_in_thread cursor.execute(sql, params if params is not None else {}) sqlite3.OperationalError: attempt to write a readonly database ``` That's running the official Datasette Docker image on https://fly.io/ - it's causing 500 errors on every page of their site.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/2058/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1907765514	I_kwDOBm6k_c5xtjEK	2195	`datasette publish` needs support for the new config/metadata split	9599	open			9	2023-09-21T21:08:12Z	2023-09-21T22:57:48Z	OWNER		> ... which raises the challenge that `datasette publish` doesn't yet know what to do with a config file! _Originally posted by @simonw in https://github.com/simonw/datasette/issues/2194#issuecomment-1730259871_	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/2195/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
285168503	MDU6SXNzdWUyODUxNjg1MDM=	176	Add GraphQL endpoint	173848	open			8	2017-12-29T23:21:01Z	2020-04-21T14:16:24Z	NONE		Would make it much easier to build React & similar frontends. Maybe with https://github.com/graphql-python/sanic-graphql ?	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/176/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
459509126	MDU6SXNzdWU0NTk1MDkxMjY=	516	Enforce import sort order with isort	9599	open			8	2019-06-22T20:35:50Z	2023-08-23T02:15:36Z	OWNER		I want to use isort to order imports. A few steps here: - [x] Add a .isort.cfg file (see below) - [x] Use `isort -rc` to reformat existing code - [ ] Commit this change - [x] Add a unit test that ensures future changes remain isort compatible	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/516/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
602533300	MDU6SXNzdWU2MDI1MzMzMDA=	1	Import photo metadata from Apple Photos into SQLite	9599	open		5324096	8	2020-04-18T19:23:26Z	2020-05-04T02:41:40Z	MEMBER		Faces, albums, locations, that kind of thing.	256834907	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
634663505	MDU6SXNzdWU2MzQ2NjM1MDU=	815	Group permission checks by request on /-/permissions debug page	9599	open			8	2020-06-08T14:25:23Z	2020-12-17T22:06:48Z	OWNER		Now that we're making a LOT more permission checks (on the DB index page we do a check for every listed table for example) the `/-/permissions` page gets filled up pretty quickly. Can make this more readable by grouping permission checks by request. Have most recent request at the top of the page but the permission requests within that page sorted chronologically by most recent last.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/815/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
724878151	MDU6SXNzdWU3MjQ4NzgxNTE=	1032	Bring date parsing into Datasette core	9599	open			8	2020-10-19T18:30:45Z	2020-10-19T19:37:55Z	OWNER		Currently this is mainly handled by a plugin - https://github.com/simonw/datasette-dateutil - but I realise now that this really needs to be core functionality. See also Twitter thread: https://twitter.com/simonw/status/1318234808653213696	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1032/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
778450486	MDU6SXNzdWU3Nzg0NTA0ODY=	1171	GitHub Actions workflow to build and sign macOS binary executables	9599	open			8	2021-01-04T23:36:59Z	2021-01-07T19:36:00Z	OWNER		Using PyInstaller, as explored in #93 and https://til.simonwillison.net/python/packaging-pyinstaller The bigger challenge will be the code signing bit. I'll need a Apple Developer account ($99/year) and some extensive CI fiddling.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1171/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
780278550	MDU6SXNzdWU3ODAyNzg1NTA=	1179	Make original path available to render hooks	9599	open			8	2021-01-06T08:31:45Z	2021-01-25T04:44:33Z	OWNER		https://github.com/simonw/datasette-export-notebook/blob/0.1/datasette_export_notebook/__init__.py ```python async def render_notebook(datasette, request): return Response.html( await datasette.render_template( "export_notebook.html", { "csv_stream_url": datasette.absolute_url( request, path_with_format( request=request, format="csv", extra_qs={"_stream": "on"} ), ), "json_url": datasette.absolute_url( request, path_with_format( request=request, format="json", extra_qs={"_shape": "array"} ), ), "json": json, }, ) ) ``` This results in https://latest-with-plugins.datasette.io/github/issue_comments.Notebook showing `http://latest-with-plugins.datasette.io/github/issue_comments.Notebook?_format=json&_shape=array`	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1179/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
969855774	MDU6SXNzdWU5Njk4NTU3NzQ=	1432	Rename Datasette.__init__(config=) parameter to settings=	9599	open			8	2021-08-13T01:00:27Z	2021-10-19T01:16:41Z	OWNER		> While I'm doing this I should rename this internal variable to avoid confusion in the future: > > https://github.com/simonw/datasette/blob/e837095ef35ae155b4c78cc9a8b7133a48c94f03/datasette/app.py#L203 _Originally posted by @simonw in https://github.com/simonw/datasette/issues/1431#issuecomment-898072940_	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1432/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1083657868	I_kwDOBm6k_c5Al06M	1565	Documented JavaScript variables on different templates made available for plugins	9599	open			8	2021-12-17T22:30:51Z	2021-12-19T22:37:29Z	OWNER		While working on https://github.com/simonw/datasette-leaflet-freedraw/issues/10 I found myself writing this atrocity to figure out the SQL query used for a specific table page: ```javascript let innerSql = Array.from(document.getElementsByTagName("span")).filter( el => el.innerText == "View and edit SQL" )[0].parentElement.getAttribute("title") ``` This is obviously bad - it's very brittle, and will break if I ever change the text on that link (like localizing it for example). Instead, I think pages like that one should have a block of script at the bottom something like this: ```javascript window.datasette = window.datasette \|\| {}; datasette.view_name = 'table'; datasette.table_sql = 'select * from ...'; ```	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1565/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1160034488	I_kwDOCGYnMM5FJLi4	411	Support for generated columns	25778	open			8	2022-03-04T20:41:33Z	2022-03-11T22:32:43Z	CONTRIBUTOR		This is a fairly new feature -- SQLite version 3.31.0 (2020-01-22) -- that I, admittedly, haven't gotten to work yet. But it looks _incredibly_ useful: https://dgl.cx/2020/06/sqlite-json-support I'm not sure if this is an option on `add-column` or a separate command like `add-generated-column`. Either way, it needs an argument to populate it. It could be something like this: ```sh sqlite-utils add-column data.db table-name generated --as 'json_extract(data, "$.field")' --virtual ``` More here: https://www.sqlite.org/gencol.html	140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/411/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1163369515	I_kwDOBm6k_c5FV5wr	1655	query result page is using 400mb of browser memory 40x size of html page and 400x size of csv data	536941	open			8	2022-03-09T00:56:40Z	2023-10-17T21:53:17Z	CONTRIBUTOR		[this page](https://labordata.bunkum.us/opdr-8335ea3?sql=with+most_recent_lu+as+%28%0D%0A++select%0D%0A++++%0D%0A++from%0D%0A++++%28%0D%0A++++++select%0D%0A++++++++%0D%0A++++++from%0D%0A++++++++lm_data%0D%0A++++++order+by%0D%0A++++++++f_num%2C%0D%0A++++++++receive_date+desc%0D%0A++++%29+t%0D%0A++group+by%0D%0A++++f_num%0D%0A%29%0D%0Aselect%0D%0A++aff_abbr+%7C%7C+coalesce%28%27+local+%27+%7C%7C+desig_num%2C+%27+%27+%7C%7C+unit_name%29+as+abbr_local_name%2C%0D%0A++coalesce%28%0D%0A++++regexp_match%28%27%28.%3F%29%28%2C%3F+AFL-CIO%24%29%27%2C+union_name%29%2C%0D%0A++++regexp_match%28%27%28.%3F%29%28+IND%24%29%27%2C+union_name%29%2C%0D%0A++++union_name%0D%0A++%29+%7C%7C+coalesce%28%27+local+%27+%7C%7C+desig_num%2C+%27+%27+%7C%7C+unit_name%29+as+full_local_name%2C%0D%0A++*%0D%0Afrom%0D%0A++most_recent_lu%0D%0Awhere+%28desig_num+IS+NOT+NULL+OR+unit_name+IS+NOT+NULL%29+AND+desig_name+%21%3D+%27HQ%27%0D%0Alimit%0D%0A++5000+offset+0) is using about 400 mb in firefox 97 on mac os x. if you download the html for the page, it's about 11mb and if you get the csv for the data its about 1mb. it's using over a 1G on chrome 99. i found this because, i was trying to figure out why editing the SQL was getting very slow.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1655/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1174655187	I_kwDOBm6k_c5GA9DT	1671	Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply	9308268	open			8	2022-03-20T19:17:24Z	2022-03-22T17:43:12Z	NONE		I found a strange behavior, and I'm not sure if it's related to views and boolean values perhaps, or if there's something else weird going on here, but I'll provide an example that may help show what I'm seeing happen. ```bash #!/bin/bash echo "\"id\",\"expiration_date\" 0,2018-01-04 1,2019-01-05 2,2020-01-06 3,2021-01-07 4,2022-01-08 5,2023-01-09 6,2024-01-10 7,2025-01-11 8,2026-01-12 9,2027-01-13 " > test.csv csvs-to-sqlite test.csv test.db sqlite-utils create-view --replace test.db test_view "select id, expiration_date, case when julianday('NOW') >= julianday(expiration_date) then 1 else 0 end as has_expired FROM test" ``` ```bash datasette test.db ``` ![image](https://user-images.githubusercontent.com/9308268/159178745-9c6152f7-eac6-4bf9-bef5-a2d63d3ee13f.png) ![image](https://user-images.githubusercontent.com/9308268/159178824-c8952137-270c-42a4-ad1c-f6ad2c51e499.png) ![image](https://user-images.githubusercontent.com/9308268/159178877-23e00b36-443a-43ef-83e5-e0bdddd3fdcd.png) ![image](https://user-images.githubusercontent.com/9308268/159178918-65922cc7-2514-4735-a72d-4904b99976d4.png) Thanks again and let me know if you want me to provide anything else!	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/1671/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1560982210	PR_kwDOBm6k_c5IvYKw	2008	array facet: don't materialize unnecessary columns	193185	open			8	2023-01-28T19:33:40Z	2023-01-29T18:17:40Z	CONTRIBUTOR	simonw/datasette/pulls/2008	The presence of `inner.*` causes SQLite to materialize a row with all the columns. Those columns will be discarded later. Instead, we can select only the column we'll use. This lets SQLite's optimizer realize that the other columns in the CTE definition aren't needed. On a test table with 278K rows, 98K of which had an array, this speeds up the facet calculation from 4 sec to 1 sec. <!-- readthedocs-preview datasette start --> ---- :books: Documentation preview :books:: https://datasette--2008.org.readthedocs.build/en/2008/ <!-- readthedocs-preview datasette end -->	107914493	pull	{ "url": "https://api.github.com/repos/simonw/datasette/issues/2008/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1781530343	I_kwDOBm6k_c5qL_7n	2093	Proposal: Combine settings, metadata, static, etc. into a single `datasette.yaml` File	15178711	open			8	2023-06-29T21:18:23Z	2023-09-11T20:19:32Z	CONTRIBUTOR		Very often I get tripped up when trying to configure my Datasette instances. For example: if I want to change the port my app listen too, do I do that with a CLI flag, a `--setting` flag, inside `metadata.json`, or an env var? If I want to up the time limit of SQL statements, is that under `metadata.json` or a setting? Where does my plugin configuration go? Normally I need to look it up in Datasette docs, and I quickly find my answer, but the number of places where "config" goes it overwhelming. - Flat CLI flags like `--port`, `--host`, `--cors`, etc. - `--setting`, like `default_page_size`, `sql_time_limit_ms` etc - Inside `metadata.json`, including plugin configuration Typically my Datasette deploys are extremely long shell commands, with multiple `--setting` and other CLI flags. ## Proposal: Consolidate all "config" into `datasette.toml` I propose that we add a new `datasette.toml` that combines "settings", "metadata", and other common CLI flags like `--port` and `--cors` into a single file. It would be similar to "Cargo.toml" in Rust projects, "package.json" in Node projects, and "pyproject.toml" in Python, etc. A sample of what it could look like: ```toml # "top level" configuration that are currently CLI flags on `datasette serve` [config] port = 8020 host = "0.0.0.0" cors = true # replaces multiple `--setting` flags [settings] base_url = "/app/datasette/" default_allow_sql = true sql_time_limit_ms = 3500 # replaces `metadata.json`. # The contents of datasette-metadata.json could be defined in this file instead, but supporting separate files is nice (since those are easy to machine-generate) [metadata] include="./datasette-metadata.json" # plugin-specific [plugins] [plugins.datasette-auth-github] client_id = {env = "DATASETTE_AUTH_GITHUB_CLIENT_ID"} client_secret = {env = "GITHUB_CLIENT_SECRET"} [plugins.datasette-cluster-map] latitude_column = "lat" longitude_column = "lon" ``` ## Pros - Instead of multiple files and CLI flags, everything could b…	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/2093/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
323223872	MDU6SXNzdWUzMjMyMjM4NzI=	260	Validate metadata.json on startup	9599	open			7	2018-05-15T13:42:56Z	2023-06-21T12:51:22Z	OWNER		It's easy to misspell the name of a database or table and then be puzzled when the metadata settings silently fail. To avoid this, let's sanity check the provided metadata.json on startup and quit with a useful error message if we find any obvious mistakes.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/260/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
327365110	MDU6SXNzdWUzMjczNjUxMTA=	294	inspect should record column types	9599	open			7	2018-05-29T15:10:41Z	2019-06-28T16:45:28Z	OWNER		For each table we want to know the columns, their order and what type they are. I'm going to break with SQLite defaults a little on this one and allow datasette to define additional types - to start with just a `geometry` type for columns that are detected as SpatiaLite geometries. Possible JSON design: "columns": [{ "name": "title", "type": "text" }, ...] Refs #276	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/294/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
520681725	MDU6SXNzdWU1MjA2ODE3MjU=	621	Syntax for ?_through= that works as a form field	9599	open			7	2019-11-11T00:19:03Z	2021-12-18T01:42:33Z	OWNER		The current syntax for `?_through=` uses JSON to avoid any risk of confusion with table or column names that contain special characters. This means you can't target a form field at it. We should be able to support both - `?x.y.z=value` for tables and columns with "regular" names, falling back to the current JSON syntax for columns or tables that won't work with the key/value syntax.	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/621/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
642388564	MDU6SXNzdWU2NDIzODg1NjQ=	858	publish heroku does not work on Windows 10	870912	open			7	2020-06-20T14:40:28Z	2021-06-10T17:44:09Z	NONE		When executing "datasette publish heroku schools.db" on Windows 10, I get the following error ```shell File "c:\users\dell\.virtualenvs\sec-schools-jn-cwk8z\lib\site-packages\datasette\publish\heroku.py", line 54, in heroku line.split()[0] for line in check_output(["heroku", "plugins"]).splitlines() File "c:\python38\lib\subprocess.py", line 411, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, File "c:\python38\lib\subprocess.py", line 489, in run with Popen(popenargs, **kwargs) as process: File "c:\python38\lib\subprocess.py", line 854, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "c:\python38\lib\subprocess.py", line 1307, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, FileNotFoundError: [WinError 2] The system cannot find the file specified ``` Changing https://github.com/simonw/datasette/blob/55a6ffb93c57680e71a070416baae1129a0243b8/datasette/publish/heroku.py#L54 to ```python line.split()[0] for line in check_output(["heroku", "plugins"], shell=True).splitlines() ``` as well as the other `check_output()` and `call()` within the same file leads me to another recursive error about temp files	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/858/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
672421411	MDU6SXNzdWU2NzI0MjE0MTE=	916	Support reverse pagination (previous page, has-previous-items)	9599	open			7	2020-08-04T00:32:06Z	2021-04-03T23:43:11Z	OWNER		I need this for `datasette-graphql` for full compatibility with the way Relay likes to paginate - using cursors for paginating backwards as well as for paginating forwards. > This may be the kick I need to get Datasette pagination to work in reverse too. _Originally posted by @simonw in https://github.com/simonw/datasette-graphql/issues/2#issuecomment-668305853_	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/916/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
688670158	MDU6SXNzdWU2ODg2NzAxNTg=	147	SQLITE_MAX_VARS maybe hard-coded too low	96218	open			7	2020-08-30T07:26:45Z	2021-02-15T21:27:55Z	CONTRIBUTOR		I came across this while about to open an issue and PR against the documentation for `batch_size`, which is a bit incomplete. As mentioned in #145, while: > [`SQLITE_MAX_VARIABLE_NUMBER`](https://www.sqlite.org/limits.html#max_variable_number) ... defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0. it is common that it is increased at compile time. Debian-based systems, for example, seem to ship with a version of sqlite compiled with SQLITE_MAX_VARIABLE_NUMBER set to 250,000, and I believe this is the case for homebrew installations too. In working to understand what `batch_size` was actually doing and why, I realized that by setting `SQLITE_MAX_VARS` in `db.py` to match the value my sqlite was compiled with (I'm on Debian), I was able to decrease the time to `insert_all()` my test data set (~128k records across 7 tables) from ~26.5s to ~3.5s. Given that this about .05% of my total dataset, this is time I am keen to save... Unfortunately, it seems that `sqlite3` in the python standard library doesn't expose the `get_limit()` C API (even though `pysqlite` used to), so it's hard to know what value sqlite has been compiled with (note that this could mean, I suppose, that it's less than 999, and even hardcoding `SQLITE_MAX_VARS` to the conservative default might not be adequate. It can also be lowered -- but not raised -- at runtime). The best I could come up with is `echo "" \| sqlite3 -cmd ".limits variable_number"` (only available in `sqlite >= 2015-05-07 (3.8.10)`). Obviously this couldn't be relied upon in `sqlite_utils`, but I wonder what your opinion would be about exposing `SQLITE_MAX_VARS` as a user-configurable parameter (with suitable "here be dragons" warnings)? I'm going to go ahead and monkey-patch it for my purposes in any event, but it seems like it might be worth considering.	140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/147/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
703218756	MDU6SXNzdWU3MDMyMTg3NTY=	50	Commands for making authenticated API calls	9599	open			7	2020-09-17T02:39:07Z	2020-10-19T05:01:29Z	MEMBER		Similar to `twitter-to-sqlite fetch`, see https://github.com/dogsheep/twitter-to-sqlite/issues/51	207052882	issue	{ "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/50/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
705215230	MDU6SXNzdWU3MDUyMTUyMzA=	26	Pagination	9599	open			7	2020-09-21T00:14:37Z	2020-09-21T02:55:54Z	MEMBER		Useful for #16 (timeline view) since you can now filter to just the items on a specific day - but if there are more than 50 items you can't see them all.	197431109	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
712984738	MDU6SXNzdWU3MTI5ODQ3Mzg=	987	Documented HTML hooks for JavaScript plugin authors	9599	open			7	2020-10-01T16:10:14Z	2021-01-25T04:00:03Z	OWNER		In #981 I added `data-column=` attributes to the `<th>` on the table page. These should become part of Datasette's documented API so JavaScript plugin authors can use them to derive things about the tables shown on a page (`datasette-cluster-map uses them as-of https://github.com/simonw/datasette-cluster-map/issues/18).	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/987/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
714377268	MDU6SXNzdWU3MTQzNzcyNjg=	991	Redesign application homepage	9599	open			7	2020-10-04T18:48:45Z	2021-01-26T19:06:36Z	OWNER		Most Datasette instances only host a single database, but the current homepage design assumes that it should leave plenty of space for multiple databases: <img width="878" alt="Datasette_Fixtures__fixtures" src="https://user-images.githubusercontent.com/9599/95024344-5b51fd80-0637-11eb-8a11-40bad16f6907.png"> Reconsider this design - should the default show more information? The Covid-19 Datasette homepage looks particularly sparse I think: https://covid-19.datasettes.com/ <img width="782" alt="COVID-19_cases__using_data_from_Johns_Hopkins_CSSE__the_New_York_Times_and_the_LA_Times__covid" src="https://user-images.githubusercontent.com/9599/95024391-876d7e80-0637-11eb-8f19-ef38e4c87d2a.png">	107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/991/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }

github

Custom SQL query returning 101 rows (hide)

Query parameters