github
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | pull_request | body | repo | type | active_lock_reason | performed_via_github_app | reactions | draft | state_reason |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
849978964 | MDU6SXNzdWU4NDk5Nzg5NjQ= | 1293 | Show column metadata plus links for foreign keys on arbitrary query results | 9599 | open | 0 | 51 | 2021-04-04T22:59:42Z | 2022-09-02T17:34:09Z | OWNER | Related to #620. It would be _really_ cool if Datasette could magically detect the source of the data displayed in an arbitrary query and, if that data represents a foreign key, display it as a hyperlink. Compare https://latest.datasette.io/fixtures/facetable <img width="1202" alt="fixtures__facetable__15_rows" src="https://user-images.githubusercontent.com/9599/113523748-a7d2b300-955e-11eb-904d-8cf2639b0b5f.png"> To https://latest.datasette.io/fixtures?sql=select+pk%2C+created%2C+planet_int%2C+on_earth%2C+state%2C+city_id%2C+neighborhood%2C+tags%2C+complex_array%2C+distinct_some_null+from+facetable+order+by+pk+limit+101 <img width="998" alt="fixtures__select_pk__created__planet_int__on_earth__state__city_id__neighborhood__tags__complex_array__distinct_some_null_from_facetable_order_by_pk_limit_101" src="https://user-images.githubusercontent.com/9599/113523761-be790a00-955e-11eb-8c82-36b9d0226bc8.png"> | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1293/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
reopened | |||||||
712260429 | MDU6SXNzdWU3MTIyNjA0Mjk= | 983 | JavaScript plugin hooks mechanism similar to pluggy | 9599 | open | 0 | 47 | 2020-09-30T20:32:43Z | 2021-01-25T04:43:58Z | OWNER | > It would be neat to provide a JavaScript plugin hook that plugins can use to add their own options to this menu. No idea what that would look like though. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/981#issuecomment-701616922_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/983/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1058072543 | I_kwDOBm6k_c4_EOff | 1518 | Complete refactor of TableView and table.html template | 9599 | open | 0 | 3268330 | 45 | 2021-11-19T02:55:16Z | 2022-03-15T18:35:49Z | OWNER | Split from #878. The current `TableView` class is by far the most complex part of Datasette, and the most difficult to work on: https://github.com/simonw/datasette/blob/0.59.2/datasette/views/table.py In #878 I started exploring a new pattern for building views. In doing so it became clear that `TableView` is the first beast that I need to slay - if I can refactor that into something neat the pattern for building other views will emerge as a natural consequence. I've been trying to build this as a `register_routes()` plugin, as originally suggested in #870 - though unfortunately it looks like those plugins can't replace existing Datasette default views at the moment, see #1517. [UPDATE: I was wrong about this, plugins can over-ride default views just fine] I also know that I want to have a fully documented template context for `table.html` as a major step on the way to Datasette 1.0, see #1510. All of this adds up to the `TableView` factor being a major project that will unblock a whole flurry of other things - so I'm going to work on that in this separate issue. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1518/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
1217759117 | I_kwDOBm6k_c5IlYeN | 1727 | Research: demonstrate if parallel SQL queries are worthwhile | 9599 | open | 0 | 32 | 2022-04-27T18:54:21Z | 2022-09-26T14:48:31Z | OWNER | I added parallel SQL query execution here: - https://github.com/simonw/datasette/issues/1723 My hunch is that this will take advantage of multiple cores, since Python's `sqlite3` module releases the GIL once a query is passed to SQLite. I'd really like to prove this is the case though. Just not sure how to do it! Larger question: is this performance optimization actually improving performance at all? Under what circumstances is it worthwhile? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1727/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
323658641 | MDU6SXNzdWUzMjM2NTg2NDE= | 262 | Add ?_extra= mechanism for requesting extra properties in JSON | 9599 | open | 0 | 3268330 | 27 | 2018-05-16T14:55:42Z | 2023-03-29T06:22:22Z | OWNER | Datasette views currently work by creating a set of data that should be returned as JSON, then defining an additional, optional `template_data()` function which is called if the view is being rendered as HTML. This `template_data()` function calculates extra template context variables which are necessary for the HTML view but should not be included in the JSON. Example of how that is used today: https://github.com/simonw/datasette/blob/2b79f2bdeb1efa86e0756e741292d625f91cb93d/datasette/views/table.py#L672-L704 With features like Facets in #255 I'm beginning to want to move more items into the `template_data()` - in the case of facets it's the `suggested_facets` array. This saves that feature from being calculated (involving several SQL queries) for the JSON case where it is unlikely to be used. But... as an API user, I want to still optionally be able to access that information. Solution: Add a `?_extra=suggested_facets&_extra=table_metadata` argument which can be used to optionally request additional blocks to be added to the JSON API. Then redefine as many of the current `template_data()` features as extra arguments instead, and teach Datasette to return certain extras by default when rendering templates. This could allow the JSON representation to be slimmed down further (removing e.g. the `table_definition` and `view_definition` keys) while still making that information available to API users who need it. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/262/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
648435885 | MDU6SXNzdWU2NDg0MzU4ODU= | 878 | New pattern for views that return either JSON or HTML, available for plugins | 9599 | open | 0 | 3268330 | 26 | 2020-06-30T19:26:13Z | 2022-03-19T16:19:30Z | OWNER | Can be part of #870 - refactoring existing views to use `register_routes()`. > I'm going to put the new `check_permissions()` method on `BaseView` as well. If I want that method to be available to plugins I can do so by turning that `BaseView` class into a documented API that plugins are encouraged to use themselves. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/832#issuecomment-651995453_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/878/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
639993467 | MDU6SXNzdWU2Mzk5OTM0Njc= | 850 | Proof of concept for Datasette on AWS Lambda with EFS | 9599 | open | 0 | 25 | 2020-06-16T21:48:31Z | 2020-06-16T23:52:16Z | OWNER | https://aws.amazon.com/about-aws/whats-new/2020/06/aws-lambda-support-for-amazon-elastic-file-system-now-generally-/ If Datasette can run on Lambda with access to EFS it could both read AND write large databases there. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/850/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
813880401 | MDExOlB1bGxSZXF1ZXN0NTc3OTUzNzI3 | 5 | WIP: Add Gmail takeout mbox import | 306240 | open | 0 | 25 | 2021-02-22T21:30:40Z | 2021-07-28T07:18:56Z | FIRST_TIME_CONTRIBUTOR | dogsheep/google-takeout-to-sqlite/pulls/5 | WIP This PR adds the ability to import emails from a Gmail mbox export from Google Takeout. This is my first PR to a datasette/dogsheep repo. I've tested this on my personal Google Takeout mbox with ~520,000 emails going back to 2004. This took around ~20 minutes to process. To provide some feedback on the progress of the import I added the "rich" python module. I'm happy to remove that if adding a dependency is discouraged. However, I think it makes a nice addition to give feedback on the progress of a long import. Do we want to log emails that have errors when trying to import them? Dealing with encodings with emails is a bit tricky. I'm very open to feedback on how to deal with those better. As well as any other feedback for improvements. | 206649770 | pull | { "url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
0 | ||||||
1124731464 | I_kwDOCGYnMM5DCgpI | 399 | Make it easier to insert geometries, with documentation and maybe code | 9599 | open | 0 | 25 | 2022-02-05T00:11:26Z | 2023-05-16T03:11:52Z | OWNER | In playing with the new SpatiaLite helpers from #385 I noticed that actually populating geometry columns is still a little bit tricky. Here's what I ended up doing: ```python import httpx, sqlite_utils db = sqlite_utils.Database("/tmp/spatial.db") attractions = httpx.get("https://latest.datasette.io/fixtures/roadside_attractions.json?_shape=array").json() db["attractions"].insert_all(attractions, pk="pk") # Schema of that table is now: # CREATE TABLE [attractions] ( # [pk] INTEGER PRIMARY KEY, # [name] TEXT, # [address] TEXT, # [latitude] FLOAT, # [longitude] FLOAT # ) db.init_spatialite() db["attractions"].add_geometry_column("point", "POINT") db.execute(""" update attractions set point = GeomFromText( 'POINT(' || longitude || ' ' || latitude || ')', 4326 ) """) ``` That last line took some figuring out - especially the need for the SRID of `4326`, without which I got this error: > `IntegrityError: attractions.point violates Geometry constraint [geom-type or SRID not allowed]` It would be good to both document this in more detail, but ideally also to come up with a more obvious pattern for inserting common types of spatial data. Also related: - #398 - #79 | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/399/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1855885427 | I_kwDOBm6k_c5unpBz | 2143 | De-tangling Metadata before Datasette 1.0 | 15178711 | open | 0 | 24 | 2023-08-18T00:51:50Z | 2023-08-24T18:28:27Z | CONTRIBUTOR | Metadata in Datasette is a really powerful feature, but is a bit difficult to work with. It was initially a way to add "metadata" about your "data" in Datasette instances, like descriptions for databases/tables/columns, titles, source URLs, licenses, etc. But it later became the go-to spot for other Datasette features that have nothing to do with metadata, like permissions/plugins/canned queries. Specifically, I've found the following problems when working with Datasette metadata: 1. Metadata cannot be updated without re-starting the entire Datasette instance. 2. The `metadata.json`/`metadata.yaml` has become a kitchen sink of unrelated (imo) features like plugin config, authentication config, canned queries 3. The Python APIs for defining extra metadata are a bit awkward (the `datasette.metadata()` class, `get_metadata()` hook, etc.) ## Possible solutions Here's a few ideas of Datasette core changes we can make to address these problems. ### Re-vamp the Datasette Python metadata APIs The Datasette object has a single `datasette.metadata()` method that's a bit difficult to work with. There's also no Python API for inserted new metadata, so plugins have to rely on the `get_metadata()` hook. The `get_metadata()` hook can also be improved - it doesn't work with async functions yet, so you're quite limited to what you can do. (I'm a bit fuzzy on what to actually do here, but I imagine it'll be very small breaking changes to a few Python methods) ### Add an optional `datasette_metadata` table Datasette should detect and use metadata stored in a new special table called `datasette_metadata`. This would be a regular table that a user can edit on their own, and would serve as a "live updating" source of metadata, than can be changed while the Datasette instance is running. Not too sure what the schema would look like, but I'd imagine: ```sql CREATE TABLE datasette_metadata( level text, target any, key text, value any, primary key (level, target) ) ``` Every row… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2143/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
459882902 | MDU6SXNzdWU0NTk4ODI5MDI= | 526 | Stream all results for arbitrary SQL and canned queries | 50578294 | open | 0 | 23 | 2019-06-24T13:09:45Z | 2022-09-28T04:01:25Z | NONE | I think that there is a difficulty with canned queries. When I want to stream all results of a canned query TwoDays I get only first 1.000 records. Example: `http://myserver/history_sample/two_days.csv?_stream=on` returns only first 1.000 records. If I do the same with the whole database i.e. `http://myserver/history_sample/database.csv?_stream=on` I get correctly all records. Any ideas? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/526/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
775666296 | MDU6SXNzdWU3NzU2NjYyOTY= | 1160 | "datasette insert" command and plugin hook | 9599 | open | 0 | 23 | 2020-12-29T02:37:03Z | 2021-06-17T18:12:32Z | OWNER | Tools for loading data into Datasette currently mostly exist as separate utilities - `yaml-to-sqlite` and `csvs-to-sqlite` and suchlike. Bringing these into Datasette could have some interesting properties: - A `datasette insert` command could be extended with plugins to handle more formats - Any format that can be inserted on the command-line could also be inserted using a web UI or web API - which would benefit from new format plugin hooks - If Datasette ever grows beyond SQLite (see #670) a built-in import mechanism could work for those other databases as well - without me needing to write `yaml-to-postgresql` and suchlike | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1160/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
944846776 | MDU6SXNzdWU5NDQ4NDY3NzY= | 297 | Option for importing CSV data using the SQLite .import mechanism | 9599 | open | 0 | 23 | 2021-07-14T22:36:41Z | 2023-09-22T20:49:52Z | OWNER | As seen in https://til.simonwillison.net/sqlite/import-csv - `.mode csv` and then `.import school.csv schools` is hugely faster than importing via `sqlite-utils insert` and doing the work in Python - but it can only be implemented by shelling out to the `sqlite3` CLI tool, it's not functionality that is exposed to the Python `sqlite3` module. An option to use this would be useful - maybe something like this: sqlite-utils insert blah.db blah blah.csv --fast | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/297/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1857234285 | I_kwDOBm6k_c5usyVt | 2145 | If a row has a primary key of `null` various things break | 9599 | open | 0 | 23 | 2023-08-18T20:06:28Z | 2023-08-21T17:30:01Z | OWNER | Stumbled across this while experimenting with `datasette-write-ui`. The error I got was a 500 on the `/db` page: > `'NoneType' object has no attribute 'encode'` Tracked it down to this code, which assembles the URL for a row page: https://github.com/simonw/datasette/blob/943df09dcca93c3b9861b8c96277a01320db8662/datasette/utils/__init__.py#L120-L134 That's because `tilde_encode` can't handle `None`: https://github.com/simonw/datasette/blob/943df09dcca93c3b9861b8c96277a01320db8662/datasette/utils/__init__.py#L1175-L1178 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2145/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
930807135 | MDU6SXNzdWU5MzA4MDcxMzU= | 1384 | Plugin hook for dynamic metadata | 9599 | open | 0 | 22 | 2021-06-26T22:36:03Z | 2022-03-14T00:36:42Z | OWNER | @brandonrobertz contributed an implementation of this in PR #1368, which I just merged. Opening this ticket to track further work on this before it goes out in a Datasette release (likely preceded by an alpha). | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1384/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
642572841 | MDU6SXNzdWU2NDI1NzI4NDE= | 859 | Database page loads too slowly with many large tables (due to table counts) | 3243482 | open | 0 | 21 | 2020-06-21T14:23:17Z | 2021-08-25T21:59:55Z | CONTRIBUTOR | Hey, I have a database that I save in HTML from couple of web scrapers. There are around 200k+, 50+ rows in a couple of tables, with sqlite file weighing around 600MB. The app runs on a VPS with 2 core CPU, 4GB RAM and refreshing database page regularly takes more than 10 seconds. I was suspecting that counting tables was the culprit, but manually running `select count(*) from table_name` for the largest table finishes under a second. I've looked at the source code. There's a check for index page for mutable databases larger than 100MB https://github.com/simonw/datasette/blob/799c5d53570d773203527f19530cf772dc2eeb24/datasette/views/index.py#L15 but this check is not performed for database page. I've manually crippled `Database::table_counts` method ```py async def table_counts(self, limit=10): if not self.is_mutable and self.cached_table_counts is not None: return self.cached_table_counts # Try to get counts for each table, $limit timeout for each count counts = {} for table in await self.table_names(): try: # table_count = ( # await self.execute( # "select count(*) from [{}]".format(table), # custom_time_limit=limit, # ) # ).rows[0][0] counts[table] = 10 # table_count # In some cases I saw "SQL Logic Error" here in addition to # QueryInterrupted - so we catch that too: except (QueryInterrupted, sqlite3.OperationalError, sqlite3.DatabaseError): counts[table] = None if not self.is_mutable: self.cached_table_counts = counts return counts ``` now the page loads in <100ms. Is it possible to apply size check on database page too? <details> <summary> /-/versions output </summary> <pre> { "python": { "version": "3.8.0", "full": "3.8.0 (default, Oct 28 2019, 16:14:01) \n[GCC 8.3.0]" }, "datasette": { "version": "0.44" }, "asgi": "3.… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/859/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
657572753 | MDU6SXNzdWU2NTc1NzI3NTM= | 894 | ?sort=colname~numeric to sort by by column cast to real | 9599 | open | 0 | 21 | 2020-07-15T18:47:48Z | 2021-08-20T02:07:53Z | OWNER | If a text column actually contains numbers, being able to "sort by column, treated as numeric" would be really useful. Probably depends on column actions enabled by #690 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/894/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
777333388 | MDU6SXNzdWU3NzczMzMzODg= | 1168 | Mechanism for storing metadata in _metadata tables | 9599 | open | 0 | 21 | 2021-01-01T18:47:27Z | 2023-09-28T18:29:05Z | OWNER | _Original title: Perhaps metadata should all live in a `_metadata` in-memory database_ Inspired by #1150 - metadata should be exposed as an API, and for large Datasette instances that API may need to be paginated. So why not expose it through an in-memory database table? One catch to this: plugins. #860 aims to add a plugin hook for metadata. But if the metadata comes from an in-memory table, how do the plugins interact with it? The need to paginate over metadata does make a plugin hook that returns metadata for an individual table seem less wise, since we don't want to have to do 10,000 plugin hook invocations to show a list of all metadata. If those plugins write directly to the in-memory table how can their contributions survive the server restarting? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1168/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
912864936 | MDU6SXNzdWU5MTI4NjQ5MzY= | 1362 | Consider using CSP to protect against future XSS | 9599 | open | 0 | 17 | 2021-06-06T15:32:20Z | 2022-10-08T18:42:09Z | OWNER | The XSS in #1360 would have been a lot less damaging if Datasette used CSP to protect against such vulnerabilities: https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1362/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
816526538 | MDU6SXNzdWU4MTY1MjY1Mzg= | 239 | sqlite-utils extract could handle nested objects | 9599 | open | 0 | 16 | 2021-02-25T15:10:28Z | 2022-09-03T23:46:02Z | OWNER | Imagine a table (imported from a nested JSON file) where one of the columns contains values that look like this: {"email": "anonymous@noreply.airtable.com", "id": "usrROSHARE0000000", "name": "Anonymous"} The `sqlite-utils extract` command already uses single text values in a column to populate a new table. It would not be much of a stretch for it to be able to use JSON instead, including specifying which of those values should be used as the primary key in the new table. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239/reactions", "total_count": 6, "+1": 5, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1408757705 | I_kwDOBm6k_c5T9-_J | 1843 | Intermittent "Too many open files" error running tests | 9599 | open | 0 | 16 | 2022-10-14T04:45:01Z | 2022-12-17T22:02:41Z | OWNER | Partial stack trace from one of them: ``` /Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/jinja2/loaders.py:200: in get_source f = open_if_exists(filename) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ filename = '/Users/simon/Dropbox/Development/datasette/datasette/templates/error.html', mode = 'rb' def open_if_exists(filename: str, mode: str = "rb") -> t.Optional[t.IO]: """Returns a file descriptor for the filename if that file exists, otherwise ``None``. """ if not os.path.isfile(filename): return None > return open(filename, mode) E OSError: [Errno 24] Too many open files: '/Users/simon/Dropbox/Development/datasette/datasette/templates/error.html' ``` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1843/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
reopened | |||||||
564833696 | MDU6SXNzdWU1NjQ4MzM2OTY= | 670 | Prototoype for Datasette on PostgreSQL | 9599 | open | 0 | 15 | 2020-02-13T17:17:55Z | 2023-11-17T15:32:21Z | OWNER | I thought this would never happen, but now that I'm deep in the weeds of running SQLite in production for Datasette Cloud I'm starting to reconsider my policy of only supporting SQLite. Some of the factors making me think PostgreSQL support could be worth the effort: - Serverless. I'm getting increasingly excited about writable-database use-cases for Datasette. If it could talk to PostgreSQL then users could easily deploy it on Heroku or other serverless providers that can talk to a managed RDS-style PostgreSQL. - Existing databases. Plenty of organizations have PostgreSQL databases. They can export to SQLite using [db-to-sqlite](https://github.com/simonw/db-to-sqlite) but that's a pretty big barrier to getting started - being able to run `datasette postgresql://connection-string` and start trying it out would be a massively better experience. - Data size. I keep running into use-cases where I want to run Datasette against many GBs of data. SQLite can do this but PostgreSQL is much more optimized for large data, especially given the existence of tools like Citus. - Marketing. Convincing people to trust their data to SQLite is potentially a big barrier to adoption. Even if I've convinced myself it's trustworthy I still have to convince everyone else. - It might not be that hard? If this required a ground-up rewrite it wouldn't be worth the effort, but I have a hunch that it may not be too hard - most of the SQL in Datasette should work on both databases since it's almost all portable SELECT statements. If Datasette did DML this would be a lot harder, but it doesn't. - Plugins! This feels like a natural surface for a plugin - at which point people could add MySQL support and suchlike in the future. The above reasons feel strong enough to justify a prototype. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/670/reactions", "total_count": 19, "+1": 14, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 5, "rocket": 0, "eyes": 0 } |
||||||||
565064079 | MDExOlB1bGxSZXF1ZXN0Mzc1MTgwODMy | 672 | --dirs option for scanning directories for SQLite databases | 9599 | open | 0 | 15 | 2020-02-14T02:25:52Z | 2020-03-27T01:03:53Z | OWNER | simonw/datasette/pulls/672 | Refs #417. | 107914493 | pull | { "url": "https://api.github.com/repos/simonw/datasette/issues/672/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
0 | ||||||
455486286 | MDU6SXNzdWU0NTU0ODYyODY= | 26 | Mechanism for turning nested JSON into foreign keys / many-to-many | 9599 | open | 0 | 14 | 2019-06-13T00:52:06Z | 2022-06-29T23:35:29Z | OWNER | The GitHub JSON APIs have a really interesting convention with respect to related objects. Consider https://api.github.com/repos/simonw/sqlite-utils/issues - here's a truncated subset: ```json { "id": 449818897, "node_id": "MDU6SXNzdWU0NDk4MTg4OTc=", "number": 24, "title": "Additional Column Constraints?", "user": { "login": "IgnoredAmbience", "id": 98555, "node_id": "MDQ6VXNlcjk4NTU1", "avatar_url": "https://avatars0.githubusercontent.com/u/98555?v=4", "gravatar_id": "" }, "labels": [ { "id": 993377884, "node_id": "MDU6TGFiZWw5OTMzNzc4ODQ=", "url": "https://api.github.com/repos/simonw/sqlite-utils/labels/enhancement", "name": "enhancement", "color": "a2eeef", "default": true } ], "state": "open" } ``` The `user` column lists a complete user. The `labels` column has a list of labels. Since both user and label have populated `id` field this is actually enough information for us to create records for them AND set up the corresponding foreign key (for user) and m2m relationships (for labels). It would be really neat if `sqlite-utils` had some kind of mechanism for correctly processing these kind of patterns. Thanks to `jq` there's not much need for extra customization of the shape here - if we support a narrowly defined structure users can use `jq` to reshape arbitrary JSON to match. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
647095487 | MDU6SXNzdWU2NDcwOTU0ODc= | 873 | "datasette -p 0 --root" gives the wrong URL | 9599 | open | 0 | 14 | 2020-06-29T04:03:06Z | 2020-08-18T17:26:10Z | OWNER | ``` $ datasette -p 0 --root http://127.0.0.1:0/-/auth-token?token=2d498c... ``` The port is incorrect. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/873/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1125297737 | I_kwDOCGYnMM5DEq5J | 402 | Advanced class-based `conversions=` mechanism | 9599 | open | 0 | 14 | 2022-02-06T19:47:41Z | 2022-02-16T10:18:55Z | OWNER | The `conversions=` parameter works like this at the moment: https://sqlite-utils.datasette.io/en/3.23/python-api.html#converting-column-values-using-sql-functions ```python db["places"].insert( {"name": "Wales", "geometry": wkt}, conversions={"geometry": "GeomFromText(?, 4326)"}, ) ``` This proposal is to support values in that dictionary that are objects, not strings, which can represent more complex conversions - spun out from #399. New proposed mechanism: ```python from sqlite_utils.utils import LongitudeLatitude db["places"].insert( { "name": "London", "point": (-0.118092, 51.509865) }, conversions={"point": LongitudeLatitude}, ) ``` Here `LongitudeLatitude` is a magical value which does TWO things: it sets up the `GeomFromText(?, 4326)` SQL function, and it handles converting the `(51.509865, -0.118092)` tuple into a `POINT({} {})` string. This would involve a change to the `conversions=` contract - where it usually expects a SQL string fragment, but it can also take an object which combines that SQL string fragment with a Python conversion function. Best of all... this resolves the `lat, lon` v.s. `lon, lat` dilemma because you can use `from sqlite_utils.utils import LongitudeLatitude` OR `from sqlite_utils.utils import LatitudeLongitude` depending on which you prefer! _Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030739566_ | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/402/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1573424830 | I_kwDOBm6k_c5dyI6- | 2019 | Refactor out the keyset pagination code | 9599 | open | 0 | 14 | 2023-02-06T23:04:00Z | 2023-02-08T01:40:46Z | OWNER | While working on: - #1999 I noticed that some of the most complex code in the existing table view is the code that implements keyset pagination: https://github.com/simonw/datasette/blob/0b4a28691468b5c758df74fa1d72a823813c96bf/datasette/views/table.py#L417-L493 Extracting that into a utility function would simplify that code a lot. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2019/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1615692818 | I_kwDOBm6k_c5gTYQS | 2035 | Potential feature: special support for `?a=1&a=2` on the query page | 9599 | open | 0 | 3268330 | 14 | 2023-03-08T18:05:03Z | 2023-03-31T16:09:08Z | OWNER | From a discussion on Discord: https://discord.com/channels/823971286308356157/996877076982415491/1082789517062320138 The key idea is to make it easier for people to implement `where id in (...)` that's populated from query string arguments. What if you could add `?id=11&id=32&id=62` to the URL and have that made available as a list that can be used in the query? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2035/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
456578474 | MDU6SXNzdWU0NTY1Nzg0NzQ= | 511 | Get Datasette tests passing on Windows in GitHub Actions | 9599 | open | 0 | 13 | 2019-06-15T21:41:58Z | 2021-07-11T17:23:05Z | OWNER | This should almost happen as a side-effect or moving from Sanic to Uvicorn during the port to ASGI: #272 Additional steps: - test it manually - update documentation - set up some form of Windows CI | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/511/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
612287234 | MDU6SXNzdWU2MTIyODcyMzQ= | 16 | Import machine-learning detected labels (dog, llama etc) from Apple Photos | 9599 | open | 0 | 13 | 2020-05-05T02:45:43Z | 2020-05-05T05:38:16Z | MEMBER | Follow-on from #1. Apple Photos runs some very sophisticated machine learning on-device to figure out if photos are of dogs, llamas and so on. I really want to extract those labels out into my own database. | 256834907 | issue | { "url": "https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16/reactions", "total_count": 2, "+1": 0, "-1": 0, "laugh": 1, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
749283032 | MDU6SXNzdWU3NDkyODMwMzI= | 1101 | register_output_renderer() should support streaming data | 9599 | open | 0 | 3268330 | 13 | 2020-11-24T02:17:09Z | 2023-01-21T22:07:19Z | OWNER | > I'd like to implement this by first extending the `register_output_renderer()` hook to support streaming huge responses, then switching CSV to use the plugin hook in addition to TSV using it. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/1096#issuecomment-732542285_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1101/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
817989436 | MDU6SXNzdWU4MTc5ODk0MzY= | 242 | Async support | 25778 | open | 0 | 13 | 2021-02-27T18:29:38Z | 2021-10-28T14:37:56Z | CONTRIBUTOR | Following our conversation last week, want to note this here before I forget. I've had a couple situations where I'd like to do a bunch of updates in an async event loop, but I run into SQLite's issues with concurrent writes. This feels like something sqlite-utils could help with. PeeWee ORM has a [SQLite write queue](http://docs.peewee-orm.com/en/latest/peewee/playhouse.html#sqliteq) that might be a good model. It's using threads or gevent, but I _think_ that approach would translate well enough to asyncio. Happy to help with this, too. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1160182768 | I_kwDOCGYnMM5FJvvw | 412 | Optional Pandas integration | 9599 | open | 0 | 13 | 2022-03-05T01:49:27Z | 2022-06-14T15:36:29Z | OWNER | It would be neat if there was a way to use this more seamlessly with Pandas, in particular Pandas dataframes - but without making Pandas a required dependency. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1400374908 | I_kwDOBm6k_c5TeAZ8 | 1836 | docker image is duplicating db files somehow | 536941 | open | 0 | 13 | 2022-10-06T22:35:54Z | 2022-10-08T16:56:51Z | CONTRIBUTOR | if you look into the docker image created by docker publish, the `datasette inspect` line is duplicating the db files. here's the result of the inspect command: <img width="490" alt="Screen Shot 2022-10-06 at 2 58 08 PM" src="https://user-images.githubusercontent.com/536941/194430909-9303ee1a-9bdd-4212-b54b-de28f43768d4.png"> | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1836/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1447050738 | I_kwDOBm6k_c5WQD3y | 1886 | Call for birthday presents: if you're using Datasette, let us know how you're using it here | 9599 | open | 0 | 13 | 2022-11-13T19:25:51Z | 2022-12-18T17:34:20Z | OWNER | Datasette is 5 years old today. To celebrate, I'm asking the community for birthday presents: https://simonwillison.net/2022/Nov/13/datasette-birthday/ > To celebrate this open source project’s birthday, I’ve decided to try something new: I’m going to ask for birthday presents. > > An aspect of Datastte’s marketing that I’ve so far neglected is social proof. I think it’s time to change that: I know people are using the software to do cool things, but this often happens behind closed doors. > > For Datastte’s birthday, I’m looking for endorsements and case studies and just general demonstrations that show how people are using it do so cool stuff. > > So: if you’ve used Datasette to solve a problem, and you’re willing to publicize it, please give us the gift of your endorsement! > > [...] > > Add a comment to [this issue thread](https://github.com/simonw/datasette/issues/1886) describing what you’re doing. Just a few sentences is fine—though a screenshot or even a link to a live instance would be even better | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1886/reactions", "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 2, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1657861026 | I_kwDOBm6k_c5i0POi | 2054 | Make detailed notes on how table, query and row views work right now | 9599 | open | 0 | 13 | 2023-04-06T18:21:09Z | 2023-04-07T20:14:38Z | OWNER | Research to help influence the following: - #2049 - #2053 - #2050 - #262 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2054/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
274615452 | MDU6SXNzdWUyNzQ2MTU0NTI= | 111 | Add “updated” to metadata | 9599 | open | 0 | 12 | 2017-11-16T18:22:20Z | 2021-09-21T22:48:27Z | OWNER | To give an indication as to when the data was last updated. This should be a field in the metadata that is then shown on the index page and in the footer, if it is set. Also support setting it using an option to “datasette publish” and “datasette package” - which can either be a string or can be the magic string “today” to set it to today’s date: datasette publish file.db --updated=today | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/111/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
299760684 | MDU6SXNzdWUyOTk3NjA2ODQ= | 185 | Metadata should be a nested arbitrary KV store | 222245 | open | 0 | 12 | 2018-02-23T16:02:07Z | 2019-05-13T18:33:33Z | NONE | I started using the metadata feature and was surprised to find that values are not inherited from the root object down to specific databases and tables. This makes metadata much less useful and requires a lot of pointless duplication. Ideally, metadata should allow arbitrary key-value pairs, and there should be a way of accessing metadata either in an inherited or non-inherited manner. Something like `metadata.page.key` vs. `metadata.this.key` might work as an interface. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/185/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
323718842 | MDU6SXNzdWUzMjM3MTg4NDI= | 268 | Mechanism for ranking results from SQLite full-text search | 9599 | open | 0 | 12 | 2018-05-16T17:36:40Z | 2022-01-13T22:21:28Z | OWNER | This isn't particularly straight-forward - all the more reason for Datasette to implement it for you. This article is helpful: http://charlesleifer.com/blog/using-sqlite-full-text-search-with-python/ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/268/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
421546944 | MDU6SXNzdWU0MjE1NDY5NDQ= | 417 | Datasette Library | 9599 | open | 0 | 12 | 2019-03-15T14:30:22Z | 2020-12-29T14:34:50Z | OWNER | The ability to run Datasette in a mode where it automatically picks up new (or modified) files in a directory tree without needing to restart the server. Suggested command: datasette library /path/to/mydbs/ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/417/reactions", "total_count": 8, "+1": 8, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
463544206 | MDU6SXNzdWU0NjM1NDQyMDY= | 537 | Populate "endpoint" key in ASGI scope | 9599 | open | 0 | 12 | 2019-07-03T04:54:47Z | 2019-07-22T06:03:18Z | OWNER | This is a trick used by Starlette so that other layers of ASGI middleware can see which route was selected. They added it here: https://github.com/encode/starlette/commit/34d0097feb6f057bd050d5057df5a2f96b97384e If Datasette supports it as well we can benefit from it if we integrate this sentry_asgi middleware (probably as a `datasette-sentry` plugin): https://github.com/encode/sentry-asgi/blob/c6a42d44d31f85885b79e4ee898683ecf8104971/sentry_asgi/middleware.py#L34-L35 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/537/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
567902704 | MDU6SXNzdWU1Njc5MDI3MDQ= | 675 | --cp option for datasette publish and datasette package for shipping additional files and directories | 141844 | open | 0 | 12 | 2020-02-19T22:55:56Z | 2020-12-28T18:49:21Z | NONE | I’m working on integrating Datasette into a documentation-oriented publishing workflow internally in my company, and in order to deploy the Docker image created by `datasette package` I need to add an additional file to the image — in my case, it’s a sort of a deployment directive. I’ve worked out a way to do this after the image has been created, but it’s convoluted and brittle. So it’d be excellent if there was an additional option for this command, something like, like, `--copy`. I’d envision it looking something like: ```shell $ datasette package --copy /the/source/path:/the/target/path data.db ``` I’d be happy to help design, specify, implement, and test this feature, if you’d be interested. Thanks for the fantastic tools! | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/675/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
770598024 | MDU6SXNzdWU3NzA1OTgwMjQ= | 1152 | Efficiently calculate list of databases/tables a user can view | 9599 | open | 0 | 12 | 2020-12-18T06:13:01Z | 2021-12-27T23:04:31Z | OWNER | > The homepage currently performs a massive flurry of permission checks - one for each, database, table and view: https://github.com/simonw/datasette/blob/0.53/datasette/views/index.py#L21-L75 > > A paginated version of this is a little daunting as the permission checks would have to be carried out in every single table just to calculate the count that will be paginated. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/1150#issuecomment-747864831_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1152/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
787098345 | MDU6SXNzdWU3ODcwOTgzNDU= | 1191 | Ability for plugins to collaborate when adding extra HTML to blocks in default templates | 9599 | open | 0 | 3268330 | 12 | 2021-01-15T18:18:51Z | 2023-09-18T06:55:52Z | OWNER | Sometimes a plugin may want to add content to an existing default template - for example `datasette-search-all` adds a new search box at the top of `index.html`. I also want `datasette-upload-csvs` to add a CTA on the `database.html` page: https://github.com/simonw/datasette-upload-csvs/issues/18 Currently plugins can do this by providing a new version of the `index.html` template - but if multiple plugins try to do that only one of them will succeed. It would be better if there were known areas of those templates which plugins could add additional content to, such that multiple plugins can use the same spot. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1191/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
1219385669 | I_kwDOBm6k_c5IrllF | 1729 | Implement ?_extra and new API design for TableView | 9599 | open | 0 | 8755003 | 12 | 2022-04-28T22:28:14Z | 2022-12-13T05:29:07Z | OWNER | Part of: - #262 - #1518 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1729/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
1250495688 | I_kwDOCGYnMM5KiQzI | 439 | Misleading progress bar against utf-16-le CSV input | 4068 | open | 0 | 12 | 2022-05-27T08:34:49Z | 2022-06-15T03:53:43Z | NONE | The program crashes without any error. ``` wget "https://artsdatabanken.no/Fab2018/api/export/csv" sqlite-utils create-database test.db sqlite-utils insert --csv --delimiter ";" --encoding "utf-16-le" test test.db csv [------------------------------------] 0% [#################-------------------] 49% 00:00:01 ``` I would like to highlight various issues: 1. sqlite-utils catches exceptions without printing the stacktrace and/or reraising the exception, so there is no easy way to use `pdb` or similar to debug the program, solution: add a debug option 2. Silent crash: this is related to (1.), and it happens when there is a catch-all mechanism; solution: let the program fail. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/439/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
317001500 | MDU6SXNzdWUzMTcwMDE1MDA= | 236 | datasette publish lambda plugin | 9599 | open | 0 | 11 | 2018-04-23T22:10:30Z | 2023-03-12T14:04:15Z | OWNER | Refs #217 - create a publish plugin that can deploy to AWS Lambda. https://docs.aws.amazon.com/lambda/latest/dg/limits.html says lambda packages can be up to 50 MB, so this would only work with smaller databases (the command can check the filesize before attempting to package and deploy it). Lambdas do get a 512 MB `/tmp` directory too, so for larger databases the function could start and then download up to 512MB from an S3 bucket - so the plugin could take an optional S3 bucket to write to and know how to upload the `.db` file there and then have the lambda download it on startup. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/236/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
632724154 | MDU6SXNzdWU2MzI3MjQxNTQ= | 805 | Writable canned queries live demo on Glitch | 9599 | open | 0 | 11 | 2020-06-06T20:52:13Z | 2020-07-01T22:44:01Z | OWNER | Needs to run somewhere with a mutable disk drive, so not Cloud Run or Heroku or Vercel. I think I'll put it on Glitch. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/805/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1363766973 | I_kwDOCGYnMM5RSW69 | 484 | Expose convert recipes to `sqlite-utils --functions` | 9599 | open | 0 | 11 | 2022-09-06T20:15:08Z | 2022-09-07T19:09:52Z | OWNER | `--functions` was added in: - #471 It would be useful if the `r.jsonsplit()` and similar recipes for `sqlite-utils convert` could be used in these blocks of code too: https://sqlite-utils.datasette.io/en/stable/cli.html#sqlite-utils-convert-recipes | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/484/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1428630253 | I_kwDOBm6k_c5VJyrt | 1873 | Ensure insert API has good tests for rowid and compound primark key tables | 9599 | open | 0 | 8755003 | 11 | 2022-10-30T06:22:17Z | 2022-12-13T05:29:08Z | OWNER | Following: - #1866 I need to design and implement various edge-cases or primary keys: - Table without an auto-incrementing primary key - Table with compound primary keys - Table with just a `rowid` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1873/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
reopened | ||||||
1452572348 | I_kwDOBm6k_c5WlH68 | 1900 | datasette package --spatialite throws error during build | 419145 | open | 0 | 11 | 2022-11-17T02:03:28Z | 2022-11-18T08:00:38Z | NONE | Hello! Attempting to use `datasette package` to bundle up a SpatiaLite DB and I'm getting this error during the `docker build`: ``` sqlite3.OperationalError: /usr/lib/x86_64-linux-gnu/mod_spatialite.so.so: cannot open shared object file: No such file or directory ``` Seems to be throwing when this step is ran: ``` ERROR [6/6] RUN datasette inspect results.db --inspect-file inspect-data.json ``` This is with `v0.63.1`. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1900/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
346027040 | MDU6SXNzdWUzNDYwMjcwNDA= | 355 | Table view should support filtering via many-to-many relationships | 9599 | open | 0 | 10 | 2018-07-31T04:04:16Z | 2019-05-23T06:04:03Z | OWNER | Parent: #354 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/355/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
447469253 | MDU6SXNzdWU0NDc0NjkyNTM= | 485 | Improvements to table label detection | 9599 | open | 0 | 9599 | 10 | 2019-05-23T06:19:49Z | 2022-10-03T00:04:42Z | OWNER | Label detection doesn't work if the primary key is called pk rather than id, so this page doesn't work: https://latest.datasette.io/fixtures/roadside_attraction_characteristics Code is here: https://github.com/simonw/datasette/blob/cccea85be6aaaeadb31f3b588ec7f732628815f5/datasette/app.py#L644-L653 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/485/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
472115381 | MDU6SXNzdWU0NzIxMTUzODE= | 49 | extracts= should support multiple-column extracts | 9599 | open | 0 | 10 | 2019-07-24T07:06:41Z | 2020-10-16T19:18:19Z | OWNER | Lookup tables can be constructed on compound columns, but the `extracts=` option doesn't currently support that. Right now extracts can be defined in two ways: ```python # Extract these columns into tables with the same name: dogs = db.table("dogs", extracts=["breed", "most_recent_trophy"]) # Same as above but with custom table names: dogs = db.table("dogs", extracts={"breed": "Breeds", "most_recent_trophy": "Trophies"}) ``` Need some kind of syntax for much more complicated extractions, like when two columns (say "source" and "source_version") are extracted into a single table. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/49/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
646737558 | MDU6SXNzdWU2NDY3Mzc1NTg= | 870 | Refactor default views to use register_routes | 9599 | open | 0 | 10 | 2020-06-27T18:53:12Z | 2022-03-15T20:07:18Z | OWNER | It would be much cleaner if Datasette's default views were all registered using the new `register_routes()` plugin hook. Could dramatically reduce the code in `datasette/app.py`. > The ideal fix here would be to rework my `BaseView` subclass mechanism to work with `register_routes()` so that those views don't have any special privileges above plugin-provided views. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/864#issuecomment-648580556_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/870/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
777140799 | MDU6SXNzdWU3NzcxNDA3OTk= | 1166 | Adopt Prettier for JavaScript code formatting | 9599 | open | 0 | 10 | 2020-12-31T21:25:27Z | 2022-01-13T22:22:18Z | OWNER | https://prettier.io/ - I'm going to go with 2 spaces. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1166/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
903986178 | MDU6SXNzdWU5MDM5ODYxNzg= | 1344 | Test Datasette Docker images built for different architectures | 9599 | open | 0 | 10 | 2021-05-27T16:52:29Z | 2022-09-06T00:07:58Z | OWNER | Continuing on from #1319 - now that we have the ability to build Datasette's Docker image against multiple architectures we should test that it works. We can do this with QEMU emulation, see https://twitter.com/nevali/status/1397958044571602945 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1344/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
910092577 | MDU6SXNzdWU5MTAwOTI1Nzc= | 1356 | Research: syntactic sugar for using --get with SQL queries, maybe "datasette query" | 9599 | open | 0 | 10 | 2021-06-03T04:49:42Z | 2022-01-20T01:06:37Z | OWNER | Inspired by https://github.com/simonw/sqlite-utils/issues/264 - in particular this example: ``` datasette covid.db --get='/covid.yaml?sql=select * from ny_times_us_counties limit 1' - date: '2020-01-21' county: Snohomish state: Washington fips: 53061 cases: 1 deaths: 0 ``` Having to construct that URL - including potentially URL escaping the SQL query - isn't a great developer experience. Imagine if you could do this instead: datasette covid.db --query "select * from ny_times_us_counties limit 1" --format yaml | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1356/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1108671952 | I_kwDOBm6k_c5CFP3Q | 1605 | Scripted exports | 25778 | open | 0 | 10 | 2022-01-19T23:45:55Z | 2022-11-30T15:06:38Z | CONTRIBUTOR | Posting this while I'm thinking about it: I mentioned at the end of [this thread](https://twitter.com/eyeseast/status/1483893011658551299) that I'm usually doing `datasette --get` to export canned queries. I used to use a tool called [datafreeze](https://github.com/pudo/datafreeze) to do scripted exports, but that project looks dead now. The ergonomics of it are pretty nice, though, and the `Freezefile.yml` structure is actually not too far from Datasette's canned queries. This is related to the idea for `datasette query` (#1356) but I think it's a distinct feature. It's most likely a plugin, but I want to raise it here because it's probably something other people have thought about. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1605/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1493471221 | I_kwDOBm6k_c5ZBI_1 | 1949 | `.json` errors should be returned as JSON | 9599 | open | 0 | 8755003 | 10 | 2022-12-13T06:14:12Z | 2022-12-15T00:46:27Z | OWNER | Eg the error in this issue: - #1945 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1949/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
275125561 | MDU6SXNzdWUyNzUxMjU1NjE= | 123 | Datasette serve should accept paths/URLs to CSVs and other file formats | 9599 | open | 0 | 9 | 2017-11-19T02:05:48Z | 2021-07-19T00:04:32Z | OWNER | This would remove the csvs-to-sqlite step which I end up using for almost everything. I'm hesitant to introduce pandas as a required dependency though since it require compiling numpy. Could build it so this option is only available if you have pandas installed. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/123/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
||||||||
496415321 | MDU6SXNzdWU0OTY0MTUzMjE= | 1 | Figure out some interesting example SQL queries | 9599 | open | 0 | 9 | 2019-09-20T15:28:07Z | 2021-05-03T03:46:23Z | MEMBER | My knowledge of genetics has left me short here. I'd love to be able to provide some interesting example SELECT queries - maybe one that spots if you are [likely to have red hair?](https://www.snpedia.com/index.php/Rs1805007) | 209590345 | issue | { "url": "https://api.github.com/repos/dogsheep/genome-to-sqlite/issues/1/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
507454958 | MDU6SXNzdWU1MDc0NTQ5NTg= | 596 | Handle really wide tables better | 9599 | open | 0 | 9 | 2019-10-15T20:05:46Z | 2022-09-07T00:58:41Z | OWNER | If a table has hundreds of columns the Datasette UI starts getting unwieldy. Addressing this would be neat. One option would be to only select the first 30 columns by default and provide a UI for selecting more. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/596/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
607223136 | MDU6SXNzdWU2MDcyMjMxMzY= | 741 | Replace "datasette publish --extra-options" with "--setting" | 9599 | open | 0 | 3268330 | 9 | 2020-04-27T04:29:04Z | 2022-05-12T19:21:16Z | OWNER | See https://github.com/simonw/datasette-publish-now/issues/9#issuecomment-618155764 - the `--extra-options` mechanism is in practice just used to set `--config` options in data that you publish, but that means you end up with pretty messy looking commands: datasette publish my.db --extra-options="--config default_page_size:50 --config sql_time_limit_ms:3500" A neater design would be to support `--config` as an option for `datasette publish` directly: datasette publish my.db --config default_page_size:50 --config sql_time_limit_ms:3500 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/741/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
702386948 | MDU6SXNzdWU3MDIzODY5NDg= | 159 | .delete_where() does not auto-commit (unlike .insert() or .upsert()) | 11712349 | open | 0 | 9 | 2020-09-16T01:55:52Z | 2023-04-01T17:21:05Z | NONE | When you use the delete_where() function on a table, it never commits.... Is that intentional? | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/159/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
763361458 | MDU6SXNzdWU3NjMzNjE0NTg= | 1142 | "Stream all rows" is not at all obvious | 9599 | open | 0 | 9 | 2020-12-12T06:24:57Z | 2021-06-17T18:12:31Z | OWNER | Got a question about how to download all rows - the current option isn't at all clear. <img width="668" alt="loans__ppp_loans__9_511_rows_where_where_search_matches__tech__sorted_by_rowid" src="https://user-images.githubusercontent.com/9599/101977057-ac660b00-3bff-11eb-88f4-c93ffd03d3e0.png"> | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1142/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
774332247 | MDExOlB1bGxSZXF1ZXN0NTQ1MjY0NDM2 | 1159 | Improve the display of facets information | 552629 | open | 0 | 3268330 | 9 | 2020-12-24T11:01:47Z | 2023-07-31T18:57:59Z | FIRST_TIME_CONTRIBUTOR | simonw/datasette/pulls/1159 | This PR changes the display of facets to hopefully make them more readable. Before | After ---|--- ![image](https://user-images.githubusercontent.com/552629/103084609-b1ec2980-45df-11eb-85bc-68ab8df3e8d9.png) | ![image](https://user-images.githubusercontent.com/552629/103085220-620e6200-45e1-11eb-8189-5dd5d3e2569e.png) | 107914493 | pull | { "url": "https://api.github.com/repos/simonw/datasette/issues/1159/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
0 | |||||
776634318 | MDU6SXNzdWU3NzY2MzQzMTg= | 1164 | Mechanism for minifying JavaScript that ships with Datasette | 9599 | open | 0 | 9 | 2020-12-30T20:59:06Z | 2022-01-13T22:21:29Z | OWNER | > If I'm going to minify it I'll need to figure out a build step in Datasette itself so that I can easily work on that minified version. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/983#issuecomment-752748496_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1164/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
776635426 | MDU6SXNzdWU3NzY2MzU0MjY= | 1165 | Mechanism for executing JavaScript unit tests | 9599 | open | 0 | 9 | 2020-12-30T21:02:34Z | 2022-01-13T22:21:29Z | OWNER | > I'm going to need to add JavaScript unit tests for this new plugin system. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/983#issuecomment-752757289_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1165/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
964322136 | MDU6SXNzdWU5NjQzMjIxMzY= | 1426 | Manage /robots.txt in Datasette core, block robots by default | 9599 | open | 0 | 9 | 2021-08-09T19:56:56Z | 2021-12-04T07:11:29Z | OWNER | See accompanying Twitter thread: https://twitter.com/simonw/status/1424820203603431439 > Datasette currently has a plugin for configuring robots.txt, but I'm beginning to think it should be part of core and crawlers should be blocked by default - having people explicitly opt-in to having their sites crawled and indexed feels a lot safer https://datasette.io/plugins/datasette-block-robots I have a lot of Datasettes deployed now, and tailing logs shows that they are being *hammered* by search engine crawlers even though many of them are not interesting enough to warrant indexing. I'm starting to think blocking crawlers would actually be a better default for most people, provided it was well documented and easy to understand how to allow them. Default-deny is usually a better policy than default-allow! | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1426/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1015646369 | I_kwDOBm6k_c48iYih | 1480 | Exceeding Cloud Run memory limits when deploying a 4.8G database | 110420 | open | 0 | 9 | 2021-10-04T21:20:24Z | 2022-10-07T04:39:10Z | CONTRIBUTOR | When I try to deploy a 4.8G SQLite database to Google Cloud Run, I get this error message: > Memory limit of 8192M exceeded with 8826M used. Consider increasing the memory limit, see https://cloud.google.com/run/docs/configuring/memory-limits Unfortunately, the maximum amount of memory that can be allocated to an instance is 8192M. Naively profiling the memory usage of running Datasette with this database locally on my MacBook shows the following memory usage (using Activity Monitor) when I just start up Datasette locally: - Real Memory Size: 70.6 MB - Virtual Memory Size: 4.51 GB - Shared Memory Size: 2.5 MB - Private Memory Size: 57.4 MB I'm trying to understand if there's a query or other operation that gets run during container deployment that causes memory use to be so large and if this can be avoided somehow. This is somewhat related to #1082, but on a different platform, so I decided to open a new issue. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1480/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1066603133 | PR_kwDOCGYnMM4vKAzW | 347 | Test against pysqlite3 running SQLite 3.37 | 9599 | open | 0 | 9 | 2021-11-29T23:17:57Z | 2021-12-11T01:02:19Z | OWNER | simonw/sqlite-utils/pulls/347 | Refs #346 and #344. | 140912432 | pull | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/347/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
0 | ||||||
1294641696 | I_kwDOBm6k_c5NKqog | 1767 | Ability to set a custom favicon | 9599 | open | 0 | 9 | 2022-07-05T18:41:12Z | 2022-07-05T18:56:43Z | OWNER | If you're running a website on Datasette, like https://www.niche-museums.com/ or https://til.simonwillison.net/ - you should have the ability to easily specify a custom favicon. Currently the `/favicon.ico` view is hard-coded to do this: https://github.com/simonw/datasette/blob/9f1eb0d4eac483b953392157bd9fd6cc4df37de7/datasette/app.py#L179-L188 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1767/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1323346408 | I_kwDOBm6k_c5O4Kno | 1775 | i18n support | 428820 | open | 0 | 9 | 2022-07-31T02:51:04Z | 2023-02-10T18:04:40Z | NONE | I want contribute for translate UI to es, de, de and it if you share strings | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1775/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1374939463 | I_kwDOCGYnMM5R8-lH | 489 | Ability to load JSON records held in a file with a single top level key that is a list of objects | 9599 | open | 0 | 9 | 2022-09-15T18:46:03Z | 2022-09-15T20:56:10Z | OWNER | It's very common for JSON to look like this: ```json { "Version": "5.5.52.6", "List": [ { "Description": "Nonpartisan", "Id": 1, "ExternalId": "" }, { "Description": "Undeclared", "Id": 2, "ExternalId": "" } ] } ``` This example taken from the records downloaded from https://www.elections.alaska.gov/election-results/e/ Right now you can't import this into `sqlite-utils` - you need to run it through `jq .List` first. But since this is so common, it would be neat if `sqlite-utils` could have a rule of thumb that says "if it's an object, but it has a single key that is is a list of objects, use that instead". | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/489/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1663399821 | I_kwDOBm6k_c5jJXeN | 2058 | 500 "attempt to write a readonly database" error caused by "PRAGMA schema_version" | 9599 | open | 0 | 9 | 2023-04-11T23:57:50Z | 2023-04-13T16:35:21Z | OWNER | I've not been able to replicate this myself yet, but I've seen log files from a user affected by it. ``` File "/usr/local/lib/python3.11/site-packages/datasette/views/base.py", line 89, in dispatch_request await self.ds.refresh_schemas() File "/usr/local/lib/python3.11/site-packages/datasette/app.py", line 371, in refresh_schemas await self._refresh_schemas() File "/usr/local/lib/python3.11/site-packages/datasette/app.py", line 386, in _refresh_schemas schema_version = (await db.execute("PRAGMA schema_version")).first()[0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 267, in execute results = await self.execute_fn(sql_operation_in_thread) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 213, in execute_fn return await asyncio.get_event_loop().run_in_executor( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 211, in in_thread return fn(conn) ^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 237, in sql_operation_in_thread cursor.execute(sql, params if params is not None else {}) sqlite3.OperationalError: attempt to write a readonly database ``` That's running the official Datasette Docker image on https://fly.io/ - it's causing 500 errors on every page of their site. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2058/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1907765514 | I_kwDOBm6k_c5xtjEK | 2195 | `datasette publish` needs support for the new config/metadata split | 9599 | open | 0 | 9 | 2023-09-21T21:08:12Z | 2023-09-21T22:57:48Z | OWNER | > ... which raises the challenge that `datasette publish` doesn't yet know what to do with a config file! _Originally posted by @simonw in https://github.com/simonw/datasette/issues/2194#issuecomment-1730259871_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2195/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
285168503 | MDU6SXNzdWUyODUxNjg1MDM= | 176 | Add GraphQL endpoint | 173848 | open | 0 | 8 | 2017-12-29T23:21:01Z | 2020-04-21T14:16:24Z | NONE | Would make it much easier to build React & similar frontends. Maybe with https://github.com/graphql-python/sanic-graphql ? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/176/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
459509126 | MDU6SXNzdWU0NTk1MDkxMjY= | 516 | Enforce import sort order with isort | 9599 | open | 0 | 8 | 2019-06-22T20:35:50Z | 2023-08-23T02:15:36Z | OWNER | I want to use isort to order imports. A few steps here: - [x] Add a .isort.cfg file (see below) - [x] Use `isort -rc` to reformat existing code - [ ] Commit this change - [x] Add a unit test that ensures future changes remain isort compatible | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/516/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
602533300 | MDU6SXNzdWU2MDI1MzMzMDA= | 1 | Import photo metadata from Apple Photos into SQLite | 9599 | open | 0 | 5324096 | 8 | 2020-04-18T19:23:26Z | 2020-05-04T02:41:40Z | MEMBER | Faces, albums, locations, that kind of thing. | 256834907 | issue | { "url": "https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
634663505 | MDU6SXNzdWU2MzQ2NjM1MDU= | 815 | Group permission checks by request on /-/permissions debug page | 9599 | open | 0 | 8 | 2020-06-08T14:25:23Z | 2020-12-17T22:06:48Z | OWNER | Now that we're making a LOT more permission checks (on the DB index page we do a check for every listed table for example) the `/-/permissions` page gets filled up pretty quickly. Can make this more readable by grouping permission checks by request. Have most recent request at the top of the page but the permission requests within that page sorted chronologically by most recent last. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/815/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
724878151 | MDU6SXNzdWU3MjQ4NzgxNTE= | 1032 | Bring date parsing into Datasette core | 9599 | open | 0 | 8 | 2020-10-19T18:30:45Z | 2020-10-19T19:37:55Z | OWNER | Currently this is mainly handled by a plugin - https://github.com/simonw/datasette-dateutil - but I realise now that this really needs to be core functionality. See also Twitter thread: https://twitter.com/simonw/status/1318234808653213696 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1032/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
778450486 | MDU6SXNzdWU3Nzg0NTA0ODY= | 1171 | GitHub Actions workflow to build and sign macOS binary executables | 9599 | open | 0 | 8 | 2021-01-04T23:36:59Z | 2021-01-07T19:36:00Z | OWNER | Using PyInstaller, as explored in #93 and https://til.simonwillison.net/python/packaging-pyinstaller The bigger challenge will be the code signing bit. I'll need a Apple Developer account ($99/year) and some extensive CI fiddling. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1171/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
780278550 | MDU6SXNzdWU3ODAyNzg1NTA= | 1179 | Make original path available to render hooks | 9599 | open | 0 | 8 | 2021-01-06T08:31:45Z | 2021-01-25T04:44:33Z | OWNER | https://github.com/simonw/datasette-export-notebook/blob/0.1/datasette_export_notebook/__init__.py ```python async def render_notebook(datasette, request): return Response.html( await datasette.render_template( "export_notebook.html", { "csv_stream_url": datasette.absolute_url( request, path_with_format( request=request, format="csv", extra_qs={"_stream": "on"} ), ), "json_url": datasette.absolute_url( request, path_with_format( request=request, format="json", extra_qs={"_shape": "array"} ), ), "json": json, }, ) ) ``` This results in https://latest-with-plugins.datasette.io/github/issue_comments.Notebook showing `http://latest-with-plugins.datasette.io/github/issue_comments.Notebook?_format=json&_shape=array` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1179/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
969855774 | MDU6SXNzdWU5Njk4NTU3NzQ= | 1432 | Rename Datasette.__init__(config=) parameter to settings= | 9599 | open | 0 | 8 | 2021-08-13T01:00:27Z | 2021-10-19T01:16:41Z | OWNER | > While I'm doing this I should rename this internal variable to avoid confusion in the future: > > https://github.com/simonw/datasette/blob/e837095ef35ae155b4c78cc9a8b7133a48c94f03/datasette/app.py#L203 _Originally posted by @simonw in https://github.com/simonw/datasette/issues/1431#issuecomment-898072940_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1432/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1083657868 | I_kwDOBm6k_c5Al06M | 1565 | Documented JavaScript variables on different templates made available for plugins | 9599 | open | 0 | 8 | 2021-12-17T22:30:51Z | 2021-12-19T22:37:29Z | OWNER | While working on https://github.com/simonw/datasette-leaflet-freedraw/issues/10 I found myself writing this atrocity to figure out the SQL query used for a specific table page: ```javascript let innerSql = Array.from(document.getElementsByTagName("span")).filter( el => el.innerText == "View and edit SQL" )[0].parentElement.getAttribute("title") ``` This is obviously bad - it's very brittle, and will break if I ever change the text on that link (like localizing it for example). Instead, I think pages like that one should have a block of script at the bottom something like this: ```javascript window.datasette = window.datasette || {}; datasette.view_name = 'table'; datasette.table_sql = 'select * from ...'; ``` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1565/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1160034488 | I_kwDOCGYnMM5FJLi4 | 411 | Support for generated columns | 25778 | open | 0 | 8 | 2022-03-04T20:41:33Z | 2022-03-11T22:32:43Z | CONTRIBUTOR | This is a fairly new feature -- SQLite version 3.31.0 (2020-01-22) -- that I, admittedly, haven't gotten to work yet. But it looks _incredibly_ useful: https://dgl.cx/2020/06/sqlite-json-support I'm not sure if this is an option on `add-column` or a separate command like `add-generated-column`. Either way, it needs an argument to populate it. It could be something like this: ```sh sqlite-utils add-column data.db table-name generated --as 'json_extract(data, "$.field")' --virtual ``` More here: https://www.sqlite.org/gencol.html | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/411/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1163369515 | I_kwDOBm6k_c5FV5wr | 1655 | query result page is using 400mb of browser memory 40x size of html page and 400x size of csv data | 536941 | open | 0 | 8 | 2022-03-09T00:56:40Z | 2023-10-17T21:53:17Z | CONTRIBUTOR | [this page](https://labordata.bunkum.us/opdr-8335ea3?sql=with+most_recent_lu+as+%28%0D%0A++select%0D%0A++++*%0D%0A++from%0D%0A++++%28%0D%0A++++++select%0D%0A++++++++*%0D%0A++++++from%0D%0A++++++++lm_data%0D%0A++++++order+by%0D%0A++++++++f_num%2C%0D%0A++++++++receive_date+desc%0D%0A++++%29+t%0D%0A++group+by%0D%0A++++f_num%0D%0A%29%0D%0Aselect%0D%0A++aff_abbr+%7C%7C+coalesce%28%27+local+%27+%7C%7C+desig_num%2C+%27+%27+%7C%7C+unit_name%29+as+abbr_local_name%2C%0D%0A++coalesce%28%0D%0A++++regexp_match%28%27%28.*%3F%29%28%2C%3F+AFL-CIO%24%29%27%2C+union_name%29%2C%0D%0A++++regexp_match%28%27%28.*%3F%29%28+IND%24%29%27%2C+union_name%29%2C%0D%0A++++union_name%0D%0A++%29+%7C%7C+coalesce%28%27+local+%27+%7C%7C+desig_num%2C+%27+%27+%7C%7C+unit_name%29+as+full_local_name%2C%0D%0A++*%0D%0Afrom%0D%0A++most_recent_lu%0D%0Awhere+%28desig_num+IS+NOT+NULL+OR+unit_name+IS+NOT+NULL%29+AND+desig_name+%21%3D+%27HQ%27%0D%0Alimit%0D%0A++5000+offset+0) is using about 400 mb in firefox 97 on mac os x. if you download the html for the page, it's about 11mb and if you get the csv for the data its about 1mb. it's using over a 1G on chrome 99. i found this because, i was trying to figure out why editing the SQL was getting very slow. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1655/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1174655187 | I_kwDOBm6k_c5GA9DT | 1671 | Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply | 9308268 | open | 0 | 8 | 2022-03-20T19:17:24Z | 2022-03-22T17:43:12Z | NONE | I found a strange behavior, and I'm not sure if it's related to views and boolean values perhaps, or if there's something else weird going on here, but I'll provide an example that may help show what I'm seeing happen. ```bash #!/bin/bash echo "\"id\",\"expiration_date\" 0,2018-01-04 1,2019-01-05 2,2020-01-06 3,2021-01-07 4,2022-01-08 5,2023-01-09 6,2024-01-10 7,2025-01-11 8,2026-01-12 9,2027-01-13 " > test.csv csvs-to-sqlite test.csv test.db sqlite-utils create-view --replace test.db test_view "select id, expiration_date, case when julianday('NOW') >= julianday(expiration_date) then 1 else 0 end as has_expired FROM test" ``` ```bash datasette test.db ``` ![image](https://user-images.githubusercontent.com/9308268/159178745-9c6152f7-eac6-4bf9-bef5-a2d63d3ee13f.png) ![image](https://user-images.githubusercontent.com/9308268/159178824-c8952137-270c-42a4-ad1c-f6ad2c51e499.png) ![image](https://user-images.githubusercontent.com/9308268/159178877-23e00b36-443a-43ef-83e5-e0bdddd3fdcd.png) ![image](https://user-images.githubusercontent.com/9308268/159178918-65922cc7-2514-4735-a72d-4904b99976d4.png) Thanks again and let me know if you want me to provide anything else! | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1671/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1560982210 | PR_kwDOBm6k_c5IvYKw | 2008 | array facet: don't materialize unnecessary columns | 193185 | open | 0 | 8 | 2023-01-28T19:33:40Z | 2023-01-29T18:17:40Z | CONTRIBUTOR | simonw/datasette/pulls/2008 | The presence of `inner.*` causes SQLite to materialize a row with all the columns. Those columns will be discarded later. Instead, we can select only the column we'll use. This lets SQLite's optimizer realize that the other columns in the CTE definition aren't needed. On a test table with 278K rows, 98K of which had an array, this speeds up the facet calculation from 4 sec to 1 sec. <!-- readthedocs-preview datasette start --> ---- :books: Documentation preview :books:: https://datasette--2008.org.readthedocs.build/en/2008/ <!-- readthedocs-preview datasette end --> | 107914493 | pull | { "url": "https://api.github.com/repos/simonw/datasette/issues/2008/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
0 | ||||||
1781530343 | I_kwDOBm6k_c5qL_7n | 2093 | Proposal: Combine settings, metadata, static, etc. into a single `datasette.yaml` File | 15178711 | open | 0 | 8 | 2023-06-29T21:18:23Z | 2023-09-11T20:19:32Z | CONTRIBUTOR | Very often I get tripped up when trying to configure my Datasette instances. For example: if I want to change the port my app listen too, do I do that with a CLI flag, a `--setting` flag, inside `metadata.json`, or an env var? If I want to up the time limit of SQL statements, is that under `metadata.json` or a setting? Where does my plugin configuration go? Normally I need to look it up in Datasette docs, and I quickly find my answer, but the number of places where "config" goes it overwhelming. - Flat CLI flags like `--port`, `--host`, `--cors`, etc. - `--setting`, like `default_page_size`, `sql_time_limit_ms` etc - Inside `metadata.json`, including plugin configuration Typically my Datasette deploys are extremely long shell commands, with multiple `--setting` and other CLI flags. ## Proposal: Consolidate all "config" into `datasette.toml` I propose that we add a new `datasette.toml` that combines "settings", "metadata", and other common CLI flags like `--port` and `--cors` into a single file. It would be similar to "Cargo.toml" in Rust projects, "package.json" in Node projects, and "pyproject.toml" in Python, etc. A sample of what it could look like: ```toml # "top level" configuration that are currently CLI flags on `datasette serve` [config] port = 8020 host = "0.0.0.0" cors = true # replaces multiple `--setting` flags [settings] base_url = "/app/datasette/" default_allow_sql = true sql_time_limit_ms = 3500 # replaces `metadata.json`. # The contents of datasette-metadata.json could be defined in this file instead, but supporting separate files is nice (since those are easy to machine-generate) [metadata] include="./datasette-metadata.json" # plugin-specific [plugins] [plugins.datasette-auth-github] client_id = {env = "DATASETTE_AUTH_GITHUB_CLIENT_ID"} client_secret = {env = "GITHUB_CLIENT_SECRET"} [plugins.datasette-cluster-map] latitude_column = "lat" longitude_column = "lon" ``` ## Pros - Instead of multiple files and CLI flags, everything could b… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2093/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
323223872 | MDU6SXNzdWUzMjMyMjM4NzI= | 260 | Validate metadata.json on startup | 9599 | open | 0 | 7 | 2018-05-15T13:42:56Z | 2023-06-21T12:51:22Z | OWNER | It's easy to misspell the name of a database or table and then be puzzled when the metadata settings silently fail. To avoid this, let's sanity check the provided metadata.json on startup and quit with a useful error message if we find any obvious mistakes. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/260/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
327365110 | MDU6SXNzdWUzMjczNjUxMTA= | 294 | inspect should record column types | 9599 | open | 0 | 7 | 2018-05-29T15:10:41Z | 2019-06-28T16:45:28Z | OWNER | For each table we want to know the columns, their order and what type they are. I'm going to break with SQLite defaults a little on this one and allow datasette to define additional types - to start with just a `geometry` type for columns that are detected as SpatiaLite geometries. Possible JSON design: "columns": [{ "name": "title", "type": "text" }, ...] Refs #276 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/294/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
520681725 | MDU6SXNzdWU1MjA2ODE3MjU= | 621 | Syntax for ?_through= that works as a form field | 9599 | open | 0 | 7 | 2019-11-11T00:19:03Z | 2021-12-18T01:42:33Z | OWNER | The current syntax for `?_through=` uses JSON to avoid any risk of confusion with table or column names that contain special characters. This means you can't target a form field at it. We should be able to support both - `?x.y.z=value` for tables and columns with "regular" names, falling back to the current JSON syntax for columns or tables that won't work with the key/value syntax. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/621/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
642388564 | MDU6SXNzdWU2NDIzODg1NjQ= | 858 | publish heroku does not work on Windows 10 | 870912 | open | 0 | 7 | 2020-06-20T14:40:28Z | 2021-06-10T17:44:09Z | NONE | When executing "datasette publish heroku schools.db" on Windows 10, I get the following error ```shell File "c:\users\dell\.virtualenvs\sec-schools-jn-cwk8z\lib\site-packages\datasette\publish\heroku.py", line 54, in heroku line.split()[0] for line in check_output(["heroku", "plugins"]).splitlines() File "c:\python38\lib\subprocess.py", line 411, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "c:\python38\lib\subprocess.py", line 489, in run with Popen(*popenargs, **kwargs) as process: File "c:\python38\lib\subprocess.py", line 854, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "c:\python38\lib\subprocess.py", line 1307, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, FileNotFoundError: [WinError 2] The system cannot find the file specified ``` Changing https://github.com/simonw/datasette/blob/55a6ffb93c57680e71a070416baae1129a0243b8/datasette/publish/heroku.py#L54 to ```python line.split()[0] for line in check_output(["heroku", "plugins"], shell=True).splitlines() ``` as well as the other `check_output()` and `call()` within the same file leads me to another recursive error about temp files | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/858/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
672421411 | MDU6SXNzdWU2NzI0MjE0MTE= | 916 | Support reverse pagination (previous page, has-previous-items) | 9599 | open | 0 | 7 | 2020-08-04T00:32:06Z | 2021-04-03T23:43:11Z | OWNER | I need this for `datasette-graphql` for full compatibility with the way Relay likes to paginate - using cursors for paginating backwards as well as for paginating forwards. > This may be the kick I need to get Datasette pagination to work in reverse too. _Originally posted by @simonw in https://github.com/simonw/datasette-graphql/issues/2#issuecomment-668305853_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/916/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
688670158 | MDU6SXNzdWU2ODg2NzAxNTg= | 147 | SQLITE_MAX_VARS maybe hard-coded too low | 96218 | open | 0 | 7 | 2020-08-30T07:26:45Z | 2021-02-15T21:27:55Z | CONTRIBUTOR | I came across this while about to open an issue and PR against the documentation for `batch_size`, which is a bit incomplete. As mentioned in #145, while: > [`SQLITE_MAX_VARIABLE_NUMBER`](https://www.sqlite.org/limits.html#max_variable_number) ... defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0. it is common that it is increased at compile time. Debian-based systems, for example, seem to ship with a version of sqlite compiled with SQLITE_MAX_VARIABLE_NUMBER set to 250,000, and I believe this is the case for homebrew installations too. In working to understand what `batch_size` was actually doing and why, I realized that by setting `SQLITE_MAX_VARS` in `db.py` to match the value my sqlite was compiled with (I'm on Debian), I was able to decrease the time to `insert_all()` my test data set (~128k records across 7 tables) from ~26.5s to ~3.5s. Given that this about .05% of my total dataset, this is time I am keen to save... Unfortunately, it seems that `sqlite3` in the python standard library doesn't expose the `get_limit()` C API (even though `pysqlite` used to), so it's hard to know what value sqlite has been compiled with (note that this could mean, I suppose, that it's less than 999, and even hardcoding `SQLITE_MAX_VARS` to the conservative default might not be adequate. It can also be lowered -- but not raised -- at runtime). The best I could come up with is `echo "" | sqlite3 -cmd ".limits variable_number"` (only available in `sqlite >= 2015-05-07 (3.8.10)`). Obviously this couldn't be relied upon in `sqlite_utils`, but I wonder what your opinion would be about exposing `SQLITE_MAX_VARS` as a user-configurable parameter (with suitable "here be dragons" warnings)? I'm going to go ahead and monkey-patch it for my purposes in any event, but it seems like it might be worth considering. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/147/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
703218756 | MDU6SXNzdWU3MDMyMTg3NTY= | 50 | Commands for making authenticated API calls | 9599 | open | 0 | 7 | 2020-09-17T02:39:07Z | 2020-10-19T05:01:29Z | MEMBER | Similar to `twitter-to-sqlite fetch`, see https://github.com/dogsheep/twitter-to-sqlite/issues/51 | 207052882 | issue | { "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/50/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
705215230 | MDU6SXNzdWU3MDUyMTUyMzA= | 26 | Pagination | 9599 | open | 0 | 7 | 2020-09-21T00:14:37Z | 2020-09-21T02:55:54Z | MEMBER | Useful for #16 (timeline view) since you can now filter to just the items on a specific day - but if there are more than 50 items you can't see them all. | 197431109 | issue | { "url": "https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
712984738 | MDU6SXNzdWU3MTI5ODQ3Mzg= | 987 | Documented HTML hooks for JavaScript plugin authors | 9599 | open | 0 | 7 | 2020-10-01T16:10:14Z | 2021-01-25T04:00:03Z | OWNER | In #981 I added `data-column=` attributes to the `<th>` on the table page. These should become part of Datasette's documented API so JavaScript plugin authors can use them to derive things about the tables shown on a page (`datasette-cluster-map uses them as-of https://github.com/simonw/datasette-cluster-map/issues/18). | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/987/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
714377268 | MDU6SXNzdWU3MTQzNzcyNjg= | 991 | Redesign application homepage | 9599 | open | 0 | 7 | 2020-10-04T18:48:45Z | 2021-01-26T19:06:36Z | OWNER | Most Datasette instances only host a single database, but the current homepage design assumes that it should leave plenty of space for multiple databases: <img width="878" alt="Datasette_Fixtures__fixtures" src="https://user-images.githubusercontent.com/9599/95024344-5b51fd80-0637-11eb-8a11-40bad16f6907.png"> Reconsider this design - should the default show more information? The Covid-19 Datasette homepage looks particularly sparse I think: https://covid-19.datasettes.com/ <img width="782" alt="COVID-19_cases__using_data_from_Johns_Hopkins_CSSE__the_New_York_Times_and_the_LA_Times__covid" src="https://user-images.githubusercontent.com/9599/95024391-876d7e80-0637-11eb-8f19-ef38e4c87d2a.png"> | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/991/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |