github
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | pull_request | body | repo | type | active_lock_reason | performed_via_github_app | reactions | draft | state_reason |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
627794879 | MDU6SXNzdWU2Mjc3OTQ4Nzk= | 782 | Redesign default .json format | 9599 | closed | 0 | 8755003 | 55 | 2020-05-30T18:47:07Z | 2023-08-10T00:07:17Z | 2023-08-10T00:07:17Z | OWNER | The default JSON just isn't right. I find myself using `?_shape=array` for almost everything I build against the API. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/782/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
849978964 | MDU6SXNzdWU4NDk5Nzg5NjQ= | 1293 | Show column metadata plus links for foreign keys on arbitrary query results | 9599 | open | 0 | 51 | 2021-04-04T22:59:42Z | 2022-09-02T17:34:09Z | OWNER | Related to #620. It would be _really_ cool if Datasette could magically detect the source of the data displayed in an arbitrary query and, if that data represents a foreign key, display it as a hyperlink. Compare https://latest.datasette.io/fixtures/facetable <img width="1202" alt="fixtures__facetable__15_rows" src="https://user-images.githubusercontent.com/9599/113523748-a7d2b300-955e-11eb-904d-8cf2639b0b5f.png"> To https://latest.datasette.io/fixtures?sql=select+pk%2C+created%2C+planet_int%2C+on_earth%2C+state%2C+city_id%2C+neighborhood%2C+tags%2C+complex_array%2C+distinct_some_null+from+facetable+order+by+pk+limit+101 <img width="998" alt="fixtures__select_pk__created__planet_int__on_earth__state__city_id__neighborhood__tags__complex_array__distinct_some_null_from_facetable_order_by_pk_limit_101" src="https://user-images.githubusercontent.com/9599/113523761-be790a00-955e-11eb-8c82-36b9d0226bc8.png"> | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1293/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
reopened | |||||||
973139047 | MDU6SXNzdWU5NzMxMzkwNDc= | 1439 | Rethink how .ext formats (v.s. ?_format=) works before 1.0 | 9599 | closed | 0 | 3268330 | 48 | 2021-08-17T23:32:51Z | 2022-03-15T20:51:26Z | 2022-03-15T20:51:26Z | OWNER | Datasette currently has surprising special behaviour for if a table name ends in `.csv` - which can happen when a tool like `csvs-to-sqlite` creates tables that match the filename that they were imported from. https://latest.datasette.io/fixtures/table%2Fwith%2Fslashes.csv illustrates this behaviour: it links to `.csv` and `.json` that look like this: - https://latest.datasette.io/fixtures/table%2Fwith%2Fslashes.csv?_format=json - https://latest.datasette.io/fixtures/table%2Fwith%2Fslashes.csv?_format=csv&_size=max Where normally Datasette would add the `.csv` or `.json` extension to the path component of the URL (as seen on other pages such as https://latest.datasette.io/fixtures/facet_cities) here the [path_with_format() function](https://github.com/simonw/datasette/blob/adb5b70de5cec3c3dd37184defe606a082c232cf/datasette/utils/__init__.py#L710) notices that there is already a `.` in the path and instead adds `?_format=csv` to the query string instead. The problem with this mechanism is that it's pretty surprising. Anyone writing external code to Datasette who wants to get back the `.csv` or `.json` version giving the URL to a table page will need to know about and implement this behaviour themselves. That's likely to cause all kinds of bugs in the future. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1439/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
712260429 | MDU6SXNzdWU3MTIyNjA0Mjk= | 983 | JavaScript plugin hooks mechanism similar to pluggy | 9599 | open | 0 | 47 | 2020-09-30T20:32:43Z | 2021-01-25T04:43:58Z | OWNER | > It would be neat to provide a JavaScript plugin hook that plugins can use to add their own options to this menu. No idea what that would look like though. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/981#issuecomment-701616922_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/983/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
824064069 | MDU6SXNzdWU4MjQwNjQwNjk= | 1249 | Updated Dockerfile with SpatiaLite version 5.0 | 9599 | closed | 0 | 45 | 2021-03-08T00:17:36Z | 2022-01-20T21:29:43Z | 2021-03-29T00:57:13Z | OWNER | The version bundled in Datasette's Docker image right now is 4.4.0-RC0 https://github.com/simonw/datasette/blob/d0fd833b8cdd97e1b91d0f97a69b494895d82bee/Dockerfile#L16-L17 5 has been out for a couple of months and has a bunch of big improvements, most notable stable KNN support. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1249/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1058072543 | I_kwDOBm6k_c4_EOff | 1518 | Complete refactor of TableView and table.html template | 9599 | open | 0 | 3268330 | 45 | 2021-11-19T02:55:16Z | 2022-03-15T18:35:49Z | OWNER | Split from #878. The current `TableView` class is by far the most complex part of Datasette, and the most difficult to work on: https://github.com/simonw/datasette/blob/0.59.2/datasette/views/table.py In #878 I started exploring a new pattern for building views. In doing so it became clear that `TableView` is the first beast that I need to slay - if I can refactor that into something neat the pattern for building other views will emerge as a natural consequence. I've been trying to build this as a `register_routes()` plugin, as originally suggested in #870 - though unfortunately it looks like those plugins can't replace existing Datasette default views at the moment, see #1517. [UPDATE: I was wrong about this, plugins can over-ride default views just fine] I also know that I want to have a fully documented template context for `table.html` as a major step on the way to Datasette 1.0, see #1510. All of this adds up to the `TableView` factor being a major project that will unblock a whole flurry of other things - so I'm going to work on that in this separate issue. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1518/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
324188953 | MDU6SXNzdWUzMjQxODg5NTM= | 272 | Port Datasette to ASGI | 9599 | closed | 0 | 9599 | 3268330 | 42 | 2018-05-17T21:16:32Z | 2019-06-24T04:54:15Z | 2019-06-24T03:33:06Z | OWNER | Datasette doesn't take much advantage of Sanic, and I'm increasingly having to work around parts of it because of idiosyncrasies that are specific to Datasette - caring about the exact order of querystring arguments for example. Since Datasette is GET-only our needs from a web framework are actually pretty slim. This becomes more important as I expand the plugins #14 framework. Am I sure I want the plugin ecosystem to depend on a Sanic if I might move away from it in the future? If Datasette wasn't all about async/await I would use WSGI, but today it makes more sense to use ASGI. I'd like to be confident that switching to ASGI would still give me the excellent performance that Sanic provides. https://github.com/django/asgiref/blob/master/specs/asgi.rst | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/272/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||
582526961 | MDU6SXNzdWU1ODI1MjY5NjE= | 699 | Authentication (and permissions) as a core concept | 9599 | closed | 0 | 5512395 | 40 | 2020-03-16T18:48:00Z | 2020-06-06T19:42:11Z | 2020-06-06T19:42:11Z | OWNER | Right now Datasette authentication is provided exclusively by plugins: * https://github.com/simonw/datasette-auth-github * https://github.com/simonw/datasette-auth-existing-cookies This is an all-or-nothing approach: either your Datasette instance requires authentication at the top level or it does not. But... as I build new plugins like https://github.com/simonw/datasette-configure-fts and https://github.com/simonw/datasette-edit-tables I increasingly have individual features which should be reserved for logged-in users while still wanting other parts of Datasette to be open to all. This is too much for plugins to own independently of Datasette core. Datasette needs to ship a single "user is authenticated" concept (independent of how users actually sign in) so that different plugins can integrate with it. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/699/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
1496652622 | I_kwDOBm6k_c5ZNRtO | 1955 | invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things | 32839123 | closed | 0 | 36 | 2022-12-14T13:39:56Z | 2022-12-19T04:34:16Z | 2022-12-18T02:45:18Z | NONE | In the past (pre-september 14, #1809) I had a running deployment of Datasette on Azure WebApps by emulating the call in cli.py to Gunicorn: `gunicorn -w 2 -k uvicorn.workers.UvicornWorker app:app`. My most recent deployment, however, fails loudly by shouting that `Datasette.invoke_startup()` was not called. It does not seem to be possible to call `invoke_startup` when running using a uvicorn command directly like this (I've reproduced this locally using `uvicorn`). Two candidates that I have tried: * Uvicorn has a `--factory` option, but the app factory has to be synchronous, so no `await invoke_startup` there * `asyncio.get_event_loop().run_until_complete` is also not an option because `uvicorn` already has the event loop running. One additional option is: * Use Gunicorn's [server hooks](https://docs.gunicorn.org/en/stable/settings.html#server-hooks) to call `invoke_startup`. These are also synchronous, but I might be able to get ahead of the event loop starting here. In my current deployment setup, it does not appear to be possible to use `datasette serve` directly, so I'm stuck either * Trying to rework my complete deployment setup, for instance, using Azure functions as described [here](https://github.com/simonw/azure-functions-datasette)) * Or dig into the ASGI spec and write a wrapper for the sole purpose of launching Datasette using a direct Uvicorn invocation. Questions for the maintainers: * Is this intended behaviour/will not support/etc.? If so, I'd be happy to add a PR with a couple lines in the documentation. * if this is not intended behaviour, what is a good way to fix it? I could have a go at the ASGI spec thing (I think the Azure Functions thing is related) and provide a PR with the wrapper here, but I'm all ears! Almost forgot, minimal reproducer: ```python from datasette import Datasette ds = Datasette(files=['./global-power-plants.db'])] app = ds.app() ``` Save as app.py in the same folder as global-power-plants.db, and then try running `uvicorn app:app`. O… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1955/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1058896236 | I_kwDOBm6k_c4_HXls | 1522 | Deploy a live instance of demos/apache-proxy | 9599 | closed | 0 | 34 | 2021-11-19T20:32:55Z | 2021-11-23T03:00:34Z | 2021-11-20T18:51:56Z | OWNER | > I'll get this working on my laptop first, but then I want to get it up and running on Cloud Run - maybe with a GitHub Actions workflow in this repo that re-deploys it on manual execution. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/1521#issuecomment-974322178_ I started by following https://ahmet.im/blog/cloud-run-multiple-processes-easy-way/ - see example in https://github.com/ahmetb/multi-process-container-lazy-solution | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1522/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
681375466 | MDU6SXNzdWU2ODEzNzU0NjY= | 943 | await datasette.client.get(path) mechanism for executing internal requests | 9599 | closed | 0 | 5971510 | 33 | 2020-08-18T22:17:42Z | 2020-10-09T17:22:55Z | 2020-10-09T16:11:26Z | OWNER | `datasette-graphql` works by making internal requests to the TableView class (in order to take advantage of existing pagination logic, plus options like `?_search=` and `?_where=`) - see #915 I want to support a `mod_rewrite` style mechanism for putting nicer URLs on top of Datasette pages - I botched that together for a project here using an internal ASGI proxying trick: https://github.com/natbat/tidepools_near_me/commit/ec102c6da5a5d86f17628740d90b6365b671b5e1 If the `datasette` object provided a documented method for executing internal requests (in a way that makes sense with logging etc - i.e. doesn't get logged as a separate request) both of these use-cases would be much neater. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/943/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
770436876 | MDU6SXNzdWU3NzA0MzY4NzY= | 1150 | Maintain an in-memory SQLite table of connected databases and their tables | 9599 | closed | 0 | 3268330 | 32 | 2020-12-17T23:02:13Z | 2020-12-27T14:51:39Z | 2020-12-18T22:34:12Z | OWNER | I want Datasette to have its own internal metadata about connected tables, to power features like a paginated searchable homepage in #461. I want this to be a SQLite table. This could also be part of the directory scanning mechanism prototyped in #672 - where Datasette can be set to continually scan a directory for new database files that it can serve. Also relevant to the Datasette Library concept in #417. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1150/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
1217759117 | I_kwDOBm6k_c5IlYeN | 1727 | Research: demonstrate if parallel SQL queries are worthwhile | 9599 | open | 0 | 32 | 2022-04-27T18:54:21Z | 2022-09-26T14:48:31Z | OWNER | I added parallel SQL query execution here: - https://github.com/simonw/datasette/issues/1723 My hunch is that this will take advantage of multiple cores, since Python's `sqlite3` module releases the GIL once a query is passed to SQLite. I'd really like to prove this is the case though. Just not sure how to do it! Larger question: is this performance optimization actually improving performance at all? Under what circumstances is it worthwhile? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1727/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
309471814 | MDU6SXNzdWUzMDk0NzE4MTQ= | 189 | Ability to sort (and paginate) by column | 9599 | closed | 0 | 9599 | 31 | 2018-03-28T18:04:51Z | 2018-04-15T18:54:22Z | 2018-04-09T05:16:02Z | OWNER | As requested in https://github.com/simonw/datasette/issues/185#issuecomment-376614973 I've previously avoided this for performance reasons: sort-by-column on a column without an index is likely to perform badly for hundreds of thousands of rows. That's not a good enough reason to avoid the feature entirely though. A few options: * Allow sort-by-column by default, give users the option to disable it for specific tables/columns * Disallow sort-by-column by default, give users option (probably in `metadata.json`) to enable it for specific tables/columns * Automatically detect if a column either has an index on it OR a table has less than X rows in it We already have the mechanism in place to cut off SQL queries that take more than X seconds, so if someone DOES try to sort by a column that's too expensive it won't actually hurt anything - but it would be nice to not show people a "sort" option which is guaranteed to throw a timeout error. The vast majority of datasette usage that I've seen so far is on smaller datasets where the performance penalties of sort-by-column are extremely unlikely to show up. ---- Still left to do: - [x] UI that shows which sort order is currently being applied (in HTML and in JSON) - [x] UI for applying a sort order (with rel=nofollow to avoid Google crawling it) - [x] Sort column names should be escaped correctly in generated SQL - [x] Validation that the selected sort order is a valid column - [x] Throw error if user attempts to apply _sort AND _sort_desc at the same time - [x] Ability to disable sorting (or sort only for specific columns) in metadata.json - [x] Fix "201 rows where sorted by sortable_with_nulls " bug | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/189/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
1901416155 | I_kwDOBm6k_c5xVU7b | 2189 | Server hang on parallel execution of queries to named in-memory databases | 9599 | closed | 0 | 31 | 2023-09-18T17:23:18Z | 2023-09-21T22:26:21Z | 2023-09-21T22:26:21Z | OWNER | I've started to encounter a bug where queries to tables inside named in-memory databases sometimes trigger server hangs. I'm still trying to figure out what's going on here - on one occasion I managed to Ctrl+C the server and saw an exception that mentioned a thread lock, but usually hitting Ctrl+C does nothing and I have to `kill -9` the PID instead. This is all running on my M2 Mac. I've seen the bug in the Datasette 1.0 alphas and in Datasette 0.64.3 - but reverting to 0.61 appeared to fix it. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2189/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1421552095 | I_kwDOBm6k_c5Uuynf | 1852 | Default API token authentication mechanism | 9599 | closed | 0 | 8658075 | 30 | 2022-10-24T22:31:07Z | 2022-11-15T19:57:00Z | 2022-10-26T02:19:54Z | OWNER | API authentication will be via `Authorization: Bearer XXX` request headers. I'm inclined to add a default token mechanism to Datasette based on tokens that are signed with the `DATASETTE_SECRET`. Maybe the root user can access `/-/create-token` which provides a UI for generating a time-limited signed token? Could also have a `datasette token` command for creating such tokens at the command-line. Plugins can then define alternative ways of creating tokens, such as the existing https://datasette.io/plugins/datasette-auth-tokens plugin. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/1850#issuecomment-1289706439_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1852/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
608058890 | MDU6SXNzdWU2MDgwNTg4OTA= | 744 | link_or_copy_directory() error - Invalid cross-device link | 30607 | closed | 0 | 28 | 2020-04-28T06:26:45Z | 2020-05-28T14:32:53Z | 2020-05-27T06:01:28Z | NONE | Hi, when I run ``` datasette publish heroku -n myapp --template-dir ./template mydb.db ``` I have this error ``` Traceback (most recent call last): File "/home/aborruso/.local/lib/python3.7/site-packages/datasette/utils/__init__.py", line 607, in link_or_copy_directory shutil.copytree(src, dst, copy_function=os.link) File "/usr/lib/python3.7/shutil.py", line 365, in copytree raise Error(errors) shutil.Error: [('/myfolder/youtubeComunePalermo/processing/./template/base.html', '/tmp/tmps9_4mzc4/templates/base.html', "[Errno 18] Invalid cross-device link: '/myfolder/youtubeComunePalermo/processing/./template/base.html' -> '/tmp/tmps9_4mzc4/templates/base.html'"), ('/myfolder/youtubeComunePalermo/processing/./template/index.html', '/tmp/tmps9_4mzc4/templates/index.html', "[Errno 18] Invalid cross-device link: '/myfolder/youtubeComunePalermo/processing/./template/index.html' -> '/tmp/tmps9_4mzc4/templates/index.html'")] During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/aborruso/.local/bin/datasette", line 8, in <module> sys.exit(cli()) File "/home/aborruso/.local/lib/python3.7/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/home/aborruso/.local/lib/python3.7/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/home/aborruso/.local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/aborruso/.local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/aborruso/.local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/aborruso/.local/lib/python3.7/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/home/aborruso/.local/lib/pytho… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/744/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
323658641 | MDU6SXNzdWUzMjM2NTg2NDE= | 262 | Add ?_extra= mechanism for requesting extra properties in JSON | 9599 | open | 0 | 3268330 | 27 | 2018-05-16T14:55:42Z | 2023-03-29T06:22:22Z | OWNER | Datasette views currently work by creating a set of data that should be returned as JSON, then defining an additional, optional `template_data()` function which is called if the view is being rendered as HTML. This `template_data()` function calculates extra template context variables which are necessary for the HTML view but should not be included in the JSON. Example of how that is used today: https://github.com/simonw/datasette/blob/2b79f2bdeb1efa86e0756e741292d625f91cb93d/datasette/views/table.py#L672-L704 With features like Facets in #255 I'm beginning to want to move more items into the `template_data()` - in the case of facets it's the `suggested_facets` array. This saves that feature from being calculated (involving several SQL queries) for the JSON case where it is unlikely to be used. But... as an API user, I want to still optionally be able to access that information. Solution: Add a `?_extra=suggested_facets&_extra=table_metadata` argument which can be used to optionally request additional blocks to be added to the JSON API. Then redefine as many of the current `template_data()` features as extra arguments instead, and teach Datasette to return certain extras by default when rendering templates. This could allow the JSON representation to be slimmed down further (removing e.g. the `table_definition` and `view_definition` keys) while still making that information available to API users who need it. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/262/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
323681589 | MDU6SXNzdWUzMjM2ODE1ODk= | 266 | Export to CSV | 9599 | closed | 0 | 27 | 2018-05-16T15:50:24Z | 2021-06-17T18:14:24Z | 2018-06-18T06:05:25Z | OWNER | Datasette needs to be able to export data to CSV. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/266/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
396212021 | MDU6SXNzdWUzOTYyMTIwMjE= | 394 | base_url configuration setting | 9599 | closed | 0 | 5234079 | 27 | 2019-01-05T23:48:48Z | 2020-06-11T09:15:20Z | 2020-03-25T00:18:45Z | OWNER | I've identified a couple of use-cases for running Datasette in a way that over-rides the default way that internal URLs are generated. 1. Running behind a reverse proxy. I tried running Datasette behind a proxy and found that some of the generated internal links incorrectly referenced `http://127.0.0.1:8001/fixtures/...` - when they should have been referencing `http://my-host.my-domain.com/fixtures/...` - this is a problem both for links within the HTML interface but also for the `toggle_url` keys returned in the JSON as part of the facets datastructure. 2. I would like it to be possible to host a Datasette instance at e.g. `https://www.mynewspaper.com/interactives/2018/election-results/` - either through careful HTTP proxying or, once Datasette has been ported to ASGI, by mounting a Datasette ASGI instance deep within an existing set of URL routes. I'm going to add a `url_prefix` configuration option. This will default to `""`, which means Datasette will behave as it does at the moment - it will use `/` for most URL prefixes in the HTML version, and an absolute URL derived from the incoming `Host` header for URLs that are returned as part of the JSON output. If `url_prefix` is set to another value (either a full URL or a path) then this path will be appended to all generated URLs. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/394/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
459397625 | MDU6SXNzdWU0NTkzOTc2MjU= | 514 | Documentation with recommendations on running Datasette in production without using Docker | 7936571 | closed | 0 | 5971510 | 27 | 2019-06-21T22:48:12Z | 2020-10-08T23:55:53Z | 2020-10-08T23:33:05Z | NONE | I've got some SQLite databases too big to push to Heroku or the other services with built-in support in datasette. So instead I moved my datasette code and databases to a remote server on Kimsufi. In the folder containing the SQLite databases I run the following code. `nohup datasette serve -h 0.0.0.0 *.db --cors --port 8000 --metadata metadata.json > output.log 2>&1 &`. When I go to `http://my-remote-server.com:8000`, the site loads. But I know this is not a good long-term solution to running datasette on this server. What is the "correct" way to have this site run, preferably on server port 80? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/514/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
1079149656 | I_kwDOBm6k_c5AUoRY | 1555 | Optimize all those calls to index_list and foreign_key_list | 9599 | closed | 0 | 7571612 | 27 | 2021-12-13T23:50:56Z | 2022-01-13T22:27:32Z | 2021-12-19T20:55:59Z | OWNER | On the first hit to a restarted index I'm seeing this in the SQL traces: https://latest-with-plugins.datasette.io/github/commits?_trace=1 <img width="826" alt="image" src="https://user-images.githubusercontent.com/9599/145907011-e4abc93c-8dbf-468d-afef-13d5e7463f92.png"> I imagine this could be sped up a lot using tricks like this one from the SQLite documentation: https://sqlite.org/pragma.html#pragfunc ```sql SELECT DISTINCT m.name || '.' || ii.name AS 'indexed-columns' FROM sqlite_schema AS m, pragma_index_list(m.name) AS il, pragma_index_info(il.name) AS ii WHERE m.type='table' ORDER BY 1; ``` https://latest-with-plugins.datasette.io/fixtures?sql=SELECT+DISTINCT+m.name+%7C%7C+%27.%27+%7C%7C+ii.name+AS+%27indexed-columns%27%0D%0A++FROM+sqlite_schema+AS+m%2C%0D%0A+++++++pragma_index_list%28m.name%29+AS+il%2C%0D%0A+++++++pragma_index_info%28il.name%29+AS+ii%0D%0A+WHERE+m.type%3D%27table%27%0D%0A+ORDER+BY+1%3B | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1555/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
325958506 | MDU6SXNzdWUzMjU5NTg1MDY= | 283 | Support cross-database joins | 9599 | closed | 0 | 26 | 2018-05-24T04:18:39Z | 2021-06-06T09:40:18Z | 2021-02-18T22:16:46Z | OWNER | SQLite has the ability to attach multiple databases to a single connection and then run joins across multiple databases. Since Datasette supports more than one database, this would make a pretty neat feature. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/283/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
582517965 | MDU6SXNzdWU1ODI1MTc5NjU= | 698 | Ability for a canned query to write to the database | 9599 | closed | 0 | 5512395 | 26 | 2020-03-16T18:31:59Z | 2020-06-06T19:43:49Z | 2020-06-06T19:43:48Z | OWNER | Canned queries are currently read-only: https://datasette.readthedocs.io/en/0.38/sql_queries.html#canned-queries Add a `"write": true` option to their definition in `metadata.json` which turns them into queries that are submitted via POST and send their queries to the write queue. Then they can be used as a really quick way to define a writable interface and JSON API! | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/698/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
621989740 | MDU6SXNzdWU2MjE5ODk3NDA= | 114 | table.transform() method for advanced alter table | 9599 | closed | 0 | 5897911 | 26 | 2020-05-20T18:20:46Z | 2020-09-22T07:51:37Z | 2020-09-22T04:20:02Z | OWNER | SQLite's `ALTER TABLE` can only do the following: * Rename a table * Rename a column * Add a column Notably, it cannot drop columns - so tricks like "add a float version of this text column, populate it, then drop the old one and rename" won't work. The docs here https://www.sqlite.org/lang_altertable.html#making_other_kinds_of_table_schema_changes describe a way of implementing full alters safely within a transaction, but it's fiddly. 1. Create new table 2. Copy data 3. Drop old table 4. Rename new into old It would be great if `sqlite-utils` provided an abstraction to help make these kinds of changes safely. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/114/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
648435885 | MDU6SXNzdWU2NDg0MzU4ODU= | 878 | New pattern for views that return either JSON or HTML, available for plugins | 9599 | open | 0 | 3268330 | 26 | 2020-06-30T19:26:13Z | 2022-03-19T16:19:30Z | OWNER | Can be part of #870 - refactoring existing views to use `register_routes()`. > I'm going to put the new `check_permissions()` method on `BaseView` as well. If I want that method to be available to plugins I can do so by turning that `BaseView` class into a documented API that plugins are encouraged to use themselves. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/832#issuecomment-651995453_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/878/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
|||||||
639993467 | MDU6SXNzdWU2Mzk5OTM0Njc= | 850 | Proof of concept for Datasette on AWS Lambda with EFS | 9599 | open | 0 | 25 | 2020-06-16T21:48:31Z | 2020-06-16T23:52:16Z | OWNER | https://aws.amazon.com/about-aws/whats-new/2020/06/aws-lambda-support-for-amazon-elastic-file-system-now-generally-/ If Datasette can run on Lambda with access to EFS it could both read AND write large databases there. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/850/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1124731464 | I_kwDOCGYnMM5DCgpI | 399 | Make it easier to insert geometries, with documentation and maybe code | 9599 | open | 0 | 25 | 2022-02-05T00:11:26Z | 2023-05-16T03:11:52Z | OWNER | In playing with the new SpatiaLite helpers from #385 I noticed that actually populating geometry columns is still a little bit tricky. Here's what I ended up doing: ```python import httpx, sqlite_utils db = sqlite_utils.Database("/tmp/spatial.db") attractions = httpx.get("https://latest.datasette.io/fixtures/roadside_attractions.json?_shape=array").json() db["attractions"].insert_all(attractions, pk="pk") # Schema of that table is now: # CREATE TABLE [attractions] ( # [pk] INTEGER PRIMARY KEY, # [name] TEXT, # [address] TEXT, # [latitude] FLOAT, # [longitude] FLOAT # ) db.init_spatialite() db["attractions"].add_geometry_column("point", "POINT") db.execute(""" update attractions set point = GeomFromText( 'POINT(' || longitude || ' ' || latitude || ')', 4326 ) """) ``` That last line took some figuring out - especially the need for the SRID of `4326`, without which I got this error: > `IntegrityError: attractions.point violates Geometry constraint [geom-type or SRID not allowed]` It would be good to both document this in more detail, but ideally also to come up with a more obvious pattern for inserting common types of spatial data. Also related: - #398 - #79 | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/399/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1662951875 | I_kwDOBm6k_c5jHqHD | 2057 | DeprecationWarning: pkg_resources is deprecated as an API | 9599 | closed | 0 | 25 | 2023-04-11T17:41:20Z | 2023-09-21T22:09:10Z | 2023-09-21T22:09:10Z | OWNER | Got this running tests against Python 3.11. ``` ../../../.local/share/virtualenvs/datasette-big-local-6Yn-280V/lib/python3.11/site-packages/datasette/app.py:14: in <module> import pkg_resources ../../../.local/share/virtualenvs/datasette-big-local-6Yn-280V/lib/python3.11/site-packages/pkg_resources/__init__.py:121: in <module> warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning) E DeprecationWarning: pkg_resources is deprecated as an API ``` I ran with `pytest -Werror --pdb -x` to get the debugger for that warning, but it turned out searching the code worked better. It's used in these two places: https://github.com/simonw/datasette/blob/5890a20c374fb0812d88c9b0ef26a838bfa06c76/datasette/plugins.py#L43-L50 https://github.com/simonw/datasette/blob/5890a20c374fb0812d88c9b0ef26a838bfa06c76/datasette/app.py#L1037 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2057/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
502993509 | MDU6SXNzdWU1MDI5OTM1MDk= | 581 | Redesign register_output_renderer callback | 9599 | closed | 0 | 5471110 | 24 | 2019-10-05T17:43:23Z | 2020-05-28T02:24:14Z | 2020-05-28T02:21:50Z | OWNER | In building https://github.com/simonw/datasette-atom it became clear that the callback function (which currently accepts just args, data and view_name) would also benefit from access to a mechanism to render templates and a `datasette` instance so it can execute SQL. To maintain backwards compatibility with existing plugins, we can introspect the callback function to see if it wants those new arguments or not. At a minimum I want to make `datasette` and ASGI `scope` available. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/581/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
1427293909 | I_kwDOBm6k_c5VEsbV | 1871 | API explorer tool | 9599 | closed | 0 | 8658075 | 24 | 2022-10-28T13:49:11Z | 2022-11-15T19:59:05Z | 2022-11-14T04:59:59Z | OWNER | The API will be much easier to develop if there's a page that helps you execute JSON POSTs against it. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1871/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
1855885427 | I_kwDOBm6k_c5unpBz | 2143 | De-tangling Metadata before Datasette 1.0 | 15178711 | open | 0 | 24 | 2023-08-18T00:51:50Z | 2023-08-24T18:28:27Z | CONTRIBUTOR | Metadata in Datasette is a really powerful feature, but is a bit difficult to work with. It was initially a way to add "metadata" about your "data" in Datasette instances, like descriptions for databases/tables/columns, titles, source URLs, licenses, etc. But it later became the go-to spot for other Datasette features that have nothing to do with metadata, like permissions/plugins/canned queries. Specifically, I've found the following problems when working with Datasette metadata: 1. Metadata cannot be updated without re-starting the entire Datasette instance. 2. The `metadata.json`/`metadata.yaml` has become a kitchen sink of unrelated (imo) features like plugin config, authentication config, canned queries 3. The Python APIs for defining extra metadata are a bit awkward (the `datasette.metadata()` class, `get_metadata()` hook, etc.) ## Possible solutions Here's a few ideas of Datasette core changes we can make to address these problems. ### Re-vamp the Datasette Python metadata APIs The Datasette object has a single `datasette.metadata()` method that's a bit difficult to work with. There's also no Python API for inserted new metadata, so plugins have to rely on the `get_metadata()` hook. The `get_metadata()` hook can also be improved - it doesn't work with async functions yet, so you're quite limited to what you can do. (I'm a bit fuzzy on what to actually do here, but I imagine it'll be very small breaking changes to a few Python methods) ### Add an optional `datasette_metadata` table Datasette should detect and use metadata stored in a new special table called `datasette_metadata`. This would be a regular table that a user can edit on their own, and would serve as a "live updating" source of metadata, than can be changed while the Datasette instance is running. Not too sure what the schema would look like, but I'd imagine: ```sql CREATE TABLE datasette_metadata( level text, target any, key text, value any, primary key (level, target) ) ``` Every row… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2143/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
459882902 | MDU6SXNzdWU0NTk4ODI5MDI= | 526 | Stream all results for arbitrary SQL and canned queries | 50578294 | open | 0 | 23 | 2019-06-24T13:09:45Z | 2022-09-28T04:01:25Z | NONE | I think that there is a difficulty with canned queries. When I want to stream all results of a canned query TwoDays I get only first 1.000 records. Example: `http://myserver/history_sample/two_days.csv?_stream=on` returns only first 1.000 records. If I do the same with the whole database i.e. `http://myserver/history_sample/database.csv?_stream=on` I get correctly all records. Any ideas? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/526/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
775666296 | MDU6SXNzdWU3NzU2NjYyOTY= | 1160 | "datasette insert" command and plugin hook | 9599 | open | 0 | 23 | 2020-12-29T02:37:03Z | 2021-06-17T18:12:32Z | OWNER | Tools for loading data into Datasette currently mostly exist as separate utilities - `yaml-to-sqlite` and `csvs-to-sqlite` and suchlike. Bringing these into Datasette could have some interesting properties: - A `datasette insert` command could be extended with plugins to handle more formats - Any format that can be inserted on the command-line could also be inserted using a web UI or web API - which would benefit from new format plugin hooks - If Datasette ever grows beyond SQLite (see #670) a built-in import mechanism could work for those other databases as well - without me needing to write `yaml-to-postgresql` and suchlike | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1160/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
944846776 | MDU6SXNzdWU5NDQ4NDY3NzY= | 297 | Option for importing CSV data using the SQLite .import mechanism | 9599 | open | 0 | 23 | 2021-07-14T22:36:41Z | 2023-09-22T20:49:52Z | OWNER | As seen in https://til.simonwillison.net/sqlite/import-csv - `.mode csv` and then `.import school.csv schools` is hugely faster than importing via `sqlite-utils insert` and doing the work in Python - but it can only be implemented by shelling out to the `sqlite3` CLI tool, it's not functionality that is exposed to the Python `sqlite3` module. An option to use this would be useful - maybe something like this: sqlite-utils insert blah.db blah blah.csv --fast | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/297/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1857234285 | I_kwDOBm6k_c5usyVt | 2145 | If a row has a primary key of `null` various things break | 9599 | open | 0 | 23 | 2023-08-18T20:06:28Z | 2023-08-21T17:30:01Z | OWNER | Stumbled across this while experimenting with `datasette-write-ui`. The error I got was a 500 on the `/db` page: > `'NoneType' object has no attribute 'encode'` Tracked it down to this code, which assembles the URL for a row page: https://github.com/simonw/datasette/blob/943df09dcca93c3b9861b8c96277a01320db8662/datasette/utils/__init__.py#L120-L134 That's because `tilde_encode` can't handle `None`: https://github.com/simonw/datasette/blob/943df09dcca93c3b9861b8c96277a01320db8662/datasette/utils/__init__.py#L1175-L1178 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2145/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
267707940 | MDU6SXNzdWUyNjc3MDc5NDA= | 14 | Datasette Plugins | 9599 | closed | 0 | 22 | 2017-10-23T15:15:28Z | 2019-05-13T18:58:20Z | 2019-05-13T18:58:19Z | OWNER | It would be neat if additional functionality could be opted-in to the system in the form of easy-to-add plugins, hosted as separate packages. First example: a Google Analytics plugin, which adds GA tracking code with your tracking ID to the web interface for your dataset. This may be an opportunity to experiment with entry points: http://amir.rachum.com/blog/2017/07/28/python-entry-points/ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/14/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
742011049 | MDU6SXNzdWU3NDIwMTEwNDk= | 1091 | .json and .csv exports fail to apply base_url | 9599 | closed | 0 | 6346396 | 22 | 2020-11-12T23:45:16Z | 2021-01-24T21:20:24Z | 2021-01-09T22:19:29Z | OWNER | > Just tested with the latest Docker image, and it works pretty much everywhere! THANK YOU! > > I did notice that if I try to export json or csv, the base is not applied. Not sure if I should reopen this issue or open a new one. > > To see this, go here: https://corpora.tika.apache.org/datasette/corpora-metadata/REF_PARSE_EXCEPTION_TYPES > > Click/hover over json or CSV and you'll see that the 'datasette' base is not included. _Originally posted by @tballison in https://github.com/simonw/datasette/issues/865#issuecomment-726385422_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1091/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
771202454 | MDU6SXNzdWU3NzEyMDI0NTQ= | 1153 | Use YAML examples in documentation by default, not JSON | 9599 | closed | 0 | 22 | 2020-12-18T22:20:15Z | 2023-07-08T20:09:48Z | 2023-07-08T20:08:13Z | OWNER | YAML configuration is much better for multi-line strings, and I'm increasingly adding configuration options to Datasette that benefit from that - fragments of HTML in `description_html` or SQL queries used to configure things like https://github.com/simonw/datasette-atom for example. Rather than confusing things by showing both in the documentation, I should switch all of the default examples to use YAML instead. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1153/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
921878733 | MDU6SXNzdWU5MjE4Nzg3MzM= | 272 | Idea: import CSV to memory, run SQL, export in a single command | 9599 | closed | 0 | 22 | 2021-06-15T23:02:48Z | 2021-06-19T23:36:48Z | 2021-06-18T15:05:03Z | OWNER | I quite often load a CSV file into a SQLite DB, then do stuff with it (like export results back out again as a new CSV) without any intention of keeping the CSV file around afterwards. What if `sqlite-utils` could do this for me? Something like this: sqlite-utils --csv blah.csv --csv baz.csv "select * from blah join baz ..." | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/272/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
930807135 | MDU6SXNzdWU5MzA4MDcxMzU= | 1384 | Plugin hook for dynamic metadata | 9599 | open | 0 | 22 | 2021-06-26T22:36:03Z | 2022-03-14T00:36:42Z | OWNER | @brandonrobertz contributed an implementation of this in PR #1368, which I just merged. Opening this ticket to track further work on this before it goes out in a Datasette release (likely preceded by an alpha). | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1384/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1058790545 | I_kwDOBm6k_c4_G9yR | 1519 | base_url is omitted in JSON and CSV views | 157158 | closed | 0 | 22 | 2021-11-19T18:10:45Z | 2021-12-01T17:50:09Z | 2021-11-20T19:11:21Z | NONE | I have a datasette deployment, using Apache2 to reverse proxy: ProxyPass /ged http://thor.phfactor.net:8001 ProxyPreserveHost On In settings.json I have ```json { "base_url": "/ged/", "trace_debug": 1, "template_debug": 1 } ``` and datasette works correctly. However, if you view a query and then click on the 'This data as json, CSV' both links omit the base_url prefix and are therefore 404. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1519/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1421544654 | I_kwDOBm6k_c5UuwzO | 1851 | API to insert a single record into an existing table | 9599 | closed | 0 | 8658075 | 22 | 2022-10-24T22:24:21Z | 2022-11-15T19:59:18Z | 2022-10-28T00:59:25Z | OWNER | Controlled by a new `insert-row` permission. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1851/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
1493390939 | I_kwDOBm6k_c5ZA1Zb | 1947 | UI to create reduced scope tokens from the `/-/create-token` page | 9599 | closed | 0 | 8711695 | 22 | 2022-12-13T05:10:48Z | 2022-12-14T05:22:00Z | 2022-12-14T05:13:24Z | OWNER | Split from: - #1855 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1947/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
324835838 | MDU6SXNzdWUzMjQ4MzU4Mzg= | 276 | Handle spatialite geometry columns better | 45057 | closed | 0 | 21 | 2018-05-21T08:46:55Z | 2022-03-21T22:22:20Z | 2022-03-21T22:22:20Z | CONTRIBUTOR | I'd like to see spatialite geometry columns rendered more sensibly - at the moment they come through as well-known-binary unless you use custom SQL, and WKB isn't of much use to anyone on the web. In HTML: they should be shown either as simple lat/long (if it's just a point, for example), or as a sensible placeholder if they're more complex geometries. In JSON: they should be GeoJSON geometries, (which means they can be automatically fed into a leaflet map with no further messing around). In CSV: they should be WKT. I briefly wondered if this should go into a plugin, but I suspect it needs hooking in at a deeper level than the plugin architecture will support any time soon. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/276/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
470345929 | MDU6SXNzdWU0NzAzNDU5Mjk= | 42 | table.extract(...) method and "sqlite-utils extract" command | 9599 | closed | 0 | 5897911 | 21 | 2019-07-19T14:09:36Z | 2020-09-22T23:39:31Z | 2020-09-22T23:37:49Z | OWNER | One of my favourite features of [csvs-to-sqlite](https://github.com/simonw/csvs-to-sqlite) is that it can "extract" columns into a separate lookup table - for example: csvs-to-sqlite big_csv_file.csv -c country output.db This will turn the `country` column in the resulting table into a integer foreign key against a new `country` table. You can see an example of what that looks like here: https://san-francisco.datasettes.com/registered-business-locations-3d50679/Business+Corridor was extracted from https://san-francisco.datasettes.com/registered-business-locations-3d50679/Registered_Business_Locations_-_San_Francisco?Business%20Corridor=1 I'd like to have the same capability in `sqlite-utils` - but with the ability to run it against an existing SQLite table rather than just against a CSV. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/42/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
642572841 | MDU6SXNzdWU2NDI1NzI4NDE= | 859 | Database page loads too slowly with many large tables (due to table counts) | 3243482 | open | 0 | 21 | 2020-06-21T14:23:17Z | 2021-08-25T21:59:55Z | CONTRIBUTOR | Hey, I have a database that I save in HTML from couple of web scrapers. There are around 200k+, 50+ rows in a couple of tables, with sqlite file weighing around 600MB. The app runs on a VPS with 2 core CPU, 4GB RAM and refreshing database page regularly takes more than 10 seconds. I was suspecting that counting tables was the culprit, but manually running `select count(*) from table_name` for the largest table finishes under a second. I've looked at the source code. There's a check for index page for mutable databases larger than 100MB https://github.com/simonw/datasette/blob/799c5d53570d773203527f19530cf772dc2eeb24/datasette/views/index.py#L15 but this check is not performed for database page. I've manually crippled `Database::table_counts` method ```py async def table_counts(self, limit=10): if not self.is_mutable and self.cached_table_counts is not None: return self.cached_table_counts # Try to get counts for each table, $limit timeout for each count counts = {} for table in await self.table_names(): try: # table_count = ( # await self.execute( # "select count(*) from [{}]".format(table), # custom_time_limit=limit, # ) # ).rows[0][0] counts[table] = 10 # table_count # In some cases I saw "SQL Logic Error" here in addition to # QueryInterrupted - so we catch that too: except (QueryInterrupted, sqlite3.OperationalError, sqlite3.DatabaseError): counts[table] = None if not self.is_mutable: self.cached_table_counts = counts return counts ``` now the page loads in <100ms. Is it possible to apply size check on database page too? <details> <summary> /-/versions output </summary> <pre> { "python": { "version": "3.8.0", "full": "3.8.0 (default, Oct 28 2019, 16:14:01) \n[GCC 8.3.0]" }, "datasette": { "version": "0.44" }, "asgi": "3.… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/859/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
657572753 | MDU6SXNzdWU2NTc1NzI3NTM= | 894 | ?sort=colname~numeric to sort by by column cast to real | 9599 | open | 0 | 21 | 2020-07-15T18:47:48Z | 2021-08-20T02:07:53Z | OWNER | If a text column actually contains numbers, being able to "sort by column, treated as numeric" would be really useful. Probably depends on column actions enabled by #690 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/894/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
777333388 | MDU6SXNzdWU3NzczMzMzODg= | 1168 | Mechanism for storing metadata in _metadata tables | 9599 | open | 0 | 21 | 2021-01-01T18:47:27Z | 2023-09-28T18:29:05Z | OWNER | _Original title: Perhaps metadata should all live in a `_metadata` in-memory database_ Inspired by #1150 - metadata should be exposed as an API, and for large Datasette instances that API may need to be paginated. So why not expose it through an in-memory database table? One catch to this: plugins. #860 aims to add a plugin hook for metadata. But if the metadata comes from an in-memory table, how do the plugins interact with it? The need to paginate over metadata does make a plugin hook that returns metadata for an individual table seem less wise, since we don't want to have to do 10,000 plugin hook invocations to show a list of all metadata. If those plugins write directly to the in-memory table how can their contributions survive the server restarting? | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1168/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1243498298 | I_kwDOBm6k_c5KHkc6 | 1746 | Switch documentation theme to Furo | 9599 | closed | 0 | 21 | 2022-05-20T18:42:17Z | 2022-05-20T21:28:29Z | 2022-05-20T21:28:29Z | OWNER | https://github.com/pradyunsg/furo I just did this for `shot-scraper` and I really like it: https://shot-scraper.datasette.io/en/latest/ - https://github.com/simonw/shot-scraper/issues/77 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1746/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
628499086 | MDU6SXNzdWU2Mjg0OTkwODY= | 790 | "flash messages" mechanism | 9599 | closed | 0 | 5512395 | 20 | 2020-06-01T14:55:44Z | 2020-06-08T19:33:59Z | 2020-06-02T21:14:03Z | OWNER | > Passing `?_success` like this isn't necessarily the best approach. Potential improvements include: > > - Signing this message so it can't be tampered with (I could generate a signing secret on startup) > - Using a cookie with a temporary flash message in it instead > - Using HTML5 history API to remove the `?_success=` from the URL bar when the user lands on the page > > If I add an option to redirect the user to another page after success I may need a mechanism to show a flash message on that page as well, in which case I'll need a general flash message solution that works for any page. _Originally posted by @simonw in https://github.com/simonw/datasette/pull/703_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/790/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
679808124 | MDU6SXNzdWU2Nzk4MDgxMjQ= | 940 | Move CI to GitHub Issues | 9599 | closed | 0 | 5818042 | 20 | 2020-08-16T19:06:08Z | 2020-09-14T22:09:35Z | 2020-09-14T22:09:35Z | OWNER | It looks like the tests take 3m33s to run in GitHub Actions, but they're taking more than 8 minutes in Travis | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/940/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
733499930 | MDU6SXNzdWU3MzM0OTk5MzA= | 1072 | load_template hook doesn't work for include/extends | 9599 | closed | 0 | 6026070 | 20 | 2020-10-30T20:33:44Z | 2020-10-31T20:48:18Z | 2020-10-30T22:50:57Z | OWNER | Includes like this one always go to disk, without hitting the `load_template` plugin hook: ```html+jinja <footer class="ft">{% block footer %}{% include "_footer.html" %}{% endblock %}</footer> ``` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1072/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
1174306154 | I_kwDOBm6k_c5F_n1q | 1668 | Introduce concept of a database `route`, separate from its name | 9599 | closed | 0 | 3268330 | 20 | 2022-03-19T16:48:28Z | 2022-03-20T16:43:16Z | 2022-03-20T16:43:16Z | OWNER | Some issues came up in the new `datasette-hashed-urls` plugin relating to the way it renames databases on startup to achieve unique URLs that depend on the database SHA-256 content: - https://github.com/simonw/datasette-hashed-urls/issues/10 - https://github.com/simonw/datasette-hashed-urls/issues/9 - https://github.com/simonw/datasette-hashed-urls/issues/8 All three of these could be addressed by making the "path" concept for a database (the `/foo` bit where it is served) work independently of the database's name, which would be used for default display and also as the alias when configuring cross-database aliases. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1668/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
1250629388 | I_kwDOCGYnMM5KixcM | 440 | CSV files with too many values in a row cause errors | 4068 | closed | 0 | 20 | 2022-05-27T10:54:44Z | 2022-06-14T22:23:01Z | 2022-06-14T20:12:46Z | NONE | *Original title: csv.DictReader can have None as key* In some cases, `csv.DictReader` can have `None` as key for unnamed columns, and a list of values as value. `sqlite_utils.utils.rows_from_file` cannot handle that: ```python url="https://artsdatabanken.no/Fab2018/api/export/csv" db = sqlite_utils.Database(":memory") with urlopen(url) as fab: reader, _ = sqlite_utils.utils.rows_from_file(fab, encoding="utf-16le") db["fab2018"].insert_all(reader, pk="Id") ``` Result: ``` Traceback (most recent call last): File "<stdin>", line 3, in <module> File "/home/user/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/sqlite_utils/db.py", line 2924, in insert_all chunk = list(chunk) File "/home/user/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/sqlite_utils/db.py", line 3454, in fix_square_braces if any("[" in key or "]" in key for key in record.keys()): File "/home/user/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/sqlite_utils/db.py", line 3454, in <genexpr> if any("[" in key or "]" in key for key in record.keys()): TypeError: argument of type 'NoneType' is not iterable ``` Code: https://github.com/simonw/sqlite-utils/blob/59be60c471fd7a2c4be7f75e8911163e618ff5ca/sqlite_utils/db.py#L3454 `sqlite-utils insert` from command line is not affected by this issue. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/440/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1485757511 | I_kwDOBm6k_c5YjtxH | 1939 | register_permissions(datasette) plugin hook | 9599 | closed | 0 | 8711695 | 20 | 2022-12-09T01:33:25Z | 2022-12-13T02:07:50Z | 2022-12-13T02:05:56Z | OWNER | A plugin hook that adds more named permissions to the list which is initially populated here: https://github.com/simonw/datasette/blob/e539c1c024bc62d88df91d9107cbe37e7f0fe55f/datasette/permissions.py#L1-L19 Originally imagined this hook in this comment: - https://github.com/simonw/datasette/issues/1881#issuecomment-1301639370 I need this for a few reasons: - https://github.com/simonw/datasette/issues/1636 - Needs it in order to validate that permissions defined in `metadata.json` are set in the right place (don't set an instance permissions at table level for example) - https://github.com/simonw/datasette/issues/1855 - Needs it to be able to register additional abbreviations for use in signed cookies - And for validation when you use `datasette create-token` and pass in extra permissions - The https://latest.datasette.io/-/permissions debug interface needs it to add extra debug options to the `<select>` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1939/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
1805076818 | I_kwDOBm6k_c5rl0lS | 2102 | API tokens with view-table but not view-database/view-instance cannot access the table | 9599 | closed | 0 | 9599 | 20 | 2023-07-14T15:34:27Z | 2023-08-29T16:32:36Z | 2023-08-29T16:32:35Z | OWNER | > Spotted a problem while working on this: if you grant a token access to view table for a specific table but don't also grant view database and view instance permissions, that token is useless. > > This was a deliberate design decision in Datasette - it's documented on https://docs.datasette.io/en/1.0a2/authentication.html#access-permissions-in-metadata > >> If a user cannot access a specific database, they will not be able to access tables, views or queries within that database. If a user cannot access the instance they will not be able to access any of the databases, tables, views or queries. > > I'm now second-guessing if this was a good decision. _Originally posted by @simonw in https://github.com/simonw/datasette-auth-tokens/issues/7#issuecomment-1636031702_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2102/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
725184645 | MDU6SXNzdWU3MjUxODQ2NDU= | 1034 | Better way of representing binary data in .csv output | 9599 | closed | 0 | 6026070 | 19 | 2020-10-20T04:28:58Z | 2021-06-17T18:13:21Z | 2020-10-29T22:47:46Z | OWNER | I just noticed this: https://latest.datasette.io/fixtures/binary_data.csv ```csv rowid,data 1,b'\x15\x1c\x02\xc7\xad\x05\xfe' 2,b'\x15\x1c\x03\xc7\xad\x05\xfe' ``` There's no good way to represent binary data in a CSV file, but this seems like one of the more-bad options. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1034/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
736520310 | MDU6SXNzdWU3MzY1MjAzMTA= | 196 | Introspect if table is FTS4 or FTS5 | 9599 | closed | 0 | 19 | 2020-11-05T00:45:50Z | 2020-11-05T03:54:07Z | 2020-11-05T03:54:07Z | OWNER | > I want `.search()` to work against both FTS5 and FTS4 tables - but sort by rank should only work for FTS5. > > This means I need to be able to introspect and tell if a table is FTS4 or FTS5. _Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/192#issuecomment-722054264_ | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/196/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1108235694 | I_kwDOBm6k_c5CDlWu | 1603 | A proper favicon | 9599 | closed | 0 | 3268330 | 19 | 2022-01-19T15:24:55Z | 2022-03-19T04:04:49Z | 2022-01-20T06:07:31Z | OWNER | Tips here: https://adamj.eu/tech/2022/01/18/how-to-add-a-favicon-to-your-django-site/ - I think a PNG served at `/favicon.ico` is the best option, since safari doesn't support SVG yet. Relevant code: https://github.com/simonw/datasette/blob/cb29119db9115b1f40de2fb45263ed77e3bfbb3e/datasette/app.py#L182-L183 I can reuse the icon for https://datasette.io/desktop | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1603/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
1423336089 | I_kwDOBm6k_c5U1mKZ | 1855 | `datasette create-token` ability to create tokens with a reduced set of permissions | 9599 | closed | 0 | 8711695 | 19 | 2022-10-26T02:20:52Z | 2022-12-14T01:24:49Z | 2022-12-13T05:20:24Z | OWNER | Initial design ideas: https://github.com/simonw/datasette/issues/1852#issuecomment-1289733483 > Token design concept: > > ```json > { > "t": { > "a": ["ir", "ur", "dr"], > "d": { > "fixtures": ["ir", "ur", "dr"] > }, > "t": { > "fixtures": { > "searchable": ["ir"] > } > } > } > } > ``` > > That JSON would be minified and signed. > > Minified version of the above looks like this (101 characters): > > `{"t":{"a":["ir","ur","dr"],"d":{"fixtures":["ir","ur","dr"]},"t":{"fixtures":{"searchable":["ir"]}}}}` > > The `"t"` key shows this is a token that as a default API key. > > `"a"` means "all" - these are permissions that have been granted on all tables and databases. > > `"d"` means "databases" - this is a way to set permissions for all tables in a specific database. > > `"t"` means "tables" - this lets you set permissions at a finely grained table level. > > Then the permissions themselves are two character codes which are shortened versions - so: > > * `ir` = `insert-row` > * `ur` = `update-row` > * `dr` = `delete-row` ## Remaining tasks - [x] Add these options to the `datasette create-token` command - [x] Tests for `datasette create-token` options - [x] Documentation for those options at https://docs.datasette.io/en/latest/authentication.html#datasette-create-token - [x] A way to handle permissions that don't have known abbreviations (permissions added by plugins). Probably need to solve the plugin permission registration problem as part of that - [x] Stop hard-coding names of actions in the `permission_allowed_actor_restrictions` function | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1855/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
273944952 | MDU6SXNzdWUyNzM5NDQ5NTI= | 93 | Package as standalone binary | 67420 | closed | 0 | 18 | 2017-11-14T21:14:07Z | 2021-11-21T07:00:23Z | 2021-11-21T07:00:23Z | NONE | hint: more than the docker image a standalone and multiplatform binary (containing the app and the database) could be simpler to distribute. i would like to investigate the possibility to package everything with [pyinstaller](http://www.pyinstaller.org/) adding the database as a [data file](https://pythonhosted.org/PyInstaller/spec-files.html#adding-data-files) | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/93/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
632843030 | MDU6SXNzdWU2MzI4NDMwMzA= | 807 | Ability to ship alpha and beta releases | 9599 | closed | 0 | 5533512 | 18 | 2020-06-07T00:12:55Z | 2020-06-18T21:41:16Z | 2020-06-18T21:41:16Z | OWNER | I'd like to be able to ship alphas and betas to PyPI so in-development plugins can depend on them and help test unreleased plugin hooks. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/807/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
638212085 | MDU6SXNzdWU2MzgyMTIwODU= | 842 | Magic parameters for canned queries | 9599 | closed | 0 | 5533512 | 18 | 2020-06-13T18:50:08Z | 2020-06-28T03:30:31Z | 2020-06-28T02:58:18Z | OWNER | Now that writable canned queries (#698) have landed, it would be neat if they supported "magic" parameters - parameters that are automatically populated with: - the current actor ID / other actor properties - the current date and time - the user's IP or user-agent And maybe other things potentially added by plugins. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/842/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
681334912 | MDU6SXNzdWU2ODEzMzQ5MTI= | 942 | Support column descriptions in metadata.json | 9599 | closed | 0 | 18 | 2020-08-18T20:52:00Z | 2022-01-13T22:21:42Z | 2021-08-12T23:53:24Z | OWNER | Could look something like this: ```json { "title": "Five Thirty Eight", "license": "CC Attribution 4.0 License", "license_url": "https://creativecommons.org/licenses/by/4.0/", "source": "fivethirtyeight/data on GitHub", "source_url": "https://github.com/fivethirtyeight/data", "databases": { "fivethirtyeight": { "tables": { "mueller-polls/mueller-approval-polls": { "description_html": "<p>....</p>", "columns": { "name_of_column": "column_description goes here" } ``` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/942/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
810618495 | MDU6SXNzdWU4MTA2MTg0OTU= | 235 | Extract columns cannot create foreign key relation: sqlite3.OperationalError: table sqlite_master may not be modified | 6913891 | closed | 0 | 18 | 2021-02-17T23:33:23Z | 2023-06-26T01:47:01Z | 2023-06-25T23:25:53Z | NONE | Thanks for what seems like a truly great suite of libraries. I wanted to try out Datasette, but never got more than half way through your YouTube video with the SF tree dataset. Whenever I try to extract a column, I get a `sqlite3.OperationalError: table sqlite_master may not be modified` error from Python. This snippet reproduces the error on my system, Python 3.9.1 and sqlite-utils 3.5 on an M1 Macbook Pro running in rosetta mode: ``` curl "https://data.nasa.gov/resource/y77d-th95.json" | \ sqlite-utils insert meteorites.db meteorites - --pk=id sqlite-utils extract meteorites.db meteorites recclass ``` I have tried googling the problem, but all I've found is that this *might* be a problem with the sqlite3 database running in defensive mode, but I definitely can't know for sure. Does the problem seem familiar to you? | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/235/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
837308703 | MDU6SXNzdWU4MzczMDg3MDM= | 1268 | Figure out why SpatiaLite 5.0 hangs the database page on Linux | 9599 | closed | 0 | 18 | 2021-03-22T04:44:16Z | 2022-01-20T21:29:43Z | 2021-03-22T17:41:12Z | OWNER | See detailed notes in https://github.com/simonw/datasette/issues/1249 - for some reason SpatiaLite 5.0 hangs the `/dbname` page on Linux (inside Docker containers, both with a custom compiled SpatiaLite and one installed from the Ubuntu 20.10 package repository). This doesn't happen on macOS with SpatiaLite 5 installed using Homebrew. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1268/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1034535001 | I_kwDOBm6k_c49qcBZ | 1497 | Publish to Docker Hub failing with "libcrypt.so.1: cannot open shared object file" | 9599 | closed | 0 | 18 | 2021-10-24T22:57:07Z | 2023-01-18T17:13:45Z | 2021-10-24T23:36:55Z | OWNER | This means the Datasette 0.59.1 release has not been published to Docker Hub. Here's where that failed: https://github.com/simonw/datasette/runs/3991043374?check_suite_focus=true ``` Preparing to unpack .../libc6_2.32-4_amd64.deb ... debconf: unable to initialize frontend: Dialog debconf: (TERM is not set, so the dialog frontend is not usable.) debconf: falling back to frontend: Readline debconf: unable to initialize frontend: Readline debconf: (Can't locate Term/ReadLine.pm in @INC (you may need to install the Term::ReadLine module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.28.1 /usr/local/share/perl/5.28.1 /usr/lib/x86_64-linux-gnu/perl5/5.28 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.28 /usr/share/perl/5.28 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at /usr/share/perl5/Debconf/FrontEnd/Readline.pm line 7.) debconf: falling back to frontend: Teletype Checking for services that may need to be restarted... Checking init scripts... Unpacking libc6:amd64 (2.32-4) over (2.28-10) ... Setting up libc6:amd64 (2.32-4) ... /usr/bin/perl: error while loading shared libraries: libcrypt.so.1: cannot open shared object file: No such file or directory dpkg: error processing package libc6:amd64 (--configure): installed libc6:amd64 package post-installation script subprocess returned error exit status 127 Errors were encountered while processing: libc6:amd64 E: Sub-process /usr/bin/dpkg returned an error code (1) The command '/bin/sh -c apt-get update && apt-get -y --no-install-recommends install software-properties-common && add-apt-repository "deb http://httpredir.debian.org/debian sid main" && apt-get update && apt-get -t sid install -y --no-install-recommends libsqlite3-mod-spatialite && apt-get remove -y software-properties-common && apt clean && rm -rf /var/lib/apt && rm -rf /var/lib/dpkg/info/*' returned a non-zero code: 100 ``` Same problem when I attempted to publish using the "Push specific Docker tag" workflow… | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1497/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1409679008 | I_kwDOBm6k_c5UBf6g | 1844 | Update screenshots in documentation to match latest designs | 9599 | closed | 0 | 18 | 2022-10-14T18:01:18Z | 2022-10-14T23:51:46Z | 2022-10-14T19:57:17Z | OWNER | https://docs.datasette.io/en/0.62/full_text_search.html has this out-of-date screenshot: <img width="1191" alt="image" src="https://user-images.githubusercontent.com/9599/195911747-386f4cd2-5239-4c83-8e0c-072e6ae56ff6.png"> | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1844/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
912864936 | MDU6SXNzdWU5MTI4NjQ5MzY= | 1362 | Consider using CSP to protect against future XSS | 9599 | open | 0 | 17 | 2021-06-06T15:32:20Z | 2022-10-08T18:42:09Z | OWNER | The XSS in #1360 would have been a lot less damaging if Datasette used CSP to protect against such vulnerabilities: https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1362/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
322477187 | MDU6SXNzdWUzMjI0NzcxODc= | 255 | Facets | 9599 | closed | 0 | 16 | 2018-05-12T03:00:07Z | 2019-05-29T21:39:12Z | 2018-05-16T15:32:12Z | OWNER | Ability to display facets and facet counts on the table view. Facets can be specified in the URL with `?_facet=column&_facet=othercolumn` or the default facets for a table can be set using a new `"facets": [...]` property in `metadata.json` - [x] Implement `?_facet=` - [x] Implement `metadata.json` `facets` key - [x] Design for how facets should be presented - [x] Facets should be able to toggle off as well as on - [x] Expand labels for facets that are foreign keys - [x] Suggest potential facets (if we can do so within a tight time limit) - [x] Documentation | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/255/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
517451234 | MDU6SXNzdWU1MTc0NTEyMzQ= | 615 | ?_col= and ?_nocol= support for toggling columns on table view | 9599 | closed | 0 | 16 | 2019-11-04T22:55:41Z | 2021-05-27T04:26:10Z | 2021-05-27T04:17:44Z | OWNER | Split off from #292 (I guess this is a re-opening of #312). | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/615/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
633578769 | MDU6SXNzdWU2MzM1Nzg3Njk= | 811 | Support "allow" block on root, databases and tables, not just queries | 9599 | closed | 0 | 5512395 | 16 | 2020-06-07T17:01:09Z | 2020-06-08T19:34:00Z | 2020-06-08T19:32:36Z | OWNER | No reason not to expand the "allow" mechanism [described here](https://github.com/simonw/datasette/blob/86dec9e8fffd6c4efec928ae9b5713748dec7e74/docs/authentication.rst#permissions-for-canned-queries) to the root of `metadata.json` plus to databases and tables. Refs #810 and #800. ```json { "databases": { "mydatabase": { "allow": { "id": ["root"] } } } } ``` TODO: - [x] Instance level - [x] Database level - [x] Table level - [x] Query level - [x] Affects list of queries - [x] Affects list of tables on database page - [x] Affects truncated list of tables on index page - [x] Affects list of SQL views on database page - [x] Affects list of databases on index page - [x] Show 🔒 in header on index page for private instances - [x] Show 🔒 in header on private database page - [x] Show 🔒 in header on private table page - [x] Show 🔒 in header on private query page - [x] Move `assert_permissions_checked()` calls from `test_html.py` to `test_permissions.py` - [x] Update documentation | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/811/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
711627628 | MDU6SXNzdWU3MTE2Mjc2Mjg= | 981 | Action menu for table columns | 9599 | closed | 0 | 5971510 | 16 | 2020-09-30T04:45:38Z | 2020-10-08T23:55:00Z | 2020-09-30T23:58:17Z | OWNER | At the very least I'd like a menu on each table column that lets me select sort-asc v.s. sort-desc without having to click twice. I'd also like to be able to indicate that a column should be used for faceting (possibly only for columns that are not floating point and do not have a unique index on them). This needs to be built with accessibility in mind - I don't want screenreaders to read out the contents of a menu as the "th" label for any given cell. Related: #690 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/981/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
725996507 | MDU6SXNzdWU3MjU5OTY1MDc= | 1036 | Make it possible to download BLOB data from the Datasette UI | 9599 | closed | 0 | 6026070 | 16 | 2020-10-20T22:47:56Z | 2021-01-18T17:45:00Z | 2020-10-25T00:14:52Z | OWNER | Currently you can only extract binary BLOB data as base64-encoded JSON, which is not user friendly at all. It should always be possible for end-users to get the binary data out. I'm worried about XSS vulnerabilities here, but hopefully sending `Content-Type: application/octet-stream` helps there? Need to research that. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1036/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
816526538 | MDU6SXNzdWU4MTY1MjY1Mzg= | 239 | sqlite-utils extract could handle nested objects | 9599 | open | 0 | 16 | 2021-02-25T15:10:28Z | 2022-09-03T23:46:02Z | OWNER | Imagine a table (imported from a nested JSON file) where one of the columns contains values that look like this: {"email": "anonymous@noreply.airtable.com", "id": "usrROSHARE0000000", "name": "Anonymous"} The `sqlite-utils extract` command already uses single text values in a column to populate a new table. It would not be much of a stretch for it to be able to use JSON instead, including specifying which of those values should be used as the primary key in the new table. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239/reactions", "total_count": 6, "+1": 5, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1095570074 | I_kwDOCGYnMM5BTRKa | 364 | `--batch-size 1` doesn't seem to commit for every item | 9599 | closed | 0 | 7558727 | 16 | 2022-01-06T18:18:50Z | 2022-01-10T19:27:17Z | 2022-01-10T05:36:19Z | OWNER | I'm trying this, but it doesn't seem to write anything to the database file until I hit `CTRL+C`: ``` heroku logs --app=simonwillisonblog --tail | grep 'measure#nginx.service' | \ sqlite-utils insert /tmp/herokutail.db log - --import re --convert "$(cat <<EOD r = re.compile(r'([^\s=]+)=(?:"(.*?)"|(\S+))') pairs = {} for key, value1, value2 in r.findall(line): pairs[key] = value1 or value2 return pairs EOD )" --lines --batch-size 1 ``` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/364/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
1096558279 | I_kwDOCGYnMM5BXCbH | 365 | create-index should run analyze after creating index | 536941 | closed | 0 | 7558727 | 16 | 2022-01-07T18:21:25Z | 2022-01-11T02:43:34Z | 2022-01-11T01:36:48Z | CONTRIBUTOR | sqlite's query planner depends upon analyze to make good use of indices. It would be nice if analyze was run as part of the create-index command. If data is inserted later, things can get out date, but it would still probably be a net win. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/365/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
1408757705 | I_kwDOBm6k_c5T9-_J | 1843 | Intermittent "Too many open files" error running tests | 9599 | open | 0 | 16 | 2022-10-14T04:45:01Z | 2022-12-17T22:02:41Z | OWNER | Partial stack trace from one of them: ``` /Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/jinja2/loaders.py:200: in get_source f = open_if_exists(filename) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ filename = '/Users/simon/Dropbox/Development/datasette/datasette/templates/error.html', mode = 'rb' def open_if_exists(filename: str, mode: str = "rb") -> t.Optional[t.IO]: """Returns a file descriptor for the filename if that file exists, otherwise ``None``. """ if not os.path.isfile(filename): return None > return open(filename, mode) E OSError: [Errno 24] Too many open files: '/Users/simon/Dropbox/Development/datasette/datasette/templates/error.html' ``` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1843/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
reopened | |||||||
1425029242 | I_kwDOBm6k_c5U8Dh6 | 1863 | Update a single record in an existing table | 9599 | closed | 0 | 8658075 | 16 | 2022-10-27T04:53:17Z | 2022-11-29T18:08:53Z | 2022-11-29T18:06:37Z | OWNER | API design: ``` POST /db/table/row-pks/-/update { "field": "updated_value" } ``` Only the fields that you pass will be updated. Maybe this is the wrong design though? The design for insert currently looks like this: - https://github.com/simonw/datasette/issues/1851#issuecomment-1294224185 ``` POST /db/table/-/insert Authorization: Bearer xxx Content-Type: application/json { "row": { "id": 1, "name": "New name" } } ``` I could use the same format for `/-/update`, but in this case the API doesn't require you to pass every field so `"row"` doesn't seem like the right key. I think I'll go with this: ``` POST /db/table/1/-/update Authorization: Bearer xxx Content-Type: application/json { "update": { "name": "New name" } } ``` The benefit of having an `"update"` key is that it allows me to use other keys in the future. Maybe a `"alter": true` key to indicate that new columns should be added if they are missing. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1863/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
1448143294 | I_kwDOBm6k_c5WUOm- | 1890 | Autocomplete text entry for filter values that correspond to facets | 536941 | closed | 0 | 16 | 2022-11-14T14:11:31Z | 2022-11-17T00:47:36Z | 2022-11-16T03:23:01Z | CONTRIBUTOR | datasette allows users to enter in the value for named parameters into a free-text form field. I think it would add a lot of usability, if the form field could be a drop down of options when query value is already a faceted column. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1890/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1726236847 | I_kwDOBm6k_c5m5Eiv | 2078 | Resolve the difference between `wrap_view()` and `BaseView` | 9599 | closed | 0 | 16 | 2023-05-25T17:44:32Z | 2023-05-26T00:18:46Z | 2023-05-26T00:18:46Z | OWNER | There are two patterns for implementing views in Datasette at the moment. I want to combine those. Part of: - #2053 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/2078/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
314665147 | MDU6SXNzdWUzMTQ2NjUxNDc= | 216 | Bug: Sort by column with NULL in next_page URL | 222245 | closed | 0 | 15 | 2018-04-16T14:03:18Z | 2018-04-17T01:45:24Z | 2018-04-17T01:45:24Z | NONE | Copy-pasting from https://github.com/simonw/datasette/issues/189#issuecomment-381429213, since that issue is closed: I think I found a bug. I tried to sort by middle initial in my salaries set, and many middle initials are null. The `next_url` gets set by Datasette to: http://localhost:8001/salaries-d3a5631/2017+Maryland+state+salaries?_next=None%2C391&_sort=middle_initial But then None is interpreted literally and it tries to find a name with the middle initial "None" and ends up skipping ahead to O on page 2. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/216/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
326800219 | MDU6SXNzdWUzMjY4MDAyMTk= | 292 | Mechanism for customizing the SQL used to select specific columns in the table view | 9599 | closed | 0 | 15 | 2018-05-27T09:05:52Z | 2021-05-27T04:25:01Z | 2021-05-27T04:25:01Z | OWNER | Some columns don't make a lot of sense in their default representation - binary blobs such as SpatiaLite geometries for example, or lengthy columns that really should be truncated somehow. We may also find that there are tables where we don't want to show all of the columns - so a mechanism to select a subset of columns would be nice. I think there are two features here: * the ability to request a subset of columns on the table view * the ability to override the SQL for a specific column and/or add extra columns - `AsGeoJSON(Geometry)` for example Both features should be available via both querystring arguments and in `metadata.json` The querystring argument for custom SQL should only work if `allow_sql` config is turned on. Refs #276 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/292/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
521868864 | MDU6SXNzdWU1MjE4Njg4NjQ= | 66 | The ".upsert()" method is misnamed | 9599 | closed | 0 | 15 | 2019-11-12T23:48:28Z | 2019-12-31T01:30:21Z | 2019-12-31T01:30:20Z | OWNER | This thread here is illuminating: https://stackoverflow.com/questions/3634984/insert-if-not-exists-else-update The term `UPSERT` in SQLite has a specific meaning as-of 3.24.0 (2018-06-04): https://www.sqlite.org/lang_UPSERT.html It means "behave as an UPDATE or a no-op if the INSERT would violate a uniqueness constraint". The syntax in 3.24.0+ looks like this (confusingly it does not use the term "upsert"): ```sql INSERT INTO phonebook(name,phonenumber) VALUES('Alice','704-555-1212') ON CONFLICT(name) DO UPDATE SET phonenumber=excluded.phonenumber ``` Here's the problem: the `sqlite-utils` `.upsert()` and `.upsert_all()` methods don't do this. They use the following SQL: ```sql INSERT OR REPLACE INTO [{table}] ({columns}) VALUES {rows}; ``` If the record already exists, it will be entirely replaced by a new record - as opposed to updating any specified fields but leaving existing fields as they are (the behaviour of "upsert" in SQLite itself). | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/66/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
564833696 | MDU6SXNzdWU1NjQ4MzM2OTY= | 670 | Prototoype for Datasette on PostgreSQL | 9599 | open | 0 | 15 | 2020-02-13T17:17:55Z | 2023-11-17T15:32:21Z | OWNER | I thought this would never happen, but now that I'm deep in the weeds of running SQLite in production for Datasette Cloud I'm starting to reconsider my policy of only supporting SQLite. Some of the factors making me think PostgreSQL support could be worth the effort: - Serverless. I'm getting increasingly excited about writable-database use-cases for Datasette. If it could talk to PostgreSQL then users could easily deploy it on Heroku or other serverless providers that can talk to a managed RDS-style PostgreSQL. - Existing databases. Plenty of organizations have PostgreSQL databases. They can export to SQLite using [db-to-sqlite](https://github.com/simonw/db-to-sqlite) but that's a pretty big barrier to getting started - being able to run `datasette postgresql://connection-string` and start trying it out would be a massively better experience. - Data size. I keep running into use-cases where I want to run Datasette against many GBs of data. SQLite can do this but PostgreSQL is much more optimized for large data, especially given the existence of tools like Citus. - Marketing. Convincing people to trust their data to SQLite is potentially a big barrier to adoption. Even if I've convinced myself it's trustworthy I still have to convince everyone else. - It might not be that hard? If this required a ground-up rewrite it wouldn't be worth the effort, but I have a hunch that it may not be too hard - most of the SQL in Datasette should work on both databases since it's almost all portable SELECT statements. If Datasette did DML this would be a lot harder, but it doesn't. - Plugins! This feels like a natural surface for a plugin - at which point people could add MySQL support and suchlike in the future. The above reasons feel strong enough to justify a prototype. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/670/reactions", "total_count": 19, "+1": 14, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 5, "rocket": 0, "eyes": 0 } |
||||||||
570309546 | MDU6SXNzdWU1NzAzMDk1NDY= | 685 | Document (and reconsider design of) Database.execute() and Database.execute_against_connection_in_thread() | 9599 | closed | 0 | 3268330 | 15 | 2020-02-25T04:49:44Z | 2020-05-30T13:20:50Z | 2020-05-08T17:42:18Z | OWNER | In #683 I started a new section of internals documentation covering the `Database` class: https://datasette.readthedocs.io/en/latest/internals.html#database-class I decided not to document `.execute()` and `.execute_against_connection_in_thread()` yet because I'm not 100% happy with their API design yet. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/685/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
585626199 | MDU6SXNzdWU1ODU2MjYxOTk= | 705 | latest.datasette.io is no longer updating | 9599 | closed | 0 | 5234079 | 15 | 2020-03-22T01:59:30Z | 2020-03-25T02:30:24Z | 2020-03-25T02:30:24Z | OWNER | https://latest.datasette.io/-/versions is stuck on 0.35. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/705/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
841377702 | MDU6SXNzdWU4NDEzNzc3MDI= | 251 | "sqlite-utils convert" command to replace the separate "sqlite-transform" tool | 9599 | closed | 0 | 15 | 2021-03-25T22:36:36Z | 2021-08-02T22:39:46Z | 2021-08-02T04:47:40Z | OWNER | See https://github.com/simonw/sqlite-transform/issues/11 - I built a separate `sqlite-transform` tool a while ago that uses the word "transform" to means something entirely different from `sqlite-utils transform` - I'd like to resolve this by merging the two tools. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/251/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1109808154 | I_kwDOBm6k_c5CJlQa | 1608 | Documentation should clarify /stable/ vs /latest/ | 9599 | closed | 0 | 15 | 2022-01-20T22:02:59Z | 2023-03-26T23:41:12Z | 2022-01-20T22:53:17Z | OWNER | It's not currently clear what the difference between https://docs.datasette.io/en/latest/ and https://docs.datasette.io/en/stable/ is - I should fix that. On Twitter: https://twitter.com/simonw/status/1484285006243528705 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1608/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1362402998 | I_kwDOBm6k_c5RNJ62 | 1802 | Tests reliably failing on Python 3.7 | 9599 | closed | 0 | 15 | 2022-09-05T19:21:16Z | 2022-09-06T00:40:20Z | 2022-09-06T00:40:20Z | OWNER | <img width="819" alt="image" src="https://user-images.githubusercontent.com/9599/188504608-5be0f941-dca2-4f42-83bc-3d04bbad221d.png"> https://github.com/simonw/datasette/runs/8194907739?check_suite_focus=true I thought this might be an intermittent failure but attempts to re-run the tests have not made it pass. End of that trace is: ``` /home/runner/work/datasette/datasette/datasette/app.py:234: in __init__ self._refresh_schemas_lock = asyncio.Lock() /opt/hostedtoolcache/Python/3.7.13/x64/lib/python3.7/asyncio/locks.py:161: in __init__ self._loop = events.get_event_loop() _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <asyncio.unix_events._UnixDefaultEventLoopPolicy object at 0x7fb1fc799fd0> def get_event_loop(self): """Get the event loop for the current context. Returns an instance of EventLoop or raises an exception. """ if (self._local._loop is None and not self._local._set_called and isinstance(threading.current_thread(), threading._MainThread)): self.set_event_loop(self.new_event_loop()) if self._local._loop is None: raise RuntimeError('There is no current event loop in thread %r.' > % threading.current_thread().name) E RuntimeError: There is no current event loop in thread 'MainThread'. ``` | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1802/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1572766460 | I_kwDOCGYnMM5dvoL8 | 524 | Transformation type `--type DATETIME` | 21095447 | closed | 0 | 15 | 2023-02-06T15:18:42Z | 2023-02-15T12:10:54Z | 2023-02-15T12:10:54Z | NONE | Hey. Currently i do transformation with the type `--type TEXT`, but i noticed using the sqlalchemy based library [dataset](https://github.com/pudo/dataset) that is reading and writing differ depending on the column types `TEXT`, `DATETIME`. Is it possible to alter a column type to `DATETIME` somehow using Sqlite-Utils? | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/524/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
276842536 | MDU6SXNzdWUyNzY4NDI1MzY= | 153 | Ability to customize presentation of specific columns in HTML view | 20264 | closed | 0 | 2949431 | 14 | 2017-11-26T17:46:11Z | 2017-12-10T02:08:45Z | 2017-12-07T06:17:33Z | NONE | This ties into https://github.com/simonw/datasette/issues/3 in some ways. It would be great to have some adaptability in the HTML views and to specific some columns as displaying in certain ways. - [x] 1. **Auto-parsing URIs into in-browser links.** Why? Lots of public data around cultural commons stuff links to a specific URL. This would be a great utility to turn on at the command line, just parse everything for URLs. Maybe they need to be underlined or represented in a different way than internal URLs. - [x] 2. **Ability to identify a column as plain/preformatted text.** Why? Was trying to import the Enron emails, the body collapses. Hard to read. These fields also tend to screw up the ability to scan a table view. If you knew it was text the system could set an `overflow` property on the relevant CSS, so you could still scan. - [x] 3. **Ability to identify a column as HTML.** Why? I want to spider some stuff and drop sections into SQLite, and just keep them as HTML. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/153/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
314506669 | MDU6SXNzdWUzMTQ1MDY2Njk= | 215 | Allow plugins to define additional URL routes and views | 9599 | closed | 0 | 5512395 | 14 | 2018-04-16T05:31:09Z | 2020-06-09T03:14:32Z | 2020-06-09T03:12:08Z | OWNER | Might be as simple as having plugins get passed the `app` after the other routes have been defined: https://github.com/simonw/datasette/blob/b2955d9065ea019500c7d072bcd9d49d1967f051/datasette/app.py#L1270-L1274 Refs #14 | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/215/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
455486286 | MDU6SXNzdWU0NTU0ODYyODY= | 26 | Mechanism for turning nested JSON into foreign keys / many-to-many | 9599 | open | 0 | 14 | 2019-06-13T00:52:06Z | 2022-06-29T23:35:29Z | OWNER | The GitHub JSON APIs have a really interesting convention with respect to related objects. Consider https://api.github.com/repos/simonw/sqlite-utils/issues - here's a truncated subset: ```json { "id": 449818897, "node_id": "MDU6SXNzdWU0NDk4MTg4OTc=", "number": 24, "title": "Additional Column Constraints?", "user": { "login": "IgnoredAmbience", "id": 98555, "node_id": "MDQ6VXNlcjk4NTU1", "avatar_url": "https://avatars0.githubusercontent.com/u/98555?v=4", "gravatar_id": "" }, "labels": [ { "id": 993377884, "node_id": "MDU6TGFiZWw5OTMzNzc4ODQ=", "url": "https://api.github.com/repos/simonw/sqlite-utils/labels/enhancement", "name": "enhancement", "color": "a2eeef", "default": true } ], "state": "open" } ``` The `user` column lists a complete user. The `labels` column has a list of labels. Since both user and label have populated `id` field this is actually enough information for us to create records for them AND set up the corresponding foreign key (for user) and m2m relationships (for labels). It would be really neat if `sqlite-utils` had some kind of mechanism for correctly processing these kind of patterns. Thanks to `jq` there's not much need for extra customization of the shape here - if we support a narrowly defined structure users can use `jq` to reshape arbitrary JSON to match. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
520655983 | MDU6SXNzdWU1MjA2NTU5ODM= | 619 | "Invalid SQL" page should let you edit the SQL | 9599 | closed | 0 | 14 | 2019-11-10T20:54:12Z | 2022-01-13T22:21:42Z | 2021-06-02T04:15:54Z | OWNER | https://latest.datasette.io/fixtures?sql=select%0D%0A++*%0D%0Afrom%0D%0A++%5Bfoo%5D <img src="https://user-images.githubusercontent.com/9599/68550604-1ab31b00-03b9-11ea-8a33-5dc339b8c5f5.jpeg" width="500"> Would be useful if this page showed you the invalid SQL you entered so you can edit it and try again. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/619/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
602533539 | MDU6SXNzdWU2MDI1MzM1Mzk= | 4 | Upload all my photos to a secure S3 bucket | 9599 | closed | 0 | 5324096 | 14 | 2020-04-18T19:24:50Z | 2020-04-18T21:58:11Z | 2020-04-18T21:57:13Z | MEMBER | - [x] Create a bucket with bucket credentials - [x] Programmatically upload some recent photos to it (from a notebook) - [x] Turn this into a script | 256834907 | issue | { "url": "https://api.github.com/repos/dogsheep/dogsheep-photos/issues/4/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
631931408 | MDU6SXNzdWU2MzE5MzE0MDg= | 800 | Canned query permissions mechanism | 9599 | closed | 0 | 5512395 | 14 | 2020-06-05T20:28:21Z | 2020-06-07T16:22:53Z | 2020-06-07T16:22:53Z | OWNER | > Idea: default is anyone can execute a query. > > Or you can specify the following: > > ```json > > { > "databases": { > "my-database": { > "queries": { > "add_twitter_handle": { > "sql": "insert into twitter_handles (username) values (:username)", > "write": true, > "allow": { > "id": ["simon"], > "role": ["staff"] > } > } > } > } > } > } > ``` > These get matched against the actor JSON. If any of the fields in any of the keys of `"allow"` match a key on the actor, the query is allowed. > > `"id": "*"` matches any actor with an `id` key. _Originally posted by @simonw in https://github.com/simonw/datasette/issues/698#issuecomment-639784651_ | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/800/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
637395097 | MDU6SXNzdWU2MzczOTUwOTc= | 838 | Incorrect URLs when served behind a proxy with base_url set | 79913 | closed | 0 | 6026070 | 14 | 2020-06-11T23:58:55Z | 2021-11-20T19:35:48Z | 2021-11-20T19:35:48Z | NONE | I'm running `datasette serve --config base_url:/foo/ …`, proxying to it with this Apache config: ProxyPass /foo/ http://localhost:8001/ ProxyPassReverse /foo/ http://localhost:8001/ and then accessing it via `https://example.com/foo/`. Although many of the URLs in the pages are correct (presumably because they either use absolute paths which include `base_url` or relative paths), the faceting and pagination links still use fully-qualified URLs pointing at `http://localhost:8001`. I looked into this a little in the source code, and it seems to be an issue anywhere `request.url` or `request.path` is used, as these contain the values for the request between the frontend (Apache) and backend (Datasette) server. Those properties are primarily used via the `path_with_…` family of utility functions and the `Datasette.absolute_url` method. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/838/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | |||||
647095487 | MDU6SXNzdWU2NDcwOTU0ODc= | 873 | "datasette -p 0 --root" gives the wrong URL | 9599 | open | 0 | 14 | 2020-06-29T04:03:06Z | 2020-08-18T17:26:10Z | OWNER | ``` $ datasette -p 0 --root http://127.0.0.1:0/-/auth-token?token=2d498c... ``` The port is incorrect. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/873/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
727802081 | MDU6SXNzdWU3Mjc4MDIwODE= | 1042 | Plugin hook for loading templates | 9599 | closed | 0 | 6026070 | 14 | 2020-10-23T00:18:39Z | 2020-10-30T17:47:21Z | 2020-10-30T17:47:20Z | OWNER | This can work with the Jinja template loaders. It would unlock things like storing templates in SQLite. | 107914493 | issue | { "url": "https://api.github.com/repos/simonw/datasette/issues/1042/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed |