{"id": 1855885427, "node_id": "I_kwDOBm6k_c5unpBz", "number": 2143, "title": "De-tangling Metadata before Datasette 1.0", "user": {"value": 15178711, "label": "asg017"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 24, "created_at": "2023-08-18T00:51:50Z", "updated_at": "2023-08-24T18:28:27Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "Metadata in Datasette is a really powerful feature, but is a bit difficult to work with. It was initially a way to add \"metadata\" about your \"data\" in Datasette instances, like descriptions for databases/tables/columns, titles, source URLs, licenses, etc. But it later became the go-to spot for other Datasette features that have nothing to do with metadata, like permissions/plugins/canned queries. \r\n\r\nSpecifically, I've found the following problems when working with Datasette metadata:\r\n\r\n1. Metadata cannot be updated without re-starting the entire Datasette instance.\r\n2. The `metadata.json`/`metadata.yaml` has become a kitchen sink of unrelated (imo) features like plugin config, authentication config, canned queries\r\n3. The Python APIs for defining extra metadata are a bit awkward (the `datasette.metadata()` class, `get_metadata()` hook, etc.)\r\n\r\n## Possible solutions\r\n\r\nHere's a few ideas of Datasette core changes we can make to address these problems. \r\n\r\n### Re-vamp the Datasette Python metadata APIs\r\n\r\nThe Datasette object has a single `datasette.metadata()` method that's a bit difficult to work with. There's also no Python API for inserted new metadata, so plugins have to rely on the `get_metadata()` hook.\r\n\r\nThe `get_metadata()` hook can also be improved - it doesn't work with async functions yet, so you're quite limited to what you can do.\r\n\r\n(I'm a bit fuzzy on what to actually do here, but I imagine it'll be very small breaking changes to a few Python methods)\r\n\r\n### Add an optional `datasette_metadata` table\r\n\r\nDatasette should detect and use metadata stored in a new special table called `datasette_metadata`. This would be a regular table that a user can edit on their own, and would serve as a \"live updating\" source of metadata, than can be changed while the Datasette instance is running.\r\n\r\nNot too sure what the schema would look like, but I'd imagine:\r\n\r\n```sql\r\nCREATE TABLE datasette_metadata(\r\n level text,\r\n target any,\r\n key text,\r\n value any,\r\n primary key (level, target)\r\n)\r\n```\r\n\r\nEvery row in this table would map to a single metadata \"entry\".\r\n\r\n- `level` would be one of \"datasette\", \"database\", \"table\", \"column\", which is the \"level\" the entry describes. For example, `level=\"table\"` means it is metadata about a specific table, `level=\"database\"` for a specific database, or `level=\"datasette\"` for the entire Datasette instance.\r\n- `target` would \"point\" to the specific object the entry metadata is about, and would depend on what `level` is specific. \r\n - `level=\"database\"`: `target` would be the string name of the database that the metadata entry is about. ex `\"fixtures\"`\r\n - `level=\"table\"`: `target` would be a JSON array of two strings. The first element would be the database name, and the second would be the table name. ex `[\"fixtures\", \"students\"]`\r\n - `level=\"column\"`: `target` would be a JSON array of 3 strings: The database name, table name, and column name. Ex `[\"fixtures\", \"students\", \"student_id\"`]\r\n- `key` would be the type of metadata entry the row has, similar to the current \"keys\" that exist in `metadata.json`. Ex `\"about_url\"`, `\"source\"`, `\"description\"`, etc\r\n- `value` would be the text value of be metadata entry. The literal text value of a description, about_url, column_label, etc\r\n\r\nA quick sample:\r\n\r\nlevel | target | key | value\r\n-- | -- | -- | --\r\ndatasette | NULL | title | my datasette title...\r\ndb | fixtures | source | \r\ntable | [\"fixtures\", \"students\"] | label_column | student_name\r\ncolumn | [\"fixtures\", \"students\", \"birthdate\"] | description | \r\n\r\nThis `datasette_metadata` would be configured with other tools, and hopefully not manually by end users. Datasette Core could also offer a UI for editing entries in `datasette_metadata`, to update descriptions/columns on the fly.\r\n\r\n### Re-vamp `metadata.json` and move non-metadata config to another place\r\n\r\nThe motivation behind this is that it's awkward that `metadata.json` contains config about things that are not strictly metadata, including:\r\n\r\n- Plugin configuration\r\n- [Authentication/permissions](https://docs.datasette.io/en/latest/authentication.html#access-permissions-in-metadata) (ex the `allow` key on datasettes/databases/tables\r\n- Canned queries. might be controversial, but in my mind, canned queries are application-specific code and configuration, and don't describe the data that exists in SQLite databases. \r\n\r\nI think we should move these outside of `metadata.json` and into a different file. The `datasette.json` idea in #2093 may be a good solution here: plugin/permissions/canned queries can be defined in `datasette.json`, while `metadata.json`/`datasette_metadata` will strictly be about documenting databases/tables/columns. \r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2143/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1427293909, "node_id": "I_kwDOBm6k_c5VEsbV", "number": 1871, "title": "API explorer tool", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8658075, "label": "Datasette 1.0a0"}, "comments": 24, "created_at": "2022-10-28T13:49:11Z", "updated_at": "2022-11-15T19:59:05Z", "closed_at": "2022-11-14T04:59:59Z", "author_association": "OWNER", "pull_request": null, "body": "The API will be much easier to develop if there's a page that helps you execute JSON POSTs against it.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/1871/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 502993509, "node_id": "MDU6SXNzdWU1MDI5OTM1MDk=", "number": 581, "title": "Redesign register_output_renderer callback", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 5471110, "label": "Datasette 0.43"}, "comments": 24, "created_at": "2019-10-05T17:43:23Z", "updated_at": "2020-05-28T02:24:14Z", "closed_at": "2020-05-28T02:21:50Z", "author_association": "OWNER", "pull_request": null, "body": "In building https://github.com/simonw/datasette-atom it became clear that the callback function (which currently accepts just args, data and view_name) would also benefit from access to a mechanism to render templates and a `datasette` instance so it can execute SQL.\r\n\r\nTo maintain backwards compatibility with existing plugins, we can introspect the callback function to see if it wants those new arguments or not.\r\n\r\nAt a minimum I want to make `datasette` and ASGI `scope` available.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/581/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"}