issue_comments
24 rows where issue = 1855885427 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- De-tangling Metadata before Datasette 1.0 · 24 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
1692210044 | https://github.com/simonw/datasette/issues/2143#issuecomment-1692210044 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5k3RN8 | simonw 9599 | 2023-08-24T18:28:27Z | 2023-08-24T18:28:27Z | OWNER | Just spotted this: https://github.com/simonw/datasette/blob/17ec309e14f9c2e90035ba33f2f38ecc5afba2fa/datasette/app.py#L328-L332 Looks to me like that second bit of code doesn't yet handle This code does though:
https://github.com/simonw/datasette/blob/d97e82df3c8a3f2e97038d7080167be9bb74a68d/datasette/utils/init.py#L980-L990 That So we should rename it to something better like |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1692182910 | https://github.com/simonw/datasette/issues/2143#issuecomment-1692182910 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5k3Kl- | simonw 9599 | 2023-08-24T18:06:57Z | 2023-08-24T18:08:17Z | OWNER | The other thing that could work is something like this:
I quite like this, because it could replace the really ugly |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1692180683 | https://github.com/simonw/datasette/issues/2143#issuecomment-1692180683 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5k3KDL | simonw 9599 | 2023-08-24T18:05:17Z | 2023-08-24T18:05:17Z | OWNER | That's a really good call, thanks @rclement - environment variable configuration totally makes sense here. Need to figure out the right syntax for that. Something like this perhaps:
I checked and |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1691094870 | https://github.com/simonw/datasette/issues/2143#issuecomment-1691094870 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kzA9W | rclement 1238873 | 2023-08-24T06:43:40Z | 2023-08-24T06:43:40Z | NONE | If I may, the "path-like" configuration is great but one thing that would be even greater: allowing the same configuration to be provided using environment variables. For instance:
could also be provided using:
(I do not like mixing FYI, you could take some inspiration from another great open source data project, Metabase: https://www.metabase.com/docs/latest/configuring-metabase/config-file https://www.metabase.com/docs/latest/configuring-metabase/environment-variables |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1690800119 | https://github.com/simonw/datasette/issues/2143#issuecomment-1690800119 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kx4_3 | simonw 9599 | 2023-08-24T00:10:32Z | 2023-08-24T00:39:00Z | OWNER | Something notable about this design is that, because the values in the key-value pairs are treated as JSON first and then strings only if they don't parse cleanly as JSON, it's possible to represent any structure (including nesting structures) using this syntax. You can do things like this if you need to (settings for an imaginary plugin):
That previous design was meant to support round-trips, so you could take any nested JSON object and turn it into an HTMl form or query string where every value can have its own form field, then turn the result back again. For the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1690800641 | https://github.com/simonw/datasette/issues/2143#issuecomment-1690800641 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kx5IB | simonw 9599 | 2023-08-24T00:11:16Z | 2023-08-24T00:11:16Z | OWNER |
That's a neat example thanks! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1690799608 | https://github.com/simonw/datasette/issues/2143#issuecomment-1690799608 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kx434 | pkulchenko 77071 | 2023-08-24T00:09:47Z | 2023-08-24T00:10:41Z | NONE | @simonw, FWIW, I do exactly the same thing for one of my projects (both to allow multiple configuration files to be passed on the command line and setting individual values) and it works quite well for me and my users. I even use the same parameter name for both (https://studio.zerobrane.com/doc-configuration#configuration-via-command-line), but I understand why you may want to use different ones for files and individual values. There is one small difference that I accept code snippets, but I don't think it matters much in this case. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1690792514 | https://github.com/simonw/datasette/issues/2143#issuecomment-1690792514 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kx3JC | simonw 9599 | 2023-08-24T00:00:16Z | 2023-08-24T00:02:55Z | OWNER | I've been thinking about what it might look like to allow command-line arguments to be used to define any of the configuration options in Here's what I've come up with:
def _handle_pair(key: str, value: str) -> dict: """ Turn a key-value pair into a nested dictionary. foo, bar => {'foo': 'bar'} foo.bar, baz => {'foo': {'bar': 'baz'}} foo.bar, [1, 2, 3] => {'foo': {'bar': [1, 2, 3]}} foo.bar, "baz" => {'foo': {'bar': 'baz'}} foo.bar, '{"baz": "qux"}' => {'foo': {'bar': "{'baz': 'qux'}"}} """ try: value = json.loads(value) except json.JSONDecodeError: # If it doesn't parse as JSON, treat it as a string pass
def _combine(base: dict, update: dict) -> dict: """ Recursively merge two dictionaries. """ for key, value in update.items(): if isinstance(value, dict) and key in base and isinstance(base[key], dict): base[key] = _combine(base[key], value) else: base[key] = value return base def handle_pairs(pairs: List[Tuple[str, Any]]) -> dict:
"""
Parse a list of key-value pairs into a nested dictionary.
"""
result = {}
for key, value in pairs:
parsed_pair = _handle_pair(key, value)
result = _combine(result, parsed_pair)
return result
Although... we could keep compatibility by saying that if you call |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1690787394 | https://github.com/simonw/datasette/issues/2143#issuecomment-1690787394 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kx15C | simonw 9599 | 2023-08-23T23:52:02Z | 2023-08-23T23:52:02Z | OWNER |
Having multiple configs that combine in that way is a really interesting direction.
I'm very keen on separating out the "metadata" - where metadata is the slimmest possible set of things, effectively the data license and the source and the column and table descriptions - from everything else, mainly because I want metadata to be able to travel with the data. One idea that's been discussed before is having an optional mechanism for storing metadata in the SQLite database file itself - potentially in a That's why I'm so keen on splitting out metadata from all of the other stuff - settings and plugin configuration and authentication rules. So really it becomes "true metadata" v.s. "all of the other junk that's accumulated in metadata and |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1685263948 | https://github.com/simonw/datasette/issues/2143#issuecomment-1685263948 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kcxZM | dvizard 11784304 | 2023-08-20T11:50:10Z | 2023-08-20T11:50:10Z | NONE | This also makes it simple to separate out secrets.
settings.yaml
secrets.yaml
db-docs.yaml
db-fixtures.yaml
|
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1685260624 | https://github.com/simonw/datasette/issues/2143#issuecomment-1685260624 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kcwlQ | dvizard 11784304 | 2023-08-20T11:31:16Z | 2023-08-20T11:31:16Z | NONE | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | ||
1685260244 | https://github.com/simonw/datasette/issues/2143#issuecomment-1685260244 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kcwfU | dvizard 11784304 | 2023-08-20T11:29:00Z | 2023-08-20T11:29:00Z | NONE | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | ||
1685259985 | https://github.com/simonw/datasette/issues/2143#issuecomment-1685259985 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kcwbR | dvizard 11784304 | 2023-08-20T11:27:21Z | 2023-08-20T11:27:21Z | NONE | To chime in from a poweruser perspective: I'm worried that this is an overengineering trap. Yes, the current solution is somewhat messy. But there are datasette-wide settings, there are database-scope settings, there are table-scope settings etc, but then there are database-scope metadata and table-scope metadata. Trying to cleanly separate "settings" from "configuration" is, I believe, an uphill fight. Even separating db/table-scope settings from pure descriptive metadata is not always easy. Like, do canned queries belong to database metadata or to settings? Do I need two separate files for this? One pragmatic solution I used in a project is stacking yaml configuration files. Basically, have an arbitrary number of yaml or json settings files that you load in a specified order. Every file adds to the corresponding settings in the earlier-loaded file (if it already existed). I implemented this myself but found later that there is an existing Python "cascading dict" type of thing, I forget what it's called. There is a bit of a challenge deciding whether there is "replacement" or "addition" (I think I pragmatically ran This way, one allows separation of settings into different blocks, while not imposing a specific idea of what belongs where that might not apply equally to all cases. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1684496274 | https://github.com/simonw/datasette/issues/2143#issuecomment-1684496274 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kZ1-S | asg017 15178711 | 2023-08-18T22:30:45Z | 2023-08-18T22:30:45Z | CONTRIBUTOR |
Does this include things like Well it could work with |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1684488526 | https://github.com/simonw/datasette/issues/2143#issuecomment-1684488526 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kZ0FO | simonw 9599 | 2023-08-18T22:18:39Z | 2023-08-18T22:18:39Z | OWNER |
I'm not a fan of that. I feel like software history is full of examples of projects that implemented configuration-as-code and then later regretted it - the most recent example is I don't think having people dynamically generate JSON/YAML for their configuration is a big burden. I'd have to see some very compelling use-cases to convince me otherwise. That said, I do really like a bias towards settings that can be changed at runtime. Datasette has suffered a bit from some settings that can't be easily changed at runtime already - hence my gnarly https://github.com/simonw/datasette-remote-metadata plugin. For things like Datasette Cloud for example the more people can configure without rebooting their container the better! I don't think live reconfiguration at runtime is incompatible with JSON/YAML configuration though. Caddy is one of my favourite examples of software that can be entirely re-configured at runtime by POSTING a big blob of JSON to it: https://caddyserver.com/docs/quick-starts/api |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1684485591 | https://github.com/simonw/datasette/issues/2143#issuecomment-1684485591 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kZzXX | simonw 9599 | 2023-08-18T22:14:35Z | 2023-08-18T22:14:35Z | OWNER | Actually there is one thing that I'm not comfortable about with respect to the existing design: the way the database / tables stuff is nested. They assume that the user will attach the database to Datasette using a fixed name - But what if we want to support users downloading databases from each other and attaching them to Datasette where those DBs might carry some of their own configuration? Moving metadata into the databases makes sense there, but what about database-specific settings like the default sort order for a table, or configured canned queries? Having those tied to the filename of the database itself feels unpleasant to me. But how else could we handle this? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1684484426 | https://github.com/simonw/datasette/issues/2143#issuecomment-1684484426 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kZzFK | simonw 9599 | 2023-08-18T22:12:52Z | 2023-08-18T22:12:52Z | OWNER | Yeah, I'm convinced by that. There's not point in having both I like Here's a thought for how it could look - I'll go with the YAML format because I expect that to be the default most people use, just because it supports multi-line strings better. I based this on the big example at https://docs.datasette.io/en/1.0a3/metadata.html#using-yaml-for-metadata - and combined some bits from https://docs.datasette.io/en/1.0a3/authentication.html as well. ```yaml
title: Demonstrating Metadata from YAML
description_html: |-
This description includes a long HTML string
settings: default_page_size: 10 max_returned_rows: 3000 sql_time_limit_ms": 8000 databases: docs: permissions: create-table: id: editor fixtures: tables: no_primary_key: hidden: true queries: neighborhood_search: sql: |- select neighborhood, facet_cities.name, state from facetable join facet_cities on facetable.city_id = facet_cities.id where neighborhood like '%' || :text || '%' order by neighborhood; title: Search neighborhoods description_html: |- This demonstrates basic LIKE search permissions: debug-menu: id: '*' plugins:
datasette-ripgrep:
path: /usr/local/lib/python3.11/site-packages
In this example I've mixed in one extra concept: that There are some things in there that look a little bit like metadata - the But are they metadata? The title and description of the overall instance feels like it could be described as general configuration. The stuff for the Note that queries can be defined by a plugin hook too: https://docs.datasette.io/en/1.0a3/plugin_hooks.html#canned-queries-datasette-database-actor What do you think? Is this the right direction, or are you thinking there's a more radical redesign that would make sense here? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1684205563 | https://github.com/simonw/datasette/issues/2143#issuecomment-1684205563 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kYu_7 | asg017 15178711 | 2023-08-18T17:12:54Z | 2023-08-18T17:12:54Z | CONTRIBUTOR | Another option would be, instead of flat Though I imagine Python imports might make this complex to do, and json/yaml is already supported and pretty easy to write |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1684202932 | https://github.com/simonw/datasette/issues/2143#issuecomment-1684202932 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kYuW0 | asg017 15178711 | 2023-08-18T17:10:21Z | 2023-08-18T17:10:21Z | CONTRIBUTOR | I agree with all your points! I think the best solution would be having a Then optionally, you have a Everything in We could even completely remove |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1683429959 | https://github.com/simonw/datasette/issues/2143#issuecomment-1683429959 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kVxpH | simonw 9599 | 2023-08-18T06:43:33Z | 2023-08-18T15:19:07Z | OWNER | The single biggest design challenge I've had with metadata relates to how it should or should not be inherited. If you apply a license to a Datasette instance, it feels like that should flow down to cover all of the databases and all of the tables within those databases. If the license is at the database level, it should cover all tables. But... should source do the same thing? I made it behave the same way as license, but it's presumably common for one database to have a single license but multiple different sources of data. Then there's title - should that inherit? It feels like title should apply to only one level - you may want a title that applies to the instance, then a different custom title for databases and tables. Here's the current state of play for metadata: https://docs.datasette.io/en/1.0a3/metadata.html So there's There's Then there are these six:
I added Tables can also have column descriptions - just a string for each column. There's a demo of those here: https://latest.datasette.io/fixtures/roadside_attractions And then there's all of the other stuff, most of which feels much more like "settings" than "metadata":
And the authentication stuff! And the new I think that might be everything (excluding the And to make things even more confusing... I believe you can add arbitrary key/value pairs to your metadata and then use them in your templates! I think I've heard from at least one person who uses that ability. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1683420879 | https://github.com/simonw/datasette/issues/2143#issuecomment-1683420879 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kVvbP | simonw 9599 | 2023-08-18T06:33:24Z | 2023-08-18T15:15:34Z | OWNER | I completely agree: metadata is a mess, and it deserves our attention.
That's not completely true - there are hacks around that. I have a plugin that applies one set of gnarly hacks for that here: https://github.com/simonw/datasette-remote-metadata - it's pretty grim though!
100% this: it's a complete mess. Datasette used to have a
Yes, they're not pretty at all. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1683443891 | https://github.com/simonw/datasette/issues/2143#issuecomment-1683443891 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kV1Cz | simonw 9599 | 2023-08-18T06:58:15Z | 2023-08-18T06:58:15Z | OWNER | Hah, that Hence the whole
If configuration and metadata were separate we could ditch that whole messy situation - configuration can stay hidden, metadata can stay public. Though I have been thinking that Datasette might benefit from a "secrets" mechanism that's separate from configuration and metadata... kind of like what LLM has: https://llm.datasette.io/en/stable/help.html#llm-keys-help |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1683440597 | https://github.com/simonw/datasette/issues/2143#issuecomment-1683440597 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kV0PV | simonw 9599 | 2023-08-18T06:54:49Z | 2023-08-18T06:54:49Z | OWNER | A related point that I've been considering a lot recently: it turns out that sometimes I really want to define settings on the CLI instead of in a file, purely for convenience. It's pretty annoying when I want to try out a new plugin but I have to create a dedicated
So maybe there's a world in which all of the settings can be applied in a That gets trickier when you need to pass a nested structure or similar, but we could always support those as JSON:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 | |
1683435579 | https://github.com/simonw/datasette/issues/2143#issuecomment-1683435579 | https://api.github.com/repos/simonw/datasette/issues/2143 | IC_kwDOBm6k_c5kVzA7 | simonw 9599 | 2023-08-18T06:49:39Z | 2023-08-18T06:49:39Z | OWNER | My ideal situation then would be something like this:
Currently we have three types of things:
Should settings and configuration be separate? I'm not 100% sure that they should - maybe those two concepts should be combined somehow. Configuration directory mode needs to be considered too: https://docs.datasette.io/en/stable/settings.html#configuration-directory-mode - interestingly it already has a thing where it can pick up settings from a |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
De-tangling Metadata before Datasette 1.0 1855885427 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [issue] INTEGER REFERENCES [issues]([id]) , [performed_via_github_app] TEXT); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 5