home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

51 rows where "created_at" is on date 2023-08-18 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date), updated_at (date)

issue 8

  • If a row has a primary key of `null` various things break 20
  • De-tangling Metadata before Datasette 1.0 11
  • .transform() instead of modifying sqlite_master for add_foreign_keys 7
  • CLI equivalents to `transform(add_foreign_keys=)` 7
  • .transform() fails to drop column if table is part of a view 3
  • Get `add_foreign_keys()` to work without modifying `sqlite_master` 1
  • Bump the python-packages group with 2 updates 1
  • Bump the python-packages group with 3 updates 1

user 4

  • simonw 46
  • asg017 3
  • codecov[bot] 1
  • dependabot[bot] 1

author_association 3

  • OWNER 46
  • CONTRIBUTOR 4
  • NONE 1
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1684530060 https://github.com/simonw/datasette/issues/2145#issuecomment-1684530060 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ-OM simonw 9599 2023-08-18T23:09:03Z 2023-08-18T23:09:14Z OWNER

Ran a quick benchmark on ChatGPT Code Interpreter: https://chat.openai.com/share/8357dc01-a97e-48ae-b35a-f06249935124

Conclusion from there is that this query returns fast no matter how much the table grows:

sql SELECT EXISTS(SELECT 1 FROM "nasty" WHERE "id" IS NULL) So detecting if a table contains any null primary keys is definitely feasible without a performance hit.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684526447 https://github.com/simonw/datasette/issues/2145#issuecomment-1684526447 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ9Vv simonw 9599 2023-08-18T23:05:02Z 2023-08-18T23:05:02Z OWNER

How expensive is it to detect if a SQLite table contains at least one null primary key?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684525943 https://github.com/simonw/datasette/issues/2145#issuecomment-1684525943 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ9N3 simonw 9599 2023-08-18T23:04:14Z 2023-08-18T23:04:14Z OWNER

This is hard. I tried this: ```python def path_from_row_pks(row, pks, use_rowid, quote=True): """Generate an optionally tilde-encoded unique identifier for a row from its primary keys.""" if use_rowid or any(row[pk] is None for pk in pks): bits = [row["rowid"]] else: bits = [ row[pk]["value"] if isinstance(row[pk], dict) else row[pk] for pk in pks ] if quote: bits = [tilde_encode(str(bit)) for bit in bits] else: bits = [str(bit) for bit in bits]

return ",".join(bits)

`` The if use_rowid or any(row[pk] is None for pk in pks)` bit is new.

But I got this error on http://127.0.0.1:8003/nulls/nasty :

File "/Users/simon/Dropbox/Development/datasette/datasette/views/table.py", line 1364, in run_display_columns_and_rows display_columns, display_rows = await display_columns_and_rows( File "/Users/simon/Dropbox/Development/datasette/datasette/views/table.py", line 186, in display_columns_and_rows pk_path = path_from_row_pks(row, pks, not pks, False) File "/Users/simon/Dropbox/Development/datasette/datasette/utils/__init__.py", line 124, in path_from_row_pks bits = [row["rowid"]] IndexError: No item with that key Because the SQL query I ran to populate the page didn't know that it would need to select rowid as well.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684525054 https://github.com/simonw/datasette/issues/2145#issuecomment-1684525054 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ8_- simonw 9599 2023-08-18T23:02:26Z 2023-08-18T23:02:26Z OWNER

Creating a quick test database: bash sqlite-utils create-table nulls.db nasty id text --pk id sqlite-utils nulls.db 'insert into nasty (id) values (null)'

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684523322 https://github.com/simonw/datasette/issues/2145#issuecomment-1684523322 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ8k6 simonw 9599 2023-08-18T22:59:14Z 2023-08-18T22:59:14Z OWNER

Except it looks like the Links from other tables section is broken:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684522567 https://github.com/simonw/datasette/issues/2145#issuecomment-1684522567 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ8ZH simonw 9599 2023-08-18T22:58:07Z 2023-08-18T22:58:07Z OWNER

Here's a prototype of that: ```diff diff --git a/datasette/app.py b/datasette/app.py index b2644ace..acc55249 100644 --- a/datasette/app.py +++ b/datasette/app.py @@ -1386,7 +1386,7 @@ class Datasette: ) add_route( RowView.as_view(self), - r"/(?P<database>[^\/.]+)/(?P<table>[^/]+?)/(?P<pks>[^/]+?)(.(?P<format>\w+))?$", + r"/(?P<database>[^\/.]+)/(?P<table>[^/]+?)/(?P<pks>[A-Za-z0-9_-\~]+|.\d+)(.(?P<format>\w+))?$", ) add_route( TableInsertView.as_view(self), @@ -1440,7 +1440,15 @@ class Datasette: async def resolve_row(self, request): db, table_name, _ = await self.resolve_table(request) pk_values = urlsafe_components(request.url_vars["pks"]) - sql, params, pks = await row_sql_params_pks(db, table_name, pk_values) + + if len(pk_values) == 1 and pk_values[0].startswith("."): + # It's a special .rowid value + pk_values = (pk_values[0][1:],) + sql, params, pks = await row_sql_params_pks( + db, table_name, pk_values, rowid=True + ) + else: + sql, params, pks = await row_sql_params_pks(db, table_name, pk_values) results = await db.execute(sql, params, truncate=True) row = results.first() if row is None: diff --git a/datasette/utils/init.py b/datasette/utils/init.py index c388673d..96669281 100644 --- a/datasette/utils/init.py +++ b/datasette/utils/init.py @@ -1206,9 +1206,12 @@ def truncate_url(url, length): return url[: length - 1] + "…"

-async def row_sql_params_pks(db, table, pk_values): +async def row_sql_params_pks(db, table, pk_values, rowid=False): pks = await db.primary_keys(table) - use_rowid = not pks + if rowid: + use_rowid = True + else: + use_rowid = not pks select = "" if use_rowid: select = "rowid, " ``` It works:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684505071 https://github.com/simonw/datasette/issues/2145#issuecomment-1684505071 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ4Hv simonw 9599 2023-08-18T22:44:35Z 2023-08-18T22:44:35Z OWNER

Also relevant: https://github.com/simonw/datasette/blob/943df09dcca93c3b9861b8c96277a01320db8662/datasette/utils/init.py#L1147-L1153

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684504398 https://github.com/simonw/datasette/issues/2145#issuecomment-1684504398 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ39O simonw 9599 2023-08-18T22:43:31Z 2023-08-18T22:43:46Z OWNER

(?P<pks>[^/]+?) could instead be a regex that is restricted to the tilde-encoded set of characters, or \.\d+.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684504051 https://github.com/simonw/datasette/issues/2145#issuecomment-1684504051 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ33z simonw 9599 2023-08-18T22:43:06Z 2023-08-18T22:43:06Z OWNER

Here's the regex in question at the moment: https://github.com/simonw/datasette/blob/943df09dcca93c3b9861b8c96277a01320db8662/datasette/app.py#L1387-L1390

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684503587 https://github.com/simonw/datasette/issues/2145#issuecomment-1684503587 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ3wj simonw 9599 2023-08-18T22:42:28Z 2023-08-18T22:42:39Z OWNER

I could set a rule that extensions (including custom render extensions set by plugins) must not be valid integers, and teach Datasette that /\.\d+ is the indication of a rowid.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684503189 https://github.com/simonw/datasette/issues/2145#issuecomment-1684503189 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ3qV simonw 9599 2023-08-18T22:41:51Z 2023-08-18T22:41:51Z OWNER

```pycon

tilde_encode("~") '~7E' tilde_encode(".") '~2E' tilde_encode("-") '-' `` I think.` might be the way to do this:

 /database/table/.4

But... I worry about that colliding with my URL routing code that spots the difference between these:

 /database/table/.4
 /database/table/.4.json
 /database/table/.4.csv

etc.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684502278 https://github.com/simonw/datasette/issues/2145#issuecomment-1684502278 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ3cG simonw 9599 2023-08-18T22:40:20Z 2023-08-18T22:40:20Z OWNER

From reviewing https://simonwillison.net/2022/Mar/19/weeknotes/

unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"

That's how I chose the tilde character - but it also suggests that I could use - or . or _ for my new rowid encoding.

So maybe /database/table/_4 could indicate "the row with rowid of 4".

No, that doesn't work: ```pycon

from datasette.utils import tilde_encode tilde_encode("") '' ``` I need a character which tilde-encoding does indeed encode.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684500540 https://github.com/simonw/datasette/issues/2145#issuecomment-1684500540 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ3A8 simonw 9599 2023-08-18T22:37:37Z 2023-08-18T22:37:37Z OWNER

I just found this and panicked, thinking maybe tilde encoding is a bad idea after all! https://jkorpela.fi/tilde.html

But... "Date of last update: 1999-08-27" - I think I'm OK.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684500172 https://github.com/simonw/datasette/issues/2145#issuecomment-1684500172 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ27M simonw 9599 2023-08-18T22:37:04Z 2023-08-18T22:37:04Z OWNER

Looking at the way these URLs work: because the components themselves in a~2Fb,~2Ec-d are tilde-encoded, any character that's "safe" in tilde-encoding could be used to indicate "this is actually a rowid".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684498947 https://github.com/simonw/datasette/issues/2145#issuecomment-1684498947 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ2oD simonw 9599 2023-08-18T22:35:04Z 2023-08-18T22:35:04Z OWNER

The most interesting row URL in the fixtures database right now is this one:

https://latest.datasette.io/fixtures/compound_primary_key/a~2Fb,~2Ec-d

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684497642 https://github.com/simonw/datasette/issues/2145#issuecomment-1684497642 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ2Tq simonw 9599 2023-08-18T22:32:53Z 2023-08-18T22:32:53Z OWNER

Here's a potential solution: make it so ALL rowid tables in SQLite can be optionally addressed by their rowid instead of by their primary key.

Then teach the code that outputs the URL to a row page to spot if there are null primary keys and switch to that alternative addressing mechanism instead.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684497000 https://github.com/simonw/datasette/issues/2145#issuecomment-1684497000 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ2Jo simonw 9599 2023-08-18T22:31:53Z 2023-08-18T22:31:53Z OWNER

So it sounds like SQLite does ensure that a rowid before it allows a primary key to be null.

So one solution here would be to detect a null primary key and switch that table over to using rowid URLs instead. The key problem we're trying to solve here after all is how to link to a row:

https://latest.datasette.io/fixtures/infinity/1

But when would we run that check? And does every row in the table get a new /rowid/ URL just because someone messed up and inserted a null by mistake?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684496274 https://github.com/simonw/datasette/issues/2143#issuecomment-1684496274 https://api.github.com/repos/simonw/datasette/issues/2143 IC_kwDOBm6k_c5kZ1-S asg017 15178711 2023-08-18T22:30:45Z 2023-08-18T22:30:45Z CONTRIBUTOR

That said, I do really like a bias towards settings that can be changed at runtime

Does this include things like --settings values or plugin config? I can totally see being able to update metadata without restarting, but not sure if that would work well with --setting, plugin config, or auth/permissions stuff.

Well it could work with --setting and auth/permissions, with a lot of core changes. But changing plugin config on the fly could be challenging, for plugin authors.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
De-tangling Metadata before Datasette 1.0 1855885427  
1684495674 https://github.com/simonw/datasette/issues/2145#issuecomment-1684495674 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ106 simonw 9599 2023-08-18T22:29:47Z 2023-08-18T22:29:47Z OWNER

https://www.sqlite.org/lang_createtable.html#the_primary_key says:

According to the SQL standard, PRIMARY KEY should always imply NOT NULL. Unfortunately, due to a bug in some early versions, this is not the case in SQLite. Unless the column is an INTEGER PRIMARY KEY or the table is a WITHOUT ROWID table or a STRICT table or the column is declared NOT NULL, SQLite allows NULL values in a PRIMARY KEY column. SQLite could be fixed to conform to the standard, but doing so might break legacy applications. Hence, it has been decided to merely document the fact that SQLite allows NULLs in most PRIMARY KEY columns.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684494464 https://github.com/simonw/datasette/issues/2145#issuecomment-1684494464 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZ1iA simonw 9599 2023-08-18T22:27:51Z 2023-08-18T22:28:40Z OWNER

Oh wow, null primary keys are bad news... SQLite lets you insert multiple rows with the same null value! ```pycon

import sqlite_utils db = sqlite_utils.Database(memory=True) db["foo"].insert({"id": None, "name": "No ID"}, pk="id")

<Table foo (id, name)> >>> db.schema 'CREATE TABLE [foo] (\n [id] TEXT PRIMARY KEY,\n [name] TEXT\n);' >>> db["foo"].insert({"id": None, "name": "No ID"}, pk="id") <Table foo (id, name)> >>> db.schema 'CREATE TABLE [foo] (\n [id] TEXT PRIMARY KEY,\n [name] TEXT\n);' >>> list(db["foo"].rows) [{'id': None, 'name': 'No ID'}, {'id': None, 'name': 'No ID'}] >>> list(db.query('select * from foo where id = null')) [] >>> list(db.query('select * from foo where id is null')) [{'id': None, 'name': 'No ID'}, {'id': None, 'name': 'No ID'}] ```
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684488526 https://github.com/simonw/datasette/issues/2143#issuecomment-1684488526 https://api.github.com/repos/simonw/datasette/issues/2143 IC_kwDOBm6k_c5kZ0FO simonw 9599 2023-08-18T22:18:39Z 2023-08-18T22:18:39Z OWNER

Another option would be, instead of flat datasette.json/datasette.yaml files, we could instead use a Python file, like datasette_config.py. That way one could dynamically generate config (ex dev vs prod, auto-discover credentials, etc.). Kinda like Django settings.

Another option would be, instead of flat datasette.json/datasette.yaml files, we could instead use a Python file, like datasette_config.py. That way one could dynamically generate config (ex dev vs prod, auto-discover credentials, etc.). Kinda like Django settings.

I'm not a fan of that. I feel like software history is full of examples of projects that implemented configuration-as-code and then later regretted it - the most recent example is setup.py in Python turning into pyproject.yaml, but I feel like I've seen that pattern play out elsewhere too.

I don't think having people dynamically generate JSON/YAML for their configuration is a big burden. I'd have to see some very compelling use-cases to convince me otherwise.

That said, I do really like a bias towards settings that can be changed at runtime. Datasette has suffered a bit from some settings that can't be easily changed at runtime already - hence my gnarly https://github.com/simonw/datasette-remote-metadata plugin.

For things like Datasette Cloud for example the more people can configure without rebooting their container the better!

I don't think live reconfiguration at runtime is incompatible with JSON/YAML configuration though. Caddy is one of my favourite examples of software that can be entirely re-configured at runtime by POSTING a big blob of JSON to it: https://caddyserver.com/docs/quick-starts/api

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
De-tangling Metadata before Datasette 1.0 1855885427  
1684485591 https://github.com/simonw/datasette/issues/2143#issuecomment-1684485591 https://api.github.com/repos/simonw/datasette/issues/2143 IC_kwDOBm6k_c5kZzXX simonw 9599 2023-08-18T22:14:35Z 2023-08-18T22:14:35Z OWNER

Actually there is one thing that I'm not comfortable about with respect to the existing design: the way the database / tables stuff is nested.

They assume that the user will attach the database to Datasette using a fixed name - docs.db or whatever.

But what if we want to support users downloading databases from each other and attaching them to Datasette where those DBs might carry some of their own configuration?

Moving metadata into the databases makes sense there, but what about database-specific settings like the default sort order for a table, or configured canned queries?

Having those tied to the filename of the database itself feels unpleasant to me. But how else could we handle this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
De-tangling Metadata before Datasette 1.0 1855885427  
1684484426 https://github.com/simonw/datasette/issues/2143#issuecomment-1684484426 https://api.github.com/repos/simonw/datasette/issues/2143 IC_kwDOBm6k_c5kZzFK simonw 9599 2023-08-18T22:12:52Z 2023-08-18T22:12:52Z OWNER

Yeah, I'm convinced by that. There's not point in having both settings.json and datasette.json.

I like datasette.json ( / datasette.yml) as a name. That can be the file that lives in your config directory too, so if you run datasette . in a folder containing datasette.yml all of those settings get picked up.

Here's a thought for how it could look - I'll go with the YAML format because I expect that to be the default most people use, just because it supports multi-line strings better.

I based this on the big example at https://docs.datasette.io/en/1.0a3/metadata.html#using-yaml-for-metadata - and combined some bits from https://docs.datasette.io/en/1.0a3/authentication.html as well.

```yaml title: Demonstrating Metadata from YAML description_html: |-

This description includes a long HTML string

  • YAML is better for embedding HTML strings than JSON!

settings: default_page_size: 10 max_returned_rows: 3000 sql_time_limit_ms": 8000

databases: docs: permissions: create-table: id: editor fixtures: tables: no_primary_key: hidden: true queries: neighborhood_search: sql: |- select neighborhood, facet_cities.name, state from facetable join facet_cities on facetable.city_id = facet_cities.id where neighborhood like '%' || :text || '%' order by neighborhood; title: Search neighborhoods description_html: |-

This demonstrates basic LIKE search

permissions: debug-menu: id: '*'

plugins: datasette-ripgrep: path: /usr/local/lib/python3.11/site-packages `` I'm inclined to say we try to be a super-set of the existingmetadata.yml` format, at least where it makes sense to do so. That way the upgrade path is smooth for people. Also, I don't think the format itself is terrible - it's the name that's the big problem.

In this example I've mixed in one extra concept: that settings: block with a bunch of settings in it.

There are some things in there that look a little bit like metadata - the title and description_html fields.

But are they metadata? The title and description of the overall instance feels like it could be described as general configuration. The stuff for the query should live where the query itself is defined.

Note that queries can be defined by a plugin hook too: https://docs.datasette.io/en/1.0a3/plugin_hooks.html#canned-queries-datasette-database-actor

What do you think? Is this the right direction, or are you thinking there's a more radical redesign that would make sense here?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
De-tangling Metadata before Datasette 1.0 1855885427  
1684384750 https://github.com/simonw/datasette/issues/2145#issuecomment-1684384750 https://api.github.com/repos/simonw/datasette/issues/2145 IC_kwDOBm6k_c5kZavu simonw 9599 2023-08-18T20:07:18Z 2023-08-18T20:07:18Z OWNER

The big challenge here is what the URL to that row page should look like. How can I encode a None in a form that can be encoded and decoded without clashing with primary keys that are the string None or null?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
If a row has a primary key of `null` various things break 1857234285  
1684235760 https://github.com/simonw/sqlite-utils/issues/577#issuecomment-1684235760 https://api.github.com/repos/simonw/sqlite-utils/issues/577 IC_kwDOCGYnMM5kY2Xw simonw 9599 2023-08-18T17:43:11Z 2023-08-18T17:43:11Z OWNER

Here's a new plugin that brings back the sqlite_master modifying version, for those that can use it:

https://github.com/simonw/sqlite-utils-fast-fks

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Get `add_foreign_keys()` to work without modifying `sqlite_master` 1817289521  
1684205563 https://github.com/simonw/datasette/issues/2143#issuecomment-1684205563 https://api.github.com/repos/simonw/datasette/issues/2143 IC_kwDOBm6k_c5kYu_7 asg017 15178711 2023-08-18T17:12:54Z 2023-08-18T17:12:54Z CONTRIBUTOR

Another option would be, instead of flat datasette.json/datasette.yaml files, we could instead use a Python file, like datasette_config.py. That way one could dynamically generate config (ex dev vs prod, auto-discover credentials, etc.). Kinda like Django settings.

Though I imagine Python imports might make this complex to do, and json/yaml is already supported and pretty easy to write

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
De-tangling Metadata before Datasette 1.0 1855885427  
1684202932 https://github.com/simonw/datasette/issues/2143#issuecomment-1684202932 https://api.github.com/repos/simonw/datasette/issues/2143 IC_kwDOBm6k_c5kYuW0 asg017 15178711 2023-08-18T17:10:21Z 2023-08-18T17:10:21Z CONTRIBUTOR

I agree with all your points!

I think the best solution would be having a datasette.json config file, where you "configure" your datasette instances, with settings, permissions/auth, plugin configuration, and table settings (sortable column, label columns, etc.). Which #2093 would do.

Then optionally, you have a metadata.json, or use datasette_metadata, or some other plugin to define metadata (ex the future sqlite-docs plugin).

Everything in datasette.json could also be overwritten by CLI flags, like --setting key value, --plugin xxxx key value.

We could even completely remove settings.json in favor or just datasette.json. Mostly because I think the less files the better, especially if they have generic names like settings.json or config.json.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
De-tangling Metadata before Datasette 1.0 1855885427  
1683429959 https://github.com/simonw/datasette/issues/2143#issuecomment-1683429959 https://api.github.com/repos/simonw/datasette/issues/2143 IC_kwDOBm6k_c5kVxpH simonw 9599 2023-08-18T06:43:33Z 2023-08-18T15:19:07Z OWNER

The single biggest design challenge I've had with metadata relates to how it should or should not be inherited.

If you apply a license to a Datasette instance, it feels like that should flow down to cover all of the databases and all of the tables within those databases.

If the license is at the database level, it should cover all tables.

But... should source do the same thing? I made it behave the same way as license, but it's presumably common for one database to have a single license but multiple different sources of data.

Then there's title - should that inherit? It feels like title should apply to only one level - you may want a title that applies to the instance, then a different custom title for databases and tables.

Here's the current state of play for metadata: https://docs.datasette.io/en/1.0a3/metadata.html

So there's title and description - and I'll be honest, I'm not 100% sure even I understand how those should be inherited down by tables/etc.

There's description_html which over-rides the description if it is set. It's a useful customization hack, but a bit surprising.

Then there are these six:

  • license
  • license_url
  • source
  • source_url
  • about
  • about_url

I added about later than the others, because I realized that plenty of my own projects needed a link to an article explaining them somewhere - e.g. https://scotrail.datasette.io/

Tables can also have column descriptions - just a string for each column. There's a demo of those here: https://latest.datasette.io/fixtures/roadside_attractions

And then there's all of the other stuff, most of which feels much more like "settings" than "metadata":

  • sort: created - the custom sort order
  • size: 10 for a custom page size for a specific table
  • sortable_columns to set which columns can be used to sort
  • hidden: true to hide a table
  • label_column: title is an interesting one - it lets you hint to Datasette which column should be displayed when there is a foreign key relationship. It's sort-of-metadata and sort-of-a-setting.
  • facets sets default facets, see https://docs.datasette.io/en/1.0a3/facets.html#facets-in-metadata
  • facet_size sets the number of facets to display
  • fts_table and fts_pk can be used to configure FTS, especially for views: https://docs.datasette.io/en/1.0a3/full_text_search.html

And the authentication stuff! allow and allow_sql blocks: https://docs.datasette.io/en/1.0a3/authentication.html#defining-permissions-with-allow-blocks

And the new permissions key in the 1.0 alphas: https://docs.datasette.io/en/1.0a3/authentication.html#other-permissions-in-metadata

I think that might be everything (excluding the plugins settings stuff, which is also a bad fit for metadata.)

And to make things even more confusing... I believe you can add arbitrary key/value pairs to your metadata and then use them in your templates! I think I've heard from at least one person who uses that ability.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
De-tangling Metadata before Datasette 1.0 1855885427  
1683420879 https://github.com/simonw/datasette/issues/2143#issuecomment-1683420879 https://api.github.com/repos/simonw/datasette/issues/2143 IC_kwDOBm6k_c5kVvbP simonw 9599 2023-08-18T06:33:24Z 2023-08-18T15:15:34Z OWNER

I completely agree: metadata is a mess, and it deserves our attention.

  1. Metadata cannot be updated without re-starting the entire Datasette instance.

That's not completely true - there are hacks around that. I have a plugin that applies one set of gnarly hacks for that here: https://github.com/simonw/datasette-remote-metadata - it's pretty grim though!

  1. The metadata.json/metadata.yaml has become a kitchen sink of unrelated (imo) features like plugin config, authentication config, canned queries

100% this: it's a complete mess.

Datasette used to have a datasette --config foo:bar mechanism, which I deprecated in favour of datasette --setting foo bar partly because I wanted to free up --config for pointing at a real config file, so we could stop dropping everything in --metadata metadata.yml.

  1. The Python APIs for defining extra metadata are a bit awkward (the datasette.metadata() class, get_metadata() hook, etc.)

Yes, they're not pretty at all.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
De-tangling Metadata before Datasette 1.0 1855885427  
1683963463 https://github.com/simonw/datasette/pull/2144#issuecomment-1683963463 https://api.github.com/repos/simonw/datasette/issues/2144 IC_kwDOBm6k_c5kXz5H codecov[bot] 22429695 2023-08-18T13:58:39Z 2023-08-18T13:58:39Z NONE

Codecov Report

Patch and project coverage have no change.

Comparison is base (943df09) 92.06% compared to head (3a97755) 92.06%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #2144 +/- ## ======================================= Coverage 92.06% 92.06% ======================================= Files 40 40 Lines 5937 5937 ======================================= Hits 5466 5466 Misses 471 471 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump the python-packages group with 3 updates 1856760386  
1683950031 https://github.com/simonw/datasette/pull/2142#issuecomment-1683950031 https://api.github.com/repos/simonw/datasette/issues/2142 IC_kwDOBm6k_c5kXwnP dependabot[bot] 49699333 2023-08-18T13:49:24Z 2023-08-18T13:49:24Z CONTRIBUTOR

Looks like these dependencies are updatable in another way, so this is no longer needed.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump the python-packages group with 2 updates 1854970601  
1683443891 https://github.com/simonw/datasette/issues/2143#issuecomment-1683443891 https://api.github.com/repos/simonw/datasette/issues/2143 IC_kwDOBm6k_c5kV1Cz simonw 9599 2023-08-18T06:58:15Z 2023-08-18T06:58:15Z OWNER

Hah, that --plugin-secret thing was a messy solution I came up with to the problem that all metadata is visible at /-/metadata - so if you need to stash a secret you need a way to keep it not-visible in there!

Hence the whole $env mess: https://docs.datasette.io/en/stable/plugins.html#secret-configuration-values

json { "plugins": { "datasette-auth-github": { "client_secret": { "$env": "GITHUB_CLIENT_SECRET" } } } }

If configuration and metadata were separate we could ditch that whole messy situation - configuration can stay hidden, metadata can stay public.

Though I have been thinking that Datasette might benefit from a "secrets" mechanism that's separate from configuration and metadata... kind of like what LLM has: https://llm.datasette.io/en/stable/help.html#llm-keys-help

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
De-tangling Metadata before Datasette 1.0 1855885427  
1683440597 https://github.com/simonw/datasette/issues/2143#issuecomment-1683440597 https://api.github.com/repos/simonw/datasette/issues/2143 IC_kwDOBm6k_c5kV0PV simonw 9599 2023-08-18T06:54:49Z 2023-08-18T06:54:49Z OWNER

A related point that I've been considering a lot recently: it turns out that sometimes I really want to define settings on the CLI instead of in a file, purely for convenience.

It's pretty annoying when I want to try out a new plugin but I have to create a dedicated metadata.yml file for it just to setup a single option - I'd love to have the option to be able to run this instead:

bash datasette data.db --plugin-setting datasette-upload-csvs default-database data

So maybe there's a world in which all of the settings can be applied in a datasette.yml file OR with command-line options.

That gets trickier when you need to pass a nested structure or similar, but we could always support those as JSON:

bash datasette data.db --plugin-setting datasette-emoji-reactions emoji '["😼", "🐺"]' Note that we kind of have precedent for this in datasette publish: https://docs.datasette.io/en/stable/publish.html#custom-metadata-and-plugins

bash datasette publish heroku my_database.db \ --name my-heroku-app-demo \ --install=datasette-auth-github \ --plugin-secret datasette-auth-github client_id your_client_id \ --plugin-secret datasette-auth-github client_secret your_client_secret

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
De-tangling Metadata before Datasette 1.0 1855885427  
1683435579 https://github.com/simonw/datasette/issues/2143#issuecomment-1683435579 https://api.github.com/repos/simonw/datasette/issues/2143 IC_kwDOBm6k_c5kVzA7 simonw 9599 2023-08-18T06:49:39Z 2023-08-18T06:49:39Z OWNER

My ideal situation then would be something like this:

  • Metadata itself is VERY clearly described, including sensible rules for metadata inheritance where it makes sense. There is a datasette.X method for accessing it which is much more intuitive than datasette.metadata().
  • It's possible that method should be an async method, because that would support things like plugins that lookup metadata in database tables better.
  • All templates etc switch to the new, clean, intuitive metadata mechanism before 1.0.
  • I'm interested in the option of metadata being able to live in a _datasette_metadata table in the databases themselves - either as a plugin or as a core feature. I think it makes a lot of sense for metadata to optionally live with the data that it describes.
  • Configuration gets split from metadata. The stuff that configures Datasette no longer lives in the metadata.yml file - it lives in config.yml (or even datasette.yml).

Currently we have three types of things:

  • Metadata - information about the data
  • Configuration - stuff like "these columns should be sortable" and "this is configured as fts_table" and suchlike
  • Settings - the stuff that you pass to datasette --setting x y on server start.

Should settings and configuration be separate? I'm not 100% sure that they should - maybe those two concepts should be combined somehow.

Configuration directory mode needs to be considered too: https://docs.datasette.io/en/stable/settings.html#configuration-directory-mode - interestingly it already has a thing where it can pick up settings from a settings.json file - where settings are things like datasette --setting sql_time_limit_ms 4000.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
De-tangling Metadata before Datasette 1.0 1855885427  
1683404978 https://github.com/simonw/sqlite-utils/issues/586#issuecomment-1683404978 https://api.github.com/repos/simonw/sqlite-utils/issues/586 IC_kwDOCGYnMM5kVriy simonw 9599 2023-08-18T06:13:46Z 2023-08-18T06:13:46Z OWNER

I shipped the view recreating fix in datasette-edit-schema, so at least I can start exercising that fix and see if it has any weird issues.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.transform() fails to drop column if table is part of a view 1856075668  
1683398866 https://github.com/simonw/sqlite-utils/issues/586#issuecomment-1683398866 https://api.github.com/repos/simonw/sqlite-utils/issues/586 IC_kwDOCGYnMM5kVqDS simonw 9599 2023-08-18T06:05:50Z 2023-08-18T06:06:42Z OWNER

Options: - Provide a recreate_views: bool parameter to table.transform() controlling if views that might reference this table are stashed and dropped and recreated within a transaction as part of the operation. But should that be True or False by default? - Read that PRAGMA and automatically do that view workaround if it's turned on - Toggle that PRAGMA off for the duration of the .transform() operation and on again at the end. Does it only affect the current connection? - Try the transform() in a transaction, detect the "error in view", "no such table"error, if spotted then do the VIEW workaround and try again

I'm on the fence as to which of these I like the most. I'm tempted to go with the one which just drops VIEWS and recreates them all the time, because it feels simpler.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.transform() fails to drop column if table is part of a view 1856075668  
1683396150 https://github.com/simonw/sqlite-utils/issues/586#issuecomment-1683396150 https://api.github.com/repos/simonw/sqlite-utils/issues/586 IC_kwDOCGYnMM5kVpY2 simonw 9599 2023-08-18T06:02:18Z 2023-08-18T06:06:31Z OWNER

More notes in here: - https://github.com/simonw/datasette-edit-schema/issues/35#issuecomment-1683392873

Not all Python/SQLite installations exhibit this problem by default!

It turns out this is controlled by the legacy_alter_table pragma: https://sqlite.org/pragma.html#pragma_legacy_alter_table

If that PRAGMA is turned on (default in newer SQLites) then alter table will error if you try to rename a table that is referenced in a view.

Here's a one-liner to test if it is on or not:

bash python -c 'import sqlite3; print(sqlite3.connect(":memory:").execute("PRAGMA legacy_alter_table").fetchall())'

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.transform() fails to drop column if table is part of a view 1856075668  
1683217284 https://github.com/simonw/sqlite-utils/issues/585#issuecomment-1683217284 https://api.github.com/repos/simonw/sqlite-utils/issues/585 IC_kwDOCGYnMM5kU9uE simonw 9599 2023-08-18T01:50:21Z 2023-08-18T01:50:21Z OWNER

And a test of the --sql option: bash sqlite-utils create-table /tmp/t.db places id integer name text country integer city integer continent integer --pk id sqlite-utils create-table /tmp/t.db country id integer name text sqlite-utils create-table /tmp/t.db city id integer name text sqlite-utils create-table /tmp/t.db continent id integer name text sqlite-utils transform /tmp/t.db places --add-foreign-key country country id --add-foreign-key continent continent id --sql Outputs: sql CREATE TABLE [places_new_6a705d2f5a13] ( [id] INTEGER PRIMARY KEY, [name] TEXT, [country] INTEGER REFERENCES [country]([id]), [city] INTEGER, [continent] INTEGER REFERENCES [continent]([id]) ); INSERT INTO [places_new_6a705d2f5a13] ([id], [name], [country], [city], [continent]) SELECT [id], [name], [country], [city], [continent] FROM [places]; DROP TABLE [places]; ALTER TABLE [places_new_6a705d2f5a13] RENAME TO [places];

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CLI equivalents to `transform(add_foreign_keys=)` 1855894222  
1683212074 https://github.com/simonw/sqlite-utils/issues/585#issuecomment-1683212074 https://api.github.com/repos/simonw/sqlite-utils/issues/585 IC_kwDOCGYnMM5kU8cq simonw 9599 2023-08-18T01:43:54Z 2023-08-18T01:43:54Z OWNER

Some manual testing: bash sqlite-utils create-table /tmp/t.db places id integer name text country integer city integer continent integer --pk id sqlite-utils schema /tmp/t.db sql CREATE TABLE [places] ( [id] INTEGER PRIMARY KEY, [name] TEXT, [country] INTEGER, [city] INTEGER, [continent] INTEGER ); bash sqlite-utils create-table /tmp/t.db country id integer name text sqlite-utils create-table /tmp/t.db city id integer name text sqlite-utils create-table /tmp/t.db continent id integer name text sqlite-utils schema /tmp/t.db sql CREATE TABLE [places] ( [id] INTEGER PRIMARY KEY, [name] TEXT, [country] INTEGER, [city] INTEGER, [continent] INTEGER ); CREATE TABLE [country] ( [id] INTEGER, [name] TEXT ); CREATE TABLE [city] ( [id] INTEGER, [name] TEXT ); CREATE TABLE [continent] ( [id] INTEGER, [name] TEXT ); bash sqlite-utils transform /tmp/t.db places --add-foreign-key country country id --add-foreign-key continent continent id sqlite-utils schema /tmp/t.db sql CREATE TABLE [country] ( [id] INTEGER, [name] TEXT ); CREATE TABLE [city] ( [id] INTEGER, [name] TEXT ); CREATE TABLE [continent] ( [id] INTEGER, [name] TEXT ); CREATE TABLE "places" ( [id] INTEGER PRIMARY KEY, [name] TEXT, [country] INTEGER REFERENCES [country]([id]), [city] INTEGER, [continent] INTEGER REFERENCES [continent]([id]) ); bash sqlite-utils transform /tmp/t.db places --drop-foreign-key country sqlite-utils schema /tmp/t.db places sql CREATE TABLE "places" ( [id] INTEGER PRIMARY KEY, [name] TEXT, [country] INTEGER, [city] INTEGER, [continent] INTEGER REFERENCES [continent]([id]) )

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CLI equivalents to `transform(add_foreign_keys=)` 1855894222  
1683201239 https://github.com/simonw/sqlite-utils/issues/585#issuecomment-1683201239 https://api.github.com/repos/simonw/sqlite-utils/issues/585 IC_kwDOCGYnMM5kU5zX simonw 9599 2023-08-18T01:30:46Z 2023-08-18T01:30:46Z OWNER

Help can now look like this: --drop-foreign-key TEXT Drop foreign key constraint for this column --add-foreign-key <TEXT TEXT TEXT>... Add a foreign key constraint from a column to another table with another column

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CLI equivalents to `transform(add_foreign_keys=)` 1855894222  
1683200128 https://github.com/simonw/sqlite-utils/issues/585#issuecomment-1683200128 https://api.github.com/repos/simonw/sqlite-utils/issues/585 IC_kwDOCGYnMM5kU5iA simonw 9599 2023-08-18T01:29:00Z 2023-08-18T01:29:00Z OWNER

I'm not going to implement the foreign_keys= option that entirely replaces existing foreign keys - I'll just do a --add-foreign-key multi-option.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CLI equivalents to `transform(add_foreign_keys=)` 1855894222  
1683198740 https://github.com/simonw/sqlite-utils/issues/585#issuecomment-1683198740 https://api.github.com/repos/simonw/sqlite-utils/issues/585 IC_kwDOCGYnMM5kU5MU simonw 9599 2023-08-18T01:26:47Z 2023-08-18T01:26:47Z OWNER

The only CLI feature that supports providing just the column name appears to be this: bash sqlite-utils add-foreign-key --help ``` Usage: sqlite-utils add-foreign-key [OPTIONS] PATH TABLE COLUMN [OTHER_TABLE] [OTHER_COLUMN]

Add a new foreign key constraint to an existing table

Example:

  sqlite-utils add-foreign-key my.db books author_id authors id

WARNING: Could corrupt your database! Back up your database file first. `` I can drop that WARNING now since I'm not writing tosqlite_master` any more.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CLI equivalents to `transform(add_foreign_keys=)` 1855894222  
1683197882 https://github.com/simonw/sqlite-utils/issues/585#issuecomment-1683197882 https://api.github.com/repos/simonw/sqlite-utils/issues/585 IC_kwDOCGYnMM5kU4-6 simonw 9599 2023-08-18T01:25:53Z 2023-08-18T01:25:53Z OWNER

Probably most relevant here is this snippet from: bash sqlite-utils create-table --help --default <TEXT TEXT>... Default value that should be set for a column --fk <TEXT TEXT TEXT>... Column, other table, other column to set as a foreign key

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CLI equivalents to `transform(add_foreign_keys=)` 1855894222  
1683195669 https://github.com/simonw/sqlite-utils/issues/585#issuecomment-1683195669 https://api.github.com/repos/simonw/sqlite-utils/issues/585 IC_kwDOCGYnMM5kU4cV simonw 9599 2023-08-18T01:24:57Z 2023-08-18T01:24:57Z OWNER

Currently: bash sqlite-utils transform --help ``` Usage: sqlite-utils transform [OPTIONS] PATH TABLE

Transform a table beyond the capabilities of ALTER TABLE

Example:

  sqlite-utils transform mydb.db mytable \
      --drop column1 \
      --rename column2 column_renamed

Options: --type <TEXT CHOICE>... Change column type to INTEGER, TEXT, FLOAT or BLOB --drop TEXT Drop this column --rename <TEXT TEXT>... Rename this column to X -o, --column-order TEXT Reorder columns --not-null TEXT Set this column to NOT NULL --not-null-false TEXT Remove NOT NULL from this column --pk TEXT Make this column the primary key --pk-none Remove primary key (convert to rowid table) --default <TEXT TEXT>... Set default value for this column --default-none TEXT Remove default from this column --drop-foreign-key TEXT Drop foreign key constraint for this column --sql Output SQL without executing it --load-extension TEXT Path to SQLite extension, with optional :entrypoint -h, --help Show this message and exit. ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CLI equivalents to `transform(add_foreign_keys=)` 1855894222  
1683164661 https://github.com/simonw/sqlite-utils/pull/584#issuecomment-1683164661 https://api.github.com/repos/simonw/sqlite-utils/issues/584 IC_kwDOCGYnMM5kUw31 simonw 9599 2023-08-18T00:45:53Z 2023-08-18T00:45:53Z OWNER

More updated documentation: - https://sqlite-utils--584.org.readthedocs.build/en/584/reference.html#sqlite_utils.db.Table.transform - https://sqlite-utils--584.org.readthedocs.build/en/584/python-api.html#python-api-transform-add-foreign-key-constraints

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.transform() instead of modifying sqlite_master for add_foreign_keys 1855838223  
1683145819 https://github.com/simonw/sqlite-utils/pull/584#issuecomment-1683145819 https://api.github.com/repos/simonw/sqlite-utils/issues/584 IC_kwDOCGYnMM5kUsRb simonw 9599 2023-08-18T00:17:26Z 2023-08-18T00:17:26Z OWNER

Updated documentation: https://sqlite-utils--584.org.readthedocs.build/en/584/python-api.html#adding-foreign-key-constraints

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.transform() instead of modifying sqlite_master for add_foreign_keys 1855838223  
1683145110 https://github.com/simonw/sqlite-utils/pull/584#issuecomment-1683145110 https://api.github.com/repos/simonw/sqlite-utils/issues/584 IC_kwDOCGYnMM5kUsGW simonw 9599 2023-08-18T00:16:28Z 2023-08-18T00:16:48Z OWNER

One last piece of documentation: need to document the new option to table.transform() and table.transform_sql():

https://github.com/simonw/sqlite-utils/blob/0771ac61fe5c2aca74075b20b1a99b9bd4c65661/sqlite_utils/db.py#L1706-L1708

I should write tests for them too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.transform() instead of modifying sqlite_master for add_foreign_keys 1855838223  
1683143723 https://github.com/simonw/sqlite-utils/pull/584#issuecomment-1683143723 https://api.github.com/repos/simonw/sqlite-utils/issues/584 IC_kwDOCGYnMM5kUrwr simonw 9599 2023-08-18T00:14:52Z 2023-08-18T00:14:52Z OWNER

Another docs update: this bit in here https://sqlite-utils.datasette.io/en/3.34/python-api.html#adding-multiple-foreign-key-constraints-at-once talks about how .add_foreign_keys() is a performance optimization to avoid having to run VACUUM a bunch of separate times:

The final step in adding a new foreign key to a SQLite database is to run VACUUM, to ensure the new foreign key is available in future introspection queries.

VACUUM against a large (multi-GB) database can take several minutes or longer. If you are adding multiple foreign keys using table.add_foreign_key(...) these can quickly add up.

Instead, you can use db.add_foreign_keys(...) to add multiple foreign keys within a single transaction. This method takes a list of four-tuples, each one specifying a table, column, other_table and other_column.

That doesn't apply any more - the new mechanism using .transform() works completely differently, so this issue around running VACUUM no longer applies.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.transform() instead of modifying sqlite_master for add_foreign_keys 1855838223  
1683139304 https://github.com/simonw/sqlite-utils/pull/584#issuecomment-1683139304 https://api.github.com/repos/simonw/sqlite-utils/issues/584 IC_kwDOCGYnMM5kUqro simonw 9599 2023-08-18T00:09:56Z 2023-08-18T00:09:56Z OWNER

Upgrading flake8 locally replicated the error: pip install -U flake8 flake8 ./tests/test_recipes.py:99:9: F811 redefinition of unused 'fn' from line 96 ./tests/test_recipes.py:127:9: F811 redefinition of unused 'fn' from line 124

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.transform() instead of modifying sqlite_master for add_foreign_keys 1855838223  
1683138953 https://github.com/simonw/sqlite-utils/pull/584#issuecomment-1683138953 https://api.github.com/repos/simonw/sqlite-utils/issues/584 IC_kwDOCGYnMM5kUqmJ simonw 9599 2023-08-18T00:09:20Z 2023-08-18T00:09:20Z OWNER

Weird, I'm getting a flake8 problem in CI which doesn't occur on my laptop: ./tests/test_recipes.py:99:9: F811 redefinition of unused 'fn' from line 96 ./tests/test_recipes.py:127:9: F811 redefinition of unused 'fn' from line 124

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.transform() instead of modifying sqlite_master for add_foreign_keys 1855838223  
1683137259 https://github.com/simonw/sqlite-utils/pull/584#issuecomment-1683137259 https://api.github.com/repos/simonw/sqlite-utils/issues/584 IC_kwDOCGYnMM5kUqLr simonw 9599 2023-08-18T00:06:59Z 2023-08-18T00:06:59Z OWNER

The docs still describe the old trick, I need to update that: https://sqlite-utils.datasette.io/en/3.34/python-api.html#adding-foreign-key-constraints

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
.transform() instead of modifying sqlite_master for add_foreign_keys 1855838223  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1143.035ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows