{"id": 1570375808, "node_id": "I_kwDODFdgUs5dmgiA", "number": 79, "title": "Deploy demo job is failing due to rate limit", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2023-02-03T20:05:01Z", "updated_at": "2023-12-08T14:50:15Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "https://github.com/dogsheep/github-to-sqlite/actions/runs/4080058087/jobs/7032116511", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/79/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1983600865, "node_id": "PR_kwDOBm6k_c5e7WH7", "number": 2206, "title": "Bump the python-packages group with 1 update", "user": {"value": 49699333, "label": "dependabot[bot]"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-11-08T13:18:56Z", "updated_at": "2023-12-08T13:46:24Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": "simonw/datasette/pulls/2206", "body": "Bumps the python-packages group with 1 update: [black](https://github.com/psf/black).\n\n
\nRelease notes\n

Sourced from black's releases.

\n
\n

23.11.0

\n

Highlights

\n\n

Stable style

\n\n

Preview style

\n\n

Configuration

\n\n

Performance

\n\n

Integrations

\n\n

23.10.1

\n

Highlights

\n\n

Preview style

\n\n
\n

... (truncated)

\n
\n
\nChangelog\n

Sourced from black's changelog.

\n
\n

23.11.0

\n

Highlights

\n\n

Stable style

\n\n

Preview style

\n\n

Configuration

\n\n

Performance

\n\n

Integrations

\n\n

23.10.1

\n

Highlights

\n\n\n
\n

... (truncated)

\n
\n
\nCommits\n\n
\n
\n\n\n[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=black&package-manager=pip&previous-version=23.9.1&new-version=23.11.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)\n\nDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`.\n\n[//]: # (dependabot-automerge-start)\n[//]: # (dependabot-automerge-end)\n\n---\n\n
\nDependabot commands and options\n
\n\nYou can trigger Dependabot actions by commenting on this PR:\n- `@dependabot rebase` will rebase this PR\n- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it\n- `@dependabot merge` will merge this PR after your CI passes on it\n- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it\n- `@dependabot cancel merge` will cancel a previously requested merge and block automerging\n- `@dependabot reopen` will reopen this PR if it is closed\n- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually\n- `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency\n- `@dependabot ignore major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself)\n- `@dependabot ignore minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself)\n- `@dependabot ignore ` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself)\n- `@dependabot unignore ` will remove all of the ignore conditions of the specified dependency\n- `@dependabot unignore ` will remove the ignore condition of the specified dependency and ignore conditions\n\n\n
\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://datasette--2206.org.readthedocs.build/en/2206/\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2206/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1988525411, "node_id": "I_kwDOCGYnMM52hn1j", "number": 603, "title": "Pyhton 3.12 Bug report", "user": {"value": 1324252, "label": "constantinedev"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-11-10T22:57:48Z", "updated_at": "2023-12-08T05:10:31Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I start with new python3 verison 3.12.0\r\nAlso have the error where connect DataBase\r\n\r\n```\r\nTraceback (most recent call last):\r\n File \"/home/t/Development/python/FKPJ/ClinicSYS/run.py\", line 1, in \r\n import re, os, io, json, sqlite_utils, requests, pytz, logging\r\n File \"/home/t/.local/lib/python3.12/site-packages/sqlite_utils/__init__.py\", line 1, in \r\n from .db import Database\r\n File \"/home/t/.local/lib/python3.12/site-packages/sqlite_utils/db.py\", line 277, in \r\n class Database:\r\n File \"/home/t/.local/lib/python3.12/site-packages/sqlite_utils/db.py\", line 306, in Database\r\n filename_or_conn: Optional[Union[str, pathlib.Path, sqlite3.Connection]] = None,\r\n ^^^^^^^^^^^^^^^^^^\r\n```\r\nThis bug come from `sqlite-utils` since's v3.33.\r\nAnyone get the same ?\r\n\r\nAs well now of the resolved plan just keep the sqlite-utils version in python3.12 with v3.32.1 [tested]\r\nbut where are the sqlite3.Connection problem.... \r\n\r\nThis won't happen on python version down to 3.11[tested]\r\nJust the python3.12.0, I have test this error are come from the sqlite3 connection\r\nThe error say from `sqlite_utils` and with the sqlite3 Connection, what can I do.\r\n\r\nLet fix together.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/603/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 2029908157, "node_id": "I_kwDOBm6k_c54_fC9", "number": 2214, "title": "CSV export fails for some `text` foreign key references", "user": {"value": 2874, "label": "precipice"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-12-07T05:04:34Z", "updated_at": "2023-12-07T07:36:34Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I'm starting this issue without a clear reproduction in case someone else has seen this behavior, and to use the issue as a notebook for research. \r\n\r\nI'm using Datasette with the [SWITRS](https://iswitrs.chp.ca.gov/) data set, which is a California Highway Patrol collection of traffic incident data from the past decade or so. I receive data from them in CSV and want to work with it in Datasette, then export it to CSV for mapping in Felt.com.\r\n\r\nTheir data makes extensive use of codes for incident column data (`1` for `Monday` and so on), some of it integer codes and some of it letter/text codes. The text codes are sometimes blank or `-`. During import, I'm creating lookup tables for foreign key references to make the Datasette UI presentation of the data easier to read.\r\n\r\nIf I import the data and set up the integer foreign keys, everything works fine, but if I set up the text foreign keys, CSV export starts to fail. \r\n\r\nThe foreign key configuration is as follows:\r\n\r\n```\r\n# Some tables use integer ids, like sensible tables do. Let's import them first\r\n# since we favor them.\r\n\r\nfor TABLE in DAY_OF_WEEK CHP_SHIFT POPULATION SPECIAL_COND BEAT_TYPE COLLISION_SEVERITY\r\ndo\r\n\tsqlite-utils create-table records.db $TABLE id integer name text --pk=id\r\n\tsqlite-utils insert records.db $TABLE lookup-tables/$TABLE.csv --csv\r\n\tsqlite-utils add-foreign-key records.db collisions $TABLE $TABLE id\r\n\tsqlite-utils create-index records.db collisions $TABLE\r\ndone\r\n\r\n# *Other* tables use letter keys, like they were raised by WOLVES. Let's put them\r\n# at the end of the import queue.\r\n\r\nfor TABLE in WEATHER_1 WEATHER_2 LOCATION_TYPE RAMP_INTERSECTION SIDE_OF_HWY \\\r\nPRIMARY_COLL_FACTOR PCF_CODE_OF_VIOL PCF_VIOL_CATEGORY TYPE_OF_COLLISION MVIW \\\r\nPED_ACTION ROAD_SURFACE ROAD_COND_1 ROAD_COND_2 LIGHTING CONTROL_DEVICE \\\r\nSTWD_VEHTYPE_AT_FAULT CHP_VEHTYPE_AT_FAULT PRIMARY_RAMP SECONDARY_RAMP\r\ndo\r\n\tsqlite-utils create-table records.db $TABLE key text name text --pk=key\r\n\tsqlite-utils insert records.db $TABLE lookup-tables/$TABLE.csv --csv\r\n\tsqlite-utils add-foreign-key records.db collisions $TABLE $TABLE key\r\n\tsqlite-utils create-index records.db collisions $TABLE\r\ndone\r\n```\r\n\r\nYou can see the full code and import script here: https://github.com/radical-bike-lobby/switrs-db\r\n\r\nIf I run this code and then hit the CSV export link in the Datasette interface (the simple link or the \"advanced\" dialog), export fails after a small number of CSV rows are written. I am not seeing any detailed error messages but this appears in the logging output:\r\n\r\n```\r\nINFO: 127.0.0.1:57885 - \"GET /records/collisions.csv?_facet=PRIMARY_RD&PRIMARY_RD=ASHBY+AV&_labels=on&_size=max HTTP/1.1\" 200 OK\r\nCaught this error: \r\n\r\n```\r\n\r\n(No other output follows `error:` other than a blank line.)\r\n\r\nI've stared at the rows directly after the error occurs and can't yet see what is causing the problem. I'm going to set up a development environment and see if I get any more detailed error output, and then stare more at some problematic lines to see if I can get a simple reproduction.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2214/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 2028698018, "node_id": "I_kwDOBm6k_c5463mi", "number": 2213, "title": "feature request: gzip compression of database downloads", "user": {"value": 536941, "label": "fgregg"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-12-06T14:35:03Z", "updated_at": "2023-12-06T15:05:46Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "At the bottom of database pages, datasette gives users the opportunity to download the underlying sqlite database. It would be great if that could be served gzip compressed. \r\n\r\nthis is similar to #1213, but for me, i don't need datasette to compress html and json because my CDN layer does it for me, however, cloudflare at least, will not compress a mimetype of \"application\"\r\n\r\n(see list of mimetype: https://developers.cloudflare.com/speed/optimization/content/brotli/content-compression/)", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2213/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 2023057255, "node_id": "I_kwDOBm6k_c54lWdn", "number": 2212, "title": "Can't filter with numbers", "user": {"value": 605070, "label": "fzakaria"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-12-04T05:26:29Z", "updated_at": "2023-12-04T05:26:29Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I have a schema that uses numbers for a column (actually it's a boolean 1 or 0 but SQLite doesn't have Boolean).\r\nI can't seem to get the facet to work or even filtering on this column.\r\n\r\nMy guess is that Datasette is \"stringifying\" the number and it's not matching?\r\nExample: https://debian-sqlelf.fly.dev/debian/elf_symbols?_sort_desc=name&_facet=exported&exported=0", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2212/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 2019811176, "node_id": "I_kwDOBm6k_c54Y99o", "number": 2211, "title": "Unreachable exception handlers for `sqlite3.OperationalError`", "user": {"value": 1214074, "label": "mattparmett"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-12-01T00:50:22Z", "updated_at": "2023-12-01T00:50:22Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "There are several places where `sqlite3.OperationalError` is caught as part of an exception handler which catches multiple exceptions, but is then caught again immediately afterwards by a dedicated exception handler.\r\n\r\nBecause the exception will be caught by the first handler, the logic in the second handler is unreachable and will never be executed. If this is intended behavior, the second handler can be removed. If this is not intended, and the second handler should be the one that catches this exception, then `sqlite3.OperationalError` should be removed from the tuple of exceptions in the first handler.\r\n\r\nThis issue was found via a CodeQL query on the repository, and I've listed the occurrences found by the query below. There may be other instances of this issue in the code that were not surfaced by the query. I'd be happy to share the query if others would like to view or run it.\r\n\r\nOne example:\r\n\r\nhttps://github.com/simonw/datasette/blob/452a587e236ef642cbc6ae345b58767ea8420cb5/datasette/views/database.py#L534-L537\r\n\r\nOther instances:\r\n\r\nhttps://github.com/simonw/datasette/blob/main/datasette/views/base.py#L266-L270\r\nhttps://github.com/simonw/datasette/blob/main/datasette/views/base.py#L452-L456", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2211/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 564833696, "node_id": "MDU6SXNzdWU1NjQ4MzM2OTY=", "number": 670, "title": "Prototoype for Datasette on PostgreSQL", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 15, "created_at": "2020-02-13T17:17:55Z", "updated_at": "2023-11-17T15:32:21Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "I thought this would never happen, but now that I'm deep in the weeds of running SQLite in production for Datasette Cloud I'm starting to reconsider my policy of only supporting SQLite.\r\n\r\nSome of the factors making me think PostgreSQL support could be worth the effort:\r\n- Serverless. I'm getting increasingly excited about writable-database use-cases for Datasette. If it could talk to PostgreSQL then users could easily deploy it on Heroku or other serverless providers that can talk to a managed RDS-style PostgreSQL.\r\n- Existing databases. Plenty of organizations have PostgreSQL databases. They can export to SQLite using [db-to-sqlite](https://github.com/simonw/db-to-sqlite) but that's a pretty big barrier to getting started - being able to run `datasette postgresql://connection-string` and start trying it out would be a massively better experience.\r\n- Data size. I keep running into use-cases where I want to run Datasette against many GBs of data. SQLite can do this but PostgreSQL is much more optimized for large data, especially given the existence of tools like Citus.\r\n- Marketing. Convincing people to trust their data to SQLite is potentially a big barrier to adoption. Even if I've convinced myself it's trustworthy I still have to convince everyone else.\r\n- It might not be that hard? If this required a ground-up rewrite it wouldn't be worth the effort, but I have a hunch that it may not be too hard - most of the SQL in Datasette should work on both databases since it's almost all portable SELECT statements. If Datasette did DML this would be a lot harder, but it doesn't.\r\n- Plugins! This feels like a natural surface for a plugin - at which point people could add MySQL support and suchlike in the future.\r\n\r\nThe above reasons feel strong enough to justify a prototype.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/670/reactions\", \"total_count\": 19, \"+1\": 14, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 5, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1994861266, "node_id": "PR_kwDOBm6k_c5fhgOS", "number": 2209, "title": "Fix query for suggested facets with column named value", "user": {"value": 198537, "label": "rgieseke"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2023-11-15T14:13:30Z", "updated_at": "2023-11-15T15:31:12Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": "simonw/datasette/pulls/2209", "body": "See discussion in https://github.com/simonw/datasette/issues/2208\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://datasette--2209.org.readthedocs.build/en/2209/\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2209/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1994857251, "node_id": "I_kwDOBm6k_c525xsj", "number": 2208, "title": "No suggested facets when a column named 'value' is included", "user": {"value": 198537, "label": "rgieseke"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-11-15T14:11:17Z", "updated_at": "2023-11-15T14:18:59Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "When a column named 'value' is included there are no suggested facets is shown as the query uses an alias of 'value'.\r\n\r\nhttps://github.com/simonw/datasette/blob/452a587e236ef642cbc6ae345b58767ea8420cb5/datasette/facets.py#L168-L174\r\n\r\nCurrently the following is shown (from https://latest.datasette.io/fixtures/facetable)\r\n\r\n![image](https://github.com/simonw/datasette/assets/198537/a919509a-ea88-461b-b25b-8b776720c7c5)\r\n\r\nWhen I add a column named 'value' only the JSON facets are processed.\r\n\r\n![image](https://github.com/simonw/datasette/assets/198537/092bd0b3-4c20-434e-88f8-47e2b8994a1d)\r\n\r\nI think that not using aliases could be a solution (except if someone wants to use a column named `count(*)` though this seems to be unlikely). I'll open a PR with that.\r\n\r\nThere is also a TODO with a similar question in the same file. I have not looked into that yet.\r\n\r\nhttps://github.com/simonw/datasette/blob/452a587e236ef642cbc6ae345b58767ea8420cb5/datasette/facets.py#L512", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2208/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1994845152, "node_id": "I_kwDOBm6k_c525uvg", "number": 2207, "title": "ModuleNotFoundError: No module named 'click_default_group", "user": {"value": 283441, "label": "honzajavorek"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-11-15T14:04:32Z", "updated_at": "2023-11-15T14:04:32Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "No matter what I do, I'm getting this error:\r\n\r\n```\r\n$ datasette\r\nTraceback (most recent call last):\r\n File \"/Users/honza/Library/Caches/pypoetry/virtualenvs/juniorguru-Lgaxwd2n-py3.11/bin/datasette\", line 5, in \r\n from datasette.cli import cli\r\n File \"/Users/honza/Library/Caches/pypoetry/virtualenvs/juniorguru-Lgaxwd2n-py3.11/lib/python3.11/site-packages/datasette/cli.py\", line 6, in \r\n from click_default_group import DefaultGroup\r\nModuleNotFoundError: No module named 'click_default_group'\r\n```\r\n\r\nI have datasette in my dependencies like this:\r\n\r\n```toml\r\n[tool.poetry.group.dev.dependencies]\r\ndatasette = {version = \"1.0a7\", allow-prereleases = true}\r\n```\r\n\r\nI had the latest regular version (not pre-release) there originally, but the result was the same:\r\n\r\n```toml\r\n[tool.poetry.group.dev.dependencies]\r\ndatasette = \"0.64.5\"\r\n```\r\n\r\nFull pyproject.toml is at https://github.com/honzajavorek/junior.guru/ Previously datasette worked for me, but I guess something had to upgrade and now I can't even launch it.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2207/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1978603203, "node_id": "I_kwDOCGYnMM517xbD", "number": 602, "title": "`sqlite-utils transform` removes the `AUTOINCREMENT` keyword", "user": {"value": 4472046, "label": "ArsTapatun"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-11-06T08:48:43Z", "updated_at": "2023-11-06T08:48:43Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "### Context\r\n\r\nWe ran into this bug randomly, noticing that deleted `ROWID` would get reused after migrating the DB. Using `transform` to change any column in the table will also unexpectedly strip away the `AUTOINCREMENT` keyword from the primary key definition, even if it was not the transformation target.\r\n\r\n### Reproducible example\r\n\r\n**Original database**\r\n\r\n```sql\r\n$ sqlite3 test.db << EOF\r\nCREATE TABLE mytable (\r\n col1 INTEGER PRIMARY KEY AUTOINCREMENT,\r\n col2 TEXT NOT NULL\r\n)\r\nEOF\r\n\r\n$ sqlite3 test.db \".schema mytable\"\r\nCREATE TABLE mytable (\r\n col1 INTEGER PRIMARY KEY AUTOINCREMENT,\r\n col2 TEXT NOT NULL\r\n);\r\n```\r\n\r\n**Modified database after sqlite-utils**\r\n\r\n```sql\r\n$ sqlite-utils transform test.db mytable --rename col2 renamedcol2\r\n\r\n$ sqlite3 test.db \"SELECT sql FROM sqlite_master WHERE name = 'mytable';\"\r\nCREATE TABLE IF NOT EXISTS \"mytable\" (\r\n [col1] INTEGER PRIMARY KEY,\r\n [renamedcol2] TEXT NOT NULL\r\n);\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/602/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1978023780, "node_id": "I_kwDOBm6k_c515j9k", "number": 2205, "title": "request.post_vars() method obliterates form keys with multiple values", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": {"value": 8755003, "label": "Datasette 1.0a-next"}, "comments": 3, "created_at": "2023-11-05T23:25:08Z", "updated_at": "2023-11-06T04:10:34Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "https://github.com/simonw/datasette/blob/452a587e236ef642cbc6ae345b58767ea8420cb5/datasette/utils/asgi.py#L137-L139\r\n\r\nIn GET requests you can do `?foo=1&foo=2` - you can do the same in POST requests, but the `dict()` call here eliminates those duplicates.\r\n\r\nYou can't even try calling `post_body()` and implement your own custom parsing because of:\r\n- #2204", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2205/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1978022687, "node_id": "I_kwDOBm6k_c515jsf", "number": 2204, "title": "request.post_body() can only be called once", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-11-05T23:22:03Z", "updated_at": "2023-11-05T23:23:23Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "This code here:\r\n\r\nhttps://github.com/simonw/datasette/blob/452a587e236ef642cbc6ae345b58767ea8420cb5/datasette/utils/asgi.py#L127-L135\r\n\r\nIt consumes the messages, which means if you try to call it a second time you won't be able to get at the body.\r\n\r\nThis is efficient - we don't end up with a `request` object property with potentially megabytes of content that we never look at again - but it's inconvenient for cases like middleware or functions where we don't know if the body has been consumed yet or not.\r\n\r\nPotential solution: set `request._body` the first time it is called, and return that on subsequent calls.\r\n\r\nPotential optimization: only do this for bodies that are shorter than a certain threshold - maybe 1MB - and raise an exception if you attempt to call `post_body()` multiple times against one of those larger bodies.\r\n\r\nI'm a bit nervous about that option though, since it could result in errors that don't show up in testing but do show up in production.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2204/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 959137143, "node_id": "MDU6SXNzdWU5NTkxMzcxNDM=", "number": 1415, "title": "feature request: document minimum permissions for service account for cloudrun", "user": {"value": 536941, "label": "fgregg"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2021-08-03T13:48:43Z", "updated_at": "2023-11-05T16:46:59Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "Thanks again for such a powerful project.\r\n\r\nFor deploying to cloudrun from github actions, I'd like to create a service account with minimal permissions.\r\n\r\nIt would be great to document what those minimum permission that need to be set in the IAM.\r\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/1415/reactions\", \"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1977726056, "node_id": "I_kwDOBm6k_c514bRo", "number": 2203, "title": "custom plugin not seen as sql function", "user": {"value": 7113541, "label": "LyzardKing"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-11-05T10:30:19Z", "updated_at": "2023-11-05T10:30:19Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Hi, I'm not sure if this is the right repo for this issue.\r\n\r\nI'm using datasette with the parquet (to read a duckdb), and jellyfish plugins. Both work perfectly.\r\n\r\nNow I need to create a simple plugin that uses the python rouge package and returns a similarity score (similarly to how the jellyfish plugin works).\r\nIf I create a custom plugin, even the example hello_world one, copied directly from the tutorial, I get the following error:\r\n```duckdb.duckdb.CatalogException: Catalog Error: Scalar Function with name hello_world does not exist!```\r\n\r\nSince the jellyfish plugin doesn't do anything more complex, I'm wondering if there is some other kind of issue with my setup.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2203/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1977155641, "node_id": "I_kwDOCGYnMM512QA5", "number": 601, "title": "Move plugin directory into documentation", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-11-04T04:07:52Z", "updated_at": "2023-11-04T04:07:52Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "https://github.com/simonw/sqlite-utils-plugins should be in the official documentation.\r\n\r\nI can use the same pattern as https://llm.datasette.io/en/stable/plugins/directory.html\r\n\r\nhttps://til.simonwillison.net/readthedocs/stable-docs", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/601/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1955676270, "node_id": "I_kwDOBm6k_c50kUBu", "number": 2201, "title": "Discord invite link is invalid", "user": {"value": 11708906, "label": "andrewsanchez"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-10-21T21:50:05Z", "updated_at": "2023-10-21T21:50:05Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "https://datasette.io/discord leads to https://discord.com/invite/ktd74dm5mw and returns the following:\r\n\r\n\"CleanShot\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2201/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1163369515, "node_id": "I_kwDOBm6k_c5FV5wr", "number": 1655, "title": "query result page is using 400mb of browser memory 40x size of html page and 400x size of csv data", "user": {"value": 536941, "label": "fgregg"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 8, "created_at": "2022-03-09T00:56:40Z", "updated_at": "2023-10-17T21:53:17Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "[this page](https://labordata.bunkum.us/opdr-8335ea3?sql=with+most_recent_lu+as+%28%0D%0A++select%0D%0A++++*%0D%0A++from%0D%0A++++%28%0D%0A++++++select%0D%0A++++++++*%0D%0A++++++from%0D%0A++++++++lm_data%0D%0A++++++order+by%0D%0A++++++++f_num%2C%0D%0A++++++++receive_date+desc%0D%0A++++%29+t%0D%0A++group+by%0D%0A++++f_num%0D%0A%29%0D%0Aselect%0D%0A++aff_abbr+%7C%7C+coalesce%28%27+local+%27+%7C%7C+desig_num%2C+%27+%27+%7C%7C+unit_name%29+as+abbr_local_name%2C%0D%0A++coalesce%28%0D%0A++++regexp_match%28%27%28.*%3F%29%28%2C%3F+AFL-CIO%24%29%27%2C+union_name%29%2C%0D%0A++++regexp_match%28%27%28.*%3F%29%28+IND%24%29%27%2C+union_name%29%2C%0D%0A++++union_name%0D%0A++%29+%7C%7C+coalesce%28%27+local+%27+%7C%7C+desig_num%2C+%27+%27+%7C%7C+unit_name%29+as+full_local_name%2C%0D%0A++*%0D%0Afrom%0D%0A++most_recent_lu%0D%0Awhere+%28desig_num+IS+NOT+NULL+OR+unit_name+IS+NOT+NULL%29+AND+desig_name+%21%3D+%27HQ%27%0D%0Alimit%0D%0A++5000+offset+0)\r\n\r\nis using about 400 mb in firefox 97 on mac os x. if you download the html for the page, it's about 11mb and if you get the csv for the data its about 1mb.\r\n\r\nit's using over a 1G on chrome 99.\r\n\r\ni found this because, i was trying to figure out why editing the SQL was getting very slow.\r\n\r\n\r\n\r\n\r\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/1655/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1943259395, "node_id": "I_kwDOEhK-wc5z08kD", "number": 16, "title": " time data '2014-11-21T11:44:12.000Z' does not match format '%Y%m%dT%H%M%SZ'", "user": {"value": 3746270, "label": "linonetwo"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-10-14T13:24:39Z", "updated_at": "2023-10-14T13:24:39Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "\r\n```\r\nevernote-to-sqlite enex evernote.db ./\u6211\u7684\u7b14\u8bb0.enex\r\nImporting from ENEX [#####-------------------------------] 14%\r\nTraceback (most recent call last):\r\n File \"/usr/local/bin/evernote-to-sqlite\", line 8, in \r\n sys.exit(cli())\r\n ^^^^^\r\n File \"/usr/local/lib/python3.11/site-packages/click/core.py\", line 1157, in __call__\r\n return self.main(*args, **kwargs)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/usr/local/lib/python3.11/site-packages/click/core.py\", line 1078, in main\r\n rv = self.invoke(ctx)\r\n ^^^^^^^^^^^^^^^^\r\n File \"/usr/local/lib/python3.11/site-packages/click/core.py\", line 1688, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/usr/local/lib/python3.11/site-packages/click/core.py\", line 1434, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/usr/local/lib/python3.11/site-packages/click/core.py\", line 783, in invoke\r\n return __callback(*args, **kwargs)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/usr/local/lib/python3.11/site-packages/evernote_to_sqlite/cli.py\", line 31, in enex\r\n save_note(db, note)\r\n File \"/usr/local/lib/python3.11/site-packages/evernote_to_sqlite/utils.py\", line 46, in save_note\r\n \"created\": convert_datetime(created),\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/usr/local/lib/python3.11/site-packages/evernote_to_sqlite/utils.py\", line 111, in convert_datetime\r\n return datetime.datetime.strptime(s, \"%Y%m%dT%H%M%SZ\").isoformat()\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/usr/local/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/_strptime.py\", line 568, in _strptime_datetime\r\n tt, fraction, gmtoff_fraction = _strptime(data_string, format)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/usr/local/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/_strptime.py\", line 349, in _strptime\r\n raise ValueError(\"time data %r does not match format %r\" %\r\nValueError: time data '2014-11-21T11:44:12.000Z' does not match format '%Y%m%dT%H%M%SZ'\r\n```\r\n\r\nenex is exported by evernote mac client ", "repo": {"value": 303218369, "label": "evernote-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/16/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1940346034, "node_id": "I_kwDOBm6k_c5zp1Sy", "number": 2199, "title": "Detailed upgrade instructions for metadata.yaml -> datasette.yaml", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": {"value": 3268330, "label": "Datasette 1.0"}, "comments": 7, "created_at": "2023-10-12T16:21:25Z", "updated_at": "2023-10-12T22:08:42Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "> `Exception: Datasette no longer accepts plugin configuration in --metadata. Move your \"plugins\" configuration blocks to a separate file - we suggest calling that datasette..json - and start Datasette with datasette -c datasette..json. See https://docs.datasette.io/en/latest/configuration.html for more details.`\r\n>\r\n> I think we should link directly to documentation that tells people how to perform this upgrade.\r\n\r\n_Originally posted by @simonw in https://github.com/simonw/datasette/issues/2190#issuecomment-1759947021_\r\n ", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2199/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1931794126, "node_id": "I_kwDOBm6k_c5zJNbO", "number": 2198, "title": "--load-extension=spatialite not working with Windows", "user": {"value": 363004, "label": "hcarter333"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-10-08T12:50:22Z", "updated_at": "2023-10-08T12:50:22Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Using each of\r\n`python -m datasette counties.db -m metadata.yml --load-extension=SpatiaLite`\r\n\r\nand \r\n\r\n`python -m datasette counties.db --load-extension=\"C:\\Windows\\System32\\mod_spatialite.dll\"`\r\n\r\nand\r\n\r\n`python -m datasette counties.db --load-extension=C:\\Windows\\System32\\mod_spatialite.dll`\r\n\r\nI got the error:\r\n\r\n```\r\n File \"C:\\Users\\m3n7es\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python311\\site-packages\\datasette\\database.py\", line 209, in in_thread\r\n self.ds._prepare_connection(conn, self.name)\r\n File \"C:\\Users\\m3n7es\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python311\\site-packages\\datasette\\app.py\", line 596, in _prepare_connection\r\n conn.execute(\"SELECT load_extension(?, ?)\", [path, entrypoint])\r\nsqlite3.OperationalError: The specified module could not be found.\r\n\r\n```\r\n\r\nI finally tried modifying the code in app.py to read:\r\n\r\n```\r\n def _prepare_connection(self, conn, database):\r\n conn.row_factory = sqlite3.Row\r\n conn.text_factory = lambda x: str(x, \"utf-8\", \"replace\")\r\n if self.sqlite_extensions:\r\n conn.enable_load_extension(True)\r\n for extension in self.sqlite_extensions:\r\n # \"extension\" is either a string path to the extension\r\n # or a 2-item tuple that specifies which entrypoint to load.\r\n #if isinstance(extension, tuple):\r\n # path, entrypoint = extension\r\n # conn.execute(\"SELECT load_extension(?, ?)\", [path, entrypoint])\r\n #else:\r\n conn.execute(\"SELECT load_extension('C:\\Windows\\System32\\mod_spatialite.dll')\")\r\n\r\n```\r\nAt which point the counties example worked. \r\n\r\nIs there a correct way to install/use the extension on Windows? My method will cause issues if there's a second extension to be used.\r\n\r\nOn an unrelated note, my next step is to figure out how to write a query across the two loaded databases supplied from the command line:\r\n`python -m datasette rm_toucans_23_10_07.db counties.db -m metadata.yml --load-extension=SpatiaLite`\r\n\r\n\r\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2198/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1426379903, "node_id": "PR_kwDOBm6k_c5BtJNn", "number": 1870, "title": "don't use immutable=1, only mode=ro", "user": {"value": 536941, "label": "fgregg"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 7, "created_at": "2022-10-27T23:33:04Z", "updated_at": "2023-10-03T19:12:37Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": "simonw/datasette/pulls/1870", "body": "Opening db files in immutable mode sometimes leads to the file being mutated, which causes duplication in the docker image layers: see #1836, #1480\r\n\r\nThat this happens in \"immutable\" mode is surprising, because the sqlite docs say that setting this should open the database as read only. \r\n\r\nhttps://www.sqlite.org/c3ref/open.html\r\n\r\n> immutable: The immutable parameter is a boolean query parameter that indicates that the database file is stored on read-only media. When immutable is set, SQLite assumes that the database file cannot be changed, even by a process with higher privilege, and so the database is opened read-only and all locking and change detection is disabled. Caution: Setting the immutable property on a database file that does in fact change can result in incorrect query results and/or [SQLITE_CORRUPT](https://www.sqlite.org/rescode.html#corrupt) errors. See also: [SQLITE_IOCAP_IMMUTABLE](https://www.sqlite.org/c3ref/c_iocap_atomic.html).\r\n\r\nPerhaps this is a bug in sqlite?\r\n\r\n\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://datasette--1870.org.readthedocs.build/en/1870/\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/1870/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1920416843, "node_id": "I_kwDOCGYnMM5ydzxL", "number": 597, "title": "sqlite-utils insert-files should be able to convert fields", "user": {"value": 1737541, "label": "grimnight"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-09-30T22:20:47Z", "updated_at": "2023-09-30T22:20:47Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Currently using both `insert-files` and `convert` is needed in order to create sqlar files, it would be more convenient if it could be done with just one command.\r\n\r\n```shell\r\n~\r\n\u276f cat test.py\r\nimport os\r\n\r\nclass Example:\r\n def __init__(self, arg1, arg2):\r\n self.arg1 = arg1\r\n\r\n~\r\n\u276f sqlite-utils insert-files test.sqlar sqlar test.py -c name:name -c data:content -c mode:mode -c mtime:mtime -c sz:size --pk=name\r\n [####################################] 100%\r\n\r\n~\r\n\u276f sqlite-utils convert test.sqlar sqlar data \"zlib.compress(value)\" --import=zlib --where \"name = 'test.py'\"\r\n[####################################] 100%\r\n\r\n~\r\n\u276f cat test.py | sqlite-utils convert test.sqlar sqlar data \"zlib.compress(sys.stdin.buffer.read())\" --import=zlib --import=sys --where \"name = 'test.py'\" # Alternative way\r\n [####################################] 100%\r\n\r\n~\r\n\u276f sqlite3 test.sqlar \"SELECT hex(data) FROM sqlar WHERE name = 'test.py';\" | python3 -c \"import sys, zlib; sys.stdout.buffer.write(zlib.decompress(bytes.fromhex(sys.stdin.read())))\"\r\nimport os\r\n\r\nclass Example:\r\n def __init__(self, arg1, arg2):\r\n self.arg1 = arg1\r\n\r\n~\r\n\u276f rm test.py\r\n\r\n~\r\n\u276f sqlar -l test.sqlar\r\ntest.py\r\n\r\n~\r\n\u276f sqlar -x test.sqlar\r\n\r\n~\r\n\u276f cat test.py\r\nimport os\r\n\r\nclass Example:\r\n def __init__(self, arg1, arg2):\r\n self.arg1 = arg1\r\n\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/597/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 777333388, "node_id": "MDU6SXNzdWU3NzczMzMzODg=", "number": 1168, "title": "Mechanism for storing metadata in _metadata tables", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 21, "created_at": "2021-01-01T18:47:27Z", "updated_at": "2023-09-28T18:29:05Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "_Original title: Perhaps metadata should all live in a `_metadata` in-memory database_\r\n\r\nInspired by #1150 - metadata should be exposed as an API, and for large Datasette instances that API may need to be paginated. So why not expose it through an in-memory database table?\r\n\r\nOne catch to this: plugins. #860 aims to add a plugin hook for metadata. But if the metadata comes from an in-memory table, how do the plugins interact with it?\r\n\r\nThe need to paginate over metadata does make a plugin hook that returns metadata for an individual table seem less wise, since we don't want to have to do 10,000 plugin hook invocations to show a list of all metadata.\r\n\r\nIf those plugins write directly to the in-memory table how can their contributions survive the server restarting?", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/1168/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1865572575, "node_id": "PR_kwDOBm6k_c5Yt2eO", "number": 2155, "title": "Fix hupper.start_reloader entry point", "user": {"value": 79087, "label": "cadeef"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2023-08-24T17:14:08Z", "updated_at": "2023-09-27T18:44:02Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "simonw/datasette/pulls/2155", "body": "Update hupper's entry point so that click commands are processed properly.\r\n\r\nFixes #2123\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://datasette--2155.org.readthedocs.build/en/2155/\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2155/reactions\", \"total_count\": 2, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 2, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 944846776, "node_id": "MDU6SXNzdWU5NDQ4NDY3NzY=", "number": 297, "title": "Option for importing CSV data using the SQLite .import mechanism", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 23, "created_at": "2021-07-14T22:36:41Z", "updated_at": "2023-09-22T20:49:52Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "As seen in https://til.simonwillison.net/sqlite/import-csv - `.mode csv` and then `.import school.csv schools` is hugely faster than importing via `sqlite-utils insert` and doing the work in Python - but it can only be implemented by shelling out to the `sqlite3` CLI tool, it's not functionality that is exposed to the Python `sqlite3` module.\r\n\r\nAn option to use this would be useful - maybe something like this:\r\n\r\n sqlite-utils insert blah.db blah blah.csv --fast", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/297/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1907765514, "node_id": "I_kwDOBm6k_c5xtjEK", "number": 2195, "title": "`datasette publish` needs support for the new config/metadata split", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 9, "created_at": "2023-09-21T21:08:12Z", "updated_at": "2023-09-21T22:57:48Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "> ... which raises the challenge that `datasette publish` doesn't yet know what to do with a config file!\r\n\r\n_Originally posted by @simonw in https://github.com/simonw/datasette/issues/2194#issuecomment-1730259871_\r\n ", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2195/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1825007061, "node_id": "I_kwDOBm6k_c5sx2XV", "number": 2123, "title": "datasette serve when invoked with --reload interprets the serve command as a file", "user": {"value": 79087, "label": "cadeef"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2023-07-27T19:07:22Z", "updated_at": "2023-09-18T13:02:46Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "When running `datasette serve` with the `--reload` flag, the serve command is picked up as a file argument:\r\n\r\n```\r\n$ datasette serve --reload test_db\r\nStarting monitor for PID 13574.\r\nError: Invalid value for '[FILES]...': Path 'serve' does not exist.\r\nPress ENTER or change a file to reload.\r\n```\r\n\r\nIf a 'serve' file is created it launches properly (albeit with an empty database called serve):\r\n\r\n```\r\n$ touch serve; datasette serve --reload test_db\r\nStarting monitor for PID 13628.\r\nINFO: Started server process [13628]\r\nINFO: Waiting for application startup.\r\nINFO: Application startup complete.\r\nINFO: Uvicorn running on http://127.0.0.1:8001 (Press CTRL+C to quit)\r\n```\r\n\r\nVersion (running from HEAD on main):\r\n\r\n```\r\n$ datasette --version\r\ndatasette, version 1.0a2\r\n```\r\n\r\nThis issue appears to have existed for awhile as https://github.com/simonw/datasette/issues/1380#issuecomment-953366110 mentions the error in a different context.\r\n\r\nI'm happy to debug and land a patch if it's welcome.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2123/reactions\", \"total_count\": 2, \"+1\": 2, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1900026059, "node_id": "I_kwDOBm6k_c5xQBjL", "number": 2188, "title": "Plugin Hooks for \"compile to SQL\" languages", "user": {"value": 15178711, "label": "asg017"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2023-09-18T01:37:15Z", "updated_at": "2023-09-18T06:58:53Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "There's a ton of tools/languages that compile to SQL, which may be nice in Datasette. Some examples:\r\n\r\n- Logica https://logica.dev\r\n- PRQL https://prql-lang.org\r\n- Malloy, but not sure if it works with SQLite? https://github.com/malloydata/malloy\r\n\r\nIt would be cool if plugins could extend Datasette to use these languages, in both the code editor and API usage.\r\n\r\nA few things I'd imagine a `datasette-prql` or `datasette-logica` plugin would do:\r\n\r\n- `prql=` instead of `sql=`\r\n- Code editor support (syntax highlighting, autocomplete)\r\n- Hide/show SQL", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2188/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 787098345, "node_id": "MDU6SXNzdWU3ODcwOTgzNDU=", "number": 1191, "title": "Ability for plugins to collaborate when adding extra HTML to blocks in default templates", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": {"value": 3268330, "label": "Datasette 1.0"}, "comments": 12, "created_at": "2021-01-15T18:18:51Z", "updated_at": "2023-09-18T06:55:52Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "Sometimes a plugin may want to add content to an existing default template - for example `datasette-search-all` adds a new search box at the top of `index.html`. I also want `datasette-upload-csvs` to add a CTA on the `database.html` page: https://github.com/simonw/datasette-upload-csvs/issues/18\r\n\r\nCurrently plugins can do this by providing a new version of the `index.html` template - but if multiple plugins try to do that only one of them will succeed.\r\n\r\nIt would be better if there were known areas of those templates which plugins could add additional content to, such that multiple plugins can use the same spot.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/1191/reactions\", \"total_count\": 4, \"+1\": 4, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1899310542, "node_id": "I_kwDOBm6k_c5xNS3O", "number": 2187, "title": "Datasette for serving JSON only", "user": {"value": 19705106, "label": "geofinder"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-09-16T05:48:29Z", "updated_at": "2023-09-16T05:48:29Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Hi, is there any way to use datasette for serving json only without displaying webpage? I've tried to search about this in documentation but didn't get any information", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2187/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1898927976, "node_id": "I_kwDOBm6k_c5xL1do", "number": 2186, "title": "Mechanism for register_output_renderer hooks to access full count", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": {"value": 3268330, "label": "Datasette 1.0"}, "comments": 2, "created_at": "2023-09-15T18:57:54Z", "updated_at": "2023-09-15T19:27:59Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "The cause of this bug:\r\n- https://github.com/simonw/datasette-export-notebook/issues/17\r\n\r\nIs that `datasette-export-notebook` was consulting `data[\"filtered_table_rows_count\"]` in the render output plugin function in order to show the total number of rows that would be exported.\r\n\r\nThat field is no longer available by default - the `\"count\"` field is only available if `?_extra=count` was passed.\r\n\r\nIt would be useful if plugins like this could access the total count on demand, should they need to.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2186/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1895266807, "node_id": "I_kwDOBm6k_c5w93n3", "number": 2184, "title": "Design decision - should configuration be exposed at /-/config ?", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-09-13T21:07:08Z", "updated_at": "2023-09-13T21:07:38Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "> This made me think. That `{\"$env\": \"ENV_VAR\"}` hack was introduced back here:\r\n>\r\n> - https://github.com/simonw/datasette/issues/538\r\n>\r\n> The problem it was solving was that metadata was visible to everyone with access to the instance at `/-/metadata` but plugins clearly needed a way to set secret settings.\r\n>\r\n> Now that this stuff is moving to config, we have some decisions to make:\r\n>\r\n> 1. Add `/-/config` to let people see the configuration of their instance, and keep the `$env` trick for secret settings.\r\n> 2. Say all configuration aside from metadata is secret and make `$env` optional or ditch it entirely.\r\n> 3. Allow plugins to announce which of their configuration options are secret so we can automatically redact them from `/-/config`\r\n>\r\n> I've found `/-/metadata` extraordinarily useful as a user of Datasette - it really helps me understand exactly what's going on if I run into any problems with a plugin, if I can quickly check what the settings look like.\r\n>\r\n> So I'm leaning towards option 1 or 3.\r\n\r\n_Originally posted by @simonw in https://github.com/simonw/datasette/pull/2183#discussion_r1325076924_\r\n\r\nAlso refs:\r\n- #2093", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2184/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1891614971, "node_id": "I_kwDOCGYnMM5wv8D7", "number": 594, "title": "Represent compound foreign keys in table.foreign_keys output", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2023-09-12T03:48:24Z", "updated_at": "2023-09-12T03:51:13Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "Given this schema:\r\n```sql\r\nCREATE TABLE departments (\r\n campus_name TEXT NOT NULL,\r\n dept_code TEXT NOT NULL,\r\n dept_name TEXT,\r\n PRIMARY KEY (campus_name, dept_code)\r\n);\r\nCREATE TABLE courses (\r\n course_code TEXT PRIMARY KEY,\r\n course_name TEXT,\r\n campus_name TEXT NOT NULL,\r\n dept_code TEXT NOT NULL,\r\n FOREIGN KEY (campus_name, dept_code) REFERENCES departments(campus_name, dept_code)\r\n);\r\n```\r\nThe output of `db[\"courses\"].foreign_keys` right now is:\r\n```\r\n[ForeignKey(table='courses', column='campus_name', other_table='departments', other_column='campus_name'),\r\n ForeignKey(table='courses', column='dept_code', other_table='departments', other_column='dept_code')]\r\n```\r\nWhich suggests two normal foreign keys, not one compound foreign key.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/594/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1781530343, "node_id": "I_kwDOBm6k_c5qL_7n", "number": 2093, "title": "Proposal: Combine settings, metadata, static, etc. into a single `datasette.yaml` File", "user": {"value": 15178711, "label": "asg017"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 8, "created_at": "2023-06-29T21:18:23Z", "updated_at": "2023-09-11T20:19:32Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "Very often I get tripped up when trying to configure my Datasette instances. For example: if I want to change the port my app listen too, do I do that with a CLI flag, a `--setting` flag, inside `metadata.json`, or an env var? If I want to up the time limit of SQL statements, is that under `metadata.json` or a setting? Where does my plugin configuration go?\r\n\r\nNormally I need to look it up in Datasette docs, and I quickly find my answer, but the number of places where \"config\" goes it overwhelming.\r\n\r\n- Flat CLI flags like `--port`, `--host`, `--cors`, etc.\r\n- `--setting`, like `default_page_size`, `sql_time_limit_ms` etc\r\n- Inside `metadata.json`, including plugin configuration\r\n\r\nTypically my Datasette deploys are extremely long shell commands, with multiple `--setting` and other CLI flags.\r\n\r\n## Proposal: Consolidate all \"config\" into `datasette.toml`\r\n\r\nI propose that we add a new `datasette.toml` that combines \"settings\", \"metadata\", and other common CLI flags like `--port` and `--cors` into a single file. It would be similar to \"Cargo.toml\" in Rust projects, \"package.json\" in Node projects, and \"pyproject.toml\" in Python, etc.\r\n\r\nA sample of what it could look like:\r\n\r\n```toml\r\n# \"top level\" configuration that are currently CLI flags on `datasette serve`\r\n[config]\r\nport = 8020\r\nhost = \"0.0.0.0\"\r\ncors = true\r\n\r\n# replaces multiple `--setting` flags\r\n[settings]\r\nbase_url = \"/app/datasette/\"\r\ndefault_allow_sql = true\r\nsql_time_limit_ms = 3500\r\n\r\n# replaces `metadata.json`.\r\n# The contents of datasette-metadata.json could be defined in this file instead, but supporting separate files is nice (since those are easy to machine-generate)\r\n[metadata]\r\ninclude=\"./datasette-metadata.json\"\r\n\r\n# plugin-specific \r\n[plugins]\r\n[plugins.datasette-auth-github]\r\nclient_id = {env = \"DATASETTE_AUTH_GITHUB_CLIENT_ID\"}\r\nclient_secret = {env = \"GITHUB_CLIENT_SECRET\"}\r\n\r\n[plugins.datasette-cluster-map]\r\n\r\nlatitude_column = \"lat\"\r\nlongitude_column = \"lon\"\r\n```\r\n\r\n## Pros\r\n- Instead of multiple files and CLI flags, everything could be in one tidy file\r\n- Editing config in a separate file is easier than editing CLI flags, since you don't have to kill a process + edit a command every time\r\n- New users will know \"just edit my `datasette.toml` instead of needing to learn metadata + settings + CLI flags\r\n- Better dev experience for multiple environment. For example, could have `datasette -c datasette-dev.toml` for local dev environments (enables SQL, debug plugins, long timeouts, etc.), and a `datasette -c datasette-prod.toml` for \"production\" (lower timeouts, less plugins, monitoring plugins, etc.)\r\n\r\n## Cons\r\n- Yet another config-management system. Now Datasette users will need to know about metadata, settings, CLI flags, _and_ `datasette.toml`. However with enough documentation + announcements + examples, I think we can get ahead of it.\r\n- If toml is chosen, would need to add a toml parser for Python version <3.11\r\n- Multiple sources of config require priority. For example: Would `--setting default_allow_sql off` override the value inside `[settings]`? What about `--port`? \r\n\r\n## Other Notes\r\n\r\n### Toml\r\n\r\nI chose toml over json because toml supports comments. I chose toml over yaml because Python 3.11 has builtin support for it. I also find toml easier to work with since it doesn't have the odd \"gotchas\" that YAML has (\"ex `3.10` resolving to `3.1`, Norway `NO` resolving to `false`, etc.). It also mimics `pyproject.toml` which is nice. Happy to change my mind about this however\r\n\r\n\r\n### Plugin config will be difficult\r\n\r\nPlugin config is currently in `metadata.json` in two places:\r\n\r\n1. Top level, under `\"plugins.[plugin-name]\"`. This fits well into `datasette.toml` as `[plugins.plugin-name]`\r\n2. Table level, under `\"databases.[db-name].tables.[table-name].plugins.[plugin-name]`. This doesn't fit that well into `datasette.toml`, unless it's nested under `[metadata]`?\r\n\r\n### Extensions, static, one-off plugins?\r\n\r\nWe could also include equivalents of `--plugins-dir`, `--static`, and `--load-extension` into `datasette.toml`, but I'd imagine there's a few security concerns there to think through. \r\n\r\n\r\n### Explicitly list with plugins to use?\r\n\r\nI believe Datasette by default will load all install plugins on startup, but maybe `datasette.toml` can specify a list of plugins to use? For example, a dev version of `datasette.toml` can specify `datasette-pretty-traces`, but the prod version can leave it out", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2093/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1876353656, "node_id": "I_kwDOBm6k_c5v1uJ4", "number": 2168, "title": "Consider a request/response wrapping hook slightly higher level than asgi_wrapper()", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 6, "created_at": "2023-08-31T21:42:04Z", "updated_at": "2023-09-10T17:54:08Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "There's a long justification for why this might be needed here:\r\n- https://github.com/simonw/datasette-auth-tokens/issues/10#issuecomment-1701820001\r\n\r\nShort version: it would be neat if it was possible to stash some data on the `request` object such that a later plugin/middleware-type-thing could use that to influence the final returned response - similar to the kinds of things you can do with Django middleware.\r\n\r\nThe `asgi_wrapper()` mechanism doesn't have access to the request or response objects - it gets `scope` and can mess around with `receive` and `send`, but those are pretty low-level primitives.\r\n\r\nSince Datasette has well-defined `request` and `response` objects now it might be nice to have a middleware layer that can manipulate those directly.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2168/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1888477283, "node_id": "I_kwDOC8SPRc5wj-Bj", "number": 38, "title": "Run `rebuild_fts` after building the index", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-09-08T23:17:45Z", "updated_at": "2023-09-08T23:17:45Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "In:\r\n- https://github.com/simonw/datasette.io/issues/152#issuecomment-1712323347\r\n\r\nThis turned out to be the fix:\r\n\r\n```bash\r\ndogsheep-beta index dogsheep-index.db templates/dogsheep-beta.yml\r\nsqlite-utils rebuild-fts dogsheep-index.db\r\n```", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/38/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 954546309, "node_id": "MDExOlB1bGxSZXF1ZXN0Njk4NDIzNjY3", "number": 8, "title": "Add Gmail takeout mbox import (v2)", "user": {"value": 28565, "label": "maxhawkins"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 7, "created_at": "2021-07-28T07:05:32Z", "updated_at": "2023-09-08T01:22:49Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/google-takeout-to-sqlite/pulls/8", "body": "WIP\r\n\r\nThis PR builds on #5 to continue implementing gmail import support.\r\n\r\nBuilding on @UtahDave's work, these commits add a few performance and bug fixes:\r\n\r\n* Decreased memory overhead for import by manually parsing mbox headers.\r\n* Fixed error where some messages in the mbox would yield a row with NULL in all columns.\r\n\r\nI will send more commits to fix any errors I encounter as I run the importer on my personal takeout data.", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1884330740, "node_id": "PR_kwDOBm6k_c5ZszDF", "number": 2174, "title": "Use $DATASETTE_INTERNAL in absence of --internal", "user": {"value": 15178711, "label": "asg017"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2023-09-06T16:07:15Z", "updated_at": "2023-09-08T00:46:13Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": "simonw/datasette/pulls/2174", "body": "#refs 2157, specifically [this comment](https://github.com/simonw/datasette/issues/2157#issuecomment-1700291967)\r\n\r\nPassing in `--internal my_internal.db` over and over again can get repetitive. \r\n\r\nThis PR adds a new configurable env variable `DATASETTE_INTERNAL_DB_PATH`. If it's defined, then it takes place as the path to the internal database. Users can still overwrite this behavior by passing in their own `--internal internal.db` flag.\r\n\r\nIn draft mode for now, needs tests and documentation. \r\n\r\nSide note: Maybe we can have a sections in the docs that lists all the \"configuration environment variables\" that Datasette respects? I did a quick grep and found:\r\n\r\n- `DATASETTE_LOAD_PLUGINS`\r\n- `DATASETTE_SECRETS`\r\n\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://datasette--2174.org.readthedocs.build/en/2174/\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2174/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1010112818, "node_id": "I_kwDOBm6k_c48NRky", "number": 1479, "title": "Win32 \"used by another process\" error with datasette publish", "user": {"value": 76450761, "label": "kirajano"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 7, "created_at": "2021-09-28T19:12:00Z", "updated_at": "2023-09-07T02:14:16Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I unfortunately was not successful to deploy to fly.io. Please see the details above of the three scenarios that I took. I am also new to datasette.\r\n\r\nFailed to deploy. Attaching logs:\r\n1. Tried with an app created via `flyctl apps create frosty-fog-8565` and the ran `datasette publish fly covid.db --app frosty-fog-8565` \r\n``` \r\nDeploying frosty-fog-8565\r\n==> Validating app configuration\r\n--> Validating app configuration done\r\nServices\r\nTCP 80/443 \u21e2 8080\r\n\r\nError error connecting to docker: An unknown error occured.\r\n\r\nTraceback (most recent call last):\r\n File \"c:\\users\\grott\\anaconda3\\lib\\runpy.py\", line 193, in _run_module_as_main\r\n \"__main__\", mod_spec)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\runpy.py\", line 85, in _run_code\r\n exec(code, run_globals)\r\n File \"C:\\Users\\grott\\Anaconda3\\Scripts\\datasette.exe\\__main__.py\", line 7, in \r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 829, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 782, in main\r\n rv = self.invoke(ctx)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 1259, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 1259, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 1066, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 610, in invoke\r\n return callback(*args, **kwargs)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\datasette_publish_fly\\__init__.py\", line 156, in fly\r\n \"--remote-only\",\r\n File \"c:\\users\\grott\\anaconda3\\lib\\contextlib.py\", line 119, in __exit__\r\n next(self.gen)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\datasette\\utils\\__init__.py\", line 451, in temporary_docker_directory\r\n tmp.cleanup()\r\n File \"c:\\users\\grott\\anaconda3\\lib\\tempfile.py\", line 811, in cleanup\r\n _shutil.rmtree(self.name)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\shutil.py\", line 516, in rmtree\r\n return _rmtree_unsafe(path, onerror)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\shutil.py\", line 395, in _rmtree_unsafe\r\n _rmtree_unsafe(fullname, onerror)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\shutil.py\", line 404, in _rmtree_unsafe\r\n onerror(os.rmdir, path, sys.exc_info())\r\n File \"c:\\users\\grott\\anaconda3\\lib\\shutil.py\", line 402, in _rmtree_unsafe\r\n os.rmdir(path)\r\nPermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\\\Users\\\\grott\\\\AppData\\\\Local\\\\Temp\\\\tmpgcm8cz66\\\\frosty-fog-8565'\r\n```\r\n\r\n2. Tried also with an app that gets autogenerate when running `flyctl launch`. This also generates the .toml file. Ran then `datasette publish fly covid.db --app dark-feather-168` **but different error now**\r\n```Deploying dark-feather-168\r\n==> Validating app configuration\r\n\r\nError not possible to validate configuration: server returned Post \"https://api.fly.io/graphql\": unexpected EOF\r\n\r\nTraceback (most recent call last):\r\n File \"c:\\users\\grott\\anaconda3\\lib\\runpy.py\", line 193, in _run_module_as_main \r\n \"__main__\", mod_spec)\r\n exec(code, run_globals)\r\n File \"C:\\Users\\grott\\Anaconda3\\Scripts\\datasette.exe\\__main__.py\", line 7, in \r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 829, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 782, in main\r\n rv = self.invoke(ctx)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 1259, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 1259, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 1066, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 610, in invoke\r\n return callback(*args, **kwargs)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\datasette_publish_fly\\__init__.py\", line 156, in fly\r\n \"--remote-only\",\r\n File \"c:\\users\\grott\\anaconda3\\lib\\contextlib.py\", line 119, in __exit__\r\n next(self.gen)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\datasette\\utils\\__init__.py\", line 451, in temporary_docker_directory\r\n tmp.cleanup()\r\n File \"c:\\users\\grott\\anaconda3\\lib\\tempfile.py\", line 811, in cleanup\r\n _shutil.rmtree(self.name)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\shutil.py\", line 516, in rmtree\r\n return _rmtree_unsafe(path, onerror)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\shutil.py\", line 395, in _rmtree_unsafe\r\n _rmtree_unsafe(fullname, onerror)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\shutil.py\", line 404, in _rmtree_unsafe\r\n onerror(os.rmdir, path, sys.exc_info())\r\n File \"c:\\users\\grott\\anaconda3\\lib\\shutil.py\", line 402, in _rmtree_unsafe\r\n os.rmdir(path)\r\nPermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\\\Users\\\\grott\\\\AppData\\\\Local\\\\Temp\\\\tmpnoyewcre\\\\dark-feather-168'\r\n```\r\n\r\nThese are also the contents of the generated **.toml file** in 2 scenario:\r\n\r\n```\r\n# fly.toml file generated for dark-feather-168 on 2021-09-28T20:35:44+02:00\r\n\r\napp = \"dark-feather-168\"\r\n\r\nkill_signal = \"SIGINT\"\r\nkill_timeout = 5\r\nprocesses = []\r\n\r\n[env]\r\n\r\n[experimental]\r\n allowed_public_ports = []\r\n auto_rollback = true\r\n\r\n[[services]]\r\n http_checks = []\r\n internal_port = 8080\r\n processes = [\"app\"]\r\n protocol = \"tcp\"\r\n script_checks = []\r\n\r\n [services.concurrency]\r\n hard_limit = 25\r\n soft_limit = 20\r\n type = \"connections\"\r\n\r\n [[services.ports]]\r\n handlers = [\"http\"]\r\n port = 80\r\n\r\n [[services.ports]]\r\n handlers = [\"tls\", \"http\"]\r\n port = 443\r\n\r\n [[services.tcp_checks]]\r\n grace_period = \"1s\"\r\n interval = \"15s\"\r\n restart_limit = 6\r\n timeout = \"2s\"\r\n```\r\n\r\n3. But also trying `datasette package covid.db` to create a local DOCKERFILE to later try to push it via `flyctl deploy` fails as well.\r\n\r\n```[+] Building 147.3s (11/11) FINISHED\r\n => [internal] load build definition from Dockerfile 0.2s \r\n => => transferring dockerfile: 396B 0.0s \r\n => [internal] load .dockerignore 0.1s \r\n => => transferring context: 2B 0.0s \r\n => [internal] load metadata for docker.io/library/python:3.8 4.7s \r\n => [auth] library/python:pull token for registry-1.docker.io 0.0s \r\n => [internal] load build context 0.1s \r\n => => transferring context: 82.37kB 0.0s \r\n => [1/5] FROM docker.io/library/python:3.8@sha256:530de807b46a11734e2587a784573c12c5034f2f14025f838589e6c0e3 108.3s \r\n => => resolve docker.io/library/python:3.8@sha256:530de807b46a11734e2587a784573c12c5034f2f14025f838589e6c0e3b5 0.0s \r\n => => sha256:56182bcdf4d4283aa1f46944b4ef7ac881e28b4d5526720a4e9ba03a4730846a 2.22kB / 2.22kB 0.0s \r\n => => sha256:955615a668ce169f8a1443fc6b6e6215f43fe0babfb4790712a2d3171f34d366 54.93MB / 54.93MB 21.6s \r\n => => sha256:911ea9f2bd51e53a455297e0631e18a72a86d7e2c8e1807176e80f991bde5d64 10.87MB / 10.87MB 15.5s \r\n => => sha256:530de807b46a11734e2587a784573c12c5034f2f14025f838589e6c0e3b5c5b6 1.86kB / 1.86kB 0.0s \r\n => => sha256:ff08f08727e50193dcf499afc30594c47e70cc96f6fcfd1a01240524624264d0 8.65kB / 8.65kB 0.0s \r\n => => sha256:2756ef5f69a5190f4308619e0f446d95f5515eef4a814dbad0bcebbbbc7b25a8 5.15MB / 5.15MB 6.4s \r\n => => sha256:27b0a22ee906271a6ce9ddd1754fdd7d3b59078e0b57b6cc054c7ed7ac301587 54.57MB / 54.57MB 37.7s \r\n => => sha256:8584d51a9262f9a3a436dea09ba40fa50f85802018f9bd299eee1bf538481077 196.45MB / 196.45MB 82.3s \r\n => => sha256:524774b7d3638702fe9ae0ea3fcfb81b027dfd75cc2fc14f0119e764b9543d58 6.29MB / 6.29MB 26.6s \r\n => => extracting sha256:955615a668ce169f8a1443fc6b6e6215f43fe0babfb4790712a2d3171f34d366 5.4s \r\n => => sha256:9460f6b75036e38367e2f27bb15e85777c5d6cd52ad168741c9566186415aa26 16.81MB / 16.81MB 40.5s \r\n => => extracting sha256:2756ef5f69a5190f4308619e0f446d95f5515eef4a814dbad0bcebbbbc7b25a8 0.6s \r\n => => extracting sha256:911ea9f2bd51e53a455297e0631e18a72a86d7e2c8e1807176e80f991bde5d64 0.6s \r\n => => sha256:9bc548096c181514aa1253966a330134d939496027f92f57ab376cd236eb280b 232B / 232B 40.1s \r\n => => extracting sha256:27b0a22ee906271a6ce9ddd1754fdd7d3b59078e0b57b6cc054c7ed7ac301587 5.8s \r\n => => sha256:1d87379b86b89fd3b8bb1621128f00c8f962756e6aaaed264ec38db733273543 2.35MB / 2.35MB 41.8s \r\n => => extracting sha256:8584d51a9262f9a3a436dea09ba40fa50f85802018f9bd299eee1bf538481077 18.8s \r\n => => extracting sha256:524774b7d3638702fe9ae0ea3fcfb81b027dfd75cc2fc14f0119e764b9543d58 1.2s \r\n => => extracting sha256:9460f6b75036e38367e2f27bb15e85777c5d6cd52ad168741c9566186415aa26 2.9s \r\n => => extracting sha256:9bc548096c181514aa1253966a330134d939496027f92f57ab376cd236eb280b 0.0s \r\n => => extracting sha256:1d87379b86b89fd3b8bb1621128f00c8f962756e6aaaed264ec38db733273543 0.8s \r\n => [2/5] COPY . /app 2.3s \r\n => [3/5] WORKDIR /app 0.2s \r\n => [4/5] RUN pip install -U datasette 26.9s \r\n => [5/5] RUN datasette inspect covid.db --inspect-file inspect-data.json 3.1s\r\n => exporting to image 1.2s \r\n => => exporting layers 1.2s \r\n => => writing image sha256:b5db0c205cd3454c21fbb00ecf6043f261540bcf91c2dfc36d418f1a23a75d7a 0.0s\r\n\r\nUse 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them\r\nTraceback (most recent call last):\r\n \"__main__\", mod_spec)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\runpy.py\", line 85, in _run_code\r\n exec(code, run_globals)\r\n File \"C:\\Users\\grott\\Anaconda3\\Scripts\\datasette.exe\\__main__.py\", line 7, in \r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 829, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 782, in main\r\n rv = self.invoke(ctx)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 1259, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 1066, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\click\\core.py\", line 610, in invoke\r\n return callback(*args, **kwargs)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\datasette\\cli.py\", line 283, in package\r\n call(args)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\contextlib.py\", line 119, in __exit__\r\n next(self.gen)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\site-packages\\datasette\\utils\\__init__.py\", line 451, in temporary_docker_directory\r\n tmp.cleanup()\r\n File \"c:\\users\\grott\\anaconda3\\lib\\tempfile.py\", line 811, in cleanup\r\n _shutil.rmtree(self.name)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\shutil.py\", line 516, in rmtree\r\n return _rmtree_unsafe(path, onerror)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\shutil.py\", line 395, in _rmtree_unsafe\r\n _rmtree_unsafe(fullname, onerror)\r\n File \"c:\\users\\grott\\anaconda3\\lib\\shutil.py\", line 404, in _rmtree_unsafe\r\n onerror(os.rmdir, path, sys.exc_info())\r\n File \"c:\\users\\grott\\anaconda3\\lib\\shutil.py\", line 402, in _rmtree_unsafe\r\n os.rmdir(path)\r\nPermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\\\Users\\\\grott\\\\AppData\\\\Local\\\\Temp\\\\tmpkb27qid3\\\\datasette'```", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/1479/reactions\", \"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1884499674, "node_id": "PR_kwDODFE5qs5ZtYMc", "number": 13, "title": "use poetry for packages, asdf for versioning, and gh actions for ci", "user": {"value": 150855, "label": "iloveitaly"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-09-06T17:59:16Z", "updated_at": "2023-09-06T17:59:16Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/google-takeout-to-sqlite/pulls/13", "body": "- build: use poetry for package management, asdf for python version\n- build: cleanup poetry config, add keywords, ignore dist\n- ci: migrate circleci to gh actions\n- fix: dup method definition\n", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/13/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1884408624, "node_id": "I_kwDOBm6k_c5wUcsw", "number": 2177, "title": "Move schema tables from _internal to _catalog", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-09-06T16:58:33Z", "updated_at": "2023-09-06T17:04:30Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "This came up in discussion over:\r\n- https://github.com/simonw/datasette/pull/2174\r\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2177/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1880968405, "node_id": "PR_kwDOJHON9s5ZhYny", "number": 14, "title": "fix: fix the problem of Chinese character garbling", "user": {"value": 2698003, "label": "barretlee"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-09-04T23:48:28Z", "updated_at": "2023-09-04T23:48:28Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/apple-notes-to-sqlite/pulls/14", "body": "1. The code uses two different ways of writing encoding formats, `mac_roman` and `macroman`. It is uncertain whether there are any typo errors.\r\n2. When there are Chinese characters in the content, exporting it results in garbled code. Changing it to `utf8` can fix the issue.", "repo": {"value": 611552758, "label": "apple-notes-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/14/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1879214365, "node_id": "I_kwDOCGYnMM5wAokd", "number": 590, "title": "Ability to tell if a Database is an in-memory one", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-09-03T19:50:15Z", "updated_at": "2023-09-03T19:50:36Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "Currently the constructor accepts `memory=True` or `memory_name=...` and uses those to create a connection, but does not record what those values were:\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/1260bdc7bfe31c36c272572c6389125f8de6ef71/sqlite_utils/db.py#L307-L349\r\n\r\nThis makes it hard to tell if a database object is to an in-memory or a file-based database, which is sometimes useful to know.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/590/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1879209560, "node_id": "I_kwDOCGYnMM5wAnZY", "number": 589, "title": "Mechanism for de-registering registered SQL functions", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2023-09-03T19:32:39Z", "updated_at": "2023-09-03T19:36:34Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "I used a custom SQL function in a migration script and then realized that it should be de-registered before the end of the script to avoid leaking into the calling code.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/589/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1875739055, "node_id": "I_kwDOBm6k_c5vzYGv", "number": 2167, "title": "Document return type of await ds.permission_allowed()", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-08-31T15:14:23Z", "updated_at": "2023-08-31T15:14:23Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "The return type isn't documented here: https://github.com/simonw/datasette/blob/4c3ef033110407f3b3dbce501659d523724985e0/docs/internals.rst#L327-L350\r\n\r\nOn inspecting the code I'm not 100% sure if it's possible for this. method to return `None`, or if it can only return `True` or `False`. Need to confirm that.\r\n\r\nhttps://github.com/simonw/datasette/blob/4c3ef033110407f3b3dbce501659d523724985e0/datasette/app.py#L822C15-L853", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2167/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1865869205, "node_id": "I_kwDOBm6k_c5vNueV", "number": 2157, "title": "Proposal: Make the `_internal` database persistent, customizable, and hidden", "user": {"value": 15178711, "label": "asg017"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2023-08-24T20:54:29Z", "updated_at": "2023-08-31T02:45:56Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "The current `_internal` database is used by Datasette core to cache info about databases/tables/columns/foreign keys of databases in a Datasette instance. It's a temporary database created at startup, that can only be seen by the root user. See an [example `_internal` DB here](https://latest.datasette.io/_internal), after [logging in as root](https://latest.datasette.io/login-as-root).\r\n\r\nThe current `_internal` database has a few rough edges:\r\n\r\n- It's part of `datasette.databases`, so many plugins have to specifically exclude `_internal` from their queries [examples here](https://github.com/search?q=datasette+hookimpl+%22_internal%22+language%3APython+-path%3Adatasette%2F&ref=opensearch&type=code)\r\n- It's only used by Datasette core and can't be used by plugins or 3rd parties\r\n- It's created from scratch at startup and stored in memory. Why is fine, the performance is great, but persistent storage would be nice.\r\n\r\nAdditionally, it would be really nice if plugins could use this `_internal` database to store their own configuration, secrets, and settings. For example:\r\n\r\n- `datasette-auth-tokens` [creates a `_datasette_auth_tokens` table](https://github.com/simonw/datasette-auth-tokens/blob/main/datasette_auth_tokens/__init__.py#L15) to store auth token metadata. This could be moved into the `_internal` database to avoid writing to the gues database\r\n- `datasette-socrata` [creates a `socrata_imports`](https://github.com/simonw/datasette-socrata/blob/1409aa9b4d2fc3aff286b52e73af33b5786d56d0/datasette_socrata/__init__.py#L190-L198) table, which also can be in `_internal`\r\n- `datasette-upload-csvs` [creates a `_csv_progress_`](https://github.com/simonw/datasette-upload-csvs/blob/main/datasette_upload_csvs/__init__.py#L154) table, which can be in `_internal`\r\n- `datasette-write-ui` wants to have the ability for users to toggle whether a table appears editable, which can be either in `datasette.yaml` or on-the-fly by storing config in `_internal`\r\n\r\n\r\nIn general, these are specific features that Datasette plugins would have access to if there was a central internal database they could read/write to:\r\n\r\n- **Dynamic configuration**. Changing the `datasette.yaml` file works, but can be tedious to restart the server every time. Plugins can define their own configuration table in `_internal`, and could read/write to it to store configuration based on user actions (cell menu click, API access, etc.)\r\n- **Caching**. If a plugin or Datasette Core needs to cache some expensive computation, they can store it inside `_internal` (possibly as a temporary table) instead of managing their own caching solution.\r\n- **Audit logs**. If a plugin performs some sensitive operations, they can log usage info to `_internal` for others to audit later. \r\n- **Long running process status**. Many plugins (`datasette-upload-csvs`, `datasette-litestream`, `datasette-socrata`) perform tasks that run for a really long time, and want to give continue status updates to the user. They can store this info inside` _internal`\r\n- **Safer authentication**. Passwords and authentication plugins usually store credentials/hashed secrets in configuration files or environment variables, which can be difficult to handle. Now, they can store them in `_internal` \r\n\r\n## Proposal\r\n\r\n- We remove `_internal` from [`datasette.databases`](https://docs.datasette.io/en/latest/internals.html#databases) property.\r\n- We add new `datasette.get_internal_db()` method that returns the `_internal` database, for plugins to use\r\n- We add a new `--internal internal.db` flag. If provided, then the `_internal` DB will be sourced from that file, and further updates will be persisted to that file (instead of an in-memory database)\r\n- When creating internal.db, create a new `_datasette_internal` table to mark it a an \"datasette internal database\"\r\n- In `datasette serve`, we check for the existence of the `_datasette_internal` table. If it exists, we assume the user provided that file in error and raise an error. This is to limit the chance that someone accidentally publishes their internal database to the internet. We could optionally add a `--unsafe-allow-internal` flag (or database plugin) that allows someone to do this if they really want to.\r\n\r\n\r\n## New features unlocked with this\r\n\r\nThese features don't really need a standardized `_internal` table per-say (plugins could currently configure their own long-time storage features if they really wanted to), but it would make it much simpler to create these kinds of features with a persistent application database.\r\n\r\n- **`datasette-comments`** : A plugin for commenting on rows or specific values in a database. Comment contents + threads + email notification info can be stored in `_internal`\r\n- **Bookmarks**: \"Bookmarking\" an SQL query could be stored in `_internal`, or a URL link shortener\r\n- **Webhooks**: If a plugin wants to either consume a webhook or create a new one, they can store hashed credentials/API endpoints in `_internal`", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2157/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 594237015, "node_id": "MDU6SXNzdWU1OTQyMzcwMTU=", "number": 718, "title": "Plugin idea: datasette-redirects", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2020-04-05T03:41:38Z", "updated_at": "2023-08-30T22:17:31Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "I just had to write a one-off custom plugin to redirect niche-musems.com to www.niche-museums.com (https://github.com/simonw/museums/issues/21) - it would be great if this kind of thing could be handled by a configurable plugin.\r\n\r\nhttps://github.com/simonw/museums/blob/6b1faf00c463b2228860d4d62d104b11935e01b1/plugins/redirect_www.py", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/718/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "reopened"} {"id": 1865649347, "node_id": "I_kwDOBm6k_c5vM4zD", "number": 2156, "title": "datasette -s/--setting option for setting nested configuration options", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2023-08-24T18:09:27Z", "updated_at": "2023-08-28T19:33:05Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "> I've been thinking about what it might look like to allow command-line arguments to be used to define _any_ of the configuration options in `datasette.yml`, as alternative and more convenient syntax.\r\n>\r\n> Here's what I've come up with:\r\n> ```\r\n> datasette \\\r\n> -s settings.sql_time_limit_ms 1000 \\\r\n> -s plugins.datasette-auth-tokens.manage_tokens true \\\r\n> -s plugins.datasette-auth-tokens.manage_tokens_database tokens \\\r\n> mydatabase.db tokens.db\r\n> ```\r\n> Which would be equivalent to `datasette.yml` containing this:\r\n> ```yaml\r\n> plugins:\r\n> datasette-auth-tokens:\r\n> manage_tokens: true\r\n> manage_tokens_database: tokens\r\n> settings:\r\n> sql_time_limit_ms: 1000\r\n> ```\r\nMore details in https://github.com/simonw/datasette/issues/2143#issuecomment-1690792514\r\n ", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2156/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1868713944, "node_id": "I_kwDOCGYnMM5vYk_Y", "number": 588, "title": "`table.get(column=value)` option for retrieving things not by their primary key", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-08-28T00:41:23Z", "updated_at": "2023-08-28T00:41:54Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "This came up working on this feature:\r\n- https://github.com/simonw/llm/pull/186\r\n\r\nI have a table with this schema:\r\n```sql\r\nCREATE TABLE [collections] (\r\n [id] INTEGER PRIMARY KEY,\r\n [name] TEXT,\r\n [model] TEXT\r\n);\r\nCREATE UNIQUE INDEX [idx_collections_name]\r\n ON [collections] ([name]);\r\n```\r\nSo the primary key is an integer (because it's going to have a huge number of rows foreign key related to it, and I don't want to store a larger text value thousands of times), but there is a unique constraint on the `name` - that would be the primary key column if not for all of those foreign keys.\r\n\r\nProblem is, fetching the collection by name is actually pretty inconvenient.\r\n\r\nFetch by numeric ID:\r\n\r\n```python\r\ntry:\r\n table[\"collections\"].get(1)\r\nexcept NotFoundError:\r\n # It doesn't exist\r\n```\r\nFetching by name:\r\n```python\r\ndef get_collection(db, collection):\r\n rows = db[\"collections\"].rows_where(\"name = ?\", [collection])\r\n try:\r\n return next(rows)\r\n except StopIteration:\r\n raise NotFoundError(\"Collection not found: {}\".format(collection))\r\n```\r\nIt would be neat if, for columns where we know that we should always get 0 or one result, we could do this instead:\r\n```python\r\ntry:\r\n collection = table[\"collections\"].get(name=\"entries\")\r\nexcept NotFoundError:\r\n # It doesn't exist\r\n```\r\nThe existing `.get()` method doesn't have any non-positional arguments, so using `**kwargs` like that should work:\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/1260bdc7bfe31c36c272572c6389125f8de6ef71/sqlite_utils/db.py#L1495", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/588/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1866815458, "node_id": "PR_kwDOBm6k_c5YyF-C", "number": 2159, "title": "Implement Dark Mode colour scheme", "user": {"value": 3315059, "label": "jamietanna"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-08-25T10:46:23Z", "updated_at": "2023-08-25T10:46:35Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "simonw/datasette/pulls/2159", "body": "Closes #2095.\n\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://datasette--2159.org.readthedocs.build/en/2159/\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2159/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 1, "state_reason": null} {"id": 1865983069, "node_id": "PR_kwDOBm6k_c5YvQSi", "number": 2158, "title": "add brand option to metadata.json.", "user": {"value": 52261150, "label": "publicmatt"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-08-24T22:37:41Z", "updated_at": "2023-08-24T22:37:57Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "simonw/datasette/pulls/2158", "body": "This adds a brand link to the top navbar if 'brand' key is populated in metadata.json. The link will be either '#' or use the contents of 'brand_url' in metadata.json for href.\r\n\r\nI was able to get this done on my own site by replacing `templates/_crumbs.html` with a custom version, but I thought it would be nice to incorporate this in the tool directly.\r\n\r\n![image](https://github.com/simonw/datasette/assets/52261150/fdfe9bb5-fee4-466c-8074-6132071d94e6)\r\n\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://datasette--2158.org.readthedocs.build/en/2158/\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2158/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 814595021, "node_id": "MDU6SXNzdWU4MTQ1OTUwMjE=", "number": 1241, "title": "Share button for copying current URL", "user": {"value": 7107523, "label": "Kabouik"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 6, "created_at": "2021-02-23T15:55:40Z", "updated_at": "2023-08-24T20:09:52Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I use datasette in an `iframe` inside another HTML file that contains other ways to represent my data (mostly leaflets maps built with R on summarized data), and the datasette `iframe` is a tab in that page. \r\n\r\nThis particular use prevents users to access the full URLs of their datasette views and queries, which is a shame because the way datasette handles URLs to make every view or query easy to share is awesome. I know how to get the URL from the context menu of my browser, but I don't think many visitors would do it or even notice that datasette uses permalinks for pretty much every action they do. Would it be possible to add a \"Share link\" button to the interface, either in datasette itself or in a plugin?", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/1241/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1855885427, "node_id": "I_kwDOBm6k_c5unpBz", "number": 2143, "title": "De-tangling Metadata before Datasette 1.0", "user": {"value": 15178711, "label": "asg017"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 24, "created_at": "2023-08-18T00:51:50Z", "updated_at": "2023-08-24T18:28:27Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "Metadata in Datasette is a really powerful feature, but is a bit difficult to work with. It was initially a way to add \"metadata\" about your \"data\" in Datasette instances, like descriptions for databases/tables/columns, titles, source URLs, licenses, etc. But it later became the go-to spot for other Datasette features that have nothing to do with metadata, like permissions/plugins/canned queries. \r\n\r\nSpecifically, I've found the following problems when working with Datasette metadata:\r\n\r\n1. Metadata cannot be updated without re-starting the entire Datasette instance.\r\n2. The `metadata.json`/`metadata.yaml` has become a kitchen sink of unrelated (imo) features like plugin config, authentication config, canned queries\r\n3. The Python APIs for defining extra metadata are a bit awkward (the `datasette.metadata()` class, `get_metadata()` hook, etc.)\r\n\r\n## Possible solutions\r\n\r\nHere's a few ideas of Datasette core changes we can make to address these problems. \r\n\r\n### Re-vamp the Datasette Python metadata APIs\r\n\r\nThe Datasette object has a single `datasette.metadata()` method that's a bit difficult to work with. There's also no Python API for inserted new metadata, so plugins have to rely on the `get_metadata()` hook.\r\n\r\nThe `get_metadata()` hook can also be improved - it doesn't work with async functions yet, so you're quite limited to what you can do.\r\n\r\n(I'm a bit fuzzy on what to actually do here, but I imagine it'll be very small breaking changes to a few Python methods)\r\n\r\n### Add an optional `datasette_metadata` table\r\n\r\nDatasette should detect and use metadata stored in a new special table called `datasette_metadata`. This would be a regular table that a user can edit on their own, and would serve as a \"live updating\" source of metadata, than can be changed while the Datasette instance is running.\r\n\r\nNot too sure what the schema would look like, but I'd imagine:\r\n\r\n```sql\r\nCREATE TABLE datasette_metadata(\r\n level text,\r\n target any,\r\n key text,\r\n value any,\r\n primary key (level, target)\r\n)\r\n```\r\n\r\nEvery row in this table would map to a single metadata \"entry\".\r\n\r\n- `level` would be one of \"datasette\", \"database\", \"table\", \"column\", which is the \"level\" the entry describes. For example, `level=\"table\"` means it is metadata about a specific table, `level=\"database\"` for a specific database, or `level=\"datasette\"` for the entire Datasette instance.\r\n- `target` would \"point\" to the specific object the entry metadata is about, and would depend on what `level` is specific. \r\n - `level=\"database\"`: `target` would be the string name of the database that the metadata entry is about. ex `\"fixtures\"`\r\n - `level=\"table\"`: `target` would be a JSON array of two strings. The first element would be the database name, and the second would be the table name. ex `[\"fixtures\", \"students\"]`\r\n - `level=\"column\"`: `target` would be a JSON array of 3 strings: The database name, table name, and column name. Ex `[\"fixtures\", \"students\", \"student_id\"`]\r\n- `key` would be the type of metadata entry the row has, similar to the current \"keys\" that exist in `metadata.json`. Ex `\"about_url\"`, `\"source\"`, `\"description\"`, etc\r\n- `value` would be the text value of be metadata entry. The literal text value of a description, about_url, column_label, etc\r\n\r\nA quick sample:\r\n\r\nlevel | target | key | value\r\n-- | -- | -- | --\r\ndatasette | NULL | title | my datasette title...\r\ndb | fixtures | source | \r\ntable | [\"fixtures\", \"students\"] | label_column | student_name\r\ncolumn | [\"fixtures\", \"students\", \"birthdate\"] | description | \r\n\r\nThis `datasette_metadata` would be configured with other tools, and hopefully not manually by end users. Datasette Core could also offer a UI for editing entries in `datasette_metadata`, to update descriptions/columns on the fly.\r\n\r\n### Re-vamp `metadata.json` and move non-metadata config to another place\r\n\r\nThe motivation behind this is that it's awkward that `metadata.json` contains config about things that are not strictly metadata, including:\r\n\r\n- Plugin configuration\r\n- [Authentication/permissions](https://docs.datasette.io/en/latest/authentication.html#access-permissions-in-metadata) (ex the `allow` key on datasettes/databases/tables\r\n- Canned queries. might be controversial, but in my mind, canned queries are application-specific code and configuration, and don't describe the data that exists in SQLite databases. \r\n\r\nI think we should move these outside of `metadata.json` and into a different file. The `datasette.json` idea in #2093 may be a good solution here: plugin/permissions/canned queries can be defined in `datasette.json`, while `metadata.json`/`datasette_metadata` will strictly be about documenting databases/tables/columns. \r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2143/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1858228057, "node_id": "I_kwDOBm6k_c5uwk9Z", "number": 2147, "title": "Plugin hook for database queries that are run", "user": {"value": 18899, "label": "jackowayed"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 6, "created_at": "2023-08-20T18:43:50Z", "updated_at": "2023-08-24T03:54:35Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I'm interested in making a plugin that saves every query that gets run to a table in the database. (I know about datasette-query-history but thought it would be good to have a server-side option.)\r\n\r\nAs far as I can tell reading the docs, there isn't really a hook setup to allow this.\r\n\r\nMaybe I could hack it with some of the hooks that are passed requests, but that doesn't seem good.\r\n\r\nI'm a little surprised this isn't possible, so I thought I would open an issue and see if that's a deeply considered decision or just \"haven't needed it yet.\" I'm potentially interested in implementing the hook if the latter.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2147/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1864112887, "node_id": "PR_kwDOBm6k_c5Yo7bk", "number": 2151, "title": "Test Datasette on multiple SQLite versions", "user": {"value": 15178711, "label": "asg017"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-08-23T22:42:51Z", "updated_at": "2023-08-23T22:58:13Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": "simonw/datasette/pulls/2151", "body": "still testing, hope it works!\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://datasette--2151.org.readthedocs.build/en/2151/\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2151/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 1, "state_reason": null} {"id": 459509126, "node_id": "MDU6SXNzdWU0NTk1MDkxMjY=", "number": 516, "title": "Enforce import sort order with isort", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 8, "created_at": "2019-06-22T20:35:50Z", "updated_at": "2023-08-23T02:15:36Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "I want to use isort to order imports. A few steps here:\r\n\r\n- [x] Add a .isort.cfg file (see below)\r\n- [x] Use `isort -rc` to reformat existing code\r\n- [ ] Commit this change\r\n- [x] Add a unit test that ensures future changes remain isort compatible", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/516/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1857234285, "node_id": "I_kwDOBm6k_c5usyVt", "number": 2145, "title": "If a row has a primary key of `null` various things break", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 23, "created_at": "2023-08-18T20:06:28Z", "updated_at": "2023-08-21T17:30:01Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "Stumbled across this while experimenting with `datasette-write-ui`. The error I got was a 500 on the `/db` page:\r\n\r\n> `'NoneType' object has no attribute 'encode'`\r\n\r\nTracked it down to this code, which assembles the URL for a row page:\r\n\r\nhttps://github.com/simonw/datasette/blob/943df09dcca93c3b9861b8c96277a01320db8662/datasette/utils/__init__.py#L120-L134\r\n\r\nThat's because `tilde_encode` can't handle `None`: https://github.com/simonw/datasette/blob/943df09dcca93c3b9861b8c96277a01320db8662/datasette/utils/__init__.py#L1175-L1178\r\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2145/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1856075668, "node_id": "I_kwDOCGYnMM5uoXeU", "number": 586, "title": ".transform() fails to drop column if table is part of a view", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2023-08-18T05:25:22Z", "updated_at": "2023-08-18T06:13:47Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "I got this error trying to drop a column from a table that was part of a SQL view:\r\n\r\n> error in view plugins: no such table: main.pypi_releases\r\n\r\nUpon further investigation I found that this pattern seemed to fix it:\r\n```python\r\ndef transform_the_table(conn):\r\n # Run this in a transaction:\r\n with conn:\r\n # We have to read all the views first, because we need to drop and recreate them\r\n db = sqlite_utils.Database(conn)\r\n views = {v.name: v.schema for v in db.views if table.lower() in v.schema.lower()}\r\n for view in views.keys():\r\n db[view].drop()\r\n db[table].transform(\r\n types=types,\r\n rename=rename,\r\n drop=drop,\r\n column_order=[p[0] for p in order_pairs],\r\n )\r\n # Now recreate the views\r\n for name, schema in views.items():\r\n db.create_view(name, schema)\r\n```\r\nSo grab a copy of any view that might reference this table, start a transaction, drop those views, run the transform, recreate the views again.\r\n\r\n> I wonder if this should become an option in `sqlite-utils`? Maybe a `recreate_views=True` argument for `table.tranform(...)`? Should it be opt-in or opt-out?\r\n\r\n_Originally posted by @simonw in https://github.com/simonw/datasette-edit-schema/issues/35#issuecomment-1683370548_\r\n ", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/586/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1754174496, "node_id": "I_kwDOCGYnMM5ojpQg", "number": 558, "title": "Ability to define unique columns when creating a table", "user": {"value": 1910303, "label": "aguinane"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-06-13T06:56:19Z", "updated_at": "2023-08-18T01:06:03Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "When creating a new table, it would be good to have an option to set unique columns similar to how not_null is set.\r\n\r\n```python\r\nfrom sqlite_utils import Database\r\n\r\ncolumns = {\"mRID\": str, \"name\": str}\r\ndb = Database(\"example.db\")\r\ndb[\"ExampleTable\"].create(columns, pk=\"mRID\", not_null=[\"mRID\"], if_not_exists=True)\r\ndb[\"ExampleTable\"].create_index([\"mRID\"], unique=True, if_not_exists=True)\r\n```\r\n\r\nSo something like this would add the UNIQUE flag to the table definition. \r\n\r\n```python\r\ndb[\"ExampleTable\"].create(columns, pk=\"mRID\", not_null=[\"mRID\"], unique=[\"mRID\"], if_not_exists=True)\r\n```\r\n\r\n```sql\r\nCREATE TABLE ExampleTable (\r\n mRID TEXT PRIMARY KEY\r\n NOT NULL\r\n UNIQUE,\r\n name TEXT\r\n);\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/558/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 476852861, "node_id": "MDU6SXNzdWU0NzY4NTI4NjE=", "number": 568, "title": "Add database_color as a configurable option", "user": {"value": 50906992, "label": "LBHELewis"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2019-08-05T13:14:45Z", "updated_at": "2023-08-11T05:19:42Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "This would be really useful as it would allow us to tie in with colour schemes.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/568/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1802613340, "node_id": "PR_kwDOBm6k_c5VZhfw", "number": 2100, "title": "Make primary key view accessible to render_cell hook", "user": {"value": 1563881, "label": "meowcat"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-07-13T09:30:36Z", "updated_at": "2023-08-10T13:15:41Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "simonw/datasette/pulls/2100", "body": "\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://datasette--2100.org.readthedocs.build/en/2100/\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2100/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1838469176, "node_id": "I_kwDOBm6k_c5tlNA4", "number": 2127, "title": "Context base class to support documenting the context", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": {"value": 3268330, "label": "Datasette 1.0"}, "comments": 3, "created_at": "2023-08-07T00:01:02Z", "updated_at": "2023-08-10T01:30:25Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "This idea first came up here:\r\n- https://github.com/simonw/datasette/issues/2112#issuecomment-1652751140\r\n\r\nIf `datasette.render_template(...)` takes an optional `Context` subclass as an alternative to a context dictionary, I could then use dataclasses to define the context made available to specific templates - which then gives me something I can use to help document what they are.\r\n\r\nAlso refs:\r\n- https://github.com/simonw/datasette/issues/1510", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2127/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1843821954, "node_id": "I_kwDOBm6k_c5t5n2C", "number": 2137, "title": "Redesign row default JSON", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": {"value": 8755003, "label": "Datasette 1.0a-next"}, "comments": 1, "created_at": "2023-08-09T18:49:11Z", "updated_at": "2023-08-09T19:02:47Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "This URL here:\r\n\r\nhttps://latest.datasette.io/fixtures/simple_primary_key/1.json?_extras=foreign_key_tables\r\n\r\n```json\r\n{\r\n \"database\": \"fixtures\",\r\n \"table\": \"simple_primary_key\",\r\n \"rows\": [\r\n {\r\n \"id\": \"1\",\r\n \"content\": \"hello\"\r\n }\r\n ],\r\n \"columns\": [\r\n \"id\",\r\n \"content\"\r\n ],\r\n \"primary_keys\": [\r\n \"id\"\r\n ],\r\n \"primary_key_values\": [\r\n \"1\"\r\n ],\r\n \"units\": {},\r\n \"foreign_key_tables\": [\r\n {\r\n \"other_table\": \"foreign_key_references\",\r\n \"column\": \"id\",\r\n \"other_column\": \"foreign_key_with_blank_label\",\r\n \"count\": 0,\r\n \"link\": \"/fixtures/foreign_key_references?foreign_key_with_blank_label=1\"\r\n },\r\n {\r\n \"other_table\": \"foreign_key_references\",\r\n \"column\": \"id\",\r\n \"other_column\": \"foreign_key_with_label\",\r\n \"count\": 1,\r\n \"link\": \"/fixtures/foreign_key_references?foreign_key_with_label=1\"\r\n },\r\n {\r\n \"other_table\": \"complex_foreign_keys\",\r\n \"column\": \"id\",\r\n \"other_column\": \"f3\",\r\n \"count\": 1,\r\n \"link\": \"/fixtures/complex_foreign_keys?f3=1\"\r\n },\r\n {\r\n \"other_table\": \"complex_foreign_keys\",\r\n \"column\": \"id\",\r\n \"other_column\": \"f2\",\r\n \"count\": 0,\r\n \"link\": \"/fixtures/complex_foreign_keys?f2=1\"\r\n },\r\n {\r\n \"other_table\": \"complex_foreign_keys\",\r\n \"column\": \"id\",\r\n \"other_column\": \"f1\",\r\n \"count\": 1,\r\n \"link\": \"/fixtures/complex_foreign_keys?f1=1\"\r\n }\r\n ],\r\n \"query_ms\": 4.226590999678592,\r\n \"source\": \"tests/fixtures.py\",\r\n \"source_url\": \"https://github.com/simonw/datasette/blob/main/tests/fixtures.py\",\r\n \"license\": \"Apache License 2.0\",\r\n \"license_url\": \"https://github.com/simonw/datasette/blob/main/LICENSE\",\r\n \"ok\": true,\r\n \"truncated\": false\r\n}\r\n```\r\n\r\nThat `?_extras=` should be `?_extra=` - plus the row JSON should be redesigned to fit the new default JSON representation.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2137/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1822939274, "node_id": "I_kwDOBm6k_c5sp9iK", "number": 2113, "title": "Implement and document extras for the new query view page", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": {"value": 8755003, "label": "Datasette 1.0a-next"}, "comments": 3, "created_at": "2023-07-26T18:24:01Z", "updated_at": "2023-08-09T17:35:22Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "- #2109 ", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2113/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1840417903, "node_id": "I_kwDOBm6k_c5tsoxv", "number": 2131, "title": "Refactor code that supports templates_considered comment", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": {"value": 3268330, "label": "Datasette 1.0"}, "comments": 1, "created_at": "2023-08-08T01:28:36Z", "updated_at": "2023-08-09T15:27:41Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "I ended up duplicating it here: https://github.com/simonw/datasette/blob/7532feb424b1dce614351e21b2265c04f9669fe2/datasette/views/database.py#L164-L167\r\n\r\nI think it should move to `datasette.render_template()` - and maybe have a renamed template variable too.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2131/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 771511344, "node_id": "MDExOlB1bGxSZXF1ZXN0NTQzMDE1ODI1", "number": 31, "title": "Update for Big Sur", "user": {"value": 41546558, "label": "RhetTbull"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 7, "created_at": "2020-12-20T04:36:45Z", "updated_at": "2023-08-08T15:52:52Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": "dogsheep/dogsheep-photos/pulls/31", "body": "Refactored out the SQL for extracting aesthetic scores to use osxphotos -- adds compatbility for Big Sur via osxphotos which has been updated for new table names in Big Sur. Have not yet refactored the SQL for extracting labels which is still compatible with Big Sur.", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/31/reactions\", \"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1840324765, "node_id": "I_kwDOBm6k_c5tsSCd", "number": 2129, "title": "CSV ?sql= should indicate errors", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": {"value": 3268330, "label": "Datasette 1.0"}, "comments": 1, "created_at": "2023-08-07T23:13:04Z", "updated_at": "2023-08-08T02:02:21Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "> https://latest.datasette.io/_memory.csv?sql=select+blah is a blank page right now:\r\n\r\n```bash\r\ncurl -I 'https://latest.datasette.io/_memory.csv?sql=select+blah'\r\n```\r\n```\r\nHTTP/2 200 \r\naccess-control-allow-origin: *\r\naccess-control-allow-headers: Authorization, Content-Type\r\naccess-control-expose-headers: Link\r\naccess-control-allow-methods: GET, POST, HEAD, OPTIONS\r\naccess-control-max-age: 3600\r\ncontent-type: text/plain; charset=utf-8\r\nx-databases: _memory, _internal, fixtures, fixtures2, extra_database, ephemeral\r\ndate: Mon, 07 Aug 2023 23:12:15 GMT\r\nserver: Google Frontend\r\n```\r\n\r\n_Originally posted by @simonw in https://github.com/simonw/datasette/issues/2118#issuecomment-1668688947_", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2129/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1818838294, "node_id": "I_kwDOCGYnMM5saUUW", "number": 578, "title": "Plugin hook for adding new output formats", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2023-07-24T17:29:18Z", "updated_at": "2023-08-07T15:41:49Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "> What would it take to add a format hook? I'm still thinking about my GIS workflow, and being able to do `sqlite-utils query ... --geojson` would be nice. It's the one place my Datasette workflow is messy, having to do `datasette . --get /path/to/query.geojson --setting max_rows_returned 10000 --load-extension spatialite`.\r\n> I know the current pattern is `--csv`, but maybe `--format geojson` is more future-proof.\r\n\r\nhttps://discord.com/channels/823971286308356157/997738192360964156/1133076679011602432", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/578/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1839344979, "node_id": "I_kwDOCGYnMM5toi1T", "number": 582, "title": "Handling CSV/file input that contains NUL bytes", "user": {"value": 1448859, "label": "betatim"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-08-07T12:24:14Z", "updated_at": "2023-08-07T12:24:14Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I was using sqlite-utils to create a DB from a CSV and it turns out the CSV contains a NUL byte.\r\n\r\nWhen the processing reaches the line that contains the NUL an exception is raised.\r\n\r\nI'm wondering if there is something that can be done in `sqlite-utils` to say \"skip lines with encoding errors\" or some such. I think it isn't super straightforward though as the exception comes from inside the `csv` module that does all the parsing.\r\n\r\nConcretely the file is the `KernelVersions.csv` from https://www.kaggle.com/datasets/kaggle/meta-kaggle\r\n\r\nThis is the command and output:\r\n```\r\n$ sqlite-utils insert --csv kaggle.db kaggle KernelVersions.csv\r\n [------------------------------------] 0%\r\n [#####################---------------] 60% 00:04:24Traceback (most recent call last):\r\n File \"/home/foobar/miniconda/envs/meta-kaggle/bin/sqlite-utils\", line 10, in \r\n sys.exit(cli())\r\n File \"/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py\", line 1128, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py\", line 1053, in main\r\n rv = self.invoke(ctx)\r\n File \"/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py\", line 1659, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py\", line 1395, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n File \"/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py\", line 754, in invoke\r\n return __callback(*args, **kwargs)\r\n File \"/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py\", line 1223, in insert\r\n insert_upsert_implementation(\r\n File \"/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py\", line 1085, in insert_upsert_implementation\r\n db[table].insert_all(\r\n File \"/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/db.py\", line 3198, in insert_all\r\n chunk = list(chunk)\r\n File \"/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/db.py\", line 3742, in fix_square_braces\r\n for record in records:\r\n File \"/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py\", line 1071, in \r\n docs = (decode_base64_values(doc) for doc in docs)\r\n File \"/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py\", line 1068, in \r\n docs = (verify_is_dict(doc) for doc in docs)\r\n File \"/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py\", line 1003, in \r\n docs = (dict(zip(headers, row)) for row in reader)\r\n_csv.Error: line contains NUL\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/582/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 774332247, "node_id": "MDExOlB1bGxSZXF1ZXN0NTQ1MjY0NDM2", "number": 1159, "title": "Improve the display of facets information", "user": {"value": 552629, "label": "lovasoa"}, "state": "open", "locked": 0, "assignee": null, "milestone": {"value": 3268330, "label": "Datasette 1.0"}, "comments": 9, "created_at": "2020-12-24T11:01:47Z", "updated_at": "2023-07-31T18:57:59Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "simonw/datasette/pulls/1159", "body": "This PR changes the display of facets to hopefully make them more readable.\r\n\r\nBefore | After\r\n---|---\r\n![image](https://user-images.githubusercontent.com/552629/103084609-b1ec2980-45df-11eb-85bc-68ab8df3e8d9.png) | ![image](https://user-images.githubusercontent.com/552629/103085220-620e6200-45e1-11eb-8189-5dd5d3e2569e.png)\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/1159/reactions\", \"total_count\": 4, \"+1\": 4, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1827436260, "node_id": "PR_kwDOD079W85WtVyk", "number": 39, "title": "Missing option in datasette instructions", "user": {"value": 319473, "label": "coldclimate"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-07-29T10:34:48Z", "updated_at": "2023-07-29T10:34:48Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/dogsheep-photos/pulls/39", "body": "Gotta tell it where to look", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/39/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1824457306, "node_id": "I_kwDOBm6k_c5svwJa", "number": 2122, "title": "Parameters on canned queries: fixed or query-generated list?", "user": {"value": 1563881, "label": "meowcat"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-07-27T14:07:07Z", "updated_at": "2023-07-27T14:07:07Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Hi,\r\n\r\ncurrently parameters in canned queries are just text fields. It would be cool to have one of the options below. Would you accept a PR doing something in this direction? (Possibly this could even work as a plugin.)\r\n\r\n* adding facets, which would work like facets on tables or views, giving a list of selectable options (and leaving parameters as is)\r\n* making it possible to provide a query which returns selectable values for a parameter, e.g.\r\n``` \r\ncalendar_entries_current_instrument:\r\n sql: | \r\n select * from calendar_entries \r\n where \r\n DTEND_UNIX > UNIXEPOCH() and\r\n DTSTART_UNIX < UNIXEPOCH() + :days *24*60*60 and\r\n current = 1 and\r\n MACHINE = :instrument\r\n order by\r\n DTSTART_UNIX\r\n params:\r\n days: \r\n sql: \"SELECT VALUE FROM generate_series(1, 30, 1)\"\r\n # this obviously requires the corresponding sqlite extension\r\n instrument:\r\n sql: \"SELECT DISTINCT MACHINE FROM calendar_entries\"\r\n```\r\n* making it possible to provide a fixed list of parameters\r\n``` \r\ncalendar_entries_current_instrument:\r\n sql: | \r\n select * from calendar_entries \r\n where \r\n DTEND_UNIX > UNIXEPOCH() and\r\n DTSTART_UNIX < UNIXEPOCH() + :days *24*60*60 and\r\n current = 1 and\r\n MACHINE = :instrument\r\n order by\r\n DTSTART_UNIX\r\n params:\r\n days: \r\n values: [1, 2, 3, 5, 10, 20, 30]\r\n instrument:\r\n values: [supermachine, crappymachine, boringmachine]\r\n```", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2122/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1823428714, "node_id": "I_kwDOBm6k_c5sr1Bq", "number": 2120, "title": "Add __all__ to datasette/__init__.py", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-07-27T01:07:10Z", "updated_at": "2023-07-27T01:07:10Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "Currently looks like this: https://github.com/simonw/datasette/blob/08181823990a71ffa5a1b57b37259198eaa43e06/datasette/__init__.py#L1-L6\r\n\r\nAdding `__all__ = [\"Permission\", \"Forbidden\"...]` would let me get rid of those `# noqa` comments.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2120/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1822918995, "node_id": "I_kwDOCGYnMM5sp4lT", "number": 580, "title": "Add way to export to a csv file using the Python library", "user": {"value": 44324811, "label": "kevinlinxc"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-07-26T18:09:26Z", "updated_at": "2023-07-26T18:09:26Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "According to the documentation, we can make a csv output using the CLI tool, but not the Python library. Could we have the latter?", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/580/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1822813627, "node_id": "I_kwDOBm6k_c5spe27", "number": 2108, "title": "some (many?) SQL syntax errors are not throwing errors with a .csv endpoint", "user": {"value": 536941, "label": "fgregg"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-07-26T16:57:45Z", "updated_at": "2023-07-26T16:58:07Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "here's a CTE query that should always fail with a syntax error:\r\n\r\n```sql\r\nwith foo as (nonsense)\r\nselect\r\n *\r\nfrom\r\n foo;\r\n```\r\n\r\nwhen we make this query against the default endpoint, we do indeed get a 400 status code the problem is returned to the user: https://global-power-plants.datasettes.com/global-power-plants?sql=with+foo+as+%28nonsense%29+select+*+from+foo%3B\r\n\r\nbut, if we use the csv endpoint, we get a 200 status code and no indication of a problem: https://global-power-plants.datasettes.com/global-power-plants.csv?sql=with+foo+as+%28nonsense%29+select+*+from+foo%3B\r\n\r\nsame with this bad sql\r\n\r\n```sql\r\nselect\r\n a,\r\nfrom\r\n foo;\r\n```\r\n\r\nhttps://global-power-plants.datasettes.com/global-power-plants?sql=select%0D%0A++a%2C%0D%0Afrom%0D%0A++foo%3B\r\n\r\nvs \r\n\r\nhttps://global-power-plants.datasettes.com/global-power-plants.csv?sql=select%0D%0A++a%2C%0D%0Afrom%0D%0A++foo%3B\r\n\r\nbut, datasette catches this bad sql at both endpoints:\r\n\r\n```sql\r\nslect\r\n a\r\nfrom\r\n foo;\r\n```\r\n\r\nhttps://global-power-plants.datasettes.com/global-power-plants?sql=slect%0D%0A++a%0D%0Afrom%0D%0A++foo%3B\r\nhttps://global-power-plants.datasettes.com/global-power-plants.csv?sql=slect%0D%0A++a%0D%0Afrom%0D%0A++foo%3B\r\n\r\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2108/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1821108702, "node_id": "I_kwDOCGYnMM5si-ne", "number": 579, "title": "Special handling for SQLite column of type `JSON`", "user": {"value": 15178711, "label": "asg017"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-07-25T20:37:23Z", "updated_at": "2023-07-25T20:37:23Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "`sqlite-utils` should detect and have specially handling for column with a `JSON` column. For example:\r\n\r\n```sql\r\nCREATE TABLE \"dogs\" (\r\n id INTEGER PRIMARY KEY,\r\n name TEXT,\r\n friends JSON \r\n);\r\n```\r\n\r\n## Automatic Nesting\r\n\r\nAccording to [\"Nested JSON Values\"](https://sqlite-utils.datasette.io/en/stable/cli.html#nested-json-values), sqlite-utils will only expand JSON if the `--json-cols` flag is passed. It looks like it'll try to `json.load` all text column to test if its JSON, which can get expensive on non-json columns. \r\n\r\nInstead, `sqlite-utils` should be default (ie without the `--json-cols` flags) do the `maybe_json()` operation on columns with a declared `JSON` type. So the above table would expand the `\"friends\"` column as expected, withoutthe `--json-cols` flag:\r\n\r\n```bash\r\nsqlite-utils dogs.db \"select * from dogs\" | python -mjson.tool\r\n```\r\n\r\n```\r\n[\r\n {\r\n \"id\": 1,\r\n \"name\": \"Cleo\",\r\n \"friends\": [\r\n {\r\n \"name\": \"Pancakes\"\r\n },\r\n {\r\n \"name\": \"Bailey\"\r\n }\r\n ]\r\n }\r\n]\r\n```\r\n\r\n---\r\n\r\nI'm sure there's other ways `sqlite-utils` can specially handle JSON columns, so keeping this open while I think of more", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/579/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1816830546, "node_id": "I_kwDODEm0Qs5sSqJS", "number": 73, "title": "Twitter v1 API shutdown", "user": {"value": 6341745, "label": "david-perez"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-07-22T16:57:41Z", "updated_at": "2023-07-22T16:57:41Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I've been using this project reliably over the past two years to periodically download my liked tweets, but unfortunately since 19th July I get:\r\n\r\n```\r\n[2023-07-19 21:00:04.937536] File \"/home/pi/code/liked-tweets/lib/python3.7/site-packages/twitter_to_sqlite/utils.py\", line 202, in fetch_timeline\r\n[2023-07-19 21:00:04.937606] raise Exception(str(tweets[\"errors\"]))\r\n[2023-07-19 21:00:04.937678] Exception: [{'message': 'You currently have access to a subset of Twitter API v2 endpoints and limited v1.1 endpoints (e.g. media post, oauth) only. If you need access to this endpoint, you may need a different access level. You can learn more here: https://developer.twitter.com/en/portal/product', 'code': 453}]\r\n```\r\n\r\nIt appears like Twitter has now shut down their v1 endpoints, which is rather gracious of them, considering they [announced they'd be deprecated on 29th April](https://twittercommunity.com/t/reminder-to-migrate-to-the-new-free-basic-or-enterprise-plans-of-the-twitter-api/189737).\r\n\r\nUnfortunately [retrieving likes using the v2 API](https://developer.twitter.com/en/docs/twitter-api/tweets/likes/introduction) is not part of their [free plan](https://developer.twitter.com/en/portal/products). In fact, with the free plan one can only post and delete tweets and retrieve information about oneself.\r\n\r\nSo I'm afraid this is the end of this very nice project. It was very useful, thank you!\r\n", "repo": {"value": 206156866, "label": "twitter-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/73/reactions\", \"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 1}", "draft": null, "state_reason": null} {"id": 1811824307, "node_id": "I_kwDOBm6k_c5r_j6z", "number": 2105, "title": "When reverse proxying datasette with nginx an URL element gets erronously added", "user": {"value": 2235371, "label": "aki-k"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2023-07-19T12:16:53Z", "updated_at": "2023-07-21T21:17:09Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I use this nginx config:\r\n```\r\n location /datasette-llm {\r\n return 302 /datasette-llm/;\r\n }\r\n\r\n location /datasette-llm/ {\r\n proxy_set_header Upgrade $http_upgrade;\r\n proxy_set_header Connection \"Upgrade\";\r\n proxy_http_version 1.1;\r\n proxy_set_header X-Real-IP $remote_addr;\r\n proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;\r\n proxy_set_header X-Forwarded-Proto https;\r\n proxy_set_header X-Forwarded-Host $http_host;\r\n proxy_set_header Host $host;\r\n proxy_max_temp_file_size 0;\r\n proxy_pass http://127.0.0.1:8001/datasette-llm/;\r\n proxy_redirect http:// https://;\r\n proxy_buffering off;\r\n proxy_request_buffering off;\r\n proxy_set_header Origin '';\r\n client_max_body_size 0;\r\n auth_basic \"datasette-llm\";\r\n auth_basic_user_file /etc/nginx/custom-userdb;\r\n }\r\n```\r\nThen I start datasette with this command:\r\n```\r\ndatasette serve --setting base_url /datasette-llm/ $(llm logs path)\r\n```\r\nEverything else works right, except the links in \"This data as json, CSV\".\r\nThey get an extra URL element \"datasette-llm\" like this:\r\n\r\nhttps://192.168.1.3:5432/datasette-llm/datasette-llm/logs.json?sql=select+*+from+_llm_migrations\r\n\r\nhttps://192.168.1.3:5432/datasette-llm/datasette-llm/logs.csv?sql=select+*+from+_llm_migrations&_size=max\r\n\r\nWhen I remove that extra \"datasette-llm\" from the URL, those links work too.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2105/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1808215339, "node_id": "I_kwDOBm6k_c5rxy0r", "number": 2104, "title": "Tables starting with an underscore should be treated as hidden", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2023-07-17T17:13:53Z", "updated_at": "2023-07-18T22:41:37Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "Plugins can then take advantage of this pattern, for example:\r\n- https://github.com/simonw/datasette-auth-tokens/pull/8", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2104/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1808116827, "node_id": "I_kwDOBm6k_c5rxaxb", "number": 2103, "title": "data attribute on Datasette tables exposing the primary key of the row", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-07-17T16:18:25Z", "updated_at": "2023-07-17T16:18:25Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "Maybe put it on the `` but probably better to go on the `td.type-pk`.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2103/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1765870617, "node_id": "I_kwDOBm6k_c5pQQwZ", "number": 2087, "title": "`--settings settings.json` option", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2023-06-20T17:48:45Z", "updated_at": "2023-07-14T17:02:03Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "https://discord.com/channels/823971286308356157/823971286941302908/1120705940728066080\r\n\r\n> May I add a request to the whole metadata / settings ? Allow to pass `--settings path/to/settings.json` instead of having to rely exclusively on directory mode to centralize settings (this would reflect the behavior of providing metadata)", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2087/reactions\", \"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1803264272, "node_id": "I_kwDOBm6k_c5re6EQ", "number": 2101, "title": "alter: true support for JSON write API", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-07-13T15:24:11Z", "updated_at": "2023-07-13T15:24:18Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "Requested here: https://discord.com/channels/823971286308356157/823971286941302908/1129034187073134642\r\n\r\n> The former datasette-insert plugin had an option `?alter=1` to auto-add new columns. Does the JSON write API also have this?", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2101/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 771608692, "node_id": "MDU6SXNzdWU3NzE2MDg2OTI=", "number": 14, "title": "UNIQUE constraint failed: workouts.id", "user": {"value": 1234956, "label": "n8henrie"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2020-12-20T15:11:20Z", "updated_at": "2023-07-10T14:46:52Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I'm getting an error on my initial attempt to import data:\r\n\r\n```console\r\n$ healthkit-to-sqlite 20201119\\ healthkit\\ export.zip healthkit.db\r\nImporting from HealthKit [###################################-] 98% 00:00:01\r\nTraceback (most recent call last):\r\n File \"venv/bin/healthkit-to-sqlite\", line 8, in \r\n sys.exit(cli())\r\n File \"venv/lib/python3.9/site-packages/click/core.py\", line 829, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"venv/lib/python3.9/site-packages/click/core.py\", line 782, in main\r\n rv = self.invoke(ctx)\r\n File \"venv/lib/python3.9/site-packages/click/core.py\", line 1066, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n File \"venv/lib/python3.9/site-packages/click/core.py\", line 610, in invoke\r\n return callback(*args, **kwargs)\r\n File \"venv/lib/python3.9/site-packages/healthkit_to_sqlite/cli.py\", line 57, in cli\r\n convert_xml_to_sqlite(fp, db, progress_callback=bar.update, zipfile=zf)\r\n File \"venv/lib/python3.9/site-packages/healthkit_to_sqlite/utils.py\", line 34, in convert_xml_to_sqlite\r\n workout_to_db(el, db, zipfile)\r\n File \"venv/lib/python3.9/site-packages/healthkit_to_sqlite/utils.py\", line 57, in workout_to_db\r\n pk = db[\"workouts\"].insert(record, alter=True, hash_id=\"id\").last_pk\r\n File \"venv/lib/python3.9/site-packages/sqlite_utils/db.py\", line 1660, in insert\r\n return self.insert_all(\r\n File \"venv/lib/python3.9/site-packages/sqlite_utils/db.py\", line 1778, in insert_all\r\n self.insert_chunk(\r\n File \"venv/lib/python3.9/site-packages/sqlite_utils/db.py\", line 1588, in insert_chunk\r\n result = self.db.execute(query, params)\r\n File \"venv/lib/python3.9/site-packages/sqlite_utils/db.py\", line 213, in execute\r\n return self.conn.execute(sql, parameters)\r\nsqlite3.IntegrityError: UNIQUE constraint failed: workouts.id\r\n```", "repo": {"value": 197882382, "label": "healthkit-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/14/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1795219865, "node_id": "I_kwDOCGYnMM5rAOGZ", "number": 566, "title": "`--no-headers` doesn't work on most formats", "user": {"value": 33625, "label": "zellyn"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2023-07-09T03:43:36Z", "updated_at": "2023-07-09T04:13:35Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Version 3.33\r\n\r\n```\r\nsqlite-utils query library.db 'select asin from audible' --fmt plain --no-headers | head -3\r\nasin\r\n0062804006\r\n0062891421\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/566/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1794604602, "node_id": "PR_kwDOBm6k_c5U-akg", "number": 2096, "title": "Clarify docs for descriptions in metadata", "user": {"value": 15906, "label": "garthk"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-07-08T01:57:58Z", "updated_at": "2023-07-08T01:58:13Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "simonw/datasette/pulls/2096", "body": "G'day! I got confused while debugging, earlier today. That's on me, but it does strike me a little repetition in the metadata documentation might help those flicking around it rather than reading it from top to bottom. No worries if you think otherwise.\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://datasette--2096.org.readthedocs.build/en/2096/\n\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2096/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1794097871, "node_id": "I_kwDOBm6k_c5q78LP", "number": 2095, "title": "Introduce \"dark mode\" CSS", "user": {"value": 3315059, "label": "jamietanna"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-07-07T19:15:58Z", "updated_at": "2023-07-07T19:15:58Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Using [the CSS media query `prefers-color-scheme`](https://developer.mozilla.org/en-US/docs/Web/CSS/@media/prefers-color-scheme) we can provide a dark-mode version of Datasette", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2095/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1784794489, "node_id": "I_kwDOCGYnMM5qYc15", "number": 562, "title": "Explore the intersection between sqlite-utils and dataclasses", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-07-02T19:23:08Z", "updated_at": "2023-07-02T19:26:39Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "> Aside: this makes me think it might be cool if `sqlite-utils` had a way of working with dataclasses rather than just dicts, and knew how to create a SQLite table to match a dataclass and maybe how to code-generate dataclasses for a specific table schema (dynamically or even using code-generation that can be written to disk, for better editor integrations).\r\n\r\n_Originally posted by @simonw in https://github.com/simonw/llm/issues/65#issuecomment-1616742529_\r\n ", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/562/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1783304750, "node_id": "I_kwDOBm6k_c5qSxIu", "number": 2094, "title": "JS Plugin Hooks for the Code Editor", "user": {"value": 15178711, "label": "asg017"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-07-01T00:51:57Z", "updated_at": "2023-07-01T00:51:57Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "When #2052 merges, I'd like to add support to add extensions/functions to the Datasette code editor. \r\n\r\nI'd eventually like to build a JS plugin for [`sqlite-docs`](https://github.com/asg017/sqlite-docs), to add things like:\r\n\r\n- Inline documentation for tables/columns on hover\r\n- Inline docs for custom functions that are loaded in\r\n- More detailed autocomplete for tables/columns/functions\r\n\r\nI did some hacking to see what this would look like, see here:\r\n\r\n\"image\"\r\n\"image\"\r\n\r\nThere can be a new hook that allows JS plugins to add new \"extension\" in the CodeMirror editorview here:\r\n\r\nhttps://github.com/simonw/datasette/blob/8cd60fd1d899952f1153460469b3175465f33f80/datasette/static/cm-editor-6.0.1.js#L25\r\n\r\nWill need some more planning. For example, the Codemirror bundle in Datasette has functions that we could re-export for plugins to use (so we don't load 2 version of `\"@codemirror/autocomplete\"`, for example. ", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2094/reactions\", \"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 1, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1781005740, "node_id": "I_kwDOBm6k_c5qJ_2s", "number": 2090, "title": "Adopt ruff for linting", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2023-06-29T14:56:43Z", "updated_at": "2023-06-29T15:05:04Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "https://beta.ruff.rs/docs/", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2090/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1054244712, "node_id": "I_kwDOBm6k_c4-1n9o", "number": 1510, "title": "Datasette 1.0 documented template context (maybe via API docs)", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": {"value": 3268330, "label": "Datasette 1.0"}, "comments": 3, "created_at": "2021-11-15T23:23:58Z", "updated_at": "2023-06-28T02:05:21Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "Documented context plus protective unit tests. Goal is that custom templates built for 1.x will not break without a 2.x release.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/1510/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 323223872, "node_id": "MDU6SXNzdWUzMjMyMjM4NzI=", "number": 260, "title": "Validate metadata.json on startup", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 7, "created_at": "2018-05-15T13:42:56Z", "updated_at": "2023-06-21T12:51:22Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "It's easy to misspell the name of a database or table and then be puzzled when the metadata settings silently fail.\r\n\r\nTo avoid this, let's sanity check the provided metadata.json on startup and quit with a useful error message if we find any obvious mistakes.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/260/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1764792125, "node_id": "I_kwDOBm6k_c5pMJc9", "number": 2086, "title": "Show information on startup in directory configuration mode", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-06-20T07:13:33Z", "updated_at": "2023-06-20T07:13:33Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "https://discord.com/channels/823971286308356157/823971286941302908/1120516587036889098\r\n\r\n> One thing that would be helpful would be message at launch indicating a metadata.json is getting picked up. I'm using directory mode and was editing the wrong file for awhile before I realize nothing I was doing was having any effect.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2086/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1762180409, "node_id": "I_kwDOBm6k_c5pCL05", "number": 2085, "title": "Interactive row selection in Datasette ", "user": {"value": 24938923, "label": "learning4life"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-06-18T08:29:45Z", "updated_at": "2023-06-18T08:31:23Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Simon did a excellent [prototype](https://til.simonwillison.net/datasette/row-selection-prototype) of an interactive row selection in Datasette.\r\n\r\nI hope this [functionality](https://camo.githubusercontent.com/3d4a0f31fb6a27fd279f809af5b53dc3b76faa63c7721e228951c5252b645a77/68747470733a2f2f7374617469632e73696d6f6e77696c6c69736f6e2e6e65742f7374617469632f323032332f6461746173657474652d7069636b65722e676966) can be turned into a Datasette plugin.\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2085/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1761613778, "node_id": "I_kwDOBm6k_c5pABfS", "number": 2084, "title": "Support facets for columns that contain timestamps", "user": {"value": 19492893, "label": "devxpy"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-06-17T03:33:54Z", "updated_at": "2023-06-17T03:33:54Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "\r\nDjango has this very nice filter for datetime fields -\r\n\r\n\"image\"\r\n\r\nIt would be nice to have something similar to facet by a field that contains a timestamp in datasette too - Which doesn't seem to do anything with timestamps right now...\r\n\r\n\"image\"\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/datasette/issues/2084/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1383646615, "node_id": "I_kwDOCGYnMM5SeMWX", "number": 491, "title": "Ability to merge databases and tables", "user": {"value": 8904453, "label": "sgraaf"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 7, "created_at": "2022-09-23T11:10:55Z", "updated_at": "2023-06-14T22:14:24Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Hi! Let me firstly say that I am a big fan of your work -- I follow your tweets and blog posts with great interest \ud83d\ude04.\r\n\r\nNow onto the matter at hand: I think it would be great if `sqlite-utils` included a `merge` or `combine` command, with the purpose of combining different SQLite databases into a single SQLite database. This way, the newly \"merged\" database would contain all differently named tables contained in the databases to be merged as-is, as well a concatenation of all tables of the same name.\r\n\r\nThis could look something like this:\r\n\r\n```bash\r\nsqlite-utils merge cats.db dogs.db > animals.db\r\n```\r\n\r\nI imagine this is rather straightforward if all databases involved in the merge contain differently named tables (i.e. no chance of conflicts), but things get slightly more complicated if two or more of the databases to be merged contain tables with the same name. Not only do you have to \"do something\" with the primary key(s), but these tables could also simply have different schemas (and therefore be incompatible for concatenation to begin with).\r\n\r\nAnyhow, I would love your thoughts on this, and, if you are open to it, work together on the design and implementation!", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/491/reactions\", \"total_count\": 2, \"+1\": 2, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1733198948, "node_id": "I_kwDOCGYnMM5nToRk", "number": 555, "title": "Filter table by a large bunch of ids", "user": {"value": 10843208, "label": "redraw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-05-31T00:29:51Z", "updated_at": "2023-06-14T22:01:57Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Hi! this might be a question related to both SQLite & sqlite-utils, and you might be more experienced with them.\r\n\r\nI have a large bunch of ids, and I'm wondering which is the best way to query them in terms of performance, and simplicity if possible.\r\n\r\nThe naive approach would be something like `select * from table where rowid in (?, ?, ?...)` but that wouldn't scale if ids are >1k.\r\n\r\nAnother approach might be creating a temp table, or in-memory db table, insert all ids in that table and then join with the target one.\r\n\r\nI failed to attach an in-memory db both using sqlite-utils, and plain sql's execute(), so my closest approach is something like,\r\n\r\n```python\r\ndef filter_existing_video_ids(video_ids):\r\n db = get_db() # contains a \"videos\" table\r\n db.execute(\"CREATE TEMPORARY TABLE IF NOT EXISTS tmp (video_id TEXT NOT NULL PRIMARY KEY)\")\r\n db[\"tmp\"].insert_all([{\"video_id\": video_id} for video_id in video_ids])\r\n for row in db[\"tmp\"].rows_where(\"video_id not in (select video_id from videos)\"):\r\n yield row[\"video_id\"]\r\n db[\"tmp\"].drop()\r\n```\r\n\r\nThat kinda worked, I couldn't find an option in sqlite-utils's `create_table()` to tell it's a temporary table. Also, `tmp` table is not dropped finally, neither using `.drop()` despite being created with the keyword `TEMPORARY`. I believe it should be automatically dropped after connection/session ends though I read.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/555/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1751214236, "node_id": "I_kwDOC8SPRc5oYWic", "number": 36, "title": "Getting sqlite_master may not be modified when creating dogsheep index", "user": {"value": 8711912, "label": "khushmeeet"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-06-11T03:21:53Z", "updated_at": "2023-06-11T03:21:53Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "When creating a `dogsheep` index from `config.yml` file on pocket.db (created using pocket-to-sqlite), I am getting this error\r\n\r\n```\r\nTraceback (most recent call last):\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/bin/dogsheep-beta\", line 8, in \r\n sys.exit(cli())\r\n ^^^^^\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py\", line 1130, in __call__\r\n return self.main(*args, **kwargs)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py\", line 1055, in main\r\n rv = self.invoke(ctx)\r\n ^^^^^^^^^^^^^^^^\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py\", line 1657, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py\", line 1404, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py\", line 760, in invoke\r\n return __callback(*args, **kwargs)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/cli.py\", line 36, in index\r\n run_indexer(\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/utils.py\", line 32, in run_indexer\r\n ensure_table_and_indexes(db, tokenize)\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/utils.py\", line 91, in ensure_table_and_indexes\r\n table.add_foreign_key(*fk)\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/sqlite_utils/db.py\", line 2155, in add_foreign_key\r\n self.db.add_foreign_keys([(self.name, column, other_table, other_column)])\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/sqlite_utils/db.py\", line 1116, in add_foreign_keys\r\n cursor.execute(\r\nsqlite3.OperationalError: table sqlite_master may not be modified\r\n```\r\n\r\nCommand I ran to get this error\r\n```\r\ndogsheep-beta index pocket.db config.yml\r\n```\r\n\r\nDogsheep version\r\n```\r\ndogsheep-beta, version 0.10.2\r\n```\r\n\r\nPython version \r\n```\r\nPython 3.11.2\r\n```", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/36/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1740026046, "node_id": "I_kwDOCGYnMM5ntrC-", "number": 556, "title": "Support storing incrementally piped values", "user": {"value": 601708, "label": "mcint"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-06-04T00:45:23Z", "updated_at": "2023-06-04T01:21:15Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "I'm trying to use sqlite-utils to data generated incrementally. There are a few\naspects of this that I don't currently know how to handle. I would like an option\nto apply writes incrementally, line-by-line as they are received. I would like an\noption to echo incremental progress. And, it would be nice to have \n\nIn particular, I'm using CoreLocationCLI -w -j to generate, newline-delimited JSON.\n\nOne variant of the command \n\n`stdbuf -oL CoreLocationCLI -w -j | pee 'sqlite-utils insert loc.db loc -' nl`\n\n`pee`, from `moreutils`, is like `tee` but spawns and pipes to the processes\ncreated by invoking each of its arguments, so, for gratuitous demonstration,\n`pee 'sponge out.log' cat` would behave like `tee`.\n\nIt looks like I can get what I want with:\n`stdbuf -oL CoreLocationCLI -w -j | while read line; do <<<\"$line\" sqlite-utils insert loc.db loc -; echo \"$line\"; done | nl`\n\n\n\n\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/556/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null}