{"id": 1114543475, "node_id": "I_kwDOCGYnMM5CbpVz", "number": 388, "title": "Link to stable docs from older versions", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 7, "created_at": "2022-01-26T01:55:46Z", "updated_at": "2023-03-26T23:43:12Z", "closed_at": "2022-01-26T02:00:22Z", "author_association": "OWNER", "pull_request": null, "body": "https://sqlite-utils.datasette.io/en/2.14.1/ isn't showing a link to the stable release right now.\r\n\r\nI should also apply the same fix I used for Datasette in:\r\n- https://github.com/simonw/datasette/issues/1608\r\n\r\nTIL: https://til.simonwillison.net/readthedocs/link-from-latest-to-stable", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/388/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1620516340, "node_id": "I_kwDOCGYnMM5glx30", "number": 533, "title": "ReadTheDocs error: not all arguments converted during string formatting", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2023-03-12T21:21:05Z", "updated_at": "2023-03-12T21:25:33Z", "closed_at": "2023-03-12T21:25:33Z", "author_association": "OWNER", "pull_request": null, "body": "This came up as a failure running tests for:\r\n- #531\r\n\r\nTraceback on https://readthedocs.org/projects/sqlite-utils/builds/19749348/\r\n\r\n```\r\n File \"/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/531/lib/python3.8/site-packages/docutils/parsers/rst/states.py\", line 889, in interpreted\r\n nodes, messages2 = role_fn(role, rawsource, text, lineno, self)\r\n File \"/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/531/lib/python3.8/site-packages/sphinx/ext/extlinks.py\", line 103, in role\r\n title = caption % part\r\nTypeError: not all arguments converted during string formatting\r\n\r\nException occurred:\r\n File \"/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/531/lib/python3.8/site-packages/sphinx/ext/extlinks.py\", line 103, in role\r\n title = caption % part\r\nTypeError: not all arguments converted during string formatting\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/533/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1572766460, "node_id": "I_kwDOCGYnMM5dvoL8", "number": 524, "title": "Transformation type `--type DATETIME`", "user": {"value": 21095447, "label": "4l1fe"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 15, "created_at": "2023-02-06T15:18:42Z", "updated_at": "2023-02-15T12:10:54Z", "closed_at": "2023-02-15T12:10:54Z", "author_association": "NONE", "pull_request": null, "body": "Hey. Currently i do transformation with the type `--type TEXT`, but i noticed using the sqlalchemy based library [dataset](https://github.com/pudo/dataset) that is reading and writing differ depending on the column types `TEXT`, `DATETIME`.\r\n\r\nIs it possible to alter a column type to `DATETIME` somehow using Sqlite-Utils?", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/524/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1560651350, "node_id": "I_kwDOCGYnMM5dBaZW", "number": 523, "title": "Feature request: trim all leading and trailing white space for all columns for all tables in a database", "user": {"value": 536941, "label": "fgregg"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-01-28T02:40:10Z", "updated_at": "2023-01-28T02:41:14Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "It's pretty common that i need to trim leading or trailing white space from lots of columns in a database a part of an initial ETL.\r\n\r\nI use the following recipe a lot, and it would be great to include this functionality into sqlite-utils\r\n\r\n`trimify.sql`\r\n```sql\r\nselect 'select group_concat(''update [' || name || '] set ['' || name || ''] = trim(['' || name || ''])'', '';\r\n'') || '';\r\n'' as sql_to_run from pragma_table_info('''||name||''');' from sqlite_schema;\r\n```\r\n\r\nthen something like:\r\n\r\n```bash\r\n\tsqlite3 example.db < scripts/trimify.sql > table_trim.sql && \\\r\n sqlite3 $example.db < table_trim.sql > trim.sql && \\\r\n sqlite3 $example.db < trim.sql\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/523/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 743384829, "node_id": "MDExOlB1bGxSZXF1ZXN0NTIxMjg3OTk0", "number": 203, "title": "changes to allow for compound foreign keys", "user": {"value": 1049910, "label": "drkane"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 7, "created_at": "2020-11-16T00:30:10Z", "updated_at": "2023-01-25T18:47:18Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/203", "body": "Add support for compound foreign keys, as per issue #117 \r\n\r\nNot sure if this is the right approach. In particular I'm unsure about:\r\n\r\n - the new `ForeignKey` class, which replaces the namedtuple in order to ensure that `column` and `other_column` are forced into tuples. The class does the job, but doesn't feel very elegant.\r\n - I haven't rewritten `guess_foreign_table` to take account of multiple columns, so it just checks for the first column in the foreign key definition. This isn't ideal.\r\n - I haven't added any ability to the CLI to add compound foreign keys, it's only in the python API at the moment.\r\n\r\nThe PR also contains a minor related change that columns and tables are always quoted in foreign key definitions.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/203/reactions\", \"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 1, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1550536442, "node_id": "I_kwDOCGYnMM5ca076", "number": 521, "title": "Custom JSON encoder", "user": {"value": 31504, "label": "janrito"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-01-20T09:19:40Z", "updated_at": "2023-01-20T09:19:40Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "It would be nice if we could specify a custom encoder (and decoder) for types that will need extra deserialisation \u2013 e.g., sets, enums or sparse matrices \u2013 or even project-specific types", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/521/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1000275035, "node_id": "PR_kwDOCGYnMM4r7n-9", "number": 327, "title": "Extract expand: Support JSON Arrays", "user": {"value": 101753, "label": "phaer"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-09-19T10:34:30Z", "updated_at": "2022-12-29T09:05:36Z", "closed_at": "2022-12-29T09:05:36Z", "author_association": "NONE", "pull_request": "simonw/sqlite-utils/pulls/327", "body": "Hi,\r\n\r\nI needed to extract data in JSON Arrays to normalize data imports. I've quickly hacked the following together based on #241 which refers to #239 where you, @simonw, wrote:\r\n\r\n> Could this handle lists of objects too? That would be pretty amazing - if the column has a [{...}, {...}] list in it could turn that into a many-to-many.\r\n\r\nThey way this works in my work is that many-to-many relationships are created for anything that maps to an dictionary in a list, and many-to-one relations for everything else (assumed to be scalar values). Not sure what the best approach here would be? Are many-to-one relationships are at all useful here?\r\n\r\nWhat do you think about this approach? I could try to add it to the cli interface and documentation if wanted.\r\n\r\nThanks for this awesome piece of software in any case! :sun_with_face: ", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/327/reactions\", \"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1373224657, "node_id": "I_kwDOCGYnMM5R2b7R", "number": 488, "title": "`sqlite-utils transform` should set empty strings to null when converting text columns to integer/float", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2022-09-14T15:51:30Z", "updated_at": "2022-12-23T17:38:55Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "```\r\n/tmp % echo \"id,age,weight\\n1,3,2.5\\n2,,\" | sqlite-utils insert test.db test - --csv\r\n/tmp % sqlite-utils schema test.db \r\nCREATE TABLE [test] (\r\n [id] TEXT,\r\n [age] TEXT,\r\n [weight] TEXT\r\n);\r\n/tmp % sqlite-utils transform test.db test --type age integer --type weight float \r\n/tmp % sqlite-utils schema test.db \r\nCREATE TABLE \"test\" (\r\n [id] TEXT,\r\n [age] INTEGER,\r\n [weight] FLOAT\r\n);\r\n/tmp % sqlite-utils rows test.db test\r\n[{\"id\": \"1\", \"age\": 3, \"weight\": 2.5},\r\n {\"id\": \"2\", \"age\": \"\", \"weight\": \"\"}]\r\n```\r\nIt would be neat if this resulted in the following instead:\r\n```\r\n {\"id\": \"2\", \"age\": null, \"weight\": null}\r\n```\r\nRelated Discord discussion: https://discord.com/channels/823971286308356157/823971286941302908/1019635490833567794", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/488/reactions\", \"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1487764628, "node_id": "I_kwDOCGYnMM5YrXyU", "number": 518, "title": "flake8 ValueError: Error code '#' supplied to 'extend-ignore' option...", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-12-10T01:30:24Z", "updated_at": "2022-12-10T01:36:46Z", "closed_at": "2022-12-10T01:36:46Z", "author_association": "OWNER", "pull_request": null, "body": "> `Error code '#' supplied to 'extend-ignore' option does not match '^[A-Z]{1,3}[0-9]{0,3}$'`\r\n\r\nhttps://github.com/simonw/sqlite-utils/actions/runs/3662011265/jobs/6190770361\r\n\r\nI think from this:\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/e660635cea6c32f4022818380b1e1ee88e7c93a6/setup.cfg#L1-L3\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/518/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1487757143, "node_id": "I_kwDOCGYnMM5YrV9X", "number": 517, "title": "Drop support for Python 3.6", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-12-10T01:23:31Z", "updated_at": "2022-12-10T01:36:36Z", "closed_at": "2022-12-10T01:36:36Z", "author_association": "OWNER", "pull_request": null, "body": "CI has started failing for Python 3.6: https://github.com/simonw/sqlite-utils/actions/runs/3576322798\r\n\r\n\"image\"\r\n\r\nIt's fixable by swiching away from `ubuntu-latest` according to:\r\n\r\n- https://github.com/actions/setup-python/issues/355#issuecomment-1335042510\r\n\r\nBut https://endoflife.date/python says that 3.6 end of life was almost 6 years ago, and end of security support nearly 1 year ago.\r\n\r\nSo I'm OK dropping support entirely - Python 3.6 users will still be able to install version 3.30, just not any releases that come next.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/517/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1479914599, "node_id": "I_kwDOCGYnMM5YNbRn", "number": 516, "title": "Feature request: output number of ignored/replaced rows for insert command", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2022-12-06T18:59:21Z", "updated_at": "2022-12-06T19:08:14Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "https://hachyderm.io/@briandorsey/109468185742876820\r\n\r\n> I'm fiddling with piping json to `insert -ignore` I'd love to see the count of records inserted & ignored, but didn't see a way to do that in the help/docs.\r\n>\r\n> Example: `xh \"https://hachyderm.io/api/v1/timelines/tag/rust?max_id=109443380308326328\" | sqlite-utils insert aoc.db aoc - --pk=id --ignore`", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/516/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1434911255, "node_id": "I_kwDOCGYnMM5VhwIX", "number": 510, "title": "Cannot enable FTS5 despite it being available", "user": {"value": 1176293, "label": "ar-jan"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2022-11-03T16:03:49Z", "updated_at": "2022-11-18T18:37:52Z", "closed_at": "2022-11-17T10:36:28Z", "author_association": "NONE", "pull_request": null, "body": "When I do `sqlite-utils enable-fts my.db table_name column_name` (with or without `--fts5`), I get an FTS4 virtual table instead of the expected FTS5.\r\n\r\nFTS5 is however available and Python/SQLite versions do not seem to be the issue. I can manually create the FTS5 virtual table, and then Datasette also works with it from this same Python environment.\r\n\r\n`>>> sqlite3.version`\r\n`2.6.0`\r\n`>>> sqlite3.sqlite_version`\r\n`3.39.4`\r\n\r\n`PRAGMA compile_options;` includes `ENABLE_FTS5`.\r\n\r\n`sqlite-utils, version 3.30`.\r\n\r\nAny ideas what's happening and how to fix?", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/510/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1430563092, "node_id": "PR_kwDOCGYnMM5B6_6K", "number": 508, "title": "Allow surrogates in parameters", "user": {"value": 7908073, "label": "chapmanjacobd"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-10-31T22:11:49Z", "updated_at": "2022-11-17T15:11:16Z", "closed_at": "2022-10-31T22:55:36Z", "author_association": "CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/508", "body": "closes #507\r\n\r\nhttps://dwheeler.com/essays/fixing-unix-linux-filenames.html\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://sqlite-utils--508.org.readthedocs.build/en/508/\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/508/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1453134846, "node_id": "I_kwDOCGYnMM5WnRP-", "number": 513, "title": "Add or document streamlined workflow for importing Datasette csv / json exports", "user": {"value": 19328961, "label": "henry501"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-11-17T10:54:47Z", "updated_at": "2022-11-17T10:54:47Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I'm working on some small front-end enhancements to the laion-aesthetic-datasette project, and I wanted to partially populate a database directly using exports from the existing Datasette instance instead of downloading the parquet files and creating my own multi-GB database.\r\n\r\nThere have been a number of small issues that are certainly related to my relative lack of familiarity with the toolkit, but that are still surprising. \r\n\r\nFor example: a CSV export of the images table (http://laion-aesthetic.datasette.io/laion-aesthetic-6pls.csv?sql=select+rowid%2C+url%2C+text%2C+domain_id%2C+width%2C+height%2C+similarity%2C+punsafe%2C+pwatermark%2C+aesthetic%2C+hash%2C+__index_level_0__+from+images+order+by+random%28%29+limit+100) has nested single quotes, double quotes, and commas that aren't handled by rows_from_file. Similarly, the json output has to be manually transformed to add the column names and remove extraneous information before sqlite_utils can import it.\r\n\r\nI was able to work through these issues, but as an enhancement it would be really helpful to create or document a clear workflow that avoids the friction of this data transformation.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/513/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1450952393, "node_id": "I_kwDOCGYnMM5We8bJ", "number": 512, "title": "mypy failures in CI", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2022-11-16T06:22:48Z", "updated_at": "2022-11-16T07:49:51Z", "closed_at": "2022-11-16T07:49:50Z", "author_association": "OWNER", "pull_request": null, "body": "https://github.com/simonw/sqlite-utils/actions/runs/3472012235 failed on Python 3.11:\r\n\r\nTruncated output:\r\n```\r\nsqlite_utils/db.py:2467: note: PEP 484 prohibits implicit Optional. Accordingly, mypy has changed its default to no_implicit_optional=True\r\nsqlite_utils/db.py:2467: note: Use https://github.com/hauntsaninja/no_implicit_optional to automatically upgrade your codebase\r\nsqlite_utils/db.py:2530: error: Incompatible default for argument \"where\" (default has type \"None\", argument has type \"str\") [assignment]\r\nsqlite_utils/db.py:2530: note: PEP 484 prohibits implicit Optional. Accordingly, mypy has changed its default to no_implicit_optional=True\r\nsqlite_utils/db.py:2530: note: Use https://github.com/hauntsaninja/no_implicit_optional to automatically upgrade your codebase\r\nsqlite_utils/db.py:2658: error: Argument 1 to \"count_where\" of \"Queryable\" has incompatible type \"Optional[str]\"; expected \"str\" [arg-type]\r\nFound 23 errors in 1 file (checked 51 source files)\r\n```\r\nBest look at https://github.com/hauntsaninja/no_implicit_optional", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/512/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1436539554, "node_id": "I_kwDOCGYnMM5Vn9qi", "number": 511, "title": "[insert_all, upsert_all] IntegrityError: constraint failed", "user": {"value": 7908073, "label": "chapmanjacobd"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-11-04T19:21:48Z", "updated_at": "2022-11-04T22:59:54Z", "closed_at": "2022-11-04T22:54:09Z", "author_association": "CONTRIBUTOR", "pull_request": null, "body": "My understand is that `INSERT OR IGNORE` will ignore when inserts would cause duplicate keys so I'm not sure exactly why the error is raised from `sqlite3`.\r\n\r\n```\r\nimport argparse\r\nfrom pathlib import Path\r\n\r\nfrom xklb import db, utils\r\nfrom xklb.utils import log\r\n\r\n\r\ndef parse_args() -> argparse.Namespace:\r\n parser = argparse.ArgumentParser()\r\n parser.add_argument(\"database\")\r\n parser.add_argument(\"dbs\", nargs=\"*\")\r\n parser.add_argument(\"--upsert\")\r\n parser.add_argument(\"--db\", \"-db\", help=argparse.SUPPRESS)\r\n parser.add_argument(\"--verbose\", \"-v\", action=\"count\", default=0)\r\n args = parser.parse_args()\r\n\r\n if args.db:\r\n args.database = args.db\r\n Path(args.database).touch()\r\n args.db = db.connect(args)\r\n log.info(utils.dict_filter_bool(args.__dict__))\r\n\r\n return args\r\n\r\n\r\ndef merge_db(args, source_db):\r\n source_db = str(Path(source_db).resolve())\r\n\r\n s_db = db.connect(argparse.Namespace(database=source_db, verbose=args.verbose))\r\n for table in [s for s in s_db.table_names() if not \"_fts\" in s and not s.startswith(\"sqlite_\")]:\r\n log.info(\"[%s]: %s\", source_db, table)\r\n with s_db.conn:\r\n data = s_db[table].rows\r\n\r\n with args.db.conn:\r\n if args.upsert:\r\n args.db[table].upsert_all(data, pk=args.upsert.split(\",\"), alter=True)\r\n else:\r\n args.db[table].insert_all(data, alter=True, replace=True)\r\n\r\n\r\ndef merge_dbs():\r\n args = parse_args()\r\n for s_db in args.dbs:\r\n merge_db(args, s_db)\r\n\r\n\r\nif __name__ == \"__main__\":\r\n merge_dbs()\r\n\r\n```\r\n\r\n```\r\n$ lb-dev merge video.db tube_71.db --upsert path -vv\r\nSQL: INSERT OR IGNORE INTO [media]([path]) VALUES(?); - params: ['https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz']\r\n...\r\nFile ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:3122, in Table.insert_all(self, records, pk, foreign_keys, column_order, not_null, defaults, batch_size, hash_id, hash_id_columns, alter, ignore, replace, truncate, extracts, conversions, columns, upsert, analyze)\r\n 3116 all_columns += [\r\n 3117 column for column in record if column not in all_columns\r\n 3118 ]\r\n 3120 first = False\r\n-> 3122 self.insert_chunk(\r\n 3123 alter,\r\n 3124 extracts,\r\n 3125 chunk,\r\n 3126 all_columns,\r\n 3127 hash_id,\r\n 3128 hash_id_columns,\r\n 3129 upsert,\r\n 3130 pk,\r\n 3131 conversions,\r\n 3132 num_records_processed,\r\n 3133 replace,\r\n 3134 ignore,\r\n 3135 )\r\n 3137 if analyze:\r\n 3138 self.analyze()\r\n\r\nFile ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:2887, in Table.insert_chunk(self, alter, extracts, chunk, all_columns, hash_id, hash_id_columns, upsert, pk, conversions, num_records_processed, replace, ignore)\r\n 2885 for query, params in queries_and_params:\r\n 2886 try:\r\n-> 2887 result = self.db.execute(query, params)\r\n 2888 except OperationalError as e:\r\n 2889 if alter and (\" column\" in e.args[0]):\r\n 2890 # Attempt to add any missing columns, then try again\r\n\r\nFile ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:484, in Database.execute(self, sql, parameters)\r\n 482 self._tracer(sql, parameters)\r\n 483 if parameters is not None:\r\n--> 484 return self.conn.execute(sql, parameters)\r\n 485 else:\r\n 486 return self.conn.execute(sql)\r\n\r\nIntegrityError: constraint failed\r\n> /home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py(484)execute()\r\n 482 self._tracer(sql, parameters)\r\n 483 if parameters is not None:\r\n--> 484 return self.conn.execute(sql, parameters)\r\n 485 else:\r\n 486 return self.conn.execute(sql)\r\n```\r\n\r\n```\r\nsqlite3 --version\r\n3.36.0 2021-06-18 18:36:39\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/511/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 473083260, "node_id": "MDU6SXNzdWU0NzMwODMyNjA=", "number": 50, "title": "\"Too many SQL variables\" on large inserts", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2019-07-25T21:43:31Z", "updated_at": "2022-11-04T14:38:36Z", "closed_at": "2019-07-28T11:59:33Z", "author_association": "OWNER", "pull_request": null, "body": "Reported here: https://github.com/dogsheep/healthkit-to-sqlite/issues/9\r\n\r\nIt looks like there's a default limit of 999 variables - we need to be smart about that, maybe dynamically lower the batch size based on the number of columns.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/50/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1429029604, "node_id": "I_kwDOCGYnMM5VLULk", "number": 506, "title": "Make `cursor.rowcount` accessible (wontfix)", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2022-10-30T21:51:55Z", "updated_at": "2022-11-01T17:37:47Z", "closed_at": "2022-11-01T17:37:13Z", "author_association": "OWNER", "pull_request": null, "body": "In building this Datasette feature on top of `sqlite-utils` I thought it might be useful to expose the number of rows that had been affected by a bulk insert or update - the `cursor.rowcount`:\r\n\r\n- https://github.com/simonw/datasette/issues/1866\r\n\r\nThis isn't currently exposed by `sqlite-utils`.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/506/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1430325103, "node_id": "I_kwDOCGYnMM5VQQdv", "number": 507, "title": "conn.execute: UnicodeEncodeError: 'utf-8' codec can't encode character", "user": {"value": 7908073, "label": "chapmanjacobd"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-10-31T18:49:51Z", "updated_at": "2022-11-01T00:40:17Z", "closed_at": "2022-11-01T00:40:16Z", "author_association": "CONTRIBUTOR", "pull_request": null, "body": "I'm not really sure what caused this and it happened in the middle of my program (after running for 35775 seconds).\r\n\r\n```\r\nExtracting metadata 49.9% (chunk 9893 of 19831)\r\n...\r\n File \"/home/xk/.local/lib/python3.10/site-packages/xklb/fs_extract.py\", line 90, in extract_chunk\r\n args.db[\"media\"].insert_all(utils.list_dict_filter_bool(media), pk=\"path\", alter=True, replace=True)\r\n File \"/home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py\", line 3107, in insert_all\r\n self.insert_chunk(\r\n File \"/home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py\", line 2872, in insert_chunk\r\n result = self.db.execute(query, params)\r\n File \"/home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py\", line 483, in execute\r\n return self.conn.execute(sql, parameters)\r\nUnicodeEncodeError: 'utf-8' codec can't encode character '\\udcc3' in position 62: surrogates not allowed\r\n```\r\n\r\nThis might be relevant: https://stackoverflow.com/questions/31898353/python-cant-encode-with-surrogateescape\r\n\r\nI'm going to try re-running with \r\n\r\n```py\r\n def execute(\r\n self, sql: str, parameters: Optional[Union[Iterable, dict]] = None\r\n ) -> sqlite3.Cursor:\r\n \"\"\"\r\n Execute SQL query and return a ``sqlite3.Cursor``.\r\n\r\n :param sql: SQL query to execute\r\n :param parameters: Parameters to use in that query - an iterable for ``where id = ?``\r\n parameters, or a dictionary for ``where id = :id``\r\n \"\"\"\r\n try:\r\n if self._tracer:\r\n self._tracer(sql, parameters)\r\n if parameters is not None:\r\n return self.conn.execute(sql, parameters)\r\n else:\r\n return self.conn.execute(sql)\r\n except UnicodeEncodeError:\r\n sql = sql.encode('utf-8', 'surrogatepass').decode('utf-8')\r\n if parameters is not None:\r\n parameters = parameters.encode('utf-8', 'surrogatepass').decode('utf-8')\r\n return self.execute(sql, parameters)\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/507/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1405196044, "node_id": "PR_kwDOCGYnMM5AmYzy", "number": 499, "title": "feat: recreate fts triggers after table transform", "user": {"value": 7908073, "label": "chapmanjacobd"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-10-11T20:35:39Z", "updated_at": "2022-10-26T17:54:51Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/499", "body": "https://github.com/simonw/sqlite-utils/pull/498\r\n\r\n\r\n----\r\n:books: Documentation preview :books:: https://sqlite-utils--499.org.readthedocs.build/en/499/\r\n\r\n\r\n\r\nalternatively, `self.disable_fts()`", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/499/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1423182778, "node_id": "I_kwDOCGYnMM5U1Au6", "number": 505, "title": "Release sqlite-utils 3.30", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-10-25T22:20:05Z", "updated_at": "2022-10-25T22:41:26Z", "closed_at": "2022-10-25T22:41:16Z", "author_association": "OWNER", "pull_request": null, "body": "https://github.com/simonw/sqlite-utils/compare/3.29...defa2974c6d3abc19be28d6b319649b8028dc966", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/505/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1386562662, "node_id": "I_kwDOCGYnMM5SpURm", "number": 493, "title": "Tiny typographical error in install/uninstall docs", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2022-09-26T19:00:42Z", "updated_at": "2022-10-25T21:31:15Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "Added in:\r\n- #483\r\n\r\nI don't know how to fix this in Sphinx: I'm getting this: https://sqlite-utils.datasette.io/en/latest/cli.html#cli-install\r\n\r\n> The [insert \u2013convert](https://sqlite-utils.datasette.io/en/latest/cli.html#cli-insert-convert) and [query \u2013functions](https://sqlite-utils.datasette.io/en/latest/cli.html#cli-query-functions) options\r\n\r\n\"image\"\r\n\r\nBut I want it to display `insert --convert` and not `insert \u2013convert` there.\r\n\r\nHere's the code: https://github.com/simonw/sqlite-utils/blob/85247038f70d7eb2f3e272cfeaa4c44459cafba8/docs/cli.rst#L2125", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/493/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1392690202, "node_id": "I_kwDOCGYnMM5TAsQa", "number": 495, "title": "Support JSON values returned from .convert() functions", "user": {"value": 649467, "label": "mhalle"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2022-09-30T16:33:49Z", "updated_at": "2022-10-25T21:23:37Z", "closed_at": "2022-10-25T21:23:28Z", "author_association": "NONE", "pull_request": null, "body": "When using the convert function on a JSON column, the result of the conversion function must be a string. If the return value is either a dict (object) or a list (array), the convert call will error out with an unhelpful user defined function exception. \r\n\r\nIt makes sense that since the original column value was a string and required conversion to data structures, the result should be converted back into a JSON string as well. However, other functions auto-convert to JSON string representation, so the fact that convert doesn't could be surprising.\r\n\r\nAt least the documentation should note this requirement, because the sqlite error messages won't readily reveal the issue.\r\n\r\nJf only sqlite's JSON column type meant something :)", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/495/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1393212964, "node_id": "I_kwDOCGYnMM5TCr4k", "number": 497, "title": "column_names", "user": {"value": 7908073, "label": "chapmanjacobd"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-10-01T03:34:21Z", "updated_at": "2022-10-25T21:09:28Z", "closed_at": "2022-10-25T21:09:28Z", "author_association": "CONTRIBUTOR", "pull_request": null, "body": "It would be nice to have a `column_names`. Similar to `table_names`.\r\n\r\nOr if you could get one or all of the following syntax to work for both Database and Table that might be even better: \r\n\r\nStyle 1\r\n- `if 'table1' in db`\r\n- `if 'col1' in db['table1']`\r\n\r\nStyle 2\r\n- `if 'table1' in db.tables`\r\n- `if 'col1' in db['table1'].columns`\r\n\r\nmaybe the table ones actually work but I'm too lazy to check. I just know that I have to do:\r\n\r\n `[c.name for c in db['table1'].columns]`\r\n\r\nEdit: This is possible with `columns_dict`. I have actually used that before but I forgot about it. Feel free to close, but I do think accessing this data could be more consistent and intuitive.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/497/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1423069384, "node_id": "I_kwDOCGYnMM5U0lDI", "number": 504, "title": "db.close() method, calling db.conn.close()", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-10-25T20:50:50Z", "updated_at": "2022-10-25T21:00:29Z", "closed_at": "2022-10-25T20:57:47Z", "author_association": "OWNER", "pull_request": null, "body": "I ended up needing to use `db.conn.close()` to fix this issue:\r\n- #503\r\n\r\nI think `.close()` should be a method on `Database` itself.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/504/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1423000702, "node_id": "I_kwDOCGYnMM5U0UR-", "number": 503, "title": "test_recreate failing on Windows Python 3.11", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 10, "created_at": "2022-10-25T20:01:41Z", "updated_at": "2022-10-25T20:47:34Z", "closed_at": "2022-10-25T20:45:43Z", "author_association": "OWNER", "pull_request": null, "body": "https://github.com/simonw/sqlite-utils/actions/runs/3323672128/jobs/5494726927\r\n\r\nRelated:\r\n- #502\r\n\r\n```\r\nFAILED tests/test_recreate.py::test_recreate[True-True] - \r\n PermissionError: [WinError 32] The process cannot access the file because it is being used by another process:\r\n 'C:\\\\Users\\\\runneradmin\\\\AppData\\\\Local\\\\Temp\\\\pytest-of-runneradmin\\\\pytest-0\\\\test_recreate_True_True_0\\\\data.db'\r\nFAILED tests/test_recreate.py::test_recreate[False-True] - \r\n PermissionError: [WinError 32] The process cannot access the file because it is being used by another process:\r\n 'C:\\\\Users\\\\runneradmin\\\\AppData\\\\Local\\\\Temp\\\\pytest-of-runneradmin\\\\pytest-0\\\\test_recreate_False_True_0\\\\data.db'\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/503/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1422954582, "node_id": "I_kwDOCGYnMM5U0JBW", "number": 502, "title": "Fix tests for Python 3.11", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-10-25T19:20:31Z", "updated_at": "2022-10-25T19:23:47Z", "closed_at": "2022-10-25T19:23:47Z", "author_association": "OWNER", "pull_request": null, "body": "The way errors are represented has changed: https://github.com/simonw/sqlite-utils/actions/runs/3323588047/jobs/5494127154\r\n```\r\n_________________________ test_query_invalid_function __________________________\r\n\r\ndb_path = '/tmp/pytest-of-runner/pytest-0/test_query_invalid_function0/test.db'\r\n\r\n def test_query_invalid_function(db_path):\r\n result = CliRunner().invoke(\r\n cli.cli, [db_path, \"select bad()\", \"--functions\", \"def invalid_python\"]\r\n )\r\n assert result.exit_code == 1\r\n> assert (\r\n result.output.strip()\r\n == \"Error: Error in functions definition: invalid syntax (, line 1)\"\r\n )\r\nE AssertionError: assert 'Error: Error...ing>, line 1)' == 'Error: Error...ing>, line 1)'\r\nE - Error: Error in functions definition: invalid syntax (, line 1)\r\nE ? ^^^^^^ ^^^^^^\r\nE + Error: Error in functions definition: expected '(' (, line 1)\r\nE ? ^^^^^^^ ^^^\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/502/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1413641049, "node_id": "I_kwDOCGYnMM5UQnNZ", "number": 501, "title": "Tests failing due to updated tabulate library", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2022-10-18T18:07:52Z", "updated_at": "2022-10-18T18:23:40Z", "closed_at": "2022-10-18T18:23:40Z", "author_association": "OWNER", "pull_request": null, "body": "Failure here: https://github.com/simonw/sqlite-utils/actions/runs/3275786702/jobs/5391063221\r\n\r\nI figured out the problem:\r\n\r\n```diff\r\ndiff --git a/docs/cli-reference.rst b/docs/cli-reference.rst\r\nindex b88e38a..82b4b6c 100644\r\n--- a/docs/cli-reference.rst\r\n+++ b/docs/cli-reference.rst\r\n@@ -112,11 +112,15 @@ See :ref:`cli_query`.\r\n --tsv Output TSV\r\n --no-headers Omit CSV headers\r\n -t, --table Output as a formatted table\r\n- --fmt TEXT Table format - one of fancy_grid, fancy_outline,\r\n- github, grid, html, jira, latex, latex_booktabs,\r\n- latex_longtable, latex_raw, mediawiki, moinmoin,\r\n- orgtbl, pipe, plain, presto, pretty, psql, rst,\r\n- simple, textile, tsv, unsafehtml, youtrack\r\n+ --fmt TEXT Table format - one of asciidoc, double_grid,\r\n+ double_outline, fancy_grid, fancy_outline, github,\r\n+ grid, heavy_grid, heavy_outline, html, jira,\r\n+ latex, latex_booktabs, latex_longtable, latex_raw,\r\n+ mediawiki, mixed_grid, mixed_outline, moinmoin,\r\n+ orgtbl, outline, pipe, plain, presto, pretty,\r\n+ psql, rounded_grid, rounded_outline, rst, simple,\r\n+ simple_grid, simple_outline, textile, tsv,\r\n+ unsafehtml, youtrack\r\n --json-cols Detect JSON cols and output them as JSON, not\r\n escaped strings\r\n -r, --raw Raw output, first column of first row\r\n@@ -176,11 +180,15 @@ See :ref:`cli_memory`.\r\n --tsv Output TSV\r\n --no-headers Omit CSV headers\r\n -t, --table Output as a formatted table\r\n- --fmt TEXT Table format - one of fancy_grid, fancy_outline,\r\n- github, grid, html, jira, latex, latex_booktabs,\r\n- latex_longtable, latex_raw, mediawiki, moinmoin,\r\n- orgtbl, pipe, plain, presto, pretty, psql, rst,\r\n- simple, textile, tsv, unsafehtml, youtrack\r\n+ --fmt TEXT Table format - one of asciidoc, double_grid,\r\n+ double_outline, fancy_grid, fancy_outline, github,\r\n+ grid, heavy_grid, heavy_outline, html, jira,\r\n+ latex, latex_booktabs, latex_longtable, latex_raw,\r\n+ mediawiki, mixed_grid, mixed_outline, moinmoin,\r\n+ orgtbl, outline, pipe, plain, presto, pretty,\r\n+ psql, rounded_grid, rounded_outline, rst, simple,\r\n+ simple_grid, simple_outline, textile, tsv,\r\n+ unsafehtml, youtrack\r\n --json-cols Detect JSON cols and output them as JSON, not\r\n escaped strings\r\n -r, --raw Raw output, first column of first row\r\n@@ -401,11 +409,14 @@ See :ref:`cli_search`.\r\n --tsv Output TSV\r\n --no-headers Omit CSV headers\r\n -t, --table Output as a formatted table\r\n- --fmt TEXT Table format - one of fancy_grid, fancy_outline,\r\n- github, grid, html, jira, latex, latex_booktabs,\r\n- latex_longtable, latex_raw, mediawiki, moinmoin,\r\n- orgtbl, pipe, plain, presto, pretty, psql, rst, simple,\r\n- textile, tsv, unsafehtml, youtrack\r\n+ --fmt TEXT Table format - one of asciidoc, double_grid,\r\n+ double_outline, fancy_grid, fancy_outline, github,\r\n+ grid, heavy_grid, heavy_outline, html, jira, latex,\r\n+ latex_booktabs, latex_longtable, latex_raw, mediawiki,\r\n+ mixed_grid, mixed_outline, moinmoin, orgtbl, outline,\r\n+ pipe, plain, presto, pretty, psql, rounded_grid,\r\n+ rounded_outline, rst, simple, simple_grid,\r\n+ simple_outline, textile, tsv, unsafehtml, youtrack\r\n --json-cols Detect JSON cols and output them as JSON, not escaped\r\n strings\r\n --load-extension TEXT Path to SQLite extension, with optional :entrypoint\r\n@@ -651,11 +662,14 @@ See :ref:`cli_tables`.\r\n --tsv Output TSV\r\n --no-headers Omit CSV headers\r\n -t, --table Output as a formatted table\r\n- --fmt TEXT Table format - one of fancy_grid, fancy_outline,\r\n- github, grid, html, jira, latex, latex_booktabs,\r\n- latex_longtable, latex_raw, mediawiki, moinmoin,\r\n- orgtbl, pipe, plain, presto, pretty, psql, rst, simple,\r\n- textile, tsv, unsafehtml, youtrack\r\n+ --fmt TEXT Table format - one of asciidoc, double_grid,\r\n+ double_outline, fancy_grid, fancy_outline, github,\r\n+ grid, heavy_grid, heavy_outline, html, jira, latex,\r\n+ latex_booktabs, latex_longtable, latex_raw, mediawiki,\r\n+ mixed_grid, mixed_outline, moinmoin, orgtbl, outline,\r\n+ pipe, plain, presto, pretty, psql, rounded_grid,\r\n+ rounded_outline, rst, simple, simple_grid,\r\n+ simple_outline, textile, tsv, unsafehtml, youtrack\r\n --json-cols Detect JSON cols and output them as JSON, not escaped\r\n strings\r\n --columns Include list of columns for each table\r\n@@ -689,11 +703,14 @@ See :ref:`cli_views`.\r\n --tsv Output TSV\r\n --no-headers Omit CSV headers\r\n -t, --table Output as a formatted table\r\n- --fmt TEXT Table format - one of fancy_grid, fancy_outline,\r\n- github, grid, html, jira, latex, latex_booktabs,\r\n- latex_longtable, latex_raw, mediawiki, moinmoin,\r\n- orgtbl, pipe, plain, presto, pretty, psql, rst, simple,\r\n- textile, tsv, unsafehtml, youtrack\r\n+ --fmt TEXT Table format - one of asciidoc, double_grid,\r\n+ double_outline, fancy_grid, fancy_outline, github,\r\n+ grid, heavy_grid, heavy_outline, html, jira, latex,\r\n+ latex_booktabs, latex_longtable, latex_raw, mediawiki,\r\n+ mixed_grid, mixed_outline, moinmoin, orgtbl, outline,\r\n+ pipe, plain, presto, pretty, psql, rounded_grid,\r\n+ rounded_outline, rst, simple, simple_grid,\r\n+ simple_outline, textile, tsv, unsafehtml, youtrack\r\n --json-cols Detect JSON cols and output them as JSON, not escaped\r\n strings\r\n --columns Include list of columns for each view\r\n@@ -732,11 +749,15 @@ See :ref:`cli_rows`.\r\n --tsv Output TSV\r\n --no-headers Omit CSV headers\r\n -t, --table Output as a formatted table\r\n- --fmt TEXT Table format - one of fancy_grid, fancy_outline,\r\n- github, grid, html, jira, latex, latex_booktabs,\r\n- latex_longtable, latex_raw, mediawiki, moinmoin,\r\n- orgtbl, pipe, plain, presto, pretty, psql, rst,\r\n- simple, textile, tsv, unsafehtml, youtrack\r\n+ --fmt TEXT Table format - one of asciidoc, double_grid,\r\n+ double_outline, fancy_grid, fancy_outline, github,\r\n+ grid, heavy_grid, heavy_outline, html, jira,\r\n+ latex, latex_booktabs, latex_longtable, latex_raw,\r\n+ mediawiki, mixed_grid, mixed_outline, moinmoin,\r\n+ orgtbl, outline, pipe, plain, presto, pretty,\r\n+ psql, rounded_grid, rounded_outline, rst, simple,\r\n+ simple_grid, simple_outline, textile, tsv,\r\n+ unsafehtml, youtrack\r\n --json-cols Detect JSON cols and output them as JSON, not\r\n escaped strings\r\n --load-extension TEXT Path to SQLite extension, with optional\r\n@@ -768,11 +789,14 @@ See :ref:`cli_triggers`.\r\n --tsv Output TSV\r\n --no-headers Omit CSV headers\r\n -t, --table Output as a formatted table\r\n- --fmt TEXT Table format - one of fancy_grid, fancy_outline,\r\n- github, grid, html, jira, latex, latex_booktabs,\r\n- latex_longtable, latex_raw, mediawiki, moinmoin,\r\n- orgtbl, pipe, plain, presto, pretty, psql, rst, simple,\r\n- textile, tsv, unsafehtml, youtrack\r\n+ --fmt TEXT Table format - one of asciidoc, double_grid,\r\n+ double_outline, fancy_grid, fancy_outline, github,\r\n+ grid, heavy_grid, heavy_outline, html, jira, latex,\r\n+ latex_booktabs, latex_longtable, latex_raw, mediawiki,\r\n+ mixed_grid, mixed_outline, moinmoin, orgtbl, outline,\r\n+ pipe, plain, presto, pretty, psql, rounded_grid,\r\n+ rounded_outline, rst, simple, simple_grid,\r\n+ simple_outline, textile, tsv, unsafehtml, youtrack\r\n --json-cols Detect JSON cols and output them as JSON, not escaped\r\n strings\r\n --load-extension TEXT Path to SQLite extension, with optional :entrypoint\r\n@@ -804,11 +828,14 @@ See :ref:`cli_indexes`.\r\n --tsv Output TSV\r\n --no-headers Omit CSV headers\r\n -t, --table Output as a formatted table\r\n- --fmt TEXT Table format - one of fancy_grid, fancy_outline,\r\n- github, grid, html, jira, latex, latex_booktabs,\r\n- latex_longtable, latex_raw, mediawiki, moinmoin,\r\n- orgtbl, pipe, plain, presto, pretty, psql, rst, simple,\r\n- textile, tsv, unsafehtml, youtrack\r\n+ --fmt TEXT Table format - one of asciidoc, double_grid,\r\n+ double_outline, fancy_grid, fancy_outline, github,\r\n+ grid, heavy_grid, heavy_outline, html, jira, latex,\r\n+ latex_booktabs, latex_longtable, latex_raw, mediawiki,\r\n+ mixed_grid, mixed_outline, moinmoin, orgtbl, outline,\r\n+ pipe, plain, presto, pretty, psql, rounded_grid,\r\n+ rounded_outline, rst, simple, simple_grid,\r\n+ simple_outline, textile, tsv, unsafehtml, youtrack\r\n --json-cols Detect JSON cols and output them as JSON, not escaped\r\n strings\r\n --load-extension TEXT Path to SQLite extension, with optional :entrypoint\r\ndiff --git a/docs/cli.rst b/docs/cli.rst\r\nindex 8bc4176..1d67e88 100644\r\n--- a/docs/cli.rst\r\n+++ b/docs/cli.rst\r\n@@ -187,10 +187,15 @@ Available ``--fmt`` options are:\r\n cog.out(\"\\n\" + \"\\n\".join('- ``{}``'.format(t) for t in tabulate.tabulate_formats) + \"\\n\\n\")\r\n .. ]]]\r\n \r\n+- ``asciidoc``\r\n+- ``double_grid``\r\n+- ``double_outline``\r\n - ``fancy_grid``\r\n - ``fancy_outline``\r\n - ``github``\r\n - ``grid``\r\n+- ``heavy_grid``\r\n+- ``heavy_outline``\r\n - ``html``\r\n - ``jira``\r\n - ``latex``\r\n@@ -198,15 +203,22 @@ Available ``--fmt`` options are:\r\n - ``latex_longtable``\r\n - ``latex_raw``\r\n - ``mediawiki``\r\n+- ``mixed_grid``\r\n+- ``mixed_outline``\r\n - ``moinmoin``\r\n - ``orgtbl``\r\n+- ``outline``\r\n - ``pipe``\r\n - ``plain``\r\n - ``presto``\r\n - ``pretty``\r\n - ``psql``\r\n+- ``rounded_grid``\r\n+- ``rounded_outline``\r\n - ``rst``\r\n - ``simple``\r\n+- ``simple_grid``\r\n+- ``simple_outline``\r\n - ``textile``\r\n - ``tsv``\r\n - ``unsafehtml``\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/501/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1413610718, "node_id": "I_kwDOCGYnMM5UQfze", "number": 500, "title": "Turn --flatten into a documented utility function", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2022-10-18T17:43:36Z", "updated_at": "2022-10-18T18:02:10Z", "closed_at": "2022-10-18T18:00:40Z", "author_association": "OWNER", "pull_request": null, "body": "The `--flatten` implementation isn't currently available to Python code - people have to roll their own implementation. Feedback from a conversation at DjangoCon.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/500/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1404013495, "node_id": "PR_kwDOCGYnMM5AicIh", "number": 498, "title": "fix: enable-fts permanently save triggers", "user": {"value": 7908073, "label": "chapmanjacobd"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-10-11T05:10:51Z", "updated_at": "2022-10-15T04:33:08Z", "closed_at": "2022-10-11T06:34:31Z", "author_association": "CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/498", "body": "I was wondering why my all my databases were giving wild search results. Turns out create_trigger was not sticking!\r\n\r\nRunning `sqlite-utils triggers x.db` shows `[]` after running `enable-fts` using the python api. Looking at the counts trigger it seems that is the right way to save triggers. triggers show up now\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://sqlite-utils--498.org.readthedocs.build/en/498/\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/498/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1361355564, "node_id": "I_kwDOCGYnMM5RJKMs", "number": 482, "title": "balanced table default column_order", "user": {"value": 7908073, "label": "chapmanjacobd"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-09-05T03:00:18Z", "updated_at": "2022-10-10T17:43:02Z", "closed_at": "2022-09-06T20:17:27Z", "author_association": "CONTRIBUTOR", "pull_request": null, "body": "Is there any performance or size difference with column order in SQLITE ? similar to this https://www.cybertec-postgresql.com/en/column-order-in-postgresql-does-matter/\r\n\r\nIt might be interesting to have an option to create with an optimized column order. I'm assuming this would look something like INTEGER columns, REAL columns, BLOB columns, TEXT columns, NULL columns. NULL columns at the end because they are more likely to be TEXT and it is impossible to know if they will become INTEGER\r\n\r\n(Of course, any schema evolution would reduce optimization but maybe column order could also be re-evaluated when schema changes)\r\n\r\nedit:\r\n\r\nthis is easy to accomplish with the existing `transform` method:\r\n\r\n```\r\nint_columns = [k for k, v in table_columns.items() if v == int]\r\ndb[table].transform(column_order=[*int_columns])\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/482/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1149661489, "node_id": "I_kwDOCGYnMM5EhnEx", "number": 409, "title": "`with db:` for transactions", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2022-02-24T19:22:06Z", "updated_at": "2022-10-01T03:42:50Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "This can be a documented wrapper around `with db.conn:`.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/409/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1386593843, "node_id": "I_kwDOCGYnMM5Spb4z", "number": 494, "title": "Document how to use Just", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-09-26T19:25:12Z", "updated_at": "2022-09-26T19:32:36Z", "closed_at": "2022-09-26T19:26:39Z", "author_association": "OWNER", "pull_request": null, "body": "I'm using `just` a lot know, based on this file - I should add that to https://sqlite-utils.datasette.io/en/latest/contributing.html\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/afbd2b2cba45cccb305c3d4638d18db4dd3d4bbd/Justfile#L1-L24", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/494/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1363765916, "node_id": "I_kwDOCGYnMM5RSWqc", "number": 483, "title": "`sqlite-utils install` command", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-09-06T20:13:55Z", "updated_at": "2022-09-26T19:04:43Z", "closed_at": "2022-09-26T18:57:15Z", "author_association": "OWNER", "pull_request": null, "body": "With the addition of `--functions` in:\r\n- #471\r\n\r\nIn addition to the existing `convert` command, there are now very good reasons to want to install additional packages into the same virtual environment as `sqlite-utils` itself, to allow them to be used with those features.\r\n\r\nThis isn't easy if you installed the tool with `pipx` or `brew install sqlite-utils`.\r\n\r\nDatasette solved this problem with the `datasette install` command:\r\n\r\n- https://github.com/simonw/datasette/issues/925\r\n\r\n`sqlite-utils` could benefit from the same idea.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/483/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1386530156, "node_id": "I_kwDOCGYnMM5SpMVs", "number": 492, "title": "Idea: ability to pass extra variables to `--convert` scripts", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-09-26T18:30:45Z", "updated_at": "2022-09-26T18:33:19Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "Got this idea from this example in https://jeqo.github.io/notes/2022-09-24-ingest-logs-sqlite/\r\n\r\n```bash\r\nsqlite-utils insert /tmp/kafka-logs.db logs server.log.2022-09-24-21 --text --convert \"\r\nimport re\r\nr = re.compile(r'^\\[(?P\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{3})\\] (?P\\w+) (?P(.+(\\n(?\\!\\[).+|)+))', re.MULTILINE)\r\ndef convert(text):\r\n rows = [m.groupdict() for m in r.finditer(text)]\r\n for row in rows:\r\n row.update({'server': 'localhost'})\r\n row.update({'component': 'broker'})\r\n return rows\r\n\"\r\n```\r\nAnd the accompanying note:\r\n\r\n> The `row.update` allows to label rows as I\u2019m planning to ingest logs from different hosts and potentially different components.\r\n\r\nThis made me think: it might be neat if you could inject additional variable values into that script with extra command-line options, to make this kind of reuse easier. Something like this:\r\n\r\n```bash\r\nsqlite-utils insert /tmp/kafka-logs.db logs server.log.2022-09-24-21 --text --convert \"\r\nimport re\r\nr = re.compile(r'^\\[(?P\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{3})\\] (?P\\w+) (?P(.+(\\n(?\\!\\[).+|)+))', re.MULTILINE)\r\ndef convert(text):\r\n rows = [m.groupdict() for m in r.finditer(text)]\r\n for row in rows:\r\n row.update({'server': server})\r\n row.update({'component': component})\r\n return rows\r\n\" --var server \"localhost\" --var component \"broker\"\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/492/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1382457780, "node_id": "I_kwDOCGYnMM5SZqG0", "number": 490, "title": "Ability to insert multi-line files", "user": {"value": 6180701, "label": "jeqo"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2022-09-22T13:29:22Z", "updated_at": "2022-09-26T18:24:44Z", "closed_at": "2022-09-23T16:37:58Z", "author_association": "NONE", "pull_request": null, "body": "I was looking into how to parse application log files that contain multiline text (e.g. Java stack traces) into sqlite. \r\nI can see that at the moment `--lines` helps, but falls short when processing multi-line texts.\r\n\r\nI wonder if this functionality would be useful for sqlite-utils. A similar approach to Elastic logstash/filebeat can be adopted: https://www.elastic.co/guide/en/beats/filebeat/current/multiline-examples.html \r\n\r\nPotential changes:\r\n\r\n- add a `--multiline` option\r\n- additional properties for\r\n - multiline-pattern (regex expression)\r\n - multiline-negate: true/false\r\n - multiline-what: previous or next\r\n\r\nOr if this is achievable in a different way, please share. Thanks!", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/490/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1082651698, "node_id": "I_kwDOCGYnMM5Ah_Qy", "number": 358, "title": "Support for CHECK constraints", "user": {"value": 11597658, "label": "luxint"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 7, "created_at": "2021-12-16T21:19:45Z", "updated_at": "2022-09-25T07:15:59Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Hi,\r\n\r\nI noticed the `transform.table()` method doesn't have an option to add/change or drop a check constraint (see https://sqlite.org/lang_createtable.html -> 3.7 Check Constraints. would be great to have this as an option! \r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/358/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1374939463, "node_id": "I_kwDOCGYnMM5R8-lH", "number": 489, "title": "Ability to load JSON records held in a file with a single top level key that is a list of objects", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 9, "created_at": "2022-09-15T18:46:03Z", "updated_at": "2022-09-15T20:56:10Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "It's very common for JSON to look like this:\r\n```json\r\n{\r\n \"Version\": \"5.5.52.6\",\r\n \"List\": [\r\n {\r\n \"Description\": \"Nonpartisan\",\r\n \"Id\": 1,\r\n \"ExternalId\": \"\"\r\n },\r\n {\r\n \"Description\": \"Undeclared\",\r\n \"Id\": 2,\r\n \"ExternalId\": \"\"\r\n }\r\n ]\r\n}\r\n```\r\nThis example taken from the records downloaded from https://www.elections.alaska.gov/election-results/e/\r\n\r\nRight now you can't import this into `sqlite-utils` - you need to run it through `jq .List` first.\r\n\r\nBut since this is so common, it would be neat if `sqlite-utils` could have a rule of thumb that says \"if it's an object, but it has a single key that is is a list of objects, use that instead\".", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/489/reactions\", \"total_count\": 2, \"+1\": 2, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1366512990, "node_id": "PR_kwDOCGYnMM4-nBs9", "number": 486, "title": "progressbar for inserts/upserts of all fileformats, closes #485", "user": {"value": 99098079, "label": "MischaU8"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 7, "created_at": "2022-09-08T14:58:02Z", "updated_at": "2022-09-15T20:40:03Z", "closed_at": "2022-09-15T20:37:51Z", "author_association": "CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/486", "body": "\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://sqlite-utils--486.org.readthedocs.build/en/486/\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/486/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1366423176, "node_id": "I_kwDOCGYnMM5RcfaI", "number": 485, "title": "Progressbar not shown when inserting/upserting jsonlines file", "user": {"value": 99098079, "label": "MischaU8"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-09-08T14:13:18Z", "updated_at": "2022-09-15T20:39:52Z", "closed_at": "2022-09-15T20:37:52Z", "author_association": "CONTRIBUTOR", "pull_request": null, "body": "When inserting or upserting a jsonlines file, no progressbar is shown. Expected behavior is that, just like with .csv/.tsv files, also for a jsonlines file (--nl), unless --silent is provided, a progressbar is shown.\r\n\r\n```bash\r\nsql-utils upsert mydb.db posts posts.jl --nl --pk post_id\r\n(silence)\r\n```\r\n\r\nCurrently `file_progress` is only called within the tsv/csv logic, however I think it can be safely wrapped around all the all the input formats that use `decoded`: https://github.com/simonw/sqlite-utils/blob/main/sqlite_utils/cli.py#L963", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/485/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1128466114, "node_id": "I_kwDOCGYnMM5DQwbC", "number": 406, "title": "Creating tables with custom datatypes", "user": {"value": 82988, "label": "psychemedia"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2022-02-09T12:16:31Z", "updated_at": "2022-09-15T18:13:50Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Via https://stackoverflow.com/a/18622264/454773 I note the ability to register custom handlers for novel datatypes that can map into and out of things like sqlite `BLOB`s.\r\n\r\nFrom a quick look and a quick play, I didn't spot a way to do this in `sqlite_utils`?\r\n\r\nFor example:\r\n\r\n```python\r\n# Via https://stackoverflow.com/a/18622264/454773\r\nimport sqlite3\r\nimport numpy as np\r\nimport io\r\n\r\ndef adapt_array(arr):\r\n \"\"\"\r\n http://stackoverflow.com/a/31312102/190597 (SoulNibbler)\r\n \"\"\"\r\n out = io.BytesIO()\r\n np.save(out, arr)\r\n out.seek(0)\r\n return sqlite3.Binary(out.read())\r\n\r\ndef convert_array(text):\r\n out = io.BytesIO(text)\r\n out.seek(0)\r\n return np.load(out)\r\n\r\n\r\n# Converts np.array to TEXT when inserting\r\nsqlite3.register_adapter(np.ndarray, adapt_array)\r\n\r\n# Converts TEXT to np.array when selecting\r\nsqlite3.register_converter(\"array\", convert_array)\r\n```\r\n\r\n```python\r\nfrom sqlite_utils import Database\r\ndb = Database('test.db')\r\n\r\n# Reset the database connection to used the parsed datatype\r\n# sqlite_utils doesn't seem to support eg:\r\n# Database('test.db', detect_types=sqlite3.PARSE_DECLTYPES)\r\ndb.conn = sqlite3.connect(db_name, detect_types=sqlite3.PARSE_DECLTYPES)\r\n\r\n# Create a table the old fashioned way\r\n# but using the new custom data type\r\nvector_table_create = \"\"\"\r\nCREATE TABLE dummy \r\n (title TEXT, vector array );\r\n\"\"\"\r\n\r\ncur = db.conn.cursor()\r\ncur.execute(vector_table_create)\r\n\r\n\r\n# sqlite_utils doesn't appear to support custom types (yet?!)\r\n# The following errors on the \"array\" datatype\r\n\"\"\"\r\ndb[\"dummy\"].create({\r\n \"title\": str,\r\n \"vector\": \"array\",\r\n})\r\n\"\"\"\r\n```\r\n\r\nWe can then add / retrieve records from the database where the datatype of the `vector` field is a custom registered `array` type (which is to say, a `numpy` array):\r\n\r\n```python\r\nimport numpy as np\r\n\r\ndb[\"dummy\"].insert({'title':\"test1\", 'vector':np.array([1,2,3])})\r\n\r\nfor row in db.query(\"SELECT * FROM dummy\"):\r\n print(row['title'], row['vector'], type(row['vector']))\r\n\r\n\"\"\"\r\ntest1 [1 2 3] \r\n\"\"\"\r\n```\r\n\r\nIt would be handy to be able to do this idiomatically in `sqlite_utils`.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/406/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1367835380, "node_id": "I_kwDOCGYnMM5Rh4L0", "number": 487, "title": "Specify foreign key against compound key in other table", "user": {"value": 540968, "label": "ryanfox"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-09-09T13:32:09Z", "updated_at": "2022-09-11T04:00:44Z", "closed_at": "2022-09-11T04:00:44Z", "author_association": "NONE", "pull_request": null, "body": "When inserting rows via the library, is it possible to specify a foreign key to a compound primary key?\r\n\r\nFor example, suppose I create a table:\r\n```\r\ndb = Database('events.db')\r\ndb['events'].insert_all([\r\n {'venue': 'Times Square', 'date': '2022-12-31', 'title': 'Rockin New Year Eve'},\r\n {'venue': 'Wembley Stadium', 'date': '2022-06-05', 'title': 'FA Cup'},\r\n {'venue': 'Times Square', 'date': '2021-12-31', 'title': 'Rockin New Year Eve'},\r\n], pk=('date', 'venue'))\r\n```\r\n\r\nAnd I want to add related data in another table:\r\n```\r\nact = {'name': 'Rick Astley', 'venue': 'Times Square', 'date': '2021-12-31' }\r\ndb['performers'].insert(act, pk=)\r\n```\r\n\r\nIs it possible to specify a value for `pk` that will point to the compound primary key in `events`?\r\n\r\nSQLite does support it:\r\nhttps://www.sqlite.org/foreignkeys.html#fk_composite", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/487/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1363766973, "node_id": "I_kwDOCGYnMM5RSW69", "number": 484, "title": "Expose convert recipes to `sqlite-utils --functions`", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 11, "created_at": "2022-09-06T20:15:08Z", "updated_at": "2022-09-07T19:09:52Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "`--functions` was added in:\r\n- #471 \r\n\r\nIt would be useful if the `r.jsonsplit()` and similar recipes for `sqlite-utils convert` could be used in these blocks of code too: https://sqlite-utils.datasette.io/en/stable/cli.html#sqlite-utils-convert-recipes", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/484/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1353441389, "node_id": "I_kwDOCGYnMM5Qq-Bt", "number": 477, "title": "Conda Forge", "user": {"value": 49702524, "label": "thewchan"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-08-28T19:03:08Z", "updated_at": "2022-09-07T03:46:55Z", "closed_at": "2022-09-07T03:46:55Z", "author_association": "NONE", "pull_request": null, "body": "Hello! I have successfully put this package on to Conda Forge, and I have extending the invitation for the owner/maintainers of this package to be maintainers on Conda Forge as well. Let me know if you are interested! Thanks.\r\nhttps://github.com/conda-forge/sqlite-utils-feedstock", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/477/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1352932716, "node_id": "I_kwDOCGYnMM5QpB1s", "number": 471, "title": "sqlite-utils query --functions mechanism for registering extra functions", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 12, "created_at": "2022-08-27T03:57:53Z", "updated_at": "2022-09-07T03:46:26Z", "closed_at": "2022-08-27T05:10:57Z", "author_association": "OWNER", "pull_request": null, "body": "It would be really cool if you could register additional custom SQL functions for use with the `sqlite-utils query` command - something like this:\r\n\r\n```\r\nsqlite-utils data.db 'update images set domain = extract_domain(url)' --functions '\r\nfrom urllib.parse import urlparse\r\n\r\ndef extract_domain(url):\r\n return urlparse(url).netloc\r\n'\r\n```\r\nEvery function defined in that code block would be registered with the connection, unless the name began with an underscore.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/471/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 816526538, "node_id": "MDU6SXNzdWU4MTY1MjY1Mzg=", "number": 239, "title": "sqlite-utils extract could handle nested objects", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 16, "created_at": "2021-02-25T15:10:28Z", "updated_at": "2022-09-03T23:46:02Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "Imagine a table (imported from a nested JSON file) where one of the columns contains values that look like this:\r\n\r\n {\"email\": \"anonymous@noreply.airtable.com\", \"id\": \"usrROSHARE0000000\", \"name\": \"Anonymous\"}\r\n\r\nThe `sqlite-utils extract` command already uses single text values in a column to populate a new table. It would not be much of a stretch for it to be able to use JSON instead, including specifying which of those values should be used as the primary key in the new table.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/239/reactions\", \"total_count\": 6, \"+1\": 5, \"-1\": 0, \"laugh\": 0, \"hooray\": 1, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1359604075, "node_id": "I_kwDOCGYnMM5RCelr", "number": 481, "title": "Idea: `sqlite-utils create-table tablename --sql \"select ...\"`", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-09-02T01:41:24Z", "updated_at": "2022-09-02T01:42:08Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "Could offer syntactic sugar for:\r\n\r\n```sql\r\ncreate table foo as select * from bar\r\n```\r\n\r\n```\r\nsqlite-utils create-table data.db foo --sql \"select * from bar\"\r\n```\r\nhttps://sqlite-utils.datasette.io/en/stable/cli-reference.html#create-table", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/481/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1353074021, "node_id": "I_kwDOCGYnMM5QpkVl", "number": 474, "title": "Add an option for specifying column names when inserting CSV data", "user": {"value": 14294, "label": "hubgit"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2022-08-27T15:29:59Z", "updated_at": "2022-08-31T03:42:36Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "https://sqlite-utils.datasette.io/en/stable/cli.html#csv-files-without-a-header-row\r\n\r\n> The first row of any CSV or TSV file is expected to contain the names of the columns in that file.\r\n\r\n> If your file does not include this row, you can use the `--no-headers` option to specify that the tool should not use that fist row as headers.\r\n\r\n> If you do this, the table will be created with column names called `untitled_1` and `untitled_2` and so on. You can then rename them using the `sqlite-utils transform ... --rename` command.\r\n\r\nIt would be nice to be able to specify the column names when importing CSV/TSV without a header row, via an extra command line option.\r\n\r\n(renaming a column of a large table can take a long time, which makes it an inconvenient workaround)", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/474/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1355433619, "node_id": "PR_kwDOCGYnMM4-B7Mc", "number": 480, "title": "search_sql add include_rank option", "user": {"value": 7908073, "label": "chapmanjacobd"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2022-08-30T09:10:29Z", "updated_at": "2022-08-31T03:40:35Z", "closed_at": "2022-08-31T03:40:35Z", "author_association": "CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/480", "body": "I haven't tested this yet but wanted to get a heads-up whether this kind of change would be useful or if I should just duplicate the function and tweak it within my code\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://sqlite-utils--480.org.readthedocs.build/en/480/\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/480/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1355193529, "node_id": "I_kwDOCGYnMM5Qxpy5", "number": 479, "title": "OperationalError: cannot VACUUM from within a transaction", "user": {"value": 7908073, "label": "chapmanjacobd"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-08-30T05:34:24Z", "updated_at": "2022-08-30T05:34:24Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "Maybe when calling `.vacuum()` and other DB-level write-lock operations `sqlite_utils` could guard against this error message by automatically committing first?\r\n\r\n```\r\n 46 db[\"media\"].optimize() # type: ignore\r\n---> 47 db.vacuum()\r\n\r\nFile ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:1047, in Database.vacuum(self)\r\n 1045 def vacuum(self):\r\n 1046 \"Run a SQLite ``VACUUM`` against the database.\"\r\n-> 1047 self.execute(\"VACUUM;\")\r\n\r\nFile ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:470, in Database.execute(self, sql, parameters)\r\n 468 return self.conn.execute(sql, parameters)\r\n 469 else:\r\n--> 470 return self.conn.execute(sql)\r\n\r\nOperationalError: cannot VACUUM from within a transaction\r\n```\r\n\r\nIt might also be nice to add a sentence or two about how transactions are committed on the [docs page](https://sqlite-utils.datasette.io/en/latest/python-api.html#detect-fts). When I was swapping out my sqlite3 code for this library it was nice that everything was pretty much drop-in but I was/am unsure what to do about the places I explicitly call `.commit()` in my code\r\n\r\nRelated to https://github.com/simonw/sqlite-utils/issues/121", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/479/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1353481513, "node_id": "I_kwDOCGYnMM5QrH0p", "number": 478, "title": "`sqlite-utils tables data.db table1 table2`", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-08-28T22:05:53Z", "updated_at": "2022-08-28T22:22:35Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "The `sqlite-utils tables` command currently lists all tables.\r\n\r\nIf you have a huge table in there then running it with `--counts` can get expensive, because of the huge table.\r\n\r\nWould be useful if it could accept an optional list of tables that it should execute against, as an alternative to the default of all of them.\r\n\r\nThis should be a backwards compatible change. Current design is: https://sqlite-utils.datasette.io/en/stable/cli-reference.html#tables\r\n\r\n```\r\nUsage: sqlite-utils tables [OPTIONS] PATH\r\n\r\n List the tables in the database\r\n\r\n Example:\r\n\r\n sqlite-utils tables trees.db\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/478/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1178546862, "node_id": "I_kwDOCGYnMM5GPzKu", "number": 420, "title": "Document how to use a `--convert` function that runs initialization code first", "user": {"value": 770231, "label": "strada"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 12, "created_at": "2022-03-23T19:07:36Z", "updated_at": "2022-08-28T11:34:37Z", "closed_at": "2022-03-25T20:07:33Z", "author_association": "NONE", "pull_request": null, "body": "When I have an insert command with transform like this:\r\n\r\n```\r\ncat items.json | jq '.data' | sqlite-utils insert listings.db listings - --convert '\r\nd = enchant.Dict(\"en_US\")\r\nrow[\"is_dictionary_word\"] = d.check(row[\"name\"])\r\n' --import=enchant --ignore\r\n```\r\n\r\nI noticed as the number of rows increases the operation becomes quite slow, likely due to the creation of the `d = enchant.Dict(\"en_US\")` object for each row. Is there a way to share that instance `d` between transform function calls, like a shared context?", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/420/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1353196970, "node_id": "I_kwDOCGYnMM5QqCWq", "number": 476, "title": "Release notes for 3.29", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 2, "created_at": "2022-08-27T23:21:21Z", "updated_at": "2022-08-28T04:07:15Z", "closed_at": "2022-08-28T04:07:03Z", "author_association": "OWNER", "pull_request": null, "body": "https://github.com/simonw/sqlite-utils/compare/3.28...104f37fa4d2e7e5999c1d829267b62c737f74d3e", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/476/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1348169997, "node_id": "I_kwDOCGYnMM5QW3EN", "number": 467, "title": "Mechanism for ensuring a table has all the columns", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 13, "created_at": "2022-08-23T15:50:23Z", "updated_at": "2022-08-27T23:19:41Z", "closed_at": "2022-08-27T23:17:56Z", "author_association": "OWNER", "pull_request": null, "body": "Suggested by @jefftriplett on Discord: https://discord.com/channels/823971286308356157/997738192360964156/1011655389063958600", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/467/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1348294436, "node_id": "PR_kwDOCGYnMM49qP2V", "number": 468, "title": "db[table].create(..., transform=True) and create-table --transform", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 6, "created_at": "2022-08-23T17:27:58Z", "updated_at": "2022-08-27T23:17:55Z", "closed_at": "2022-08-27T23:17:55Z", "author_association": "OWNER", "pull_request": "simonw/sqlite-utils/pulls/468", "body": "Work in progress. Still needs documentation and tests (and to cover more cases of things that might have changed).\r\n\r\nRefs:\r\n- #467\r\n\r\n\r\n----\r\n:books: Documentation preview :books:: https://sqlite-utils--468.org.readthedocs.build/en/468/\r\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/468/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1353189941, "node_id": "I_kwDOCGYnMM5QqAo1", "number": 475, "title": "table.default_values introspection property", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 1, "created_at": "2022-08-27T22:33:31Z", "updated_at": "2022-08-27T22:44:46Z", "closed_at": "2022-08-27T22:43:02Z", "author_association": "OWNER", "pull_request": null, "body": "> Interesting challenge with `default_value`: I need to be able to tell if the default values passed to `.create()` differ from those in the database already.\r\n>\r\n> Introspecting that is a bit tricky:\r\n>\r\n> ```pycon\r\n> >>> import sqlite_utils\r\n> >>> db = sqlite_utils.Database(memory=True)\r\n> >>> db[\"blah\"].create({\"id\": int, \"name\": str}, not_null=(\"name\",), defaults={\"name\": \"bob\"})\r\n> \r\n> >>> db[\"blah\"].columns\r\n> [Column(cid=0, name='id', type='INTEGER', notnull=0, default_value=None, is_pk=0), Column(cid=1, name='name', type='TEXT', notnull=1, default_value=\"'bob'\", is_pk=0)]\r\n> ```\r\n> Note how a default value of the Python string `bob` is represented in the results of `PRAGMA table_info()` as `default_value=\"'bob'\"` - it's got single quotes added to it!\r\n> \r\n> So comparing default values from introspecting the database needs me to first parse that syntax. This may require a new table introspection method.\r\n\r\n_Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/468#issuecomment-1229279539_", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/475/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1199158210, "node_id": "I_kwDOCGYnMM5HebPC", "number": 423, "title": ".extract() doesn't set foreign key when extracted columns contain NULL value", "user": {"value": 37447552, "label": "jlieth"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-04-10T20:05:30Z", "updated_at": "2022-08-27T14:45:04Z", "closed_at": "2022-08-27T14:45:04Z", "author_association": "NONE", "pull_request": null, "body": "I've run into an issue with `extract` and I don't believe this is the intended behaviour.\r\n\r\nI'm working with a database with music listening information. Currently it has one large table `listens` that contains all information. I'm trying to normalize the database by extracting relevant columns to separate tables (`artists`, `tracks`, `albums`). Not every track has an album.\r\n\r\nA simplified demonstration with just `track_title` and `album_title` columns:\r\n```ipython\r\nIn [1]: import sqlite_utils\r\n\r\nIn [2]: db = sqlite_utils.Database(memory=True)\r\n\r\nIn [3]: db[\"listens\"].insert_all([\r\n ...: {\"id\": 1, \"track_title\": \"foo\", \"album_title\": \"bar\"},\r\n ...: {\"id\": 2, \"track_title\": \"baz\", \"album_title\": None}\r\n ...: ], pk=\"id\")\r\nOut[3]:
\r\n```\r\n\r\nThe track in the first row has an album, the second track doesn't. Now I extract album information into a separate column:\r\n```ipython\r\nIn [4]: db[\"listens\"].extract(columns=[\"album_title\"], table=\"albums\", fk_column=\"album_id\")\r\nOut[4]:
\r\n\r\nIn [5]: list(db[\"albums\"].rows)\r\nOut[5]: [{'id': 1, 'album_title': 'bar'}, {'id': 2, 'album_title': None}]\r\n\r\nIn [6]: list(db[\"listens\"].rows)\r\nOut[6]: \r\n[{'id': 1, 'track_title': 'foo', 'album_id': 1},\r\n {'id': 2, 'track_title': 'baz', 'album_id': None}]\r\n```\r\n\r\nThis behaves as expected -- the `album` table contains entries for both the existing album and the NULL album. The `listens` table has a foreign key only for the first row (since the album in the second row was empty).\r\n\r\nNow I want to extract the track information as well. Album information belongs to the track so I want to extract both columns to a new table.\r\n```ipython\r\nIn [7]: db[\"listens\"].extract(columns=[\"track_title\", \"album_id\"], table=\"tracks\", fk_column=\"track_id\")\r\nOut[7]:
\r\n\r\nIn [8]: list(db[\"tracks\"].rows)\r\nOut[8]: \r\n[{'id': 1, 'track_title': 'foo', 'album_id': 1},\r\n {'id': 2, 'track_title': 'baz', 'album_id': None}]\r\n\r\nIn [9]: list(db[\"listens\"].rows)\r\nOut[9]: [{'id': 1, 'track_id': 1}, {'id': 2, 'track_id': None}]\r\n```\r\n\r\nExtracting to the `tracks` table worked fine (both tracks are present with correct columns). However, the `listens` table only has a foreign key to the newly created tracks for the first row, the foreign key in the second row is NULL.\r\n\r\nChanging the order of extracts doesn't help.\r\n\r\nI poked around in the source a bit and I believe [this line](https://github.com/simonw/sqlite-utils/blob/433813612ff9b4b501739fd7543bef0040dd51fe/sqlite_utils/db.py#L1737) (essentially comparing `NULL = NULL`) is the problem, but I don't know enough about SQL to create a reliable fix myself.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/423/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1309542173, "node_id": "PR_kwDOCGYnMM47pwAb", "number": 455, "title": "in extract code, check equality with IS instead of = for nulls", "user": {"value": 536941, "label": "fgregg"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2022-07-19T13:40:25Z", "updated_at": "2022-08-27T14:45:03Z", "closed_at": "2022-08-27T14:45:03Z", "author_association": "CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/455", "body": "sqlite \"IS\" is equivalent to SQL \"IS NOT DISTINCT FROM\"\r\n\r\ncloses #423", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/455/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1352953535, "node_id": "PR_kwDOCGYnMM4950Az", "number": 473, "title": "Support entrypoints for `--load-extension`", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-08-27T05:53:59Z", "updated_at": "2022-08-27T05:55:52Z", "closed_at": "2022-08-27T05:55:47Z", "author_association": "OWNER", "pull_request": "simonw/sqlite-utils/pulls/473", "body": "Refs #470\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://sqlite-utils--473.org.readthedocs.build/en/473/\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/473/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1352932038, "node_id": "I_kwDOCGYnMM5QpBrG", "number": 470, "title": "Upgrade `--load-extension` to accept entrypoints like Datasette", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 6, "created_at": "2022-08-27T03:53:20Z", "updated_at": "2022-08-27T05:55:49Z", "closed_at": "2022-08-27T05:55:48Z", "author_association": "OWNER", "pull_request": null, "body": "Imitate:\r\n- https://github.com/simonw/datasette/pull/1789\r\n```\r\n# would load default entrypoint like before\r\ndatasette data.db --load-extension ext\r\n\r\n# loads the extensions with the \"sqlite3_foo_init\" entrpoint\r\ndatasette data.db --load-extension ext:sqlite3_foo_init\r\n\r\n# loads the extensions with the \"sqlite3_bar_init\" entrpoint\r\ndatasette data.db --load-extension ext:sqlite3_bar_init\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/470/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1352946135, "node_id": "I_kwDOCGYnMM5QpFHX", "number": 472, "title": "Reuse the locals/globals fix from --functions for other code accepting options", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 2, "created_at": "2022-08-27T05:12:05Z", "updated_at": "2022-08-27T05:20:12Z", "closed_at": "2022-08-27T05:20:12Z", "author_association": "OWNER", "pull_request": null, "body": "I figured out a workaround for the ugly `global x` hack here:\r\n- https://github.com/simonw/sqlite-utils/issues/471#issuecomment-1229120653", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/472/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1352931464, "node_id": "I_kwDOCGYnMM5QpBiI", "number": 469, "title": "sqlite-utils rows --order option", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 1, "created_at": "2022-08-27T03:49:51Z", "updated_at": "2022-08-27T04:30:49Z", "closed_at": "2022-08-27T04:10:32Z", "author_association": "OWNER", "pull_request": null, "body": "For consistency with `search`: https://sqlite-utils.datasette.io/en/stable/cli-reference.html#search\r\n\r\n```\r\n -o, --order TEXT Order by ('column' or 'column desc')\r\n```\r\n\r\nI wanted to run `sqlite-utils rows db.db mytable --order 'rowid desc'` to see the most recently imported rows.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/469/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1320243134, "node_id": "I_kwDOCGYnMM5OsU--", "number": 458, "title": "Support custom names for registered functions", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 1, "created_at": "2022-07-28T00:13:00Z", "updated_at": "2022-08-27T03:56:01Z", "closed_at": "2022-07-28T00:13:57Z", "author_association": "OWNER", "pull_request": null, "body": "In this example:\r\n\r\n```python\r\n @db.register_function\r\n def reverse_string(s):\r\n return \"\".join(reversed(list(s)))\r\n\r\n print(db.execute('select reverse_string(\"hello\")').fetchone()[0])\r\n```\r\nThere's currently no way to over-ride the automatically selected name for the SQL function.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/458/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1319881016, "node_id": "PR_kwDOCGYnMM48Mmde", "number": 457, "title": "Link to installation instructions", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": {"value": 8355157, "label": "3.29"}, "comments": 2, "created_at": "2022-07-27T17:38:36Z", "updated_at": "2022-08-27T03:55:52Z", "closed_at": "2022-07-27T17:57:50Z", "author_association": "OWNER", "pull_request": "simonw/sqlite-utils/pulls/457", "body": "Also testing https://docs.readthedocs.io/en/stable/pull-requests.html", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/457/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1326087800, "node_id": "PR_kwDOCGYnMM48hI-_", "number": 460, "title": "Cross-link CLI to Python docs", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2022-08-02T16:18:28Z", "updated_at": "2022-08-18T21:58:10Z", "closed_at": "2022-08-18T21:58:07Z", "author_association": "OWNER", "pull_request": "simonw/sqlite-utils/pulls/460", "body": "Work in progress, partly to test the ReadTheDocs preview link action.\r\n\r\nRefs:\r\n- #426\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://readthedocs-preview--460.org.readthedocs.build/en/460/\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/460/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1334416486, "node_id": "PR_kwDOCGYnMM488n6D", "number": 463, "title": "Use Read the Docs action v1", "user": {"value": 244656, "label": "humitos"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-08-10T10:31:47Z", "updated_at": "2022-08-18T08:30:14Z", "closed_at": "2022-08-17T23:11:16Z", "author_association": "CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/463", "body": "Read the Docs repository was renamed from `readthedocs/readthedocs-preview` to `readthedocs/actions/`. Now, the `preview` action is under `readthedocs/actions/preview` and is tagged as `v1`\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://sqlite-utils--463.org.readthedocs.build/en/463/\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/463/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1342357149, "node_id": "PR_kwDOCGYnMM49Wsnq", "number": 465, "title": "beanbag-docutils>=2.0", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-08-17T22:41:39Z", "updated_at": "2022-08-17T23:38:07Z", "closed_at": "2022-08-17T23:38:02Z", "author_association": "OWNER", "pull_request": "simonw/sqlite-utils/pulls/465", "body": "Refs #464", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/465/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1338001039, "node_id": "I_kwDOCGYnMM5PwEaP", "number": 464, "title": "Link from documentation to source code", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2022-08-13T16:19:57Z", "updated_at": "2022-08-17T23:38:03Z", "closed_at": "2022-08-17T23:38:03Z", "author_association": "OWNER", "pull_request": null, "body": "Twitter conversation asking for ways to automate this here: https://twitter.com/simonw/status/1558260492015046656", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/464/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1342374388, "node_id": "PR_kwDOCGYnMM49Wv9T", "number": 466, "title": "Use Read the Docs action v1 (#463)", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-08-17T23:11:50Z", "updated_at": "2022-08-17T23:11:54Z", "closed_at": "2022-08-17T23:11:54Z", "author_association": "OWNER", "pull_request": "simonw/sqlite-utils/pulls/466", "body": "Read the Docs repository was renamed from `readthedocs/readthedocs-preview` to `readthedocs/actions/`. Now, the `preview` action is under `readthedocs/actions/preview` and is tagged as `v1`", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/466/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1326391841, "node_id": "PR_kwDOCGYnMM48iLGF", "number": 462, "title": "Discord badge", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-08-02T20:56:04Z", "updated_at": "2022-08-02T21:15:57Z", "closed_at": "2022-08-02T21:15:52Z", "author_association": "OWNER", "pull_request": "simonw/sqlite-utils/pulls/462", "body": "Also testing fix for:\r\n- https://github.com/readthedocs/readthedocs-preview/issues/10\r\n\r\n\r\n----\n:books: Documentation preview :books:: https://sqlite-utils--462.org.readthedocs.build/en/462/\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/462/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1326349129, "node_id": "I_kwDOCGYnMM5PDntJ", "number": 461, "title": "Consider including animated SVG console demos", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-08-02T20:10:04Z", "updated_at": "2022-08-02T20:12:14Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "I recorded this one using https://github.com/nbedos/termtosvg - with `pipx install termtosvg` and then `termtosvg` - execute demo - `exit` to save.\r\n\r\n![sqlite-utils-insert-json](https://user-images.githubusercontent.com/9599/182464206-f4976af4-eda8-4020-8257-4ada1867fb44.svg)\r\n\r\n```json\r\n[\r\n {\r\n \"id\": 1,\r\n \"name\": \"Catimus\"\r\n },\r\n {\r\n \"id\": 2,\r\n \"name\": \"Feliopia\"\r\n }\r\n]\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/461/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1324659241, "node_id": "I_kwDOCGYnMM5O9LIp", "number": 459, "title": "Single quoted transform recipes on Windows do not work as expected ", "user": {"value": 19921, "label": "shakeel"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-08-01T16:14:54Z", "updated_at": "2022-08-01T16:14:54Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "Trying to follow the tutorial for sqlite-utils and datasette https://datasette.io/tutorials/clean-data on Windows 11 OS `Microsoft Windows [Version 10.0.22622.440]`, with sqlite-utils and datasette installed using pipx.\r\n\r\n```\r\npipx list\r\npackage datasette 0.61.1, installed using Python 3.10.4\r\n - datasette.exe\r\npackage sqlite-utils 3.28, installed using Python 3.10.4\r\n - sqlite-utils.exe\r\n``` \r\n\r\nIn the step to transform dates into ISO dates the quoted value `'r.parsedatetime(value)'` is copied verbatim into the columns instead of applying the output of the Python recipe.\r\n\r\n```\r\nsqlite-utils convert manatees.db locations \\\r\n REPDATE created_date last_edited_date \\\r\n 'r.parsedatetime(value)' --dry-run\r\n\r\n1975/01/31 00:00:00+00\r\n --- becomes:\r\nr.parsedatetime(value)\r\n\r\nWould affect 13568 rows\r\n```\r\n\r\nHowever, if I change the code from single quotes to double quotes, it works as expected.\r\n\r\n```\r\nsqlite-utils convert manatees.db locations \\\r\n REPDATE created_date last_edited_date \\\r\n \"r.parsedatetime(value)\" --dry-run\r\n\r\n1975/01/31 00:00:00+00\r\n --- becomes:\r\n1975-01-31T00:00:00+00:00\r\n\r\nWould affect 13568 rows\r\n```\r\n\r\nSpecifying the transform code recipe should work with single quotes on Windows.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/459/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1310243385, "node_id": "I_kwDOCGYnMM5OGLo5", "number": 456, "title": "feature request: pivot command", "user": {"value": 536941, "label": "fgregg"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2022-07-20T00:58:08Z", "updated_at": "2022-07-20T17:50:50Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "pivoting long-format table to wide-format tables is pretty common and kind of pain. would love to see this feature in sqlite-utils!", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/456/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1292060682, "node_id": "I_kwDOCGYnMM5NA0gK", "number": 450, "title": "Add --ignore option to more commands", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 9, "created_at": "2022-07-02T13:52:02Z", "updated_at": "2022-07-15T22:39:09Z", "closed_at": "2022-07-15T22:37:45Z", "author_association": "OWNER", "pull_request": null, "body": "As seen in https://sqlite-utils.datasette.io/en/stable/cli-reference.html#add-foreign-key\r\n\r\nCould make this TIL trick unnecessary: https://til.simonwillison.net/bash/ignore-errors", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/450/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1298531653, "node_id": "I_kwDOCGYnMM5NZgVF", "number": 451, "title": "Make sqlite_utils.utils.chunks a documented function", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-07-08T06:01:04Z", "updated_at": "2022-07-15T22:09:34Z", "closed_at": "2022-07-15T21:59:33Z", "author_association": "OWNER", "pull_request": null, "body": "I want to use it in another project: https://github.com/simonw/sqlite-utils/blob/8a9fe6498faf783a1fdeb1793e661ad194a05267/sqlite_utils/utils.py#L471-L474", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/451/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1303169663, "node_id": "I_kwDOCGYnMM5NrMp_", "number": 453, "title": "'unclosed file' warning when using insert_upsert_implementation from Python", "user": {"value": 311257, "label": "makkus"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-07-13T09:34:35Z", "updated_at": "2022-07-15T21:52:25Z", "closed_at": "2022-07-15T21:52:21Z", "author_association": "NONE", "pull_request": null, "body": "I'm using the `[insert_upsert_implementation](https://github.com/simonw/sqlite-utils/blob/main/sqlite_utils/cli.py)` function directly in my Python code to import a csv file with all the bells and whistles `sqlite-utils` provides, but I'm getting a resource warning that a io.TextWrapper object is not closed.\r\n\r\nThe warning goes away when wrapping the code from [this line](https://github.com/simonw/sqlite-utils/blob/42440d6345c242ee39778045e29143fb550bd2c2/sqlite_utils/cli.py#L924) in a try/finally block like:\r\n\r\n```\r\ntry:\r\n ...\r\n ...\r\nfinally:\r\n decoded.close()\r\n```\r\n(might be that `sniff_buffer` must also be closed if non null, but I might be wrong)\r\n\r\nI suspect Python closes the reference automatically when the sqlite-utils cli run is done, but since my code doesn't exit, I'm getting the warning.\r\n\r\nAlternatively, it'd be cool if the 'import csv/tsv' functionality could be added directly to the Database class.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/453/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1306548397, "node_id": "I_kwDOCGYnMM5N4Fit", "number": 454, "title": "CLI command for duplicating tables", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-07-15T21:31:27Z", "updated_at": "2022-07-15T21:48:23Z", "closed_at": "2022-07-15T21:45:51Z", "author_association": "OWNER", "pull_request": null, "body": "CLI equivalent of:\r\n- #449", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/454/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1279863844, "node_id": "I_kwDOCGYnMM5MSSwk", "number": 449, "title": "Utilities for duplicating tables and creating a table with the results of a query", "user": {"value": 1690072, "label": "davidleejy"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2022-06-22T09:41:43Z", "updated_at": "2022-07-15T21:46:13Z", "closed_at": "2022-07-15T21:21:36Z", "author_association": "CONTRIBUTOR", "pull_request": null, "body": "is there a duplicate table functionality? Otherwise, I'd be happy to submit a PR.\r\n\r\nIn sqlite3 it would look like:\r\n\r\n```python\r\nimport sqlite3 as sl\r\n\r\ncon = sl.connect('prompt-tune.db')\r\n\r\ndef db_duplicate_table(table_name, table_name_new, con=con):\r\n # Duplicates table `table_name` to a new table `table_name_new`.\r\n try:\r\n cur = con.cursor()\r\n cur.execute(f\"\"\"CREATE TABLE {table_name_new} AS SELECT * FROM {table_name}\"\"\")\r\n except Exception as e:\r\n print(e)\r\n finally:\r\n cur.close()\r\n\r\ndb_duplicate_table('orig_table', 'new_table')\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/449/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1299760627, "node_id": "PR_kwDOCGYnMM47JUun", "number": 452, "title": "Add duplicate table feature", "user": {"value": 1690072, "label": "davidleejy"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-07-09T20:24:31Z", "updated_at": "2022-07-15T21:21:37Z", "closed_at": "2022-07-15T21:21:36Z", "author_association": "CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/452", "body": "This PR addresses a feature request raised in issue #449. Specifically this PR adds a functionality that lets users duplicate a table via:\r\n\r\n```python\r\ntable_new = db[\"my_table\"].duplicate(\"new_table\")\r\n```\r\n\r\nTest added in file `tests/test_duplicate.py`.\r\n\r\nHappy to make changes to meet maintainers' feedback, if any. ", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/452/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1271426387, "node_id": "I_kwDOCGYnMM5LyG1T", "number": 444, "title": "CSV `extras_key=` and `ignore_extras=` equivalents for CLI tool", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2022-06-14T22:22:47Z", "updated_at": "2022-07-07T16:39:18Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "> I forgot to add equivalents of `extras_key=` and `ignore_extras=` to the CLI tool - will do that in a separate issue.\r\n\r\n_Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155767915_", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/444/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1212701569, "node_id": "I_kwDOCGYnMM5ISFuB", "number": 427, "title": "sqlite-utils convert date parsing recipe complains about trying to parse \"*\"", "user": {"value": 1385831, "label": "wdccdw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-04-22T19:27:10Z", "updated_at": "2022-07-02T13:59:59Z", "closed_at": "2022-07-02T13:59:32Z", "author_association": "NONE", "pull_request": null, "body": "Missing values in my dataset are denoted by a single asterisk. I am trying to parse string dates into dates. This works fine for columns without missing values, but, when the column contains \"*\", I get the following:\r\n\r\n```\r\n$ sqlite-utils convert ${dbfile} details dob 'r.parsedate(value)' \r\n [------------------------------------] 0%Traceback (most recent call last):\r\n File \"/usr/local/Cellar/sqlite-utils/3.25.1/libexec/lib/python3.9/site-packages/sqlite_utils/db.py\", line 2508, in convert_value\r\n return fn(v)\r\n File \"\", line 2, in fn\r\n File \"/usr/local/Cellar/sqlite-utils/3.25.1/libexec/lib/python3.9/site-packages/sqlite_utils/recipes.py\", line 8, in parsedate\r\n parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).date().isoformat()\r\n File \"/usr/local/Cellar/sqlite-utils/3.25.1/libexec/lib/python3.9/site-packages/dateutil/parser/_parser.py\", line 1368, in parse\r\n return DEFAULTPARSER.parse(timestr, **kwargs)\r\n File \"/usr/local/Cellar/sqlite-utils/3.25.1/libexec/lib/python3.9/site-packages/dateutil/parser/_parser.py\", line 643, in parse\r\n raise ParserError(\"Unknown string format: %s\", timestr)\r\ndateutil.parser._parser.ParserError: Unknown string format: *\r\n\r\nTraceback (most recent call last):\r\n File \"/usr/local/bin/sqlite-utils\", line 33, in \r\n sys.exit(load_entry_point('sqlite-utils==3.25.1', 'console_scripts', 'sqlite-utils')())\r\n File \"/usr/local/Cellar/sqlite-utils/3.25.1/libexec/lib/python3.9/site-packages/click/core.py\", line 1128, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"/usr/local/Cellar/sqlite-utils/3.25.1/libexec/lib/python3.9/site-packages/click/core.py\", line 1053, in main\r\n rv = self.invoke(ctx)\r\n File \"/usr/local/Cellar/sqlite-utils/3.25.1/libexec/lib/python3.9/site-packages/click/core.py\", line 1659, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"/usr/local/Cellar/sqlite-utils/3.25.1/libexec/lib/python3.9/site-packages/click/core.py\", line 1395, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n File \"/usr/local/Cellar/sqlite-utils/3.25.1/libexec/lib/python3.9/site-packages/click/core.py\", line 754, in invoke\r\n return __callback(*args, **kwargs)\r\n File \"/usr/local/Cellar/sqlite-utils/3.25.1/libexec/lib/python3.9/site-packages/sqlite_utils/cli.py\", line 2698, in convert\r\n db[table].convert(\r\n File \"/usr/local/Cellar/sqlite-utils/3.25.1/libexec/lib/python3.9/site-packages/sqlite_utils/db.py\", line 2524, in convert\r\n self.db.execute(sql, where_args or [])\r\n File \"/usr/local/Cellar/sqlite-utils/3.25.1/libexec/lib/python3.9/site-packages/sqlite_utils/db.py\", line 458, in execute\r\n return self.conn.execute(sql, parameters)\r\nsqlite3.OperationalError: user-defined function raised exception\r\n```\r\n\r\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/427/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 455486286, "node_id": "MDU6SXNzdWU0NTU0ODYyODY=", "number": 26, "title": "Mechanism for turning nested JSON into foreign keys / many-to-many", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 14, "created_at": "2019-06-13T00:52:06Z", "updated_at": "2022-06-29T23:35:29Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "The GitHub JSON APIs have a really interesting convention with respect to related objects.\r\n\r\nConsider https://api.github.com/repos/simonw/sqlite-utils/issues - here's a truncated subset:\r\n```json\r\n {\r\n \"id\": 449818897,\r\n \"node_id\": \"MDU6SXNzdWU0NDk4MTg4OTc=\",\r\n \"number\": 24,\r\n \"title\": \"Additional Column Constraints?\",\r\n \"user\": {\r\n \"login\": \"IgnoredAmbience\",\r\n \"id\": 98555,\r\n \"node_id\": \"MDQ6VXNlcjk4NTU1\",\r\n \"avatar_url\": \"https://avatars0.githubusercontent.com/u/98555?v=4\",\r\n \"gravatar_id\": \"\"\r\n },\r\n \"labels\": [\r\n {\r\n \"id\": 993377884,\r\n \"node_id\": \"MDU6TGFiZWw5OTMzNzc4ODQ=\",\r\n \"url\": \"https://api.github.com/repos/simonw/sqlite-utils/labels/enhancement\",\r\n \"name\": \"enhancement\",\r\n \"color\": \"a2eeef\",\r\n \"default\": true\r\n }\r\n ],\r\n \"state\": \"open\"\r\n }\r\n```\r\nThe `user` column lists a complete user. The `labels` column has a list of labels.\r\n\r\nSince both user and label have populated `id` field this is actually enough information for us to create records for them AND set up the corresponding foreign key (for user) and m2m relationships (for labels).\r\n\r\nIt would be really neat if `sqlite-utils` had some kind of mechanism for correctly processing these kind of patterns.\r\n\r\nThanks to `jq` there's not much need for extra customization of the shape here - if we support a narrowly defined structure users can use `jq` to reshape arbitrary JSON to match.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/26/reactions\", \"total_count\": 4, \"+1\": 4, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1227571375, "node_id": "I_kwDOCGYnMM5JK0Cv", "number": 431, "title": "Allow making m2m relation of a table to itself", "user": {"value": 738408, "label": "rafguns"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2022-05-06T08:30:43Z", "updated_at": "2022-06-23T14:12:51Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I am building a database, in which one of the tables has a many-to-many relationship to itself. As far as I can see, this is not (yet) possible using `.m2m()` in sqlite-utils. This may be a bit of a niche use case, so feel free to close this issue if you feel it would introduce too much complexity compared to the benefits.\r\n\r\nExample: suppose I have a table of people, and I want to store the information that John and Mary have two children, Michael and Suzy. It would be neat if I could do something like this:\r\n\r\n```python\r\nfrom sqlite_utils import Database\r\n\r\ndb = Database(memory=True)\r\ndb[\"people\"].insert({\"name\": \"John\"}, pk=\"name\").m2m(\r\n \"people\", [{\"name\": \"Michael\"}, {\"name\": \"Suzy\"}], m2m_table=\"parent_child\", pk=\"name\"\r\n)\r\ndb[\"people\"].insert({\"name\": \"Mary\"}, pk=\"name\").m2m(\r\n \"people\", [{\"name\": \"Michael\"}, {\"name\": \"Suzy\"}], m2m_table=\"parent_child\", pk=\"name\"\r\n)\r\n```\r\n\r\nBut if I do that, the many-to-many table `parent_child` has only one column:\r\n```\r\nCREATE TABLE [parent_child] (\r\n [people_id] TEXT REFERENCES [people]([name]),\r\n PRIMARY KEY ([people_id], [people_id])\r\n)\r\n```\r\n\r\nThis could be solved by adding one or two keyword_arguments to `.m2m()`, e.g. `.m2m(..., left_name=None, right_name=None)` or `.m2m(..., names=(None, None))`.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/431/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1277328147, "node_id": "I_kwDOCGYnMM5MInsT", "number": 446, "title": "Use Just to automate running tests and linters locally", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-06-20T19:51:09Z", "updated_at": "2022-06-21T19:28:35Z", "closed_at": "2022-06-20T19:54:50Z", "author_association": "OWNER", "pull_request": null, "body": "I keep committing code that fails additional tests like `mypy` and `flake8` and `black`. Automate those using Just.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/446/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1278571700, "node_id": "I_kwDOCGYnMM5MNXS0", "number": 447, "title": "Incorrect syntax highlighting in docs CLI reference", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2022-06-21T14:53:10Z", "updated_at": "2022-06-21T18:48:47Z", "closed_at": "2022-06-21T18:48:46Z", "author_association": "OWNER", "pull_request": null, "body": "https://sqlite-utils.datasette.io/en/stable/cli-reference.html#insert\r\n\r\n![CE020DDA-27FB-49C3-9EA6-37457DC4C321](https://user-images.githubusercontent.com/9599/174830380-06530537-b870-41c0-a8af-03c7fa720c6f.jpeg)\r\n\r\nIt looks like Python keywords are being incorrectly highlighted here.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/447/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1269998342, "node_id": "I_kwDOCGYnMM5LsqMG", "number": 443, "title": "Make `utils.rows_from_file()` a documented API", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-06-13T21:53:24Z", "updated_at": "2022-06-20T19:49:37Z", "closed_at": "2022-06-14T20:12:46Z", "author_association": "OWNER", "pull_request": null, "body": "> `rows_from_file()` isn't part of the documented API but maybe it should be!\r\n\r\n_Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154385916_", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/443/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1277295119, "node_id": "I_kwDOCGYnMM5MIfoP", "number": 445, "title": "`sqlite_utils.utils.TypeTracker` should be a documented API", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2022-06-20T19:08:28Z", "updated_at": "2022-06-20T19:49:02Z", "closed_at": "2022-06-20T19:46:58Z", "author_association": "OWNER", "pull_request": null, "body": "I've used it in a couple of external places now:\r\n\r\n- https://github.com/simonw/datasette-socrata/blob/32fb256a461bf0e790eca10bdc7dd9d96c20f7c4/datasette_socrata/__init__.py#L264-L280\r\n- https://github.com/simonw/datasette-lite/blob/caa8eade10f0321c64f9f65c4561186f02d57c5b/webworker.js#L55-L64\r\n\r\nRefs:\r\n- https://github.com/simonw/datasette-lite/issues/32", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/445/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1250495688, "node_id": "I_kwDOCGYnMM5KiQzI", "number": 439, "title": "Misleading progress bar against utf-16-le CSV input", "user": {"value": 4068, "label": "frafra"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 12, "created_at": "2022-05-27T08:34:49Z", "updated_at": "2022-06-15T03:53:43Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "The program crashes without any error.\r\n```\r\nwget \"https://artsdatabanken.no/Fab2018/api/export/csv\"\r\nsqlite-utils create-database test.db\r\nsqlite-utils insert --csv --delimiter \";\" --encoding \"utf-16-le\" test test.db csv \r\n [------------------------------------] 0%\r\n [#################-------------------] 49% 00:00:01\r\n```\r\nI would like to highlight various issues:\r\n1. sqlite-utils catches exceptions without printing the stacktrace and/or reraising the exception, so there is no easy way to use `pdb` or similar to debug the program, solution: add a debug option\r\n2. Silent crash: this is related to (1.), and it happens when there is a catch-all mechanism; solution: let the program fail.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/439/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1224112817, "node_id": "I_kwDOCGYnMM5I9nqx", "number": 430, "title": "Document how to use `PRAGMA temp_store` to avoid errors when running VACUUM against huge databases", "user": {"value": 9308268, "label": "rayvoelker"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-05-03T13:33:58Z", "updated_at": "2022-06-14T23:26:37Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I'm trying to figure out a way to get the `table.extract()` method to complete successfully -- I'm not sure if maybe the cause (and a possible solution) of this on Ubuntu Server 22.04 is to adjust some of the PRAGMA values within SQLite itself ... on another Linux system (PopOS), using this method on this same database appears to work just fine.\r\n\r\nHere's the bit that's causing the error, and the resulting error output:\r\n```python\r\n# combine these columns into 1 table \"bib_properties\" :\r\n# best_title\r\n# bib_level_code\r\n# mat_type\r\n# material_code\r\n# best_author\r\ndb[\"circ_trans\"].extract(\r\n [\"best_title\", \"bib_level_code\", \"mat_type\", \"material_code\", \"best_author\"], \r\n table=\"bib_properties\", \r\n fk_column=\"bib_properties_id\"\r\n)\r\n\r\ndb[\"circ_trans\"].extract(\r\n [\"call_number\"], \r\n table=\"call_number\", \r\n fk_column=\"call_number_id\",\r\n rename={\"call_number\": \"value\"}\r\n)\r\n```\r\n\r\n```python\r\n---------------------------------------------------------------------------\r\nOperationalError Traceback (most recent call last)\r\nInput In [17], in ()\r\n 1 # combine these columns into 1 table \"bib_properties\" :\r\n 2 # best_title\r\n 3 # bib_level_code\r\n 4 # mat_type\r\n 5 # material_code\r\n 6 # best_author\r\n----> 7 db[\"circ_trans\"].extract(\r\n 8 [\"best_title\", \"bib_level_code\", \"mat_type\", \"material_code\", \"best_author\"], \r\n 9 table=\"bib_properties\", \r\n 10 fk_column=\"bib_properties_id\"\r\n 11 )\r\n 13 db[\"circ_trans\"].extract(\r\n 14 [\"call_number\"], \r\n 15 table=\"call_number\", \r\n 16 fk_column=\"call_number_id\",\r\n 17 rename={\"call_number\": \"value\"}\r\n 18 )\r\n\r\nFile ~/jupyter/venv/lib/python3.10/site-packages/sqlite_utils/db.py:1764, in Table.extract(self, columns, table, fk_column, rename)\r\n 1761 column_order.append(c.name)\r\n 1763 # Drop the unnecessary columns and rename lookup column\r\n-> 1764 self.transform(\r\n 1765 drop=set(columns),\r\n 1766 rename={magic_lookup_column: fk_column},\r\n 1767 column_order=column_order,\r\n 1768 )\r\n 1770 # And add the foreign key constraint\r\n 1771 self.add_foreign_key(fk_column, table, \"id\")\r\n\r\nFile ~/jupyter/venv/lib/python3.10/site-packages/sqlite_utils/db.py:1526, in Table.transform(self, types, rename, drop, pk, not_null, defaults, drop_foreign_keys, column_order)\r\n 1524 with self.db.conn:\r\n 1525 for sql in sqls:\r\n-> 1526 self.db.execute(sql)\r\n 1527 # Run the foreign_key_check before we commit\r\n 1528 if pragma_foreign_keys_was_on:\r\n\r\nFile ~/jupyter/venv/lib/python3.10/site-packages/sqlite_utils/db.py:465, in Database.execute(self, sql, parameters)\r\n 463 return self.conn.execute(sql, parameters)\r\n 464 else:\r\n--> 465 return self.conn.execute(sql)\r\n\r\nOperationalError: database or disk is full\r\n```\r\n\r\nThis database is about 17G in total size, so I'm assuming the error is coming from the vacuum ... where i'm assuming it's maybe trying to do the temp storage in a location that doesn't have sufficient room. The disk space is more than ample on the host in question (1.8T is free in the directory where the sqlite db resides) The `/tmp` directory however is limited on a smaller disk associated with the OS\r\n\r\nI'm trying to think if there's a way to set the `PRAGMA temp_store` or maybe if it's `temp_store_directory` that I'm after ... to use the same local directory of where the file is located (maybe this is a property of the version of sqlite on the system?) \r\n\r\n```python\r\n# SET the temp file store to be a file ...\r\nprint(db.execute('PRAGMA temp_store').fetchall())\r\nprint(db.execute('PRAGMA temp_store=FILE').fetchall())\r\n\r\nprint(db.execute('PRAGMA temp_store').fetchall())\r\n\r\n# the users home directory ...\r\nprint(db.execute(\"PRAGMA temp_store_directory='/home/plchuser/'\").fetchall())\r\nprint(db.execute(\"PRAGMA sqlite3_temp_directory='/home/plchuser/'\").fetchall())\r\n\r\nprint(db.execute(\"PRAGMA temp_store_directory\").fetchall())\r\nprint(db.execute(\"PRAGMA sqlite3_temp_directory\").fetchall())\r\n```\r\n```text\r\n[(1,)]\r\n[]\r\n[(1,)]\r\n[]\r\n[]\r\n[('/home/plchuser/',)]\r\n[]\r\n```\r\n\r\nHere's the docs on the Temporary File Storage Locations \r\nhttps://www.sqlite.org/tempfiles.html", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/430/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1243151184, "node_id": "I_kwDOCGYnMM5KGPtQ", "number": 434, "title": "`detect_fts()` identifies the wrong table if tables have names that are subsets of each other", "user": {"value": 559711, "label": "ryascott"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2022-05-20T13:28:31Z", "updated_at": "2022-06-14T23:24:09Z", "closed_at": "2022-06-14T23:24:09Z", "author_association": "NONE", "pull_request": null, "body": "Windows 10\r\nPython 3.9.6\r\n\r\nWhen I was running a full text search through the Python library, I noticed that the query was being run on a different full text search table than the one I was trying to search.\r\n\r\nI took a look at the following function\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/841ad44bacaff05ec79ef78166d12e80c82ba6d7/sqlite_utils/db.py#L2213\r\n\r\nand noticed:\r\n\r\n```python\r\nsql LIKE '%VIRTUAL TABLE%USING FTS%content=%{table}%'\r\n```\r\n\r\nMy database contains tables with similar names and %{table}% was matching another table that ended differently in its name.\r\nI have included a sample test that shows this occurring:\r\n\r\nI search for Marsupials in db[\"books\"] and The Clue of the Broken Blade is returned. \r\n\r\nThis occurs since the search for Marsupials was \"successfully\" done against db[\"booksb\"] and rowid 1 is returned. \"The Clue of the Broken Blade\" has a rowid of 1 in db[\"books\"] and this is what is returned from the search.\r\n\r\n```python\r\ndef test_fts_search_with_similar_table_names(fresh_db):\r\n db = Database(memory=True)\r\n db[\"books\"].insert_all(\r\n [\r\n {\r\n \"title\": \"The Clue of the Broken Blade\",\r\n \"author\": \"Franklin W. Dixon\",\r\n },\r\n {\r\n \"title\": \"Habits of Australian Marsupials\",\r\n \"author\": \"Marlee Hawkins\",\r\n },\r\n ]\r\n )\r\n db[\"booksb\"].insert(\r\n {\r\n \"title\": \"Habits of Australian Marsupials\",\r\n \"author\": \"Marlee Hawkins\",\r\n }\r\n )\r\n\r\n db[\"booksb\"].enable_fts([\"title\", \"author\"])\r\n db[\"books\"].enable_fts([\"title\", \"author\"])\r\n\r\n\r\n query = \"Marsupials\"\r\n\r\n assert [\r\n { \"rowid\": 1,\r\n \"title\": \"Habits of Australian Marsupials\",\r\n \"author\": \"Marlee Hawkins\",\r\n },\r\n ] == list(db[\"books\"].search(query))\r\n```\r\n\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/434/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1250629388, "node_id": "I_kwDOCGYnMM5KixcM", "number": 440, "title": "CSV files with too many values in a row cause errors", "user": {"value": 4068, "label": "frafra"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 20, "created_at": "2022-05-27T10:54:44Z", "updated_at": "2022-06-14T22:23:01Z", "closed_at": "2022-06-14T20:12:46Z", "author_association": "NONE", "pull_request": null, "body": "*Original title: csv.DictReader can have None as key*\r\n\r\nIn some cases, `csv.DictReader` can have `None` as key for unnamed columns, and a list of values as value.\r\n`sqlite_utils.utils.rows_from_file` cannot handle that:\r\n\r\n```python\r\nurl=\"https://artsdatabanken.no/Fab2018/api/export/csv\"\r\ndb = sqlite_utils.Database(\":memory\")\r\n\r\nwith urlopen(url) as fab:\r\n reader, _ = sqlite_utils.utils.rows_from_file(fab, encoding=\"utf-16le\") \r\n db[\"fab2018\"].insert_all(reader, pk=\"Id\")\r\n```\r\n\r\nResult:\r\n```\r\nTraceback (most recent call last):\r\n File \"\", line 3, in \r\n File \"/home/user/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/sqlite_utils/db.py\", line 2924, in insert_all\r\n chunk = list(chunk)\r\n File \"/home/user/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/sqlite_utils/db.py\", line 3454, in fix_square_braces\r\n if any(\"[\" in key or \"]\" in key for key in record.keys()):\r\n File \"/home/user/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/sqlite_utils/db.py\", line 3454, in \r\n if any(\"[\" in key or \"]\" in key for key in record.keys()):\r\nTypeError: argument of type 'NoneType' is not iterable\r\n```\r\n\r\nCode:\r\nhttps://github.com/simonw/sqlite-utils/blob/59be60c471fd7a2c4be7f75e8911163e618ff5ca/sqlite_utils/db.py#L3454\r\n\r\n`sqlite-utils insert` from command line is not affected by this issue.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/440/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1236693079, "node_id": "I_kwDOCGYnMM5JtnBX", "number": 432, "title": "Support `rows_where()`, `delete_where()` etc for attached alias databases", "user": {"value": 11597658, "label": "luxint"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2022-05-16T06:38:58Z", "updated_at": "2022-06-14T22:16:48Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Hi,\r\n\r\nI noticed `rows_where()` doesn't return any rows from tables which are from attached databases. The `exists()` function returns false. As far as I can see this is because the `table_names()` function only looks for table names in the current database and not in attached (or temp) databases.\r\n\r\nBesides, `rows_where()`, also `insert_all()` and `delete_where()` didn't do what I was expecting because of this. For the moment I've patched `table_names()` for myself, see below but I'm not sure what the total impact is on the other functions like lookup truncate etc which all use `exists()`. Also `view_names()` doesn't look for views in attached or temp databases. \r\n```python\r\n def table_names(self, fts4: bool = False, fts5: bool = False) -> List[str]:\r\n \"A list of string table names in this database.\"\r\n where = [\"type = 'table'\"]\r\n if fts4:\r\n where.append(\"sql like '%USING FTS4%'\")\r\n if fts5:\r\n where.append(\"sql like '%USING FTS5%'\")\r\n dbs = [x[1] for x in self.execute('pragma database_list').fetchall()] \r\n lst=[]\r\n for db in dbs: \r\n sql = \"select name from {} where {}\".format(db+\".sqlite_master\",\" AND \".join(where))\r\n lst.extend(r[0] for r in self.execute(sql).fetchall())\r\n return lst\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/432/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1257724585, "node_id": "I_kwDOCGYnMM5K91qp", "number": 441, "title": "Combining `rows_where()` and `search()` to limit which rows are searched", "user": {"value": 1448859, "label": "betatim"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2022-06-02T06:01:55Z", "updated_at": "2022-06-14T21:57:57Z", "closed_at": "2022-06-14T21:54:38Z", "author_association": "NONE", "pull_request": null, "body": "What is the right way to limit a full text search query to some rows of a table?\r\n\r\nFor example, I have a table that contains the following columns: `title`, `content`, `owner` (each row represents a document). The `owner` column is a username. It feels right to store all documents in one table, instead of having one table per owner. In particular because I'd like to full text search all documents, only documents owned by one user and documents owned by a set of users.\r\n\r\nI tried to combine `.rows_where(\"owner = ?\", \"1234\")` and `.search()` from the `Table` class but I don't think that is meant to work. I discovered `.search_sql()` as a way to generate the FTS SQL statement. By hand I can edit it to add a `AND [original].[owner] = :owner` to the `where` clause. This seems to do what I want.\r\n\r\nMy two questions:\r\n1. is adding a `AND ...` to the `where` clause actually the right thing to do or should I be doing something else (my SQL skills are low)?\r\n2. is there a built-in to sqlite-utils way to achieve this?\r\n\r\nRight now I am thinking I will make my own version of `search_sql()` that generates a query that contains an additional `owner = :owner` for my particular use-case.\r\n\r\nBonus question: is this generally useful/something to add to sqlite-utils or too niche?", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/441/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1269886084, "node_id": "I_kwDOCGYnMM5LsOyE", "number": 442, "title": "`maximize_csv_field_size_limit()` utility function", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-06-13T19:54:54Z", "updated_at": "2022-06-14T21:55:15Z", "closed_at": "2022-06-14T21:31:49Z", "author_association": "OWNER", "pull_request": null, "body": "This code here runs only if `cli.py` is imported: https://github.com/simonw/sqlite-utils/blob/7ddf5300886a32d6daf60cf1d71efe492b65c87e/sqlite_utils/cli.py#L50-L59\r\n\r\nI found myself needing the same fix in another library:\r\n\r\n- https://github.com/simonw/datasette-socrata/issues/13\r\n\r\nIt should be a documented utility function.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/442/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1160182768, "node_id": "I_kwDOCGYnMM5FJvvw", "number": 412, "title": "Optional Pandas integration", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 13, "created_at": "2022-03-05T01:49:27Z", "updated_at": "2022-06-14T15:36:29Z", "closed_at": null, "author_association": "OWNER", "pull_request": null, "body": "It would be neat if there was a way to use this more seamlessly with Pandas, in particular Pandas dataframes - but without making Pandas a required dependency.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/412/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1244294227, "node_id": "PR_kwDOCGYnMM44P4GG", "number": 437, "title": "docs to dogs", "user": {"value": 114388, "label": "yurivish"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-05-22T15:50:33Z", "updated_at": "2022-05-30T21:32:41Z", "closed_at": "2022-05-30T21:32:41Z", "author_association": "CONTRIBUTOR", "pull_request": "simonw/sqlite-utils/pulls/437", "body": "Fixes a typo.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/437/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1250161887, "node_id": "I_kwDOCGYnMM5Kg_Tf", "number": 438, "title": "illegal UTF-16 surrogate", "user": {"value": 4068, "label": "frafra"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-05-26T22:49:52Z", "updated_at": "2022-05-27T08:21:53Z", "closed_at": "2022-05-27T08:21:53Z", "author_association": "NONE", "pull_request": null, "body": "I am trying to insert `https://artsdatabanken.no/Fab2018/api/export/csv` into a SQLite database, but I have an error when using `sqlite-utils`:\r\n\r\n```\r\nsqlite-utils insert --csv --delimiter \";\" --encoding=\"utf-16-le\" --pk \"Id\" csv fremmedart test.db\r\n [------------------------------------] 0%\r\nError: 'utf-16-le' codec can't decode bytes in position 98-99: illegal UTF-16 surrogate\r\n\r\nThe input you provided uses a character encoding other than utf-8.\r\n\r\nYou can fix this by passing the --encoding= option with the encoding of the file.\r\n\r\nIf you do not know the encoding, running 'file filename.csv' may tell you.\r\n\r\nIt's often worth trying: --encoding=latin-1\r\n```\r\n\r\nI tried to convert the file using `iconv -f \"utf-16le\" -t \"utf-8\"`, but I still get a similar error (slightly different position):\r\n\r\n```\r\nsqlite-utils insert --csv --delimiter \";\" --encoding=utf-8 --pk \"Id\" csv_utf8 fremmedart test.db\r\n [------------------------------------] 0%\r\nError: 'utf-8' codec can't decode byte 0xd9 in position 99: invalid continuation byte\r\n\r\nThe input you provided uses a character encoding other than utf-8.\r\n\r\nYou can fix this by passing the --encoding= option with the encoding of the file.\r\n\r\nIf you do not know the encoding, running 'file filename.csv' may tell you.\r\n\r\nIt's often worth trying: --encoding=latin-1\r\n```\r\n\r\nI have no issues reading such file using this Python code:\r\n```python\r\ncontent = open('csv', encoding='utf-16-le').read())\r\n```\r\n\r\n`in2csv` works too.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/438/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1243715381, "node_id": "I_kwDOCGYnMM5KIZc1", "number": 436, "title": "Add \"copy to clipboard\" button to code examples in documentation", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-05-20T21:53:23Z", "updated_at": "2022-05-20T21:57:53Z", "closed_at": "2022-05-20T21:57:53Z", "author_association": "OWNER", "pull_request": null, "body": "Follows:\r\n- #435\r\n\r\nImitates:\r\n- https://github.com/simonw/datasette/issues/1748\r\n\r\nI'll use https://github.com/executablebooks/sphinx-copybutton - here's the Datasette commit: https://github.com/simonw/datasette/commit/1465fea4798599eccfe7e8f012bd8d9adfac3039", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/436/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1243704847, "node_id": "I_kwDOCGYnMM5KIW4P", "number": 435, "title": "Switch to Furo documentation theme", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-05-20T21:46:39Z", "updated_at": "2022-05-20T21:56:10Z", "closed_at": "2022-05-20T21:54:43Z", "author_association": "OWNER", "pull_request": null, "body": "As seen in:\r\n- https://github.com/simonw/datasette/issues/1746\r\n- https://github.com/simonw/shot-scraper/issues/77", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/435/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1173023272, "node_id": "I_kwDOCGYnMM5F6uoo", "number": 416, "title": "Options for how `r.parsedate()` should handle invalid dates", "user": {"value": 638427, "label": "mattkiefer"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 11, "created_at": "2022-03-17T23:29:55Z", "updated_at": "2022-05-03T21:36:49Z", "closed_at": "2022-03-21T04:01:39Z", "author_association": "NONE", "pull_request": null, "body": "Exceptions are normal expected behavior when typecasting an invalid format. However, r.parsedate() is really just re-formatting strings and keeping the type as text. So it may be better to print-and-pass on exception so the user can see a complete list of invalid values -- while also allowing for the parser to reformat the remaining valid values. \r\n```\r\nsqlite-utils convert idfpr.db license \"Expiration Date\" \"r.parsedate(value)\"\r\n [#######-----------------------------] 21% 00:01:57Traceback (most recent call last):\r\n File \"/usr/local/lib/python3.9/dist-packages/sqlite_utils/db.py\", line 2336, in convert_value\r\n return fn(v)\r\n File \"\", line 2, in fn\r\n File \"/usr/local/lib/python3.9/dist-packages/sqlite_utils/recipes.py\", line 8, in parsedate\r\n parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).date().isoformat()\r\n File \"/usr/lib/python3/dist-packages/dateutil/parser/_parser.py\", line 1374, in parse\r\n return DEFAULTPARSER.parse(timestr, **kwargs)\r\n File \"/usr/lib/python3/dist-packages/dateutil/parser/_parser.py\", line 652, in parse\r\n raise ParserError(\"String does not contain a date: %s\", timestr)\r\ndateutil.parser._parser.ParserError: String does not contain a date: / / \r\n```\r\nIn this case, I had just one variation of an invalid date: ' / / '. But theoretically there could be many values that would have to be fixed one at a time with the current exception handling. ", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/416/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"}