github
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | pull_request | body | repo | type | active_lock_reason | performed_via_github_app | reactions | draft | state_reason |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1821108702 | I_kwDOCGYnMM5si-ne | 579 | Special handling for SQLite column of type `JSON` | 15178711 | open | 0 | 0 | 2023-07-25T20:37:23Z | 2023-07-25T20:37:23Z | CONTRIBUTOR | `sqlite-utils` should detect and have specially handling for column with a `JSON` column. For example: ```sql CREATE TABLE "dogs" ( id INTEGER PRIMARY KEY, name TEXT, friends JSON ); ``` ## Automatic Nesting According to ["Nested JSON Values"](https://sqlite-utils.datasette.io/en/stable/cli.html#nested-json-values), sqlite-utils will only expand JSON if the `--json-cols` flag is passed. It looks like it'll try to `json.load` all text column to test if its JSON, which can get expensive on non-json columns. Instead, `sqlite-utils` should be default (ie without the `--json-cols` flags) do the `maybe_json()` operation on columns with a declared `JSON` type. So the above table would expand the `"friends"` column as expected, withoutthe `--json-cols` flag: ```bash sqlite-utils dogs.db "select * from dogs" | python -mjson.tool ``` ``` [ { "id": 1, "name": "Cleo", "friends": [ { "name": "Pancakes" }, { "name": "Bailey" } ] } ] ``` --- I'm sure there's other ways `sqlite-utils` can specially handle JSON columns, so keeping this open while I think of more | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/579/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1740026046 | I_kwDOCGYnMM5ntrC- | 556 | Support storing incrementally piped values | 601708 | open | 0 | 1 | 2023-06-04T00:45:23Z | 2023-06-04T01:21:15Z | CONTRIBUTOR | I'm trying to use sqlite-utils to data generated incrementally. There are a few aspects of this that I don't currently know how to handle. I would like an option to apply writes incrementally, line-by-line as they are received. I would like an option to echo incremental progress. And, it would be nice to have In particular, I'm using CoreLocationCLI -w -j to generate, newline-delimited JSON. One variant of the command `stdbuf -oL CoreLocationCLI -w -j | pee 'sqlite-utils insert loc.db loc -' nl` `pee`, from `moreutils`, is like `tee` but spawns and pipes to the processes created by invoking each of its arguments, so, for gratuitous demonstration, `pee 'sponge out.log' cat` would behave like `tee`. It looks like I can get what I want with: `stdbuf -oL CoreLocationCLI -w -j | while read line; do <<<"$line" sqlite-utils insert loc.db loc -; echo "$line"; done | nl` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/556/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1595340692 | I_kwDOCGYnMM5fFveU | 530 | add ability to configure "on delete" and "on update" attributes of foreign keys: | 536941 | open | 0 | 2 | 2023-02-22T15:44:14Z | 2023-05-08T20:39:01Z | CONTRIBUTOR | sqlite supports these, and it would be quite nice to be able to add them with sqlite-utils. https://www.sqlite.org/foreignkeys.html#fk_actions | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/530/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1393202060 | I_kwDOCGYnMM5TCpOM | 496 | devrel/python api: Pylance type hinting | 7908073 | open | 0 | 4 | 2022-10-01T03:03:34Z | 2023-05-03T05:53:27Z | CONTRIBUTOR | Pylance is generally pretty good at figuring out stuff but `sqlite-utils` has some quirks which make type hinting kinda useless. Maybe you don't care but I thought I would bring it to your attention. For example: ``` db["subs"].insert_all(subs, pk="index") ``` ``` Cannot access member "insert_all" for type "View" Member "insert_all" is unknown ``` `insert_all` and all the other methods show up as a type issues because the program can't know whether something is a View or a Table. Fair enough. But that basically throws all type checking out the window. `pk="index"` also shows up as a type issue: ``` Argument of type "Literal['index']" cannot be assigned to parameter "pk" of type "Default" in function "insert_all" "Literal['index']" is incompatible with "Default" ``` I think this is because DEFAULT is an empty class? maybe a few small changes could be made to make the library more type-friendly The interim solution is of course to turn off type hints completely for the line ``` db["subs"].insert_all(subs, pk="index") # type: ignore ``` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/496/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1560651350 | I_kwDOCGYnMM5dBaZW | 523 | Feature request: trim all leading and trailing white space for all columns for all tables in a database | 536941 | open | 0 | 1 | 2023-01-28T02:40:10Z | 2023-01-28T02:41:14Z | CONTRIBUTOR | It's pretty common that i need to trim leading or trailing white space from lots of columns in a database a part of an initial ETL. I use the following recipe a lot, and it would be great to include this functionality into sqlite-utils `trimify.sql` ```sql select 'select group_concat(''update [' || name || '] set ['' || name || ''] = trim(['' || name || ''])'', ''; '') || ''; '' as sql_to_run from pragma_table_info('''||name||''');' from sqlite_schema; ``` then something like: ```bash sqlite3 example.db < scripts/trimify.sql > table_trim.sql && \ sqlite3 $example.db < table_trim.sql > trim.sql && \ sqlite3 $example.db < trim.sql ``` | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/523/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1405196044 | PR_kwDOCGYnMM5AmYzy | 499 | feat: recreate fts triggers after table transform | 7908073 | open | 0 | 2 | 2022-10-11T20:35:39Z | 2022-10-26T17:54:51Z | CONTRIBUTOR | simonw/sqlite-utils/pulls/499 | https://github.com/simonw/sqlite-utils/pull/498 <!-- readthedocs-preview sqlite-utils start --> ---- :books: Documentation preview :books:: https://sqlite-utils--499.org.readthedocs.build/en/499/ <!-- readthedocs-preview sqlite-utils end --> alternatively, `self.disable_fts()` | 140912432 | pull | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/499/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
0 | ||||||
1355193529 | I_kwDOCGYnMM5Qxpy5 | 479 | OperationalError: cannot VACUUM from within a transaction | 7908073 | open | 0 | 0 | 2022-08-30T05:34:24Z | 2022-08-30T05:34:24Z | CONTRIBUTOR | Maybe when calling `.vacuum()` and other DB-level write-lock operations `sqlite_utils` could guard against this error message by automatically committing first? ``` 46 db["media"].optimize() # type: ignore ---> 47 db.vacuum() File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:1047, in Database.vacuum(self) 1045 def vacuum(self): 1046 "Run a SQLite ``VACUUM`` against the database." -> 1047 self.execute("VACUUM;") File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:470, in Database.execute(self, sql, parameters) 468 return self.conn.execute(sql, parameters) 469 else: --> 470 return self.conn.execute(sql) OperationalError: cannot VACUUM from within a transaction ``` It might also be nice to add a sentence or two about how transactions are committed on the [docs page](https://sqlite-utils.datasette.io/en/latest/python-api.html#detect-fts). When I was swapping out my sqlite3 code for this library it was nice that everything was pretty much drop-in but I was/am unsure what to do about the places I explicitly call `.commit()` in my code Related to https://github.com/simonw/sqlite-utils/issues/121 | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/479/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1324659241 | I_kwDOCGYnMM5O9LIp | 459 | Single quoted transform recipes on Windows do not work as expected | 19921 | open | 0 | 0 | 2022-08-01T16:14:54Z | 2022-08-01T16:14:54Z | CONTRIBUTOR | Trying to follow the tutorial for sqlite-utils and datasette https://datasette.io/tutorials/clean-data on Windows 11 OS `Microsoft Windows [Version 10.0.22622.440]`, with sqlite-utils and datasette installed using pipx. ``` pipx list package datasette 0.61.1, installed using Python 3.10.4 - datasette.exe package sqlite-utils 3.28, installed using Python 3.10.4 - sqlite-utils.exe ``` In the step to transform dates into ISO dates the quoted value `'r.parsedatetime(value)'` is copied verbatim into the columns instead of applying the output of the Python recipe. ``` sqlite-utils convert manatees.db locations \ REPDATE created_date last_edited_date \ 'r.parsedatetime(value)' --dry-run 1975/01/31 00:00:00+00 --- becomes: r.parsedatetime(value) Would affect 13568 rows ``` However, if I change the code from single quotes to double quotes, it works as expected. ``` sqlite-utils convert manatees.db locations \ REPDATE created_date last_edited_date \ "r.parsedatetime(value)" --dry-run 1975/01/31 00:00:00+00 --- becomes: 1975-01-31T00:00:00+00:00 Would affect 13568 rows ``` Specifying the transform code recipe should work with single quotes on Windows. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/459/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1310243385 | I_kwDOCGYnMM5OGLo5 | 456 | feature request: pivot command | 536941 | open | 0 | 5 | 2022-07-20T00:58:08Z | 2022-07-20T17:50:50Z | CONTRIBUTOR | pivoting long-format table to wide-format tables is pretty common and kind of pain. would love to see this feature in sqlite-utils! | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/456/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
1160034488 | I_kwDOCGYnMM5FJLi4 | 411 | Support for generated columns | 25778 | open | 0 | 8 | 2022-03-04T20:41:33Z | 2022-03-11T22:32:43Z | CONTRIBUTOR | This is a fairly new feature -- SQLite version 3.31.0 (2020-01-22) -- that I, admittedly, haven't gotten to work yet. But it looks _incredibly_ useful: https://dgl.cx/2020/06/sqlite-json-support I'm not sure if this is an option on `add-column` or a separate command like `add-generated-column`. Either way, it needs an argument to populate it. It could be something like this: ```sh sqlite-utils add-column data.db table-name generated --as 'json_extract(data, "$.field")' --virtual ``` More here: https://www.sqlite.org/gencol.html | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/411/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
817989436 | MDU6SXNzdWU4MTc5ODk0MzY= | 242 | Async support | 25778 | open | 0 | 13 | 2021-02-27T18:29:38Z | 2021-10-28T14:37:56Z | CONTRIBUTOR | Following our conversation last week, want to note this here before I forget. I've had a couple situations where I'd like to do a bunch of updates in an async event loop, but I run into SQLite's issues with concurrent writes. This feels like something sqlite-utils could help with. PeeWee ORM has a [SQLite write queue](http://docs.peewee-orm.com/en/latest/peewee/playhouse.html#sqliteq) that might be a good model. It's using threads or gevent, but I _think_ that approach would translate well enough to asyncio. Happy to help with this, too. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
688670158 | MDU6SXNzdWU2ODg2NzAxNTg= | 147 | SQLITE_MAX_VARS maybe hard-coded too low | 96218 | open | 0 | 7 | 2020-08-30T07:26:45Z | 2021-02-15T21:27:55Z | CONTRIBUTOR | I came across this while about to open an issue and PR against the documentation for `batch_size`, which is a bit incomplete. As mentioned in #145, while: > [`SQLITE_MAX_VARIABLE_NUMBER`](https://www.sqlite.org/limits.html#max_variable_number) ... defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0. it is common that it is increased at compile time. Debian-based systems, for example, seem to ship with a version of sqlite compiled with SQLITE_MAX_VARIABLE_NUMBER set to 250,000, and I believe this is the case for homebrew installations too. In working to understand what `batch_size` was actually doing and why, I realized that by setting `SQLITE_MAX_VARS` in `db.py` to match the value my sqlite was compiled with (I'm on Debian), I was able to decrease the time to `insert_all()` my test data set (~128k records across 7 tables) from ~26.5s to ~3.5s. Given that this about .05% of my total dataset, this is time I am keen to save... Unfortunately, it seems that `sqlite3` in the python standard library doesn't expose the `get_limit()` C API (even though `pysqlite` used to), so it's hard to know what value sqlite has been compiled with (note that this could mean, I suppose, that it's less than 999, and even hardcoding `SQLITE_MAX_VARS` to the conservative default might not be adequate. It can also be lowered -- but not raised -- at runtime). The best I could come up with is `echo "" | sqlite3 -cmd ".limits variable_number"` (only available in `sqlite >= 2015-05-07 (3.8.10)`). Obviously this couldn't be relied upon in `sqlite_utils`, but I wonder what your opinion would be about exposing `SQLITE_MAX_VARS` as a user-configurable parameter (with suitable "here be dragons" warnings)? I'm going to go ahead and monkey-patch it for my purposes in any event, but it seems like it might be worth considering. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/147/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
||||||||
546073980 | MDU6SXNzdWU1NDYwNzM5ODA= | 74 | Test failures on openSUSE 15.1: AssertionError: Explicit other_table and other_column | 15092 | open | 0 | 3 | 2020-01-07T04:35:50Z | 2020-01-12T07:21:17Z | CONTRIBUTOR | openSUSE 15.1 is using python 3.6.5 and click-7.0 , however it has test failures while openSUSE Tumbleweed on py37 passes. Most fail on the cli exit code like ```py [ 74s] =================================== FAILURES =================================== [ 74s] _________________________________ test_tables __________________________________ [ 74s] [ 74s] db_path = '/tmp/pytest-of-abuild/pytest-0/test_tables0/test.db' [ 74s] [ 74s] def test_tables(db_path): [ 74s] result = CliRunner().invoke(cli.cli, ["tables", db_path]) [ 74s] > assert '[{"table": "Gosh"},\n {"table": "Gosh2"}]' == result.output.strip() [ 74s] E assert '[{"table": "...e": "Gosh2"}]' == '' [ 74s] E - [{"table": "Gosh"}, [ 74s] E - {"table": "Gosh2"}] [ 74s] [ 74s] tests/test_cli.py:28: AssertionError ``` packaging project at https://build.opensuse.org/package/show/home:jayvdb:py-new/python-sqlite-utils I'll keep digging into this after I have github-to-sqlite working on Tumbleweed, as I'll need openSUSE Leap 15.1 working before I can submit this into the main python repo. | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/74/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |