github
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | pull_request | body | repo | type | active_lock_reason | performed_via_github_app | reactions | draft | state_reason |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1655860104 | I_kwDOCGYnMM5ismuI | 535 | rows: --transpose or psql extended view-like functionality | 7908073 | closed | 0 | 2 | 2023-04-05T15:37:33Z | 2023-06-15T08:39:49Z | 2023-06-14T22:05:28Z | CONTRIBUTOR | It would be nice if the rows subcommand had a flag, perhaps called `--transpose` which would print in long form instead of wide. Similar to extended display mode in psql (`\x`) In other words instead of this: ``` sqlite-utils rows --limit 5 --fmt github track_metadata.db songs ``` | track_id | title | song_id | release | artist_id | artist_mbid | artist_name | duration | artist_familiarity | artist_hotttnesss | year | track_7digitalid | shs_perf | shs_work | |--------------------|-------------------|--------------------|--------------------------------------|--------------------|--------------------------------------|------------------|------------|----------------------|---------------------|--------|--------------------|------------|------------| | TRMMMYQ128F932D901 | Silent Night | SOQMMHC12AB0180CB8 | Monster Ballads X-Mas | ARYZTJS1187B98C555 | 357ff05d-848a-44cf-b608-cb34b5701ae5 | Faster Pussy cat | 252.055 | 0.649822 | 0.394032 | 2003 | 7032331 | -1 | 0 | | TRMMMKD128F425225D | Tanssi vaan | SOVFVAK12A8C1350D9 | Karkuteillä | ARMVN3U1187FB3A1EB | 8d7ef530-a6fd-4f8f-b2e2-74aec765e0f9 | Karkkiautomaatti | 156.551 | 0.439604 | 0.356992 | 1995 | 1514808 | -1 | 0 | | TRMMMRX128F93187D9 | No One Could Ever | SOGTUKN12AB017F4F1 | Butter | ARGEKB01187FB50750 | 3d403d44-36ce-465c-ad43-ae877e65adc4 | Hudson Mohawke | 138.971 | 0.643681 | 0.437504 | 2006 | 6945353 | -1 | 0 | | TRMMMCH128F425532C | Si Vos Querés | SOBNYVR12A8C13558C | De Culo | ARNWYLR1187B9B2F9C | 12be7648-7094-495f-90e6-df4189d68615 | Yerba Brava | 145.058 | 0.448501 | 0.372349 | 2003 | 2168257 |… | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/535/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1740150327 | I_kwDOCGYnMM5nuJY3 | 557 | Aliased ROWID option for tables created from alter=True commands | 7908073 | closed | 0 | 2 | 2023-06-04T05:29:28Z | 2023-06-14T06:09:21Z | 2023-06-05T19:26:26Z | CONTRIBUTOR | > If you use INTEGER PRIMARY KEY column, the VACUUM does not change the values of that column. However, if you use unaliased rowid, the VACUUM command will reset the rowid values. ROWID should never be used with foreign keys but the simple act of aliasing rowid to id (which is what happens when one does `id integer primary key` DDL) makes it OK. It would be convenient if there were more options to use a string column (eg. filepath) as the PK, and be able to use it during upserts, but when creating a foreign key, to create an integer column which aliases rowid I made an attempt to switch to integer primary keys here but it is not going well... In my usecase the path column is a business key. Yes, it should be as simple as including the `id` column in any select statement where I plan on using `upsert` but it would be nice if this could be abstracted away somehow https://github.com/chapmanjacobd/library/commit/788cd125be01d76f0fe2153335d9f6b21db1343c https://github.com/chapmanjacobd/library/actions/runs/5173602136/jobs/9319024777 | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/557/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
1436539554 | I_kwDOCGYnMM5Vn9qi | 511 | [insert_all, upsert_all] IntegrityError: constraint failed | 7908073 | closed | 0 | 2 | 2022-11-04T19:21:48Z | 2022-11-04T22:59:54Z | 2022-11-04T22:54:09Z | CONTRIBUTOR | My understand is that `INSERT OR IGNORE` will ignore when inserts would cause duplicate keys so I'm not sure exactly why the error is raised from `sqlite3`. ``` import argparse from pathlib import Path from xklb import db, utils from xklb.utils import log def parse_args() -> argparse.Namespace: parser = argparse.ArgumentParser() parser.add_argument("database") parser.add_argument("dbs", nargs="*") parser.add_argument("--upsert") parser.add_argument("--db", "-db", help=argparse.SUPPRESS) parser.add_argument("--verbose", "-v", action="count", default=0) args = parser.parse_args() if args.db: args.database = args.db Path(args.database).touch() args.db = db.connect(args) log.info(utils.dict_filter_bool(args.__dict__)) return args def merge_db(args, source_db): source_db = str(Path(source_db).resolve()) s_db = db.connect(argparse.Namespace(database=source_db, verbose=args.verbose)) for table in [s for s in s_db.table_names() if not "_fts" in s and not s.startswith("sqlite_")]: log.info("[%s]: %s", source_db, table) with s_db.conn: data = s_db[table].rows with args.db.conn: if args.upsert: args.db[table].upsert_all(data, pk=args.upsert.split(","), alter=True) else: args.db[table].insert_all(data, alter=True, replace=True) def merge_dbs(): args = parse_args() for s_db in args.dbs: merge_db(args, s_db) if __name__ == "__main__": merge_dbs() ``` ``` $ lb-dev merge video.db tube_71.db --upsert path -vv SQL: INSERT OR IGNORE INTO [media]([path]) VALUES(?); - params: ['https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz'] ... File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:3122, in Table.insert_all(self, records, pk, foreign_keys, column_order, not_null, defaults, batch_size, hash_id, hash_id_columns, alter, ignore, re… | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/511/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | ||||||
688659182 | MDU6SXNzdWU2ODg2NTkxODI= | 145 | Bug when first record contains fewer columns than subsequent records | 96218 | closed | 0 | 2 | 2020-08-30T05:44:44Z | 2020-09-08T23:21:23Z | 2020-09-08T23:21:23Z | CONTRIBUTOR | `insert_all()` selects the maximum batch size based on the number of fields in the first record. If the first record has fewer fields than subsequent records (and `alter=True` is passed), this can result in SQL statements with more than the maximum permitted number of host parameters. This situation is perhaps unlikely to occur, but could happen if the first record had, say, 10 columns, such that `batch_size` (based on `SQLITE_MAX_VARIABLE_NUMBER = 999`) would be 99. If the next 98 rows had 11 columns, the resulting SQL statement for the first batch would have `10 * 1 + 11 * 98 = 1088` host parameters (and subsequent batches, if the data were consistent from thereon out, would have `99 * 11 = 1089`). I suspect that this bug is masked somewhat by the fact that while: > [`SQLITE_MAX_VARIABLE_NUMBER`](https://www.sqlite.org/limits.html#max_variable_number) ... defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0. it is common that it is increased at compile time. Debian-based systems, for example, seem to ship with a version of sqlite compiled with `SQLITE_MAX_VARIABLE_NUMBER` set to 250,000, and I believe this is the case for homebrew installations too. A test for this issue might look like this: ```python def test_columns_not_in_first_record_should_not_cause_batch_to_be_too_large(fresh_db): # sqlite on homebrew and Debian/Ubuntu etc. is typically compiled with # SQLITE_MAX_VARIABLE_NUMBER set to 250,000, so we need to exceed this value to # trigger the error on these systems. THRESHOLD = 250000 extra_columns = 1 + (THRESHOLD - 1) // 99 records = [ {"c0": "first record"}, # one column in first record -> batch_size = 100 # fill out the batch with 99 records with enough columns to exceed THRESHOLD *[ dict([("c{}".format(i), j) for i in range(extra_columns)]) for j in range(99) ] ] try: fresh_db["too_many_columns"].insert_all(records, a… | 140912432 | issue | { "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/145/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed |