id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,pull_request,body,repo,type,active_lock_reason,performed_via_github_app,reactions,draft,state_reason 1988525411,I_kwDOCGYnMM52hn1j,603,Pyhton 3.12 Bug report,1324252,open,0,,,1,2023-11-10T22:57:48Z,2023-12-08T05:10:31Z,,NONE,,"I start with new python3 verison 3.12.0 Also have the error where connect DataBase ``` Traceback (most recent call last): File ""/home/t/Development/python/FKPJ/ClinicSYS/run.py"", line 1, in import re, os, io, json, sqlite_utils, requests, pytz, logging File ""/home/t/.local/lib/python3.12/site-packages/sqlite_utils/__init__.py"", line 1, in from .db import Database File ""/home/t/.local/lib/python3.12/site-packages/sqlite_utils/db.py"", line 277, in class Database: File ""/home/t/.local/lib/python3.12/site-packages/sqlite_utils/db.py"", line 306, in Database filename_or_conn: Optional[Union[str, pathlib.Path, sqlite3.Connection]] = None, ^^^^^^^^^^^^^^^^^^ ``` This bug come from `sqlite-utils` since's v3.33. Anyone get the same ? As well now of the resolved plan just keep the sqlite-utils version in python3.12 with v3.32.1 [tested] but where are the sqlite3.Connection problem.... This won't happen on python version down to 3.11[tested] Just the python3.12.0, I have test this error are come from the sqlite3 connection The error say from `sqlite_utils` and with the sqlite3 Connection, what can I do. Let fix together.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/603/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1978603203,I_kwDOCGYnMM517xbD,602,`sqlite-utils transform` removes the `AUTOINCREMENT` keyword,4472046,open,0,,,0,2023-11-06T08:48:43Z,2023-11-06T08:48:43Z,,NONE,,"### Context We ran into this bug randomly, noticing that deleted `ROWID` would get reused after migrating the DB. Using `transform` to change any column in the table will also unexpectedly strip away the `AUTOINCREMENT` keyword from the primary key definition, even if it was not the transformation target. ### Reproducible example **Original database** ```sql $ sqlite3 test.db << EOF CREATE TABLE mytable ( col1 INTEGER PRIMARY KEY AUTOINCREMENT, col2 TEXT NOT NULL ) EOF $ sqlite3 test.db "".schema mytable"" CREATE TABLE mytable ( col1 INTEGER PRIMARY KEY AUTOINCREMENT, col2 TEXT NOT NULL ); ``` **Modified database after sqlite-utils** ```sql $ sqlite-utils transform test.db mytable --rename col2 renamedcol2 $ sqlite3 test.db ""SELECT sql FROM sqlite_master WHERE name = 'mytable';"" CREATE TABLE IF NOT EXISTS ""mytable"" ( [col1] INTEGER PRIMARY KEY, [renamedcol2] TEXT NOT NULL ); ```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/602/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1920416843,I_kwDOCGYnMM5ydzxL,597,sqlite-utils insert-files should be able to convert fields,1737541,open,0,,,0,2023-09-30T22:20:47Z,2023-09-30T22:20:47Z,,NONE,,"Currently using both `insert-files` and `convert` is needed in order to create sqlar files, it would be more convenient if it could be done with just one command. ```shell ~ ❯ cat test.py import os class Example: def __init__(self, arg1, arg2): self.arg1 = arg1 ~ ❯ sqlite-utils insert-files test.sqlar sqlar test.py -c name:name -c data:content -c mode:mode -c mtime:mtime -c sz:size --pk=name [####################################] 100% ~ ❯ sqlite-utils convert test.sqlar sqlar data ""zlib.compress(value)"" --import=zlib --where ""name = 'test.py'"" [####################################] 100% ~ ❯ cat test.py | sqlite-utils convert test.sqlar sqlar data ""zlib.compress(sys.stdin.buffer.read())"" --import=zlib --import=sys --where ""name = 'test.py'"" # Alternative way [####################################] 100% ~ ❯ sqlite3 test.sqlar ""SELECT hex(data) FROM sqlar WHERE name = 'test.py';"" | python3 -c ""import sys, zlib; sys.stdout.buffer.write(zlib.decompress(bytes.fromhex(sys.stdin.read())))"" import os class Example: def __init__(self, arg1, arg2): self.arg1 = arg1 ~ ❯ rm test.py ~ ❯ sqlar -l test.sqlar test.py ~ ❯ sqlar -x test.sqlar ~ ❯ cat test.py import os class Example: def __init__(self, arg1, arg2): self.arg1 = arg1 ```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/597/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1754174496,I_kwDOCGYnMM5ojpQg,558,Ability to define unique columns when creating a table,1910303,open,0,,,0,2023-06-13T06:56:19Z,2023-08-18T01:06:03Z,,NONE,,"When creating a new table, it would be good to have an option to set unique columns similar to how not_null is set. ```python from sqlite_utils import Database columns = {""mRID"": str, ""name"": str} db = Database(""example.db"") db[""ExampleTable""].create(columns, pk=""mRID"", not_null=[""mRID""], if_not_exists=True) db[""ExampleTable""].create_index([""mRID""], unique=True, if_not_exists=True) ``` So something like this would add the UNIQUE flag to the table definition. ```python db[""ExampleTable""].create(columns, pk=""mRID"", not_null=[""mRID""], unique=[""mRID""], if_not_exists=True) ``` ```sql CREATE TABLE ExampleTable ( mRID TEXT PRIMARY KEY NOT NULL UNIQUE, name TEXT ); ```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/558/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1839344979,I_kwDOCGYnMM5toi1T,582,Handling CSV/file input that contains NUL bytes,1448859,open,0,,,0,2023-08-07T12:24:14Z,2023-08-07T12:24:14Z,,NONE,,"I was using sqlite-utils to create a DB from a CSV and it turns out the CSV contains a NUL byte. When the processing reaches the line that contains the NUL an exception is raised. I'm wondering if there is something that can be done in `sqlite-utils` to say ""skip lines with encoding errors"" or some such. I think it isn't super straightforward though as the exception comes from inside the `csv` module that does all the parsing. Concretely the file is the `KernelVersions.csv` from https://www.kaggle.com/datasets/kaggle/meta-kaggle This is the command and output: ``` $ sqlite-utils insert --csv kaggle.db kaggle KernelVersions.csv [------------------------------------] 0% [#####################---------------] 60% 00:04:24Traceback (most recent call last): File ""/home/foobar/miniconda/envs/meta-kaggle/bin/sqlite-utils"", line 10, in sys.exit(cli()) File ""/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py"", line 1128, in __call__ return self.main(*args, **kwargs) File ""/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py"", line 1053, in main rv = self.invoke(ctx) File ""/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py"", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File ""/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py"", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File ""/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py"", line 754, in invoke return __callback(*args, **kwargs) File ""/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py"", line 1223, in insert insert_upsert_implementation( File ""/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py"", line 1085, in insert_upsert_implementation db[table].insert_all( File ""/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/db.py"", line 3198, in insert_all chunk = list(chunk) File ""/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/db.py"", line 3742, in fix_square_braces for record in records: File ""/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py"", line 1071, in docs = (decode_base64_values(doc) for doc in docs) File ""/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py"", line 1068, in docs = (verify_is_dict(doc) for doc in docs) File ""/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py"", line 1003, in docs = (dict(zip(headers, row)) for row in reader) _csv.Error: line contains NUL ```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/582/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1822918995,I_kwDOCGYnMM5sp4lT,580,Add way to export to a csv file using the Python library,44324811,open,0,,,0,2023-07-26T18:09:26Z,2023-07-26T18:09:26Z,,NONE,,"According to the documentation, we can make a csv output using the CLI tool, but not the Python library. Could we have the latter?",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/580/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1795219865,I_kwDOCGYnMM5rAOGZ,566,`--no-headers` doesn't work on most formats,33625,open,0,,,2,2023-07-09T03:43:36Z,2023-07-09T04:13:35Z,,NONE,,"Version 3.33 ``` sqlite-utils query library.db 'select asin from audible' --fmt plain --no-headers | head -3 asin 0062804006 0062891421 ```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/566/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1383646615,I_kwDOCGYnMM5SeMWX,491,Ability to merge databases and tables,8904453,open,0,,,7,2022-09-23T11:10:55Z,2023-06-14T22:14:24Z,,NONE,,"Hi! Let me firstly say that I am a big fan of your work -- I follow your tweets and blog posts with great interest 😄. Now onto the matter at hand: I think it would be great if `sqlite-utils` included a `merge` or `combine` command, with the purpose of combining different SQLite databases into a single SQLite database. This way, the newly ""merged"" database would contain all differently named tables contained in the databases to be merged as-is, as well a concatenation of all tables of the same name. This could look something like this: ```bash sqlite-utils merge cats.db dogs.db > animals.db ``` I imagine this is rather straightforward if all databases involved in the merge contain differently named tables (i.e. no chance of conflicts), but things get slightly more complicated if two or more of the databases to be merged contain tables with the same name. Not only do you have to ""do something"" with the primary key(s), but these tables could also simply have different schemas (and therefore be incompatible for concatenation to begin with). Anyhow, I would love your thoughts on this, and, if you are open to it, work together on the design and implementation!",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/491/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1733198948,I_kwDOCGYnMM5nToRk,555,Filter table by a large bunch of ids,10843208,open,0,,,1,2023-05-31T00:29:51Z,2023-06-14T22:01:57Z,,NONE,,"Hi! this might be a question related to both SQLite & sqlite-utils, and you might be more experienced with them. I have a large bunch of ids, and I'm wondering which is the best way to query them in terms of performance, and simplicity if possible. The naive approach would be something like `select * from table where rowid in (?, ?, ?...)` but that wouldn't scale if ids are >1k. Another approach might be creating a temp table, or in-memory db table, insert all ids in that table and then join with the target one. I failed to attach an in-memory db both using sqlite-utils, and plain sql's execute(), so my closest approach is something like, ```python def filter_existing_video_ids(video_ids): db = get_db() # contains a ""videos"" table db.execute(""CREATE TEMPORARY TABLE IF NOT EXISTS tmp (video_id TEXT NOT NULL PRIMARY KEY)"") db[""tmp""].insert_all([{""video_id"": video_id} for video_id in video_ids]) for row in db[""tmp""].rows_where(""video_id not in (select video_id from videos)""): yield row[""video_id""] db[""tmp""].drop() ``` That kinda worked, I couldn't find an option in sqlite-utils's `create_table()` to tell it's a temporary table. Also, `tmp` table is not dropped finally, neither using `.drop()` despite being created with the keyword `TEMPORARY`. I believe it should be automatically dropped after connection/session ends though I read.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/555/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1720096994,I_kwDOCGYnMM5mhpji,554,"`IndexError` when doing `.insert(..., pk='id')` after `insert_all`",1231935,open,0,,,1,2023-05-22T17:13:02Z,2023-05-22T17:18:33Z,,NONE,,"I believe this is related to https://github.com/simonw/sqlite-utils/issues/98. When `pk` is specified by table A's `insert` call, it throws an index error if a different table has written a row with a higher rowid than exists in the first table. Here's a basic example: ```py from sqlite_utils import Database def test_pk_for_insert(fresh_db): user = {""id"": ""abc"", ""name"": ""david""} fresh_db[""users""].insert(user, pk=""id"") fresh_db[""comments""].insert_all( [ {""id"": ""def"", ""text"": ""ok""}, {""id"": ""ghi"", ""text"": ""great""}, ], ) fresh_db[""users""].insert( user, ignore=True, # BUG: when specifying pk on the second insert call # db.py goes into a block it doesn't expect and we get the error pk=""id"", ) if __name__ == ""__main__"": db = Database(""bug.db"") if db[""users""].exists(): raise ValueError( ""bug only shows on a new database - remove bug.db before running the script"" ) test_pk_for_insert(db) ``` The error is: ```py File ""/Users/david/projects/reddit-to-sqlite/.venv/lib/python3.11/site-packages/sqlite_utils/db.py"", line 2960, in insert_chunk row = list(self.rows_where(""rowid = ?"", [self.last_rowid]))[0] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^ IndexError: list index out of range ``` The issue is in this block: https://github.com/simonw/sqlite-utils/blob/2747257a3334d55e890b40ec58fada57ae8cfbfd/sqlite_utils/db.py#L2954-L2958 relevant locals are: - `pk`: `'id'` - `result.lastrowid`: `2` What's most interesting is the comment `# self.last_rowid will be 0 if a ""INSERT OR IGNORE"" happened`, which doesn't seem to be the case here. ",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/554/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 907795562,MDU6SXNzdWU5MDc3OTU1NjI=,265,Using enable_fts before search term,36287,open,0,,,1,2021-06-01T01:43:34Z,2023-04-01T17:27:18Z,,NONE,,"Many thanks for the sqlite-utils suite of utilities. Has made my life much much easier. I used this to create a table and enable FTS. All works fine. The datasette utility detects FTS and shows a text box. Searching for a term using that interface works well. However, when I start to use features by following https://www.sqlite.org/fts5.html section **""3. Full-text Query Syntax""** I seem to run into issues that I suspect is due to `escape_fts` wrapper function. As an example, if i search for the term `""^குகை"" `on the text box in datasette it produces 140 results. However, when i tweak the query produced by datasette to not use ""escape_fts"" it produces 5 results. Similarly, when I try to restrict the search to a single column in FTS using a spec like `{title : ^குகை}` it returns no rows. The same thing pulls results when used without `escape_fts`. The text in the table is in Tamil language and the search term is a Tamil word. ``` ... where posts_fts match escape_fts(:search) ``` vs ``` ... where posts_fts match (:search) ``` Any ideas why? How can I get the benefits of both escaping as well as utilizing different facets of providing / controlling search terms? Thanks.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/265/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 702386948,MDU6SXNzdWU3MDIzODY5NDg=,159,.delete_where() does not auto-commit (unlike .insert() or .upsert()),11712349,open,0,,,9,2020-09-16T01:55:52Z,2023-04-01T17:21:05Z,,NONE,,"When you use the delete_where() function on a table, it never commits.... Is that intentional?",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/159/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1550536442,I_kwDOCGYnMM5ca076,521,Custom JSON encoder,31504,open,0,,,0,2023-01-20T09:19:40Z,2023-01-20T09:19:40Z,,NONE,,"It would be nice if we could specify a custom encoder (and decoder) for types that will need extra deserialisation – e.g., sets, enums or sparse matrices – or even project-specific types",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/521/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1453134846,I_kwDOCGYnMM5WnRP-,513,Add or document streamlined workflow for importing Datasette csv / json exports,19328961,open,0,,,0,2022-11-17T10:54:47Z,2022-11-17T10:54:47Z,,NONE,,"I'm working on some small front-end enhancements to the laion-aesthetic-datasette project, and I wanted to partially populate a database directly using exports from the existing Datasette instance instead of downloading the parquet files and creating my own multi-GB database. There have been a number of small issues that are certainly related to my relative lack of familiarity with the toolkit, but that are still surprising. For example: a CSV export of the images table (http://laion-aesthetic.datasette.io/laion-aesthetic-6pls.csv?sql=select+rowid%2C+url%2C+text%2C+domain_id%2C+width%2C+height%2C+similarity%2C+punsafe%2C+pwatermark%2C+aesthetic%2C+hash%2C+__index_level_0__+from+images+order+by+random%28%29+limit+100) has nested single quotes, double quotes, and commas that aren't handled by rows_from_file. Similarly, the json output has to be manually transformed to add the column names and remove extraneous information before sqlite_utils can import it. I was able to work through these issues, but as an enhancement it would be really helpful to create or document a clear workflow that avoids the friction of this data transformation.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/513/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1082651698,I_kwDOCGYnMM5Ah_Qy,358,Support for CHECK constraints,11597658,open,0,,,7,2021-12-16T21:19:45Z,2022-09-25T07:15:59Z,,NONE,,"Hi, I noticed the `transform.table()` method doesn't have an option to add/change or drop a check constraint (see https://sqlite.org/lang_createtable.html -> 3.7 Check Constraints. would be great to have this as an option! ",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/358/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1128466114,I_kwDOCGYnMM5DQwbC,406,Creating tables with custom datatypes,82988,open,0,,,5,2022-02-09T12:16:31Z,2022-09-15T18:13:50Z,,NONE,,"Via https://stackoverflow.com/a/18622264/454773 I note the ability to register custom handlers for novel datatypes that can map into and out of things like sqlite `BLOB`s. From a quick look and a quick play, I didn't spot a way to do this in `sqlite_utils`? For example: ```python # Via https://stackoverflow.com/a/18622264/454773 import sqlite3 import numpy as np import io def adapt_array(arr): """""" http://stackoverflow.com/a/31312102/190597 (SoulNibbler) """""" out = io.BytesIO() np.save(out, arr) out.seek(0) return sqlite3.Binary(out.read()) def convert_array(text): out = io.BytesIO(text) out.seek(0) return np.load(out) # Converts np.array to TEXT when inserting sqlite3.register_adapter(np.ndarray, adapt_array) # Converts TEXT to np.array when selecting sqlite3.register_converter(""array"", convert_array) ``` ```python from sqlite_utils import Database db = Database('test.db') # Reset the database connection to used the parsed datatype # sqlite_utils doesn't seem to support eg: # Database('test.db', detect_types=sqlite3.PARSE_DECLTYPES) db.conn = sqlite3.connect(db_name, detect_types=sqlite3.PARSE_DECLTYPES) # Create a table the old fashioned way # but using the new custom data type vector_table_create = """""" CREATE TABLE dummy (title TEXT, vector array ); """""" cur = db.conn.cursor() cur.execute(vector_table_create) # sqlite_utils doesn't appear to support custom types (yet?!) # The following errors on the ""array"" datatype """""" db[""dummy""].create({ ""title"": str, ""vector"": ""array"", }) """""" ``` We can then add / retrieve records from the database where the datatype of the `vector` field is a custom registered `array` type (which is to say, a `numpy` array): ```python import numpy as np db[""dummy""].insert({'title':""test1"", 'vector':np.array([1,2,3])}) for row in db.query(""SELECT * FROM dummy""): print(row['title'], row['vector'], type(row['vector'])) """""" test1 [1 2 3] """""" ``` It would be handy to be able to do this idiomatically in `sqlite_utils`.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/406/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1353074021,I_kwDOCGYnMM5QpkVl,474,Add an option for specifying column names when inserting CSV data,14294,open,0,,,3,2022-08-27T15:29:59Z,2022-08-31T03:42:36Z,,NONE,,"https://sqlite-utils.datasette.io/en/stable/cli.html#csv-files-without-a-header-row > The first row of any CSV or TSV file is expected to contain the names of the columns in that file. > If your file does not include this row, you can use the `--no-headers` option to specify that the tool should not use that fist row as headers. > If you do this, the table will be created with column names called `untitled_1` and `untitled_2` and so on. You can then rename them using the `sqlite-utils transform ... --rename` command. It would be nice to be able to specify the column names when importing CSV/TSV without a header row, via an extra command line option. (renaming a column of a large table can take a long time, which makes it an inconvenient workaround)",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/474/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1227571375,I_kwDOCGYnMM5JK0Cv,431,Allow making m2m relation of a table to itself,738408,open,0,,,3,2022-05-06T08:30:43Z,2022-06-23T14:12:51Z,,NONE,,"I am building a database, in which one of the tables has a many-to-many relationship to itself. As far as I can see, this is not (yet) possible using `.m2m()` in sqlite-utils. This may be a bit of a niche use case, so feel free to close this issue if you feel it would introduce too much complexity compared to the benefits. Example: suppose I have a table of people, and I want to store the information that John and Mary have two children, Michael and Suzy. It would be neat if I could do something like this: ```python from sqlite_utils import Database db = Database(memory=True) db[""people""].insert({""name"": ""John""}, pk=""name"").m2m( ""people"", [{""name"": ""Michael""}, {""name"": ""Suzy""}], m2m_table=""parent_child"", pk=""name"" ) db[""people""].insert({""name"": ""Mary""}, pk=""name"").m2m( ""people"", [{""name"": ""Michael""}, {""name"": ""Suzy""}], m2m_table=""parent_child"", pk=""name"" ) ``` But if I do that, the many-to-many table `parent_child` has only one column: ``` CREATE TABLE [parent_child] ( [people_id] TEXT REFERENCES [people]([name]), PRIMARY KEY ([people_id], [people_id]) ) ``` This could be solved by adding one or two keyword_arguments to `.m2m()`, e.g. `.m2m(..., left_name=None, right_name=None)` or `.m2m(..., names=(None, None))`.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/431/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1250495688,I_kwDOCGYnMM5KiQzI,439,Misleading progress bar against utf-16-le CSV input,4068,open,0,,,12,2022-05-27T08:34:49Z,2022-06-15T03:53:43Z,,NONE,,"The program crashes without any error. ``` wget ""https://artsdatabanken.no/Fab2018/api/export/csv"" sqlite-utils create-database test.db sqlite-utils insert --csv --delimiter "";"" --encoding ""utf-16-le"" test test.db csv [------------------------------------] 0% [#################-------------------] 49% 00:00:01 ``` I would like to highlight various issues: 1. sqlite-utils catches exceptions without printing the stacktrace and/or reraising the exception, so there is no easy way to use `pdb` or similar to debug the program, solution: add a debug option 2. Silent crash: this is related to (1.), and it happens when there is a catch-all mechanism; solution: let the program fail.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/439/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1224112817,I_kwDOCGYnMM5I9nqx,430,Document how to use `PRAGMA temp_store` to avoid errors when running VACUUM against huge databases,9308268,open,0,,,2,2022-05-03T13:33:58Z,2022-06-14T23:26:37Z,,NONE,,"I'm trying to figure out a way to get the `table.extract()` method to complete successfully -- I'm not sure if maybe the cause (and a possible solution) of this on Ubuntu Server 22.04 is to adjust some of the PRAGMA values within SQLite itself ... on another Linux system (PopOS), using this method on this same database appears to work just fine. Here's the bit that's causing the error, and the resulting error output: ```python # combine these columns into 1 table ""bib_properties"" : # best_title # bib_level_code # mat_type # material_code # best_author db[""circ_trans""].extract( [""best_title"", ""bib_level_code"", ""mat_type"", ""material_code"", ""best_author""], table=""bib_properties"", fk_column=""bib_properties_id"" ) db[""circ_trans""].extract( [""call_number""], table=""call_number"", fk_column=""call_number_id"", rename={""call_number"": ""value""} ) ``` ```python --------------------------------------------------------------------------- OperationalError Traceback (most recent call last) Input In [17], in () 1 # combine these columns into 1 table ""bib_properties"" : 2 # best_title 3 # bib_level_code 4 # mat_type 5 # material_code 6 # best_author ----> 7 db[""circ_trans""].extract( 8 [""best_title"", ""bib_level_code"", ""mat_type"", ""material_code"", ""best_author""], 9 table=""bib_properties"", 10 fk_column=""bib_properties_id"" 11 ) 13 db[""circ_trans""].extract( 14 [""call_number""], 15 table=""call_number"", 16 fk_column=""call_number_id"", 17 rename={""call_number"": ""value""} 18 ) File ~/jupyter/venv/lib/python3.10/site-packages/sqlite_utils/db.py:1764, in Table.extract(self, columns, table, fk_column, rename) 1761 column_order.append(c.name) 1763 # Drop the unnecessary columns and rename lookup column -> 1764 self.transform( 1765 drop=set(columns), 1766 rename={magic_lookup_column: fk_column}, 1767 column_order=column_order, 1768 ) 1770 # And add the foreign key constraint 1771 self.add_foreign_key(fk_column, table, ""id"") File ~/jupyter/venv/lib/python3.10/site-packages/sqlite_utils/db.py:1526, in Table.transform(self, types, rename, drop, pk, not_null, defaults, drop_foreign_keys, column_order) 1524 with self.db.conn: 1525 for sql in sqls: -> 1526 self.db.execute(sql) 1527 # Run the foreign_key_check before we commit 1528 if pragma_foreign_keys_was_on: File ~/jupyter/venv/lib/python3.10/site-packages/sqlite_utils/db.py:465, in Database.execute(self, sql, parameters) 463 return self.conn.execute(sql, parameters) 464 else: --> 465 return self.conn.execute(sql) OperationalError: database or disk is full ``` This database is about 17G in total size, so I'm assuming the error is coming from the vacuum ... where i'm assuming it's maybe trying to do the temp storage in a location that doesn't have sufficient room. The disk space is more than ample on the host in question (1.8T is free in the directory where the sqlite db resides) The `/tmp` directory however is limited on a smaller disk associated with the OS I'm trying to think if there's a way to set the `PRAGMA temp_store` or maybe if it's `temp_store_directory` that I'm after ... to use the same local directory of where the file is located (maybe this is a property of the version of sqlite on the system?) ```python # SET the temp file store to be a file ... print(db.execute('PRAGMA temp_store').fetchall()) print(db.execute('PRAGMA temp_store=FILE').fetchall()) print(db.execute('PRAGMA temp_store').fetchall()) # the users home directory ... print(db.execute(""PRAGMA temp_store_directory='/home/plchuser/'"").fetchall()) print(db.execute(""PRAGMA sqlite3_temp_directory='/home/plchuser/'"").fetchall()) print(db.execute(""PRAGMA temp_store_directory"").fetchall()) print(db.execute(""PRAGMA sqlite3_temp_directory"").fetchall()) ``` ```text [(1,)] [] [(1,)] [] [] [('/home/plchuser/',)] [] ``` Here's the docs on the Temporary File Storage Locations https://www.sqlite.org/tempfiles.html",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/430/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1236693079,I_kwDOCGYnMM5JtnBX,432,"Support `rows_where()`, `delete_where()` etc for attached alias databases",11597658,open,0,,,5,2022-05-16T06:38:58Z,2022-06-14T22:16:48Z,,NONE,,"Hi, I noticed `rows_where()` doesn't return any rows from tables which are from attached databases. The `exists()` function returns false. As far as I can see this is because the `table_names()` function only looks for table names in the current database and not in attached (or temp) databases. Besides, `rows_where()`, also `insert_all()` and `delete_where()` didn't do what I was expecting because of this. For the moment I've patched `table_names()` for myself, see below but I'm not sure what the total impact is on the other functions like lookup truncate etc which all use `exists()`. Also `view_names()` doesn't look for views in attached or temp databases. ```python def table_names(self, fts4: bool = False, fts5: bool = False) -> List[str]: ""A list of string table names in this database."" where = [""type = 'table'""] if fts4: where.append(""sql like '%USING FTS4%'"") if fts5: where.append(""sql like '%USING FTS5%'"") dbs = [x[1] for x in self.execute('pragma database_list').fetchall()] lst=[] for db in dbs: sql = ""select name from {} where {}"".format(db+"".sqlite_master"","" AND "".join(where)) lst.extend(r[0] for r in self.execute(sql).fetchall()) return lst ```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/432/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 836829560,MDU6SXNzdWU4MzY4Mjk1NjA=,248,support for Apache Arrow / parquet files I/O,649467,open,0,,,1,2021-03-20T14:59:30Z,2021-10-28T23:46:48Z,,NONE,,"I just started looking at Apache Arrow using pyarrow for import and export of tabular datasets, and it looks quite compelling. It might be worth looking at for sqlite-utils and/or datasette. As a test, I took a random jsonl data dump of a dataset I have with floats, strings, and ints and converted it to arrow's parquet format using the naive `pyarrow.parquet.write_file()` command, which has automatic type inferrence. It compressed down to 7% of the original size. Conversion of a 26MB JSON file and serializing it to parquet was eyeblink instantaneous. Parquet files are portable and can be directly imported into pandas and other analytics software. The only hangup is the automatic type inference of the naive reader. It's great for general laziness and for parsing JSON columns (it correctly interpreted a table of mine with a JSON array). However, I did get an exception for a string column where most entries looked integer-like but had a couple values that weren't -- the reader tried to coerce all of them for some reason, even though the JSON type is string. Since the writer optionally takes a schema, it shouldn't be too hard to grab the sqlite header types. With some additional hinting, you might get datetime columns and JSON, which are native Arrow types. Somewhat tangentially, someone even wrote an sqlite vfs extension for Parquet: https://cldellow.com/2018/06/22/sqlite-parquet-vtable.html ",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/248/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 915421499,MDU6SXNzdWU5MTU0MjE0OTk=,267,row.update() or row.pk,12721157,open,0,,,4,2021-06-08T19:56:00Z,2021-06-22T17:27:27Z,,NONE,,"Hi, fantastic framework for working with Sqlite3 databases!!! I tried to update spezific rows in a table and used for row in db[tablename]: newValue = row[""counter""] * row[""prize""] row.update({""Fieldname"": newValue}) print(row) This updates the value in the printet row, but not in the database. So I switched to db[tablename].update(id, {""Filedname"": newValue}) This works fine. But row.update would be nicer, because no need for the id (its that row), no need for the tablename and the db (all defined in the for row ... loop). Thx ",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/267/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 818684978,MDU6SXNzdWU4MTg2ODQ5Nzg=,243,How can i use this utils to deal with fts on column meta of tables ?,27874014,open,0,,,0,2021-03-01T09:45:05Z,2021-03-01T09:45:05Z,,NONE,,"Thank you to release this bravo project. When i use this project on multi table db, I want to implement convenient search on column name from different tables. I want to develop a meta table to save the meta data of different columns of different tables and search on this meta table to get rows from the data table (which the meta table describes) does this project provide some simple function on it ? You can think a have a knowledge graph about the table in the db, and i save this knowledge graph into the db with fts enabled.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/243/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 539204432,MDU6SXNzdWU1MzkyMDQ0MzI=,70,Implement ON DELETE and ON UPDATE actions for foreign keys,26292069,open,0,,,2,2019-12-17T17:19:10Z,2020-02-27T04:18:53Z,,NONE,,"Hi! I did not find any mention on the library about ON DELETE and ON UPDATE actions for foreign keys. Are those expected to be implemented? If not, it would be a nice thing to include!",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/70/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,