id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,pull_request,body,repo,type,active_lock_reason,performed_via_github_app,reactions,draft,state_reason 1128466114,I_kwDOCGYnMM5DQwbC,406,Creating tables with custom datatypes,82988,open,0,,,5,2022-02-09T12:16:31Z,2022-09-15T18:13:50Z,,NONE,,"Via https://stackoverflow.com/a/18622264/454773 I note the ability to register custom handlers for novel datatypes that can map into and out of things like sqlite `BLOB`s. From a quick look and a quick play, I didn't spot a way to do this in `sqlite_utils`? For example: ```python # Via https://stackoverflow.com/a/18622264/454773 import sqlite3 import numpy as np import io def adapt_array(arr): """""" http://stackoverflow.com/a/31312102/190597 (SoulNibbler) """""" out = io.BytesIO() np.save(out, arr) out.seek(0) return sqlite3.Binary(out.read()) def convert_array(text): out = io.BytesIO(text) out.seek(0) return np.load(out) # Converts np.array to TEXT when inserting sqlite3.register_adapter(np.ndarray, adapt_array) # Converts TEXT to np.array when selecting sqlite3.register_converter(""array"", convert_array) ``` ```python from sqlite_utils import Database db = Database('test.db') # Reset the database connection to used the parsed datatype # sqlite_utils doesn't seem to support eg: # Database('test.db', detect_types=sqlite3.PARSE_DECLTYPES) db.conn = sqlite3.connect(db_name, detect_types=sqlite3.PARSE_DECLTYPES) # Create a table the old fashioned way # but using the new custom data type vector_table_create = """""" CREATE TABLE dummy (title TEXT, vector array ); """""" cur = db.conn.cursor() cur.execute(vector_table_create) # sqlite_utils doesn't appear to support custom types (yet?!) # The following errors on the ""array"" datatype """""" db[""dummy""].create({ ""title"": str, ""vector"": ""array"", }) """""" ``` We can then add / retrieve records from the database where the datatype of the `vector` field is a custom registered `array` type (which is to say, a `numpy` array): ```python import numpy as np db[""dummy""].insert({'title':""test1"", 'vector':np.array([1,2,3])}) for row in db.query(""SELECT * FROM dummy""): print(row['title'], row['vector'], type(row['vector'])) """""" test1 [1 2 3] """""" ``` It would be handy to be able to do this idiomatically in `sqlite_utils`.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/406/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1063388037,I_kwDOCGYnMM4_YgOF,343,Provide function to generate hash_id from specified columns,82988,closed,0,,,4,2021-11-25T10:12:12Z,2022-03-02T04:25:25Z,2022-03-02T04:25:25Z,NONE,,"Hi I note that you define `_hash()` to create a `hash_id` from non-id column values in a table [here](https://github.com/simonw/sqlite-utils/blob/8f386a0d300d1b1c76132bb75972b755049fb742/sqlite_utils/db.py#L2996). It would be useful to be able to call a complementary function to generate a corresponding `_id` from a subset of specified columns when adding items to another table, eg to support the creation of foreign keys. Or is there a better pattern for doing that?",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/343/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed 545407916,MDU6SXNzdWU1NDU0MDc5MTY=,73,upsert_all() throws issue when upserting to empty table,82988,closed,0,,,6,2020-01-05T11:58:57Z,2020-01-31T14:21:09Z,2020-01-05T17:20:18Z,NONE,,"If I try to add a list of `dict`s to an empty table using `upsert_all`, I get an error: ```python import sqlite3 from sqlite_utils import Database import pandas as pd conx = sqlite3.connect(':memory') cx = conx.cursor() cx.executescript('CREATE TABLE ""test"" (""Col1"" TEXT);') q=""SELECT * FROM test;"" pd.read_sql(q, conx) #shows empty table db = Database(conx) db['test'].upsert_all([{'Col1':'a'},{'Col1':'b'}]) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in 1 db = Database(conx) ----> 2 db['test'].upsert_all([{'Col1':'a'},{'Col1':'b'}]) /usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in upsert_all(self, records, pk, foreign_keys, column_order, not_null, defaults, batch_size, hash_id, alter, extracts) 1157 alter=alter, 1158 extracts=extracts, -> 1159 upsert=True, 1160 ) 1161 /usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in insert_all(self, records, pk, foreign_keys, column_order, not_null, defaults, batch_size, hash_id, alter, ignore, replace, extracts, upsert) 1040 sql = ""INSERT OR IGNORE INTO [{table}]({pks}) VALUES({pk_placeholders});"".format( 1041 table=self.name, -> 1042 pks="", "".join([""[{}]"".format(p) for p in pks]), 1043 pk_placeholders="", "".join([""?"" for p in pks]), 1044 ) TypeError: 'NoneType' object is not iterable ``` A hacky workaround in use is: ```python try: db['test'].upsert_all([{'Col1':'a'},{'Col1':'b'}]) except: db['test'].insert_all([{'Col1':'a'},{'Col1':'b'}]) ```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/73/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed 403922644,MDU6SXNzdWU0MDM5MjI2NDQ=,8,Problems handling column names containing spaces or - ,82988,closed,0,,,3,2019-01-28T17:23:28Z,2019-04-14T15:29:33Z,2019-02-23T21:09:03Z,NONE,,"Irrrespective of whether using column names containing a space or - character is good practice, SQLite does allow it, but `sqlite-utils` throws an error in the following cases: ```python from sqlite_utils import Database dbname = 'test.db' DB = Database(sqlite3.connect(dbname)) import pandas as pd df = pd.DataFrame({'col1':range(3), 'col2':range(3)}) #Convert pandas dataframe to appropriate list/dict format DB['test1'].insert_all( df.to_dict(orient='records') ) #Works fine ``` However: ```python df = pd.DataFrame({'col 1':range(3), 'col2':range(3)}) DB['test1'].insert_all(df.to_dict(orient='records')) ``` throws: ``` --------------------------------------------------------------------------- OperationalError Traceback (most recent call last) in () 1 import pandas as pd 2 df = pd.DataFrame({'col 1':range(3), 'col2':range(3)}) ----> 3 DB['test1'].insert_all(df.to_dict(orient='records')) /usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in insert_all(self, records, pk, foreign_keys, upsert, batch_size, column_order) 327 jsonify_if_needed(record.get(key, None)) for key in all_columns 328 ) --> 329 result = self.db.conn.execute(sql, values) 330 self.db.conn.commit() 331 self.last_id = result.lastrowid OperationalError: near ""1"": syntax error ``` and: ```python df = pd.DataFrame({'col-1':range(3), 'col2':range(3)}) DB['test1'].upsert_all(df.to_dict(orient='records')) ``` results in: ``` --------------------------------------------------------------------------- OperationalError Traceback (most recent call last) in () 1 import pandas as pd 2 df = pd.DataFrame({'col-1':range(3), 'col2':range(3)}) ----> 3 DB['test1'].insert_all(df.to_dict(orient='records')) /usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in insert_all(self, records, pk, foreign_keys, upsert, batch_size, column_order) 327 jsonify_if_needed(record.get(key, None)) for key in all_columns 328 ) --> 329 result = self.db.conn.execute(sql, values) 330 self.db.conn.commit() 331 self.last_id = result.lastrowid OperationalError: near ""-"": syntax error ```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/8/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed 411066700,MDU6SXNzdWU0MTEwNjY3MDA=,10,Error in upsert if column named 'order',82988,closed,0,,,1,2019-02-16T12:05:18Z,2019-02-24T16:55:38Z,2019-02-24T16:55:37Z,NONE,,"The following works fine: ``` connX = sqlite3.connect('DELME.db', timeout=10) dfX=pd.DataFrame({'col1':range(3),'col2':range(3)}) DBX = Database(connX) DBX['test'].upsert_all(dfX.to_dict(orient='records')) ``` But if a column is named `order`: ``` connX = sqlite3.connect('DELME.db', timeout=10) dfX=pd.DataFrame({'order':range(3),'col2':range(3)}) DBX = Database(connX) DBX['test'].upsert_all(dfX.to_dict(orient='records')) ``` it throws an error: ``` --------------------------------------------------------------------------- OperationalError Traceback (most recent call last) in 3 dfX=pd.DataFrame({'order':range(3),'col2':range(3)}) 4 DBX = Database(connX) ----> 5 DBX['test'].upsert_all(dfX.to_dict(orient='records')) /usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in upsert_all(self, records, pk, foreign_keys, column_order) 347 foreign_keys=foreign_keys, 348 upsert=True, --> 349 column_order=column_order, 350 ) 351 /usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in insert_all(self, records, pk, foreign_keys, upsert, batch_size, column_order) 327 jsonify_if_needed(record.get(key, None)) for key in all_columns 328 ) --> 329 result = self.db.conn.execute(sql, values) 330 self.db.conn.commit() 331 self.last_id = result.lastrowid OperationalError: near ""order"": syntax error ```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/10/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed