{"html_url": "https://github.com/simonw/sqlite-utils/issues/406#issuecomment-1248440137", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/406", "id": 1248440137, "node_id": "IC_kwDOCGYnMM5Kaa9J", "user": {"value": 82988, "label": "psychemedia"}, "created_at": "2022-09-15T18:13:50Z", "updated_at": "2022-09-15T18:13:50Z", "author_association": "NONE", "body": "I was wondering if you have any more thoughts on this? I have a tangible use case now: adding a \"vector\" column to a database to support semantic search using doc2vec embeddings ([example](https://psychemedia.github.io/storynotes/Lang_Doc2Vec.html); note that the `vtfunc` package may no longer be reliable...).", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1128466114, "label": "Creating tables with custom datatypes"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/406#issuecomment-1041363433", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/406", "id": 1041363433, "node_id": "IC_kwDOCGYnMM4-EfHp", "user": {"value": 82988, "label": "psychemedia"}, "created_at": "2022-02-16T10:57:03Z", "updated_at": "2022-02-16T10:57:19Z", "author_association": "NONE", "body": "Wondering if this actually relates to https://github.com/simonw/sqlite-utils/issues/402 ?\r\n\r\nI also wonder if this would be a sensible approach for eg registering `pint` based quantity conversions into and out of the db, perhaps storing the quantity as a serialised `magnitude measurement` single column string?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1128466114, "label": "Creating tables with custom datatypes"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/402#issuecomment-1041325398", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/402", "id": 1041325398, "node_id": "IC_kwDOCGYnMM4-EV1W", "user": {"value": 82988, "label": "psychemedia"}, "created_at": "2022-02-16T10:12:48Z", "updated_at": "2022-02-16T10:18:55Z", "author_association": "NONE", "body": "> My hunch is that the case where you want to consider input from more than one column will actually be pretty rare - the only case I can think of where I would want to do that is for latitude/longitude columns\r\n\r\nOther possible pairs: unconventional date/datetime and timezone pairs eg `2022-02-16::17.00, London`; or more generally, numerical value and unit of measurement pairs (eg if you want to cast into and out of different measurement units using packages like `pint`) or currencies etc. Actually, in that case, I guess you may be presenting things that are unit typed already, and so a conversion would need to parse things into an appropriate, possibly two column `value, unit` format.\r\n\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1125297737, "label": "Advanced class-based `conversions=` mechanism"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/406#issuecomment-1041313679", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/406", "id": 1041313679, "node_id": "IC_kwDOCGYnMM4-ES-P", "user": {"value": 82988, "label": "psychemedia"}, "created_at": "2022-02-16T09:59:51Z", "updated_at": "2022-02-16T10:00:10Z", "author_association": "NONE", "body": "The `CustomColumnType()` approach looks good. This pushes you into the mindspace that you are defining and working with a custom column type.\r\n\r\nWhen creating the table, you could then error, or at least warn, if someone wasn't setting a column on a `type` or a custom column type, which I guess is where `mypy` comes in?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1128466114, "label": "Creating tables with custom datatypes"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/pull/203#issuecomment-1033641009", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/203", "id": 1033641009, "node_id": "IC_kwDOCGYnMM49nBwx", "user": {"value": 82988, "label": "psychemedia"}, "created_at": "2022-02-09T11:06:18Z", "updated_at": "2022-02-09T11:06:18Z", "author_association": "NONE", "body": "Is there any progress elsewhere on the handling of compound / composite foreign keys, or is this PR still effectively open?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 743384829, "label": "changes to allow for compound foreign keys"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/73#issuecomment-580745213", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/73", "id": 580745213, "node_id": "MDEyOklzc3VlQ29tbWVudDU4MDc0NTIxMw==", "user": {"value": 82988, "label": "psychemedia"}, "created_at": "2020-01-31T14:02:38Z", "updated_at": "2020-01-31T14:21:09Z", "author_association": "NONE", "body": "So the conundrum continues.. The simple test case above now runs, but if I upsert a large number of new records (successfully) and then try to upsert a fewer number of new records to a different table, I get the same error.\r\n\r\nIf I run the same upserts again (which in the first case means there are no new records to add, because they were already added), the second upsert works correctly.\r\n\r\nIt feels as if the number of items added via an upsert >> the number of items I try to add in an upsert immediately after, I get the error.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 545407916, "label": "upsert_all() throws issue when upserting to empty table"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/73#issuecomment-573047321", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/73", "id": 573047321, "node_id": "MDEyOklzc3VlQ29tbWVudDU3MzA0NzMyMQ==", "user": {"value": 82988, "label": "psychemedia"}, "created_at": "2020-01-10T14:02:56Z", "updated_at": "2020-01-10T14:09:23Z", "author_association": "NONE", "body": "Hmmm... just tried with installs from pip and the repo (v2.0.0 and v2.0.1) and I get the error each time (start of second run through the second loop).\r\n\r\nCould it be sqlite3? I'm on 3.30.1.\r\n\r\nUPDATE: just tried it on jupyter.org/try and I get the error there, too.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 545407916, "label": "upsert_all() throws issue when upserting to empty table"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/73#issuecomment-571138093", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/73", "id": 571138093, "node_id": "MDEyOklzc3VlQ29tbWVudDU3MTEzODA5Mw==", "user": {"value": 82988, "label": "psychemedia"}, "created_at": "2020-01-06T13:28:31Z", "updated_at": "2020-01-06T13:28:31Z", "author_association": "NONE", "body": "I think I actually had several issues in play...\r\n\r\nThe missing key was one, but I think there is also an issue as per below.\r\n\r\nFor example, in the following:\r\n\r\n```python\r\ndef init_testdb(dbname='test.db'):\r\n \r\n if os.path.exists(dbname):\r\n os.remove(dbname)\r\n\r\n conn = sqlite3.connect(dbname)\r\n db = Database(conn)\r\n \r\n return conn, db\r\n\r\nconn, db = init_testdb()\r\n\r\nc = conn.cursor()\r\nc.executescript('CREATE TABLE \"test1\" (\"Col1\" TEXT, \"Col2\" TEXT, PRIMARY KEY (\"Col1\"));')\r\nc.executescript('CREATE TABLE \"test2\" (\"Col1\" TEXT, \"Col2\" TEXT, PRIMARY KEY (\"Col1\"));')\r\n\r\nprint('Test 1...')\r\nfor i in range(3):\r\n db['test1'].upsert_all([{'Col1':'a', 'Col2':'x'},{'Col1':'b', 'Col2':'x'}], pk=('Col1'))\r\n db['test2'].upsert_all([{'Col1':'a', 'Col2':'x'},{'Col1':'b', 'Col2':'x'}], pk=('Col1'))\r\n\r\nprint('Test 2...')\r\nfor i in range(3):\r\n db['test1'].upsert_all([{'Col1':'a', 'Col2':'x'},{'Col1':'b', 'Col2':'x'}], pk=('Col1'))\r\n db['test2'].upsert_all([{'Col1':'a', 'Col2':'x'},{'Col1':'b', 'Col2':'x'},\r\n {'Col1':'c','Col2':'x'}], pk=('Col1'))\r\nprint('Done...')\r\n\r\n---------------------------------------------------------------------------\r\nTest 1...\r\nTest 2...\r\n IndexError: list index out of range \r\n---------------------------------------------------------------------------\r\nIndexError Traceback (most recent call last)\r\n in \r\n 22 print('Test 2...')\r\n 23 for i in range(3):\r\n---> 24 db['test1'].upsert_all([{'Col1':'a', 'Col2':'x'},{'Col1':'b', 'Col2':'x'}], pk=('Col1'))\r\n 25 db['test2'].upsert_all([{'Col1':'a', 'Col2':'x'},{'Col1':'b', 'Col2':'x'},\r\n 26 {'Col1':'c','Col2':'x'}], pk=('Col1'))\r\n\r\n/usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in upsert_all(self, records, pk, foreign_keys, column_order, not_null, defaults, batch_size, hash_id, alter, extracts)\r\n 1157 alter=alter,\r\n 1158 extracts=extracts,\r\n-> 1159 upsert=True,\r\n 1160 )\r\n 1161 \r\n\r\n/usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in insert_all(self, records, pk, foreign_keys, column_order, not_null, defaults, batch_size, hash_id, alter, ignore, replace, extracts, upsert)\r\n 1097 # self.last_rowid will be 0 if a \"INSERT OR IGNORE\" happened\r\n 1098 if (hash_id or pk) and self.last_rowid:\r\n-> 1099 row = list(self.rows_where(\"rowid = ?\", [self.last_rowid]))[0]\r\n 1100 if hash_id:\r\n 1101 self.last_pk = row[hash_id]\r\n\r\nIndexError: list index out of range\r\n```\r\n\r\nthe first test works but the second fails. Is the length of the list of items being upserted leaking somewhere?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 545407916, "label": "upsert_all() throws issue when upserting to empty table"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/8#issuecomment-482994231", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/8", "id": 482994231, "node_id": "MDEyOklzc3VlQ29tbWVudDQ4Mjk5NDIzMQ==", "user": {"value": 82988, "label": "psychemedia"}, "created_at": "2019-04-14T15:04:07Z", "updated_at": "2019-04-14T15:29:33Z", "author_association": "NONE", "body": "\r\n\r\nPLEASE IGNORE THE BELOW... I did a package update and rebuilt the kernel I was working in... may just have been an old version of sqlite_utils, seems to be working now. (Too many containers / too many environments!)\r\n\r\n\r\nHas an issue been reintroduced here with FTS? eg I'm getting an error thrown by spaces in column names here:\r\n\r\n```\r\n/usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in insert_all(self, records, pk, foreign_keys, upsert, batch_size, column_order)\r\n\r\ndef enable_fts(self, columns, fts_version=\"FTS5\"):\r\n--> 329 \"Enables FTS on the specified columns\"\r\n 330 sql = \"\"\"\r\n 331 CREATE VIRTUAL TABLE \"{table}_fts\" USING {fts_version} (\r\n```\r\n\r\nwhen trying an `insert_all`.\r\n\r\nAlso, if a col has a `.` in it, I seem to get:\r\n\r\n```\r\n/usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in insert_all(self, records, pk, foreign_keys, upsert, batch_size, column_order)\r\n 327 jsonify_if_needed(record.get(key, None)) for key in all_columns\r\n 328 )\r\n--> 329 result = self.db.conn.execute(sql, values)\r\n 330 self.db.conn.commit()\r\n 331 self.last_id = result.lastrowid\r\n\r\nOperationalError: near \".\": syntax error\r\n```\r\n\r\n(Can't post a worked minimal example right now; racing trying to build something against a live timing screen that will stop until next weekend in an hour or two...)\r\n\r\nPS Hmmm I did a test and they seem to work; I must be messing up s/where else...\r\n\r\n```\r\nimport sqlite3\r\nfrom sqlite_utils import Database\r\n\r\ndbname='testingDB_sqlite_utils.db'\r\n\r\n#!rm $dbname\r\nconn = sqlite3.connect(dbname, timeout=10)\r\n\r\n\r\n#Setup database tables\r\nc = conn.cursor()\r\n\r\nsetup='''\r\nCREATE TABLE IF NOT EXISTS \"test1\" (\r\n \"NO\" INTEGER,\r\n \"NAME\" TEXT\r\n);\r\n\r\nCREATE TABLE IF NOT EXISTS \"test2\" (\r\n \"NO\" INTEGER,\r\n `TIME OF DAY` TEXT\r\n);\r\n\r\nCREATE TABLE IF NOT EXISTS \"test3\" (\r\n \"NO\" INTEGER,\r\n `AVG. SPEED (MPH)` FLOAT\r\n);\r\n'''\r\n\r\nc.executescript(setup)\r\n\r\n\r\nDB = Database(conn)\r\n\r\nimport pandas as pd\r\n\r\ndf1 = pd.DataFrame({'NO':[1,2],'NAME':['a','b']})\r\nDB['test1'].insert_all(df1.to_dict(orient='records'))\r\n\r\ndf2 = pd.DataFrame({'NO':[1,2],'TIME OF DAY':['early on','late']})\r\nDB['test2'].insert_all(df2.to_dict(orient='records'))\r\n\r\ndf3 = pd.DataFrame({'NO':[1,2],'AVG. SPEED (MPH)':['123.3','123.4']})\r\nDB['test3'].insert_all(df3.to_dict(orient='records'))\r\n```\r\n\r\nall seem to work ok. I'm still getting errors in my set up though, which is not too different to the text cases?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 403922644, "label": "Problems handling column names containing spaces or - "}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/18#issuecomment-480621924", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/18", "id": 480621924, "node_id": "MDEyOklzc3VlQ29tbWVudDQ4MDYyMTkyNA==", "user": {"value": 82988, "label": "psychemedia"}, "created_at": "2019-04-07T19:31:42Z", "updated_at": "2019-04-07T19:31:42Z", "author_association": "NONE", "body": "I've just noticed that SQLite lets you IGNORE inserts that collide with a pre-existing key. This can be quite handy if you have a dataset that keeps changing in part, and you don't want to upsert and replace pre-existing PK rows but you do want to ignore collisions to existing PK rows.\r\n\r\nDo `sqlite_utils` support such (cavalier!) behaviour?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 413871266, "label": ".insert/.upsert/.insert_all/.upsert_all should add missing columns"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/8#issuecomment-464341721", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/8", "id": 464341721, "node_id": "MDEyOklzc3VlQ29tbWVudDQ2NDM0MTcyMQ==", "user": {"value": 82988, "label": "psychemedia"}, "created_at": "2019-02-16T12:08:41Z", "updated_at": "2019-02-16T12:08:41Z", "author_association": "NONE", "body": "We also get an error if a column name contains a `.`", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 403922644, "label": "Problems handling column names containing spaces or - "}, "performed_via_github_app": null}