{"id": 1250495688, "node_id": "I_kwDOCGYnMM5KiQzI", "number": 439, "title": "Misleading progress bar against utf-16-le CSV input", "user": {"value": 4068, "label": "frafra"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 12, "created_at": "2022-05-27T08:34:49Z", "updated_at": "2022-06-15T03:53:43Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "The program crashes without any error.\r\n```\r\nwget \"https://artsdatabanken.no/Fab2018/api/export/csv\"\r\nsqlite-utils create-database test.db\r\nsqlite-utils insert --csv --delimiter \";\" --encoding \"utf-16-le\" test test.db csv \r\n [------------------------------------] 0%\r\n [#################-------------------] 49% 00:00:01\r\n```\r\nI would like to highlight various issues:\r\n1. sqlite-utils catches exceptions without printing the stacktrace and/or reraising the exception, so there is no easy way to use `pdb` or similar to debug the program, solution: add a debug option\r\n2. Silent crash: this is related to (1.), and it happens when there is a catch-all mechanism; solution: let the program fail.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/439/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1250629388, "node_id": "I_kwDOCGYnMM5KixcM", "number": 440, "title": "CSV files with too many values in a row cause errors", "user": {"value": 4068, "label": "frafra"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 20, "created_at": "2022-05-27T10:54:44Z", "updated_at": "2022-06-14T22:23:01Z", "closed_at": "2022-06-14T20:12:46Z", "author_association": "NONE", "pull_request": null, "body": "*Original title: csv.DictReader can have None as key*\r\n\r\nIn some cases, `csv.DictReader` can have `None` as key for unnamed columns, and a list of values as value.\r\n`sqlite_utils.utils.rows_from_file` cannot handle that:\r\n\r\n```python\r\nurl=\"https://artsdatabanken.no/Fab2018/api/export/csv\"\r\ndb = sqlite_utils.Database(\":memory\")\r\n\r\nwith urlopen(url) as fab:\r\n reader, _ = sqlite_utils.utils.rows_from_file(fab, encoding=\"utf-16le\") \r\n db[\"fab2018\"].insert_all(reader, pk=\"Id\")\r\n```\r\n\r\nResult:\r\n```\r\nTraceback (most recent call last):\r\n File \"\", line 3, in \r\n File \"/home/user/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/sqlite_utils/db.py\", line 2924, in insert_all\r\n chunk = list(chunk)\r\n File \"/home/user/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/sqlite_utils/db.py\", line 3454, in fix_square_braces\r\n if any(\"[\" in key or \"]\" in key for key in record.keys()):\r\n File \"/home/user/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/sqlite_utils/db.py\", line 3454, in \r\n if any(\"[\" in key or \"]\" in key for key in record.keys()):\r\nTypeError: argument of type 'NoneType' is not iterable\r\n```\r\n\r\nCode:\r\nhttps://github.com/simonw/sqlite-utils/blob/59be60c471fd7a2c4be7f75e8911163e618ff5ca/sqlite_utils/db.py#L3454\r\n\r\n`sqlite-utils insert` from command line is not affected by this issue.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/440/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 1250161887, "node_id": "I_kwDOCGYnMM5Kg_Tf", "number": 438, "title": "illegal UTF-16 surrogate", "user": {"value": 4068, "label": "frafra"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-05-26T22:49:52Z", "updated_at": "2022-05-27T08:21:53Z", "closed_at": "2022-05-27T08:21:53Z", "author_association": "NONE", "pull_request": null, "body": "I am trying to insert `https://artsdatabanken.no/Fab2018/api/export/csv` into a SQLite database, but I have an error when using `sqlite-utils`:\r\n\r\n```\r\nsqlite-utils insert --csv --delimiter \";\" --encoding=\"utf-16-le\" --pk \"Id\" csv fremmedart test.db\r\n [------------------------------------] 0%\r\nError: 'utf-16-le' codec can't decode bytes in position 98-99: illegal UTF-16 surrogate\r\n\r\nThe input you provided uses a character encoding other than utf-8.\r\n\r\nYou can fix this by passing the --encoding= option with the encoding of the file.\r\n\r\nIf you do not know the encoding, running 'file filename.csv' may tell you.\r\n\r\nIt's often worth trying: --encoding=latin-1\r\n```\r\n\r\nI tried to convert the file using `iconv -f \"utf-16le\" -t \"utf-8\"`, but I still get a similar error (slightly different position):\r\n\r\n```\r\nsqlite-utils insert --csv --delimiter \";\" --encoding=utf-8 --pk \"Id\" csv_utf8 fremmedart test.db\r\n [------------------------------------] 0%\r\nError: 'utf-8' codec can't decode byte 0xd9 in position 99: invalid continuation byte\r\n\r\nThe input you provided uses a character encoding other than utf-8.\r\n\r\nYou can fix this by passing the --encoding= option with the encoding of the file.\r\n\r\nIf you do not know the encoding, running 'file filename.csv' may tell you.\r\n\r\nIt's often worth trying: --encoding=latin-1\r\n```\r\n\r\nI have no issues reading such file using this Python code:\r\n```python\r\ncontent = open('csv', encoding='utf-16-le').read())\r\n```\r\n\r\n`in2csv` works too.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/438/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 919314806, "node_id": "MDU6SXNzdWU5MTkzMTQ4MDY=", "number": 270, "title": "Cannot set type JSON", "user": {"value": 4068, "label": "frafra"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2021-06-11T23:53:22Z", "updated_at": "2021-06-16T17:34:49Z", "closed_at": "2021-06-16T15:47:06Z", "author_association": "NONE", "pull_request": null, "body": "It would be great if the column type could be set to JSON. That would not be different from handling a regular string. It would be something like `repr(value)` and it would work with both JSON and CSV inputs, no matter if `value` is a real list or just a string representing a list.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/270/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 919250621, "node_id": "MDU6SXNzdWU5MTkyNTA2MjE=", "number": 269, "title": "bool type not supported", "user": {"value": 4068, "label": "frafra"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2021-06-11T22:00:36Z", "updated_at": "2021-06-15T01:34:10Z", "closed_at": "2021-06-15T01:34:10Z", "author_association": "NONE", "pull_request": null, "body": "Hi! Thank you for sharing this very nice tool :)\r\nIt would be nice to have support for more types, like `bool`: it is not possible to convert to boolean at the moment. My suggestion would be to handle it as `bool(int(value))`, like csvkit does.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/269/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"}