{"id": 1250161887, "node_id": "I_kwDOCGYnMM5Kg_Tf", "number": 438, "title": "illegal UTF-16 surrogate", "user": {"value": 4068, "label": "frafra"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2022-05-26T22:49:52Z", "updated_at": "2022-05-27T08:21:53Z", "closed_at": "2022-05-27T08:21:53Z", "author_association": "NONE", "pull_request": null, "body": "I am trying to insert `https://artsdatabanken.no/Fab2018/api/export/csv` into a SQLite database, but I have an error when using `sqlite-utils`:\r\n\r\n```\r\nsqlite-utils insert --csv --delimiter \";\" --encoding=\"utf-16-le\" --pk \"Id\" csv fremmedart test.db\r\n [------------------------------------] 0%\r\nError: 'utf-16-le' codec can't decode bytes in position 98-99: illegal UTF-16 surrogate\r\n\r\nThe input you provided uses a character encoding other than utf-8.\r\n\r\nYou can fix this by passing the --encoding= option with the encoding of the file.\r\n\r\nIf you do not know the encoding, running 'file filename.csv' may tell you.\r\n\r\nIt's often worth trying: --encoding=latin-1\r\n```\r\n\r\nI tried to convert the file using `iconv -f \"utf-16le\" -t \"utf-8\"`, but I still get a similar error (slightly different position):\r\n\r\n```\r\nsqlite-utils insert --csv --delimiter \";\" --encoding=utf-8 --pk \"Id\" csv_utf8 fremmedart test.db\r\n [------------------------------------] 0%\r\nError: 'utf-8' codec can't decode byte 0xd9 in position 99: invalid continuation byte\r\n\r\nThe input you provided uses a character encoding other than utf-8.\r\n\r\nYou can fix this by passing the --encoding= option with the encoding of the file.\r\n\r\nIf you do not know the encoding, running 'file filename.csv' may tell you.\r\n\r\nIt's often worth trying: --encoding=latin-1\r\n```\r\n\r\nI have no issues reading such file using this Python code:\r\n```python\r\ncontent = open('csv', encoding='utf-16-le').read())\r\n```\r\n\r\n`in2csv` works too.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/438/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"}