{"html_url": "https://github.com/simonw/sqlite-utils/issues/507#issuecomment-1297859539", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/507", "id": 1297859539, "node_id": "IC_kwDOCGYnMM5NW8PT", "user": {"value": 7908073, "label": "chapmanjacobd"}, "created_at": "2022-11-01T00:40:16Z", "updated_at": "2022-11-01T00:40:16Z", "author_association": "CONTRIBUTOR", "body": "Ideally people could fix their data if they run into this issue.\r\n\r\nIf you are using filenames try [convmv](https://linux.die.net/man/1/convmv)\r\n\r\n```\r\nconvmv --preserve-mtimes -f utf8 -t utf8 --notest -i -r .\r\n```\r\n\r\nmaybe this script will also help: \r\n\r\n```py\r\nimport argparse, shutil\r\nfrom pathlib import Path\r\n\r\nimport ftfy\r\n\r\nfrom xklb import utils\r\nfrom xklb.utils import log\r\n\r\n\r\ndef parse_args() -> argparse.Namespace:\r\n parser = argparse.ArgumentParser()\r\n parser.add_argument(\"paths\", nargs='*')\r\n parser.add_argument(\"--verbose\", \"-v\", action=\"count\", default=0)\r\n args = parser.parse_args()\r\n\r\n log.info(utils.dict_filter_bool(args.__dict__))\r\n return args\r\n\r\n\r\ndef rename_invalid_paths() -> None:\r\n args = parse_args()\r\n\r\n for path in args.paths:\r\n log.info(path)\r\n for p in sorted([str(p) for p in Path(path).rglob(\"*\")], key=len):\r\n fixed = ftfy.fix_text(p, uncurl_quotes=False).replace(\"\\r\\n\", \"\\n\").replace(\"\\r\", \"\\n\").replace(\"\\n\", \"\")\r\n if p != fixed:\r\n try:\r\n shutil.move(p, fixed)\r\n except FileNotFoundError:\r\n log.warning(\"FileNotFound. %s\", p)\r\n else:\r\n log.info(fixed)\r\n\r\n\r\nif __name__ == \"__main__\":\r\n rename_invalid_paths()\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1430325103, "label": "conn.execute: UnicodeEncodeError: 'utf-8' codec can't encode character"}, "performed_via_github_app": null}