{"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1116684581", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1116684581, "node_id": "IC_kwDOCGYnMM5Cj0El", "user": {"value": 638427, "label": "mattkiefer"}, "created_at": "2022-05-03T21:36:49Z", "updated_at": "2022-05-03T21:36:49Z", "author_association": "NONE", "body": "Thanks for addressing this @simonw! However, I just reinstalled sqlite-utils 3.26.1 and get an `ParserError: Unknown string format: None`:\r\n```\r\nsqlite-utils --version\r\nsqlite-utils, version 3.26.1\r\n```\r\n```\r\nsqlite-utils convert idfpr.db license \"Original Issue Date\" \"r.parsedate(value)\"\r\nTraceback (most recent call last):\r\n File \"/home/matt/.local/lib/python3.9/site-packages/sqlite_utils/db.py\", line 2514, in convert_value\r\n return fn(v)\r\n File \"\", line 2, in fn\r\n File \"/home/matt/.local/lib/python3.9/site-packages/sqlite_utils/recipes.py\", line 19, in parsedate\r\n parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst)\r\n File \"/usr/lib/python3/dist-packages/dateutil/parser/_parser.py\", line 1374, in parse\r\n return DEFAULTPARSER.parse(timestr, **kwargs)\r\n File \"/usr/lib/python3/dist-packages/dateutil/parser/_parser.py\", line 649, in parse\r\n raise ParserError(\"Unknown string format: %s\", timestr)\r\ndateutil.parser._parser.ParserError: Unknown string format: None\r\nTraceback (most recent call last):\r\n File \"/home/matt/.local/bin/sqlite-utils\", line 8, in \r\n sys.exit(cli())\r\n File \"/usr/lib/python3/dist-packages/click/core.py\", line 829, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"/usr/lib/python3/dist-packages/click/core.py\", line 782, in main\r\n rv = self.invoke(ctx)\r\n File \"/usr/lib/python3/dist-packages/click/core.py\", line 1259, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"/usr/lib/python3/dist-packages/click/core.py\", line 1066, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n File \"/usr/lib/python3/dist-packages/click/core.py\", line 610, in invoke\r\n return callback(*args, **kwargs)\r\n File \"/home/matt/.local/lib/python3.9/site-packages/sqlite_utils/cli.py\", line 2707, in convert\r\n db[table].convert(\r\n File \"/home/matt/.local/lib/python3.9/site-packages/sqlite_utils/db.py\", line 2530, in convert\r\n self.db.execute(sql, where_args or [])\r\n File \"/home/matt/.local/lib/python3.9/site-packages/sqlite_utils/db.py\", line 463, in execute\r\n return self.conn.execute(sql, parameters)\r\nsqlite3.OperationalError: user-defined function raised exception\r\n```\r\nI definitely have some invalid data in the db. Happy to send a copy if it's helpful.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073456222", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073456222, "node_id": "IC_kwDOCGYnMM4_-6Re", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:45:52Z", "updated_at": "2022-03-21T03:45:52Z", "author_association": "OWNER", "body": "Needs tests and documentation.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073456155", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073456155, "node_id": "IC_kwDOCGYnMM4_-6Qb", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:45:37Z", "updated_at": "2022-03-21T03:45:37Z", "author_association": "OWNER", "body": "Prototype:\r\n```diff\r\ndiff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py\r\nindex 8255b56..0a3693e 100644\r\n--- a/sqlite_utils/cli.py\r\n+++ b/sqlite_utils/cli.py\r\n@@ -2583,7 +2583,11 @@ def _generate_convert_help():\r\n \"\"\"\r\n ).strip()\r\n recipe_names = [\r\n- n for n in dir(recipes) if not n.startswith(\"_\") and n not in (\"json\", \"parser\")\r\n+ n\r\n+ for n in dir(recipes)\r\n+ if not n.startswith(\"_\")\r\n+ and n not in (\"json\", \"parser\")\r\n+ and callable(getattr(recipes, n))\r\n ]\r\n for name in recipe_names:\r\n fn = getattr(recipes, name)\r\ndiff --git a/sqlite_utils/recipes.py b/sqlite_utils/recipes.py\r\nindex 6918661..569c30d 100644\r\n--- a/sqlite_utils/recipes.py\r\n+++ b/sqlite_utils/recipes.py\r\n@@ -1,17 +1,38 @@\r\n from dateutil import parser\r\n import json\r\n \r\n+IGNORE = object()\r\n+SET_NULL = object()\r\n \r\n-def parsedate(value, dayfirst=False, yearfirst=False):\r\n+\r\n+def parsedate(value, dayfirst=False, yearfirst=False, errors=None):\r\n \"Parse a date and convert it to ISO date format: yyyy-mm-dd\"\r\n- return (\r\n- parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).date().isoformat()\r\n- )\r\n+ try:\r\n+ return (\r\n+ parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst)\r\n+ .date()\r\n+ .isoformat()\r\n+ )\r\n+ except parser.ParserError:\r\n+ if errors is IGNORE:\r\n+ return value\r\n+ elif errors is SET_NULL:\r\n+ return None\r\n+ else:\r\n+ raise\r\n \r\n \r\n-def parsedatetime(value, dayfirst=False, yearfirst=False):\r\n+def parsedatetime(value, dayfirst=False, yearfirst=False, errors=None):\r\n \"Parse a datetime and convert it to ISO datetime format: yyyy-mm-ddTHH:MM:SS\"\r\n- return parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).isoformat()\r\n+ try:\r\n+ return parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).isoformat()\r\n+ except parser.ParserError:\r\n+ if errors is IGNORE:\r\n+ return value\r\n+ elif errors is SET_NULL:\r\n+ return None\r\n+ else:\r\n+ raise\r\n \r\n \r\n def jsonsplit(value, delimiter=\",\", type=str):\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073455905", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073455905, "node_id": "IC_kwDOCGYnMM4_-6Mh", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:44:47Z", "updated_at": "2022-03-21T03:45:00Z", "author_association": "OWNER", "body": "This is quite nice:\r\n```\r\n% sqlite-utils convert test-dates.db dates date \"r.parsedate(value, errors=r.IGNORE)\"\r\n [####################################] 100%\r\n% sqlite-utils rows test-dates.db dates \r\n[{\"id\": 1, \"date\": \"2016-03-15\"},\r\n {\"id\": 2, \"date\": \"2016-03-16\"},\r\n {\"id\": 3, \"date\": \"2016-03-17\"},\r\n {\"id\": 4, \"date\": \"2016-03-18\"},\r\n {\"id\": 5, \"date\": \"2016-03-19\"},\r\n {\"id\": 6, \"date\": \"2016-03-20\"},\r\n {\"id\": 7, \"date\": \"2016-03-21\"},\r\n {\"id\": 8, \"date\": \"2016-03-22\"},\r\n {\"id\": 9, \"date\": \"2016-03-23\"},\r\n {\"id\": 10, \"date\": \"//\"},\r\n {\"id\": 11, \"date\": \"2016-03-25\"},\r\n {\"id\": 12, \"date\": \"2016-03-26\"},\r\n {\"id\": 13, \"date\": \"2016-03-27\"},\r\n {\"id\": 14, \"date\": \"2016-03-28\"},\r\n {\"id\": 15, \"date\": \"2016-03-29\"},\r\n {\"id\": 16, \"date\": \"2016-03-30\"},\r\n {\"id\": 17, \"date\": \"2016-03-31\"},\r\n {\"id\": 18, \"date\": \"2016-04-01\"}]\r\n% sqlite-utils convert test-dates.db dates date \"r.parsedate(value, errors=r.SET_NULL)\"\r\n [####################################] 100%\r\n% sqlite-utils rows test-dates.db dates \r\n[{\"id\": 1, \"date\": \"2016-03-15\"},\r\n {\"id\": 2, \"date\": \"2016-03-16\"},\r\n {\"id\": 3, \"date\": \"2016-03-17\"},\r\n {\"id\": 4, \"date\": \"2016-03-18\"},\r\n {\"id\": 5, \"date\": \"2016-03-19\"},\r\n {\"id\": 6, \"date\": \"2016-03-20\"},\r\n {\"id\": 7, \"date\": \"2016-03-21\"},\r\n {\"id\": 8, \"date\": \"2016-03-22\"},\r\n {\"id\": 9, \"date\": \"2016-03-23\"},\r\n {\"id\": 10, \"date\": null},\r\n {\"id\": 11, \"date\": \"2016-03-25\"},\r\n {\"id\": 12, \"date\": \"2016-03-26\"},\r\n {\"id\": 13, \"date\": \"2016-03-27\"},\r\n {\"id\": 14, \"date\": \"2016-03-28\"},\r\n {\"id\": 15, \"date\": \"2016-03-29\"},\r\n {\"id\": 16, \"date\": \"2016-03-30\"},\r\n {\"id\": 17, \"date\": \"2016-03-31\"},\r\n {\"id\": 18, \"date\": \"2016-04-01\"}]\r\n```", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073453370", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073453370, "node_id": "IC_kwDOCGYnMM4_-5k6", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:41:06Z", "updated_at": "2022-03-21T03:41:06Z", "author_association": "OWNER", "body": "I'm going to try the `errors=r.IGNORE` option and see what that looks like once implemented.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073453230", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073453230, "node_id": "IC_kwDOCGYnMM4_-5iu", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:40:37Z", "updated_at": "2022-03-21T03:40:37Z", "author_association": "OWNER", "body": "I think the options here should be:\r\n\r\n- On error, raise an exception and revert the transaction (the current default)\r\n- On error, leave the value as-is\r\n- On error, set the value to `None`\r\n\r\nThese need to be indicated by parameters to the `r.parsedate()` function.\r\n\r\nSome design options:\r\n\r\n- `ignore=True` to ignore errors - but how does it know if it should leave the value or set it to `None`? This is similar to other `ignore=True` parameters elsewhere in the Python API.\r\n- `errors=\"ignore\"`, `errors=\"set-null\"` - I don't like magic string values very much, but this is similar to Python's `str.encode(errors=)` mechanism\r\n- `errors=r.IGNORE` - using constants, which at least avoids magic strings. The other one could be `errors=r.SET_NULL`\r\n- `error=lambda v: None` or `error=lambda v: v` - this is a bit confusing though, introducing another callback that gets to have a go at converting the error if the first callback failed? And what happens if that lambda itself raises an error?", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073451659", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073451659, "node_id": "IC_kwDOCGYnMM4_-5KL", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:35:01Z", "updated_at": "2022-03-21T03:35:01Z", "author_association": "OWNER", "body": "I confirmed that if it fails for any value ALL values are left alone, since it runs in a transaction.\r\n\r\nHere's the code that does that:\r\n\r\nhttps://github.com/simonw/sqlite-utils/blob/433813612ff9b4b501739fd7543bef0040dd51fe/sqlite_utils/db.py#L2523-L2526", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073450588", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073450588, "node_id": "IC_kwDOCGYnMM4_-45c", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:32:58Z", "updated_at": "2022-03-21T03:32:58Z", "author_association": "OWNER", "body": "Then I ran this to convert `2016-03-27` etc to `2016/03/27` so I could see which ones were later converted:\r\n\r\n sqlite-utils convert test-dates.db dates date 'value.replace(\"-\", \"/\")'\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073448904", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1073448904, "node_id": "IC_kwDOCGYnMM4_-4fI", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-21T03:28:12Z", "updated_at": "2022-03-21T03:30:37Z", "author_association": "OWNER", "body": "Generating a test database using a pattern from https://www.geekytidbits.com/date-range-table-sqlite/\r\n```\r\nsqlite-utils create-database test-dates.db\r\nsqlite-utils create-table test-dates.db dates id integer date text --pk id\r\nsqlite-utils test-dates.db \"WITH RECURSIVE\r\n cnt(x) AS (\r\n SELECT 0\r\n UNION ALL\r\n SELECT x+1 FROM cnt\r\n LIMIT (SELECT ((julianday('2016-04-01') - julianday('2016-03-15'))) + 1)\r\n )\r\ninsert into dates (date) select date(julianday('2016-03-15'), '+' || x || ' days') as date FROM cnt;\"\r\n```\r\nAfter running that:\r\n```\r\n% sqlite-utils rows test-dates.db dates\r\n[{\"id\": 1, \"date\": \"2016-03-15\"},\r\n {\"id\": 2, \"date\": \"2016-03-16\"},\r\n {\"id\": 3, \"date\": \"2016-03-17\"},\r\n {\"id\": 4, \"date\": \"2016-03-18\"},\r\n {\"id\": 5, \"date\": \"2016-03-19\"},\r\n {\"id\": 6, \"date\": \"2016-03-20\"},\r\n {\"id\": 7, \"date\": \"2016-03-21\"},\r\n {\"id\": 8, \"date\": \"2016-03-22\"},\r\n {\"id\": 9, \"date\": \"2016-03-23\"},\r\n {\"id\": 10, \"date\": \"2016-03-24\"},\r\n {\"id\": 11, \"date\": \"2016-03-25\"},\r\n {\"id\": 12, \"date\": \"2016-03-26\"},\r\n {\"id\": 13, \"date\": \"2016-03-27\"},\r\n {\"id\": 14, \"date\": \"2016-03-28\"},\r\n {\"id\": 15, \"date\": \"2016-03-29\"},\r\n {\"id\": 16, \"date\": \"2016-03-30\"},\r\n {\"id\": 17, \"date\": \"2016-03-31\"},\r\n {\"id\": 18, \"date\": \"2016-04-01\"}]\r\n```\r\nThen to make one of them invalid:\r\n\r\n sqlite-utils test-dates.db \"update dates set date = '//' where id = 10\"", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1072834273", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1072834273, "node_id": "IC_kwDOCGYnMM4_8ibh", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-18T21:36:05Z", "updated_at": "2022-03-18T21:36:05Z", "author_association": "OWNER", "body": "Python's `str.encode()` method has a `errors=` parameter that does something along these lines: https://docs.python.org/3/library/stdtypes.html#str.encode\r\n\r\n> *errors* may be given to set a different error handling scheme. The default for *errors* is `'strict'`, meaning that encoding errors raise a [`UnicodeError`](https://docs.python.org/3/library/exceptions.html#UnicodeError \"UnicodeError\"). Other possible values are `'ignore'`, `'replace'`, `'xmlcharrefreplace'`, `'backslashreplace'` and any other name registered via [`codecs.register_error()`](https://docs.python.org/3/library/codecs.html#codecs.register_error \"codecs.register_error\"),\r\n\r\nImitating this might be the way to go.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1072833174", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/416", "id": 1072833174, "node_id": "IC_kwDOCGYnMM4_8iKW", "user": {"value": 9599, "label": "simonw"}, "created_at": "2022-03-18T21:34:06Z", "updated_at": "2022-03-18T21:34:06Z", "author_association": "OWNER", "body": "Good call-out: right now the `parsedate()` and `parsedatetime()` functions both terminate with an exception if they hit something invalid: https://sqlite-utils.datasette.io/en/stable/cli.html#sqlite-utils-convert-recipes\r\n\r\nIt would be better if this was configurable by the user (and properly documented) - options could include \"set null if date is invalid\" and \"leave the value as it is if invalid\" in addition to throwing an error.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 1173023272, "label": "Options for how `r.parsedate()` should handle invalid dates"}, "performed_via_github_app": null}