{"html_url": "https://github.com/dogsheep/evernote-to-sqlite/issues/6#issuecomment-706785201", "issue_url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/6", "id": 706785201, "node_id": "MDEyOklzc3VlQ29tbWVudDcwNjc4NTIwMQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-11T23:29:39Z", "updated_at": "2020-10-11T23:29:39Z", "author_association": "MEMBER", "body": "It looks to me like each of those `` blocks has a number of guesses in order of confidence:\r\n```xml\r\n \r\n wonders,\r\n wanders,\r\n wonders ?\r\n wonders\r\n wonders.\r\n \r\n```\r\nSo maybe the best approach here is to just take the first `t` element within each `item`.\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 718949182, "label": "Better handling of OCR data"}, "performed_via_github_app": null} {"html_url": "https://github.com/dogsheep/evernote-to-sqlite/issues/6#issuecomment-706785086", "issue_url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/6", "id": 706785086, "node_id": "MDEyOklzc3VlQ29tbWVudDcwNjc4NTA4Ng==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-10-11T23:28:50Z", "updated_at": "2020-10-11T23:28:50Z", "author_association": "MEMBER", "body": "The XML for the OCR stuff is a bit weird. Currently I'm doing this to it:\r\n\r\nhttps://github.com/dogsheep/evernote-to-sqlite/blob/c33d7b043a45eb3e88676e5fa3ce31755199d9f8/evernote_to_sqlite/utils.py#L70-L78\r\n\r\nThis can produce some odd results, for example:\r\n\r\n> Sure 'Sure, 'Sure. Sure, Sure. sure sure. sure ? If you If Yau [you live jive In m 1n an area devoid of natural wonders, wanders, wonders ? wonders wonders. your mind will be blown, blown' blown. blown ? -e i ? ,1 IL it ? at ? KY ? fl ft bat at\r\n\r\nWhich came from this image:\r\n\r\n![image](https://user-images.githubusercontent.com/9599/95692952-5dd7c880-0bde-11eb-939a-d10b800a4105.png)\r\n\r\nThe XML for that is:\r\n\r\n```xml\r\n\r\n \r\n Sure\r\n 'Sure,\r\n 'Sure.\r\n Sure,\r\n Sure.\r\n \r\n \r\n sure\r\n sure.\r\n sure ?\r\n \r\n \r\n If you\r\n If Yau\r\n [you\r\n \r\n \r\n live\r\n jive\r\n \r\n \r\n In\r\n m\r\n 1n\r\n \r\n \r\n an\r\n \r\n \r\n area\r\n \r\n \r\n devoid\r\n \r\n \r\n of\r\n \r\n \r\n natural\r\n \r\n \r\n wonders,\r\n wanders,\r\n wonders ?\r\n wonders\r\n wonders.\r\n \r\n \r\n your\r\n \r\n \r\n mind\r\n \r\n \r\n will\r\n \r\n \r\n be\r\n \r\n \r\n blown,\r\n blown'\r\n blown.\r\n blown ?\r\n \r\n \r\n -e\r\n \r\n \r\n i ?\r\n \r\n \r\n ,1\r\n \r\n \r\n IL\r\n \r\n \r\n it ?\r\n at ?\r\n KY ?\r\n \r\n \r\n fl\r\n ft\r\n bat\r\n at\r\n \r\n\r\n```\r\n", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 718949182, "label": "Better handling of OCR data"}, "performed_via_github_app": null}