home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 905021010

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions issue performed_via_github_app
https://github.com/simonw/sqlite-utils/issues/319#issuecomment-905021010 https://api.github.com/repos/simonw/sqlite-utils/issues/319 905021010 IC_kwDOCGYnMM418YZS 66709385 2021-08-24T22:33:42Z 2021-08-24T22:33:42Z NONE

Oh, I misread. Yes some files will not be valid UTF-8, I'd throw a warning and continue (not adding that file) but if you want to get more elaborate you could allow to define a policy on what to do. Not adding the file, index binary content or use a conversion policy like the ones available on Python's decode.

From https://stackoverflow.com/questions/24616678/unicodedecodeerror-in-python-when-reading-a-file-how-to-ignore-the-error-and-ju : - 'ignore' ignores errors. Note that ignoring encoding errors can lead to data loss. - 'replace' causes a replacement marker (such as '?') to be inserted where there is malformed data. - 'surrogateescape' will represent any incorrect bytes as code points in the Unicode Private Use Area ranging from U+DC80 to U+DCFF. These private code points will then be turned back into the same bytes when the surrogateescape error handler is used when writing data. This is useful for processing files in an unknown encoding. - 'xmlcharrefreplace' is only supported when writing to a file. Characters not supported by the encoding are replaced with the appropriate XML character reference &#nnn;. - 'backslashreplace' (also only supported when writing) replaces unsupported characters with Python’s backslashed escape sequences.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
976399638  
Powered by Datasette · Queries took 1.185ms · About: github-to-sqlite