github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/sqlite-utils/issues/281#issuecomment-864323438 | https://api.github.com/repos/simonw/sqlite-utils/issues/281 | 864323438 | MDEyOklzc3VlQ29tbWVudDg2NDMyMzQzOA== | 9599 | 2021-06-18T23:55:06Z | 2021-06-18T23:55:06Z | OWNER | The `-:json` idea is flawed: Click thinks that's the syntax for an option called `:json`. I'm going to do `stdin:json` - which means you can't open a file called `stdin` - but you could use `cat stdin | sqlite-utils memory stdin:json ...` instead which is an OK workaround. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
924992318 | |
https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864208476 | https://api.github.com/repos/simonw/sqlite-utils/issues/279 | 864208476 | MDEyOklzc3VlQ29tbWVudDg2NDIwODQ3Ng== | 9599 | 2021-06-18T18:30:08Z | 2021-06-18T23:30:19Z | OWNER | So maybe this is a function which can either be told the format or, if none is provided, it detects one for itself. ```python def rows_from_file(fp, format=None): # ... yield from rows ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
924990677 | |
https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864207841 | https://api.github.com/repos/simonw/sqlite-utils/issues/279 | 864207841 | MDEyOklzc3VlQ29tbWVudDg2NDIwNzg0MQ== | 9599 | 2021-06-18T18:28:40Z | 2021-06-18T18:28:46Z | OWNER | ```python def detect_format(fp): # ... return "csv", fp, dialect # or return "json", fp, parsed_data # or return "json-nl", fp, docs ``` The mixed return types here are ugly. In all of these cases what we really want is to return a generator of `{...}` objects. So maybe it returns that instead. ```python def filepointer_to_documents(fp): # ... yield from documents ``` I can refactor `sqlite-utils insert` to use this new code too. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
924990677 | |
https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864206308 | https://api.github.com/repos/simonw/sqlite-utils/issues/279 | 864206308 | MDEyOklzc3VlQ29tbWVudDg2NDIwNjMwOA== | 9599 | 2021-06-18T18:25:04Z | 2021-06-18T18:25:04Z | OWNER | Or... since I'm not using a streaming JSON parser at the moment, if I think something is JSON I can load the entire thing into memory to validate it. I still need to detect newline-delimited JSON. For that I can consume the first line of the input to see if it's a valid JSON object, then maybe sniff the second line too? This does mean that if the input is a single line of GIANT JSON it will all be consumed into memory at once, but that's going to happen anyway. So I need a function which, given a file pointer, consumes from it, detects the type, then returns that type AND a file pointer to the beginning of the file again. I can use `io.BufferedReader` for this. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
924990677 | |
https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864129273 | https://api.github.com/repos/simonw/sqlite-utils/issues/279 | 864129273 | MDEyOklzc3VlQ29tbWVudDg2NDEyOTI3Mw== | 9599 | 2021-06-18T15:47:47Z | 2021-06-18T15:47:47Z | OWNER | Detecting valid JSON is tricky - just because a stream starts with `[` or `{` doesn't mean the entire stream is valid JSON. You need to parse the entire stream to determine that for sure. One way to solve this would be with a custom state machine. Another would be to use the `ijson` streaming parser - annoyingly it throws the same exception class for invalid JSON for different reasons, but the `e.args[0]` for that exception includes human-readable text about the error - if it's anything other than `parse error: premature EOF` then it probably means the JSON was invalid. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
924990677 | |
https://github.com/simonw/sqlite-utils/issues/278#issuecomment-864128489 | https://api.github.com/repos/simonw/sqlite-utils/issues/278 | 864128489 | MDEyOklzc3VlQ29tbWVudDg2NDEyODQ4OQ== | 9599 | 2021-06-18T15:46:24Z | 2021-06-18T15:46:24Z | OWNER | A workaround could be to define a bash or zsh alias of some sort. | { "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
923697888 | |
https://github.com/simonw/sqlite-utils/issues/278#issuecomment-864126781 | https://api.github.com/repos/simonw/sqlite-utils/issues/278 | 864126781 | MDEyOklzc3VlQ29tbWVudDg2NDEyNjc4MQ== | 9599 | 2021-06-18T15:43:19Z | 2021-06-18T15:43:19Z | OWNER | I don't think it's possible to do this without breaking backwards compatibility, unfortunately. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
923697888 | |
https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864103005 | https://api.github.com/repos/simonw/sqlite-utils/issues/279 | 864103005 | MDEyOklzc3VlQ29tbWVudDg2NDEwMzAwNQ== | 9599 | 2021-06-18T15:04:15Z | 2021-06-18T15:04:15Z | OWNER | To detect JSON, check to see if the stream starts with `[` or `{` - maybe do something more sophisticated than that. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
924990677 | |
https://github.com/simonw/sqlite-utils/issues/272#issuecomment-864101267 | https://api.github.com/repos/simonw/sqlite-utils/issues/272 | 864101267 | MDEyOklzc3VlQ29tbWVudDg2NDEwMTI2Nw== | 9599 | 2021-06-18T15:01:41Z | 2021-06-18T15:01:41Z | OWNER | I'll split the remaining work out into separate issues. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
921878733 | |
https://github.com/simonw/sqlite-utils/pull/273#issuecomment-864099764 | https://api.github.com/repos/simonw/sqlite-utils/issues/273 | 864099764 | MDEyOklzc3VlQ29tbWVudDg2NDA5OTc2NA== | 9599 | 2021-06-18T14:59:27Z | 2021-06-18T14:59:27Z | OWNER | I'm going to merge this as-is and work on the JSON/TSV support in a separate issue. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
922099793 | |
https://github.com/simonw/sqlite-utils/pull/277#issuecomment-864092515 | https://api.github.com/repos/simonw/sqlite-utils/issues/277 | 864092515 | MDEyOklzc3VlQ29tbWVudDg2NDA5MjUxNQ== | 9599 | 2021-06-18T14:47:57Z | 2021-06-18T14:47:57Z | OWNER | This is a neat improvement. | { "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
923612361 |