github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006219956 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006219956 | IC_kwDOCGYnMM47-bK0 | 22429695 | 2022-01-06T01:51:54Z | 2022-01-06T06:22:25Z | NONE | # [Codecov](https://codecov.io/gh/simonw/sqlite-utils/pull/361?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report > Merging [#361](https://codecov.io/gh/simonw/sqlite-utils/pull/361?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (b7f0b88) into [main](https://codecov.io/gh/simonw/sqlite-utils/commit/f3fd8613113d21d44238a6ec54b375f5aa72c4e0?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (f3fd861) will **decrease** coverage by `0.05%`. > The diff coverage is `92.85%`. [![Impacted file tree graph](https://codecov.io/gh/simonw/sqlite-utils/pull/361/graphs/tree.svg?width=650&height=150&src=pr&token=O0X3703L9P&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)](https://codecov.io/gh/simonw/sqlite-utils/pull/361?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) ```diff @@ Coverage Diff @@ ## main #361 +/- ## ========================================== - Coverage 96.49% 96.44% -0.06% ========================================== Files 5 5 Lines 2283 2306 +23 ========================================== + Hits 2203 2224 +21 - Misses 80 82 +2 ``` | [Impacted Files](https://codecov.io/gh/simonw/sqlite-utils/pull/361?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) | Coverage Δ | | |---|---|---| | [sqlite\_utils/cli.py](https://codecov.io/gh/simonw/sqlite-utils/pull/361/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison#diff-c3FsaXRlX3V0aWxzL2NsaS5weQ==) | `95.49% <92.00%> (-0.1… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 | |
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006315145 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006315145 | IC_kwDOCGYnMM47-yaJ | 9599 | 2022-01-06T06:20:51Z | 2022-01-06T06:20:51Z | OWNER | This is all documented. I'm going to rebase-merge it to keep the individual commits. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 | |
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006311742 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006311742 | IC_kwDOCGYnMM47-xk- | 9599 | 2022-01-06T06:12:19Z | 2022-01-06T06:12:19Z | OWNER | Got that working: ``` % echo 'This is cool' | sqlite-utils insert words.db words - --text --convert '({"word": w} for w in text.split())' % sqlite-utils dump words.db BEGIN TRANSACTION; CREATE TABLE [words] ( [word] TEXT ); INSERT INTO "words" VALUES('This'); INSERT INTO "words" VALUES('is'); INSERT INTO "words" VALUES('cool'); COMMIT; ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 | |
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006309834 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006309834 | IC_kwDOCGYnMM47-xHK | 9599 | 2022-01-06T06:08:01Z | 2022-01-06T06:08:01Z | OWNER | For `--text` the conversion function should be allowed to return an iterable instead of a dictionary, in which case it will be treated as the full list of records to be inserted. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 | |
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006301546 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006301546 | IC_kwDOCGYnMM47-vFq | 9599 | 2022-01-06T05:44:47Z | 2022-01-06T05:44:47Z | OWNER | Just need documentation for `--convert` now against the various different types of input. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 | |
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006300280 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006300280 | IC_kwDOCGYnMM47-ux4 | 9599 | 2022-01-06T05:40:45Z | 2022-01-06T05:40:45Z | OWNER | I'm going to rename `--all` to `--text`: > - Use `--text` to write the entire input to a column called "text" To avoid that clash with Python's `all()` function. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 | |
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006299778 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006299778 | IC_kwDOCGYnMM47-uqC | 9599 | 2022-01-06T05:39:10Z | 2022-01-06T05:39:10Z | OWNER | `all` is a bad variable name because it clashes with the Python `all()` built-in function. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 | |
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006295276 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006295276 | IC_kwDOCGYnMM47-tjs | 9599 | 2022-01-06T05:26:11Z | 2022-01-06T05:26:11Z | OWNER | Here's the traceback if your `--convert` function doesn't return a dict right now: ``` % sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert 'all.upper()' --all Traceback (most recent call last): File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/bin/sqlite-utils", line 33, in <module> sys.exit(load_entry_point('sqlite-utils', 'console_scripts', 'sqlite-utils')()) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1137, in __call__ return self.main(*args, **kwargs) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1062, in main rv = self.invoke(ctx) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1668, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 763, in invoke return __callback(*args, **kwargs) File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py", line 949, in insert insert_upsert_implementation( File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py", line 834, in insert_upsert_implementation db[table].insert_all( File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py", line 2602, in insert_all first_record = next(records) File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py", line 3044, in fix_square_braces for record in records: File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py", line 831, in <genexpr> docs = (decode_base64_values(doc) for doc in docs) File "/Users/simon/Dropbox/Development/s… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 | |
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006294777 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006294777 | IC_kwDOCGYnMM47-tb5 | 9599 | 2022-01-06T05:24:54Z | 2022-01-06T05:24:54Z | OWNER | > I added a custom error message for if the user's `--convert` code doesn't return a dict. That turned out to be a bad idea because it meant exhausting the iterator early for the check - before we got to the `.insert_all()` code that breaks the iterator up into chunks. I tried fixing that with `itertools.tee()` to run the generator twice but that's grossly memory-inefficient for large imports. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 | |
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006288444 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006288444 | IC_kwDOCGYnMM47-r48 | 9599 | 2022-01-06T05:07:10Z | 2022-01-06T05:07:10Z | OWNER | And here's a demo of `--convert` used with `--all` - I added a custom error message for if the user's `--convert` code doesn't return a dict. ``` % sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert 'all.upper()' --all Error: Records returned by your --convert function must be dicts % sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert '{"all": all.upper()}' --all % sqlite-utils dump /tmp/all.db BEGIN TRANSACTION; CREATE TABLE [blah] ( [all] TEXT ); INSERT INTO "blah" VALUES('INFO: 127.0.0.1:60581 - "GET / HTTP/1.1" 200 OK INFO: 127.0.0.1:60581 - "GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1" 200 OK INFO: 127.0.0.1:60581 - "GET /FAVICON.ICO HTTP/1.1" 200 OK INFO: 127.0.0.1:60581 - "GET /FOO/TIDDLYWIKI HTTP/1.1" 200 OK INFO: 127.0.0.1:60581 - "GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1" 200 OK INFO: 127.0.0.1:60584 - "GET /FOO/-/STATIC/SQL-FORMATTER-2.3.3.MIN.JS HTTP/1.1" 200 OK INFO: 127.0.0.1:60586 - "GET /FOO/-/STATIC/CODEMIRROR-5.57.0.MIN.JS HTTP/1.1" 200 OK INFO: 127.0.0.1:60585 - "GET /FOO/-/STATIC/CODEMIRROR-5.57.0.MIN.CSS HTTP/1.1" 200 OK INFO: 127.0.0.1:60588 - "GET /FOO/-/STATIC/CODEMIRROR-5.57.0-SQL.MIN.JS HTTP/1.1" 200 OK INFO: 127.0.0.1:60587 - "GET /FOO/-/STATIC/CM-RESIZE-1.0.1.MIN.JS HTTP/1.1" 200 OK INFO: 127.0.0.1:60586 - "GET /FOO/TIDDLYWIKI/TIDDLERS HTTP/1.1" 200 OK INFO: 127.0.0.1:60586 - "GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1" 200 OK INFO: 127.0.0.1:60584 - "GET /FOO/-/STATIC/TABLE.JS HTTP/1.1" 200 OK '); COMMIT; ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 | |
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006284673 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006284673 | IC_kwDOCGYnMM47-q-B | 9599 | 2022-01-06T04:55:52Z | 2022-01-06T04:55:52Z | OWNER | Test code that just worked for me: ``` sqlite-utils insert /tmp/blah.db blah /tmp/log.log --convert ' bits = line.split() return dict([("b_{}".format(i), bit) for i, bit in enumerate(bits)])' --lines ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 | |
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006232013 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006232013 | IC_kwDOCGYnMM47-eHN | 9599 | 2022-01-06T02:21:35Z | 2022-01-06T02:21:35Z | OWNER | I'm having second thoughts about this bit: > Your Python code will be passed a "row" variable representing the imported row, and can return a modified row. > > If you are using `--lines` your code will be passed a "line" variable, and for `--all` an "all" variable. The code in question is this: https://github.com/simonw/sqlite-utils/blob/500a35ad4d91c8a6232134ce9406efec11bedff8/sqlite_utils/utils.py#L296-L303 Do I really want to add the complexity of supporting different variable names there? I think always using `value` might be better. Except... `value` made sense for the existing `sqlite-utils convert` command where you are running a conversion function against the value for the column in the current row - is it confusing if applied to lines or documents or `all`? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 | |
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006230411 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006230411 | IC_kwDOCGYnMM47-duL | 9599 | 2022-01-06T02:17:35Z | 2022-01-06T02:17:35Z | OWNER | Documentation: https://github.com/simonw/sqlite-utils/blob/33223856ff7fe746b7b77750fbe5b218531d0545/docs/cli.rst#inserting-unstructured-data-with---lines-and---all - I went with a single section titled "Inserting unstructured data with --lines and --all" | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 | |
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006220129 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006220129 | IC_kwDOCGYnMM47-bNh | 9599 | 2022-01-06T01:52:26Z | 2022-01-06T01:52:26Z | OWNER | I'm going to refactor all of the tests for `sqlite-utils insert` into a new `test_cli_insert.py` module. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 | |
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006219848 | https://api.github.com/repos/simonw/sqlite-utils/issues/361 | 1006219848 | IC_kwDOCGYnMM47-bJI | 9599 | 2022-01-06T01:51:36Z | 2022-01-06T01:51:36Z | OWNER | So far I've just implemented the new help: ``` % sqlite-utils insert --help Usage: sqlite-utils insert [OPTIONS] PATH TABLE FILE Insert records from FILE into a table, creating the table if it does not already exist. By default the input is expected to be a JSON array of objects. Or: - Use --nl for newline-delimited JSON objects - Use --csv or --tsv for comma-separated or tab-separated input - Use --lines to write each incoming line to a column called "line" - Use --all to write the entire input to a column called "all" You can also use --convert to pass a fragment of Python code that will be used to convert each input. Your Python code will be passed a "row" variable representing the imported row, and can return a modified row. If you are using --lines your code will be passed a "line" variable, and for --all an "all" variable. Options: --pk TEXT Columns to use as the primary key, e.g. id --flatten Flatten nested JSON objects, so {"a": {"b": 1}} becomes {"a_b": 1} --nl Expect newline-delimited JSON -c, --csv Expect CSV input --tsv Expect TSV input --lines Treat each line as a single value called 'line' --all Treat input as a single value called 'all' --convert TEXT Python code to convert each item --import TEXT Python modules to import --delimiter TEXT Delimiter to use for CSV files --quotechar TEXT Quote character to use for CSV/TSV --sniff Detect delimiter and quote character --no-headers CSV file has no header row --batch-size INTEGER Commit every X records --alter Alter existing table to add any missing columns --not-null TEXT Columns that should be created as NOT NULL --default <TEXT TEXT>... Default value that should be set for a column --e… | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1094890366 |