github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1229438242 | https://api.github.com/repos/simonw/sqlite-utils/issues/420 | 1229438242 | IC_kwDOCGYnMM5JR70i | 9599 | 2022-08-28T11:34:21Z | 2022-08-28T11:34:37Z | OWNER | I found a fix that makes that `global` workaround unnecessary: - #472 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1178546862 | |
https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1082476727 | https://api.github.com/repos/simonw/sqlite-utils/issues/420 | 1082476727 | IC_kwDOCGYnMM5AhUi3 | 770231 | 2022-03-29T23:52:38Z | 2022-03-29T23:52:38Z | NONE | @simonw Thanks for looking into it and documenting the solution! | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1178546862 | |
https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1081047053 | https://api.github.com/repos/simonw/sqlite-utils/issues/420 | 1081047053 | IC_kwDOCGYnMM5Ab3gN | 9599 | 2022-03-28T19:22:37Z | 2022-03-28T19:22:37Z | OWNER | Wrote about this in my weeknotes: https://simonwillison.net/2022/Mar/28/datasette-auth0/#new-features-as-documentation | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1178546862 | |
https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1080141111 | https://api.github.com/repos/simonw/sqlite-utils/issues/420 | 1080141111 | IC_kwDOCGYnMM5AYaU3 | 9599 | 2022-03-28T03:25:57Z | 2022-03-28T03:54:37Z | OWNER | So now this should solve your problem: ``` echo '[{"name": "notaword"}, {"name": "word"}] ' | python3 -m sqlite_utils insert listings.db listings - --convert ' import enchant d = enchant.Dict("en_US") def convert(row): global d row["is_dictionary_word"] = d.check(row["name"]) ' ``` | { "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1178546862 | |
https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1079404281 | https://api.github.com/repos/simonw/sqlite-utils/issues/420 | 1079404281 | IC_kwDOCGYnMM5AVmb5 | 9599 | 2022-03-25T20:19:50Z | 2022-03-25T20:19:50Z | OWNER | Now documented here: https://sqlite-utils.datasette.io/en/latest/cli.html#using-a-convert-function-to-execute-initialization | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1178546862 | |
https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1079384771 | https://api.github.com/repos/simonw/sqlite-utils/issues/420 | 1079384771 | IC_kwDOCGYnMM5AVhrD | 9599 | 2022-03-25T19:51:34Z | 2022-03-25T19:53:01Z | OWNER | This works: ``` % sqlite-utils insert dogs.db dogs dogs.json --convert ' import random print("seeding") random.seed(10) print(random.random()) def convert(row): global random print(row) row["random_score"] = random.random() ' seeding 0.5714025946899135 {'id': 1, 'name': 'Cleo'} {'id': 2, 'name': 'Pancakes'} {'id': 3, 'name': 'New dog'} (sqlite-utils) sqlite-utils % sqlite-utils rows dogs.db dogs [{"id": 1, "name": "Cleo", "random_score": 0.4288890546751146}, {"id": 2, "name": "Pancakes", "random_score": 0.5780913011344704}, {"id": 3, "name": "New dog", "random_score": 0.20609823213950174}] ``` Having to use `global random` inside the function is frustrating but apparently necessary. https://stackoverflow.com/a/56552138/6083 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1178546862 | |
https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1079376283 | https://api.github.com/repos/simonw/sqlite-utils/issues/420 | 1079376283 | IC_kwDOCGYnMM5AVfmb | 9599 | 2022-03-25T19:39:30Z | 2022-03-25T19:43:35Z | OWNER | Actually this doesn't work as I thought. This demo shows that the initialization code is run once per item, not a single time at the start of the run: ``` % sqlite-utils insert dogs.db dogs dogs.json --convert ' import random print("seeding") random.seed(10) print(random.random()) def convert(row): print(row) row["random_score"] = random.random() ' seeding 0.5714025946899135 seeding 0.5714025946899135 seeding 0.5714025946899135 seeding 0.5714025946899135 ``` Also that `print(row)` line is not being printed anywhere that gets to the console for some reason. ... my mistake, that happened because I changed this line in order to try to get local imports to work: ```python try: exec(code, globals, locals) return globals["convert"] except (AttributeError, SyntaxError, NameError, KeyError, TypeError): ``` It should be `locals["convert"]` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1178546862 | |
https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1079243535 | https://api.github.com/repos/simonw/sqlite-utils/issues/420 | 1079243535 | IC_kwDOCGYnMM5AU_MP | 9599 | 2022-03-25T17:25:12Z | 2022-03-25T17:25:12Z | OWNER | That documentation is split across a few places. This is the only bit that talks about `def convert()` pattern right now: - https://sqlite-utils.datasette.io/en/stable/cli.html#converting-data-in-columns But that's for `sqlite-utils convert` - the documentation for `sqlite-utils insert --convert` at https://sqlite-utils.datasette.io/en/stable/cli.html#applying-conversions-while-inserting-data doesn't mention it. Since both `sqlite-utils convert` and `sqlite-utils insert --convert` apply the same rules to the code, they should link to a shared explanation in the documentation. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1178546862 | |
https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1078343231 | https://api.github.com/repos/simonw/sqlite-utils/issues/420 | 1078343231 | IC_kwDOCGYnMM5ARjY_ | 9599 | 2022-03-24T21:16:10Z | 2022-03-24T21:17:20Z | OWNER | Aha! This may be possible already: https://github.com/simonw/sqlite-utils/blob/396f80fcc60da8dd844577114f7920830a2e5403/sqlite_utils/utils.py#L311-L316 And yes, this does indeed work - you can do something like this: ``` echo '{"name": "harry"}' | sqlite-utils insert db.db people - --convert ' import time # Simulate something expensive time.sleep(1) def convert(row): row["upper"] = row["name"].upper() ' ``` And after running that: ``` sqlite-utils dump db.db BEGIN TRANSACTION; CREATE TABLE [people] ( [name] TEXT, [upper] TEXT ); INSERT INTO "people" VALUES('harry','HARRY'); COMMIT; ``` So this is a documentation issue - there's a trick for it but I didn't know what the trick was! | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1178546862 | |
https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1078328774 | https://api.github.com/repos/simonw/sqlite-utils/issues/420 | 1078328774 | IC_kwDOCGYnMM5ARf3G | 9599 | 2022-03-24T21:12:33Z | 2022-03-24T21:12:33Z | OWNER | Here's how the `_compile_code()` mechanism works at the moment: https://github.com/simonw/sqlite-utils/blob/396f80fcc60da8dd844577114f7920830a2e5403/sqlite_utils/utils.py#L308-L342 At the end it does this: ```python return locals["fn"] ``` So it's already building and then returning a function. The question is if there's a sensible way to allow people to further customize that function by executing some code first, in a way that's easy to explain. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1178546862 | |
https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1078322301 | https://api.github.com/repos/simonw/sqlite-utils/issues/420 | 1078322301 | IC_kwDOCGYnMM5AReR9 | 9599 | 2022-03-24T21:10:52Z | 2022-03-24T21:10:52Z | OWNER | I can think of three ways forward: - Figure out a pattern that gets that local file import workaround to work - Add another option such as `--convert-init` that lets you pass code that will be executed once at the start - Come up with a pattern where the `--convert` code can run some initialization code and then return a function which will be called against each value I quite like the idea of that third option - I'm going to prototype it and see if I can work something out. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1178546862 | |
https://github.com/simonw/sqlite-utils/issues/420#issuecomment-1078315922 | https://api.github.com/repos/simonw/sqlite-utils/issues/420 | 1078315922 | IC_kwDOCGYnMM5ARcuS | 9599 | 2022-03-24T21:09:27Z | 2022-03-24T21:09:27Z | OWNER | Yeah, this is WAY harder than it should be. There's a clumsy workaround you could use which looks something like this: create a file `my_enchant.py` containing: ```python import enchant d = enchant.Dict("en_US") def check(word): return d.check(word) ``` Then run `sqlite-utils` like this: ``` PYTHONPATH=. cat items.json | jq '.data' | sqlite-utils insert listings.db listings - --convert 'my_enchant.check(value)' --import my_enchant ``` Except I tried that and it doesn't work! I don't know the right pattern for getting `--import` to work with modules in the same directory. So yeah, this is definitely a big feature gap. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1178546862 |