github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697037974 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 697037974 | MDEyOklzc3VlQ29tbWVudDY5NzAzNzk3NA== | 9599 | 2020-09-22T23:39:31Z | 2020-09-22T23:39:31Z | OWNER | Documentation for `sqlite-utils extract`: https://sqlite-utils.readthedocs.io/en/latest/cli.html#extracting-columns-into-a-separate-table | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697031174 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 697031174 | MDEyOklzc3VlQ29tbWVudDY5NzAzMTE3NA== | 9599 | 2020-09-22T23:16:00Z | 2020-09-22T23:16:00Z | OWNER | Trying this demo again: ``` wget 'https://raw.githubusercontent.com/wri/global-power-plant-database/master/output_database/global_power_plant_database.csv' sqlite-utils insert global.db power_plants global_power_plant_database.csv --csv sqlite-utils extract global.db power_plants country country_long --table countries --rename country_long name ``` It worked! | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697025403 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 697025403 | MDEyOklzc3VlQ29tbWVudDY5NzAyNTQwMw== | 9599 | 2020-09-22T22:57:53Z | 2020-09-22T22:57:53Z | OWNER | The documentation for the `.extract()` method is here: https://sqlite-utils.readthedocs.io/en/latest/python-api.html#extracting-columns-into-a-separate-table | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697019944 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 697019944 | MDEyOklzc3VlQ29tbWVudDY5NzAxOTk0NA== | 9599 | 2020-09-22T22:40:00Z | 2020-09-22T22:40:00Z | OWNER | I tried out the prototype of the CLI on the Global Power Plants data: ``` wget 'https://raw.githubusercontent.com/wri/global-power-plant-database/master/output_database/global_power_plant_database.csv' sqlite-utils insert global.db power_plants global_power_plant_database.csv --csv sqlite-utils extract global.db power_plants country country_long ``` This threw an error because `rowid` columns are not yet supported. I fixed that like so: ``` sqlite-utils transform global.db power_plants --rename rowid id sqlite-utils extract global.db power_plants country country_long ``` That worked! But it didn't play great with Datasette, because the resulting extracted table had columns `country` and `country_long` and neither of those are called `name` or `value` or `title`. Based on this I need to add `rowid` table support AND I need to implement the proposed `rename=` argument for renaming columns on their way into the new table. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697013681 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 697013681 | MDEyOklzc3VlQ29tbWVudDY5NzAxMzY4MQ== | 9599 | 2020-09-22T22:22:49Z | 2020-09-22T22:22:49Z | OWNER | The command-line version of this needs to accept a table and one or more columns, then a `--table` and `--fk-column` option. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697012111 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 697012111 | MDEyOklzc3VlQ29tbWVudDY5NzAxMjExMQ== | 9599 | 2020-09-22T22:18:13Z | 2020-09-22T22:18:13Z | OWNER | Here's how I'm generating the examples for the documentation: ``` In [2]: import sqlite_utils In [3]: db = sqlite_utils.Database(memory=True) In [4]: db["Trees"].insert({"id": 1, "TreeAddress": "52 Vine St", "CommonName": ...: "Palm", "LatinName": "foo"}, pk="id") Out[4]: <Table Trees (id, TreeAddress, CommonName, LatinName)> In [5]: db["Trees"].extract(["CommonName", "LatinName"], table="Species", fk_col ...: umn="species_id") In [6]: print(db["Trees"].schema) CREATE TABLE "Trees" ( [id] INTEGER PRIMARY KEY, [TreeAddress] TEXT, [species_id] INTEGER, FOREIGN KEY(species_id) REFERENCES Species(id) ) In [7]: print(db["Species"].schema) CREATE TABLE [Species] ( [id] INTEGER PRIMARY KEY, [CommonName] TEXT, [LatinName] TEXT ) ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696987925 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 696987925 | MDEyOklzc3VlQ29tbWVudDY5Njk4NzkyNQ== | 9599 | 2020-09-22T21:19:04Z | 2020-09-22T21:19:04Z | OWNER | Need to make sure this works correctly for `rowid` tables. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696987257 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 696987257 | MDEyOklzc3VlQ29tbWVudDY5Njk4NzI1Nw== | 9599 | 2020-09-22T21:17:34Z | 2020-09-22T21:17:34Z | OWNER | What to do if the table already exists? The `.lookup()` function already knows how to modify an existing table to create the correct constraints etc, so I'll rely on that mechanism. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696980709 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 696980709 | MDEyOklzc3VlQ29tbWVudDY5Njk4MDcwOQ== | 9599 | 2020-09-22T21:05:07Z | 2020-09-22T21:05:07Z | OWNER | So `.extract()` probably takes a `batch_size=` argument too, which defaults to maybe 1000. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696980503 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 696980503 | MDEyOklzc3VlQ29tbWVudDY5Njk4MDUwMw== | 9599 | 2020-09-22T21:04:45Z | 2020-09-22T21:04:45Z | OWNER | `table.extract()` can take an optional `progress=` argument which is a callback which will be used to report progress - called after each batch with `(num_done, total)`. It will get called with `(0, total)` once at the start to allow progress bars to be initialized. The command-line progress bar will use this. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696979626 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 696979626 | MDEyOklzc3VlQ29tbWVudDY5Njk3OTYyNg== | 9599 | 2020-09-22T21:03:11Z | 2020-09-22T21:03:11Z | OWNER | And if you want to rename some of the columns in the new table: ```python db["trees"].extract(["common_name", "latin_name"], table="species", rename={"common_name": "name"}) ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696979168 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 696979168 | MDEyOklzc3VlQ29tbWVudDY5Njk3OTE2OA== | 9599 | 2020-09-22T21:02:24Z | 2020-09-22T21:02:24Z | OWNER | In Python it looks like this: ```python # Simple case - species column species_id pointing to species table db["trees"].extract("species") # Setting a custom table db["trees"].extract("species", table="Species") # Custom foreign key column on trees db["trees"].extract("species", fk_column="species") # Extracting multiple columns db["trees"].extract(["common_name", "latin_name"]) # (this creates a lookup table called common_name_latin_name ref'd by common_name_latin_name_id) # Or with explicit table (fk_column here defaults to species_id because of the table name) db["trees"].extract(["common_name", "latin_name"], table="species") ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696976678 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 696976678 | MDEyOklzc3VlQ29tbWVudDY5Njk3NjY3OA== | 9599 | 2020-09-22T20:57:57Z | 2020-09-22T20:57:57Z | OWNER | I think I understand the shape of this feature now. It lets you specify one or more columns on the source table which will be extracted into another table. It uses the `.lookup()` mechanism to populate that other table, which means each unique column value / pair / triple will be assigned an integer ID. That integer ID gets written back into the first of the columns that are being transformed. A `.transform()` call then converts that column to an integer (and drops the additional columns). Finally we set up the new foreign key relationship. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696893774 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 696893774 | MDEyOklzc3VlQ29tbWVudDY5Njg5Mzc3NA== | 9599 | 2020-09-22T18:15:33Z | 2020-09-22T18:15:33Z | OWNER | I think the new foreign key column is called `company_name_id` by default in this example but can be customized by passing `--fk-column=xxx` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696893244 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 696893244 | MDEyOklzc3VlQ29tbWVudDY5Njg5MzI0NA== | 9599 | 2020-09-22T18:14:33Z | 2020-09-22T18:14:45Z | OWNER | Thinking more about this one: ``` $ sqlite-utils extract my.db \ dea_sales company_name company_address \ --table companies ``` The goal here is to pull the company name and address pair out into a separate table. Some questions: - should this first verify that every company_name has just one company_address? I like the idea of a unique constraint on the created table for this. - what should the foreign key column that gets added to the `companies` table be called? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-513262013 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 513262013 | MDEyOklzc3VlQ29tbWVudDUxMzI2MjAxMw== | 9599 | 2019-07-19T14:58:23Z | 2020-09-22T18:12:11Z | OWNER | CLI design idea: $ sqlite-utils extract my.db \ dea_sales company_name Here we just specify the original table and column - the new extracted table will automatically be called "company_name" and will have "id" and "value" columns, by default. To set a custom extract table: $ sqlite-utils extract my.db \ dea_sales company_name \ --table companies And for extracting multiple columns and renaming them on the created table, maybe something like this: $ sqlite-utils extract my.db \ dea_sales company_name company_address \ --table companies \ --column company_name name \ --column company_address address | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 | |
https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696567460 | https://api.github.com/repos/simonw/sqlite-utils/issues/42 | 696567460 | MDEyOklzc3VlQ29tbWVudDY5NjU2NzQ2MA== | 9599 | 2020-09-22T07:56:42Z | 2020-09-22T07:56:42Z | OWNER | `.transform()` has landed now which should make this a lot easier to solve. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
470345929 |