html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697037974,https://api.github.com/repos/simonw/sqlite-utils/issues/42,697037974,MDEyOklzc3VlQ29tbWVudDY5NzAzNzk3NA==,9599,2020-09-22T23:39:31Z,2020-09-22T23:39:31Z,OWNER,Documentation for `sqlite-utils extract`: https://sqlite-utils.readthedocs.io/en/latest/cli.html#extracting-columns-into-a-separate-table,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697031174,https://api.github.com/repos/simonw/sqlite-utils/issues/42,697031174,MDEyOklzc3VlQ29tbWVudDY5NzAzMTE3NA==,9599,2020-09-22T23:16:00Z,2020-09-22T23:16:00Z,OWNER,"Trying this demo again: ``` wget 'https://raw.githubusercontent.com/wri/global-power-plant-database/master/output_database/global_power_plant_database.csv' sqlite-utils insert global.db power_plants global_power_plant_database.csv --csv sqlite-utils extract global.db power_plants country country_long --table countries --rename country_long name ``` It worked!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697025403,https://api.github.com/repos/simonw/sqlite-utils/issues/42,697025403,MDEyOklzc3VlQ29tbWVudDY5NzAyNTQwMw==,9599,2020-09-22T22:57:53Z,2020-09-22T22:57:53Z,OWNER,The documentation for the `.extract()` method is here: https://sqlite-utils.readthedocs.io/en/latest/python-api.html#extracting-columns-into-a-separate-table,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697019944,https://api.github.com/repos/simonw/sqlite-utils/issues/42,697019944,MDEyOklzc3VlQ29tbWVudDY5NzAxOTk0NA==,9599,2020-09-22T22:40:00Z,2020-09-22T22:40:00Z,OWNER,"I tried out the prototype of the CLI on the Global Power Plants data: ``` wget 'https://raw.githubusercontent.com/wri/global-power-plant-database/master/output_database/global_power_plant_database.csv' sqlite-utils insert global.db power_plants global_power_plant_database.csv --csv sqlite-utils extract global.db power_plants country country_long ``` This threw an error because `rowid` columns are not yet supported. I fixed that like so: ``` sqlite-utils transform global.db power_plants --rename rowid id sqlite-utils extract global.db power_plants country country_long ``` That worked! But it didn't play great with Datasette, because the resulting extracted table had columns `country` and `country_long` and neither of those are called `name` or `value` or `title`. Based on this I need to add `rowid` table support AND I need to implement the proposed `rename=` argument for renaming columns on their way into the new table. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697013681,https://api.github.com/repos/simonw/sqlite-utils/issues/42,697013681,MDEyOklzc3VlQ29tbWVudDY5NzAxMzY4MQ==,9599,2020-09-22T22:22:49Z,2020-09-22T22:22:49Z,OWNER,"The command-line version of this needs to accept a table and one or more columns, then a `--table` and `--fk-column` option.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697012111,https://api.github.com/repos/simonw/sqlite-utils/issues/42,697012111,MDEyOklzc3VlQ29tbWVudDY5NzAxMjExMQ==,9599,2020-09-22T22:18:13Z,2020-09-22T22:18:13Z,OWNER,"Here's how I'm generating the examples for the documentation: ``` In [2]: import sqlite_utils In [3]: db = sqlite_utils.Database(memory=True) In [4]: db[""Trees""].insert({""id"": 1, ""TreeAddress"": ""52 Vine St"", ""CommonName"": ...: ""Palm"", ""LatinName"": ""foo""}, pk=""id"") Out[4]: In [5]: db[""Trees""].extract([""CommonName"", ""LatinName""], table=""Species"", fk_col ...: umn=""species_id"") In [6]: print(db[""Trees""].schema) CREATE TABLE ""Trees"" ( [id] INTEGER PRIMARY KEY, [TreeAddress] TEXT, [species_id] INTEGER, FOREIGN KEY(species_id) REFERENCES Species(id) ) In [7]: print(db[""Species""].schema) CREATE TABLE [Species] ( [id] INTEGER PRIMARY KEY, [CommonName] TEXT, [LatinName] TEXT ) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696987925,https://api.github.com/repos/simonw/sqlite-utils/issues/42,696987925,MDEyOklzc3VlQ29tbWVudDY5Njk4NzkyNQ==,9599,2020-09-22T21:19:04Z,2020-09-22T21:19:04Z,OWNER,Need to make sure this works correctly for `rowid` tables.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696987257,https://api.github.com/repos/simonw/sqlite-utils/issues/42,696987257,MDEyOklzc3VlQ29tbWVudDY5Njk4NzI1Nw==,9599,2020-09-22T21:17:34Z,2020-09-22T21:17:34Z,OWNER,"What to do if the table already exists? The `.lookup()` function already knows how to modify an existing table to create the correct constraints etc, so I'll rely on that mechanism.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696980709,https://api.github.com/repos/simonw/sqlite-utils/issues/42,696980709,MDEyOklzc3VlQ29tbWVudDY5Njk4MDcwOQ==,9599,2020-09-22T21:05:07Z,2020-09-22T21:05:07Z,OWNER,"So `.extract()` probably takes a `batch_size=` argument too, which defaults to maybe 1000.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696980503,https://api.github.com/repos/simonw/sqlite-utils/issues/42,696980503,MDEyOklzc3VlQ29tbWVudDY5Njk4MDUwMw==,9599,2020-09-22T21:04:45Z,2020-09-22T21:04:45Z,OWNER,"`table.extract()` can take an optional `progress=` argument which is a callback which will be used to report progress - called after each batch with `(num_done, total)`. It will get called with `(0, total)` once at the start to allow progress bars to be initialized. The command-line progress bar will use this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696979626,https://api.github.com/repos/simonw/sqlite-utils/issues/42,696979626,MDEyOklzc3VlQ29tbWVudDY5Njk3OTYyNg==,9599,2020-09-22T21:03:11Z,2020-09-22T21:03:11Z,OWNER,"And if you want to rename some of the columns in the new table: ```python db[""trees""].extract([""common_name"", ""latin_name""], table=""species"", rename={""common_name"": ""name""}) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696979168,https://api.github.com/repos/simonw/sqlite-utils/issues/42,696979168,MDEyOklzc3VlQ29tbWVudDY5Njk3OTE2OA==,9599,2020-09-22T21:02:24Z,2020-09-22T21:02:24Z,OWNER,"In Python it looks like this: ```python # Simple case - species column species_id pointing to species table db[""trees""].extract(""species"") # Setting a custom table db[""trees""].extract(""species"", table=""Species"") # Custom foreign key column on trees db[""trees""].extract(""species"", fk_column=""species"") # Extracting multiple columns db[""trees""].extract([""common_name"", ""latin_name""]) # (this creates a lookup table called common_name_latin_name ref'd by common_name_latin_name_id) # Or with explicit table (fk_column here defaults to species_id because of the table name) db[""trees""].extract([""common_name"", ""latin_name""], table=""species"") ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696976678,https://api.github.com/repos/simonw/sqlite-utils/issues/42,696976678,MDEyOklzc3VlQ29tbWVudDY5Njk3NjY3OA==,9599,2020-09-22T20:57:57Z,2020-09-22T20:57:57Z,OWNER,"I think I understand the shape of this feature now. It lets you specify one or more columns on the source table which will be extracted into another table. It uses the `.lookup()` mechanism to populate that other table, which means each unique column value / pair / triple will be assigned an integer ID. That integer ID gets written back into the first of the columns that are being transformed. A `.transform()` call then converts that column to an integer (and drops the additional columns). Finally we set up the new foreign key relationship.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696893774,https://api.github.com/repos/simonw/sqlite-utils/issues/42,696893774,MDEyOklzc3VlQ29tbWVudDY5Njg5Mzc3NA==,9599,2020-09-22T18:15:33Z,2020-09-22T18:15:33Z,OWNER,I think the new foreign key column is called `company_name_id` by default in this example but can be customized by passing `--fk-column=xxx`,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696893244,https://api.github.com/repos/simonw/sqlite-utils/issues/42,696893244,MDEyOklzc3VlQ29tbWVudDY5Njg5MzI0NA==,9599,2020-09-22T18:14:33Z,2020-09-22T18:14:45Z,OWNER,"Thinking more about this one: ``` $ sqlite-utils extract my.db \ dea_sales company_name company_address \ --table companies ``` The goal here is to pull the company name and address pair out into a separate table. Some questions: - should this first verify that every company_name has just one company_address? I like the idea of a unique constraint on the created table for this. - what should the foreign key column that gets added to the `companies` table be called?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929, https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696567460,https://api.github.com/repos/simonw/sqlite-utils/issues/42,696567460,MDEyOklzc3VlQ29tbWVudDY5NjU2NzQ2MA==,9599,2020-09-22T07:56:42Z,2020-09-22T07:56:42Z,OWNER,`.transform()` has landed now which should make this a lot easier to solve.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929,