html_url,issue_url,id,node_id,user,user_label,created_at,updated_at,author_association,body,reactions,issue,issue_label,performed_via_github_app https://github.com/simonw/sqlite-utils/issues/42#issuecomment-513246831,https://api.github.com/repos/simonw/sqlite-utils/issues/42,513246831,MDEyOklzc3VlQ29tbWVudDUxMzI0NjgzMQ==,9599,simonw,2019-07-19T14:20:15Z,2019-07-19T14:20:49Z,OWNER,"Since these operations could take a long time against large tables, it would be neat if there was a progress bar option for the CLI command. The operations are full table scans so calculating progress shouldn't be too difficult.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929,"table.extract(...) method and ""sqlite-utils extract"" command", https://github.com/simonw/sqlite-utils/issues/42#issuecomment-513246124,https://api.github.com/repos/simonw/sqlite-utils/issues/42,513246124,MDEyOklzc3VlQ29tbWVudDUxMzI0NjEyNA==,9599,simonw,2019-07-19T14:18:35Z,2019-07-19T14:19:40Z,OWNER,"How about the Python version? That should be easier to design. ```python db[""dea_sales""].extract( columns=[""company_name"", ""company_address""], to_table=""companies"" ) ``` If we want to transform the extracted data (e.g. rename those columns) maybe support a `transform=` argument? ```python db[""dea_sales""].extract( columns=[""company_name"", ""company_address""], to_table=""companies"", transform = lambda extracted: { ""name"": extracted[""company_name""], ""address"": extracted[""company_address""], } ) ``` This would create a new ""companies"" table with three columns: id, name and address. Would also be nice if there was a syntax for saying ""... and use the value from this column as the primary key column in the newly created table"".","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929,"table.extract(...) method and ""sqlite-utils extract"" command", https://github.com/simonw/sqlite-utils/issues/42#issuecomment-513244121,https://api.github.com/repos/simonw/sqlite-utils/issues/42,513244121,MDEyOklzc3VlQ29tbWVudDUxMzI0NDEyMQ==,9599,simonw,2019-07-19T14:13:33Z,2019-07-19T14:13:33Z,OWNER,"So what could the interface to this look like? Especially for the CLI? One option: sqlite-utils extract dea_sales company_name companies name Tricky thing here is that it's quite a large number of positional arguments: sqlite-utils extract dea_sales company_name companies name Table column New table New column (maybe optional?) It would be great if this could supported multiple columns - for if a spreadsheet has e.g. a “Company Name”, “Company Address” pair of fields that always match each other and areduplicated many times. This could be handled by creating the new table with two columns that are indexed as a unique compound key. Then you can easily get-or-create on the pairs (or triples or whatever) from the original table. Challenge here is what does the CLI syntax look like. Something like this? $ sqlite-utils extract dea_sales -c company_name -c company_address \ --to companies --to-col name --to-col address Perhaps the columns in the new table are FORCED to be the same as the old ones, hence avoiding some options? Bit restrictive… maybe they default to the same but you can customize? $ sqlite-utils extract dea_sales -c company_name -c company_address -t companies","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",470345929,"table.extract(...) method and ""sqlite-utils extract"" command",