html_url,issue_url,id,node_id,user,user_label,created_at,updated_at,author_association,body,reactions,issue,issue_label,performed_via_github_app https://github.com/simonw/sqlite-utils/issues/26#issuecomment-696566750,https://api.github.com/repos/simonw/sqlite-utils/issues/26,696566750,MDEyOklzc3VlQ29tbWVudDY5NjU2Njc1MA==,9599,simonw,2020-09-22T07:55:00Z,2020-09-22T07:55:00Z,OWNER,"Problem: `extract` means something else now, see #47 and the upcoming work in #42.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",455486286,Mechanism for turning nested JSON into foreign keys / many-to-many, https://github.com/simonw/sqlite-utils/issues/26#issuecomment-507051670,https://api.github.com/repos/simonw/sqlite-utils/issues/26,507051670,MDEyOklzc3VlQ29tbWVudDUwNzA1MTY3MA==,9599,simonw,2019-06-30T17:04:09Z,2019-06-30T17:04:09Z,OWNER,I think the implementation of this will benefit from #23 (syntactic sugar for creating m2m records),"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",455486286,Mechanism for turning nested JSON into foreign keys / many-to-many, https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501541902,https://api.github.com/repos/simonw/sqlite-utils/issues/26,501541902,MDEyOklzc3VlQ29tbWVudDUwMTU0MTkwMg==,9599,simonw,2019-06-13T04:15:22Z,2019-06-13T16:55:42Z,OWNER,"So maybe something like this: ``` curl https://api.github.com/repos/simonw/datasette/pulls?state=all | \ sqlite-utils insert git.db pulls - \ --flatten=base \ --flatten=head \ --extract=user:users:id \ --extract=head_repo.license:licenses:key \ --extract=head_repo.owner:users \ --extract=head_repo --extract=base_repo.license:licenses:key \ --extract=base_repo.owner:users \ --extract=base_repo ``` Is the order of those nested `--extract` lines significant I wonder? It would be nice if the order didn't matter and the code figured out the right execution plan on its own.","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",455486286,Mechanism for turning nested JSON into foreign keys / many-to-many, https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501543688,https://api.github.com/repos/simonw/sqlite-utils/issues/26,501543688,MDEyOklzc3VlQ29tbWVudDUwMTU0MzY4OA==,9599,simonw,2019-06-13T04:26:15Z,2019-06-13T04:26:15Z,OWNER,"I may ignore `--flatten` for the moment - users can do their own flattening using `jq` if they need that. ``` curl https://api.github.com/repos/simonw/datasette/pulls?state=all | jq "" [.[] | . + { base_label: .base.label, base_ref: .base.ref, base_sha: .base.sha, base_user: .base.user, base_repo: .base.repo, head_label: .head.label, head_ref: .head.ref, head_sha: .head.sha, head_user: .head.user, head_repo: .head.repo } | del(.base, .head, ._links)] "" ``` Output: https://gist.github.com/simonw/2703ed43fcfe96eb8cfeee7b558b61e1","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",455486286,Mechanism for turning nested JSON into foreign keys / many-to-many, https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501542025,https://api.github.com/repos/simonw/sqlite-utils/issues/26,501542025,MDEyOklzc3VlQ29tbWVudDUwMTU0MjAyNQ==,9599,simonw,2019-06-13T04:16:10Z,2019-06-13T04:16:42Z,OWNER,"So for `--extract` the format is `path-to-property:table-to-extract-to:primary-key` If we find an array (as opposed to a direct nested object) at the end of the dotted path we do a m2m table. And if `primary-key` is omitted maybe we do the rowid thing with a foreign key back to ourselves.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",455486286,Mechanism for turning nested JSON into foreign keys / many-to-many, https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501539452,https://api.github.com/repos/simonw/sqlite-utils/issues/26,501539452,MDEyOklzc3VlQ29tbWVudDUwMTUzOTQ1Mg==,9599,simonw,2019-06-13T03:59:32Z,2019-06-13T03:59:32Z,OWNER,"Another complexity from the https://api.github.com/repos/simonw/datasette/pulls example: We don't actually want `head` and `base` to be pulled out into a separate table. Our ideal table design would probably look something like this: - `url`: ... - `id`: `285698310` - ... - `user_id`: `9599` -> refs `users` - `head_label`: `simonw:travis-38dev` - `head_ref`: `travis-38dev` - `head_sha`: `f274f9004302c5ca75ce89d0abfd648457957e31` - `head_user_id`: `9599` -> refs `users` - `head_repo_id`: `107914493` -> refs `repos` - `base_label`: `simonw:master` - `base_ref`: `master` - `base_sha`: `5e8fbf7f6fbc0b63d0479da3806dd9ccd6aaa945` - `base_user_id`: `9599` -> refs `users` - `base_repo_id`: `107914493` -> refs `repos` So the nested `head` and `base` sections here, instead of being extracted into another table, were flattened into their own columns. So perhaps we need a flatten-nested-into-columns mechanism which can be used in conjunction with a extract-to-tables mechanism.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",455486286,Mechanism for turning nested JSON into foreign keys / many-to-many, https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501538100,https://api.github.com/repos/simonw/sqlite-utils/issues/26,501538100,MDEyOklzc3VlQ29tbWVudDUwMTUzODEwMA==,9599,simonw,2019-06-13T03:51:27Z,2019-06-13T03:51:27Z,OWNER,"I like the term ""extract"" for what we are doing here, partly because that's the terminology I used in `csvs-to-sqlite`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",455486286,Mechanism for turning nested JSON into foreign keys / many-to-many, https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501537812,https://api.github.com/repos/simonw/sqlite-utils/issues/26,501537812,MDEyOklzc3VlQ29tbWVudDUwMTUzNzgxMg==,9599,simonw,2019-06-13T03:49:37Z,2019-06-13T03:50:39Z,OWNER,"There's an interesting difference here between nested objects with a primary-key style ID and nested objects without. If a nested object does not have a primary key, we could still shift it out to another table but it would need to be in a context where it has an automatic foreign key back to our current record. A good example of something where that would be useful is the `outageDevices` key in https://github.com/simonw/pge-outages/blob/d890d09ff6e2997948028528e06c82e1efe30365/pge-outages.json#L13-L25 ```json { ""outageNumber"": ""407367"", ""outageStartTime"": ""1560355216"", ""crewCurrentStatus"": ""PG&E repair crew is on-site working to restore power."", ""currentEtor"": ""1560376800"", ""cause"": ""Our preliminary determination is that your outage was caused by scheduled maintenance work."", ""estCustAffected"": ""3"", ""lastUpdateTime"": ""1560355709"", ""hazardFlag"": ""0"", ""latitude"": ""37.35629"", ""longitude"": ""-119.70469"", ""outageDevices"": [ { ""latitude"": ""37.35409"", ""longitude"": ""-119.70575"" }, { ""latitude"": ""37.35463"", ""longitude"": ""-119.70525"" }, { ""latitude"": ""37.35562"", ""longitude"": ""-119.70467"" } ], ""regionName"": ""Ahwahnee"" } ``` These could either be inserted into an `outageDevices` table that uses `rowid`... or we could have a mechanism where we automatically derive a primary key for them based on a hash of their data, hence avoiding creating duplicates even though we don't have a provided primary key.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",455486286,Mechanism for turning nested JSON into foreign keys / many-to-many, https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501536495,https://api.github.com/repos/simonw/sqlite-utils/issues/26,501536495,MDEyOklzc3VlQ29tbWVudDUwMTUzNjQ5NQ==,9599,simonw,2019-06-13T03:40:21Z,2019-06-13T03:40:21Z,OWNER,I think I can do something here with a very simple `head.repo.owner` path syntax. Normally this kind of syntax would have to take the difference between dictionaries and lists into account but I don't think that matters here.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",455486286,Mechanism for turning nested JSON into foreign keys / many-to-many, https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501508302,https://api.github.com/repos/simonw/sqlite-utils/issues/26,501508302,MDEyOklzc3VlQ29tbWVudDUwMTUwODMwMg==,9599,simonw,2019-06-13T00:57:52Z,2019-06-13T00:57:52Z,OWNER,"Two challenges here: 1. We need a way to specify which tables should be used - e.g. ""put records from the `""user""` key in a `users` table, put multiple records from the `""labels""` key in a table called `labels`"" (we can pick an automatic name for the m2m table, though it might be nice to have an option to customize it) 2. Should we deal with nested objects? Consider https://api.github.com/repos/simonw/datasette/pulls for example: Here we have `head.user` as a user, `head.repo` as a repo, and `head.repo.owner` as another user. Ideally our mechanism for specifying which table things should be pulled out into would handle this, but it's getting a bit complicated.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",455486286,Mechanism for turning nested JSON into foreign keys / many-to-many,