{"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-696566750", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 696566750, "node_id": "MDEyOklzc3VlQ29tbWVudDY5NjU2Njc1MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2020-09-22T07:55:00Z", "updated_at": "2020-09-22T07:55:00Z", "author_association": "OWNER", "body": "Problem: `extract` means something else now, see #47 and the upcoming work in #42.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-507051670", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 507051670, "node_id": "MDEyOklzc3VlQ29tbWVudDUwNzA1MTY3MA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-06-30T17:04:09Z", "updated_at": "2019-06-30T17:04:09Z", "author_association": "OWNER", "body": "I think the implementation of this will benefit from #23 (syntactic sugar for creating m2m records)", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501541902", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 501541902, "node_id": "MDEyOklzc3VlQ29tbWVudDUwMTU0MTkwMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-06-13T04:15:22Z", "updated_at": "2019-06-13T16:55:42Z", "author_association": "OWNER", "body": "So maybe something like this:\r\n```\r\ncurl https://api.github.com/repos/simonw/datasette/pulls?state=all | \\\r\n sqlite-utils insert git.db pulls - \\\r\n --flatten=base \\\r\n --flatten=head \\\r\n --extract=user:users:id \\\r\n --extract=head_repo.license:licenses:key \\\r\n --extract=head_repo.owner:users \\\r\n --extract=head_repo\r\n --extract=base_repo.license:licenses:key \\\r\n --extract=base_repo.owner:users \\\r\n --extract=base_repo\r\n```\r\nIs the order of those nested `--extract` lines significant I wonder? It would be nice if the order didn't matter and the code figured out the right execution plan on its own.", "reactions": "{\"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 1, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501543688", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 501543688, "node_id": "MDEyOklzc3VlQ29tbWVudDUwMTU0MzY4OA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-06-13T04:26:15Z", "updated_at": "2019-06-13T04:26:15Z", "author_association": "OWNER", "body": "I may ignore `--flatten` for the moment - users can do their own flattening using `jq` if they need that.\r\n\r\n```\r\ncurl https://api.github.com/repos/simonw/datasette/pulls?state=all | jq \"\r\n [.[] | . + {\r\n base_label: .base.label,\r\n base_ref: .base.ref,\r\n base_sha: .base.sha,\r\n base_user: .base.user,\r\n base_repo: .base.repo,\r\n head_label: .head.label,\r\n head_ref: .head.ref,\r\n head_sha: .head.sha,\r\n head_user: .head.user,\r\n head_repo: .head.repo\r\n } | del(.base, .head, ._links)]\r\n\"\r\n```\r\nOutput: https://gist.github.com/simonw/2703ed43fcfe96eb8cfeee7b558b61e1", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501542025", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 501542025, "node_id": "MDEyOklzc3VlQ29tbWVudDUwMTU0MjAyNQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-06-13T04:16:10Z", "updated_at": "2019-06-13T04:16:42Z", "author_association": "OWNER", "body": "So for `--extract` the format is `path-to-property:table-to-extract-to:primary-key`\r\n\r\nIf we find an array (as opposed to a direct nested object) at the end of the dotted path we do a m2m table.\r\n\r\nAnd if `primary-key` is omitted maybe we do the rowid thing with a foreign key back to ourselves.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501539452", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 501539452, "node_id": "MDEyOklzc3VlQ29tbWVudDUwMTUzOTQ1Mg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-06-13T03:59:32Z", "updated_at": "2019-06-13T03:59:32Z", "author_association": "OWNER", "body": "Another complexity from the https://api.github.com/repos/simonw/datasette/pulls example:\r\n\r\n\"Mozilla_Firefox\"\r\n\r\nWe don't actually want `head` and `base` to be pulled out into a separate table. Our ideal table design would probably look something like this:\r\n\r\n- `url`: ...\r\n- `id`: `285698310`\r\n- ...\r\n- `user_id`: `9599` -> refs `users`\r\n- `head_label`: `simonw:travis-38dev`\r\n- `head_ref`: `travis-38dev`\r\n- `head_sha`: `f274f9004302c5ca75ce89d0abfd648457957e31`\r\n- `head_user_id`: `9599` -> refs `users`\r\n- `head_repo_id`: `107914493` -> refs `repos`\r\n- `base_label`: `simonw:master`\r\n- `base_ref`: `master`\r\n- `base_sha`: `5e8fbf7f6fbc0b63d0479da3806dd9ccd6aaa945`\r\n- `base_user_id`: `9599` -> refs `users`\r\n- `base_repo_id`: `107914493` -> refs `repos`\r\n\r\nSo the nested `head` and `base` sections here, instead of being extracted into another table, were flattened into their own columns.\r\n\r\nSo perhaps we need a flatten-nested-into-columns mechanism which can be used in conjunction with a extract-to-tables mechanism.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501538100", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 501538100, "node_id": "MDEyOklzc3VlQ29tbWVudDUwMTUzODEwMA==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-06-13T03:51:27Z", "updated_at": "2019-06-13T03:51:27Z", "author_association": "OWNER", "body": "I like the term \"extract\" for what we are doing here, partly because that's the terminology I used in `csvs-to-sqlite`.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501537812", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 501537812, "node_id": "MDEyOklzc3VlQ29tbWVudDUwMTUzNzgxMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-06-13T03:49:37Z", "updated_at": "2019-06-13T03:50:39Z", "author_association": "OWNER", "body": "There's an interesting difference here between nested objects with a primary-key style ID and nested objects without.\r\n\r\nIf a nested object does not have a primary key, we could still shift it out to another table but it would need to be in a context where it has an automatic foreign key back to our current record.\r\n\r\nA good example of something where that would be useful is the `outageDevices` key in https://github.com/simonw/pge-outages/blob/d890d09ff6e2997948028528e06c82e1efe30365/pge-outages.json#L13-L25 \r\n\r\n```json\r\n {\r\n \"outageNumber\": \"407367\",\r\n \"outageStartTime\": \"1560355216\",\r\n \"crewCurrentStatus\": \"PG&E repair crew is on-site working to restore power.\",\r\n \"currentEtor\": \"1560376800\",\r\n \"cause\": \"Our preliminary determination is that your outage was caused by scheduled maintenance work.\",\r\n \"estCustAffected\": \"3\",\r\n \"lastUpdateTime\": \"1560355709\",\r\n \"hazardFlag\": \"0\",\r\n \"latitude\": \"37.35629\",\r\n \"longitude\": \"-119.70469\",\r\n \"outageDevices\": [\r\n {\r\n \"latitude\": \"37.35409\",\r\n \"longitude\": \"-119.70575\"\r\n },\r\n {\r\n \"latitude\": \"37.35463\",\r\n \"longitude\": \"-119.70525\"\r\n },\r\n {\r\n \"latitude\": \"37.35562\",\r\n \"longitude\": \"-119.70467\"\r\n }\r\n ],\r\n \"regionName\": \"Ahwahnee\"\r\n }\r\n```\r\n\r\nThese could either be inserted into an `outageDevices` table that uses `rowid`... or we could have a mechanism where we automatically derive a primary key for them based on a hash of their data, hence avoiding creating duplicates even though we don't have a provided primary key.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501536495", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 501536495, "node_id": "MDEyOklzc3VlQ29tbWVudDUwMTUzNjQ5NQ==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-06-13T03:40:21Z", "updated_at": "2019-06-13T03:40:21Z", "author_association": "OWNER", "body": "I think I can do something here with a very simple `head.repo.owner` path syntax. Normally this kind of syntax would have to take the difference between dictionaries and lists into account but I don't think that matters here.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null} {"html_url": "https://github.com/simonw/sqlite-utils/issues/26#issuecomment-501508302", "issue_url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26", "id": 501508302, "node_id": "MDEyOklzc3VlQ29tbWVudDUwMTUwODMwMg==", "user": {"value": 9599, "label": "simonw"}, "created_at": "2019-06-13T00:57:52Z", "updated_at": "2019-06-13T00:57:52Z", "author_association": "OWNER", "body": "Two challenges here:\r\n\r\n1. We need a way to specify which tables should be used - e.g. \"put records from the `\"user\"` key in a `users` table, put multiple records from the `\"labels\"` key in a table called `labels`\" (we can pick an automatic name for the m2m table, though it might be nice to have an option to customize it)\r\n\r\n2. Should we deal with nested objects? Consider https://api.github.com/repos/simonw/datasette/pulls for example:\r\n\r\n\"Mozilla_Firefox\"\r\n\r\nHere we have `head.user` as a user, `head.repo` as a repo, and `head.repo.owner` as another user.\r\n\r\nIdeally our mechanism for specifying which table things should be pulled out into would handle this, but it's getting a bit complicated.", "reactions": "{\"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "issue": {"value": 455486286, "label": "Mechanism for turning nested JSON into foreign keys / many-to-many"}, "performed_via_github_app": null}