```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",986829194,
https://github.com/dogsheep/genome-to-sqlite/issues/1#issuecomment-765495861,https://api.github.com/repos/dogsheep/genome-to-sqlite/issues/1,765495861,MDEyOklzc3VlQ29tbWVudDc2NTQ5NTg2MQ==,25372415,2021-01-22T15:44:00Z,2021-01-22T15:44:00Z,NONE,"Risk of autoimmune disorders: https://www.snpedia.com/index.php/Genotype
```
select rsid, genotype, case genotype
when 'AA' then '2x risk of rheumatoid arthritis and other autoimmune diseases'
when 'GG' then 'Normal risk for autoimmune disorders'
end as interpretation from genome where rsid = 'rs2476601'
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",496415321,
https://github.com/dogsheep/genome-to-sqlite/issues/1#issuecomment-765498984,https://api.github.com/repos/dogsheep/genome-to-sqlite/issues/1,765498984,MDEyOklzc3VlQ29tbWVudDc2NTQ5ODk4NA==,25372415,2021-01-22T15:48:25Z,2021-01-22T15:49:33Z,NONE,"The ""Warrior Gene"" https://www.snpedia.com/index.php/Rs4680
```
select rsid, genotype, case genotype
when 'AA' then '(worrier) advantage in memory and attention tasks'
when 'AG' then 'Intermediate dopamine levels, other effects'
when 'GG' then '(warrior) multiple associations, see details'
end as interpretation from genome where rsid = 'rs4680'
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",496415321,
https://github.com/dogsheep/genome-to-sqlite/issues/1#issuecomment-765502845,https://api.github.com/repos/dogsheep/genome-to-sqlite/issues/1,765502845,MDEyOklzc3VlQ29tbWVudDc2NTUwMjg0NQ==,25372415,2021-01-22T15:53:19Z,2021-01-22T15:53:19Z,NONE,"rs7903146 Influences risk of Type-2 diabetes
https://www.snpedia.com/index.php/Rs7903146
```
select rsid, genotype, case genotype
when 'CC' then 'Normal (lower) risk of Type 2 Diabetes and Gestational Diabetes.'
when 'CT' then '1.4x increased risk for diabetes (and perhaps colon cancer).'
when 'TT' then '2x increased risk for Type-2 diabetes'
end as interpretation from genome where rsid = 'rs7903146'
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",496415321,
https://github.com/dogsheep/genome-to-sqlite/issues/1#issuecomment-765506901,https://api.github.com/repos/dogsheep/genome-to-sqlite/issues/1,765506901,MDEyOklzc3VlQ29tbWVudDc2NTUwNjkwMQ==,25372415,2021-01-22T15:58:41Z,2021-01-22T15:58:58Z,NONE,"Both rs10757274 and rs2383206 can both indicate higher risks of heart disease
https://www.snpedia.com/index.php/Rs2383206
```
select rsid, genotype, case genotype
when 'AA' then 'Normal'
when 'AG' then '~1.2x increased risk for heart disease'
when 'GG' then '~1.3x increased risk for heart disease'
end as interpretation from genome where rsid = 'rs10757274'
```
```
select rsid, genotype, case genotype
when 'AA' then 'Normal'
when 'AG' then '1.4x increased risk for heart disease'
when 'GG' then '1.7x increased risk for heart disease'
end as interpretation from genome where rsid = 'rs2383206'
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",496415321,
https://github.com/dogsheep/genome-to-sqlite/issues/1#issuecomment-765523517,https://api.github.com/repos/dogsheep/genome-to-sqlite/issues/1,765523517,MDEyOklzc3VlQ29tbWVudDc2NTUyMzUxNw==,25372415,2021-01-22T16:20:25Z,2021-01-22T16:20:25Z,NONE,"rs53576: the oxytocin receptor (OXTR) gene
```
select rsid, genotype, case genotype
when 'AA' then 'Lack of empathy?'
when 'AG' then 'Lack of empathy?'
when 'GG' then 'Optimistic and empathetic; handle stress well'
end as interpretation from genome where rsid = 'rs53576'
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",496415321,
https://github.com/dogsheep/genome-to-sqlite/issues/1#issuecomment-765525338,https://api.github.com/repos/dogsheep/genome-to-sqlite/issues/1,765525338,MDEyOklzc3VlQ29tbWVudDc2NTUyNTMzOA==,25372415,2021-01-22T16:22:44Z,2021-01-22T16:22:44Z,NONE,"rs1333049 associated with coronary artery disease
https://www.snpedia.com/index.php/Rs1333049
```
select rsid, genotype, case genotype
when 'CC' then '1.9x increased risk for coronary artery disease'
when 'CG' then '1.5x increased risk for CAD'
when 'GG' then 'normal'
end as interpretation from genome where rsid = 'rs1333049'
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",496415321,
https://github.com/dogsheep/genome-to-sqlite/issues/1#issuecomment-831004775,https://api.github.com/repos/dogsheep/genome-to-sqlite/issues/1,831004775,MDEyOklzc3VlQ29tbWVudDgzMTAwNDc3NQ==,25372415,2021-05-03T03:46:23Z,2021-05-03T03:46:23Z,NONE,"RS1800955 is related to novelty seeking and ADHD
https://www.snpedia.com/index.php/Rs1800955
`select rsid, genotype, case genotype
when 'CC' then 'increased susceptibility to novelty seeking'
when 'CT' then 'increased susceptibility to novelty seeking'
when 'TT' then 'normal'
end as interpretation from genome where rsid = 'rs1800955'`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",496415321,
https://github.com/dogsheep/github-to-sqlite/issues/15#issuecomment-605439685,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/15,605439685,MDEyOklzc3VlQ29tbWVudDYwNTQzOTY4NQ==,2029,2020-03-28T12:17:01Z,2020-03-28T12:17:01Z,NONE,"That looks great, thanks!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",544571092,
https://github.com/dogsheep/github-to-sqlite/issues/16#issuecomment-571412923,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/16,571412923,MDEyOklzc3VlQ29tbWVudDU3MTQxMjkyMw==,15092,2020-01-07T03:06:46Z,2020-01-07T03:06:46Z,NONE,"I re-tried after doing `auth`, and I get the same result.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",546051181,
https://github.com/dogsheep/github-to-sqlite/issues/16#issuecomment-602136481,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/16,602136481,MDEyOklzc3VlQ29tbWVudDYwMjEzNjQ4MQ==,15092,2020-03-22T02:08:57Z,2020-03-22T02:08:57Z,NONE,"I'd love to be using your library as a better cached gh layer for a new library I have built, replacing large parts of the very ugly https://github.com/jayvdb/pypidb/blob/master/pypidb/_github.py , and then probably being able to rebuild the setuppy chunk as a feature here at a later stage.
I would also need tokenless and netrc support, but I would be happy to add those bits.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",546051181,
https://github.com/dogsheep/github-to-sqlite/issues/33#issuecomment-622279374,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/33,622279374,MDEyOklzc3VlQ29tbWVudDYyMjI3OTM3NA==,2029,2020-05-01T07:12:47Z,2020-05-01T07:12:47Z,NONE,"I also go it working with:
```yaml
run: echo ${{ secrets.github_token }} | github-to-sqlite auth
```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",609950090,
https://github.com/dogsheep/github-to-sqlite/issues/38#issuecomment-623038148,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/38,623038148,MDEyOklzc3VlQ29tbWVudDYyMzAzODE0OA==,5779832,2020-05-03T01:18:57Z,2020-05-03T01:18:57Z,NONE,"Thanks, @simonw!
I feel a little foolish in hindsight, but I'm on the same page now and am glad to have discovered first-hand a motivation for this `repos_starred` use case.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",611284481,
https://github.com/dogsheep/github-to-sqlite/issues/38#issuecomment-623044643,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/38,623044643,MDEyOklzc3VlQ29tbWVudDYyMzA0NDY0Mw==,5779832,2020-05-03T02:34:32Z,2020-05-03T02:34:32Z,NONE,"1. More than glad to share feedback from the sidelines as a [starrer](https://github-to-sqlite.dogsheep.net/github?sql=select%0D%0A++starred_at%2C%0D%0A++starred_by%2C%0D%0A++full_name+as+repo_name%0D%0Afrom%0D%0A++repos_starred%0D%0Awhere%0D%0A++starred_by+%3D+%22zzeleznick%22%0D%0Aorder+by%0D%0A++starred_at+desc).
```
-- Motivation:
-- Datasette is a data hammer and I'm looking for nails
-- e.g. Find which repos a user has starred => trigger a TBD downstream action
select
starred_at,
starred_by,
full_name as repo_name
from
repos_starred
where
starred_by = ""zzeleznick""
order by
starred_at desc
```
| starred_at | starred_by | repo_name |
| --- | --- | --- |
| 2020-02-11T01:08:59Z | zzeleznick | dogsheep/twitter-to-sqlite |
| 2020-01-11T21:57:34Z | zzeleznick | simonw/datasette |
2. In my day job, I use [airflow](https://github.com/apache/airflow), and that's the mental model I'm bringing to [datasette](https://github.com/simonw/datasette).
3. I see your project like [twitter-to-sqlite](https://github.com/dogsheep/twitter-to-sqlite) akin to [Operators](https://airflow.apache.org/docs/stable/_api/index.html#pythonapi-operators) in Airflow world.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",611284481,
https://github.com/dogsheep/github-to-sqlite/issues/46#issuecomment-1359468823,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46,1359468823,IC_kwDODFdgUs5RB9kX,1839645,2022-12-20T14:39:39Z,2022-12-20T14:40:15Z,NONE,"Just a quick +1 to this one from me - I would like to do a better job of tracking who is reviewing one another's pull requests in repositories, since this is a specific kind of maintenance work that I think often goes unrewarded. I can't seem to figure this out just by looking at the `pull_request` or `issue_comments` tables, so I think it would be helpful to support PR reviews natively (even if just for summary statistics). Alternatively if there is a way in the API to tell if an issue comment is part of a review, then perhaps you could quickly calculate the number of unique reviews that an author performed. But that was beyond my SQL-foo :-) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",664485022,
https://github.com/dogsheep/github-to-sqlite/issues/51#issuecomment-1208757153,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/51,1208757153,IC_kwDODFdgUs5IDCuh,9020979,2022-08-09T00:29:44Z,2022-08-09T00:29:44Z,NONE,"I've been looking into how to to get this data out of Github (especially now there are ""secondary rate limits"" without an advertised allowance separate from the regular rate limits.
I've had decent success with the Airbyte github extractor (aside from one data quality issue https://github.com/airbytehq/airbyte/pull/15420 ). Airbyte splits data extraction between the GraphQL and REST endpoints depending on the resource type, but they're very comprehensive.
https://github.com/airbytehq/airbyte/blob/306a75ef5370728e0912cf52a1a898a530db0c90/airbyte-integrations/connectors/source-github/source_github/streams.py#L22-L122
Before this, I tried a few solutions in my own custom wrapper mentioned in this thread + its children https://github.com/PyGithub/PyGithub/issues/1989 , but they weren't working as expected.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",703246031,
https://github.com/dogsheep/github-to-sqlite/issues/51#issuecomment-1279224780,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/51,1279224780,IC_kwDODFdgUs5MP2vM,7908073,2022-10-14T16:34:07Z,2022-10-14T16:34:07Z,NONE,"also, it says that authenticated requests have a much higher ""rate limit"". Unauthenticated requests only get 60 req/hour ?? seems more like a quota than a ""rate limit"" (although I guess that is semantic equivalence)
You would want to use `x-ratelimit-reset`
```
time.sleep(r['x-ratelimit-reset'] + 1 - time.time())
```
But a more complete solution would bring authenticated requests to the other subcommands. I'm surprised only `github-to-sqlite get` is using the `--auth=` CLI flag","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",703246031,
https://github.com/dogsheep/github-to-sqlite/issues/64#issuecomment-860895838,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/64,860895838,MDEyOklzc3VlQ29tbWVudDg2MDg5NTgzOA==,231498,2021-06-14T18:23:21Z,2021-06-14T21:37:35Z,NONE,"i have a basic working version at https://github.com/khimaros/github-to-sqlite
this can be tested with `github-to-sqlite events.db khimaros/events`
caveat: the GitHub API doesn't seem to provide a complete history of events.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",920636216,
https://github.com/dogsheep/github-to-sqlite/issues/64#issuecomment-861035862,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/64,861035862,MDEyOklzc3VlQ29tbWVudDg2MTAzNTg2Mg==,231498,2021-06-14T22:29:20Z,2021-06-14T22:29:20Z,NONE,"it looks like the v4 GraphQL API is the only way to get data beyond 90 days from GitHub.
this is significant change, but may be worth considering in the future.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",920636216,
https://github.com/dogsheep/github-to-sqlite/issues/64#issuecomment-861087651,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/64,861087651,MDEyOklzc3VlQ29tbWVudDg2MTA4NzY1MQ==,231498,2021-06-15T00:48:37Z,2021-06-15T00:48:37Z,NONE,"@simonw -- i've created an omega-query that fetched most of what was interesting to me for a single user.
found by poking around in the ""Explorer"" tab in https://docs.github.com/en/graphql/overview/explorer
note: pagination is still required via `first` and `last` but it seems to allow unlimited history.
```
query MyQuery {
__typename
user(login: ""
"") {
id
pinnedItems(first: 100) {
edges {
node
}
}
pullRequests(first: 100) {
nodes {
body
title
state
createdAt
}
}
createdAt
issues(first: 100) {
pageInfo {
endCursor
startCursor
}
nodes {
title
url
createdAt
body
}
}
issueComments(first: 100) {
edges {
node {
id
updatedAt
url
body
}
}
}
repositories(first: 100) {
nodes {
createdAt
description
parent {
name
}
pinnedIssues(first: 100) {
edges {
node {
id
}
}
}
pinnedDiscussions(first: 100) {
edges {
node {
id
}
}
}
}
}
starredRepositories(first: 100) {
edges {
node {
id
}
}
}
}
}
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",920636216,
https://github.com/dogsheep/github-to-sqlite/issues/79#issuecomment-1847317568,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/79,1847317568,IC_kwDODFdgUs5uG9RA,23789,2023-12-08T14:50:13Z,2023-12-08T14:50:13Z,NONE,Adding `&per_page=100` would reduce the number of API requests by 3x.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1570375808,
https://github.com/dogsheep/github-to-sqlite/pull/65#issuecomment-1266141699,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/65,1266141699,IC_kwDODFdgUs5Ld8oD,231498,2022-10-03T22:35:03Z,2022-10-03T22:35:03Z,NONE,"@simonw rebased against latest, please let me know if i should drop this PR.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",923270900,
https://github.com/dogsheep/github-to-sqlite/pull/65#issuecomment-885964242,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/65,885964242,IC_kwDODFdgUs40zr3S,231498,2021-07-23T23:45:35Z,2021-07-23T23:45:35Z,NONE,@simonw is this PR of interest to you?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",923270900,
https://github.com/dogsheep/github-to-sqlite/pull/66#issuecomment-929651819,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/66,929651819,IC_kwDODFdgUs43aVxr,30531572,2021-09-28T21:50:31Z,2021-09-28T21:50:31Z,NONE,@simonw any feedback/thoughts? ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",975161924,
https://github.com/dogsheep/github-to-sqlite/pull/76#issuecomment-1238190601,https://api.github.com/repos/dogsheep/github-to-sqlite/issues/76,1238190601,IC_kwDODFdgUs5JzUoJ,2757699,2022-09-06T13:58:20Z,2022-09-06T13:59:08Z,NONE,"Tested PR just now in private org, fetched >2k repos infos flawlessly!
poetry run github-to-sqlite repos --organization github.db MYORG","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1363280254,
https://github.com/dogsheep/google-takeout-to-sqlite/issues/10#issuecomment-1073152522,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/10,1073152522,IC_kwDODFE5qs4_9wIK,9290214,2022-03-20T02:38:07Z,2022-03-20T02:38:07Z,NONE,"[This line](https://github.com/dogsheep/google-takeout-to-sqlite/blob/e54e544427f1cc3ea8189f0e95f54046301a8645/google_takeout_to_sqlite/utils.py) needs to say `""MyActivity.json""` instead of `""My Activity.json""`. Google must have changed the file name.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1123393829,
https://github.com/dogsheep/google-takeout-to-sqlite/issues/2#issuecomment-747130908,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/2,747130908,MDEyOklzc3VlQ29tbWVudDc0NzEzMDkwOA==,231498,2020-12-17T00:47:04Z,2020-12-17T00:47:43Z,NONE,"it looks like almost all of the memory consumption is coming from `json.load()`.
another direction here may be to use the new ""Semantic Location History"" data which is already broken down by year and month.
it also provides much more interesting data, such as estimated address, form of travel, etc.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",769376447,
https://github.com/dogsheep/google-takeout-to-sqlite/issues/4#issuecomment-780817596,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/4,780817596,MDEyOklzc3VlQ29tbWVudDc4MDgxNzU5Ng==,306240,2021-02-17T20:01:35Z,2021-02-17T20:01:35Z,NONE,I've got this almost working. Just needs some polish,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",778380836,
https://github.com/dogsheep/google-takeout-to-sqlite/issues/4#issuecomment-781451701,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/4,781451701,MDEyOklzc3VlQ29tbWVudDc4MTQ1MTcwMQ==,203343,2021-02-18T16:06:21Z,2021-02-18T16:06:21Z,NONE,Awesome!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",778380836,
https://github.com/dogsheep/google-takeout-to-sqlite/issues/4#issuecomment-783688547,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/4,783688547,MDEyOklzc3VlQ29tbWVudDc4MzY4ODU0Nw==,306240,2021-02-22T21:31:28Z,2021-02-22T21:31:28Z,NONE,"@Btibert3 I've opened a PR with my initial attempt at this. Would you be willing to give this a try?
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",778380836,
https://github.com/dogsheep/google-takeout-to-sqlite/issues/4#issuecomment-790198930,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/4,790198930,MDEyOklzc3VlQ29tbWVudDc5MDE5ODkzMA==,203343,2021-03-04T00:58:40Z,2021-03-04T00:58:40Z,NONE,"I am just seeing this sorry, yes! I will kick the tires later on tonight. My apologies for the delay.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",778380836,
https://github.com/dogsheep/google-takeout-to-sqlite/issues/4#issuecomment-790934616,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/4,790934616,MDEyOklzc3VlQ29tbWVudDc5MDkzNDYxNg==,203343,2021-03-04T20:54:44Z,2021-03-04T20:54:44Z,NONE,"Sorry for the delay, I got sidetracked after class last night. I am getting the following error:
```
/content# google-takeout-to-sqlite mbox takeout.db Takeout/Mail/gmail.mbox
Usage: google-takeout-to-sqlite [OPTIONS] COMMAND [ARGS]...Try 'google-takeout-to-sqlite --help' for help.
Error: No such command 'mbox'.
```
On the box, I installed with pip after cloning: https://github.com/UtahDave/google-takeout-to-sqlite.git","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",778380836,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-783794520,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,783794520,MDEyOklzc3VlQ29tbWVudDc4Mzc5NDUyMA==,306240,2021-02-23T01:13:54Z,2021-02-23T01:13:54Z,NONE,"Also, @simonw I created a test based off the existing tests. I think it's working correctly","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-784638394,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,784638394,MDEyOklzc3VlQ29tbWVudDc4NDYzODM5NA==,306240,2021-02-24T00:36:18Z,2021-02-24T00:36:18Z,NONE,I noticed that @simonw is using black for formatting. I ran black on my additions in this PR.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790389335,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,790389335,MDEyOklzc3VlQ29tbWVudDc5MDM4OTMzNQ==,306240,2021-03-04T07:32:04Z,2021-03-04T07:32:04Z,NONE,"> The command takes quite a while to start running, presumably because this line causes it to have to scan the WHOLE file in order to generate a count:
>
> https://github.com/dogsheep/google-takeout-to-sqlite/blob/a3de045eba0fae4b309da21aa3119102b0efc576/google_takeout_to_sqlite/utils.py#L66-L67
>
> I'm fine with waiting though. It's not like this is a command people run every day - and without that count we can't show a progress bar, which seems pretty important for a process that takes this long.
The wait is from python loading the mbox file. This happens regardless if you're getting the length of the mbox. The mbox module is on the slow side. It is possible to do one's own parsing of the mbox, but I kind of wanted to avoid doing that.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-790391711,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,790391711,MDEyOklzc3VlQ29tbWVudDc5MDM5MTcxMQ==,306240,2021-03-04T07:36:24Z,2021-03-04T07:36:24Z,NONE,"> Looks like you're doing this:
>
> ```python
> elif message.get_content_type() == ""text/plain"":
> body = message.get_payload(decode=True)
> ```
>
> So presumably that decodes to a unicode string?
>
> I imagine the reason the column is a `BLOB` for me is that `sqlite-utils` determines the column type based on the first batch of items - https://github.com/simonw/sqlite-utils/blob/09c3386f55f766b135b6a1c00295646c4ae29bec/sqlite_utils/db.py#L1927-L1928 - and I got unlucky and had something in my first batch that wasn't a unicode string.
Ah, that's good to know. I think explicitly creating the tables will be a great improvement. I'll add that.
Also, I noticed after I opened this PR that the `message.get_payload()` is being deprecated in favor of `message.get_content()` or something like that. I'll see if that handles the decoding better, too.
Thanks for the feedback. I should have time tomorrow to put together some improvements.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-791089881,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,791089881,MDEyOklzc3VlQ29tbWVudDc5MTA4OTg4MQ==,28565,2021-03-05T02:03:19Z,2021-03-05T02:03:19Z,NONE,"I just tried to run this on a small VPS instance with 2GB of memory and it crashed out of memory while processing a 12GB mbox from Takeout.
Is it possible to stream the emails to sqlite instead of loading it all into memory and upserting at once?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-791530093,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,791530093,MDEyOklzc3VlQ29tbWVudDc5MTUzMDA5Mw==,306240,2021-03-05T16:28:07Z,2021-03-05T16:28:07Z,NONE,"> I just tried to run this on a small VPS instance with 2GB of memory and it crashed out of memory while processing a 12GB mbox from Takeout.
>
> Is it possible to stream the emails to sqlite instead of loading it all into memory and upserting at once?
@maxhawkins a limitation of the python mbox module is it loads the entire mbox into memory. I did find another approach to this problem that didn't use the builtin python mbox module and created a generator so that it didn't have to load the whole mbox into memory. I was hoping to use standard library modules, but this might be a good reason to investigate that approach a bit more. My worry is making sure a custom processor handles all the ins and outs of the mbox format correctly.
Hm. As I'm writing this, I thought of something. I think I can parse each message one at a time, and then use an mbox function to load each message using the python mbox module. That way the mbox module can still deal with the specifics of the mbox format, but I can use a generator.
I'll give that a try. Thanks for the feedback @maxhawkins and @simonw. I'll give that a try.
@simonw can we hold off on merging this until I can test this new approach?","{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-849708617,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,849708617,MDEyOklzc3VlQ29tbWVudDg0OTcwODYxNw==,28565,2021-05-27T15:01:42Z,2021-05-27T15:01:42Z,NONE,Any updates?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-884672647,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,884672647,IC_kwDODFE5qs40uwiH,28565,2021-07-22T05:56:31Z,2021-07-22T14:03:08Z,NONE,"How does this commit look? https://github.com/maxhawkins/google-takeout-to-sqlite/commit/72802a83fee282eb5d02d388567731ba4301050d
It seems that Takeout's mbox format is pretty simple, so we can get away with just splitting the file on lines begining with `From `. My commit just splits the file every time a line starts with `From ` and uses `email.message_from_bytes` to parse each chunk.
I was able to load a 12GB takeout mbox without the program using more than a couple hundred MB of memory during the import process. It does make us lose the progress bar, but maybe I can add that back in a later commit.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-885022230,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,885022230,IC_kwDODFE5qs40wF4W,28565,2021-07-22T15:51:46Z,2021-07-22T15:51:46Z,NONE,One thing I noticed is this importer doesn't save attachments along with the body of the emails. It would be nice if those got stored as blobs in a separate attachments table so attachments can be included while fetching search results.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-885094284,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,885094284,IC_kwDODFE5qs40wXeM,28565,2021-07-22T17:41:32Z,2021-07-22T17:41:32Z,NONE,I added a follow-up commit that deals with emails that don't have a `Date` header: https://github.com/maxhawkins/google-takeout-to-sqlite/commit/4bc70103582c10802c85a523ef1e99a8a2154aa9,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-885098025,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,885098025,IC_kwDODFE5qs40wYYp,306240,2021-07-22T17:47:50Z,2021-07-22T17:47:50Z,NONE,"Hi @maxhawkins , I'm sorry, I haven't had any time to work on this. I'll have some time tomorrow to test your commits. I think they look great. I'm great with your commits superseding my initial attempt here.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-888075098,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,888075098,IC_kwDODFE5qs407vNa,28565,2021-07-28T07:18:56Z,2021-07-28T07:18:56Z,NONE,"> I'm not sure why but my most recent import, when displayed in Datasette, looks like this:
>
>
I did some investigation into this issue and made a fix [here](https://github.com/dogsheep/google-takeout-to-sqlite/pull/8/commits/8ee555c2889a38ff42b95664ee074b4a01a82f06). The problem was that some messages (like gchat logs) don't have a `Message-Id` and we need to use `X-GM-THRID` as the pkey instead.
@simonw While looking into this I found something unexpected about how sqlite_utils handles upserts if the pkey column is `None`. When the pkey is NULL I'd expect the function to either use rowid or throw an exception. Instead, it seems upsert_all creates a row where all columns are NULL instead of using the values provided as parameters.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1002735370,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8,1002735370,IC_kwDODFE5qs47xIcK,203343,2021-12-29T18:58:23Z,2021-12-29T18:58:23Z,NONE,"@maxhawkins how hard would it be to add an entry to the table that includes the HTML version of the email, if it exists? I just attempted your the PR branch on a very small mbox file, and it worked great. My use case is a research project and I need to access more than just the body plain text.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",954546309,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1003437288,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8,1003437288,IC_kwDODFE5qs47zzzo,28565,2021-12-31T19:06:20Z,2021-12-31T19:06:20Z,NONE,"> @maxhawkins how hard would it be to add an entry to the table that includes the HTML version of the email, if it exists? I just attempted your the PR branch on a very small mbox file, and it worked great. My use case is a research project and I need to access more than just the body plain text.
Shouldn't be hard. The easiest way is probably to remove the `if body.content_type == ""text/html""` clause from [utils.py:254](https://github.com/dogsheep/google-takeout-to-sqlite/pull/8/commits/8e6d487b697ce2e8ad885acf613a157bfba84c59#diff-25ad9dd1ced1b8bfc37fda8444819c803232c08891e4af3d4064aa205d8174eaR254) and just return content directly without parsing.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",954546309,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1708945716,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8,1708945716,IC_kwDODFE5qs5l3HE0,150855,2023-09-06T19:12:33Z,2023-09-06T19:12:33Z,NONE,@maxhawkins curious why you didn't use the stdlib `mailbox` to parse the `mbox` files?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",954546309,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1710380941,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8,1710380941,IC_kwDODFE5qs5l8leN,28565,2023-09-07T15:39:59Z,2023-09-07T15:39:59Z,NONE,"> @maxhawkins curious why you didn't use the stdlib `mailbox` to parse the `mbox` files?
Mailbox parses the entire mbox into memory. Using the lower level library lets us stream the emails in one at a time to support larger archives. Both libraries are in the stdlib.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",954546309,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1710950671,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8,1710950671,IC_kwDODFE5qs5l-wkP,150855,2023-09-08T01:22:49Z,2023-09-08T01:22:49Z,NONE,"Makes sense, thanks for explaining!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",954546309,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-894581223,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8,894581223,IC_kwDODFE5qs41Ujnn,28565,2021-08-07T00:57:48Z,2021-08-07T00:57:48Z,NONE,"Just added two more fixes:
* Added parsing for rfc 2047 encoded unicode headers
* Body is now stored as TEXT rather than a BLOB regardless of what order the messages are parsed in.
I was able to run this on my Takeout export and everything seems to work fine. @simonw let me know if this looks good to merge.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",954546309,
https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-896378525,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8,896378525,IC_kwDODFE5qs41baad,28565,2021-08-10T23:28:45Z,2021-08-10T23:28:45Z,NONE,"I added parsing of text/html emails using BeautifulSoup.
Around half of the emails in my archive don't include a text/plain payload so adding html parsing makes a good chunk of them searchable.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",954546309,
https://github.com/dogsheep/hacker-news-to-sqlite/pull/6#issuecomment-1489110168,https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/6,1489110168,IC_kwDODtX3eM5YwgSY,1231935,2023-03-29T18:36:16Z,2023-03-29T18:36:16Z,NONE,@simonw can you take a look when you have a chance?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1641117021,
https://github.com/dogsheep/healthkit-to-sqlite/issues/11#issuecomment-711083698,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/11,711083698,MDEyOklzc3VlQ29tbWVudDcxMTA4MzY5OA==,572,2020-10-17T21:39:15Z,2020-10-17T21:39:15Z,NONE,Nice! Works perfectly. Thanks for the quick response and great tooling in general.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",723838331,
https://github.com/dogsheep/healthkit-to-sqlite/issues/12#issuecomment-1163917719,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/12,1163917719,IC_kwDOC8tyDs5FX_mX,956433,2022-06-23T04:35:02Z,2022-06-23T04:35:02Z,NONE,In terms of unique identifiers - could you use values stored in `HKMetadataKeySyncIdentifier`?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",727848625,
https://github.com/dogsheep/healthkit-to-sqlite/issues/12#issuecomment-877805513,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/12,877805513,MDEyOklzc3VlQ29tbWVudDg3NzgwNTUxMw==,956433,2021-07-11T14:03:01Z,2021-07-11T14:03:01Z,NONE,"Hi Simon -- just experimenting with your excellent software!
Up to this point in time I have been using the (paid) [HealthFit App](https://apps.apple.com/au/app/healthfit/id1202650514) to export my workouts from my Apple Watch, one walk at the time into either .GPX or .FIT format and then using another library to suck it into Python and eventually here to my ""Emmaus Walking"" app:
https://share.streamlit.io/mjboothaus/emmaus_walking/emmaus_walking/app.py
I just used `healthkit-to-sqlite` to convert my export.zip file and it all ""just worked"".
I did notice the issue with various numeric fields being stored in the SQLite db as TEXT for now and just thought I'd flag it - but you're already self-reported this issue.
Keep up the great work!
I was curious if you have any thoughts about periodically exporting ""export.zip"" and how to just update the SQLite file instead of re-creating it each time. Hopefully Apple will give some thought to managing this data in a more sensible fashion as it grows over time. Ideally one could pull it from iCloud (where it is allegedly being backed up).
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",727848625,
https://github.com/dogsheep/healthkit-to-sqlite/issues/12#issuecomment-877874117,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/12,877874117,MDEyOklzc3VlQ29tbWVudDg3Nzg3NDExNw==,956433,2021-07-11T23:03:37Z,2021-07-11T23:03:37Z,NONE,P.s. wondering if you have explored using the spatialite functionality with the location data in workouts?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",727848625,
https://github.com/dogsheep/healthkit-to-sqlite/issues/14#issuecomment-1073123231,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/14,1073123231,IC_kwDOC8tyDs4_9o-f,343884,2022-03-19T22:39:29Z,2022-03-19T22:39:29Z,NONE,"I have this issue, too, with a fresh export. None of my `Workout` entries in `export.xml` have an `id` key, though [the sample `export.xml` in the tests folder doesn’t either](https://github.com/dogsheep/healthkit-to-sqlite/blob/main/tests/zip_contents/apple_health_export/export.xml#L14-L21), so I don’t think this is the culprit. Indeed, it seems @simonw is using the [`hash_id` function from `sqlite_utils`](https://sqlite-utils.datasette.io/en/stable/python-api.html#setting-an-id-based-on-the-hash-of-the-row-contents), which creates a column (`id`, in this case) based on a hash of the row’s contents.
When I run the script, a `workouts` table is created, with one entry: my first workout. No `workout_points` table is created, as [I’d expect from `utils.py`](https://github.com/dogsheep/healthkit-to-sqlite/blob/main/healthkit_to_sqlite/utils.py#L89-L90). I then get essentially the same error as noted in this thread:
```Importing from HealthKit [###################################-] 98% 00:00:01
Traceback (most recent call last):
File ""/Users/lchski/.pyenv/versions/3.10.3/bin/healthkit-to-sqlite"", line 8, in
sys.exit(cli())
File ""/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/click/core.py"", line 1128, in __call__
return self.main(*args, **kwargs)
File ""/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/click/core.py"", line 1053, in main
rv = self.invoke(ctx)
File ""/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/click/core.py"", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File ""/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/click/core.py"", line 754, in invoke
return __callback(*args, **kwargs)
File ""/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/healthkit_to_sqlite/cli.py"", line 57, in cli
convert_xml_to_sqlite(fp, db, progress_callback=bar.update, zipfile=zf)
File ""/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/healthkit_to_sqlite/utils.py"", line 34, in convert_xml_to_sqlite
workout_to_db(el, db, zipfile)
File ""/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/healthkit_to_sqlite/utils.py"", line 57, in workout_to_db
pk = db[""workouts""].insert(record, alter=True, hash_id=""id"").last_pk
File ""/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/sqlite_utils/db.py"", line 2822, in insert
return self.insert_all(
File ""/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/sqlite_utils/db.py"", line 2950, in insert_all
self.insert_chunk(
File ""/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/sqlite_utils/db.py"", line 2715, in insert_chunk
result = self.db.execute(query, params)
File ""/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/sqlite_utils/db.py"", line 458, in execute
return self.conn.execute(sql, parameters)
sqlite3.IntegrityError: UNIQUE constraint failed: workouts.id
```
Are there maybe duplicate workouts in the data, which’d cause multiple rows to share the same `id`? It’s strange, though, that no `workout_points` is created at all. Export created from iOS 15.3.1.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",771608692,
https://github.com/dogsheep/healthkit-to-sqlite/issues/14#issuecomment-1073139067,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/14,1073139067,IC_kwDOC8tyDs4_9s17,343884,2022-03-20T00:54:18Z,2022-03-20T00:54:18Z,NONE,"Update: this appears to be because of running the command twice without clearing the DB in between. Tries to insert a Workout that already exists, causing a collision on the (auto-generated) `id` column. Had a different error with a clean DB, likely due to the workout points format; will make a new issue for that.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",771608692,
https://github.com/dogsheep/healthkit-to-sqlite/issues/14#issuecomment-1629123734,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/14,1629123734,IC_kwDOC8tyDs5hGnSW,44622670,2023-07-10T14:46:52Z,2023-07-10T14:46:52Z,NONE,@simonw any chance to get this fixed soon? ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",771608692,
https://github.com/dogsheep/healthkit-to-sqlite/issues/14#issuecomment-798436026,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/14,798436026,MDEyOklzc3VlQ29tbWVudDc5ODQzNjAyNg==,1234956,2021-03-13T14:23:16Z,2021-03-13T14:23:16Z,NONE,"This PR allows my import to succeed.
It looks like some events don't have an `id`, but do have `HKExternalUUID` (which gets turned into `metadata_HKExternalUUID`), so I use this as a fallback.
If a record has neither of these, I changed it to just print the record (for debugging) and `return`.
For some odd reason this ran fine at first, and now (after removing the generated db and trying again) I'm getting a different error (duplicate column name).
Looks like it may have run when I had two successive runs without remembering to delete the db in between. Will try to refactor.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",771608692,
https://github.com/dogsheep/healthkit-to-sqlite/issues/14#issuecomment-798468572,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/14,798468572,MDEyOklzc3VlQ29tbWVudDc5ODQ2ODU3Mg==,1234956,2021-03-13T14:47:31Z,2021-03-13T14:47:31Z,NONE,"Ok, new PR works. I'm not `git` enough so I just force-pushed over the old one.
I still end up with a lot of activities that are missing an `id` and therefore skipped (since this is used as the primary key). For example:
```
{'workoutActivityType': 'HKWorkoutActivityTypeRunning', 'duration': '35.31666666666667', 'durationUnit': 'min', 'totalDistance': '4.010870267636999', 'totalDistanceUnit': 'mi', 'totalEnergyBurned': '660.3516235351562', 'totalEnergyBurnedUnit': 'Cal', 'sourceName': 'Strava', 'sourceVersion': '22810', 'creationDate': '2020-07-16 13:38:26 -0700', 'startDate': '2020-07-16 06:38:26 -0700', 'endDate': '2020-07-16 07:13:45 -0700'}
```
I also end up with some unhappy characters (in the skipped events), such as: `'sourceName': 'Nathan’s Apple\xa0Watch',`.
But it's successfully making it through the file, and the resulting db opens in datasette, so I'd call that progress.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",771608692,
https://github.com/dogsheep/healthkit-to-sqlite/issues/21#issuecomment-903950096,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/21,903950096,IC_kwDOC8tyDs414S8Q,32016596,2021-08-23T17:00:59Z,2021-08-23T17:00:59Z,NONE,"I think the issue is that I have records like these:
```xml
```
And if sqlite is case insensitive, then `metadata_meal` and `metadata_Meal` result in the same column.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",977128935,
https://github.com/dogsheep/healthkit-to-sqlite/issues/24#issuecomment-1464786643,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/24,1464786643,IC_kwDOC8tyDs5XTt7T,956433,2023-03-11T02:01:27Z,2023-03-11T02:01:27Z,NONE,Thanks for reporting this and providing a solution -- I was puzzled by this error when I revisited my walking data and experienced this issues. I haven't tried the fix yet.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1515883470,
https://github.com/dogsheep/healthkit-to-sqlite/issues/24#issuecomment-1464796494,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/24,1464796494,IC_kwDOC8tyDs5XTwVO,956433,2023-03-11T02:23:42Z,2023-03-11T02:23:42Z,NONE,@simonw - maybe put in some error handling to trap for poorly formed XML (from Apple engineers) so that it suggests that there are problems with export.zip rather than odd looking Python errors :),"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1515883470,
https://github.com/dogsheep/healthkit-to-sqlite/issues/9#issuecomment-514745798,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/9,514745798,MDEyOklzc3VlQ29tbWVudDUxNDc0NTc5OA==,166463,2019-07-24T18:25:36Z,2019-07-24T18:25:36Z,NONE,"This is on macOS 10.14.6, with Python 3.7.4, packages in the virtual environment:
```
Package Version
------------------- -------
aiofiles 0.4.0
Click 7.0
click-default-group 1.2.1
datasette 0.29.2
h11 0.8.1
healthkit-to-sqlite 0.3.1
httptools 0.0.13
hupper 1.8.1
importlib-metadata 0.18
Jinja2 2.10.1
MarkupSafe 1.1.1
Pint 0.8.1
pip 19.2.1
pluggy 0.12.0
setuptools 41.0.1
sqlite-utils 1.7
tabulate 0.8.3
uvicorn 0.8.4
uvloop 0.12.2
websockets 7.0
zipp 0.5.2
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",472429048,
https://github.com/dogsheep/healthkit-to-sqlite/issues/9#issuecomment-515370687,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/9,515370687,MDEyOklzc3VlQ29tbWVudDUxNTM3MDY4Nw==,166463,2019-07-26T09:01:19Z,2019-07-26T09:01:19Z,NONE,"Yes, that did fix the issue I was seeing — it will now import my complete HealthKit data.
Thorsten
> On Jul 25, 2019, at 23:07, Simon Willison wrote:
>
> @tholo this should be fixed in just-released version 0.3.2 - could you run a pip install -U healthkit-to-sqlite and let me know if it works for you now?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub , or mute the thread .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",472429048,
https://github.com/dogsheep/healthkit-to-sqlite/pull/13#issuecomment-904642396,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/13,904642396,IC_kwDOC8tyDs41679c,32016596,2021-08-24T13:27:40Z,2021-08-24T13:28:26Z,NONE,This would fix #21 and make #22 obsolete.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",743071410,
https://github.com/dogsheep/healthkit-to-sqlite/pull/22#issuecomment-904641261,https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/22,904641261,IC_kwDOC8tyDs4167rt,32016596,2021-08-24T13:26:20Z,2021-08-24T13:26:20Z,NONE,Did not see that #13 fixes the same issue in a similar way. You can decide which one to merge ;),"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",978086284,
https://github.com/dogsheep/pocket-to-sqlite/issues/10#issuecomment-1239516561,https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/10,1239516561,IC_kwDODLZ_YM5J4YWR,11887,2022-09-07T15:07:38Z,2022-09-07T15:07:38Z,NONE,Thanks!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1246826792,
https://github.com/dogsheep/pocket-to-sqlite/issues/11#issuecomment-1221521377,https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/11,1221521377,IC_kwDODLZ_YM5Izu_h,2467,2022-08-21T10:51:37Z,2022-08-21T10:51:37Z,NONE,I didn't see there is a PR about this: https://github.com/dogsheep/pocket-to-sqlite/pull/7,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1345452427,
https://github.com/dogsheep/pocket-to-sqlite/issues/9#issuecomment-774726123,https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/9,774726123,MDEyOklzc3VlQ29tbWVudDc3NDcyNjEyMw==,12669260,2021-02-07T18:21:08Z,2021-02-07T18:21:08Z,NONE,@simonw any ideas here?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",801780625,
https://github.com/dogsheep/pocket-to-sqlite/issues/9#issuecomment-774730656,https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/9,774730656,MDEyOklzc3VlQ29tbWVudDc3NDczMDY1Ng==,635179,2021-02-07T18:45:04Z,2021-02-07T18:45:04Z,NONE,"That URL uses TLS 1.3, but maybe only if the client supports it.
It could be your Python version or your SSL library that’s not recent enough.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",801780625,
https://github.com/dogsheep/swarm-to-sqlite/issues/12#issuecomment-941274088,https://api.github.com/repos/dogsheep/swarm-to-sqlite/issues/12,941274088,IC_kwDODD6af844GrPo,33631,2021-10-12T18:31:57Z,2021-10-12T18:31:57Z,NONE,I am running into the same problem. Is there any workaround?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",951817328,