{"id": 795367402, "node_id": "MDU6SXNzdWU3OTUzNjc0MDI=", "number": 1209, "title": "v0.54 500 error from sql query in custom template; code worked in v0.53; found a workaround", "user": {"value": 11788561, "label": "jrdmb"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2021-01-27T19:08:13Z", "updated_at": "2021-01-28T23:00:27Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "v0.54 500 error in sql query template; code worked in v0.53; found a workaround\r\n\r\n**schema:** \r\nCREATE TABLE \"talks\" (\"talk\" TEXT,\"series\" INTEGER, \"talkdate\" TEXT) \r\nCREATE TABLE \"series\" (\"id\" INTEGER PRIMARY KEY, \"series\" TEXT, talks_list TEXT default '', website TEXT default '');\r\n\r\n**Live example of correctly rendered template in v.053:** https://cosmotalks-cy6xkkbezq-uw.a.run.app/cosmotalks/talks/1\r\n\r\n**Description of problem:** I needed 'sql select' code in a custom row-mydatabase-mytable.html template to lookup the series name for a foreign key integer value in the talks table. So `metadata.json` specifies the `datasette-template-sql` plugin.\r\n\r\nThe code below worked perfectly in v0.53 (just the relevant sql statement part is shown; full code is [here](https://github.com/jrdmb/cosmotalks-datasette/blob/main/templates/row-cosmotalks-talks.html)):\r\n\r\n```\r\n{# custom addition #} \r\n{% for row in display_rows %} \r\n ... \r\n {% set sname = sql(\"select series from series where id = ?\", [row.series]) %} \r\n Series name: {{ sname[0].series }} \r\n ... \r\n{% endfor %} \r\n{# End of custom addition #} \r\n```\r\n\r\n**In v0.54, that code resulted in a 500 error with a 'no such table series' message.** A second query in that template also did not work but the above is fully illustrative of the problem.\r\n\r\nAll templates were up-to-date along with datasette v0.54.\r\n\r\n**Workaround:** After fiddling around with trying different things, what worked was the syntax from [Querying a different database from the datasette-template-sql github repo](https://github.com/simonw/datasette-template-sql#querying-a-different-database) to add the database name to the sql statement:\r\n\r\n`{% set sname = sql(\"select series from series where id = ?\", [row.series], database=\"cosmotalks\") %}`\r\n\r\nThough this was found to work, it should not be necessary to add `database=\"cosmotalks\"` since per the `datasette-template-sql` README, it's only needed when querying a different database, but here it's a table within the same database.\r\n", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": null, "draft": null, "state_reason": null} {"id": 811054000, "node_id": "MDU6SXNzdWU4MTEwNTQwMDA=", "number": 1230, "title": "Vega charts are plotted only for rows on the visible page, cluster maps only for rows in the remaining pages", "user": {"value": 7107523, "label": "Kabouik"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2021-02-18T12:27:02Z", "updated_at": "2021-02-18T15:22:15Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I filtered a data set on some criteria and obtain 265 results, split over three pages (100, 100, 65), and reazlized that Vega plots are only applied to the results displayed on the current page, instead of the whole filtered data, _e.g._, 100 on page 1, 100 on page 2, 65 on page 3. Is there a way to force the graphs to consider all results instead of just the page, considering that pages rarely represent sensible information?\r\n\r\nLikewise, while the cluster map does show all results on the first page, if you go to next pages, it will show all remaining results except the previous page(s), _e.g._, 265 on page 1, 165 on page 2, 65 on page 3.\r\n\r\nIn both cases, I don't see many situations where one would like to represent the data this way, and it might even lead to interpretation errors when viewing the data. Am I missing some cases where this would be best? Perhaps a clickable option to subset visual representations according visible pages _vs._ display all search results would do?\r\n\r\n[Edit] Oh, I just saw the \"Load all\" button under the cluster map as well as the [setting to alter the max number or results](https://docs.datasette.io/en/stable/settings.html#max-returned-rows). So I guess this issue only is about the Vega charts.", "repo": {"value": 107914493, "label": "datasette"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": null, "draft": null, "state_reason": null} {"id": 1618130434, "node_id": "I_kwDOJHON9s5gcrYC", "number": 11, "title": "Implement a SQL view to make it easier to query files in a nested folder", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2023-03-09T23:19:28Z", "updated_at": "2023-03-09T23:24:01Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Working with nested data in SQL is tricky, can I make it easier with a view or canned query?", "repo": {"value": 611552758, "label": "apple-notes-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/11/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1650981564, "node_id": "I_kwDOJHON9s5iZ_q8", "number": 12, "title": "Error running pytest", "user": {"value": 14314871, "label": "amlestin"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-04-02T15:02:36Z", "updated_at": "2023-04-02T15:07:10Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "`______________________________________________________ ERROR collecting tests/test_apple_notes_to_sqlite.py _______________________________________________________\r\nImportError while importing test module '/Users/lol/development/apple-notes-to-sqlite/tests/test_apple_notes_to_sqlite.py'.\r\nHint: make sure your test modules/packages have valid Python names.\r\nTraceback:\r\n/opt/homebrew/Cellar/python@3.9/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/importlib/__init__.py:127: in import_module\r\n return _bootstrap._gcd_import(name[level:], package, level)\r\ntests/test_apple_notes_to_sqlite.py:2: in \r\n from apple_notes_to_sqlite.cli import cli, COUNT_SCRIPT, FOLDERS_SCRIPT\r\nE ModuleNotFoundError: No module named 'apple_notes_to_sqlite'`\r\n\r\nSolution:\r\nThis is likely a PYTHONPATH issue due to having pytest installed both globally and in the venv. We can guarantee the tests run by adding the current directory to sys.path automatically using\r\n\r\n`python -m pytest`\r\n\r\nThe alternative is to activate the venv, install pytest, deactivate, then activate the venv again (https://stackoverflow.com/questions/35045038/how-do-i-use-pytest-with-virtualenv)", "repo": {"value": 611552758, "label": "apple-notes-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/12/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1650984552, "node_id": "PR_kwDOJHON9s5NbyYN", "number": 13, "title": "use universal command", "user": {"value": 14314871, "label": "amlestin"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-04-02T15:10:54Z", "updated_at": "2023-04-02T15:37:34Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/apple-notes-to-sqlite/pulls/13", "body": null, "repo": {"value": 611552758, "label": "apple-notes-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/13/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1880968405, "node_id": "PR_kwDOJHON9s5ZhYny", "number": 14, "title": "fix: fix the problem of Chinese character garbling", "user": {"value": 2698003, "label": "barretlee"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-09-04T23:48:28Z", "updated_at": "2023-09-04T23:48:28Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/apple-notes-to-sqlite/pulls/14", "body": "1. The code uses two different ways of writing encoding formats, `mac_roman` and `macroman`. It is uncertain whether there are any typo errors.\r\n2. When there are Chinese characters in the content, exporting it results in garbled code. Changing it to `utf8` can fix the issue.", "repo": {"value": 611552758, "label": "apple-notes-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/14/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1616429236, "node_id": "I_kwDOJHON9s5gWMC0", "number": 4, "title": "Support incremental updates", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2023-03-09T05:14:00Z", "updated_at": "2023-03-09T18:20:56Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Running this script can take several hours against a large notes database.\r\n\r\nWould be neat if it could run against just the notes that have been modified since it last ran. Could pull the max `updated` date and then keep on looping until it finds one modified before then.\r\n\r\nProblem is I don't actually know what order it iterates over the notes in.", "repo": {"value": 611552758, "label": "apple-notes-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/4/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1616440856, "node_id": "I_kwDOJHON9s5gWO4Y", "number": 5, "title": "Configure full text search", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-03-09T05:20:46Z", "updated_at": "2023-03-09T05:20:46Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "FTS would be useful.\r\n\r\nMaybe even extract the plain text from the notes to make that index easier to create, rather than creating it against the HTML. Can use the `plaintext` property for that.", "repo": {"value": 611552758, "label": "apple-notes-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/5/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1617602868, "node_id": "I_kwDOJHON9s5gaqk0", "number": 6, "title": "Character encoding problem", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2023-03-09T16:44:34Z", "updated_at": "2023-04-14T15:22:09Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "I ran against a recent note with this in it:\r\n\r\n> Or just \"Actions \u2699\ufe0f \"\r\n\r\nAnd got back:\r\n\r\n\"image\"\r\n\r\n> `Actions \u201a\u00f6\u00f4\u00d4\u220f\u00e8`\r\n\r\nPasting that into https://ftfy.vercel.app/?s=Actions+%E2%80%9A%C3%B6%C3%B4%C3%94%E2%88%8F%C3%A8+ gives this:\r\n\r\n```python\r\ns = 'Actions \u00e2\\x80\\x9a\u00c3\u00b6\u00c3\u00b4\u00c3\\x94\u00e2\\x88\\x8f\u00c3\u00a8'\r\ns = s.encode('latin-1')\r\ns = s.decode('utf-8')\r\ns = s.encode('macroman')\r\ns = s.decode('utf-8')\r\nprint(s)\r\n```\r\n\"image\"\r\n", "repo": {"value": 611552758, "label": "apple-notes-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/6/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1617938730, "node_id": "I_kwDOJHON9s5gb8kq", "number": 9, "title": "Default to just storing plaintext, store HTML if `--html` is passed", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-03-09T20:19:06Z", "updated_at": "2023-03-09T20:19:06Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "The full `body` version of the notes can get HUGE, due to embedded images. It turns out for my own purposes I'm usually happy with just the `plaintext` version.\r\n\r\nI'm tempted to say you don't get HTML unless you pass a `--html` option.", "repo": {"value": 611552758, "label": "apple-notes-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/9/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 692202408, "node_id": "MDU6SXNzdWU2OTIyMDI0MDg=", "number": 12, "title": "Idea: maps and GeoJSON support", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2020-09-03T18:47:10Z", "updated_at": "2020-09-04T01:45:03Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "It would be cool if the `display_sql` could return a column populated with GeoJSON which would the automatically be displayed on a map in the results (or maybe default JS would look for a `class=\"geojson\"` element output by the `display` template) - ala https://github.com/simonw/datasette-leaflet-geojson\r\n\r\nThen I could render workout routes on a map, or Swarm checkin points.", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/12/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 694136490, "node_id": "MDU6SXNzdWU2OTQxMzY0OTA=", "number": 15, "title": "Add a bunch of config examples", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2020-09-05T17:58:43Z", "updated_at": "2020-09-18T23:17:39Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "I can bring these over from my personal Dogsheep.", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/15/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 694493566, "node_id": "MDU6SXNzdWU2OTQ0OTM1NjY=", "number": 16, "title": "Timeline view", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2020-09-06T19:13:58Z", "updated_at": "2020-09-21T02:42:29Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Ability to browse (and facet) by date.", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/16/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 695553522, "node_id": "MDU6SXNzdWU2OTU1NTM1MjI=", "number": 18, "title": "Deleted records stay in the search index", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2020-09-08T05:14:23Z", "updated_at": "2020-09-08T05:15:51Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Here's why: https://github.com/dogsheep/dogsheep-beta/blob/24f7898d41a39218058f174c75ba62f7c0fcfff6/dogsheep_beta/utils.py#L44-L53\r\n\r\nThat should probably do `DELETE FROM index1.search_index WHERE [table] = ?` first.", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/18/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 695556681, "node_id": "MDU6SXNzdWU2OTU1NTY2ODE=", "number": 19, "title": "Figure out incremental re-indexing", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2020-09-08T05:23:31Z", "updated_at": "2020-09-08T05:27:07Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "As tables get bigger reindexing everything on a schedule (essentially recreating the entire index from scratch) will start to become a performance bottleneck.", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/19/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 697162939, "node_id": "MDU6SXNzdWU2OTcxNjI5Mzk=", "number": 20, "title": "Add more tags so people can find your project.", "user": {"value": 7902810, "label": "ran88dom99"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2020-09-09T21:14:09Z", "updated_at": "2020-09-09T21:14:09Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "quantified-self habit-tracking google-fit time-tracking wearables quantifiedself \r\nfor example", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/20/reactions\", \"total_count\": 1, \"+1\": 0, \"-1\": 1, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 705215230, "node_id": "MDU6SXNzdWU3MDUyMTUyMzA=", "number": 26, "title": "Pagination", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 7, "created_at": "2020-09-21T00:14:37Z", "updated_at": "2020-09-21T02:55:54Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Useful for #16 (timeline view) since you can now filter to just the items on a specific day - but if there are more than 50 items you can't see them all.", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 709789634, "node_id": "MDU6SXNzdWU3MDk3ODk2MzQ=", "number": 27, "title": "Sort order is not persisted by facet filter links", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2020-09-27T18:22:07Z", "updated_at": "2020-09-27T18:22:07Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "A link to `/-/beta?category=1×tamp__date=2018-08-01&q=swedish` should be to `/-/beta?category=1×tamp__date=2018-08-01&q=swedish&sort=newest`", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/27/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 724759588, "node_id": "MDU6SXNzdWU3MjQ3NTk1ODg=", "number": 29, "title": "Add search highlighting snippets", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2020-10-19T16:00:48Z", "updated_at": "2021-08-26T20:23:11Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Like on https://til.simonwillison.net/til/search?q=Snippet", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/29/reactions\", \"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 1, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 836923194, "node_id": "MDU6SXNzdWU4MzY5MjMxOTQ=", "number": 32, "title": "JSON API for search results", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-03-20T22:21:36Z", "updated_at": "2021-03-20T22:21:36Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Refs https://github.com/simonw/datasette/issues/878", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/32/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 983221851, "node_id": "MDU6SXNzdWU5ODMyMjE4NTE=", "number": 34, "title": "Data folder as index command parameter", "user": {"value": 1223625, "label": "humrochagf"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-08-30T21:29:33Z", "updated_at": "2021-08-30T21:29:33Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Hi,\r\n\r\nFirst of all, thank you for this wonderful project :smile:\r\n\r\nI started to use dogsheep to make my personal data searchable, and by using the project I noticed an issue with the index command.\r\n\r\nIt always expects you are running it from the root folder from where the data is located, so I got some errors while trying to make it work on my setup.\r\n\r\nI separate all databases inside a `data` folder (I published my setup to be easier to follow: https://github.com/humrochagf/my-dogsheep)\r\n\r\nBefore, I configured `dogsheep.yml` to add the data folder to its path like this:\r\n\r\n```yml\r\ndata/twitter.db:\r\n tweets:\r\n sql: |-\r\n...\r\n```\r\n\r\nAnd running the index command like this:\r\n\r\n```\r\ndogsheep-beta index data/dogsheep.db dogsheep.yml\r\n```\r\n\r\nIt worked to the normal search feature with no problem this way, but when I started adding `display_sql` rules the app started to crash, because at datasette `get_database` it was looking for `data/twitter` and it only had a db called `twitter` there.\r\n\r\nSo my workaround to that was to cd into the data folder and run the indexer. You can check the way I'm doing it at this line of the makefile: https://github.com/humrochagf/my-dogsheep/blob/main/makefile#L3\r\n\r\nIt works but it would be nice to have an option to pass the path where the data is located to the index function.", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/34/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 987985935, "node_id": "MDExOlB1bGxSZXF1ZXN0NzI2OTkwNjgw", "number": 35, "title": "Support for Datasette's --base-url setting", "user": {"value": 2670795, "label": "brandonrobertz"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-09-03T17:47:45Z", "updated_at": "2021-09-03T17:47:45Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/dogsheep-beta/pulls/35", "body": "This makes it so you can use Dogsheep if you're using Datasette with the `--base-url /some-path/` setting.", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/35/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1751214236, "node_id": "I_kwDOC8SPRc5oYWic", "number": 36, "title": "Getting sqlite_master may not be modified when creating dogsheep index", "user": {"value": 8711912, "label": "khushmeeet"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-06-11T03:21:53Z", "updated_at": "2023-06-11T03:21:53Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "When creating a `dogsheep` index from `config.yml` file on pocket.db (created using pocket-to-sqlite), I am getting this error\r\n\r\n```\r\nTraceback (most recent call last):\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/bin/dogsheep-beta\", line 8, in \r\n sys.exit(cli())\r\n ^^^^^\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py\", line 1130, in __call__\r\n return self.main(*args, **kwargs)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py\", line 1055, in main\r\n rv = self.invoke(ctx)\r\n ^^^^^^^^^^^^^^^^\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py\", line 1657, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py\", line 1404, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py\", line 760, in invoke\r\n return __callback(*args, **kwargs)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/cli.py\", line 36, in index\r\n run_indexer(\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/utils.py\", line 32, in run_indexer\r\n ensure_table_and_indexes(db, tokenize)\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/utils.py\", line 91, in ensure_table_and_indexes\r\n table.add_foreign_key(*fk)\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/sqlite_utils/db.py\", line 2155, in add_foreign_key\r\n self.db.add_foreign_keys([(self.name, column, other_table, other_column)])\r\n File \"/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/sqlite_utils/db.py\", line 1116, in add_foreign_keys\r\n cursor.execute(\r\nsqlite3.OperationalError: table sqlite_master may not be modified\r\n```\r\n\r\nCommand I ran to get this error\r\n```\r\ndogsheep-beta index pocket.db config.yml\r\n```\r\n\r\nDogsheep version\r\n```\r\ndogsheep-beta, version 0.10.2\r\n```\r\n\r\nPython version \r\n```\r\nPython 3.11.2\r\n```", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/36/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1888477283, "node_id": "I_kwDOC8SPRc5wj-Bj", "number": 38, "title": "Run `rebuild_fts` after building the index", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-09-08T23:17:45Z", "updated_at": "2023-09-08T23:17:45Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "In:\r\n- https://github.com/simonw/datasette.io/issues/152#issuecomment-1712323347\r\n\r\nThis turned out to be the fix:\r\n\r\n```bash\r\ndogsheep-beta index dogsheep-index.db templates/dogsheep-beta.yml\r\nsqlite-utils rebuild-fts dogsheep-index.db\r\n```", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/38/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 689850810, "node_id": "MDU6SXNzdWU2ODk4NTA4MTA=", "number": 6, "title": "Set up a demo instance", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2020-09-01T06:20:24Z", "updated_at": "2020-09-01T06:20:24Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Once I've got the Datasette plugin to a state where it's worth building a demo: #3\r\n\r\nI can use data from my public https://github-to-sqlite.dogsheep.net/ demo plus the Pocket data subset I use for the demo in https://github.com/dogsheep/pocket-to-sqlite/issues/5 - I could pull in the https://dogsheep-photos.dogsheep.net/ photos data too.", "repo": {"value": 197431109, "label": "dogsheep-beta"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-beta/issues/6/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 602533300, "node_id": "MDU6SXNzdWU2MDI1MzMzMDA=", "number": 1, "title": "Import photo metadata from Apple Photos into SQLite", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": {"value": 5324096, "label": "Apple Photos online and securely browsable"}, "comments": 8, "created_at": "2020-04-18T19:23:26Z", "updated_at": "2020-05-04T02:41:40Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Faces, albums, locations, that kind of thing.", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/1/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 606033104, "node_id": "MDU6SXNzdWU2MDYwMzMxMDQ=", "number": 12, "title": "If less than 500MB, show size in MB not GB", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2020-04-24T04:35:01Z", "updated_at": "2020-04-24T04:35:25Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Just saw this:\r\n```\r\nUploading 0.05 GB\r\n```", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/12/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 607888367, "node_id": "MDU6SXNzdWU2MDc4ODgzNjc=", "number": 13, "title": "Also upload movie files", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2020-04-27T22:11:25Z", "updated_at": "2020-04-28T00:39:45Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "The `upload` command currently only handles static images:\r\n\r\nhttps://github.com/dogsheep/photos-to-sqlite/blob/d939455af00e07866686457ee2fcb9b2d1b7194e/photos_to_sqlite/utils.py#L26-L33\r\n\r\nNeed to cover movies taken by my phone and DSLR too.", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/13/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 608512747, "node_id": "MDU6SXNzdWU2MDg1MTI3NDc=", "number": 14, "title": "Annotate photos using the Google Cloud Vision API", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2020-04-28T18:09:03Z", "updated_at": "2020-04-28T18:19:06Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "It can detect faces, run OCR, do image labeling (it knows what a lemur is!) and do object localization where it identifies objects and returns bounding polygons for them.", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/14/reactions\", \"total_count\": 3, \"+1\": 2, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 1, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 612287234, "node_id": "MDU6SXNzdWU2MTIyODcyMzQ=", "number": 16, "title": "Import machine-learning detected labels (dog, llama etc) from Apple Photos", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 13, "created_at": "2020-05-05T02:45:43Z", "updated_at": "2020-05-05T05:38:16Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Follow-on from #1. Apple Photos runs some very sophisticated machine learning on-device to figure out if photos are of dogs, llamas and so on. I really want to extract those labels out into my own database.", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/16/reactions\", \"total_count\": 2, \"+1\": 0, \"-1\": 0, \"laugh\": 1, \"hooray\": 1, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 612860758, "node_id": "MDU6SXNzdWU2MTI4NjA3NTg=", "number": 18, "title": "Switch CI solution to GitHub Actions with a macOS runner", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2020-05-05T20:03:50Z", "updated_at": "2020-05-05T23:49:18Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Refs #17.", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/18/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 615626118, "node_id": "MDU6SXNzdWU2MTU2MjYxMTg=", "number": 22, "title": "Try out ExifReader", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2020-05-11T06:32:13Z", "updated_at": "2020-05-14T05:59:53Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "https://pypi.org/project/ExifReader/\r\n\r\nNew fork that should be able to handle EXIF in HEIC files.\r\n\r\nForked here: https://github.com/ianare/exif-py/issues/102#issuecomment-626376522\r\n\r\nRefs #3 ", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/22/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 621323348, "node_id": "MDU6SXNzdWU2MjEzMjMzNDg=", "number": 24, "title": "Configurable URL for images", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2020-05-19T22:25:56Z", "updated_at": "2020-05-20T06:00:29Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "This is hard-coded at the moment, which is bad:\r\nhttps://github.com/dogsheep/photos-to-sqlite/blob/d5d69b9019703c47bc251444838578dd752801e2/photos_to_sqlite/cli.py#L269-L272", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/24/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 621486115, "node_id": "MDU6SXNzdWU2MjE0ODYxMTU=", "number": 27, "title": "photos_with_apple_metadata view should include labels", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2020-05-20T06:06:17Z", "updated_at": "2020-05-20T06:06:17Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?place_city=New+Orleans&_facet=place_city&_facet_array=albums&_facet_array=persons\r\n\r\nHere's one way to add that:\r\n```sql\r\n select\r\n rowid,\r\n photo,\r\n (\r\n select\r\n json_group_array(\r\n json_object(\r\n 'label',\r\n normalized_string,\r\n 'href',\r\n '/photos/labelled?_hide_sql=1&label=' || normalized_string\r\n )\r\n )\r\n from\r\n labels\r\n where\r\n labels.uuid = photos_with_apple_metadata.uuid\r\n ) as labels,\r\n date,\r\n```", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/27/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 624490929, "node_id": "MDU6SXNzdWU2MjQ0OTA5Mjk=", "number": 28, "title": "Invalid SQL no such table: main.uploads", "user": {"value": 41439, "label": "dmd"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2020-05-25T21:25:39Z", "updated_at": "2020-12-24T22:26:22Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "http://127.0.0.1:8001/photos/photos_with_apple_metadata gives \"Invalid SQL no such table: main.uploads\"", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/28/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 602533481, "node_id": "MDU6SXNzdWU2MDI1MzM0ODE=", "number": 3, "title": "Import EXIF data into SQLite - lens used, ISO, aperture etc", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": {"value": 5324096, "label": "Apple Photos online and securely browsable"}, "comments": 2, "created_at": "2020-04-18T19:24:31Z", "updated_at": "2021-10-05T12:38:24Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/3/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 655974395, "node_id": "MDExOlB1bGxSZXF1ZXN0NDQ4MzU1Njgw", "number": 30, "title": "Handle empty bucket on first upload. Allow specifying the endpoint_url for services other than S3 (like b2 and digitalocean spaces)", "user": {"value": 110038, "label": "scanner"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2020-07-13T16:15:26Z", "updated_at": "2020-07-13T16:15:26Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/dogsheep-photos/pulls/30", "body": "Finally got around to trying dogsheep-photos but I want to use backblaze's b2 service instead of AWS S3.\r\nHad to add a way to optionally specify the endpoint_url to connect to. Then with the bucket being empty the initial key retrieval would fail. Probably a better way to see that the bucket is empty than doing a test inside the paginator loop.\r\n\r\nAlso probably a better way to specify the endpoint_url as we get and test for it twice using the same code in two different places but did not want to spend too much time worrying about it.", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/30/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 771511344, "node_id": "MDExOlB1bGxSZXF1ZXN0NTQzMDE1ODI1", "number": 31, "title": "Update for Big Sur", "user": {"value": 41546558, "label": "RhetTbull"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 7, "created_at": "2020-12-20T04:36:45Z", "updated_at": "2023-08-08T15:52:52Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": "dogsheep/dogsheep-photos/pulls/31", "body": "Refactored out the SQL for extracting aesthetic scores to use osxphotos -- adds compatbility for Big Sur via osxphotos which has been updated for new table names in Big Sur. Have not yet refactored the SQL for extracting labels which is still compatible with Big Sur.", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/31/reactions\", \"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 803333769, "node_id": "MDU6SXNzdWU4MDMzMzM3Njk=", "number": 32, "title": "KeyError: 'Contents' on running upload", "user": {"value": 11855322, "label": "robmarkcole"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2021-02-08T08:36:37Z", "updated_at": "2021-07-22T06:40:25Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Following the readme, on big sur, and having entered my auth creds via `dogsheep-photos s3-auth`:\r\n\r\n```\r\n(venv) (base) Robins-MacBook:datasette robin$ dogsheep-photos upload photos.db ~/Pictures/Photos\\ /Users/robin/Pictures/Library.photoslibrary --dry-run\r\n\r\nFetching existing keys from S3...\r\nTraceback (most recent call last):\r\n File \"/Users/robin/datasette/venv/bin/dogsheep-photos\", line 8, in \r\n sys.exit(cli())\r\n File \"/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py\", line 829, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py\", line 782, in main\r\n rv = self.invoke(ctx)\r\n File \"/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py\", line 1259, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py\", line 1066, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n File \"/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py\", line 610, in invoke\r\n return callback(*args, **kwargs)\r\n File \"/Users/robin/datasette/venv/lib/python3.8/site-packages/dogsheep_photos/cli.py\", line 96, in upload\r\n key.split(\".\")[0] for key in get_all_keys(client, creds[\"photos_s3_bucket\"])\r\n File \"/Users/robin/datasette/venv/lib/python3.8/site-packages/dogsheep_photos/utils.py\", line 46, in get_all_keys\r\n for row in page[\"Contents\"]:\r\nKeyError: 'Contents'\r\n```\r\n\r\nPossibly since the bucket is in `EU (London) eu-west-2` and this into is not requested?", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/32/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 803338729, "node_id": "MDU6SXNzdWU4MDMzMzg3Mjk=", "number": 33, "title": "photo-to-sqlite: command not found", "user": {"value": 11855322, "label": "robmarkcole"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2021-02-08T08:42:57Z", "updated_at": "2021-02-12T15:00:44Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Having installed in a venv I get:\r\n```\r\n(venv) (base) Robins-MacBook:datasette robin$ photo-to-sqlite apple-photos photos.db\r\n\r\n-bash: photo-to-sqlite: command not found\r\n```", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/33/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 830283447, "node_id": "MDU6SXNzdWU4MzAyODM0NDc=", "number": 34, "title": "bucket name", "user": {"value": 6213, "label": "dsisnero"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-03-12T16:40:57Z", "updated_at": "2021-03-12T16:40:57Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I followed the instructions to setup credentials but I am getting a invalid bucket name. Can you put a sample auth.json file in the base that shows the correct format for this? Thanks", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/34/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 842695374, "node_id": "MDU6SXNzdWU4NDI2OTUzNzQ=", "number": 35, "title": "Support to annotate photos on other than macOS OSes", "user": {"value": 1151557, "label": "ligurio"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2021-03-28T09:01:25Z", "updated_at": "2021-04-05T07:37:57Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "dogsheep-photos allows to annotate photos using Apple Photo's db. It would be nice to have such ability on other OSes too. For example using trained local model or using Google Vision API (see #14).", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/35/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 988493790, "node_id": "MDExOlB1bGxSZXF1ZXN0NzI3MzkwODM1", "number": 36, "title": "Correct naming of tool in readme", "user": {"value": 2129, "label": "badboy"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2021-09-05T12:05:40Z", "updated_at": "2022-01-06T16:04:46Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/dogsheep-photos/pulls/36", "body": null, "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/36/reactions\", \"total_count\": 2, \"+1\": 2, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1293698966, "node_id": "PR_kwDOD079W84600uh", "number": 37, "title": "Fix former command name in readme", "user": {"value": 578773, "label": "DanLipsitt"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-07-05T02:09:13Z", "updated_at": "2022-07-05T02:09:13Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/dogsheep-photos/pulls/37", "body": "Looks like a previous commit missed a `photo-to-sqlite`\u2192 `dogsheep-photos` replacement.", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/37/reactions\", \"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1827436260, "node_id": "PR_kwDOD079W85WtVyk", "number": 39, "title": "Missing option in datasette instructions", "user": {"value": 319473, "label": "coldclimate"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-07-29T10:34:48Z", "updated_at": "2023-07-29T10:34:48Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/dogsheep-photos/pulls/39", "body": "Gotta tell it where to look", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/39/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 602585497, "node_id": "MDU6SXNzdWU2MDI1ODU0OTc=", "number": 7, "title": "Integrate image content hashing", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2020-04-19T00:36:58Z", "updated_at": "2021-08-26T02:01:01Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "To spot duplicate images (where the file content differs such that the sha256 is no longer a match) it would be useful to calculate and store perceptual hashes of some sort.", "repo": {"value": 256834907, "label": "dogsheep-photos"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep-photos/issues/7/reactions\", \"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 1, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 541274681, "node_id": "MDU6SXNzdWU1NDEyNzQ2ODE=", "number": 2, "title": "Add linkedin-to-sqlite", "user": {"value": 881925, "label": "mnp"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2019-12-21T03:13:40Z", "updated_at": "2019-12-21T03:13:40Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "There is an API available. https://developer.linkedin.com/docs/rest-api#\r\n\r\nAt the minimum, I would think contact list and messages would be of interest.", "repo": {"value": 214746582, "label": "dogsheep.github.io"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep.github.io/issues/2/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 723499985, "node_id": "MDExOlB1bGxSZXF1ZXN0NTA1MDc2NDE4", "number": 5, "title": "Add fitbit-to-sqlite", "user": {"value": 4632208, "label": "mrphil007"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2020-10-16T20:04:05Z", "updated_at": "2020-10-16T20:04:05Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/dogsheep.github.io/pulls/5", "body": "", "repo": {"value": 214746582, "label": "dogsheep.github.io"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep.github.io/issues/5/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 925384329, "node_id": "MDExOlB1bGxSZXF1ZXN0NjczODcyOTc0", "number": 7, "title": "Add instagram-to-sqlite", "user": {"value": 36654812, "label": "gavindsouza"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-06-19T12:26:16Z", "updated_at": "2021-07-28T07:58:59Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/dogsheep.github.io/pulls/7", "body": "The tool covers only chat imports at the time of opening this PR but I'm planning to import everything else that I feel inquisitive about\r\n\r\nref: https://github.com/gavindsouza/instagram-to-sqlite", "repo": {"value": 214746582, "label": "dogsheep.github.io"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep.github.io/issues/7/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 927385540, "node_id": "MDU6SXNzdWU5MjczODU1NDA=", "number": 8, "title": "any guidance / experience on imessage-to-sqlite ?", "user": {"value": 2675621, "label": "Casyfill"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-06-22T15:46:16Z", "updated_at": "2021-06-22T15:46:16Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "", "repo": {"value": 214746582, "label": "dogsheep.github.io"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/dogsheep.github.io/issues/8/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 718934942, "node_id": "MDU6SXNzdWU3MTg5MzQ5NDI=", "number": 1, "title": "Documentation on how to use this with Datasette", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2020-10-11T21:56:27Z", "updated_at": "2020-10-11T22:14:00Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "In particular how to use `datasette-render-images` to see the images.", "repo": {"value": 303218369, "label": "evernote-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/1/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 892383270, "node_id": "MDExOlB1bGxSZXF1ZXN0NjQ1MTAwODQ4", "number": 12, "title": "Recovering of malformed ENEX file", "user": {"value": 8431437, "label": "engdan77"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-05-15T07:49:31Z", "updated_at": "2021-05-15T19:57:50Z", "closed_at": null, "author_association": "FIRST_TIMER", "pull_request": "dogsheep/evernote-to-sqlite/pulls/12", "body": "Hey .. Awesome work developing this project, that I found very useful to me and saved me some work.. Thanks.. :)\r\n\r\nSome background to this PR... \r\nI've been searching around for a tool allowing me to transforming my personal collection of Evernote notes to a format easier to search and potentially easier import to future services. \r\n\r\nNow I discovered problem processing my large data ~5GB using the existing source using Pythons builtin xml-parser that unfortunately was unable to succeed without exception breaking the process. \r\n\r\nMy first attempt I tried to adapt to more robust lxml package allowing huge data and with \"recover\", but even if it worked better it also failed processing the whole data. Even using the memory efficient etree.iterparse() it also unfortunately got into trouble.\r\n\r\nAnd with no luck finding any other libraries successfully parsing this enormous file I instead chose to build a \"hugexmlparser\" module that allows parsing this huge file using yield (on a byte-to-byte-level) and allows you to set a maximum size for to cater for potential malformed or undesirable large attachments to export, should succeed covering potential exceptions. Some cases found where the parses discover malformed XML within so also in those cases try to save as much as possible by escaping (to be dealt at a later stage, better than nothing), and if a missing end before new (malformed?) it would add this after encounter a new start-tag.\r\n\r\nThe code for the recovery process is a bit rough and for certain room for refactoring, but at the moment is seem to achieve what I wanted.\r\n\r\nNow with the above we pass this a minor changed version of save_note_recovery() assure the existing works.\r\nAlso adding this as a new recover-enex command to click and kept the original options. \r\nA couple of new tests was added as well to check against using this command.\r\n\r\nNow this currently works to me, but thought I might share a PR in such as you find use for this yourself or found useful to others finding this repository.\r\n\r\nAs a second step .. When the time allows it would have been nice to also be able to easily export from SQLite to formatted HTML/MD and attachments saved... but that might perhaps be better a separate project ... or if you or someone else have something that might shared to save some trouble, I would be interested ;-) ", "repo": {"value": 303218369, "label": "evernote-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/12/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 986829194, "node_id": "MDU6SXNzdWU5ODY4MjkxOTQ=", "number": 14, "title": "xml.etree.ElementTree.Parse Error - mismatched tag", "user": {"value": 46968, "label": "step21"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2021-09-02T14:46:36Z", "updated_at": "2021-09-02T14:53:11Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "This is an error message I get upon parsing the enex file of my Inbox. Please find the full error message below. Any hints welcome.\r\n\r\n```\r\nImporting from ENEX [##################------------------] 50% 00:00:50\r\nTraceback (most recent call last):\r\n File \"/Users/utopist/.virtualenvs/evernote-to-sqlite-Og2PIW3Y/bin/evernote-to-sqlite\", line 8, in \r\n sys.exit(cli())\r\n File \"/Users/utopist/.virtualenvs/evernote-to-sqlite-Og2PIW3Y/lib/python3.9/site-packages/click/core.py\", line 1137, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"/Users/utopist/.virtualenvs/evernote-to-sqlite-Og2PIW3Y/lib/python3.9/site-packages/click/core.py\", line 1062, in main\r\n rv = self.invoke(ctx)\r\n File \"/Users/utopist/.virtualenvs/evernote-to-sqlite-Og2PIW3Y/lib/python3.9/site-packages/click/core.py\", line 1668, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"/Users/utopist/.virtualenvs/evernote-to-sqlite-Og2PIW3Y/lib/python3.9/site-packages/click/core.py\", line 1404, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n File \"/Users/utopist/.virtualenvs/evernote-to-sqlite-Og2PIW3Y/lib/python3.9/site-packages/click/core.py\", line 763, in invoke\r\n return __callback(*args, **kwargs)\r\n File \"/Users/utopist/.virtualenvs/evernote-to-sqlite-Og2PIW3Y/lib/python3.9/site-packages/evernote_to_sqlite/cli.py\", line 30, in enex\r\n for tag, note in find_all_tags(fp, [\"note\"], progress_callback=bar.update):\r\n File \"/Users/utopist/.virtualenvs/evernote-to-sqlite-Og2PIW3Y/lib/python3.9/site-packages/evernote_to_sqlite/utils.py\", line 17, in find_all_tags\r\n for event, el in parser.read_events():\r\n File \"/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/xml/etree/ElementTree.py\", line 1329, in read_events\r\n raise event\r\n File \"/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/xml/etree/ElementTree.py\", line 1301, in feed\r\n self._parser.feed(data)\r\nxml.etree.ElementTree.ParseError: mismatched tag: line 6837961, column 2\r\n```\r\n", "repo": {"value": 303218369, "label": "evernote-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/14/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1042759769, "node_id": "PR_kwDOEhK-wc4uAJb9", "number": 15, "title": "include note tags in the export", "user": {"value": 436138, "label": "d-rep"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-11-02T20:04:31Z", "updated_at": "2021-11-02T20:04:31Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/evernote-to-sqlite/pulls/15", "body": "When parsing the Evernote `` elements, the script will now also parse any nested `` elements, writing them out into a separate sqlite table.\r\n\r\nHere is an example of how to query the data after the script has run:\r\n```\r\nselect notes.*,\r\n\t(select group_concat(tag) from notes_tags where notes_tags.note_id=notes.id) as tags\r\nfrom notes;\r\n```\r\n\r\nMy .enex source file is 3+ years old so I am assuming the structure hasn't changed. Interestingly, my _notebook names_ show up in the _tags_ list where the tag name is prefixed with `notebook_`, so this could maybe help work around the first limitation mentioned in the [evernote-to-sqlite blog post](https://simonwillison.net/2020/Oct/16/building-evernote-sqlite-exporter/).\r\n", "repo": {"value": 303218369, "label": "evernote-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/15/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1943259395, "node_id": "I_kwDOEhK-wc5z08kD", "number": 16, "title": " time data '2014-11-21T11:44:12.000Z' does not match format '%Y%m%dT%H%M%SZ'", "user": {"value": 3746270, "label": "linonetwo"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-10-14T13:24:39Z", "updated_at": "2023-10-14T13:24:39Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "\r\n```\r\nevernote-to-sqlite enex evernote.db ./\u6211\u7684\u7b14\u8bb0.enex\r\nImporting from ENEX [#####-------------------------------] 14%\r\nTraceback (most recent call last):\r\n File \"/usr/local/bin/evernote-to-sqlite\", line 8, in \r\n sys.exit(cli())\r\n ^^^^^\r\n File \"/usr/local/lib/python3.11/site-packages/click/core.py\", line 1157, in __call__\r\n return self.main(*args, **kwargs)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/usr/local/lib/python3.11/site-packages/click/core.py\", line 1078, in main\r\n rv = self.invoke(ctx)\r\n ^^^^^^^^^^^^^^^^\r\n File \"/usr/local/lib/python3.11/site-packages/click/core.py\", line 1688, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/usr/local/lib/python3.11/site-packages/click/core.py\", line 1434, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/usr/local/lib/python3.11/site-packages/click/core.py\", line 783, in invoke\r\n return __callback(*args, **kwargs)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/usr/local/lib/python3.11/site-packages/evernote_to_sqlite/cli.py\", line 31, in enex\r\n save_note(db, note)\r\n File \"/usr/local/lib/python3.11/site-packages/evernote_to_sqlite/utils.py\", line 46, in save_note\r\n \"created\": convert_datetime(created),\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/usr/local/lib/python3.11/site-packages/evernote_to_sqlite/utils.py\", line 111, in convert_datetime\r\n return datetime.datetime.strptime(s, \"%Y%m%dT%H%M%SZ\").isoformat()\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/usr/local/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/_strptime.py\", line 568, in _strptime_datetime\r\n tt, fraction, gmtoff_fraction = _strptime(data_string, format)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"/usr/local/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/_strptime.py\", line 349, in _strptime\r\n raise ValueError(\"time data %r does not match format %r\" %\r\nValueError: time data '2014-11-21T11:44:12.000Z' does not match format '%Y%m%dT%H%M%SZ'\r\n```\r\n\r\nenex is exported by evernote mac client ", "repo": {"value": 303218369, "label": "evernote-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/16/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 718938889, "node_id": "MDU6SXNzdWU3MTg5Mzg4ODk=", "number": 5, "title": "Figure out how to display images from tags inline in Datasette", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 6, "created_at": "2020-10-11T22:17:03Z", "updated_at": "2020-10-16T20:16:28Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Relates to #1. Evernote XML looks like this:\r\n\r\n```xml\r\n\r\n\r\n
This note includes two images.
\r\n
\r\n The Python logo\r\n
\r\n
\r\n \r\n
\r\n
\r\n The Evernote logo\r\n
\r\n
\r\n \r\n
\r\n
\r\n```\r\nThat hash is the md5 we use to store resources. It should be possible to turn these into embedded image tags, especially if done in conjunction with the https://github.com/simonw/datasette-media plugin.", "repo": {"value": 303218369, "label": "evernote-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/5/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 496415321, "node_id": "MDU6SXNzdWU0OTY0MTUzMjE=", "number": 1, "title": "Figure out some interesting example SQL queries", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 9, "created_at": "2019-09-20T15:28:07Z", "updated_at": "2021-05-03T03:46:23Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "My knowledge of genetics has left me short here. I'd love to be able to provide some interesting example SELECT queries - maybe one that spots if you are [likely to have red hair?](https://www.snpedia.com/index.php/Rs1805007)", "repo": {"value": 209590345, "label": "genome-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/genome-to-sqlite/issues/1/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 664793260, "node_id": "MDU6SXNzdWU2NjQ3OTMyNjA=", "number": 2, "title": "Yak shave", "user": {"value": 145425, "label": "ekg"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2020-07-23T22:04:18Z", "updated_at": "2020-07-23T22:04:18Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Just a quick note... The 23andme data is not exactly your genome, but a SNP chip of your genome. It's \"some of your genotypes.\" Or about 0.1% of your genome. Nice work in any case! It deserves to be liberated!!!!!", "repo": {"value": 209590345, "label": "genome-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/genome-to-sqlite/issues/2/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 530491074, "node_id": "MDU6SXNzdWU1MzA0OTEwNzQ=", "number": 14, "title": "Command for importing events", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2019-11-29T21:28:58Z", "updated_at": "2020-04-14T19:38:34Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Eg from https://api.github.com/users/simonw/events\r\n\r\nDocs here: https://developer.github.com/v3/activity/events/#list-events-performed-by-a-user", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/14/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 599776345, "node_id": "MDU6SXNzdWU1OTk3NzYzNDU=", "number": 24, "title": "Feature idea: github-to-sqlite everything ...", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2020-04-14T18:34:00Z", "updated_at": "2020-04-14T18:34:00Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "At the moment if you want to pull all your repos, issues, issues comments etc you have to do it with a sequence of separate commands.\r\n\r\nConsider adding a `everything` or `all` command which fetches everything that the tool knows how to fetch, and is designed to be run on a cron in a way that fetches just new stuff each time.", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/24/reactions\", \"total_count\": 7, \"+1\": 7, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 664485022, "node_id": "MDU6SXNzdWU2NjQ0ODUwMjI=", "number": 46, "title": "Feature: pull request reviews and comments", "user": {"value": 1326704, "label": "bhrutledge"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 6, "created_at": "2020-07-23T13:43:45Z", "updated_at": "2022-12-20T14:40:15Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Hi there! I saw your [presentation at Boston Python](https://www.meetup.com/bostonpython/events/271887195). I'm already a light user of Datasette (thank you!), but wasn't aware of this project.\r\n\r\nI've been working on a \"pull request dashboard\" to get a comprehensive view of the state of open PR's, esp. related to reviews (i.e., pending, approved, changes requested). Currently it's a CLI command, but I thought a Datasette UI might be fun.\r\n\r\nI see that PR's are available from the `issues` command, but I don't see reviews anywhere. From the [API docs](https://docs.github.com/en/rest/reference/pulls#reviews), it looks like there are separate endpoints for those (as well as pull requests in general). What do you think about adding that? Would you accept a PR? Any sense of the level of effort?", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46/reactions\", \"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 703216044, "node_id": "MDU6SXNzdWU3MDMyMTYwNDQ=", "number": 49, "title": "Feature: gists and starred gists", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2020-09-17T02:30:52Z", "updated_at": "2020-09-17T02:30:52Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "https://developer.github.com/v3/gists/#list-starred-gists", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/49/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 703218756, "node_id": "MDU6SXNzdWU3MDMyMTg3NTY=", "number": 50, "title": "Commands for making authenticated API calls", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 7, "created_at": "2020-09-17T02:39:07Z", "updated_at": "2020-10-19T05:01:29Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Similar to `twitter-to-sqlite fetch`, see https://github.com/dogsheep/twitter-to-sqlite/issues/51", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/50/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 703246031, "node_id": "MDU6SXNzdWU3MDMyNDYwMzE=", "number": 51, "title": "github-to-sqlite should handle rate limits better", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2020-09-17T04:01:50Z", "updated_at": "2022-10-14T16:34:07Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "From #50 - right now it will crash with an error of it hits the rate limit. Since the rate limit information (including reset time) is available in the headers it could automatically sleep and try again instead.", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/51/reactions\", \"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 753000405, "node_id": "MDU6SXNzdWU3NTMwMDA0MDU=", "number": 53, "title": "Command for fetching file contents", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2020-11-29T20:31:04Z", "updated_at": "2020-11-30T00:36:09Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Something like this:\r\n\r\n github-to-sqlite files github.db simonw/datasette\r\n\r\nThis would fetch all files from the `main` branch into a `files` table.\r\n\r\nAdditional options could handle things like pulling files from a branch or tag, or just pulling files that match a specific glob or that exist in a specific directory.", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/53/reactions\", \"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 797097140, "node_id": "MDU6SXNzdWU3OTcwOTcxNDA=", "number": 60, "title": "Use Data from SQLite in other commands", "user": {"value": 22578954, "label": "daniel-butler"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2021-01-29T18:35:52Z", "updated_at": "2021-02-12T18:29:43Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "As a total beginner here how could you access data from the sqlite table to run other commands.\r\n\r\nWhat I am thinking is I want to get all the repos in an organization then using the repo list pull all the commit messages for each repo. \r\n\r\nI love this project by the way!", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/60/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 797784080, "node_id": "MDU6SXNzdWU3OTc3ODQwODA=", "number": 62, "title": "Stargazers and workflows commands always require an auth file when using GITHUB_TOKEN ", "user": {"value": 631242, "label": "frosencrantz"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-01-31T18:56:05Z", "updated_at": "2021-01-31T18:56:05Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "Requested fix in https://github.com/dogsheep/github-to-sqlite/pull/59\r\n\r\nThe stargazers and workflows commands always require an auth file, even when using a `GITHUB_TOKEN`. Other commands don't require the auth file.\r\n\r\n", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/62/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 897212458, "node_id": "MDU6SXNzdWU4OTcyMTI0NTg=", "number": 63, "title": "Ability to fetch commits from branches other than the default", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-05-20T17:58:08Z", "updated_at": "2021-05-20T17:58:08Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "This tool is currently almost entirely ignorant of the concept of branches. One example: you can't retrieve commits from any branch other than the default (usually main).", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/63/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 920636216, "node_id": "MDU6SXNzdWU5MjA2MzYyMTY=", "number": 64, "title": "feature: support \"events\"", "user": {"value": 231498, "label": "khimaros"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2021-06-14T17:42:49Z", "updated_at": "2021-06-15T00:48:37Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "the GitHub API provides the ability to fetch all events for a given user, organization, or repository: https://docs.github.com/en/rest/reference/activity#list-events-for-the-authenticated-user\r\n\r\nthis would allow users to export all of the issue comments, new issues, etc. that they created. something which is currently missing from the GitHub takeout exports.", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/64/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 923270900, "node_id": "MDExOlB1bGxSZXF1ZXN0NjcyMDUzODEx", "number": 65, "title": "basic support for events", "user": {"value": 231498, "label": "khimaros"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2021-06-17T00:51:30Z", "updated_at": "2022-10-03T22:35:03Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/github-to-sqlite/pulls/65", "body": "a quick first pass at implementing the feature requested in https://github.com/dogsheep/github-to-sqlite/issues/64\r\n\r\ntesting instructions:\r\n\r\n```\r\n$ github-to-sqlite events events.db user/khimaros\r\n```\r\n\r\nif the specified user is the authenticated user, it will also include private events.\r\n\r\ncaveat: pagination appears to be broken (i don't see `next` in the response JSON from GitHub)", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/65/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 975161924, "node_id": "MDExOlB1bGxSZXF1ZXN0NzE2MzU3OTgy", "number": 66, "title": "Add --merged-by flag to pull-requests sub command", "user": {"value": 30531572, "label": "sarcasticadmin"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2021-08-20T00:57:55Z", "updated_at": "2021-09-28T21:50:31Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/github-to-sqlite/pulls/66", "body": "## Description\r\n\r\nProposing a solution to the API limitation for `merged_by` in pull_requests. Specifically the following called out in the readme:\r\n\r\n```\r\nNote that the merged_by column on the pull_requests table will only be populated for pull requests that are loaded using the --pull-request option - the GitHub API does not return this field for pull requests that are loaded in bulk.\r\n```\r\n\r\nThis approach might cause larger repos to hit rate limits called out in https://github.com/dogsheep/github-to-sqlite/issues/51 but seems to work well in the repos I tested and included below.\r\n\r\n## Old Behavior\r\n- Had to list out the pull-requests individually via multiple `--pull-request` flags\r\n\r\n## New Behavior\r\n\r\n- `--merged-by` flag for getting 'merge_by' information out of pull-requests without having to specify individual PR numbers.\r\n\r\n# Testing\r\n\r\nPicking some repo that has more than one merger (datasette only has 1 \ud83d\ude09 )\r\n\r\n```\r\n$ github-to-sqlite pull-requests ./github.db opnsense/tools --merged-by\r\n$ echo \"select id, url, merged_by from pull_requests;\" | sqlite3 ./github.db \r\n83533612|https://github.com/opnsense/tools/pull/39|1915288\r\n102632885|https://github.com/opnsense/tools/pull/43|1915288\r\n149114810|https://github.com/opnsense/tools/pull/57|1915288\r\n160394495|https://github.com/opnsense/tools/pull/64|1915288\r\n163308408|https://github.com/opnsense/tools/pull/67|1915288\r\n169723264|https://github.com/opnsense/tools/pull/69|1915288\r\n171381422|https://github.com/opnsense/tools/pull/72|1915288\r\n179938195|https://github.com/opnsense/tools/pull/77|1915288\r\n196233824|https://github.com/opnsense/tools/pull/82|1915288\r\n215289964|https://github.com/opnsense/tools/pull/93|\r\n219696100|https://github.com/opnsense/tools/pull/97|1915288\r\n223664843|https://github.com/opnsense/tools/pull/99|\r\n228446172|https://github.com/opnsense/tools/pull/103|1915288\r\n238930434|https://github.com/opnsense/tools/pull/110|1915288\r\n255507110|https://github.com/opnsense/tools/pull/119|1915288\r\n255980675|https://github.com/opnsense/tools/pull/120|1915288\r\n261906770|https://github.com/opnsense/tools/pull/125|\r\n263800503|https://github.com/opnsense/tools/pull/127|1915288\r\n264038685|https://github.com/opnsense/tools/pull/128|1915288\r\n264696704|https://github.com/opnsense/tools/pull/129|1915288\r\n266660547|https://github.com/opnsense/tools/pull/130|1915288\r\n273120409|https://github.com/opnsense/tools/pull/133|1915288\r\n274370803|https://github.com/opnsense/tools/pull/135|\r\n276600629|https://github.com/opnsense/tools/pull/139|\r\n277303655|https://github.com/opnsense/tools/pull/141|1915288\r\n293033714|https://github.com/opnsense/tools/pull/145|\r\n294827649|https://github.com/opnsense/tools/pull/146|\r\n295140008|https://github.com/opnsense/tools/pull/147|1915288\r\n305690829|https://github.com/opnsense/tools/pull/150|9783985\r\n307077931|https://github.com/opnsense/tools/pull/152|1915288\r\n321782100|https://github.com/opnsense/tools/pull/155|\r\n337265672|https://github.com/opnsense/tools/pull/160|\r\n337267484|https://github.com/opnsense/tools/pull/161|1915288\r\n368251763|https://github.com/opnsense/tools/pull/169|\r\n428262505|https://github.com/opnsense/tools/pull/181|\r\n437557011|https://github.com/opnsense/tools/pull/182|1915288\r\n447079893|https://github.com/opnsense/tools/pull/185|\r\n461822092|https://github.com/opnsense/tools/pull/191|\r\n463290142|https://github.com/opnsense/tools/pull/193|1915288\r\n470112962|https://github.com/opnsense/tools/pull/194|1915288\r\n472644649|https://github.com/opnsense/tools/pull/195|1915288\r\n488696898|https://github.com/opnsense/tools/pull/198|\r\n513289902|https://github.com/opnsense/tools/pull/201|\r\n522530265|https://github.com/opnsense/tools/pull/203|\r\n564443347|https://github.com/opnsense/tools/pull/213|\r\n597579516|https://github.com/opnsense/tools/pull/220|1915288\r\n602860357|https://github.com/opnsense/tools/pull/221|1915288\r\n608744738|https://github.com/opnsense/tools/pull/222|1915288\r\n623279673|https://github.com/opnsense/tools/pull/228|1915288\r\n664656182|https://github.com/opnsense/tools/pull/233|\r\n664781786|https://github.com/opnsense/tools/pull/234|1915288\r\n670683636|https://github.com/opnsense/tools/pull/235|1915288\r\n683150764|https://github.com/opnsense/tools/pull/237|\r\n685016233|https://github.com/opnsense/tools/pull/238|\r\n687099825|https://github.com/opnsense/tools/pull/239|1915288\r\n715705652|https://github.com/opnsense/tools/pull/244|1915288\r\n715721248|https://github.com/opnsense/tools/pull/245|1915288\r\n```\r\n`userid` are now present for those PRs that were merged.\r\n\r\nWithout the flag the `merged_by` behavior remains missing as expected when get PRs bulk:\r\n\r\n```\r\n$ github-to-sqlite pull-requests ./github.db opnsense/tools\r\n$ echo \"select id, url, merged_by from pull_requests;\" | sqlite3 ./github.db \r\n83533612|https://github.com/opnsense/tools/pull/39|\r\n102632885|https://github.com/opnsense/tools/pull/43|\r\n149114810|https://github.com/opnsense/tools/pull/57|\r\n160394495|https://github.com/opnsense/tools/pull/64|\r\n163308408|https://github.com/opnsense/tools/pull/67|\r\n169723264|https://github.com/opnsense/tools/pull/69|\r\n171381422|https://github.com/opnsense/tools/pull/72|\r\n179938195|https://github.com/opnsense/tools/pull/77|\r\n196233824|https://github.com/opnsense/tools/pull/82|\r\n215289964|https://github.com/opnsense/tools/pull/93|\r\n219696100|https://github.com/opnsense/tools/pull/97|\r\n223664843|https://github.com/opnsense/tools/pull/99|\r\n228446172|https://github.com/opnsense/tools/pull/103|\r\n238930434|https://github.com/opnsense/tools/pull/110|\r\n255507110|https://github.com/opnsense/tools/pull/119|\r\n255980675|https://github.com/opnsense/tools/pull/120|\r\n261906770|https://github.com/opnsense/tools/pull/125|\r\n263800503|https://github.com/opnsense/tools/pull/127|\r\n264038685|https://github.com/opnsense/tools/pull/128|\r\n264696704|https://github.com/opnsense/tools/pull/129|\r\n266660547|https://github.com/opnsense/tools/pull/130|\r\n273120409|https://github.com/opnsense/tools/pull/133|\r\n274370803|https://github.com/opnsense/tools/pull/135|\r\n276600629|https://github.com/opnsense/tools/pull/139|\r\n277303655|https://github.com/opnsense/tools/pull/141|\r\n293033714|https://github.com/opnsense/tools/pull/145|\r\n294827649|https://github.com/opnsense/tools/pull/146|\r\n295140008|https://github.com/opnsense/tools/pull/147|\r\n305690829|https://github.com/opnsense/tools/pull/150|\r\n307077931|https://github.com/opnsense/tools/pull/152|\r\n321782100|https://github.com/opnsense/tools/pull/155|\r\n337265672|https://github.com/opnsense/tools/pull/160|\r\n337267484|https://github.com/opnsense/tools/pull/161|\r\n368251763|https://github.com/opnsense/tools/pull/169|\r\n428262505|https://github.com/opnsense/tools/pull/181|\r\n437557011|https://github.com/opnsense/tools/pull/182|\r\n447079893|https://github.com/opnsense/tools/pull/185|\r\n461822092|https://github.com/opnsense/tools/pull/191|\r\n463290142|https://github.com/opnsense/tools/pull/193|\r\n470112962|https://github.com/opnsense/tools/pull/194|\r\n472644649|https://github.com/opnsense/tools/pull/195|\r\n488696898|https://github.com/opnsense/tools/pull/198|\r\n513289902|https://github.com/opnsense/tools/pull/201|\r\n522530265|https://github.com/opnsense/tools/pull/203|\r\n564443347|https://github.com/opnsense/tools/pull/213|\r\n597579516|https://github.com/opnsense/tools/pull/220|\r\n602860357|https://github.com/opnsense/tools/pull/221|\r\n608744738|https://github.com/opnsense/tools/pull/222|\r\n623279673|https://github.com/opnsense/tools/pull/228|\r\n664656182|https://github.com/opnsense/tools/pull/233|\r\n664781786|https://github.com/opnsense/tools/pull/234|\r\n670683636|https://github.com/opnsense/tools/pull/235|\r\n683150764|https://github.com/opnsense/tools/pull/237|\r\n685016233|https://github.com/opnsense/tools/pull/238|\r\n687099825|https://github.com/opnsense/tools/pull/239|\r\n715705652|https://github.com/opnsense/tools/pull/244|\r\n715721248|https://github.com/opnsense/tools/pull/245|\r\n```\r\n\r\nIndividual PRs passed via `--pull-request` flag behaves as expected (unchanged):\r\n\r\n```\r\n$ github-to-sqlite pull-requests ./github.db opnsense/tools --pull-request 39 --pull-request 237\r\n$ echo \"select id, url, merged_by from pull_requests;\" | sqlite3 ./github.db\r\n83533612|https://github.com/opnsense/tools/pull/39|1915288\r\n683150764|https://github.com/opnsense/tools/pull/237|\r\n```\r\n> Picking 1 PR that has a merged_by (39) and one that does not (237)", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/66/reactions\", \"total_count\": 3, \"+1\": 2, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 1, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 981690086, "node_id": "MDExOlB1bGxSZXF1ZXN0NzIxNjg2NzIx", "number": 67, "title": "Replacing step ID key with step_id", "user": {"value": 16374374, "label": "jshcmpbll"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-08-28T01:26:41Z", "updated_at": "2021-08-28T01:27:00Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/github-to-sqlite/pulls/67", "body": "Workflows that have an `id` in any step result in the following error when running `workflows`:\r\n\r\ne.g.`github-to-sqlite workflows github.db nixos/nixpkgs`\r\n\r\n```Traceback (most recent call last):\r\n File \"/usr/local/bin/github-to-sqlite\", line 8, in \r\n sys.exit(cli())\r\n File \"/usr/local/lib/python3.8/dist-packages/click/core.py\", line 1137, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"/usr/local/lib/python3.8/dist-packages/click/core.py\", line 1062, in main\r\n rv = self.invoke(ctx)\r\n File \"/usr/local/lib/python3.8/dist-packages/click/core.py\", line 1668, in invoke```Traceback (most recent call last):\r\n File \"/usr/local/bin/github-to-sqlite\", line 8, in \r\n sys.exit(cli())\r\n File \"/usr/local/lib/python3.8/dist-packages/click/core.py\", line 1137, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"/usr/local/lib/python3.8/dist-packages/click/core.py\", line 1062, in main\r\n rv = self.invoke(ctx)\r\n File \"/usr/local/lib/python3.8/dist-packages/click/core.py\", line 1668, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"/usr/local/lib/python3.8/dist-packages/click/core.py\", line 1404, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n File \"/usr/local/lib/python3.8/dist-packages/click/core.py\", line 763, in invoke\r\n return __callback(*args, **kwargs)\r\n File \"/usr/local/lib/python3.8/dist-packages/github_to_sqlite/cli.py\", line 601, in workflows\r\n utils.save_workflow(db, repo_id, filename, content)\r\n File \"/usr/local/lib/python3.8/dist-packages/github_to_sqlite/utils.py\", line 865, in save_workflow\r\n db[\"steps\"].insert_all(\r\n File \"/usr/local/lib/python3.8/dist-packages/sqlite_utils/db.py\", line 2596, in insert_all\r\n self.insert_chunk(\r\n File \"/usr/local/lib/python3.8/dist-packages/sqlite_utils/db.py\", line 2378, in insert_chunk\r\n result = self.db.execute(query, params)\r\n File \"/usr/local/lib/python3.8/dist-packages/sqlite_utils/db.py\", line 419, in execute\r\n return self.conn.execute(sql, parameters)\r\nsqlite3.IntegrityError: datatype mismatch\r\n```\r\n\r\n - [Information about the ID key in a step for GHA](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idstepsid)\r\n - [An example workflow from a public repo](https://github.com/NixOS/nixpkgs/blob/b4cc66827745e525ce7bb54659845ac89788a597/.github/workflows/direct-push.yml#L16)\r\n\r\n# Changes\r\nI'm proposing that the key for `id` in step is replaced with `step_id` so that it no longer interferes with the table `id` for tracking the record.\r\n\r\nSpecial thanks to @sarcasticadmin @egiffen and @ruebenramirez for helping a bit on this \ud83d\ude04 ", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/67/reactions\", \"total_count\": 1, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 1, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1013506559, "node_id": "PR_kwDODFdgUs4skaNS", "number": 68, "title": "Add support for retrieving teams / members", "user": {"value": 68329, "label": "philwills"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-10-01T15:55:02Z", "updated_at": "2021-10-01T15:59:53Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/github-to-sqlite/pulls/68", "body": "Adds a method for retrieving all the teams within an organisation and all the members in those teams. The latter is stored as a join table `team_members` beteween `teams` and `users`.", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/68/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1071071397, "node_id": "I_kwDODFdgUs4_10Cl", "number": 69, "title": "View that combines issues and issue comments", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2021-12-04T00:34:33Z", "updated_at": "2021-12-04T00:34:52Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "I want to see a reverse chronologically ordered interface onto both issues and comments - essentially a unified log of comments and issues opened across one or multiple projects.", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/69/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1149402080, "node_id": "PR_kwDODFdgUs4zaUta", "number": 70, "title": "scrape-dependents: enable paging through package menu option if present", "user": {"value": 36061055, "label": "stanbiryukov"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-02-24T15:07:25Z", "updated_at": "2022-02-24T15:07:25Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/github-to-sqlite/pulls/70", "body": "Some repos organize network dependents by a Package toggle. This PR adds the ability to page through those options and scrape underlying dependents.", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/70/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1211283427, "node_id": "I_kwDODFdgUs5IMrfj", "number": 72, "title": "feature: display progress bar when downloading multi-page responses", "user": {"value": 9020979, "label": "hydrosquall"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-04-21T16:37:12Z", "updated_at": "2022-04-21T17:29:31Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "## Motivation\r\n\r\nFor a long running command (longer than 1 minute) for a big table (like pull requests or commits), it can be tricky to know if the script is still running, or if a rate limit/error was encountered\r\n\r\nWe know how many pages there are, so it may be possible to indicate how many remain.\r\n\r\nhttps://github.com/dogsheep/github-to-sqlite/blob/a6e237f75a4b86963d91dcb5c9582e3a1b3349d6/github_to_sqlite/utils.py#L367\r\n\r\n## Resources\r\n\r\n- Using the existing Click API: \r\n - https://click.palletsprojects.com/en/5.x/utils/#showing-progress-bars\r\n- Loading spinner: https://github.com/pavdmyt/yaspin\r\n- Progress bar: https://github.com/tqdm/tqdm", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/72/reactions\", \"total_count\": 3, \"+1\": 3, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1363244199, "node_id": "I_kwDODFdgUs5RQXSn", "number": 75, "title": "Fetch repos doesn't support organisations", "user": {"value": 2757699, "label": "OverkillGuy"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-09-06T12:55:06Z", "updated_at": "2022-09-06T12:55:06Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Say I want to get all my Github Org's repos info, for data analysis. Not just the public repos, but also the private/internal repos.\r\n\r\nThe endpoints are different for organisation, and this tool doesn't take it into account:\r\nhttps://github.com/dogsheep/github-to-sqlite/blob/ace13ec3d98090d99bd71871c286a4a612c96a50/github_to_sqlite/utils.py#L453\r\nhttps://github.com/dogsheep/github-to-sqlite/blob/ace13ec3d98090d99bd71871c286a4a612c96a50/github_to_sqlite/utils.py#L455\r\n\r\nThe endpoints for organisation repos is instead ([source](https://docs.github.com/en/rest/repos/repos#list-organization-repositories)):\r\n`url = \"https://api.github.com/orgs/{}/repos\".format(username)`\r\n\r\nLet's add support for organisations repo scraping.", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/75/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1363280254, "node_id": "PR_kwDODFdgUs4-cIa_", "number": 76, "title": "Add organization support to repos command", "user": {"value": 2757699, "label": "OverkillGuy"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-09-06T13:21:42Z", "updated_at": "2022-09-06T13:59:08Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/github-to-sqlite/pulls/76", "body": "New --organization flag to signify all given \"usernames\" are private\r\norgs. Adapts API URL to the organization path instead.\r\n\r\nNot the best implementation, but a first draft to talk around\r\n\r\nFixes #75 (badly, no tests, overly vague, untested)", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/76/reactions\", \"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1410548368, "node_id": "I_kwDODFdgUs5UE0KQ", "number": 77, "title": "Feature: Support GitHub discussions", "user": {"value": 631242, "label": "frosencrantz"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-10-16T16:53:38Z", "updated_at": "2022-10-16T16:53:38Z", "closed_at": null, "author_association": "CONTRIBUTOR", "pull_request": null, "body": "Hi @simonw I've been a happy user of this tool. Thank you for writing it and sharing it.\r\n\r\nI wanted to suggest a feature request to support Discussions. For example the VisiData project has discussions https://github.com/saulpw/visidata/discussions , and it would be useful if there was a way to pull that data into the database.\r\n\r\nHowever, I'm not offering a pull request.", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/77/reactions\", \"total_count\": 2, \"+1\": 2, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1505411725, "node_id": "I_kwDODFdgUs5ZusKN", "number": 78, "title": "self-hosted or corp github enterprise", "user": {"value": 549431, "label": "ebdavison"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-12-20T22:51:45Z", "updated_at": "2022-12-20T22:51:45Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "We use github enterprise at work and I would like to use this tool to pull info from that site rather than the public github.com instance. Is there an option for this? If not, can one be added for a custom repo URL?", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/78/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1570375808, "node_id": "I_kwDODFdgUs5dmgiA", "number": 79, "title": "Deploy demo job is failing due to rate limit", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2023-02-03T20:05:01Z", "updated_at": "2023-12-08T14:50:15Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "https://github.com/dogsheep/github-to-sqlite/actions/runs/4080058087/jobs/7032116511", "repo": {"value": 207052882, "label": "github-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/github-to-sqlite/issues/79/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 504720731, "node_id": "MDU6SXNzdWU1MDQ3MjA3MzE=", "number": 1, "title": "Add more details on how to request data from google takeout correctly.", "user": {"value": 1055831, "label": "dazzag24"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2019-10-09T15:17:34Z", "updated_at": "2019-10-09T15:17:34Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "The default is to download everything. This can result in an enormous amount of data when you only really need 2 types of data for now:\r\n\r\n- My Activity\r\n- Location History\r\n\r\nIn addition unless you specify that \"My Activity\" is downloaded in JSON format the default is HTML. This then causes the \r\n\r\n`google-takeout-to-sqlite my-activity takeout.db takeout.zip`\r\n\r\ncommand to fail as it only contains html files not json files.\r\n\r\nThanks", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/1/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1123393829, "node_id": "I_kwDODFE5qs5C9aEl", "number": 10, "title": "sqlite3.OperationalError: no such table: main.my_activity", "user": {"value": 69208826, "label": "glxblt14"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2022-02-03T17:59:29Z", "updated_at": "2022-03-20T02:38:07Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "Hello,\r\nWhen i run the command `google-takeout-to-sqlite my-activity db.db takeout-20220203T174446Z-001.zip`, i get this error :\r\n```\r\nTraceback (most recent call last):\r\n File \"c:\\users\\julie\\appdata\\local\\programs\\python\\python39-32\\lib\\runpy.py\", line 197, in _run_module_as_main\r\n return _run_code(code, main_globals, None,\r\n File \"c:\\users\\julie\\appdata\\local\\programs\\python\\python39-32\\lib\\runpy.py\", line 87, in _run_code\r\n exec(code, run_globals)\r\n File \"C:\\Users\\julie\\AppData\\Local\\Programs\\Python\\Python39-32\\Scripts\\google-takeout-to-sqlite.exe\\__main__.py\", line 7, in \r\n File \"c:\\users\\julie\\appdata\\local\\programs\\python\\python39-32\\lib\\site-packages\\click\\core.py\", line 1128, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"c:\\users\\julie\\appdata\\local\\programs\\python\\python39-32\\lib\\site-packages\\click\\core.py\", line 1053, in main\r\n rv = self.invoke(ctx)\r\n File \"c:\\users\\julie\\appdata\\local\\programs\\python\\python39-32\\lib\\site-packages\\click\\core.py\", line 1659, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"c:\\users\\julie\\appdata\\local\\programs\\python\\python39-32\\lib\\site-packages\\click\\core.py\", line 1395, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n File \"c:\\users\\julie\\appdata\\local\\programs\\python\\python39-32\\lib\\site-packages\\click\\core.py\", line 754, in invoke\r\n return __callback(*args, **kwargs)\r\n File \"c:\\users\\julie\\appdata\\local\\programs\\python\\python39-32\\lib\\site-packages\\google_takeout_to_sqlite\\cli.py\", line 31, in my_activity\r\n utils.save_my_activity(db, zf)\r\n File \"c:\\users\\julie\\appdata\\local\\programs\\python\\python39-32\\lib\\site-packages\\google_takeout_to_sqlite\\utils.py\", line 19, in save_my_activity\r\n db[\"my_activity\"].create_index([\"time\"])\r\n File \"c:\\users\\julie\\appdata\\local\\programs\\python\\python39-32\\lib\\site-packages\\sqlite_utils\\db.py\", line 629, in create_index\r\n self.db.conn.execute(sql)\r\nsqlite3.OperationalError: no such table: main.my_activity\r\n```\r\nThank you for your help\r\nSorry for my bad English\r\nEDIT: i used the json format", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/10/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1250287607, "node_id": "PR_kwDODFE5qs44jvRV", "number": 11, "title": "Update README.md", "user": {"value": 11887, "label": "ashanan"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-05-27T03:13:59Z", "updated_at": "2022-05-27T03:13:59Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/google-takeout-to-sqlite/pulls/11", "body": "Fix typo", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/11/reactions\", \"total_count\": 1, \"+1\": 1, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1557599877, "node_id": "I_kwDODFE5qs5c1xaF", "number": 12, "title": "location history changes", "user": {"value": 14809320, "label": "gerardrbentley"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-01-26T03:57:25Z", "updated_at": "2023-01-26T03:57:25Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "not sure if each download is unique, but I had to change some things to work with the takeout zip I made 2023-01-25\r\n\r\nfilename changed from \"Location History.json\" to \"Records.json\"\r\n\r\n`\"timestampMs\"` is not present, `\"timestamp\"` is roughly iso timestamp\r\n\r\n```py\r\ndef get_timestamp_ms(raw_timestamp):\r\n try:\r\n return datetime.datetime.strptime(raw_timestamp, \"%Y-%m-%dT%H:%M:%SZ\").timestamp()\r\n except ValueError:\r\n return datetime.datetime.strptime(raw_timestamp, \"%Y-%m-%dT%H:%M:%S.%fZ\").timestamp()\r\n\r\ndef save_location_history(db, zf):\r\n location_history = json.load(\r\n zf.open(\"Takeout/Location History/Records.json\")\r\n )\r\n db[\"location_history\"].upsert_all(\r\n (\r\n {\r\n \"id\": id_for_location_history(row),\r\n \"latitude\": row[\"latitudeE7\"] / 1e7,\r\n \"longitude\": row[\"longitudeE7\"] / 1e7,\r\n \"accuracy\": row[\"accuracy\"],\r\n \"timestampMs\": get_timestamp_ms(row[\"timestamp\"]),\r\n \"when\": row[\"timestamp\"],\r\n }\r\n for row in location_history[\"locations\"]\r\n ),\r\n pk=\"id\",\r\n )\r\n\r\n\r\ndef id_for_location_history(row):\r\n # We want an ID that is unique but can be sorted by in\r\n # date order - so we use the isoformat date + the first\r\n # 6 characters of a hash of the JSON\r\n first_six = hashlib.sha1(\r\n json.dumps(row, separators=(\",\", \":\"), sort_keys=True).encode(\"utf8\")\r\n ).hexdigest()[:6]\r\n return \"{}-{}\".format(\r\n row['timestamp'],\r\n first_six,\r\n )\r\n```\r\n\r\nexample locations from mine\r\n\r\n```json\r\n{\r\n \"latitudeE7\": 427220206,\r\n \"longitudeE7\": -923423972,\r\n \"accuracy\": 10,\r\n \"deviceTag\": -1312429967,\r\n \"deviceDesignation\": \"PRIMARY\",\r\n \"timestamp\": \"2019-01-08T23:31:50.867Z\"\r\n }\r\n```\r\n\r\n```json\r\n{\r\n \"latitudeE7\": 427011317,\r\n \"longitudeE7\": -923448300,\r\n \"accuracy\": 5,\r\n \"deviceTag\": -1312429967,\r\n \"deviceDesignation\": \"PRIMARY\",\r\n \"timestamp\": \"2019-01-08T23:33:53Z\"\r\n }, \r\n```", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/12/reactions\", \"total_count\": 2, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 2}", "draft": null, "state_reason": null} {"id": 1884499674, "node_id": "PR_kwDODFE5qs5ZtYMc", "number": 13, "title": "use poetry for packages, asdf for versioning, and gh actions for ci", "user": {"value": 150855, "label": "iloveitaly"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2023-09-06T17:59:16Z", "updated_at": "2023-09-06T17:59:16Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/google-takeout-to-sqlite/pulls/13", "body": "- build: use poetry for package management, asdf for python version\n- build: cleanup poetry config, add keywords, ignore dist\n- ci: migrate circleci to gh actions\n- fix: dup method definition\n", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/13/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 769376447, "node_id": "MDU6SXNzdWU3NjkzNzY0NDc=", "number": 2, "title": "killed by oomkiller on large location-history", "user": {"value": 231498, "label": "khimaros"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 2, "created_at": "2020-12-17T00:32:24Z", "updated_at": "2020-12-17T00:48:32Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "memory seems to grow unbounded and is oom-killed after about 20GB memory usage.\r\n\r\nthis is happening while loading a ~1GB uncompressed location history.", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/2/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 769397742, "node_id": "MDU6SXNzdWU3NjkzOTc3NDI=", "number": 3, "title": "sqlite-utils error on takeout import", "user": {"value": 231498, "label": "khimaros"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2020-12-17T01:18:48Z", "updated_at": "2020-12-17T01:19:04Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "```\r\n$ google-takeout-to-sqlite my-activity takeout.db /path/to/zip\r\n...\r\nsqlite3.OperationalError: no such table: main.my_activity\r\n```\r\n\r\nthere is no table create in `utils.py`, unlike other importers such as github-to-sqlite\r\n\r\nadditionally, this package and hackernews-to-sqlite have conflicting `sqlite-utils` dep with datasette and dogsheep-beta", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/3/reactions\", \"total_count\": 2, \"+1\": 2, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 778380836, "node_id": "MDU6SXNzdWU3NzgzODA4MzY=", "number": 4, "title": "Feature Request: Gmail", "user": {"value": 203343, "label": "Btibert3"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 5, "created_at": "2021-01-04T21:31:09Z", "updated_at": "2021-03-04T20:54:44Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "From takeout, I only exported my Gmail account. Ideally I could parse this into sqlite via this tool.", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/4/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 813880401, "node_id": "MDExOlB1bGxSZXF1ZXN0NTc3OTUzNzI3", "number": 5, "title": "WIP: Add Gmail takeout mbox import", "user": {"value": 306240, "label": "UtahDave"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 25, "created_at": "2021-02-22T21:30:40Z", "updated_at": "2021-07-28T07:18:56Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/google-takeout-to-sqlite/pulls/5", "body": "WIP\r\n\r\nThis PR adds the ability to import emails from a Gmail mbox export from Google Takeout.\r\n\r\nThis is my first PR to a datasette/dogsheep repo. I've tested this on my personal Google Takeout mbox with ~520,000 emails going back to 2004. This took around ~20 minutes to process.\r\n\r\nTo provide some feedback on the progress of the import I added the \"rich\" python module. I'm happy to remove that if adding a dependency is discouraged. However, I think it makes a nice addition to give feedback on the progress of a long import.\r\n\r\nDo we want to log emails that have errors when trying to import them?\r\n\r\nDealing with encodings with emails is a bit tricky. I'm very open to feedback on how to deal with those better. As well as any other feedback for improvements.", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 821841046, "node_id": "MDU6SXNzdWU4MjE4NDEwNDY=", "number": 6, "title": "Upgrade to latest sqlite-utils", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2021-03-04T07:21:54Z", "updated_at": "2021-03-04T07:22:51Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "This is pinned to v1 at the moment.", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/6/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 930946817, "node_id": "MDU6SXNzdWU5MzA5NDY4MTc=", "number": 7, "title": "KeyError: 'accuracy' when processing Location History", "user": {"value": 403152, "label": "davidwilemski"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-06-27T14:39:43Z", "updated_at": "2021-06-27T14:39:43Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I'm new to both the dogsheep tools and datasette but have been experimenting a bit the last few days and these are really cool tools!\r\n\r\nI encountered a problem running my Google location history through this tool running the latest release in a docker container:\r\n\r\n```\r\nTraceback (most recent call last):\r\n File \"/usr/local/bin/google-takeout-to-sqlite\", line 8, in \r\n sys.exit(cli())\r\n File \"/usr/local/lib/python3.9/site-packages/click/core.py\", line 829, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"/usr/local/lib/python3.9/site-packages/click/core.py\", line 782, in main\r\n rv = self.invoke(ctx)\r\n File \"/usr/local/lib/python3.9/site-packages/click/core.py\", line 1259, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"/usr/local/lib/python3.9/site-packages/click/core.py\", line 1066, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n File \"/usr/local/lib/python3.9/site-packages/click/core.py\", line 610, in invoke\r\n return callback(*args, **kwargs)\r\n File \"/usr/local/lib/python3.9/site-packages/google_takeout_to_sqlite/cli.py\", line 49, in my_activity\r\n utils.save_location_history(db, zf)\r\n File \"/usr/local/lib/python3.9/site-packages/google_takeout_to_sqlite/utils.py\", line 27, in save_location_history\r\n db[\"location_history\"].upsert_all(\r\n File \"/usr/local/lib/python3.9/site-packages/sqlite_utils/db.py\", line 1105, in upsert_all\r\n return self.insert_all(\r\n File \"/usr/local/lib/python3.9/site-packages/sqlite_utils/db.py\", line 990, in insert_all\r\n chunk = list(chunk)\r\n File \"/usr/local/lib/python3.9/site-packages/google_takeout_to_sqlite/utils.py\", line 33, in \r\n \"accuracy\": row[\"accuracy\"],\r\nKeyError: 'accuracy'\r\n```\r\n\r\nIt looks like the tool assumes the `accuracy` key will be in every location history entry. \r\n\r\nMy first attempt at a local patch to get myself going was to convert accessing the `accuracy` key to a `.get` instead to hopefully make the row nullable but I wasn't quite sure what `sqlite_utils` would do there. That did work in that the import happened and so I was going to propose a patch that made that change but in updating the existing test to include an entry with a missing accuracy entry, I noticed the expected type of the field appeared to be changing to a string in the test (and from a quick scan through the sqlite_utils code, probably TEXT in the database). Given this change in column type, it seemed that opening an issue first before proposing a fix seemed warranted. It seems the schema would need to be explicitly specified if you wanted a nullable integer column.\r\n\r\nNow that I've done a successful import run using my initial fix of calling `.get` on the row dict, I can see with datasette that I only have 7 data points (out of ~250k) that have a null accuracy column. They are all from 2011-2012 in an import that includes points spanning ~2010-2016 so perhaps another approach might be to filter those entries out during import if it really is that infrequent?\r\n\r\nI'm happy to provide a PR for a fix but figured I'd ask about which direction is preferred first.", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/7/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 954546309, "node_id": "MDExOlB1bGxSZXF1ZXN0Njk4NDIzNjY3", "number": 8, "title": "Add Gmail takeout mbox import (v2)", "user": {"value": 28565, "label": "maxhawkins"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 7, "created_at": "2021-07-28T07:05:32Z", "updated_at": "2023-09-08T01:22:49Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/google-takeout-to-sqlite/pulls/8", "body": "WIP\r\n\r\nThis PR builds on #5 to continue implementing gmail import support.\r\n\r\nBuilding on @UtahDave's work, these commits add a few performance and bug fixes:\r\n\r\n* Decreased memory overhead for import by manually parsing mbox headers.\r\n* Fixed error where some messages in the mbox would yield a row with NULL in all columns.\r\n\r\nI will send more commits to fix any errors I encounter as I run the importer on my personal takeout data.", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1046887492, "node_id": "PR_kwDODFE5qs4uMsMJ", "number": 9, "title": "Removed space from filename My Activity.json", "user": {"value": 91880982, "label": "widadmogral"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2021-11-08T00:04:31Z", "updated_at": "2021-11-08T00:04:31Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/google-takeout-to-sqlite/pulls/9", "body": "File name from google takeout has no space. The code only runs without error if filename is \"MyActivity.json\" and not \"My Activity.json\". Is it a new change by Google?", "repo": {"value": 206649770, "label": "google-takeout-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/9/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 952179830, "node_id": "MDU6SXNzdWU5NTIxNzk4MzA=", "number": 2, "title": "Command for fetching Hacker News threads from the search API", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2021-07-25T02:00:45Z", "updated_at": "2021-07-25T03:12:57Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "I want to be able to fetch every item for a domain, e.g. https://news.ycombinator.com/from?site=simonwillison.net", "repo": {"value": 248903544, "label": "hacker-news-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/2/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 952189173, "node_id": "MDU6SXNzdWU5NTIxODkxNzM=", "number": 3, "title": "Use HN algolia endpoint to retrieve trees", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2021-07-25T03:35:27Z", "updated_at": "2021-07-25T18:41:17Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "The `trees` command currently has to make a request for every single comment. Algolia have an endpoint that bundles the entire thread together into a single request.\r\n\r\n`https://hn.algolia.com/api/v1/items/ID`\r\n\r\nHere's an example that loads quickly, with about 50 comments: https://hn.algolia.com/api/v1/items/27941108\r\n\r\nIt doesn't appear to use pagination at all - if a thread is big then the response is big.\r\n\r\nI ran this search to find some stories with more than 1000 comments: https://hn.algolia.com/api/v1/search?tags=story&numericFilters=num_comments%3E=1000\r\n\r\nHere's one: https://news.ycombinator.com/item?id=25015967 with 4759 comments. Hitting the API takes 41s and returns 3.7 MB of JSON!\r\n```\r\nwget 'https://hn.algolia.com/api/v1/items/25015967' 0.03s user 0.04s system 0% cpu 41.368 total\r\n/tmp % ls -lah 25015967 \r\n-rw-r--r-- 1 simon wheel 3.7M Jul 24 20:31 25015967\r\n```", "repo": {"value": 248903544, "label": "hacker-news-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/3/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1205867842, "node_id": "I_kwDODtX3eM5H4BVC", "number": 4, "title": "Retrieve the top-level story for a comment", "user": {"value": 1755789, "label": "telotortium"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-04-15T20:25:39Z", "updated_at": "2022-04-15T20:25:39Z", "closed_at": null, "author_association": "NONE", "pull_request": null, "body": "I think that each comment inserted into the database should include a column `onstory` that contains the ID of the story on which the comment was made. This is exactly equivalent to the link after \"on:\" at the top of an HN comment page ([example](https://news.ycombinator.com/item?id=18358028)). We could do this either by directly retrieving the HTML page and using Beautiful Soup to find that link, or alternatively recurse up the tree in the Firebase API using the `parent` field (probably using `functools.lru_cache` in case a person has commented a bunch of times on the same story).", "repo": {"value": 248903544, "label": "hacker-news-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/4/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null} {"id": 1353418822, "node_id": "PR_kwDODtX3eM497MOV", "number": 5, "title": "The program fails when the user has no submissions", "user": {"value": 2467, "label": "fernand0"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 0, "created_at": "2022-08-28T17:25:45Z", "updated_at": "2022-08-28T17:25:45Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/hacker-news-to-sqlite/pulls/5", "body": "Tested with:\r\n \r\n hacker-news-to-sqlite user hacker-news.db fernand0\r\n\r\nResult:\r\n`\r\nTraceback (most recent call last):\r\n File \"/home/ftricas/.pyenv/versions/3.10.6/bin/hacker-news-to-sqlite\", line 8, in \r\n sys.exit(cli())\r\n File \"/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py\", line 1130, in __call__\r\n return self.main(*args, **kwargs)\r\n File \"/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py\", line 1055, in main\r\n rv = self.invoke(ctx)\r\n File \"/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py\", line 1657, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n File \"/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py\", line 1404, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n File \"/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py\", line 760, in invoke\r\n return __callback(*args, **kwargs)\r\n File \"/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/hacker_news_to_sqlite/cli.py\", line 27, in user\r\n submitted = user.pop(\"submitted\", None) or []\r\nAttributeError: 'NoneType' object has no attribute 'pop'\r\n`\r\n\r\nThere is a problem of style with the patch (but not sure what to do) because with the new inicialization ( submitted = []) the part \r\n\r\n or []\r\n\r\nis not needed. Maybe there is a more adequate way of doing this.", "repo": {"value": 248903544, "label": "hacker-news-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/5/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 1641117021, "node_id": "PR_kwDODtX3eM5M66op", "number": 6, "title": "Add permalink virtual field to items table", "user": {"value": 1231935, "label": "xavdid"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2023-03-26T22:22:38Z", "updated_at": "2023-03-29T18:38:52Z", "closed_at": null, "author_association": "FIRST_TIME_CONTRIBUTOR", "pull_request": "dogsheep/hacker-news-to-sqlite/pulls/6", "body": "I added a virtual column (no storage overhead) to the output that easily links back to the source. It works nicely out of the box with datasette:\r\n\r\n![](https://cdn.zappy.app/faf43661d539ee0fee02c0421de22d65.png)\r\n\r\nI got bit a bit by https://github.com/simonw/sqlite-utils/issues/411, so I went with a manual `table_xinfo` and creating the table via execute. Happy to adjust if that issue moves, but this seems like it works.\r\n\r\nI also added my best-guess instructions for local development on this package. I'm shooting in the dark, so feel free to replace with how you work on it locally.", "repo": {"value": 248903544, "label": "hacker-news-to-sqlite"}, "type": "pull", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/6/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": 0, "state_reason": null} {"id": 727848625, "node_id": "MDU6SXNzdWU3Mjc4NDg2MjU=", "number": 12, "title": "Some workout columns should be float, not text", "user": {"value": 9599, "label": "simonw"}, "state": "open", "locked": 0, "assignee": null, "milestone": null, "comments": 4, "created_at": "2020-10-23T02:47:02Z", "updated_at": "2022-06-23T04:35:02Z", "closed_at": null, "author_association": "MEMBER", "pull_request": null, "body": "Columns `duration`, `totalDistance` and `totalEnergyBurned` should be converted to float.\r\n\r\nhttps://github.com/dogsheep/healthkit-to-sqlite/blob/71e36e1cf034b96de2a8e6652265d782d3fdf63b/healthkit_to_sqlite/utils.py#L50-L57", "repo": {"value": 197882382, "label": "healthkit-to-sqlite"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/12/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": null}