github

This data as json, CSV

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	issue	performed_via_github_app
https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1710380941	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8	1710380941	IC_kwDODFE5qs5l8leN	28565	2023-09-07T15:39:59Z	2023-09-07T15:39:59Z	NONE	> @maxhawkins curious why you didn't use the stdlib `mailbox` to parse the `mbox` files? Mailbox parses the entire mbox into memory. Using the lower level library lets us stream the emails in one at a time to support larger archives. Both libraries are in the stdlib.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	954546309
https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1003437288	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8	1003437288	IC_kwDODFE5qs47zzzo	28565	2021-12-31T19:06:20Z	2021-12-31T19:06:20Z	NONE	> @maxhawkins how hard would it be to add an entry to the table that includes the HTML version of the email, if it exists? I just attempted your the PR branch on a very small mbox file, and it worked great. My use case is a research project and I need to access more than just the body plain text. Shouldn't be hard. The easiest way is probably to remove the `if body.content_type == "text/html"` clause from [utils.py:254](https://github.com/dogsheep/google-takeout-to-sqlite/pull/8/commits/8e6d487b697ce2e8ad885acf613a157bfba84c59#diff-25ad9dd1ced1b8bfc37fda8444819c803232c08891e4af3d4064aa205d8174eaR254) and just return content directly without parsing.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	954546309
https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-896378525	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8	896378525	IC_kwDODFE5qs41baad	28565	2021-08-10T23:28:45Z	2021-08-10T23:28:45Z	NONE	I added parsing of text/html emails using BeautifulSoup. Around half of the emails in my archive don't include a text/plain payload so adding html parsing makes a good chunk of them searchable.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	954546309
https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-894581223	https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8	894581223	IC_kwDODFE5qs41Ujnn	28565	2021-08-07T00:57:48Z	2021-08-07T00:57:48Z	NONE	Just added two more fixes: * Added parsing for rfc 2047 encoded unicode headers * Body is now stored as TEXT rather than a BLOB regardless of what order the messages are parsed in. I was able to run this on my Takeout export and everything seems to work fine. @simonw let me know if this looks good to merge.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	954546309