github: issues: 252 rows where comments = 0 and state = "open" sorted by number

252 rows where comments = 0 and state = "open" sorted by number

Search:

descending

id	node_id	number ▼	title	user	state	created_at	updated_at	author_association	pull_request	body	repo	type	reactions	draft
504720731	MDU6SXNzdWU1MDQ3MjA3MzE=	1	Add more details on how to request data from google takeout correctly.	dazzag24 1055831	open	2019-10-09T15:17:34Z	2019-10-09T15:17:34Z	NONE		The default is to download everything. This can result in an enormous amount of data when you only really need 2 types of data for now: My Activity Location History In addition unless you specify that "My Activity" is downloaded in JSON format the default is HTML. This then causes the `google-takeout-to-sqlite my-activity takeout.db takeout.zip` command to fail as it only contains html files not json files. Thanks	google-takeout-to-sqlite 206649770	issue	{ "url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/1/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1353411865	I_kwDODEpn8M5Qq20Z	1	Problem with my user	fernand0 2467	open	2022-08-28T16:59:37Z	2022-08-28T16:59:37Z	NONE		If I call the program with: inaturalist-to-sqlite inaturalist.db ftricas the program exits with an error: Importing 36 observations Traceback (most recent call last): File "/home/ftricas/.pyenv/versions/3.10.6/bin/inaturalist-to-sqlite", line 8, in <module> sys.exit(cli()) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1130, in __call__ return self.main(args, kwargs) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 760, in invoke return __callback(args, **kwargs) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/inaturalist_to_sqlite/cli.py", line 51, in cli save_observation(observation, db) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/inaturalist_to_sqlite/utils.py", line 34, in save_observation db["observations"] File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sqlite_utils/db.py", line 2965, in insert return self.insert_all( File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sqlite_utils/db.py", line 3068, in insert_all self.create( File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sqlite_utils/db.py", line 1564, in create self.db.create_table( File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sqlite_utils/db.py", line 951, in create_table sql = self.create_table_sql( File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sqlite_utils/db.py", line 765, in create_table_sql foreign_keys = self.resolve_foreign_keys(name, foreign_keys or []) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sqlite_utils/db.py", line 702, in resolve_foreign_keys other_table = table.guess_foreign_table(column) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sqlite_utils/db.py", line 2061, in guess_foreign_table raise NoObviousTable( sqlite_utils.db.NoObviousTable: No obvious foreign key table for column 'taxon' - tried ['taxon', 'taxons'] If I call the program with your user everything seems to go well and then, I can call the program with my own user without problems. Moreover, I can call the program again with my own user and everything goes well now. Additional info, the command: `sqlite-utils tables inaturalist.db` shows that the correct name can be 'taxons'. There is another small problem with a warning: warnings.warn("urllib3 ({}) or chardet ({})/charset_normalizer ({}) doesn't match a supported "	inaturalist-to-sqlite 206202864	issue	{ "url": "https://api.github.com/repos/dogsheep/inaturalist-to-sqlite/issues/1/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
541274681	MDU6SXNzdWU1NDEyNzQ2ODE=	2	Add linkedin-to-sqlite	mnp 881925	open	2019-12-21T03:13:40Z	2019-12-21T03:13:40Z	NONE		There is an API available. https://developer.linkedin.com/docs/rest-api# At the minimum, I would think contact list and messages would be of interest.	dogsheep.github.io 214746582	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep.github.io/issues/2/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
664793260	MDU6SXNzdWU2NjQ3OTMyNjA=	2	Yak shave	ekg 145425	open	2020-07-23T22:04:18Z	2020-07-23T22:04:18Z	NONE		Just a quick note... The 23andme data is not exactly your genome, but a SNP chip of your genome. It's "some of your genotypes." Or about 0.1% of your genome. Nice work in any case! It deserves to be liberated!!!!!	genome-to-sqlite 209590345	issue	{ "url": "https://api.github.com/repos/dogsheep/genome-to-sqlite/issues/2/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1485017981	I_kwDODEpn8M5Yg5N9	2	table identifications has no column named previous_observation_taxon	heaversm 520541	open	2022-12-08T16:47:17Z	2022-12-08T16:47:17Z	NONE		Installed successfully with pip and ran `inaturalist-to-sqlite inaturalist.db simonw` and got the error: `sqlite3.OperationalError: table identifications has no column named previous_observation_taxon`	inaturalist-to-sqlite 206202864	issue	{ "url": "https://api.github.com/repos/dogsheep/inaturalist-to-sqlite/issues/2/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
769397742	MDU6SXNzdWU3NjkzOTc3NDI=	3	sqlite-utils error on takeout import	khimaros 231498	open	2020-12-17T01:18:48Z	2020-12-17T01:19:04Z	NONE		`$ google-takeout-to-sqlite my-activity takeout.db /path/to/zip ... sqlite3.OperationalError: no such table: main.my_activity` there is no table create in `utils.py`, unlike other importers such as github-to-sqlite additionally, this package and hackernews-to-sqlite have conflicting `sqlite-utils` dep with datasette and dogsheep-beta	google-takeout-to-sqlite 206649770	issue	{ "url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/3/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1205867842	I_kwDODtX3eM5H4BVC	4	Retrieve the top-level story for a comment	telotortium 1755789	open	2022-04-15T20:25:39Z	2022-04-15T20:25:39Z	NONE		I think that each comment inserted into the database should include a column `onstory` that contains the ID of the story on which the comment was made. This is exactly equivalent to the link after "on:" at the top of an HN comment page (example). We could do this either by directly retrieving the HTML page and using Beautiful Soup to find that link, or alternatively recurse up the tree in the Firebase API using the `parent` field (probably using `functools.lru_cache` in case a person has commented a bunch of times on the same story).	hacker-news-to-sqlite 248903544	issue	{ "url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/4/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
723499985	MDExOlB1bGxSZXF1ZXN0NTA1MDc2NDE4	5	Add fitbit-to-sqlite	mrphil007 4632208	open	2020-10-16T20:04:05Z	2020-10-16T20:04:05Z	FIRST_TIME_CONTRIBUTOR	dogsheep/dogsheep.github.io/pulls/5		dogsheep.github.io 214746582	pull	{ "url": "https://api.github.com/repos/dogsheep/dogsheep.github.io/issues/5/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1353418822	PR_kwDODtX3eM497MOV	5	The program fails when the user has no submissions	fernand0 2467	open	2022-08-28T17:25:45Z	2022-08-28T17:25:45Z	FIRST_TIME_CONTRIBUTOR	dogsheep/hacker-news-to-sqlite/pulls/5	Tested with: `hacker-news-to-sqlite user hacker-news.db fernand0` Result: Traceback (most recent call last): File "/home/ftricas/.pyenv/versions/3.10.6/bin/hacker-news-to-sqlite", line 8, in <module> sys.exit(cli()) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1130, in __call__ return self.main(args, kwargs) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 760, in invoke return __callback(args, **kwargs) File "/home/ftricas/.pyenv/versions/3.10.6/lib/python3.10/site-packages/hacker_news_to_sqlite/cli.py", line 27, in user submitted = user.pop("submitted", None) or [] AttributeError: 'NoneType' object has no attribute 'pop' There is a problem of style with the patch (but not sure what to do) because with the new inicialization ( submitted = []) the part `or []` is not needed. Maybe there is a more adequate way of doing this.	hacker-news-to-sqlite 248903544	pull	{ "url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/5/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1616440856	I_kwDOJHON9s5gWO4Y	5	Configure full text search	simonw 9599	open	2023-03-09T05:20:46Z	2023-03-09T05:20:46Z	MEMBER		FTS would be useful. Maybe even extract the plain text from the notes to make that index easier to create, rather than creating it against the HTML. Can use the `plaintext` property for that.	apple-notes-to-sqlite 611552758	issue	{ "url": "https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/5/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
689848827	MDU6SXNzdWU2ODk4NDg4Mjc=	6	ISO timestamps	simonw 9599	open	2020-09-01T06:16:42Z	2020-09-01T06:16:42Z	MEMBER		The `time_added`, `time_updated` and `time_read` columns currently store data like this: `September 19, 2019 - 00:30:30 UTC` Should use ISO instead, e.g. `2020-07-26T01:05:24+00:00`	pocket-to-sqlite 213286752	issue	{ "url": "https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/6/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
689850810	MDU6SXNzdWU2ODk4NTA4MTA=	6	Set up a demo instance	simonw 9599	open	2020-09-01T06:20:24Z	2020-09-01T06:20:24Z	MEMBER		Once I've got the Datasette plugin to a state where it's worth building a demo: #3 I can use data from my public https://github-to-sqlite.dogsheep.net/ demo plus the Pocket data subset I use for the demo in https://github.com/dogsheep/pocket-to-sqlite/issues/5 - I could pull in the https://dogsheep-photos.dogsheep.net/ photos data too.	dogsheep-beta 197431109	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-beta/issues/6/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
925384329	MDExOlB1bGxSZXF1ZXN0NjczODcyOTc0	7	Add instagram-to-sqlite	gavindsouza 36654812	open	2021-06-19T12:26:16Z	2021-07-28T07:58:59Z	FIRST_TIME_CONTRIBUTOR	dogsheep/dogsheep.github.io/pulls/7	The tool covers only chat imports at the time of opening this PR but I'm planning to import everything else that I feel inquisitive about ref: https://github.com/gavindsouza/instagram-to-sqlite	dogsheep.github.io 214746582	pull	{ "url": "https://api.github.com/repos/dogsheep/dogsheep.github.io/issues/7/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
930946817	MDU6SXNzdWU5MzA5NDY4MTc=	7	KeyError: 'accuracy' when processing Location History	davidwilemski 403152	open	2021-06-27T14:39:43Z	2021-06-27T14:39:43Z	NONE		I'm new to both the dogsheep tools and datasette but have been experimenting a bit the last few days and these are really cool tools! I encountered a problem running my Google location history through this tool running the latest release in a docker container: Traceback (most recent call last): File "/usr/local/bin/google-takeout-to-sqlite", line 8, in <module> sys.exit(cli()) File "/usr/local/lib/python3.9/site-packages/click/core.py", line 829, in __call__ return self.main(args, kwargs) File "/usr/local/lib/python3.9/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, ctx.params) File "/usr/local/lib/python3.9/site-packages/click/core.py", line 610, in invoke return callback(args, **kwargs) File "/usr/local/lib/python3.9/site-packages/google_takeout_to_sqlite/cli.py", line 49, in my_activity utils.save_location_history(db, zf) File "/usr/local/lib/python3.9/site-packages/google_takeout_to_sqlite/utils.py", line 27, in save_location_history db["location_history"].upsert_all( File "/usr/local/lib/python3.9/site-packages/sqlite_utils/db.py", line 1105, in upsert_all return self.insert_all( File "/usr/local/lib/python3.9/site-packages/sqlite_utils/db.py", line 990, in insert_all chunk = list(chunk) File "/usr/local/lib/python3.9/site-packages/google_takeout_to_sqlite/utils.py", line 33, in <genexpr> "accuracy": row["accuracy"], KeyError: 'accuracy' It looks like the tool assumes the `accuracy` key will be in every location history entry. My first attempt at a local patch to get myself going was to convert accessing the `accuracy` key to a `.get` instead to hopefully make the row nullable but I wasn't quite sure what `sqlite_utils` would do there. That did work in that the import happened and so I was going to propose a patch that made that change but in updating the existing test to include an entry with a missing accuracy entry, I noticed the expected type of the field appeared to be changing to a string in the test (and from a quick scan through the sqlite_utils code, probably TEXT in the database). Given this change in column type, it seemed that opening an issue first before proposing a fix seemed warranted. It seems the schema would need to be explicitly specified if you wanted a nullable integer column. Now that I've done a successful import run using my initial fix of calling `.get` on the row dict, I can see with datasette that I only have 7 data points (out of ~250k) that have a null accuracy column. They are all from 2011-2012 in an import that includes points spanning ~2010-2016 so perhaps another approach might be to filter those entries out during import if it really is that infrequent? I'm happy to provide a PR for a fix but figured I'd ask about which direction is preferred first.	google-takeout-to-sqlite 206649770	issue	{ "url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/7/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
797728929	MDU6SXNzdWU3OTc3Mjg5Mjk=	8	QUESTION: extract full text	darribas 417363	open	2021-01-31T14:50:10Z	2021-01-31T14:50:10Z	NONE		This may be solved or a feature already, but I couldn't figure it out, is it possible to extract and store also full text from the saved pages? The same way that Pocket parses the text, it'd be amazing to be able to store (and thus make searchable later) the text. Thank you very much for the project, it's such an amazing idea!	pocket-to-sqlite 213286752	issue	{ "url": "https://api.github.com/repos/dogsheep/pocket-to-sqlite/issues/8/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
927385540	MDU6SXNzdWU5MjczODU1NDA=	8	any guidance / experience on imessage-to-sqlite ?	Casyfill 2675621	open	2021-06-22T15:46:16Z	2021-06-22T15:46:16Z	NONE			dogsheep.github.io 214746582	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep.github.io/issues/8/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
673602857	MDU6SXNzdWU2NzM2MDI4NTc=	9	Define a view that displays photos correctly	simonw 9599	open	2020-08-05T14:53:39Z	2020-08-05T14:53:39Z	MEMBER		The `photos` table stores data like this: id \| createdAt \| source \| prefix \| suffix \| width \| height \| visibility \| created ▲ \| user -- \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| -- 5e12c9708506bc000840262a \| January 06, 2020 - 05:45:20 UTC \| Swarm for iOS 1 \| https://fastly.4sqi.net/img/general/ \| /15889193_AXxGk4I1nbzUZuyYqObgbXdJNyEHiwj6AUDq0tPZWtw.jpg \| 1920 \| 1440 \| public \| 2020-01-06T05:45:20 \| 15889193 The photo URL can be derived from those pieces - define a SQL view which does that (using `datasette-json-html` to display the pictures)	swarm-to-sqlite 205429375	issue	{ "url": "https://api.github.com/repos/dogsheep/swarm-to-sqlite/issues/9/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1046887492	PR_kwDODFE5qs4uMsMJ	9	Removed space from filename My Activity.json	widadmogral 91880982	open	2021-11-08T00:04:31Z	2021-11-08T00:04:31Z	FIRST_TIME_CONTRIBUTOR	dogsheep/google-takeout-to-sqlite/pulls/9	File name from google takeout has no space. The code only runs without error if filename is "MyActivity.json" and not "My Activity.json". Is it a new change by Google?	google-takeout-to-sqlite 206649770	pull	{ "url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/9/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1617938730	I_kwDOJHON9s5gb8kq	9	Default to just storing plaintext, store HTML if `--html` is passed	simonw 9599	open	2023-03-09T20:19:06Z	2023-03-09T20:19:06Z	MEMBER		The full `body` version of the notes can get HUGE, due to embedded images. It turns out for my own purposes I'm usually happy with just the `plaintext` version. I'm tempted to say you don't get HTML unless you pass a `--html` option.	apple-notes-to-sqlite 611552758	issue	{ "url": "https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/9/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1250287607	PR_kwDODFE5qs44jvRV	11	Update README.md	ashanan 11887	open	2022-05-27T03:13:59Z	2022-05-27T03:13:59Z	FIRST_TIME_CONTRIBUTOR	dogsheep/google-takeout-to-sqlite/pulls/11	Fix typo	google-takeout-to-sqlite 206649770	pull	{ "url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/11/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
692202408	MDU6SXNzdWU2OTIyMDI0MDg=	12	Idea: maps and GeoJSON support	simonw 9599	open	2020-09-03T18:47:10Z	2020-09-04T01:45:03Z	MEMBER		It would be cool if the `display_sql` could return a column populated with GeoJSON which would the automatically be displayed on a map in the results (or maybe default JS would look for a `class="geojson"` element output by the `display` template) - ala https://github.com/simonw/datasette-leaflet-geojson Then I could render workout routes on a map, or Swarm checkin points.	dogsheep-beta 197431109	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-beta/issues/12/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
892383270	MDExOlB1bGxSZXF1ZXN0NjQ1MTAwODQ4	12	Recovering of malformed ENEX file	engdan77 8431437	open	2021-05-15T07:49:31Z	2021-05-15T19:57:50Z	FIRST_TIMER	dogsheep/evernote-to-sqlite/pulls/12	Hey .. Awesome work developing this project, that I found very useful to me and saved me some work.. Thanks.. :) Some background to this PR... I've been searching around for a tool allowing me to transforming my personal collection of Evernote notes to a format easier to search and potentially easier import to future services. Now I discovered problem processing my large data ~5GB using the existing source using Pythons builtin xml-parser that unfortunately was unable to succeed without exception breaking the process. My first attempt I tried to adapt to more robust lxml package allowing huge data and with "recover", but even if it worked better it also failed processing the whole data. Even using the memory efficient etree.iterparse() it also unfortunately got into trouble. And with no luck finding any other libraries successfully parsing this enormous file I instead chose to build a "hugexmlparser" module that allows parsing this huge file using yield (on a byte-to-byte-level) and allows you to set a maximum size for <note> to cater for potential malformed or undesirable large attachments to export, should succeed covering potential exceptions. Some cases found where the parses discover malformed XML within <content> so also in those cases try to save as much as possible by escaping (to be dealt at a later stage, better than nothing), and if a missing end </note> before new (malformed?) it would add this after encounter a new start-tag. The code for the recovery process is a bit rough and for certain room for refactoring, but at the moment is seem to achieve what I wanted. Now with the above we pass this a minor changed version of save_note_recovery() assure the existing works. Also adding this as a new recover-enex command to click and kept the original options. A couple of new tests was added as well to check against using this command. Now this currently works to me, but thought I might share a PR in such as you find use for this yourself or found useful to others finding this repository. As a second step .. When the time allows it would have been nice to also be able to easily export from SQLite to formatted HTML/MD and attachments saved... but that might perhaps be better a separate project ... or if you or someone else have something that might shared to save some trouble, I would be interested ;-)	evernote-to-sqlite 303218369	pull	{ "url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/12/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1557599877	I_kwDODFE5qs5c1xaF	12	location history changes	gerardrbentley 14809320	open	2023-01-26T03:57:25Z	2023-01-26T03:57:25Z	NONE		not sure if each download is unique, but I had to change some things to work with the takeout zip I made 2023-01-25 filename changed from "Location History.json" to "Records.json" `"timestampMs"` is not present, `"timestamp"` is roughly iso timestamp ```py def get_timestamp_ms(raw_timestamp): try: return datetime.datetime.strptime(raw_timestamp, "%Y-%m-%dT%H:%M:%SZ").timestamp() except ValueError: return datetime.datetime.strptime(raw_timestamp, "%Y-%m-%dT%H:%M:%S.%fZ").timestamp() def save_location_history(db, zf): location_history = json.load( zf.open("Takeout/Location History/Records.json") ) db["location_history"].upsert_all( ( { "id": id_for_location_history(row), "latitude": row["latitudeE7"] / 1e7, "longitude": row["longitudeE7"] / 1e7, "accuracy": row["accuracy"], "timestampMs": get_timestamp_ms(row["timestamp"]), "when": row["timestamp"], } for row in location_history["locations"] ), pk="id", ) def id_for_location_history(row): # We want an ID that is unique but can be sorted by in # date order - so we use the isoformat date + the first # 6 characters of a hash of the JSON first_six = hashlib.sha1( json.dumps(row, separators=(",", ":"), sort_keys=True).encode("utf8") ).hexdigest()[:6] return "{}-{}".format( row['timestamp'], first_six, ) ``` example locations from mine `json { "latitudeE7": 427220206, "longitudeE7": -923423972, "accuracy": 10, "deviceTag": -1312429967, "deviceDesignation": "PRIMARY", "timestamp": "2019-01-08T23:31:50.867Z" }` `json { "latitudeE7": 427011317, "longitudeE7": -923448300, "accuracy": 5, "deviceTag": -1312429967, "deviceDesignation": "PRIMARY", "timestamp": "2019-01-08T23:33:53Z" },`	google-takeout-to-sqlite 206649770	issue	{ "url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/12/reactions", "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 2 }
1650981564	I_kwDOJHON9s5iZ_q8	12	Error running pytest	amlestin 14314871	open	2023-04-02T15:02:36Z	2023-04-02T15:07:10Z	NONE		______________________________________________________ ERROR collecting tests/test_apple_notes_to_sqlite.py _______________________________________________________ ImportError while importing test module '/Users/lol/development/apple-notes-to-sqlite/tests/test_apple_notes_to_sqlite.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: /opt/homebrew/Cellar/python@3.9/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/importlib/__init__.py:127: in import_module return _bootstrap._gcd_import(name[level:], package, level) tests/test_apple_notes_to_sqlite.py:2: in <module> from apple_notes_to_sqlite.cli import cli, COUNT_SCRIPT, FOLDERS_SCRIPT E ModuleNotFoundError: No module named 'apple_notes_to_sqlite' Solution: This is likely a PYTHONPATH issue due to having pytest installed both globally and in the venv. We can guarantee the tests run by adding the current directory to sys.path automatically using `python -m pytest` The alternative is to activate the venv, install pytest, deactivate, then activate the venv again (https://stackoverflow.com/questions/35045038/how-do-i-use-pytest-with-virtualenv)	apple-notes-to-sqlite 611552758	issue	{ "url": "https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/12/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1650984552	PR_kwDOJHON9s5NbyYN	13	use universal command	amlestin 14314871	open	2023-04-02T15:10:54Z	2023-04-02T15:37:34Z	FIRST_TIME_CONTRIBUTOR	dogsheep/apple-notes-to-sqlite/pulls/13		apple-notes-to-sqlite 611552758	pull	{ "url": "https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/13/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1884499674	PR_kwDODFE5qs5ZtYMc	13	use poetry for packages, asdf for versioning, and gh actions for ci	iloveitaly 150855	open	2023-09-06T17:59:16Z	2023-09-06T17:59:16Z	FIRST_TIME_CONTRIBUTOR	dogsheep/google-takeout-to-sqlite/pulls/13	build: use poetry for package management, asdf for python version build: cleanup poetry config, add keywords, ignore dist ci: migrate circleci to gh actions fix: dup method definition	google-takeout-to-sqlite 206649770	pull	{ "url": "https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/13/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1393330070	PR_kwDODD6af84__DNJ	14	Photo links	redmanmale 6782721	open	2022-10-01T09:44:15Z	2022-11-18T17:10:49Z	FIRST_TIME_CONTRIBUTOR	dogsheep/swarm-to-sqlite/pulls/14	add to `checkin_details` view new column for a calculated photo links supported multiple links split by newline create `events` table if there's no events in the history to avoid SQL errors Fixes #9.	swarm-to-sqlite 205429375	pull	{ "url": "https://api.github.com/repos/dogsheep/swarm-to-sqlite/issues/14/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1880968405	PR_kwDOJHON9s5ZhYny	14	fix: fix the problem of Chinese character garbling	barretlee 2698003	open	2023-09-04T23:48:28Z	2023-09-04T23:48:28Z	FIRST_TIME_CONTRIBUTOR	dogsheep/apple-notes-to-sqlite/pulls/14	The code uses two different ways of writing encoding formats, `mac_roman` and `macroman`. It is uncertain whether there are any typo errors. When there are Chinese characters in the content, exporting it results in garbled code. Changing it to `utf8` can fix the issue.	apple-notes-to-sqlite 611552758	pull	{ "url": "https://api.github.com/repos/dogsheep/apple-notes-to-sqlite/issues/14/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
793907673	MDExOlB1bGxSZXF1ZXN0NTYxNTEyNTAz	15	added try / except to write_records	ryancheley 9857779	open	2021-01-26T03:56:21Z	2021-01-26T03:56:21Z	FIRST_TIME_CONTRIBUTOR	dogsheep/healthkit-to-sqlite/pulls/15	to keep the data write from failing if it came across an error during processing. In particular when trying to convert my HealthKit zip file (and that of my wife's) it would consistently error out with the following: ``` db.py 1709 insert_chunk result = self.db.execute(query, params) db.py 226 execute return self.conn.execute(sql, parameters) sqlite3.OperationalError: too many SQL variables db.py 1709 insert_chunk result = self.db.execute(query, params) db.py 226 execute return self.conn.execute(sql, parameters) sqlite3.OperationalError: too many SQL variables db.py 1709 insert_chunk result = self.db.execute(query, params) db.py 226 execute return self.conn.execute(sql, parameters) sqlite3.OperationalError: table rBodyMass has no column named metadata_HKWasUserEntered healthkit-to-sqlite 8 <module> sys.exit(cli()) core.py 829 call return self.main(args, kwargs) core.py 782 main rv = self.invoke(ctx) core.py 1066 invoke return ctx.invoke(self.callback, ctx.params) core.py 610 invoke return callback(args,* *kwargs) cli.py 57 cli convert_xml_to_sqlite(fp, db, progress_callback=bar.update, zipfile=zf) utils.py 42 convert_xml_to_sqlite write_records(records, db) utils.py 143 write_records db[table].insert_all( db.py 1899 insert_all self.insert_chunk( db.py 1720 insert_chunk self.insert_chunk( db.py 1720 insert_chunk self.insert_chunk( db.py 1714 insert_chunk result = self.db.execute(query, params) db.py 226 execute return self.conn.execute(sql, parameters) sqlite3.OperationalError: table rBodyMass has no column named metadata_HKWasUserEntered ``` Adding the try / except in the `write_records` seems to fix that issue.	healthkit-to-sqlite 197882382	pull	{ "url": "https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/15/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1042759769	PR_kwDOEhK-wc4uAJb9	15	include note tags in the export	d-rep 436138	open	2021-11-02T20:04:31Z	2021-11-02T20:04:31Z	FIRST_TIME_CONTRIBUTOR	dogsheep/evernote-to-sqlite/pulls/15	When parsing the Evernote `<note>` elements, the script will now also parse any nested `<tag>` elements, writing them out into a separate sqlite table. Here is an example of how to query the data after the script has run: `select notes., (select group_concat(tag) from notes_tags where notes_tags.note_id=notes.id) as tags from notes;` My .enex source file is 3+ years old so I am assuming the structure hasn't changed. Interestingly, my notebook names* show up in the tags list where the tag name is prefixed with `notebook_`, so this could maybe help work around the first limitation mentioned in the evernote-to-sqlite blog post.	evernote-to-sqlite 303218369	pull	{ "url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/15/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
505673645	MDU6SXNzdWU1MDU2NzM2NDU=	16	Do a better job with archived direct message threads	simonw 9599	open	2019-10-11T06:55:21Z	2019-10-11T06:55:27Z	MEMBER		https://github.com/dogsheep/twitter-to-sqlite/blob/fb2698086d766e0333a55bb73435e7283feeb438/twitter_to_sqlite/archive.py#L98-L99	twitter-to-sqlite 206156866	issue	{ "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/16/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
830901133	MDExOlB1bGxSZXF1ZXN0NTkyMzY0MjU1	16	Add a fallback ID, print if no ID found	n8henrie 1234956	open	2021-03-13T13:38:29Z	2021-03-13T14:44:04Z	FIRST_TIME_CONTRIBUTOR	dogsheep/healthkit-to-sqlite/pulls/16	Fixes https://github.com/dogsheep/healthkit-to-sqlite/issues/14	healthkit-to-sqlite 197882382	pull	{ "url": "https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/16/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1943259395	I_kwDOEhK-wc5z08kD	16	time data '2014-11-21T11:44:12.000Z' does not match format '%Y%m%dT%H%M%SZ'	linonetwo 3746270	open	2023-10-14T13:24:39Z	2023-10-14T13:24:39Z	NONE		evernote-to-sqlite enex evernote.db ./我的笔记.enex Importing from ENEX [#####-------------------------------] 14% Traceback (most recent call last): File "/usr/local/bin/evernote-to-sqlite", line 8, in <module> sys.exit(cli()) ^^^^^ File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1157, in __call__ return self.main(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/click/core.py", line 783, in invoke return __callback(args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/evernote_to_sqlite/cli.py", line 31, in enex save_note(db, note) File "/usr/local/lib/python3.11/site-packages/evernote_to_sqlite/utils.py", line 46, in save_note "created": convert_datetime(created), ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/evernote_to_sqlite/utils.py", line 111, in convert_datetime return datetime.datetime.strptime(s, "%Y%m%dT%H%M%SZ").isoformat() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/_strptime.py", line 568, in _strptime_datetime tt, fraction, gmtoff_fraction = _strptime(data_string, format) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/_strptime.py", line 349, in _strptime raise ValueError("time data %r does not match format %r" % ValueError: time data '2014-11-21T11:44:12.000Z' does not match format '%Y%m%dT%H%M%SZ' enex is exported by evernote mac client	evernote-to-sqlite 303218369	issue	{ "url": "https://api.github.com/repos/dogsheep/evernote-to-sqlite/issues/16/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
836063389	MDU6SXNzdWU4MzYwNjMzODk=	17	Datetime columns are not properly formatted to be recognizes as datetime	n8henrie 1234956	open	2021-03-19T14:33:04Z	2021-03-19T14:33:04Z	NONE		Currently, the datetimes are formatted in a way that is not recognized by datasette-vega for plotting with a `Date/time` type for the axis. For example, if you have datasette running locally with `datasette-vega` installed and have a database that includes resting heart rate: `http://localhost:8001/healthkit/rRestingHeartRate#g.mark=line&g.x_column=startDate&g.x_type=temporal&g.y_column=value&g.y_type=quantitative` The plot is blank unless you choose `Label` as the type for the date data. The `startDate` (and `creationDate` and `endDate`) columns appear like: `2019-11-14 18:22:18 -0700` If instead the format for this column is changed slightly: `2019-11-14T18:22:18-07:00` they are recognized as proper dates and the charting works as expected. I have a PR that addresses this issue, will submit shortly.	healthkit-to-sqlite 197882382	issue	{ "url": "https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/17/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
836064851	MDExOlB1bGxSZXF1ZXN0NTk2NjI3Nzgw	18	Add datetime parsing	n8henrie 1234956	open	2021-03-19T14:34:22Z	2021-03-19T14:34:22Z	FIRST_TIME_CONTRIBUTOR	dogsheep/healthkit-to-sqlite/pulls/18	Parses the datetime columns so they are subsequently properly recognized as datetime. Fixes https://github.com/dogsheep/healthkit-to-sqlite/issues/17	healthkit-to-sqlite 197882382	pull	{ "url": "https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/18/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
697162939	MDU6SXNzdWU2OTcxNjI5Mzk=	20	Add more tags so people can find your project.	ran88dom99 7902810	open	2020-09-09T21:14:09Z	2020-09-09T21:14:09Z	NONE		quantified-self habit-tracking google-fit time-tracking wearables quantifiedself for example	dogsheep-beta 197431109	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-beta/issues/20/reactions", "total_count": 1, "+1": 0, "-1": 1, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1515717718	PR_kwDOC8tyDs5Gc-VH	23	Include workout statistics	badboy 2129	open	2023-01-01T17:29:57Z	2023-01-01T17:29:57Z	FIRST_TIME_CONTRIBUTOR	dogsheep/healthkit-to-sqlite/pulls/23	Not sure when this changed (iOS 16 maybe?), but the `WorkoutStatistics` now has a whole bunch of information about workouts, e.g. for runs it contains the distance (as a `<WorkoutStatistics type="HKQuantityTypeIdentifierDistanceWalkingRunning ...>` element). Adding it as another column at leat allows me to pull these out (using SQLite's JSON support). I'm running with this patch on my own data now.	healthkit-to-sqlite 197882382	pull	{ "url": "https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/23/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
599776345	MDU6SXNzdWU1OTk3NzYzNDU=	24	Feature idea: github-to-sqlite everything ...	simonw 9599	open	2020-04-14T18:34:00Z	2020-04-14T18:34:00Z	MEMBER		At the moment if you want to pull all your repos, issues, issues comments etc you have to do it with a sequence of separate commands. Consider adding a `everything` or `all` command which fetches everything that the tool knows how to fetch, and is designed to be run on a cron in a way that fetches just new stuff each time.	github-to-sqlite 207052882	issue	{ "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/24/reactions", "total_count": 7, "+1": 7, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
621486115	MDU6SXNzdWU2MjE0ODYxMTU=	27	photos_with_apple_metadata view should include labels	simonw 9599	open	2020-05-20T06:06:17Z	2020-05-20T06:06:17Z	MEMBER		https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?place_city=New+Orleans&_facet=place_city&_facet_array=albums&_facet_array=persons Here's one way to add that: `sql select rowid, photo, ( select json_group_array( json_object( 'label', normalized_string, 'href', '/photos/labelled?_hide_sql=1&label=' \|\| normalized_string ) ) from labels where labels.uuid = photos_with_apple_metadata.uuid ) as labels, date,`	dogsheep-photos 256834907	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-photos/issues/27/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
709789634	MDU6SXNzdWU3MDk3ODk2MzQ=	27	Sort order is not persisted by facet filter links	simonw 9599	open	2020-09-27T18:22:07Z	2020-09-27T18:22:07Z	MEMBER		A link to `/-/beta?category=1&timestamp__date=2018-08-01&q=swedish` should be to `/-/beta?category=1&timestamp__date=2018-08-01&q=swedish&sort=newest`	dogsheep-beta 197431109	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-beta/issues/27/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
655974395	MDExOlB1bGxSZXF1ZXN0NDQ4MzU1Njgw	30	Handle empty bucket on first upload. Allow specifying the endpoint_url for services other than S3 (like b2 and digitalocean spaces)	scanner 110038	open	2020-07-13T16:15:26Z	2020-07-13T16:15:26Z	FIRST_TIME_CONTRIBUTOR	dogsheep/dogsheep-photos/pulls/30	Finally got around to trying dogsheep-photos but I want to use backblaze's b2 service instead of AWS S3. Had to add a way to optionally specify the endpoint_url to connect to. Then with the bucket being empty the initial key retrieval would fail. Probably a better way to see that the bucket is empty than doing a test inside the paginator loop. Also probably a better way to specify the endpoint_url as we get and test for it twice using the same code in two different places but did not want to spend too much time worrying about it.	dogsheep-photos 256834907	pull	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-photos/issues/30/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
836923194	MDU6SXNzdWU4MzY5MjMxOTQ=	32	JSON API for search results	simonw 9599	open	2021-03-20T22:21:36Z	2021-03-20T22:21:36Z	MEMBER		Refs https://github.com/simonw/datasette/issues/878	dogsheep-beta 197431109	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-beta/issues/32/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
268110769	MDU6SXNzdWUyNjgxMTA3Njk=	33	Use locust for benchmarking and load tests	simonw 9599	open	2017-10-24T17:00:09Z	2017-12-10T03:12:16Z	OWNER		https://github.com/locustio/locust Needed for #32	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/33/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
830283447	MDU6SXNzdWU4MzAyODM0NDc=	34	bucket name	dsisnero 6213	open	2021-03-12T16:40:57Z	2021-03-12T16:40:57Z	NONE		I followed the instructions to setup credentials but I am getting a invalid bucket name. Can you put a sample auth.json file in the base that shows the correct format for this? Thanks	dogsheep-photos 256834907	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-photos/issues/34/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
983221851	MDU6SXNzdWU5ODMyMjE4NTE=	34	Data folder as index command parameter	humrochagf 1223625	open	2021-08-30T21:29:33Z	2021-08-30T21:29:33Z	NONE		Hi, First of all, thank you for this wonderful project :smile: I started to use dogsheep to make my personal data searchable, and by using the project I noticed an issue with the index command. It always expects you are running it from the root folder from where the data is located, so I got some errors while trying to make it work on my setup. I separate all databases inside a `data` folder (I published my setup to be easier to follow: https://github.com/humrochagf/my-dogsheep) Before, I configured `dogsheep.yml` to add the data folder to its path like this: `yml data/twitter.db: tweets: sql: \|- ...` And running the index command like this: `dogsheep-beta index data/dogsheep.db dogsheep.yml` It worked to the normal search feature with no problem this way, but when I started adding `display_sql` rules the app started to crash, because at datasette `get_database` it was looking for `data/twitter` and it only had a db called `twitter` there. So my workaround to that was to cd into the data folder and run the indexer. You can check the way I'm doing it at this line of the makefile: https://github.com/humrochagf/my-dogsheep/blob/main/makefile#L3 It works but it would be nice to have an option to pass the path where the data is located to the index function.	dogsheep-beta 197431109	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-beta/issues/34/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
987985935	MDExOlB1bGxSZXF1ZXN0NzI2OTkwNjgw	35	Support for Datasette's --base-url setting	brandonrobertz 2670795	open	2021-09-03T17:47:45Z	2021-09-03T17:47:45Z	FIRST_TIME_CONTRIBUTOR	dogsheep/dogsheep-beta/pulls/35	This makes it so you can use Dogsheep if you're using Datasette with the `--base-url /some-path/` setting.	dogsheep-beta 197431109	pull	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-beta/issues/35/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1751214236	I_kwDOC8SPRc5oYWic	36	Getting sqlite_master may not be modified when creating dogsheep index	khushmeeet 8711912	open	2023-06-11T03:21:53Z	2023-06-11T03:21:53Z	NONE		When creating a `dogsheep` index from `config.yml` file on pocket.db (created using pocket-to-sqlite), I am getting this error Traceback (most recent call last): File "/Users/khushmeeet/.pyenv/versions/3.11.2/bin/dogsheep-beta", line 8, in <module> sys.exit(cli()) ^^^^^ File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py", line 1130, in __call__ return self.main(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py", line 760, in invoke return __callback(args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/cli.py", line 36, in index run_indexer( File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/utils.py", line 32, in run_indexer ensure_table_and_indexes(db, tokenize) File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/utils.py", line 91, in ensure_table_and_indexes table.add_foreign_key(fk) File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/sqlite_utils/db.py", line 2155, in add_foreign_key self.db.add_foreign_keys([(self.name, column, other_table, other_column)]) File "/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/sqlite_utils/db.py", line 1116, in add_foreign_keys cursor.execute( sqlite3.OperationalError: table sqlite_master may not be modified Command I ran to get this error `dogsheep-beta index pocket.db config.yml` Dogsheep version `dogsheep-beta, version 0.10.2` Python version `Python 3.11.2`	dogsheep-beta 197431109	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-beta/issues/36/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1293698966	PR_kwDOD079W84600uh	37	Fix former command name in readme	DanLipsitt 578773	open	2022-07-05T02:09:13Z	2022-07-05T02:09:13Z	FIRST_TIME_CONTRIBUTOR	dogsheep/dogsheep-photos/pulls/37	Looks like a previous commit missed a `photo-to-sqlite`→ `dogsheep-photos` replacement.	dogsheep-photos 256834907	pull	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-photos/issues/37/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1888477283	I_kwDOC8SPRc5wj-Bj	38	Run `rebuild_fts` after building the index	simonw 9599	open	2023-09-08T23:17:45Z	2023-09-08T23:17:45Z	MEMBER		In: - https://github.com/simonw/datasette.io/issues/152#issuecomment-1712323347 This turned out to be the fix: `bash dogsheep-beta index dogsheep-index.db templates/dogsheep-beta.yml sqlite-utils rebuild-fts dogsheep-index.db`	dogsheep-beta 197431109	issue	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-beta/issues/38/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1827436260	PR_kwDOD079W85WtVyk	39	Missing option in datasette instructions	coldclimate 319473	open	2023-07-29T10:34:48Z	2023-07-29T10:34:48Z	FIRST_TIME_CONTRIBUTOR	dogsheep/dogsheep-photos/pulls/39	Gotta tell it where to look	dogsheep-photos 256834907	pull	{ "url": "https://api.github.com/repos/dogsheep/dogsheep-photos/issues/39/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
703216044	MDU6SXNzdWU3MDMyMTYwNDQ=	49	Feature: gists and starred gists	simonw 9599	open	2020-09-17T02:30:52Z	2020-09-17T02:30:52Z	MEMBER		https://developer.github.com/v3/gists/#list-starred-gists	github-to-sqlite 207052882	issue	{ "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/49/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
703218448	MDU6SXNzdWU3MDMyMTg0NDg=	51	Documentation for twitter-to-sqlite fetch	simonw 9599	open	2020-09-17T02:38:10Z	2020-09-17T02:38:10Z	MEMBER		It's mentioned in passing in the README but it deserves its own section: `$ twitter-to-sqlite fetch \ "https://api.twitter.com/1.1/account/verify_credentials.json" \ \| grep '"id"' \| head -n 1`	twitter-to-sqlite 206156866	issue	{ "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/51/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
797784080	MDU6SXNzdWU3OTc3ODQwODA=	62	Stargazers and workflows commands always require an auth file when using GITHUB_TOKEN	frosencrantz 631242	open	2021-01-31T18:56:05Z	2021-01-31T18:56:05Z	CONTRIBUTOR		Requested fix in https://github.com/dogsheep/github-to-sqlite/pull/59 The stargazers and workflows commands always require an auth file, even when using a `GITHUB_TOKEN`. Other commands don't require the auth file.	github-to-sqlite 207052882	issue	{ "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/62/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
897212458	MDU6SXNzdWU4OTcyMTI0NTg=	63	Ability to fetch commits from branches other than the default	simonw 9599	open	2021-05-20T17:58:08Z	2021-05-20T17:58:08Z	MEMBER		This tool is currently almost entirely ignorant of the concept of branches. One example: you can't retrieve commits from any branch other than the default (usually main).	github-to-sqlite 207052882	issue	{ "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/63/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1091850530	I_kwDODEm0Qs5BFFEi	63	Import archive error 'withheld_in_countries'	pauloxnet 521097	open	2022-01-01T16:58:59Z	2022-01-01T16:58:59Z	NONE		Importing the twitter archive I received this error: bash $ twitter-to-sqlite import archive.db twitter-2021-12-31-<hash>.zip birdwatch-note-rating: not yet implemented birdwatch-note: not yet implemented branch-links: not yet implemented community-tweet: not yet implemented contact: not yet implemented device-token: not yet implemented direct-message-mute: not yet implemented mute: not yet implemented periscope-account-information: not yet implemented periscope-ban-information: not yet implemented periscope-broadcast-metadata: not yet implemented periscope-comments-made-by-user: not yet implemented periscope-expired-broadcasts: not yet implemented periscope-followers: not yet implemented periscope-profile-description: not yet implemented professional-data: not yet implemented protected-history: not yet implemented reply-prompt: not yet implemented screen-name-change: not yet implemented smartblock: not yet implemented spaces-metadata: not yet implemented sso: not yet implemented Traceback (most recent call last): File "/home/paulox/.virtualenvs/dogsheep/bin/twitter-to-sqlite", line 8, in <module> sys.exit(cli()) File "/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/click/core.py", line 1128, in __call__ return self.main(args, kwargs) File "/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, ctx.params) File "/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/click/core.py", line 754, in invoke return __callback(args, **kwargs) File "/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/twitter_to_sqlite/cli.py", line 759, in import_ archive.import_from_file(db, filename, content) File "/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/twitter_to_sqlite/archive.py", line 246, in import_from_file db[table_name].insert_all(rows, pk=pk, replace=True) File "/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/sqlite_utils/db.py", line 2625, in insert_all self.insert_chunk( File "/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/sqlite_utils/db.py", line 2406, in insert_chunk result = self.db.execute(query, params) File "/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/sqlite_utils/db.py", line 422, in execute return self.conn.execute(sql, parameters) sqlite3.OperationalError: table archive_tweet has no column named withheld_in_countries I found only a single tweet with the key `withheld_in_countries` in `tweet.js` that seems the problems: JSON [ { "tweet" : { "retweeted" : false, "source" : "<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>", "entities" : { "hashtags" : [ { "text" : "NowOnAndroid", "indices" : [ "64", "77" ] } ], "symbols" : [ ], "user_mentions" : [ { "name" : "Periscope", "screen_name" : "PeriscopeCo", "indices" : [ "3", "15" ], "id_str" : "1111111111", "id" : "222222222" } ], "urls" : [ { "url" : "https://t.co/xxxxxxxxx", "expanded_url" : "https://vine.co/v/xxxxxxxxx", "display_url" : "vine.co/v/xxxxxxxxxx", "indices" : [ "78", "101" ] } ] }, "display_text_range" : [ "0", "101" ], "favorite_count" : "0", "id_str" : "1111111111111111111111", "truncated" : false, "retweet_count" : "0", "withheld_in_countries" : [ "TR" ], "id" : "000000000000000000", "possibly_sensitive" : false, "created_at" : "Fri Aug 14 06:04:03 +0000 2015", "favorited" : false, "full_text" : "RT @periscopeco: Travel the world. LIVE. The Global Map is here #NowOnAndroid https://t.co/NZXdsPWROk", "lang" : "en" } } ] I solved the error removing the key from the `tweet.js` but I'm reporting this error to improve the project.	twitter-to-sqlite 206156866	issue	{ "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/63/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1097332098	I_kwDODEm0Qs5BZ_WC	64	Include all entities for tweets	max 111631	open	2022-01-09T23:35:28Z	2022-01-09T23:35:28Z	NONE		Per our conversation on Twitter: It would be neat if all entities (including URLs) were captured. This way you can ensure, that URLs are parsed out exactly the same way Twitter parses URLs – we all know parsing URLs with a regex ain't fun. Right now, I believe the tool filters out all entities that are not of type `media`.	twitter-to-sqlite 206156866	issue	{ "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/64/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1160327106	PR_kwDODEm0Qs4z_V3w	65	Update Twitter dev link, clarify apps vs projects	rixx 2657547	open	2022-03-05T11:56:08Z	2022-03-05T11:56:08Z	FIRST_TIME_CONTRIBUTOR	dogsheep/twitter-to-sqlite/pulls/65	Twitter pushes you heavily towards v2 projects instead of v1 apps – I know the README mentions v1 API compatibility at the top, but I still nearly got turned around here.	twitter-to-sqlite 206156866	pull	{ "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/65/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1244082183	PR_kwDODEm0Qs44PPLy	66	Ageinfo workaround	ashanan 11887	open	2022-05-21T21:08:29Z	2022-05-21T21:09:16Z	FIRST_TIME_CONTRIBUTOR	dogsheep/twitter-to-sqlite/pulls/66	I'm not sure if this is due to a new format or just because my ageinfo file is blank, but trying to import an archive would crash when it got to that file. This PR adds a guard clause in the `ageinfo` transformer and sets a default value that doesn't throw an exception. Seems likely to be the same issue mentioned by danp in https://github.com/dogsheep/twitter-to-sqlite/issues/54, my ageinfo file looks the same. Added that same ageinfo file to the test archive as well to help confirm my workaround didn't break anything. Let me know if you want any changes!	twitter-to-sqlite 206156866	pull	{ "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/66/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
981690086	MDExOlB1bGxSZXF1ZXN0NzIxNjg2NzIx	67	Replacing step ID key with step_id	jshcmpbll 16374374	open	2021-08-28T01:26:41Z	2021-08-28T01:27:00Z	FIRST_TIME_CONTRIBUTOR	dogsheep/github-to-sqlite/pulls/67	Workflows that have an `id` in any step result in the following error when running `workflows`: e.g.`github-to-sqlite workflows github.db nixos/nixpkgs` `Traceback (most recent call last): File "/usr/local/bin/github-to-sqlite", line 8, in <module> sys.exit(cli()) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1137, in __call__ return self.main(args, kwargs) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1062, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1668, in invoke`Traceback (most recent call last): File "/usr/local/bin/github-to-sqlite", line 8, in <module> sys.exit(cli()) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1137, in call* return self.main(args, kwargs) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1062, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1668, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 763, in invoke return __callback(args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/github_to_sqlite/cli.py", line 601, in workflows utils.save_workflow(db, repo_id, filename, content) File "/usr/local/lib/python3.8/dist-packages/github_to_sqlite/utils.py", line 865, in save_workflow db["steps"].insert_all( File "/usr/local/lib/python3.8/dist-packages/sqlite_utils/db.py", line 2596, in insert_all self.insert_chunk( File "/usr/local/lib/python3.8/dist-packages/sqlite_utils/db.py", line 2378, in insert_chunk result = self.db.execute(query, params) File "/usr/local/lib/python3.8/dist-packages/sqlite_utils/db.py", line 419, in execute return self.conn.execute(sql, parameters) sqlite3.IntegrityError: datatype mismatch ``` Information about the ID key in a step for GHA An example workflow from a public repo Changes I'm proposing that the key for `id` in step is replaced with `step_id` so that it no longer interferes with the table `id` for tracking the record. Special thanks to @sarcasticadmin @egiffen and @ruebenramirez for helping a bit on this 😄	github-to-sqlite 207052882	pull	{ "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/67/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1513237712	PR_kwDODEm0Qs5GUoG_	67	Add support for app-only bearer tokens	sometimes-i-send-pull-requests 26161409	open	2022-12-28T23:31:20Z	2022-12-28T23:31:20Z	FIRST_TIME_CONTRIBUTOR	dogsheep/twitter-to-sqlite/pulls/67	Previously, twitter-to-sqlite only supported OAuth1 authentication, and the token must be on behalf of a user. However, Twitter also supports application-only bearer tokens, documented here: https://developer.twitter.com/en/docs/authentication/oauth-2-0/bearer-tokens This PR adds support to twitter-to-sqlite for using application-only bearer tokens. To use, the auth.json file just needs to contain a "bearer_token" key instead of "api_key", "api_secret_key", etc.	twitter-to-sqlite 206156866	pull	{ "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/67/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1013506559	PR_kwDODFdgUs4skaNS	68	Add support for retrieving teams / members	philwills 68329	open	2021-10-01T15:55:02Z	2021-10-01T15:59:53Z	FIRST_TIME_CONTRIBUTOR	dogsheep/github-to-sqlite/pulls/68	Adds a method for retrieving all the teams within an organisation and all the members in those teams. The latter is stored as a join table `team_members` beteween `teams` and `users`.	github-to-sqlite 207052882	pull	{ "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/68/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1513237982	PR_kwDODEm0Qs5GUoKL	68	Archive: Import mute table	sometimes-i-send-pull-requests 26161409	open	2022-12-28T23:32:06Z	2022-12-28T23:32:06Z	FIRST_TIME_CONTRIBUTOR	dogsheep/twitter-to-sqlite/pulls/68		twitter-to-sqlite 206156866	pull	{ "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/68/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1513238152	PR_kwDODEm0Qs5GUoMM	69	Archive: Import new tweets table name	sometimes-i-send-pull-requests 26161409	open	2022-12-28T23:32:44Z	2022-12-28T23:32:44Z	FIRST_TIME_CONTRIBUTOR	dogsheep/twitter-to-sqlite/pulls/69	Given the code here, it seems like in the past this file was named "tweet.js". In recent exports, it's named "tweets.js". The archive importer needs to be modified to take this into account. Existing logic is reused for importing this table. (However, the resulting table name will be different, matching the different file name -- archive_tweets, rather than archive_tweet).	twitter-to-sqlite 206156866	pull	{ "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/69/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1149402080	PR_kwDODFdgUs4zaUta	70	scrape-dependents: enable paging through package menu option if present	stanbiryukov 36061055	open	2022-02-24T15:07:25Z	2022-02-24T15:07:25Z	FIRST_TIME_CONTRIBUTOR	dogsheep/github-to-sqlite/pulls/70	Some repos organize network dependents by a Package toggle. This PR adds the ability to page through those options and scrape underlying dependents.	github-to-sqlite 207052882	pull	{ "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/70/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1513238314	PR_kwDODEm0Qs5GUoN6	70	Archive: Import Twitter Circle data	sometimes-i-send-pull-requests 26161409	open	2022-12-28T23:33:09Z	2022-12-28T23:33:09Z	FIRST_TIME_CONTRIBUTOR	dogsheep/twitter-to-sqlite/pulls/70		twitter-to-sqlite 206156866	pull	{ "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/70/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1513238455	PR_kwDODEm0Qs5GUoPm	71	Archive: Fix "ni devices" typo in importer	sometimes-i-send-pull-requests 26161409	open	2022-12-28T23:33:31Z	2022-12-28T23:33:31Z	FIRST_TIME_CONTRIBUTOR	dogsheep/twitter-to-sqlite/pulls/71		twitter-to-sqlite 206156866	pull	{ "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/71/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1524431805	I_kwDODEm0Qs5a3Pu9	72	Import thread, including self- and others' replies	mcint 601708	open	2023-01-08T09:51:06Z	2023-01-08T09:51:06Z	NONE		statuses-lookup, home-timeline, mentions (only for auth'ed user) don't cover this. `twitter-to-sqlite fetch-thread tw-group1.db 1234123412341234` twitter-to-sqlite focuses on archiving users, but does not easily support archiving conversations or community activity. For reference, this is implemented in twarc, using a search, optionally recursively. Other research suggests that this formerly, or currently, requires a search query, use of undocumented `related_results` api, or with requested inclusion of newer conversation_id with subsequent query.	twitter-to-sqlite 206156866	issue	{ "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/72/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1816830546	I_kwDODEm0Qs5sSqJS	73	Twitter v1 API shutdown	david-perez 6341745	open	2023-07-22T16:57:41Z	2023-07-22T16:57:41Z	NONE		I've been using this project reliably over the past two years to periodically download my liked tweets, but unfortunately since 19th July I get: [2023-07-19 21:00:04.937536] File "/home/pi/code/liked-tweets/lib/python3.7/site-packages/twitter_to_sqlite/utils.py", line 202, in fetch_timeline [2023-07-19 21:00:04.937606] raise Exception(str(tweets["errors"])) [2023-07-19 21:00:04.937678] Exception: [{'message': 'You currently have access to a subset of Twitter API v2 endpoints and limited v1.1 endpoints (e.g. media post, oauth) only. If you need access to this endpoint, you may need a different access level. You can learn more here: https://developer.twitter.com/en/portal/product', 'code': 453}] It appears like Twitter has now shut down their v1 endpoints, which is rather gracious of them, considering they announced they'd be deprecated on 29th April. Unfortunately retrieving likes using the v2 API is not part of their free plan. In fact, with the free plan one can only post and delete tweets and retrieve information about oneself. So I'm afraid this is the end of this very nice project. It was very useful, thank you!	twitter-to-sqlite 206156866	issue	{ "url": "https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/73/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 1 }
1363244199	I_kwDODFdgUs5RQXSn	75	Fetch repos doesn't support organisations	OverkillGuy 2757699	open	2022-09-06T12:55:06Z	2022-09-06T12:55:06Z	NONE		Say I want to get all my Github Org's repos info, for data analysis. Not just the public repos, but also the private/internal repos. The endpoints are different for organisation, and this tool doesn't take it into account: https://github.com/dogsheep/github-to-sqlite/blob/ace13ec3d98090d99bd71871c286a4a612c96a50/github_to_sqlite/utils.py#L453 https://github.com/dogsheep/github-to-sqlite/blob/ace13ec3d98090d99bd71871c286a4a612c96a50/github_to_sqlite/utils.py#L455 The endpoints for organisation repos is instead (source): `url = "https://api.github.com/orgs/{}/repos".format(username)` Let's add support for organisations repo scraping.	github-to-sqlite 207052882	issue	{ "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/75/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1410548368	I_kwDODFdgUs5UE0KQ	77	Feature: Support GitHub discussions	frosencrantz 631242	open	2022-10-16T16:53:38Z	2022-10-16T16:53:38Z	CONTRIBUTOR		Hi @simonw I've been a happy user of this tool. Thank you for writing it and sharing it. I wanted to suggest a feature request to support Discussions. For example the VisiData project has discussions https://github.com/saulpw/visidata/discussions , and it would be useful if there was a way to pull that data into the database. However, I'm not offering a pull request.	github-to-sqlite 207052882	issue	{ "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/77/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1505411725	I_kwDODFdgUs5ZusKN	78	self-hosted or corp github enterprise	ebdavison 549431	open	2022-12-20T22:51:45Z	2022-12-20T22:51:45Z	NONE		We use github enterprise at work and I would like to use this tool to pull info from that site rather than the public github.com instance. Is there an option for this? If not, can one be added for a custom repo URL?	github-to-sqlite 207052882	issue	{ "url": "https://api.github.com/repos/dogsheep/github-to-sqlite/issues/78/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
581795570	MDU6SXNzdWU1ODE3OTU1NzA=	93	Support more string values for types in .add_column()	simonw 9599	open	2020-03-15T19:32:49Z	2020-09-24T20:36:46Z	OWNER		https://sqlite-utils.readthedocs.io/en/2.4.2/python-api.html#adding-columns says: SQLite types you can specify are "TEXT", "INTEGER", "FLOAT" or "BLOB". As discovered in #92 this isn't the right list of values. I should expand this to match https://www.sqlite.org/datatype3.html	sqlite-utils 140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/93/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
275159710	MDU6SXNzdWUyNzUxNTk3MTA=	128	Every visualization should have an "embed" button	simonw 9599	open	2017-11-19T13:38:13Z	2019-05-13T18:33:51Z	OWNER		At least for the first round of visualizations, any time you construct one using the UI the result should include an "embed this" button that returns source code to copy and paste These examples should use unpkg.com (or similarl) urls with SRI hashes, eg https://www.srihash.org - and should load data from the datasette JSON API.	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/128/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
688352145	MDU6SXNzdWU2ODgzNTIxNDU=	141	insert-files support for compressed values	simonw 9599	open	2020-08-28T20:59:46Z	2020-09-24T20:36:08Z	OWNER		The `sqlar` format supports this, it would be useful if `insert-files` could support this too. https://www.sqlite.org/sqlar.html	sqlite-utils 140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/141/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
312395790	MDU6SXNzdWUzMTIzOTU3OTA=	197	Ability to sort by more than one column	simonw 9599	open	2018-04-09T05:13:30Z	2018-07-10T17:45:37Z	OWNER		Split off from #189. I'd like to support "sort by X descending, then by Y ascending if there are dupes for X" as well. Suggested syntax for that: `?_sort_desc=X&_sort=Y` we currently only allow one argument to be sent. We should allow as many arguments as there are columns, for example: `?_sort=department&_sort_desc=precinct&_sort=age&_sort_desc=size`	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/197/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
312396095	MDU6SXNzdWUzMTIzOTYwOTU=	198	Ability to sort with nulls last	simonw 9599	open	2018-04-09T05:15:40Z	2018-07-10T17:45:37Z	OWNER		Split off from #189 Here's how to do that in SQL: https://fivethirtyeight.datasettes.com/fivethirtyeight-2628db9?sql=select+rowid%2C+*+from+%5Bnfl-wide-receivers%2Fadvanced-historical%5D%0D%0Aorder+by+case+when+career_ranypa+is+null+then+1+else+0+end%2C+career_ranypa%2C+rowid `order by case when career_ranypa is null then 1 else 0 end, career_ranypa`	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/198/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
314771615	MDU6SXNzdWUzMTQ3NzE2MTU=	218	Support custom unit display in order to handle "$10,000"	simonw 9599	open	2018-04-16T18:39:31Z	2018-07-10T17:45:38Z	OWNER		I tried to get Datasette to display `$10,000` using the new units support but we currently only display units as a suffix: https://github.com/simonw/datasette/blob/10a34f995c70daa37a8a2aa02c3135a4b023a24c/datasette/app.py#L563-L572 It would be neat if there was a mechanism for specifying a custom unit display - maybe something like this: `{ "custom_units": { "us_dollar": { "unit": "us_dollar = [] = $", "format": "${:,}" } } }`	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/218/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
314834783	MDU6SXNzdWUzMTQ4MzQ3ODM=	219	Expose units in the JSON API?	russss 45057	open	2018-04-16T22:04:25Z	2018-04-16T22:04:25Z	CONTRIBUTOR		From #203: it would be nice for the JSON API to (optionally) return columns rendered with units in them - if, for example, you're consuming the JSON to render the rows on a map. I'm not entirely sure how useful this will be though - at the moment my map queries are custom SQL queries (a few have joins in, the rest might be fetching large amounts of data so it makes sense to limit columns fetched). Perhaps the SQL function is a better approach in general.	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/219/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
318490133	MDU6SXNzdWUzMTg0OTAxMzM=	241	Default datasette logging format should be JSON	simonw 9599	open	2018-04-27T17:32:48Z	2018-07-10T17:45:40Z	OWNER		Structured logs are better. Datasette should default to outputting it's HTTP access log lines as newline delimited JSON instead of the Sanic default format it uses at the moment. For improved greppability these logs should have keys ordered in a consistent way. Python's JSON module can do this with ordered dictionaries.	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/241/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
816601354	MDExOlB1bGxSZXF1ZXN0NTgwMjM1NDI3	241	Extract expand - work in progress	simonw 9599	open	2021-02-25T16:36:38Z	2021-02-25T16:36:38Z	OWNER	simonw/sqlite-utils/pulls/241	Refs #239. Still needs documentation and CLI implementation.	sqlite-utils 140912432	pull	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/241/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	1
818684978	MDU6SXNzdWU4MTg2ODQ5Nzg=	243	How can i use this utils to deal with fts on column meta of tables ?	svjack 27874014	open	2021-03-01T09:45:05Z	2021-03-01T09:45:05Z	NONE		Thank you to release this bravo project. When i use this project on multi table db, I want to implement convenient search on column name from different tables. I want to develop a meta table to save the meta data of different columns of different tables and search on this meta table to get rows from the data table (which the meta table describes) does this project provide some simple function on it ? You can think a have a knowledge graph about the table in the db, and i save this knowledge graph into the db with fts enabled.	sqlite-utils 140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/243/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
320132682	MDU6SXNzdWUzMjAxMzI2ODI=	250	Setup some issue templates	simonw 9599	open	2018-05-04T01:49:07Z	2018-05-04T01:49:07Z	OWNER		https://twitter.com/left_pad/status/99216385740464537 I like the idea of using these to help people understand some of the ways I want to use issues.	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/250/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
326778161	MDU6SXNzdWUzMjY3NzgxNjE=	290	Consider increasing the default for num_sql_threads (currently 3)	simonw 9599	open	2018-05-27T00:52:41Z	2018-05-27T00:52:41Z	OWNER		I ran a very rough micro-benchmark on the new `num_sql_threads` config option (added in #285) `datasette --config num_sql_threads:1 fivethirtyeight.db` Then `ab -n 100 -c 10 'http://127.0.0.1:8011/fivethirtyeight-2628db9/twitter-ratio%2Fsenators'` \| Number of threads \| Requests/second \| \|---\|---\| \| 1 \| 4.57 \| \| 3 \| 9.77 \| \| 10 \| 13.53 \| \| 20 \| 15.24 \| 50 \| 8.21 \| This was on my early 2018 OS X laptop. Need to benchmark in other common environments before making a decision on changing the default. That said, the default of 3 was a number I plucked out of thin air.	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/290/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
344654623	MDU6SXNzdWUzNDQ2NTQ2MjM=	347	Rename "datasette package" to "datasette publish docker"	simonw 9599	open	2018-07-26T00:42:46Z	2018-07-26T00:42:46Z	OWNER			datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/347/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
346026869	MDU6SXNzdWUzNDYwMjY4Njk=	354	Handle many-to-many relationships	simonw 9599	open	2018-07-31T04:03:13Z	2020-11-24T19:51:18Z	OWNER		This is a master tracking ticket for various many-2-many features.	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/354/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1090798237	I_kwDOCGYnMM5BBEKd	359	Use RETURNING if available to populate last_pk	simonw 9599	open	2021-12-29T23:43:23Z	2021-12-29T23:43:23Z	OWNER		Inspired by this: https://news.ycombinator.com/item?id=29729283 Because SQLite is effectively serializing all the writes for us, we have zero locking in our code. We used to have to lock when inserting new items (to get the LastInsertRowId), but the newer version of SQLite supports the RETURNING keyword, so we don't even have to lock on inserts now.	sqlite-utils 140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/359/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
359075028	MDExOlB1bGxSZXF1ZXN0MjE0NjUzNjQx	364	Support for other types of databases using external connectors	jsancho-gpl 11912854	open	2018-09-11T14:31:47Z	2018-09-11T14:31:47Z	FIRST_TIME_CONTRIBUTOR	simonw/datasette/pulls/364	This PR is related to #293, but now all commits have been merged. The purpose is to support other file formats that aren't SQLite, like files with PyTables format. I've tried to accomplish that using external connectors published with entry points. The modifications in the original datasette code are minimal and many are in a separated file.	datasette 107914493	pull	{ "url": "https://api.github.com/repos/simonw/datasette/issues/364/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
377166793	MDU6SXNzdWUzNzcxNjY3OTM=	372	Docker build tools	psychemedia 82988	open	2018-11-04T16:02:35Z	2018-11-04T16:02:35Z	CONTRIBUTOR		In terms of small pieces lightly joined, I note that there are several tools starting to appear for building generating Dockerfiles and building Docker containers from simpler components such as `requirements.txt` files. If plugin/extensions builders want to include additional packages, then things like incremental builds of composable builds that add additional items into a base `datasette` container may be required. Examples of Dockerfile generators / container builders: openshift/source-to-image (s2i) jupyter/repo2docker stencila/dockter Discussions / threads (via Binderhub gitter) on: - why `repo2docker` not `s2i` - why `dockter` not `repo2docker` - composability in `s2i` Relates to things like: https://github.com/simonw/datasette/pull/280	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/372/reactions", "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 2, "rocket": 0, "eyes": 0 }
426722204	MDU6SXNzdWU0MjY3MjIyMDQ=	423	?_search_col=X not reflected correctly in the UI	simonw 9599	open	2019-03-28T21:48:19Z	2020-11-03T19:01:59Z	OWNER		e.g. https://latest.datasette.io/fixtures/searchable?_search_text1=barry	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/423/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
440325850	MDExOlB1bGxSZXF1ZXN0Mjc1OTIzMDY2	452	SQL builder utility classes	russss 45057	open	2019-05-04T13:57:47Z	2019-05-04T14:03:04Z	CONTRIBUTOR	simonw/datasette/pulls/452	This adds a straightforward set of classes to aid in the construction of SQL queries. My plan for this was to allow plugins to manipulate the Datasette-generated SQL in a more structured way. I'm not sure that's going to work, but I feel like this is still a step forward - it reduces the number of intermediate variables in `TableView.data` which aids readability, and also factors out a lot of the boring string concatenation. There are a fair number of minor structure changes in here too as I've tried to make the ordering of `TableView.data` a bit more logical. As far as I can tell, I haven't broken anything...	datasette 107914493	pull	{ "url": "https://api.github.com/repos/simonw/datasette/issues/452/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	0
1324659241	I_kwDOCGYnMM5O9LIp	459	Single quoted transform recipes on Windows do not work as expected	shakeel 19921	open	2022-08-01T16:14:54Z	2022-08-01T16:14:54Z	CONTRIBUTOR		Trying to follow the tutorial for sqlite-utils and datasette https://datasette.io/tutorials/clean-data on Windows 11 OS `Microsoft Windows [Version 10.0.22622.440]`, with sqlite-utils and datasette installed using pipx. `pipx list package datasette 0.61.1, installed using Python 3.10.4 - datasette.exe package sqlite-utils 3.28, installed using Python 3.10.4 - sqlite-utils.exe` In the step to transform dates into ISO dates the quoted value `'r.parsedatetime(value)'` is copied verbatim into the columns instead of applying the output of the Python recipe. ``` sqlite-utils convert manatees.db locations \ REPDATE created_date last_edited_date \ 'r.parsedatetime(value)' --dry-run 1975/01/31 00:00:00+00 --- becomes: r.parsedatetime(value) Would affect 13568 rows ``` However, if I change the code from single quotes to double quotes, it works as expected. ``` sqlite-utils convert manatees.db locations \ REPDATE created_date last_edited_date \ "r.parsedatetime(value)" --dry-run 1975/01/31 00:00:00+00 --- becomes: 1975-01-31T00:00:00+00:00 Would affect 13568 rows ``` Specifying the transform code recipe should work with single quotes on Windows.	sqlite-utils 140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/459/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1355193529	I_kwDOCGYnMM5Qxpy5	479	OperationalError: cannot VACUUM from within a transaction	chapmanjacobd 7908073	open	2022-08-30T05:34:24Z	2022-08-30T05:34:24Z	CONTRIBUTOR		Maybe when calling `.vacuum()` and other DB-level write-lock operations `sqlite_utils` could guard against this error message by automatically committing first? ``` 46 db["media"].optimize() # type: ignore ---> 47 db.vacuum() File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:1047, in Database.vacuum(self) 1045 def vacuum(self): 1046 "Run a SQLite `VACUUM` against the database." -> 1047 self.execute("VACUUM;") File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:470, in Database.execute(self, sql, parameters) 468 return self.conn.execute(sql, parameters) 469 else: --> 470 return self.conn.execute(sql) OperationalError: cannot VACUUM from within a transaction ``` It might also be nice to add a sentence or two about how transactions are committed on the docs page. When I was swapping out my sqlite3 code for this library it was nice that everything was pretty much drop-in but I was/am unsure what to do about the places I explicitly call `.commit()` in my code Related to https://github.com/simonw/sqlite-utils/issues/121	sqlite-utils 140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/479/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1359604075	I_kwDOCGYnMM5RCelr	481	Idea: `sqlite-utils create-table tablename --sql "select ..."`	simonw 9599	open	2022-09-02T01:41:24Z	2022-09-02T01:42:08Z	OWNER		Could offer syntactic sugar for: `sql create table foo as select * from bar` `sqlite-utils create-table data.db foo --sql "select * from bar"` https://sqlite-utils.datasette.io/en/stable/cli-reference.html#create-table	sqlite-utils 140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/481/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
449445715	MDU6SXNzdWU0NDk0NDU3MTU=	491	Figure out how to use Firebase with cloudrun to enable vanity URLs and CDN caching	simonw 9599	open	2019-05-28T19:48:06Z	2019-05-28T19:48:35Z	OWNER		It looks like Firebase can solve a couple of problems with the existing `datasette publish cloudrun` hosting mechanism: The URLs it produces aren't pretty enough. Firebase offers more control over vanity URLs. CDN caching (as seen in `datasette publish now`) is great for improving performance and saving money on Cloud Run execution time. https://firebase.google.com/docs/hosting/cloud-run looks like it can help with both of these. Lots of interesting questions: Should this be a new `datasette publish firebase` command or should it instead be implemented as additional custom options to `datasette publish cloudrun`? How much harder does it become to do account setup? How much will this option cost users?	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/491/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1453134846	I_kwDOCGYnMM5WnRP-	513	Add or document streamlined workflow for importing Datasette csv / json exports	henry501 19328961	open	2022-11-17T10:54:47Z	2022-11-17T10:54:47Z	NONE		I'm working on some small front-end enhancements to the laion-aesthetic-datasette project, and I wanted to partially populate a database directly using exports from the existing Datasette instance instead of downloading the parquet files and creating my own multi-GB database. There have been a number of small issues that are certainly related to my relative lack of familiarity with the toolkit, but that are still surprising. For example: a CSV export of the images table (http://laion-aesthetic.datasette.io/laion-aesthetic-6pls.csv?sql=select+rowid%2C+url%2C+text%2C+domain_id%2C+width%2C+height%2C+similarity%2C+punsafe%2C+pwatermark%2C+aesthetic%2C+hash%2C+index_level_0+from+images+order+by+random%28%29+limit+100) has nested single quotes, double quotes, and commas that aren't handled by rows_from_file. Similarly, the json output has to be manually transformed to add the column names and remove extraneous information before sqlite_utils can import it. I was able to work through these issues, but as an enhancement it would be really helpful to create or document a clear workflow that avoids the friction of this data transformation.	sqlite-utils 140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/513/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
459469278	MDU6SXNzdWU0NTk0NjkyNzg=	515	Try shrinking official image with docker-slim	simonw 9599	open	2019-06-22T12:25:37Z	2019-06-22T12:25:37Z	OWNER		This looks really promising: https://github.com/docker-slim/docker-slim If it can shave substantial size from our official container reliably we could add it to the automated build process.	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/515/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1550536442	I_kwDOCGYnMM5ca076	521	Custom JSON encoder	janrito 31504	open	2023-01-20T09:19:40Z	2023-01-20T09:19:40Z	NONE		It would be nice if we could specify a custom encoder (and decoder) for types that will need extra deserialisation – e.g., sets, enums or sparse matrices – or even project-specific types	sqlite-utils 140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/521/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
460095928	MDU6SXNzdWU0NjAwOTU5Mjg=	528	Establish a pattern for Datasette plugins built on top of Pandas	simonw 9599	open	2019-06-24T21:05:52Z	2019-06-24T21:05:52Z	OWNER		The Pandas ecosystem is huge, varied and full of tools that are really good at doing interesting analysis on top of tabular data. Pandas should not be a dependency of Datasette core, but I think there is a lot of potential in having plugins which use Pandas to apply interesting analysis to data sucked out of Datasette's SQLite tables. One example (thanks, Tony): https://github.com/ResidentMario/missingno could form the basis of a fantastic plugin for getting a high-level overview of how complete each column in a table is. Some thought is needed here about what shape these kind of plugins might take, and what plugin hooks they would use.	datasette 107914493	issue	{ "url": "https://api.github.com/repos/simonw/datasette/issues/528/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1754174496	I_kwDOCGYnMM5ojpQg	558	Ability to define unique columns when creating a table	aguinane 1910303	open	2023-06-13T06:56:19Z	2023-08-18T01:06:03Z	NONE		When creating a new table, it would be good to have an option to set unique columns similar to how not_null is set. ```python from sqlite_utils import Database columns = {"mRID": str, "name": str} db = Database("example.db") db["ExampleTable"].create(columns, pk="mRID", not_null=["mRID"], if_not_exists=True) db["ExampleTable"].create_index(["mRID"], unique=True, if_not_exists=True) ``` So something like this would add the UNIQUE flag to the table definition. `python db["ExampleTable"].create(columns, pk="mRID", not_null=["mRID"], unique=["mRID"], if_not_exists=True)` `sql CREATE TABLE ExampleTable ( mRID TEXT PRIMARY KEY NOT NULL UNIQUE, name TEXT );`	sqlite-utils 140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/558/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }
1821108702	I_kwDOCGYnMM5si-ne	579	Special handling for SQLite column of type `JSON`	asg017 15178711	open	2023-07-25T20:37:23Z	2023-07-25T20:37:23Z	CONTRIBUTOR		`sqlite-utils` should detect and have specially handling for column with a `JSON` column. For example: `sql CREATE TABLE "dogs" ( id INTEGER PRIMARY KEY, name TEXT, friends JSON );` Automatic Nesting According to "Nested JSON Values", sqlite-utils will only expand JSON if the `--json-cols` flag is passed. It looks like it'll try to `json.load` all text column to test if its JSON, which can get expensive on non-json columns. Instead, `sqlite-utils` should be default (ie without the `--json-cols` flags) do the `maybe_json()` operation on columns with a declared `JSON` type. So the above table would expand the `"friends"` column as expected, withoutthe `--json-cols` flag: `bash sqlite-utils dogs.db "select * from dogs" \| python -mjson.tool` `[ { "id": 1, "name": "Cleo", "friends": [ { "name": "Pancakes" }, { "name": "Bailey" } ] } ]` I'm sure there's other ways `sqlite-utils` can specially handle JSON columns, so keeping this open while I think of more	sqlite-utils 140912432	issue	{ "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/579/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [pull_request] TEXT,
   [body] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
, [active_lock_reason] TEXT, [performed_via_github_app] TEXT, [reactions] TEXT, [draft] INTEGER, [state_reason] TEXT);
CREATE INDEX [idx_issues_repo]
                ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
                ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
                ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
                ON [issues] ([user]);

issues

252 rows where comments = 0 and state = "open" sorted by number

Changes

Automatic Nesting

Advanced export