id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,pull_request,body,repo,type,active_lock_reason,performed_via_github_app,reactions,draft,state_reason 1751214236,I_kwDOC8SPRc5oYWic,36,Getting sqlite_master may not be modified when creating dogsheep index,8711912,open,0,,,0,2023-06-11T03:21:53Z,2023-06-11T03:21:53Z,,NONE,,"When creating a `dogsheep` index from `config.yml` file on pocket.db (created using pocket-to-sqlite), I am getting this error ``` Traceback (most recent call last): File ""/Users/khushmeeet/.pyenv/versions/3.11.2/bin/dogsheep-beta"", line 8, in sys.exit(cli()) ^^^^^ File ""/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py"", line 1130, in __call__ return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File ""/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py"", line 1055, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File ""/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py"", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ""/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py"", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ""/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/click/core.py"", line 760, in invoke return __callback(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ""/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/cli.py"", line 36, in index run_indexer( File ""/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/utils.py"", line 32, in run_indexer ensure_table_and_indexes(db, tokenize) File ""/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/dogsheep_beta/utils.py"", line 91, in ensure_table_and_indexes table.add_foreign_key(*fk) File ""/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/sqlite_utils/db.py"", line 2155, in add_foreign_key self.db.add_foreign_keys([(self.name, column, other_table, other_column)]) File ""/Users/khushmeeet/.pyenv/versions/3.11.2/lib/python3.11/site-packages/sqlite_utils/db.py"", line 1116, in add_foreign_keys cursor.execute( sqlite3.OperationalError: table sqlite_master may not be modified ``` Command I ran to get this error ``` dogsheep-beta index pocket.db config.yml ``` Dogsheep version ``` dogsheep-beta, version 0.10.2 ``` Python version ``` Python 3.11.2 ```",197431109,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/dogsheep-beta/issues/36/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1293698966,PR_kwDOD079W84600uh,37,Fix former command name in readme,578773,open,0,,,0,2022-07-05T02:09:13Z,2022-07-05T02:09:13Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/dogsheep-photos/pulls/37,Looks like a previous commit missed a `photo-to-sqlite`→ `dogsheep-photos` replacement.,256834907,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/dogsheep-photos/issues/37/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 1888477283,I_kwDOC8SPRc5wj-Bj,38,Run `rebuild_fts` after building the index,9599,open,0,,,0,2023-09-08T23:17:45Z,2023-09-08T23:17:45Z,,MEMBER,,"In: - https://github.com/simonw/datasette.io/issues/152#issuecomment-1712323347 This turned out to be the fix: ```bash dogsheep-beta index dogsheep-index.db templates/dogsheep-beta.yml sqlite-utils rebuild-fts dogsheep-index.db ```",197431109,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/dogsheep-beta/issues/38/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1827436260,PR_kwDOD079W85WtVyk,39,Missing option in datasette instructions,319473,open,0,,,0,2023-07-29T10:34:48Z,2023-07-29T10:34:48Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/dogsheep-photos/pulls/39,Gotta tell it where to look,256834907,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/dogsheep-photos/issues/39/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 602619330,MDU6SXNzdWU2MDI2MTkzMzA=,45,Use raise_for_status() everywhere,9599,open,0,,,1,2020-04-19T04:38:28Z,2020-04-19T04:39:22Z,,MEMBER,,"I keep seeing errors which I think are caused by authentication or rate limit problems but which appear to be unexpected JSON responses - presumably because they are actually an error message. Recent example: https://github.com/simonw/jsk-fellows-on-twitter/runs/598892575 Using `response.raise_for_status()` everywhere will make these errors less confusing.",206156866,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/45/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 664485022,MDU6SXNzdWU2NjQ0ODUwMjI=,46,Feature: pull request reviews and comments,1326704,open,0,,,6,2020-07-23T13:43:45Z,2022-12-20T14:40:15Z,,NONE,,"Hi there! I saw your [presentation at Boston Python](https://www.meetup.com/bostonpython/events/271887195). I'm already a light user of Datasette (thank you!), but wasn't aware of this project. I've been working on a ""pull request dashboard"" to get a comprehensive view of the state of open PR's, esp. related to reviews (i.e., pending, approved, changes requested). Currently it's a CLI command, but I thought a Datasette UI might be fun. I see that PR's are available from the `issues` command, but I don't see reviews anywhere. From the [API docs](https://docs.github.com/en/rest/reference/pulls#reviews), it looks like there are separate endpoints for those (as well as pull requests in general). What do you think about adding that? Would you accept a PR? Any sense of the level of effort?",207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/46/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 639542974,MDU6SXNzdWU2Mzk1NDI5NzQ=,47,Fall back to FTS4 if FTS5 is not available,73579,open,0,,,3,2020-06-16T10:11:23Z,2020-06-17T20:13:48Z,,NONE,,"got this with version 0.21.1 from pypi. twitter-to-sqlite auth worked but then ""twitter-to-sqlite user-timeline USER.db"" produced a tracekback ending in ""no such module: FTS5"". ",206156866,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/47/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 472115381,MDU6SXNzdWU0NzIxMTUzODE=,49,extracts= should support multiple-column extracts,9599,open,0,,,10,2019-07-24T07:06:41Z,2020-10-16T19:18:19Z,,OWNER,,"Lookup tables can be constructed on compound columns, but the `extracts=` option doesn't currently support that. Right now extracts can be defined in two ways: ```python # Extract these columns into tables with the same name: dogs = db.table(""dogs"", extracts=[""breed"", ""most_recent_trophy""]) # Same as above but with custom table names: dogs = db.table(""dogs"", extracts={""breed"": ""Breeds"", ""most_recent_trophy"": ""Trophies""}) ``` Need some kind of syntax for much more complicated extractions, like when two columns (say ""source"" and ""source_version"") are extracted into a single table.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/49/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 703216044,MDU6SXNzdWU3MDMyMTYwNDQ=,49,Feature: gists and starred gists,9599,open,0,,,0,2020-09-17T02:30:52Z,2020-09-17T02:30:52Z,,MEMBER,,https://developer.github.com/v3/gists/#list-starred-gists,207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/49/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 698791218,MDU6SXNzdWU2OTg3OTEyMTg=,50,"favorites --stop_after=N stops after min(N, 200)",370930,open,0,,,2,2020-09-11T03:38:14Z,2020-09-13T05:11:14Z,,CONTRIBUTOR,,"For any number greater than 200, `favorites --stop_after` stops after getting 200 tweets, e.g. ``` $ twitter-to-sqlite favorites tweets.db --stop_after=300 Importing favorites [####################################] 199 $ ``` I don't _think_ this is a limitation of the API (if you omit `--stop_after` you get some very large number, possibly all of them), so I _think_ this is a bug.",206156866,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/50/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 703218756,MDU6SXNzdWU3MDMyMTg3NTY=,50,Commands for making authenticated API calls,9599,open,0,,,7,2020-09-17T02:39:07Z,2020-10-19T05:01:29Z,,MEMBER,,"Similar to `twitter-to-sqlite fetch`, see https://github.com/dogsheep/twitter-to-sqlite/issues/51",207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/50/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 703218448,MDU6SXNzdWU3MDMyMTg0NDg=,51,Documentation for twitter-to-sqlite fetch,9599,open,0,,,0,2020-09-17T02:38:10Z,2020-09-17T02:38:10Z,,MEMBER,,"It's mentioned in passing in the README but it deserves its own section: ``` $ twitter-to-sqlite fetch \ ""https://api.twitter.com/1.1/account/verify_credentials.json"" \ | grep '""id""' | head -n 1 ```",206156866,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/51/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 703246031,MDU6SXNzdWU3MDMyNDYwMzE=,51,github-to-sqlite should handle rate limits better,9599,open,0,,,4,2020-09-17T04:01:50Z,2022-10-14T16:34:07Z,,MEMBER,,From #50 - right now it will crash with an error of it hits the rate limit. Since the rate limit information (including reset time) is available in the headers it could automatically sleep and try again instead.,207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/51/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 753000405,MDU6SXNzdWU3NTMwMDA0MDU=,53,Command for fetching file contents,9599,open,0,,,1,2020-11-29T20:31:04Z,2020-11-30T00:36:09Z,,MEMBER,,"Something like this: github-to-sqlite files github.db simonw/datasette This would fetch all files from the `main` branch into a `files` table. Additional options could handle things like pulling files from a branch or tag, or just pulling files that match a specific glob or that exist in a specific directory.",207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/53/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 779088071,MDU6SXNzdWU3NzkwODgwNzE=,54,Archive import appears to be broken on recent exports,21148,open,0,,,5,2021-01-05T14:18:01Z,2023-01-04T11:06:55Z,,CONTRIBUTOR,,"I requested a Twitter export yesterday, and unfortunately they seem to have changed it such that `twitter-to-sqlite import` can't handle it anymore 😢 So far I've ran into two issues. The first was easy to work around, but the second will take more investigation. If I can find the time I'll keep working on it and update this issue accordingly. The issues (so far): ### 1. Data seems to have moved to a `data/` subdirectory Running `twitter-to-sqlite import` on the raw zip file reports a bunch of ""not yet implemented"" errors, and then exits without actually importing anything: ``` ❯ twitter-to-sqlite import tarchive.db twitter.zip ... data/manifest: not yet implemented data/account-creation-ip: not yet implemented data/account-suspension: not yet implemented ... (dozens of more lines like this, including critical stuff like data/tweets) ... ``` (`tarchive.db` now exists, but is empty) Workaround: unpack the zip file, and run `twitter-to-sqlite import tarchive.db path/to/archive/data` That gets further, but: ### 2. Some schema(s?) have changed At least, the `blocks` schema seems different now: ``` ❯ twitter-to-sqlite import tarchive.db archive/data direct-messages-group: not yet implemented branch-links: not yet implemented periscope-expired-broadcasts: not yet implemented direct-messages: not yet implemented mute: not yet implemented Traceback (most recent call last): File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/bin/twitter-to-sqlite"", line 8, in sys.exit(cli()) File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/click/core.py"", line 829, in __call__ return self.main(*args, **kwargs) File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/click/core.py"", line 782, in main rv = self.invoke(ctx) File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/click/core.py"", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/click/core.py"", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/click/core.py"", line 610, in invoke return callback(*args, **kwargs) File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/twitter_to_sqlite/cli.py"", line 772, in import_ archive.import_from_file(db, filepath.name, open(filepath, ""rb"").read()) File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/twitter_to_sqlite/archive.py"", line 215, in import_from_file to_insert = transformer(data) File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/twitter_to_sqlite/archive.py"", line 115, in lists_member return {""lists-member"": _list_from_common(data)} File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/twitter_to_sqlite/archive.py"", line 200, in _list_from_common for url in block[""userListInfo""][""urls""]: KeyError: 'urls' ``` That's as far as I got before I needed to work on something else. I'll report back if I get further!",206156866,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/54/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 797097140,MDU6SXNzdWU3OTcwOTcxNDA=,60,Use Data from SQLite in other commands,22578954,open,0,,,3,2021-01-29T18:35:52Z,2021-02-12T18:29:43Z,,CONTRIBUTOR,,"As a total beginner here how could you access data from the sqlite table to run other commands. What I am thinking is I want to get all the repos in an organization then using the repo list pull all the commit messages for each repo. I love this project by the way!",207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/60/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1063982712,I_kwDODEm0Qs4_axZ4,60,Execution on Windows,1733616,open,0,,,1,2021-11-26T00:24:34Z,2022-10-14T16:58:27Z,,NONE,,"My installation on Windows using pip has been successful. I have Python 3.6. How do I run twitter-to-sqlite? I cannot even figure out how ""auth"" is a command. I have python on my path: C:\prog\python\Python36;C:\prog\python\Python36\Scripts Where should the commands be executed, and where are the files created? Could some basics please be added to the documentation to get beginners started?",206156866,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/60/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1077560091,I_kwDODEm0Qs5AOkMb,61,"Data Pull fails for ""Essential"" level access to the Twitter API (for Documentation)",57161638,open,0,,,1,2021-12-11T14:59:41Z,2022-10-31T14:47:58Z,,NONE,,"Per Twitter documentation: https://developer.twitter.com/en/docs/twitter-api/getting-started/about-twitter-api#v2-access-leve This isn't any fault of twitter-to-sqlite of course, but it should probably be documented as a side-note. ![image](https://user-images.githubusercontent.com/57161638/145681272-8c85b3b9-be95-44ff-9760-1bafa4917ce2.png) And this is how I'm surfacing the message from utils.py: ![image](https://user-images.githubusercontent.com/57161638/145681005-2776c0ad-9822-4461-b43a-450ab2e828eb.png) ",206156866,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/61/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 797784080,MDU6SXNzdWU3OTc3ODQwODA=,62,Stargazers and workflows commands always require an auth file when using GITHUB_TOKEN ,631242,open,0,,,0,2021-01-31T18:56:05Z,2021-01-31T18:56:05Z,,CONTRIBUTOR,,"Requested fix in https://github.com/dogsheep/github-to-sqlite/pull/59 The stargazers and workflows commands always require an auth file, even when using a `GITHUB_TOKEN`. Other commands don't require the auth file. ",207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/62/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 897212458,MDU6SXNzdWU4OTcyMTI0NTg=,63,Ability to fetch commits from branches other than the default,9599,open,0,,,0,2021-05-20T17:58:08Z,2021-05-20T17:58:08Z,,MEMBER,,This tool is currently almost entirely ignorant of the concept of branches. One example: you can't retrieve commits from any branch other than the default (usually main).,207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/63/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1091850530,I_kwDODEm0Qs5BFFEi,63,Import archive error 'withheld_in_countries',521097,open,0,,,0,2022-01-01T16:58:59Z,2022-01-01T16:58:59Z,,NONE,,"Importing the twitter archive I received this error: ```bash $ twitter-to-sqlite import archive.db twitter-2021-12-31-.zip birdwatch-note-rating: not yet implemented birdwatch-note: not yet implemented branch-links: not yet implemented community-tweet: not yet implemented contact: not yet implemented device-token: not yet implemented direct-message-mute: not yet implemented mute: not yet implemented periscope-account-information: not yet implemented periscope-ban-information: not yet implemented periscope-broadcast-metadata: not yet implemented periscope-comments-made-by-user: not yet implemented periscope-expired-broadcasts: not yet implemented periscope-followers: not yet implemented periscope-profile-description: not yet implemented professional-data: not yet implemented protected-history: not yet implemented reply-prompt: not yet implemented screen-name-change: not yet implemented smartblock: not yet implemented spaces-metadata: not yet implemented sso: not yet implemented Traceback (most recent call last): File ""/home/paulox/.virtualenvs/dogsheep/bin/twitter-to-sqlite"", line 8, in sys.exit(cli()) File ""/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/click/core.py"", line 1128, in __call__ return self.main(*args, **kwargs) File ""/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/click/core.py"", line 1053, in main rv = self.invoke(ctx) File ""/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/click/core.py"", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File ""/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/click/core.py"", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File ""/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/click/core.py"", line 754, in invoke return __callback(*args, **kwargs) File ""/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/twitter_to_sqlite/cli.py"", line 759, in import_ archive.import_from_file(db, filename, content) File ""/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/twitter_to_sqlite/archive.py"", line 246, in import_from_file db[table_name].insert_all(rows, pk=pk, replace=True) File ""/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/sqlite_utils/db.py"", line 2625, in insert_all self.insert_chunk( File ""/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/sqlite_utils/db.py"", line 2406, in insert_chunk result = self.db.execute(query, params) File ""/home/paulox/.virtualenvs/dogsheep/lib/python3.9/site-packages/sqlite_utils/db.py"", line 422, in execute return self.conn.execute(sql, parameters) sqlite3.OperationalError: table archive_tweet has no column named withheld_in_countries ``` I found only a single tweet with the key `withheld_in_countries` in `tweet.js` that seems the problems: ```JSON [ { ""tweet"" : { ""retweeted"" : false, ""source"" : ""Twitter for Android"", ""entities"" : { ""hashtags"" : [ { ""text"" : ""NowOnAndroid"", ""indices"" : [ ""64"", ""77"" ] } ], ""symbols"" : [ ], ""user_mentions"" : [ { ""name"" : ""Periscope"", ""screen_name"" : ""PeriscopeCo"", ""indices"" : [ ""3"", ""15"" ], ""id_str"" : ""1111111111"", ""id"" : ""222222222"" } ], ""urls"" : [ { ""url"" : ""https://t.co/xxxxxxxxx"", ""expanded_url"" : ""https://vine.co/v/xxxxxxxxx"", ""display_url"" : ""vine.co/v/xxxxxxxxxx"", ""indices"" : [ ""78"", ""101"" ] } ] }, ""display_text_range"" : [ ""0"", ""101"" ], ""favorite_count"" : ""0"", ""id_str"" : ""1111111111111111111111"", ""truncated"" : false, ""retweet_count"" : ""0"", ""withheld_in_countries"" : [ ""TR"" ], ""id"" : ""000000000000000000"", ""possibly_sensitive"" : false, ""created_at"" : ""Fri Aug 14 06:04:03 +0000 2015"", ""favorited"" : false, ""full_text"" : ""RT @periscopeco: Travel the world. LIVE. The Global Map is here #NowOnAndroid https://t.co/NZXdsPWROk"", ""lang"" : ""en"" } } ] ``` I solved the error removing the key from the `tweet.js` but I'm reporting this error to improve the project.",206156866,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/63/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 920636216,MDU6SXNzdWU5MjA2MzYyMTY=,64,"feature: support ""events""",231498,open,0,,,5,2021-06-14T17:42:49Z,2021-06-15T00:48:37Z,,NONE,,"the GitHub API provides the ability to fetch all events for a given user, organization, or repository: https://docs.github.com/en/rest/reference/activity#list-events-for-the-authenticated-user this would allow users to export all of the issue comments, new issues, etc. that they created. something which is currently missing from the GitHub takeout exports.",207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/64/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1097332098,I_kwDODEm0Qs5BZ_WC,64,Include all entities for tweets,111631,open,0,,,0,2022-01-09T23:35:28Z,2022-01-09T23:35:28Z,,NONE,,"Per our conversation [on Twitter](https://twitter.com/mschoening/status/1480312477246054401): It would be neat if all entities (including URLs) were captured. This way you can ensure, that URLs are parsed out exactly the same way Twitter parses URLs – we all know parsing URLs with a regex ain't fun. Right now, I believe the tool filters out all entities that are not of type `media`.",206156866,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/64/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 923270900,MDExOlB1bGxSZXF1ZXN0NjcyMDUzODEx,65,basic support for events,231498,open,0,,,2,2021-06-17T00:51:30Z,2022-10-03T22:35:03Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/github-to-sqlite/pulls/65,"a quick first pass at implementing the feature requested in https://github.com/dogsheep/github-to-sqlite/issues/64 testing instructions: ``` $ github-to-sqlite events events.db user/khimaros ``` if the specified user is the authenticated user, it will also include private events. caveat: pagination appears to be broken (i don't see `next` in the response JSON from GitHub)",207052882,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/65/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 1160327106,PR_kwDODEm0Qs4z_V3w,65,"Update Twitter dev link, clarify apps vs projects",2657547,open,0,,,0,2022-03-05T11:56:08Z,2022-03-05T11:56:08Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/twitter-to-sqlite/pulls/65,"Twitter pushes you heavily towards v2 projects instead of v1 apps – I know the README mentions v1 API compatibility at the top, but I still nearly got turned around here.",206156866,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/65/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 975161924,MDExOlB1bGxSZXF1ZXN0NzE2MzU3OTgy,66,Add --merged-by flag to pull-requests sub command,30531572,open,0,,,1,2021-08-20T00:57:55Z,2021-09-28T21:50:31Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/github-to-sqlite/pulls/66,"## Description Proposing a solution to the API limitation for `merged_by` in pull_requests. Specifically the following called out in the readme: ``` Note that the merged_by column on the pull_requests table will only be populated for pull requests that are loaded using the --pull-request option - the GitHub API does not return this field for pull requests that are loaded in bulk. ``` This approach might cause larger repos to hit rate limits called out in https://github.com/dogsheep/github-to-sqlite/issues/51 but seems to work well in the repos I tested and included below. ## Old Behavior - Had to list out the pull-requests individually via multiple `--pull-request` flags ## New Behavior - `--merged-by` flag for getting 'merge_by' information out of pull-requests without having to specify individual PR numbers. # Testing Picking some repo that has more than one merger (datasette only has 1 😉 ) ``` $ github-to-sqlite pull-requests ./github.db opnsense/tools --merged-by $ echo ""select id, url, merged_by from pull_requests;"" | sqlite3 ./github.db 83533612|https://github.com/opnsense/tools/pull/39|1915288 102632885|https://github.com/opnsense/tools/pull/43|1915288 149114810|https://github.com/opnsense/tools/pull/57|1915288 160394495|https://github.com/opnsense/tools/pull/64|1915288 163308408|https://github.com/opnsense/tools/pull/67|1915288 169723264|https://github.com/opnsense/tools/pull/69|1915288 171381422|https://github.com/opnsense/tools/pull/72|1915288 179938195|https://github.com/opnsense/tools/pull/77|1915288 196233824|https://github.com/opnsense/tools/pull/82|1915288 215289964|https://github.com/opnsense/tools/pull/93| 219696100|https://github.com/opnsense/tools/pull/97|1915288 223664843|https://github.com/opnsense/tools/pull/99| 228446172|https://github.com/opnsense/tools/pull/103|1915288 238930434|https://github.com/opnsense/tools/pull/110|1915288 255507110|https://github.com/opnsense/tools/pull/119|1915288 255980675|https://github.com/opnsense/tools/pull/120|1915288 261906770|https://github.com/opnsense/tools/pull/125| 263800503|https://github.com/opnsense/tools/pull/127|1915288 264038685|https://github.com/opnsense/tools/pull/128|1915288 264696704|https://github.com/opnsense/tools/pull/129|1915288 266660547|https://github.com/opnsense/tools/pull/130|1915288 273120409|https://github.com/opnsense/tools/pull/133|1915288 274370803|https://github.com/opnsense/tools/pull/135| 276600629|https://github.com/opnsense/tools/pull/139| 277303655|https://github.com/opnsense/tools/pull/141|1915288 293033714|https://github.com/opnsense/tools/pull/145| 294827649|https://github.com/opnsense/tools/pull/146| 295140008|https://github.com/opnsense/tools/pull/147|1915288 305690829|https://github.com/opnsense/tools/pull/150|9783985 307077931|https://github.com/opnsense/tools/pull/152|1915288 321782100|https://github.com/opnsense/tools/pull/155| 337265672|https://github.com/opnsense/tools/pull/160| 337267484|https://github.com/opnsense/tools/pull/161|1915288 368251763|https://github.com/opnsense/tools/pull/169| 428262505|https://github.com/opnsense/tools/pull/181| 437557011|https://github.com/opnsense/tools/pull/182|1915288 447079893|https://github.com/opnsense/tools/pull/185| 461822092|https://github.com/opnsense/tools/pull/191| 463290142|https://github.com/opnsense/tools/pull/193|1915288 470112962|https://github.com/opnsense/tools/pull/194|1915288 472644649|https://github.com/opnsense/tools/pull/195|1915288 488696898|https://github.com/opnsense/tools/pull/198| 513289902|https://github.com/opnsense/tools/pull/201| 522530265|https://github.com/opnsense/tools/pull/203| 564443347|https://github.com/opnsense/tools/pull/213| 597579516|https://github.com/opnsense/tools/pull/220|1915288 602860357|https://github.com/opnsense/tools/pull/221|1915288 608744738|https://github.com/opnsense/tools/pull/222|1915288 623279673|https://github.com/opnsense/tools/pull/228|1915288 664656182|https://github.com/opnsense/tools/pull/233| 664781786|https://github.com/opnsense/tools/pull/234|1915288 670683636|https://github.com/opnsense/tools/pull/235|1915288 683150764|https://github.com/opnsense/tools/pull/237| 685016233|https://github.com/opnsense/tools/pull/238| 687099825|https://github.com/opnsense/tools/pull/239|1915288 715705652|https://github.com/opnsense/tools/pull/244|1915288 715721248|https://github.com/opnsense/tools/pull/245|1915288 ``` `userid` are now present for those PRs that were merged. Without the flag the `merged_by` behavior remains missing as expected when get PRs bulk: ``` $ github-to-sqlite pull-requests ./github.db opnsense/tools $ echo ""select id, url, merged_by from pull_requests;"" | sqlite3 ./github.db 83533612|https://github.com/opnsense/tools/pull/39| 102632885|https://github.com/opnsense/tools/pull/43| 149114810|https://github.com/opnsense/tools/pull/57| 160394495|https://github.com/opnsense/tools/pull/64| 163308408|https://github.com/opnsense/tools/pull/67| 169723264|https://github.com/opnsense/tools/pull/69| 171381422|https://github.com/opnsense/tools/pull/72| 179938195|https://github.com/opnsense/tools/pull/77| 196233824|https://github.com/opnsense/tools/pull/82| 215289964|https://github.com/opnsense/tools/pull/93| 219696100|https://github.com/opnsense/tools/pull/97| 223664843|https://github.com/opnsense/tools/pull/99| 228446172|https://github.com/opnsense/tools/pull/103| 238930434|https://github.com/opnsense/tools/pull/110| 255507110|https://github.com/opnsense/tools/pull/119| 255980675|https://github.com/opnsense/tools/pull/120| 261906770|https://github.com/opnsense/tools/pull/125| 263800503|https://github.com/opnsense/tools/pull/127| 264038685|https://github.com/opnsense/tools/pull/128| 264696704|https://github.com/opnsense/tools/pull/129| 266660547|https://github.com/opnsense/tools/pull/130| 273120409|https://github.com/opnsense/tools/pull/133| 274370803|https://github.com/opnsense/tools/pull/135| 276600629|https://github.com/opnsense/tools/pull/139| 277303655|https://github.com/opnsense/tools/pull/141| 293033714|https://github.com/opnsense/tools/pull/145| 294827649|https://github.com/opnsense/tools/pull/146| 295140008|https://github.com/opnsense/tools/pull/147| 305690829|https://github.com/opnsense/tools/pull/150| 307077931|https://github.com/opnsense/tools/pull/152| 321782100|https://github.com/opnsense/tools/pull/155| 337265672|https://github.com/opnsense/tools/pull/160| 337267484|https://github.com/opnsense/tools/pull/161| 368251763|https://github.com/opnsense/tools/pull/169| 428262505|https://github.com/opnsense/tools/pull/181| 437557011|https://github.com/opnsense/tools/pull/182| 447079893|https://github.com/opnsense/tools/pull/185| 461822092|https://github.com/opnsense/tools/pull/191| 463290142|https://github.com/opnsense/tools/pull/193| 470112962|https://github.com/opnsense/tools/pull/194| 472644649|https://github.com/opnsense/tools/pull/195| 488696898|https://github.com/opnsense/tools/pull/198| 513289902|https://github.com/opnsense/tools/pull/201| 522530265|https://github.com/opnsense/tools/pull/203| 564443347|https://github.com/opnsense/tools/pull/213| 597579516|https://github.com/opnsense/tools/pull/220| 602860357|https://github.com/opnsense/tools/pull/221| 608744738|https://github.com/opnsense/tools/pull/222| 623279673|https://github.com/opnsense/tools/pull/228| 664656182|https://github.com/opnsense/tools/pull/233| 664781786|https://github.com/opnsense/tools/pull/234| 670683636|https://github.com/opnsense/tools/pull/235| 683150764|https://github.com/opnsense/tools/pull/237| 685016233|https://github.com/opnsense/tools/pull/238| 687099825|https://github.com/opnsense/tools/pull/239| 715705652|https://github.com/opnsense/tools/pull/244| 715721248|https://github.com/opnsense/tools/pull/245| ``` Individual PRs passed via `--pull-request` flag behaves as expected (unchanged): ``` $ github-to-sqlite pull-requests ./github.db opnsense/tools --pull-request 39 --pull-request 237 $ echo ""select id, url, merged_by from pull_requests;"" | sqlite3 ./github.db 83533612|https://github.com/opnsense/tools/pull/39|1915288 683150764|https://github.com/opnsense/tools/pull/237| ``` > Picking 1 PR that has a merged_by (39) and one that does not (237)",207052882,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/66/reactions"", ""total_count"": 3, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",0, 1244082183,PR_kwDODEm0Qs44PPLy,66,Ageinfo workaround,11887,open,0,,,0,2022-05-21T21:08:29Z,2022-05-21T21:09:16Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/twitter-to-sqlite/pulls/66,"I'm not sure if this is due to a new format or just because my ageinfo file is blank, but trying to import an archive would crash when it got to that file. This PR adds a guard clause in the `ageinfo` transformer and sets a default value that doesn't throw an exception. Seems likely to be the same issue mentioned by danp in https://github.com/dogsheep/twitter-to-sqlite/issues/54, my ageinfo file looks the same. Added that same ageinfo file to the test archive as well to help confirm my workaround didn't break anything. Let me know if you want any changes!",206156866,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/66/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 981690086,MDExOlB1bGxSZXF1ZXN0NzIxNjg2NzIx,67,Replacing step ID key with step_id,16374374,open,0,,,0,2021-08-28T01:26:41Z,2021-08-28T01:27:00Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/github-to-sqlite/pulls/67,"Workflows that have an `id` in any step result in the following error when running `workflows`: e.g.`github-to-sqlite workflows github.db nixos/nixpkgs` ```Traceback (most recent call last): File ""/usr/local/bin/github-to-sqlite"", line 8, in sys.exit(cli()) File ""/usr/local/lib/python3.8/dist-packages/click/core.py"", line 1137, in __call__ return self.main(*args, **kwargs) File ""/usr/local/lib/python3.8/dist-packages/click/core.py"", line 1062, in main rv = self.invoke(ctx) File ""/usr/local/lib/python3.8/dist-packages/click/core.py"", line 1668, in invoke```Traceback (most recent call last): File ""/usr/local/bin/github-to-sqlite"", line 8, in sys.exit(cli()) File ""/usr/local/lib/python3.8/dist-packages/click/core.py"", line 1137, in __call__ return self.main(*args, **kwargs) File ""/usr/local/lib/python3.8/dist-packages/click/core.py"", line 1062, in main rv = self.invoke(ctx) File ""/usr/local/lib/python3.8/dist-packages/click/core.py"", line 1668, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File ""/usr/local/lib/python3.8/dist-packages/click/core.py"", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File ""/usr/local/lib/python3.8/dist-packages/click/core.py"", line 763, in invoke return __callback(*args, **kwargs) File ""/usr/local/lib/python3.8/dist-packages/github_to_sqlite/cli.py"", line 601, in workflows utils.save_workflow(db, repo_id, filename, content) File ""/usr/local/lib/python3.8/dist-packages/github_to_sqlite/utils.py"", line 865, in save_workflow db[""steps""].insert_all( File ""/usr/local/lib/python3.8/dist-packages/sqlite_utils/db.py"", line 2596, in insert_all self.insert_chunk( File ""/usr/local/lib/python3.8/dist-packages/sqlite_utils/db.py"", line 2378, in insert_chunk result = self.db.execute(query, params) File ""/usr/local/lib/python3.8/dist-packages/sqlite_utils/db.py"", line 419, in execute return self.conn.execute(sql, parameters) sqlite3.IntegrityError: datatype mismatch ``` - [Information about the ID key in a step for GHA](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idstepsid) - [An example workflow from a public repo](https://github.com/NixOS/nixpkgs/blob/b4cc66827745e525ce7bb54659845ac89788a597/.github/workflows/direct-push.yml#L16) # Changes I'm proposing that the key for `id` in step is replaced with `step_id` so that it no longer interferes with the table `id` for tracking the record. Special thanks to @sarcasticadmin @egiffen and @ruebenramirez for helping a bit on this 😄 ",207052882,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/67/reactions"", ""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 1513237712,PR_kwDODEm0Qs5GUoG_,67,Add support for app-only bearer tokens,26161409,open,0,,,0,2022-12-28T23:31:20Z,2022-12-28T23:31:20Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/twitter-to-sqlite/pulls/67,"Previously, twitter-to-sqlite only supported OAuth1 authentication, and the token must be on behalf of a user. However, Twitter also supports application-only bearer tokens, documented here: https://developer.twitter.com/en/docs/authentication/oauth-2-0/bearer-tokens This PR adds support to twitter-to-sqlite for using application-only bearer tokens. To use, the auth.json file just needs to contain a ""bearer_token"" key instead of ""api_key"", ""api_secret_key"", etc.",206156866,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/67/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 1013506559,PR_kwDODFdgUs4skaNS,68,Add support for retrieving teams / members,68329,open,0,,,0,2021-10-01T15:55:02Z,2021-10-01T15:59:53Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/github-to-sqlite/pulls/68,Adds a method for retrieving all the teams within an organisation and all the members in those teams. The latter is stored as a join table `team_members` beteween `teams` and `users`.,207052882,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/68/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 1513237982,PR_kwDODEm0Qs5GUoKL,68,Archive: Import mute table,26161409,open,0,,,0,2022-12-28T23:32:06Z,2022-12-28T23:32:06Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/twitter-to-sqlite/pulls/68,,206156866,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/68/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 1071071397,I_kwDODFdgUs4_10Cl,69,View that combines issues and issue comments,9599,open,0,,,1,2021-12-04T00:34:33Z,2021-12-04T00:34:52Z,,MEMBER,,I want to see a reverse chronologically ordered interface onto both issues and comments - essentially a unified log of comments and issues opened across one or multiple projects.,207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/69/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1513238152,PR_kwDODEm0Qs5GUoMM,69,Archive: Import new tweets table name,26161409,open,0,,,0,2022-12-28T23:32:44Z,2022-12-28T23:32:44Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/twitter-to-sqlite/pulls/69,"Given the code here, it seems like in the past this file was named ""tweet.js"". In recent exports, it's named ""tweets.js"". The archive importer needs to be modified to take this into account. Existing logic is reused for importing this table. (However, the resulting table name will be different, matching the different file name -- archive_tweets, rather than archive_tweet).",206156866,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/69/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 539204432,MDU6SXNzdWU1MzkyMDQ0MzI=,70,Implement ON DELETE and ON UPDATE actions for foreign keys,26292069,open,0,,,2,2019-12-17T17:19:10Z,2020-02-27T04:18:53Z,,NONE,,"Hi! I did not find any mention on the library about ON DELETE and ON UPDATE actions for foreign keys. Are those expected to be implemented? If not, it would be a nice thing to include!",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/70/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1149402080,PR_kwDODFdgUs4zaUta,70,scrape-dependents: enable paging through package menu option if present,36061055,open,0,,,0,2022-02-24T15:07:25Z,2022-02-24T15:07:25Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/github-to-sqlite/pulls/70,Some repos organize network dependents by a Package toggle. This PR adds the ability to page through those options and scrape underlying dependents.,207052882,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/70/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 1513238314,PR_kwDODEm0Qs5GUoN6,70,Archive: Import Twitter Circle data,26161409,open,0,,,0,2022-12-28T23:33:09Z,2022-12-28T23:33:09Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/twitter-to-sqlite/pulls/70,,206156866,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/70/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 1513238455,PR_kwDODEm0Qs5GUoPm,71,"Archive: Fix ""ni devices"" typo in importer",26161409,open,0,,,0,2022-12-28T23:33:31Z,2022-12-28T23:33:31Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/twitter-to-sqlite/pulls/71,,206156866,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/71/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 1211283427,I_kwDODFdgUs5IMrfj,72,feature: display progress bar when downloading multi-page responses,9020979,open,0,,,1,2022-04-21T16:37:12Z,2022-04-21T17:29:31Z,,NONE,,"## Motivation For a long running command (longer than 1 minute) for a big table (like pull requests or commits), it can be tricky to know if the script is still running, or if a rate limit/error was encountered We know how many pages there are, so it may be possible to indicate how many remain. https://github.com/dogsheep/github-to-sqlite/blob/a6e237f75a4b86963d91dcb5c9582e3a1b3349d6/github_to_sqlite/utils.py#L367 ## Resources - Using the existing Click API: - https://click.palletsprojects.com/en/5.x/utils/#showing-progress-bars - Loading spinner: https://github.com/pavdmyt/yaspin - Progress bar: https://github.com/tqdm/tqdm",207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/72/reactions"", ""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1524431805,I_kwDODEm0Qs5a3Pu9,72,"Import thread, including self- and others' replies",601708,open,0,,,0,2023-01-08T09:51:06Z,2023-01-08T09:51:06Z,,NONE,,"statuses-lookup, home-timeline, mentions (only for auth'ed user) don't cover this. `twitter-to-sqlite fetch-thread tw-group1.db 1234123412341234` twitter-to-sqlite focuses on archiving users, but does not easily support archiving conversations or community activity. For reference, this is [implemented in twarc](https://sourcegraph.com/github.com/DocNow/twarc/-/blob/twarc/client.py?L708-766&subtree=true), using a search, optionally recursively. Other research suggests that this formerly, or currently, requires a [search query](https://stackoverflow.com/a/30480103/1020467), use of [undocumented `related_results` api](https://stackoverflow.com/a/9419346/1020467), or with requested inclusion of [newer conversation_id](https://stackoverflow.com/a/68115718/1020467) with subsequent query. ",206156866,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/72/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1816830546,I_kwDODEm0Qs5sSqJS,73,Twitter v1 API shutdown,6341745,open,0,,,0,2023-07-22T16:57:41Z,2023-07-22T16:57:41Z,,NONE,,"I've been using this project reliably over the past two years to periodically download my liked tweets, but unfortunately since 19th July I get: ``` [2023-07-19 21:00:04.937536] File ""/home/pi/code/liked-tweets/lib/python3.7/site-packages/twitter_to_sqlite/utils.py"", line 202, in fetch_timeline [2023-07-19 21:00:04.937606] raise Exception(str(tweets[""errors""])) [2023-07-19 21:00:04.937678] Exception: [{'message': 'You currently have access to a subset of Twitter API v2 endpoints and limited v1.1 endpoints (e.g. media post, oauth) only. If you need access to this endpoint, you may need a different access level. You can learn more here: https://developer.twitter.com/en/portal/product', 'code': 453}] ``` It appears like Twitter has now shut down their v1 endpoints, which is rather gracious of them, considering they [announced they'd be deprecated on 29th April](https://twittercommunity.com/t/reminder-to-migrate-to-the-new-free-basic-or-enterprise-plans-of-the-twitter-api/189737). Unfortunately [retrieving likes using the v2 API](https://developer.twitter.com/en/docs/twitter-api/tweets/likes/introduction) is not part of their [free plan](https://developer.twitter.com/en/portal/products). In fact, with the free plan one can only post and delete tweets and retrieve information about oneself. So I'm afraid this is the end of this very nice project. It was very useful, thank you! ",206156866,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/73/reactions"", ""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 1}",, 546073980,MDU6SXNzdWU1NDYwNzM5ODA=,74,Test failures on openSUSE 15.1: AssertionError: Explicit other_table and other_column,15092,open,0,,,3,2020-01-07T04:35:50Z,2020-01-12T07:21:17Z,,CONTRIBUTOR,,"openSUSE 15.1 is using python 3.6.5 and click-7.0 , however it has test failures while openSUSE Tumbleweed on py37 passes. Most fail on the cli exit code like ```py [ 74s] =================================== FAILURES =================================== [ 74s] _________________________________ test_tables __________________________________ [ 74s] [ 74s] db_path = '/tmp/pytest-of-abuild/pytest-0/test_tables0/test.db' [ 74s] [ 74s] def test_tables(db_path): [ 74s] result = CliRunner().invoke(cli.cli, [""tables"", db_path]) [ 74s] > assert '[{""table"": ""Gosh""},\n {""table"": ""Gosh2""}]' == result.output.strip() [ 74s] E assert '[{""table"": ""...e"": ""Gosh2""}]' == '' [ 74s] E - [{""table"": ""Gosh""}, [ 74s] E - {""table"": ""Gosh2""}] [ 74s] [ 74s] tests/test_cli.py:28: AssertionError ``` packaging project at https://build.opensuse.org/package/show/home:jayvdb:py-new/python-sqlite-utils I'll keep digging into this after I have github-to-sqlite working on Tumbleweed, as I'll need openSUSE Leap 15.1 working before I can submit this into the main python repo.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/74/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1363244199,I_kwDODFdgUs5RQXSn,75,Fetch repos doesn't support organisations,2757699,open,0,,,0,2022-09-06T12:55:06Z,2022-09-06T12:55:06Z,,NONE,,"Say I want to get all my Github Org's repos info, for data analysis. Not just the public repos, but also the private/internal repos. The endpoints are different for organisation, and this tool doesn't take it into account: https://github.com/dogsheep/github-to-sqlite/blob/ace13ec3d98090d99bd71871c286a4a612c96a50/github_to_sqlite/utils.py#L453 https://github.com/dogsheep/github-to-sqlite/blob/ace13ec3d98090d99bd71871c286a4a612c96a50/github_to_sqlite/utils.py#L455 The endpoints for organisation repos is instead ([source](https://docs.github.com/en/rest/repos/repos#list-organization-repositories)): `url = ""https://api.github.com/orgs/{}/repos"".format(username)` Let's add support for organisations repo scraping.",207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/75/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1363280254,PR_kwDODFdgUs4-cIa_,76,Add organization support to repos command,2757699,open,0,,,1,2022-09-06T13:21:42Z,2022-09-06T13:59:08Z,,FIRST_TIME_CONTRIBUTOR,dogsheep/github-to-sqlite/pulls/76,"New --organization flag to signify all given ""usernames"" are private orgs. Adapts API URL to the organization path instead. Not the best implementation, but a first draft to talk around Fixes #75 (badly, no tests, overly vague, untested)",207052882,pull,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/76/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 1410548368,I_kwDODFdgUs5UE0KQ,77,Feature: Support GitHub discussions,631242,open,0,,,0,2022-10-16T16:53:38Z,2022-10-16T16:53:38Z,,CONTRIBUTOR,,"Hi @simonw I've been a happy user of this tool. Thank you for writing it and sharing it. I wanted to suggest a feature request to support Discussions. For example the VisiData project has discussions https://github.com/saulpw/visidata/discussions , and it would be useful if there was a way to pull that data into the database. However, I'm not offering a pull request.",207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/77/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1505411725,I_kwDODFdgUs5ZusKN,78,self-hosted or corp github enterprise,549431,open,0,,,0,2022-12-20T22:51:45Z,2022-12-20T22:51:45Z,,NONE,,"We use github enterprise at work and I would like to use this tool to pull info from that site rather than the public github.com instance. Is there an option for this? If not, can one be added for a custom repo URL?",207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/78/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1570375808,I_kwDODFdgUs5dmgiA,79,Deploy demo job is failing due to rate limit,9599,open,0,,,2,2023-02-03T20:05:01Z,2023-12-08T14:50:15Z,,MEMBER,,https://github.com/dogsheep/github-to-sqlite/actions/runs/4080058087/jobs/7032116511,207052882,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/github-to-sqlite/issues/79/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 573578548,MDU6SXNzdWU1NzM1Nzg1NDg=,89,Ability to customize columns used by extracts= feature,9599,open,0,,,3,2020-03-01T16:54:48Z,2020-10-16T19:17:50Z,,OWNER,,"@simonw any thoughts on allow extracts to specify the lookup column name? If I'm understanding the documentation right, `.lookup()` allows you to define the ""value"" column (the documentation uses name), but when you use `extracts` keyword as part of `.insert()`, `.upsert()` etc. the lookup must be done against a column named ""value"". I have an existing lookup table that I've populated with columns ""id"" and ""name"" as opposed to ""id"" and ""value"", and seems I can't use `extracts=`, unless I'm missing something... Initial thought on how to do this would be to allow the dictionary value to be a tuple of table name column pair... so: ``` table = db.table(""trees"", extracts={""species_id"": (""Species"", ""name""}) ``` I haven't dug too much into the existing code yet, but does this make sense? Worth doing? _Originally posted by @chrishas35 in https://github.com/simonw/sqlite-utils/issues/46#issuecomment-592999503_",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/89/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 581795570,MDU6SXNzdWU1ODE3OTU1NzA=,93,Support more string values for types in .add_column(),9599,open,0,,,0,2020-03-15T19:32:49Z,2020-09-24T20:36:46Z,,OWNER,,"https://sqlite-utils.readthedocs.io/en/2.4.2/python-api.html#adding-columns says: > SQLite types you can specify are ""TEXT"", ""INTEGER"", ""FLOAT"" or ""BLOB"". As discovered in #92 this isn't the right list of values. I should expand this to match https://www.sqlite.org/datatype3.html",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/93/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 274615452,MDU6SXNzdWUyNzQ2MTU0NTI=,111,Add “updated” to metadata,9599,open,0,,,12,2017-11-16T18:22:20Z,2021-09-21T22:48:27Z,,OWNER,,"To give an indication as to when the data was last updated. This should be a field in the metadata that is then shown on the index page and in the footer, if it is set. Also support setting it using an option to “datasette publish” and “datasette package” - which can either be a string or can be the magic string “today” to set it to today’s date: datasette publish file.db --updated=today",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/111/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 644161221,MDU6SXNzdWU2NDQxNjEyMjE=,117,Support for compound (composite) foreign keys,9599,open,0,,,3,2020-06-23T21:33:42Z,2020-06-23T21:40:31Z,,OWNER,,"It turns out SQLite supports composite foreign keys: https://www.sqlite.org/foreignkeys.html#fk_composite Their example looks like this: ```sql CREATE TABLE album( albumartist TEXT, albumname TEXT, albumcover BINARY, PRIMARY KEY(albumartist, albumname) ); CREATE TABLE song( songid INTEGER, songartist TEXT, songalbum TEXT, songname TEXT, FOREIGN KEY(songartist, songalbum) REFERENCES album(albumartist, albumname) ); ``` Here's what that looks like in sqlite-utils: ``` In [1]: import sqlite_utils In [2]: import sqlite3 In [3]: conn = sqlite3.connect("":memory:"") In [4]: conn Out[4]: In [5]: conn.executescript("""""" ...: CREATE TABLE album( ...: albumartist TEXT, ...: albumname TEXT, ...: albumcover BINARY, ...: PRIMARY KEY(albumartist, albumname) ...: ); ...: ...: CREATE TABLE song( ...: songid INTEGER, ...: songartist TEXT, ...: songalbum TEXT, ...: songname TEXT, ...: FOREIGN KEY(songartist, songalbum) REFERENCES album(albumartist, albumname) ...: ); ...: """""") Out[5]: In [6]: db = sqlite_utils.Database(conn) In [7]: db.tables Out[7]: [,
] In [8]: db.tables[0].foreign_keys Out[8]: [] In [9]: db.tables[1].foreign_keys Out[9]: [ForeignKey(table='song', column='songartist', other_table='album', other_column='albumartist'), ForeignKey(table='song', column='songalbum', other_table='album', other_column='albumname')] ``` The table appears to have two separate foreign keys, when actually it has a single compound composite foreign key.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/117/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 652961907,MDU6SXNzdWU2NTI5NjE5MDc=,121,Improved (and better documented) support for transactions,9599,open,0,,,3,2020-07-08T04:56:51Z,2020-09-24T20:36:46Z,,OWNER,,"_Originally posted by @simonw in https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655283393_ We should put some thought into how this library supports and encourages smart use of transactions.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/121/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 275125561,MDU6SXNzdWUyNzUxMjU1NjE=,123,Datasette serve should accept paths/URLs to CSVs and other file formats,9599,open,0,,,9,2017-11-19T02:05:48Z,2021-07-19T00:04:32Z,,OWNER,,"This would remove the csvs-to-sqlite step which I end up using for almost everything. I'm hesitant to introduce pandas as a required dependency though since it require compiling numpy. Could build it so this option is only available if you have pandas installed.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/123/reactions"", ""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",, 275159710,MDU6SXNzdWUyNzUxNTk3MTA=,128,"Every visualization should have an ""embed"" button",9599,open,0,,,0,2017-11-19T13:38:13Z,2019-05-13T18:33:51Z,,OWNER,,"At least for the first round of visualizations, any time you construct one using the UI the result should include an ""embed this"" button that returns source code to copy and paste These examples should use unpkg.com (or similarl) urls with SRI hashes, eg https://www.srihash.org - and should load data from the datasette JSON API.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/128/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 675753042,MDU6SXNzdWU2NzU3NTMwNDI=,131,sqlite-utils insert: options for column types,9599,open,0,,,5,2020-08-09T18:59:11Z,2022-03-15T13:21:42Z,,OWNER,,"The `insert` command currently results in string types for every column - at least when used against CSV or TSV inputs. It would be useful if you could do the following: - automatically detects the column types based on eg the first 1000 records - explicitly state the rule for specific columns `--detect-types` could work for the former - or it could do that by default and allow opt-out using `--no-detect-types` For specific columns maybe this: sqlite-utils insert db.db images images.tsv \ --tsv \ -c id int \ -c score float",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/131/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 275415799,MDU6SXNzdWUyNzU0MTU3OTk=,137,Ability to combine multiple SQL queries on a single graph,9599,open,0,,,1,2017-11-20T16:26:57Z,2019-05-13T18:33:51Z,,OWNER,,This would make visualizations significantly more powerful. The interesting challenge will be around the URL design. It would be useful to be able to combine either multiple explicit SQL queries or multiple queries based on the filter string parameters passed to one or more table views.,107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/137/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 275755475,MDU6SXNzdWUyNzU3NTU0NzU=,140,Heatmap visualization plugin,9599,open,0,,,2,2017-11-21T15:34:23Z,2019-05-13T18:33:51Z,,OWNER,,Could use https://github.com/scottbedard/svelte-heatmap,107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/140/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 688351054,MDU6SXNzdWU2ODgzNTEwNTQ=,140,Idea: insert-files mechanism for adding extra columns with fixed values,9599,open,0,,,1,2020-08-28T20:57:36Z,2022-03-20T19:45:45Z,,OWNER,,"Say for example you want to populate a `file_type` column with the value `gif`. That could work like this: ``` sqlite-utils insert-files gifs.db images *.gif \ -c path -c md5 -c last_modified:mtime \ -c file_type:text:gif --pk=path ``` So a column defined as a `text` column with a value that follows a second colon.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/140/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 688352145,MDU6SXNzdWU2ODgzNTIxNDU=,141,insert-files support for compressed values,9599,open,0,,,0,2020-08-28T20:59:46Z,2020-09-24T20:36:08Z,,OWNER,,"The `sqlar` format supports this, it would be useful if `insert-files` could support this too. https://www.sqlite.org/sqlar.html",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/141/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 688670158,MDU6SXNzdWU2ODg2NzAxNTg=,147,SQLITE_MAX_VARS maybe hard-coded too low,96218,open,0,,,7,2020-08-30T07:26:45Z,2021-02-15T21:27:55Z,,CONTRIBUTOR,,"I came across this while about to open an issue and PR against the documentation for `batch_size`, which is a bit incomplete. As mentioned in #145, while: > [`SQLITE_MAX_VARIABLE_NUMBER`](https://www.sqlite.org/limits.html#max_variable_number) ... defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0. it is common that it is increased at compile time. Debian-based systems, for example, seem to ship with a version of sqlite compiled with SQLITE_MAX_VARIABLE_NUMBER set to 250,000, and I believe this is the case for homebrew installations too. In working to understand what `batch_size` was actually doing and why, I realized that by setting `SQLITE_MAX_VARS` in `db.py` to match the value my sqlite was compiled with (I'm on Debian), I was able to decrease the time to `insert_all()` my test data set (~128k records across 7 tables) from ~26.5s to ~3.5s. Given that this about .05% of my total dataset, this is time I am keen to save... Unfortunately, it seems that `sqlite3` in the python standard library doesn't expose the `get_limit()` C API (even though `pysqlite` used to), so it's hard to know what value sqlite has been compiled with (note that this could mean, I suppose, that it's less than 999, and even hardcoding `SQLITE_MAX_VARS` to the conservative default might not be adequate. It can also be lowered -- but not raised -- at runtime). The best I could come up with is `echo """" | sqlite3 -cmd "".limits variable_number""` (only available in `sqlite >= 2015-05-07 (3.8.10)`). Obviously this couldn't be relied upon in `sqlite_utils`, but I wonder what your opinion would be about exposing `SQLITE_MAX_VARS` as a user-configurable parameter (with suitable ""here be dragons"" warnings)? I'm going to go ahead and monkey-patch it for my purposes in any event, but it seems like it might be worth considering.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/147/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 695441530,MDU6SXNzdWU2OTU0NDE1MzA=,154,OperationalError: cannot change into wal mode from within a transaction,9599,open,0,,,2,2020-09-07T23:42:44Z,2020-09-07T23:47:10Z,,OWNER,,"I'm getting this error when running: sqlite-utils enable-wal beta.db `OperationalError: cannot change into wal mode from within a transaction` I'm worried that maybe that's because of this new code from #152: https://github.com/simonw/sqlite-utils/blob/deb2eb013ff85bbc828ebc244a9654f0d9c3139e/sqlite_utils/db.py#L128-L129",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/154/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 702386948,MDU6SXNzdWU3MDIzODY5NDg=,159,.delete_where() does not auto-commit (unlike .insert() or .upsert()),11712349,open,0,,,9,2020-09-16T01:55:52Z,2023-04-01T17:21:05Z,,NONE,,"When you use the delete_where() function on a table, it never commits.... Is that intentional?",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/159/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 706001517,MDU6SXNzdWU3MDYwMDE1MTc=,163,Idea: conversions= could take Python functions,9599,open,0,,,4,2020-09-22T00:37:12Z,2021-12-20T00:56:52Z,,OWNER,,"Right now you use `conversions=` like this: ```python db[""example""].insert({ ""name"": ""The Bigfoot Discovery Museum"" }, conversions={""name"": ""upper(?)""}) ``` How about if you could optionally provide a Python function (or a lambda) like this? ```python db[""example""].insert({ ""name"": ""The Bigfoot Discovery Museum"" }, conversions={""name"": lambda s: s.upper()}) ``` This would work by creating a random name for that function, registering it (similar to #162), executing the SQL and then un-registering the custom function at the end.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/163/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 281110295,MDU6SXNzdWUyODExMTAyOTU=,173,I18n and L10n support,50138,open,0,,,2,2017-12-11T17:49:58Z,2021-04-26T12:10:01Z,,NONE,,It would be less geeky and more user friendly if the display strings in the filter menu and possibly other parts could be localized.,107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/173/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 285168503,MDU6SXNzdWUyODUxNjg1MDM=,176,Add GraphQL endpoint,173848,open,0,,,8,2017-12-29T23:21:01Z,2020-04-21T14:16:24Z,,NONE,,Would make it much easier to build React & similar frontends. Maybe with https://github.com/graphql-python/sanic-graphql ?,107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/176/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 288438570,MDU6SXNzdWUyODg0Mzg1NzA=,179,More metadata options for template authors ,9599,open,0,,,2,2018-01-14T20:51:04Z,2019-05-13T18:33:33Z,,OWNER,,See this thread on Twitter: https://twitter.com/simonw/status/952637152797458432,107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/179/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 299760684,MDU6SXNzdWUyOTk3NjA2ODQ=,185,Metadata should be a nested arbitrary KV store,222245,open,0,,,12,2018-02-23T16:02:07Z,2019-05-13T18:33:33Z,,NONE,,"I started using the metadata feature and was surprised to find that values are not inherited from the root object down to specific databases and tables. This makes metadata much less useful and requires a lot of pointless duplication. Ideally, metadata should allow arbitrary key-value pairs, and there should be a way of accessing metadata either in an inherited or non-inherited manner. Something like `metadata.page.key` vs. `metadata.this.key` might work as an interface.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/185/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 722816436,MDU6SXNzdWU3MjI4MTY0MzY=,186,.extract() shouldn't extract null values,9599,open,0,,,7,2020-10-16T02:41:08Z,2021-08-12T12:32:14Z,,OWNER,,"This almost works, but it creates a rogue `type` record with a value of None. ``` In [1]: import sqlite_utils In [2]: db = sqlite_utils.Database(memory=True) In [5]: db[""creatures""].insert_all([ {""id"": 1, ""name"": ""Simon"", ""type"": None}, {""id"": 2, ""name"": ""Natalie"", ""type"": None}, {""id"": 3, ""name"": ""Cleo"", ""type"": ""dog""}], pk=""id"") Out[5]:
In [7]: db[""creatures""].extract(""type"") Out[7]:
In [8]: list(db[""creatures""].rows) Out[8]: [{'id': 1, 'name': 'Simon', 'type_id': None}, {'id': 2, 'name': 'Natalie', 'type_id': None}, {'id': 3, 'name': 'Cleo', 'type_id': 2}] In [9]: db[""type""] Out[9]:
In [10]: list(db[""type""].rows) Out[10]: [{'id': 1, 'type': None}, {'id': 2, 'type': 'dog'}] ```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/186/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 309047460,MDU6SXNzdWUzMDkwNDc0NjA=,188,Ability to bundle metadata and templates inside the SQLite file,9599,open,0,,,4,2018-03-27T16:42:07Z,2020-12-04T17:18:34Z,,OWNER,,"One of the nicest qualities of SQLite as a data format is that you get a single file which you can then backup or share with other people. Datasette breaks this a little once you start including custom metadata.json or template files and CSS. It would be cool if there was an optional mechanism for baking that extra configuration into the SQLite file itself. That way entire datasette mini-applications (including canned queries and custom HTML and CSS) could be constructed as single .db files. Since datasette configuration is all file-based, one way to achieve that would be to support a ""datasette_files"" table which, if present is used to search for file contents by path. This is inline with the philosophy described by https://www.sqlite.org/appfileformat.html ",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/188/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 312395790,MDU6SXNzdWUzMTIzOTU3OTA=,197,Ability to sort by more than one column,9599,open,0,,,0,2018-04-09T05:13:30Z,2018-07-10T17:45:37Z,,OWNER,,"Split off from #189. I'd like to support ""sort by X descending, then by Y ascending if there are dupes for X"" as well. Suggested syntax for that: ?_sort_desc=X&_sort=Y we currently only allow one argument to be sent. We should allow as many arguments as there are columns, for example: ?_sort=department&_sort_desc=precinct&_sort=age&_sort_desc=size",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/197/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 312396095,MDU6SXNzdWUzMTIzOTYwOTU=,198,Ability to sort with nulls last,9599,open,0,,,0,2018-04-09T05:15:40Z,2018-07-10T17:45:37Z,,OWNER,,"Split off from #189 Here's how to do that in SQL: https://fivethirtyeight.datasettes.com/fivethirtyeight-2628db9?sql=select+rowid%2C+*+from+%5Bnfl-wide-receivers%2Fadvanced-historical%5D%0D%0Aorder+by+case+when+career_ranypa+is+null+then+1+else+0+end%2C+career_ranypa%2C+rowid order by case when career_ranypa is null then 1 else 0 end, career_ranypa",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/198/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 743384829,MDExOlB1bGxSZXF1ZXN0NTIxMjg3OTk0,203,changes to allow for compound foreign keys,1049910,open,0,,,7,2020-11-16T00:30:10Z,2023-01-25T18:47:18Z,,FIRST_TIME_CONTRIBUTOR,simonw/sqlite-utils/pulls/203,"Add support for compound foreign keys, as per issue #117 Not sure if this is the right approach. In particular I'm unsure about: - the new `ForeignKey` class, which replaces the namedtuple in order to ensure that `column` and `other_column` are forced into tuples. The class does the job, but doesn't feel very elegant. - I haven't rewritten `guess_foreign_table` to take account of multiple columns, so it just checks for the first column in the foreign key definition. This isn't ideal. - I haven't added any ability to the CLI to add compound foreign keys, it's only in the python API at the moment. The PR also contains a minor related change that columns and tables are always quoted in foreign key definitions.",140912432,pull,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/203/reactions"", ""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",0, 314771615,MDU6SXNzdWUzMTQ3NzE2MTU=,218,"Support custom unit display in order to handle ""$10,000""",9599,open,0,,,0,2018-04-16T18:39:31Z,2018-07-10T17:45:38Z,,OWNER,,"I tried to get Datasette to display `$10,000` using the new units support but we currently only display units as a suffix: https://github.com/simonw/datasette/blob/10a34f995c70daa37a8a2aa02c3135a4b023a24c/datasette/app.py#L563-L572 It would be neat if there was a mechanism for specifying a custom unit display - maybe something like this: ``` { ""custom_units"": { ""us_dollar"": { ""unit"": ""us_dollar = [] = $"", ""format"": ""${:,}"" } } } ```",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/218/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 314834783,MDU6SXNzdWUzMTQ4MzQ3ODM=,219,Expose units in the JSON API?,45057,open,0,,,0,2018-04-16T22:04:25Z,2018-04-16T22:04:25Z,,CONTRIBUTOR,,"From #203: it would be nice for the JSON API to (optionally) return columns rendered with units in them - if, for example, you're consuming the JSON to render the rows on a map. I'm not entirely sure how useful this will be though - at the moment my map queries are custom SQL queries (a few have joins in, the rest might be fetching large amounts of data so it makes sense to limit columns fetched). Perhaps the SQL function is a better approach in general.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/219/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 316621102,MDU6SXNzdWUzMTY2MjExMDI=,235,Add limit on the size in KB of data returned from a single query,9599,open,0,,,2,2018-04-22T23:01:15Z,2018-04-24T00:30:02Z,,OWNER,,"Datasette limits the number of rows returned to 1,000 and limits the time spent executing a SQL query to 1000ms - and both of these limits can be customized. It does not have a limit on the size of the response returned. It's possible to compose maliciously large SQL responses in a small number of rows using mechanisms like the `group_concat()` aggregate function. It would be good to avoid malicious SQL creating 100MB+ responses and potentially crashing the server. I think the easiest place to implement that is here: https://github.com/simonw/datasette/blob/f3f42957128c1e7ece584d45d9167f2ac003a3b8/datasette/app.py#L175-L190 Currently we use `cursor.fetchmany()` to fetch up to 1,001 rows at once. Instead, we could switch to iterating through `cursor.fetchone()` (or just using `for row in cursor`) and keeping a running tally of the size of the response as we go - maybe just using `rough_response_size += len(str(row))`. If that goes above a certain threshold we can terminate the response with an error, like we do with timelimits. The bigger challenge here is understanding how well this approach works and what impact it will have on overall Datasette performance. I think I need #33 for this.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/235/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 317001500,MDU6SXNzdWUzMTcwMDE1MDA=,236,datasette publish lambda plugin,9599,open,0,,,11,2018-04-23T22:10:30Z,2023-03-12T14:04:15Z,,OWNER,,"Refs #217 - create a publish plugin that can deploy to AWS Lambda. https://docs.aws.amazon.com/lambda/latest/dg/limits.html says lambda packages can be up to 50 MB, so this would only work with smaller databases (the command can check the filesize before attempting to package and deploy it). Lambdas do get a 512 MB `/tmp` directory too, so for larger databases the function could start and then download up to 512MB from an S3 bucket - so the plugin could take an optional S3 bucket to write to and know how to upload the `.db` file there and then have the lambda download it on startup.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/236/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 816526538,MDU6SXNzdWU4MTY1MjY1Mzg=,239,sqlite-utils extract could handle nested objects,9599,open,0,,,16,2021-02-25T15:10:28Z,2022-09-03T23:46:02Z,,OWNER,,"Imagine a table (imported from a nested JSON file) where one of the columns contains values that look like this: {""email"": ""anonymous@noreply.airtable.com"", ""id"": ""usrROSHARE0000000"", ""name"": ""Anonymous""} The `sqlite-utils extract` command already uses single text values in a column to populate a new table. It would not be much of a stretch for it to be able to use JSON instead, including specifying which of those values should be used as the primary key in the new table.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/239/reactions"", ""total_count"": 6, ""+1"": 5, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 318490133,MDU6SXNzdWUzMTg0OTAxMzM=,241,Default datasette logging format should be JSON,9599,open,0,,,0,2018-04-27T17:32:48Z,2018-07-10T17:45:40Z,,OWNER,,"Structured logs are better. Datasette should default to outputting it's HTTP access log lines as newline delimited JSON instead of the Sanic default format it uses at the moment. For improved greppability these logs should have keys ordered in a consistent way. Python's JSON module can do this with ordered dictionaries.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/241/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 816601354,MDExOlB1bGxSZXF1ZXN0NTgwMjM1NDI3,241,Extract expand - work in progress,9599,open,0,,,0,2021-02-25T16:36:38Z,2021-02-25T16:36:38Z,,OWNER,simonw/sqlite-utils/pulls/241,Refs #239. Still needs documentation and CLI implementation.,140912432,pull,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/241/reactions"", ""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1, 817989436,MDU6SXNzdWU4MTc5ODk0MzY=,242,Async support,25778,open,0,,,13,2021-02-27T18:29:38Z,2021-10-28T14:37:56Z,,CONTRIBUTOR,,"Following our conversation last week, want to note this here before I forget. I've had a couple situations where I'd like to do a bunch of updates in an async event loop, but I run into SQLite's issues with concurrent writes. This feels like something sqlite-utils could help with. PeeWee ORM has a [SQLite write queue](http://docs.peewee-orm.com/en/latest/peewee/playhouse.html#sqliteq) that might be a good model. It's using threads or gevent, but I _think_ that approach would translate well enough to asyncio. Happy to help with this, too.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/242/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 818684978,MDU6SXNzdWU4MTg2ODQ5Nzg=,243,How can i use this utils to deal with fts on column meta of tables ?,27874014,open,0,,,0,2021-03-01T09:45:05Z,2021-03-01T09:45:05Z,,NONE,,"Thank you to release this bravo project. When i use this project on multi table db, I want to implement convenient search on column name from different tables. I want to develop a meta table to save the meta data of different columns of different tables and search on this meta table to get rows from the data table (which the meta table describes) does this project provide some simple function on it ? You can think a have a knowledge graph about the table in the db, and i save this knowledge graph into the db with fts enabled.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/243/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 319449852,MDU6SXNzdWUzMTk0NDk4NTI=,247,SQLite code decoupled from Datasette,11912854,open,0,,,1,2018-05-02T08:03:28Z,2018-05-21T15:29:31Z,,NONE,,"I'm working on the possibility of use Datasette with other file formats that aren't SQLite, like files with [PyTables](https://github.com/PyTables/PyTables) format. In order to accomplish that, I've started [a fork for decoupling the code related with SQLite](https://github.com/jsancho-gpl/datasette/tree/feature/db-type-plugin) and putting it in an external connector to allow future connectors for a lot of file formats. It'd be nice if you could look at it and suggest improvements for a possible PR.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/247/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 836829560,MDU6SXNzdWU4MzY4Mjk1NjA=,248,support for Apache Arrow / parquet files I/O,649467,open,0,,,1,2021-03-20T14:59:30Z,2021-10-28T23:46:48Z,,NONE,,"I just started looking at Apache Arrow using pyarrow for import and export of tabular datasets, and it looks quite compelling. It might be worth looking at for sqlite-utils and/or datasette. As a test, I took a random jsonl data dump of a dataset I have with floats, strings, and ints and converted it to arrow's parquet format using the naive `pyarrow.parquet.write_file()` command, which has automatic type inferrence. It compressed down to 7% of the original size. Conversion of a 26MB JSON file and serializing it to parquet was eyeblink instantaneous. Parquet files are portable and can be directly imported into pandas and other analytics software. The only hangup is the automatic type inference of the naive reader. It's great for general laziness and for parsing JSON columns (it correctly interpreted a table of mine with a JSON array). However, I did get an exception for a string column where most entries looked integer-like but had a couple values that weren't -- the reader tried to coerce all of them for some reason, even though the JSON type is string. Since the writer optionally takes a schema, it shouldn't be too hard to grab the sqlite header types. With some additional hinting, you might get datetime columns and JSON, which are native Arrow types. Somewhat tangentially, someone even wrote an sqlite vfs extension for Parquet: https://cldellow.com/2018/06/22/sqlite-parquet-vtable.html ",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/248/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 320132682,MDU6SXNzdWUzMjAxMzI2ODI=,250,Setup some issue templates,9599,open,0,,,0,2018-05-04T01:49:07Z,2018-05-04T01:49:07Z,,OWNER,,"https://twitter.com/left_pad/status/99216385740464537 I like the idea of using these to help people understand some of the ways I want to use issues.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/250/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 323223872,MDU6SXNzdWUzMjMyMjM4NzI=,260,Validate metadata.json on startup,9599,open,0,,,7,2018-05-15T13:42:56Z,2023-06-21T12:51:22Z,,OWNER,,"It's easy to misspell the name of a database or table and then be puzzled when the metadata settings silently fail. To avoid this, let's sanity check the provided metadata.json on startup and quit with a useful error message if we find any obvious mistakes.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/260/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 323658641,MDU6SXNzdWUzMjM2NTg2NDE=,262,Add ?_extra= mechanism for requesting extra properties in JSON,9599,open,0,,3268330,27,2018-05-16T14:55:42Z,2023-03-29T06:22:22Z,,OWNER,,"Datasette views currently work by creating a set of data that should be returned as JSON, then defining an additional, optional `template_data()` function which is called if the view is being rendered as HTML. This `template_data()` function calculates extra template context variables which are necessary for the HTML view but should not be included in the JSON. Example of how that is used today: https://github.com/simonw/datasette/blob/2b79f2bdeb1efa86e0756e741292d625f91cb93d/datasette/views/table.py#L672-L704 With features like Facets in #255 I'm beginning to want to move more items into the `template_data()` - in the case of facets it's the `suggested_facets` array. This saves that feature from being calculated (involving several SQL queries) for the JSON case where it is unlikely to be used. But... as an API user, I want to still optionally be able to access that information. Solution: Add a `?_extra=suggested_facets&_extra=table_metadata` argument which can be used to optionally request additional blocks to be added to the JSON API. Then redefine as many of the current `template_data()` features as extra arguments instead, and teach Datasette to return certain extras by default when rendering templates. This could allow the JSON representation to be slimmed down further (removing e.g. the `table_definition` and `view_definition` keys) while still making that information available to API users who need it.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/262/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 907795562,MDU6SXNzdWU5MDc3OTU1NjI=,265,Using enable_fts before search term,36287,open,0,,,1,2021-06-01T01:43:34Z,2023-04-01T17:27:18Z,,NONE,,"Many thanks for the sqlite-utils suite of utilities. Has made my life much much easier. I used this to create a table and enable FTS. All works fine. The datasette utility detects FTS and shows a text box. Searching for a term using that interface works well. However, when I start to use features by following https://www.sqlite.org/fts5.html section **""3. Full-text Query Syntax""** I seem to run into issues that I suspect is due to `escape_fts` wrapper function. As an example, if i search for the term `""^குகை"" `on the text box in datasette it produces 140 results. However, when i tweak the query produced by datasette to not use ""escape_fts"" it produces 5 results. Similarly, when I try to restrict the search to a single column in FTS using a spec like `{title : ^குகை}` it returns no rows. The same thing pulls results when used without `escape_fts`. The text in the table is in Tamil language and the search term is a Tamil word. ``` ... where posts_fts match escape_fts(:search) ``` vs ``` ... where posts_fts match (:search) ``` Any ideas why? How can I get the benefits of both escaping as well as utilizing different facets of providing / controlling search terms? Thanks.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/265/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 915421499,MDU6SXNzdWU5MTU0MjE0OTk=,267,row.update() or row.pk,12721157,open,0,,,4,2021-06-08T19:56:00Z,2021-06-22T17:27:27Z,,NONE,,"Hi, fantastic framework for working with Sqlite3 databases!!! I tried to update spezific rows in a table and used for row in db[tablename]: newValue = row[""counter""] * row[""prize""] row.update({""Fieldname"": newValue}) print(row) This updates the value in the printet row, but not in the database. So I switched to db[tablename].update(id, {""Filedname"": newValue}) This works fine. But row.update would be nicer, because no need for the id (its that row), no need for the tablename and the db (all defined in the for row ... loop). Thx ",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/267/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 323718842,MDU6SXNzdWUzMjM3MTg4NDI=,268,Mechanism for ranking results from SQLite full-text search,9599,open,0,,,12,2018-05-16T17:36:40Z,2022-01-13T22:21:28Z,,OWNER,,This isn't particularly straight-forward - all the more reason for Datasette to implement it for you. This article is helpful: http://charlesleifer.com/blog/using-sqlite-full-text-search-with-python/,107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/268/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 326599525,MDU6SXNzdWUzMjY1OTk1MjU=,286,Database hash should include current datasette version,9599,open,0,,,2,2018-05-25T17:03:42Z,2018-05-25T17:07:36Z,,OWNER,,"Right now deploying a new version of datasette doesn't invalidate existing URLs, so users may still see a cached copy of the old templates. We can fix this by including the current datasette version in the input to the hash function (which currently just the database file contents).",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/286/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 326778161,MDU6SXNzdWUzMjY3NzgxNjE=,290,Consider increasing the default for num_sql_threads (currently 3),9599,open,0,,,0,2018-05-27T00:52:41Z,2018-05-27T00:52:41Z,,OWNER,,"I ran a very rough micro-benchmark on the new `num_sql_threads` config option (added in #285) datasette --config num_sql_threads:1 fivethirtyeight.db Then ab -n 100 -c 10 'http://127.0.0.1:8011/fivethirtyeight-2628db9/twitter-ratio%2Fsenators' | Number of threads | Requests/second | |---|---| | 1 | 4.57 | | 3 | 9.77 | | 10 | 13.53 | | 20 | 15.24 | 50 | 8.21 | This was on my early 2018 OS X laptop. Need to benchmark in other common environments before making a decision on changing the default. That said, the default of 3 was a number I plucked out of thin air.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/290/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 327365110,MDU6SXNzdWUzMjczNjUxMTA=,294,inspect should record column types,9599,open,0,,,7,2018-05-29T15:10:41Z,2019-06-28T16:45:28Z,,OWNER,,"For each table we want to know the columns, their order and what type they are. I'm going to break with SQLite defaults a little on this one and allow datasette to define additional types - to start with just a `geometry` type for columns that are detected as SpatiaLite geometries. Possible JSON design: ""columns"": [{ ""name"": ""title"", ""type"": ""text"" }, ...] Refs #276",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/294/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 327395270,MDU6SXNzdWUzMjczOTUyNzA=,296,Per-database and per-table /-/ URL namespace,9599,open,0,,,3,2018-05-29T16:23:13Z,2019-06-28T16:46:34Z,,OWNER,,"Initially this will be for subsets of `/-/inspect` and `/-/metadata` but it will also give us a URL namespace for future features like `/-/facet` (expanded list of a specific facet, linked to from `...`) and `/-/graph` To start: * `/dbname/-/inspect` * `/dbname/-/metadata` * `/dbname/tablename/-/inspect` * `/dbname/tablename/-/metadata` This means we will no longer allow databases or tables to have the name `""-""` - I think that's OK We will continue to support rows with a primary key of `""-""` at the following URL: * `/dbname/tablename/-`",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/296/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 944846776,MDU6SXNzdWU5NDQ4NDY3NzY=,297,Option for importing CSV data using the SQLite .import mechanism,9599,open,0,,,23,2021-07-14T22:36:41Z,2023-09-22T20:49:52Z,,OWNER,,"As seen in https://til.simonwillison.net/sqlite/import-csv - `.mode csv` and then `.import school.csv schools` is hugely faster than importing via `sqlite-utils insert` and doing the work in Python - but it can only be implemented by shelling out to the `sqlite3` CLI tool, it's not functionality that is exposed to the Python `sqlite3` module. An option to use this would be useful - maybe something like this: sqlite-utils insert blah.db blah blah.csv --fast",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/297/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 328155946,MDU6SXNzdWUzMjgxNTU5NDY=,301,"--spatialite option for ""datasette publish heroku""",9599,open,0,,,1,2018-05-31T14:13:09Z,2022-01-20T21:28:50Z,,OWNER,,Split off from #243. Need to figure out how to install and configure SpatiaLite on Heroku.,107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/301/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 330826972,MDU6SXNzdWUzMzA4MjY5NzI=,308,"Support extra Heroku apps:create options - region, space, team",78156,open,0,,,2,2018-06-08T23:08:33Z,2018-09-21T14:09:28Z,,NONE,,"It would be useful to document how to pass Heroku CLI options on `datasette publish`, e.g. `--region eu`.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/308/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 961008507,MDU6SXNzdWU5NjEwMDg1MDc=,308,Add an interactive tutorial as a Jupyter notebook,9599,open,0,,,2,2021-08-04T20:34:22Z,2021-08-04T21:30:59Z,,OWNER,,Can show people how to open this up in Binder.,140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/308/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 974067156,MDU6SXNzdWU5NzQwNjcxNTY=,318,Research: handle gzipped CSV directly,9599,open,0,,,2,2021-08-18T21:23:04Z,2021-08-18T21:25:30Z,,OWNER,,"Would it be worthwhile for the `sqlite-utils` command-line tool to grow features to efficiently directly interact with gzipped CSV data? Maybe add `--gz` options to both `insert` and to the various commands that output query results.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/318/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 335200136,MDU6SXNzdWUzMzUyMDAxMzY=,327,Explore if SquashFS can be used to shrink size of packaged Docker containers,9599,open,0,,,4,2018-06-24T18:15:16Z,2022-02-17T23:37:24Z,,OWNER,,"Inspired by this article: https://cldellow.com/2018/06/22/sqlite-parquet-vtable.html#sqlite-database-indexed--squashed https://en.wikipedia.org/wiki/SquashFS is ""a compressed read-only file system for Linux"" - which means it could be a really nice fit for Datasette and its read-only SQLite databases. It would be interesting to explore a Dockerfile recipe that used SquashFS to compress the SQLite database file that was bundled up by `datasette package` and friends.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/327/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 341228846,MDU6SXNzdWUzNDEyMjg4NDY=,343,Render boolean fields better by default,45057,open,0,,,1,2018-07-14T11:10:29Z,2018-07-14T14:17:14Z,,CONTRIBUTOR,,These show up as 0 or 1 because sqlite. I think Yes/No would be fine in most cases?,107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/343/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 1066563554,I_kwDOCGYnMM4_knfi,346,Way to test SQLite 3.37 (and potentially other versions) in CI,9599,open,0,,,5,2021-11-29T22:21:06Z,2021-11-29T23:12:49Z,,OWNER,,"> Need to figure out a good pattern for testing this in CI too - it will currently skip the new tests if it doesn't have SQLite 3.37 or higher. _Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/344#issuecomment-982076924_",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/346/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",, 344654623,MDU6SXNzdWUzNDQ2NTQ2MjM=,347,"Rename ""datasette package"" to ""datasette publish docker""",9599,open,0,,,0,2018-07-26T00:42:46Z,2018-07-26T00:42:46Z,,OWNER,,,107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/347/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,