home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

54 rows where issue = 627794879 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 4

  • simonw 49
  • frankieroberto 3
  • carlmjohnson 1
  • simonrjones 1

author_association 2

  • OWNER 49
  • NONE 5

issue 1

  • Redesign default .json format · 54 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1368269732 https://github.com/simonw/datasette/issues/782#issuecomment-1368269732 https://api.github.com/repos/simonw/datasette/issues/782 IC_kwDOBm6k_c5RjiOk simonw 9599 2022-12-31T19:32:33Z 2023-01-17T02:05:45Z OWNER

New thinking on the trimmed-down default. Previously I was going to use "row" and "next_url" - I now want to do this instead: json { "ok": true, "rows": [ { "pk1": "a", "pk2": "a", "pk3": "a", "content": "a-a-a" }, { "pk1": "a", "pk2": "a", "pk3": "b", "content": "a-a-b" } ], "next": "a,a,b" } If there isn't a next page it will return "next": null.

This is even more succinct. I'm OK with people having to request next_url if they don't want to construct the new URL themselves.

The "ok": true is there so it can be false for errors, consistently.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
1368285442 https://github.com/simonw/datasette/issues/782#issuecomment-1368285442 https://api.github.com/repos/simonw/datasette/issues/782 IC_kwDOBm6k_c5RjmEC simonw 9599 2022-12-31T22:02:16Z 2022-12-31T22:02:16Z OWNER

https://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2 now returns count:

json { "database": "fixtures", "table": "compound_three_primary_keys", "is_view": false, "human_description_en": "", "rows": [ { "pk1": "a", "pk2": "a", "pk3": "a", "content": "a-a-a" }, { "pk1": "a", "pk2": "a", "pk3": "b", "content": "a-a-b" } ], "truncated": false, "count": 1001,

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
1368278278 https://github.com/simonw/datasette/issues/782#issuecomment-1368278278 https://api.github.com/repos/simonw/datasette/issues/782 IC_kwDOBm6k_c5RjkUG simonw 9599 2022-12-31T20:49:38Z 2022-12-31T20:49:38Z OWNER

I'm going to rename filtered_table_rows_count to count - to match the SQL count(*) function.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
1368269811 https://github.com/simonw/datasette/issues/782#issuecomment-1368269811 https://api.github.com/repos/simonw/datasette/issues/782 IC_kwDOBm6k_c5RjiPz simonw 9599 2022-12-31T19:33:09Z 2022-12-31T19:33:09Z OWNER

Here's the so-far updated documentation for this change: https://github.com/simonw/datasette/blob/a2dca62360ad4a961d4c46f68eae41b7d5c7b2c9/docs/json_api.rst#different-shapes

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
1368269283 https://github.com/simonw/datasette/issues/782#issuecomment-1368269283 https://api.github.com/repos/simonw/datasette/issues/782 IC_kwDOBm6k_c5RjiHj simonw 9599 2022-12-31T19:29:45Z 2022-12-31T19:29:45Z OWNER

https://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2 now shows the new default: json { "database": "fixtures", "table": "compound_three_primary_keys", "is_view": false, "human_description_en": "", "rows": [ { "pk1": "a", "pk2": "a", "pk3": "a", "content": "a-a-a" }, { "pk1": "a", "pk2": "a", "pk3": "b", "content": "a-a-b" } ], The old format can be had like this: https://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2&_shape=arrays

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
1368268148 https://github.com/simonw/datasette/issues/782#issuecomment-1368268148 https://api.github.com/repos/simonw/datasette/issues/782 IC_kwDOBm6k_c5Rjh10 simonw 9599 2022-12-31T19:22:07Z 2022-12-31T19:22:07Z OWNER

It turned out the most significant part of this change - switching from an array of arrays to an array of objects for the "rows" key - was really easy: Datasette already had a ?_shape=arrays v.s. ?_shape=objects mechanism, so I switched which one was the default in https://github.com/simonw/datasette/commit/234230e59574ccb8d8a24c45ccd325f725812377

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
970554697 https://github.com/simonw/datasette/issues/782#issuecomment-970554697 https://api.github.com/repos/simonw/datasette/issues/782 IC_kwDOBm6k_c452X1J simonw 9599 2021-11-16T18:32:03Z 2021-11-16T18:32:03Z OWNER

I'm going to take another look at this: - https://github.com/simonw/datasette/issues/878

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
970553780 https://github.com/simonw/datasette/issues/782#issuecomment-970553780 https://api.github.com/repos/simonw/datasette/issues/782 IC_kwDOBm6k_c452Xm0 simonw 9599 2021-11-16T18:30:51Z 2021-11-16T18:30:58Z OWNER

OK, I'm ready to start working on this today.

I'm going to go with a default representation that looks like this:

json { "rows": [ {"id": 1, "name": "One"}, {"id": 2, "name": "Two"} ], "next_url": null } Note that there's no count - all it provides is the current selection of results and an indication as to how the next can be retrieved (null if there are no more results).

I'll implement ?_extra= to provide everything else.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
803469623 https://github.com/simonw/datasette/issues/782#issuecomment-803469623 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDgwMzQ2OTYyMw== simonw 9599 2021-03-20T22:01:23Z 2021-03-20T22:01:23Z OWNER

I'm going to keep ?_shape=array working on the assumption that many existing uses of the Datasette API are already using that option, so it would be nice not to break them.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
783265830 https://github.com/simonw/datasette/issues/782#issuecomment-783265830 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4MzI2NTgzMA== frankieroberto 30665 2021-02-22T10:21:14Z 2021-02-22T10:21:14Z NONE

@simonw:

The problem there is that ?_size=x isn't actually doing the same thing as the SQL limit keyword.

Interesting! Although I don't think it matters too much what the underlying implementation is - I more meant that limit is familiar to developers conceptually as "up to and including this number, if they exist", whereas "size" is potentially more ambiguous. However, it's probably no big deal either way.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782789598 https://github.com/simonw/datasette/issues/782#issuecomment-782789598 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc4OTU5OA== simonw 9599 2021-02-21T03:30:02Z 2021-02-21T03:30:02Z OWNER

Another benefit to default:object - I could include a key that shows a list of available extras. I could then use that to power an interactive API explorer.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782765665 https://github.com/simonw/datasette/issues/782#issuecomment-782765665 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc2NTY2NQ== simonw 9599 2021-02-20T23:34:41Z 2021-02-20T23:34:41Z OWNER

OK, I'm back to the "top level object as the default" side of things now - it's pretty much unanimous at this point, and it's certainly true that it's not a decision you'll even regret.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782756398 https://github.com/simonw/datasette/issues/782#issuecomment-782756398 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc1NjM5OA== simonrjones 601316 2021-02-20T22:05:48Z 2021-02-20T22:05:48Z NONE

I think it’s a good idea if the top level item of the response JSON is always an object, rather than an array, at least as the default.

I agree it is more predictable if the top level item is an object with a rows or data object that contains an array of data, which then allows for other top-level meta data.

I can see the argument for removing this and just using an array for convenience - but I think that's OK as an option (as you have now).

Rather than have lots of top-level keys you could have a "meta" object to contain non-data stuff. You could use something like "links" for API endpoint URLs (or use a standard like HAL). Which would then leave the top level a bit cleaner - if that's what you what.

Have you had much feedback from users who use the Datasette API a lot?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782748501 https://github.com/simonw/datasette/issues/782#issuecomment-782748501 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0ODUwMQ== simonw 9599 2021-02-20T20:58:18Z 2021-02-20T20:58:18Z OWNER

Yet another option: support a ?_path=x option which returns a nested path from the result. So you could do this:

/github/commits.json?_path=rows - to get back a top-level array pulled from the "rows" key.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782748093 https://github.com/simonw/datasette/issues/782#issuecomment-782748093 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0ODA5Mw== simonw 9599 2021-02-20T20:54:52Z 2021-02-20T20:54:52Z OWNER

Have you given any thought as to whether to pretty print (format with spaces) the output or not? Can be useful for debugging/exploring in a browser or other basic tools which don’t parse the JSON. Could be default (can’t be much bigger with gzip?) or opt-in.

Adding a ?_pretty=1 option that does that is a great idea, I'm filing a ticket for it: #1237

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782747878 https://github.com/simonw/datasette/issues/782#issuecomment-782747878 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0Nzg3OA== simonw 9599 2021-02-20T20:53:11Z 2021-02-20T20:53:11Z OWNER

... though thinking about this further, I could re-implement the select * from commits (but only return a max of 10 results) feature using a nested select * from (select * from commits) limit 10 query.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782747743 https://github.com/simonw/datasette/issues/782#issuecomment-782747743 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0Nzc0Mw== simonw 9599 2021-02-20T20:52:10Z 2021-02-20T20:52:10Z OWNER

Minor suggestion: rename size query param to limit, to better reflect that it’s a maximum number of rows returned rather than a guarantee of getting that number, and also for consistency with the SQL keyword?

The problem there is that ?_size=x isn't actually doing the same thing as the SQL limit keyword. Consider this query:

https://latest-with-plugins.datasette.io/github?sql=select+*+from+commits - select * from commits

Datasette returns 1,000 results, and shows a "Custom SQL query returning more than 1,000 rows" message at the top. That's the size kicking in - I only fetch the first 1,000 results from the cursor to avoid exhausting resources. In the JSON version of that at https://latest-with-plugins.datasette.io/github.json?sql=select+*+from+commits there's a "truncated": true key to let you know what happened.

I find myself using ?_size=2 against Datasette occasionally if I know the rows being returned are really big and I don't want to load 10+MB of HTML.

This is only really a concern for arbitrary SQL queries though - for table pages such as https://latest-with-plugins.datasette.io/github/commits?_size=10 adding ?_size=10 actually puts a limit 10 on the underlying SQL query.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782747164 https://github.com/simonw/datasette/issues/782#issuecomment-782747164 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0NzE2NA== simonw 9599 2021-02-20T20:47:16Z 2021-02-20T20:47:16Z OWNER

(I started a thread on Twitter about this: https://twitter.com/simonw/status/1363220355318358016)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782746755 https://github.com/simonw/datasette/issues/782#issuecomment-782746755 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0Njc1NQ== frankieroberto 30665 2021-02-20T20:44:05Z 2021-02-20T20:44:05Z NONE

Minor suggestion: rename size query param to limit, to better reflect that it’s a maximum number of rows returned rather than a guarantee of getting that number, and also for consistency with the SQL keyword?

I like the idea of specifying a limit of 0 if you don’t want any rows data - and returning an empty array under the rows key seems fine.

Have you given any thought as to whether to pretty print (format with spaces) the output or not? Can be useful for debugging/exploring in a browser or other basic tools which don’t parse the JSON. Could be default (can’t be much bigger with gzip?) or opt-in.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782746633 https://github.com/simonw/datasette/issues/782#issuecomment-782746633 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0NjYzMw== simonw 9599 2021-02-20T20:43:07Z 2021-02-20T20:43:07Z OWNER

Another option: .json always returns an object with a list of keys that gets increased through adding ?_extra= parameters.

.jsona always returns a JSON array of objects

I had something similar to this in Datasette a few years ago - a .jsono extension, which still redirects to the shape=array version.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782745199 https://github.com/simonw/datasette/issues/782#issuecomment-782745199 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0NTE5OQ== frankieroberto 30665 2021-02-20T20:32:03Z 2021-02-20T20:32:03Z NONE

I think it’s a good idea if the top level item of the response JSON is always an object, rather than an array, at least as the default. Mainly because it allows you to add extra keys in a backwards-compatible way. Also just seems more expected somehow.

The API design guidance for the UK government also recommends this: https://www.gov.uk/guidance/gds-api-technical-and-data-standards#use-json

I also strongly dislike having versioned APIs (eg with a /v1/ path prefix, as it invariably means that old versions stop working at some point, even though the bit of the API you’re using might not have changed at all.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
Redesign default .json format 627794879  
782742233 https://github.com/simonw/datasette/issues/782#issuecomment-782742233 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0MjIzMw== simonw 9599 2021-02-20T20:09:16Z 2021-02-20T20:09:16Z OWNER

I just noticed that https://latest-with-plugins.datasette.io/github/commits.json-preview?_extra=total&_size=0&_trace=1 executes 35 SQL queries at the moment! A great reminder that a big improvement from this change will be a reduction in queries through not calculating things like suggested facets unless they are explicitly requested.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782741719 https://github.com/simonw/datasette/issues/782#issuecomment-782741719 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0MTcxOQ== simonw 9599 2021-02-20T20:05:04Z 2021-02-20T20:05:04Z OWNER

The only advantage of headers is that you don’t need to do .rows, but that’s actually good as a data validation step anyway—if .rows is missing assume there’s an error and do your error handling path instead of parsing the rest.

This is something I've not thought very hard about. If there's an error, I need to return a top-level object, not a top-level array, so I can provide details of the error.

But this means that client code will have to handle this difference - it will have to know that the returned data can be array-shaped if nothing went wrong, and object-shaped if there's an error.

The HTTP status code helps here - calling client code can know that a 200 status code means there will be an array, but an error status code means an object.

If developers really hate that the shape could be different, they can always use ?_extra=next to ensure that the top level item is an object whether or not an error occurred. So I think this is OK.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782741107 https://github.com/simonw/datasette/issues/782#issuecomment-782741107 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0MTEwNw== simonw 9599 2021-02-20T20:00:22Z 2021-02-20T20:00:22Z OWNER

A really exciting opportunity this opens up is for parallel execution - the facets() and suggested_facets() and total() async functions could be called in parallel, which could speed things up if I'm confident the SQLite thread pool can execute on multiple CPU cores (it should be able to because the Python sqlite3 module releases the GIL while it's executing C code).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782740985 https://github.com/simonw/datasette/issues/782#issuecomment-782740985 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0MDk4NQ== simonw 9599 2021-02-20T19:59:21Z 2021-02-20T19:59:21Z OWNER

This design should be influenced by how it's implemented.

One implementation that could be nice is that each of the keys that can be requested - next_url, total etc - maps to an async def function which can do the work. So that expensive count(*) will only be executed by the async def total function if it is requested.

This raises more questions: Both next and next_url work off the same underlying data, so if they are both requested can we re-use the work that next does somehow? Maybe by letting these functions depend on each other (so next_url() knows to first call next(), but only if it hasn't been called already.

I think I need to flesh out the full default collection of ?_extra= parameters in order to design how they will work under the hood.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782740604 https://github.com/simonw/datasette/issues/782#issuecomment-782740604 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0MDYwNA== simonw 9599 2021-02-20T19:56:21Z 2021-02-20T19:56:33Z OWNER

I think I want to support ?_extra=next_url,total in addition to ?_extra=next_url&_extra=total - partly because it's less characters to type, and also because I know there exist URL handling library that don't know how to handle the same parameter multiple times (though they're going to break against Datasette already, so it's not a big deal).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782740488 https://github.com/simonw/datasette/issues/782#issuecomment-782740488 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4Mjc0MDQ4OA== simonw 9599 2021-02-20T19:55:23Z 2021-02-20T19:55:23Z OWNER

Am I saying you won't get back a key in the response unless you explicitly request it, either by name or by specifying a bundle of extras (e.g. all or paginated)?

The "truncated": true key that tells you that your arbitrary query returned more than X results but was truncated is pretty important, do I really want people to have to opt-in to that one?

Also: having bundles like all or paginated live in the same namespace as single keys like next_url or total is a little odd - you can't tell by looking at them if they'll add a key called all or if they'll add a bunch of other stuff.

Maybe bundles could be prefixed with something, perhaps an underscore? ?_extra=_all and ?_extra=_paginated for example.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782739926 https://github.com/simonw/datasette/issues/782#issuecomment-782739926 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4MjczOTkyNg== simonw 9599 2021-02-20T19:51:30Z 2021-02-20T19:52:19Z OWNER

Demos:

  • https://latest-with-plugins.datasette.io/github/commits.json-preview
  • https://latest-with-plugins.datasette.io/github/commits.json-preview?_extra=next_url
  • https://latest-with-plugins.datasette.io/github/commits.json-preview?_extra=total
  • https://latest-with-plugins.datasette.io/github/commits.json-preview?_extra=next_url&_extra=total
  • https://latest-with-plugins.datasette.io/github/commits.json-preview?_extra=total&_size=0
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782709425 https://github.com/simonw/datasette/issues/782#issuecomment-782709425 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4MjcwOTQyNQ== simonw 9599 2021-02-20T16:24:54Z 2021-02-20T16:24:54Z OWNER

Having shortcuts means I could support ?_extra=all for returning ALL possible keys.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782709270 https://github.com/simonw/datasette/issues/782#issuecomment-782709270 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4MjcwOTI3MA== simonw 9599 2021-02-20T16:23:51Z 2021-02-20T16:24:11Z OWNER

Also how would you opt out of returning the "rows" key? I sometimes want to do this - if I want to get back just the count or just the facets for example.

Some options:

  • /fixtures/roadside_attractions.json?_extra=total&_extra=-rows
  • /fixtures/roadside_attractions.json?_extra=total&_skip=rows
  • /fixtures/roadside_attractions.json?_extra=total&_size=0

I quite like that last one with ?_size=0. I think it would still return "rows": [] but that's OK.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
782708938 https://github.com/simonw/datasette/issues/782#issuecomment-782708938 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc4MjcwODkzOA== simonw 9599 2021-02-20T16:22:14Z 2021-02-20T16:22:14Z OWNER

I'm leaning back in the direction of a flat JSON array of objects as the default - this:

/fixtures/roadside_attractions.json

Would return:

json [ { "pk": 1, "name": "The Mystery Spot", "address": "465 Mystery Spot Road, Santa Cruz, CA 95065", "latitude": 37.0167, "longitude": -122.0024 }, { "pk": 2, "name": "Winchester Mystery House", "address": "525 South Winchester Boulevard, San Jose, CA 95128", "latitude": 37.3184, "longitude": -121.9511 }, { "pk": 3, "name": "Burlingame Museum of PEZ Memorabilia", "address": "214 California Drive, Burlingame, CA 94010", "latitude": 37.5793, "longitude": -122.3442 }, { "pk": 4, "name": "Bigfoot Discovery Museum", "address": "5497 Highway 9, Felton, CA 95018", "latitude": 37.0414, "longitude": -122.0725 } ] To get the version that includes pagination information you would use the ?_extra= parameter. For example:

/fixtures/roadside_attractions.json?_extra=total&_extra=next_url

json { "rows": [ { "pk": 1, "name": "The Mystery Spot", "address": "465 Mystery Spot Road, Santa Cruz, CA 95065", "latitude": 37.0167, "longitude": -122.0024 }, { "pk": 2, "name": "Winchester Mystery House", "address": "525 South Winchester Boulevard, San Jose, CA 95128", "latitude": 37.3184, "longitude": -121.9511 }, { "pk": 3, "name": "Burlingame Museum of PEZ Memorabilia", "address": "214 California Drive, Burlingame, CA 94010", "latitude": 37.5793, "longitude": -122.3442 }, { "pk": 4, "name": "Bigfoot Discovery Museum", "address": "5497 Highway 9, Felton, CA 95018", "latitude": 37.0414, "longitude": -122.0725 } ], "total": 4, "next_url": null } ANY usage of the ?_extra= parameter would turn the list into an object with a "rows" key.

Opting in to the total is nice because it's actually expensive to run a count, so only doing a count if the user requests it feels good.

But... having to add ?_extra=total&_extra=next_url for the common case of wanting both the total count and the URL to get the next page of results is a bit verbose. So maybe support aliases, like ?_extra=paginated which is a shortcut for ?_extra=total&_extra=next_url?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
755484384 https://github.com/simonw/datasette/issues/782#issuecomment-755484384 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc1NTQ4NDM4NA== simonw 9599 2021-01-06T18:31:14Z 2021-01-06T18:31:57Z OWNER

In building https://latest-with-plugins.datasette.io/github/issue_comments.Notebook?_labels=on I discovered the following patterns for importing data into both Pandas and Observable/d3: python import pandas df = pandas.read_json( "https://latest-with-plugins.datasette.io/github/issue_comments.json?_shape=array" ) And: javascript d3 = require("d3@5") rows = d3.json( "https://latest-with-plugins.datasette.io/github/issue_comments.json?_shape=array" ) Once again I find myself torn on the best possible default. A list of JSON objects is instantly compatible with both pandas.read_json() and d3.json() - but it leaves nowhere to put the extra information like pagination and suchlike!

Even given this I still think the correct default is an object with "rows", "total" and "next_url" keys. I should commit to that and implement it - this thought exercise has been running for far too long.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
754958610 https://github.com/simonw/datasette/issues/782#issuecomment-754958610 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDc1NDk1ODYxMA== simonw 9599 2021-01-05T23:15:24Z 2021-01-05T23:15:24Z OWNER

https://latest-with-plugins.datasette.io/fixtures/roadside_attraction_characteristics/1.json-preview returns a 500 error at the moment - a KeyError on 'filtered_table_rows_count'.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
720028476 https://github.com/simonw/datasette/issues/782#issuecomment-720028476 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcyMDAyODQ3Ng== simonw 9599 2020-11-01T05:00:05Z 2020-11-01T05:00:05Z OWNER

This should be the key focus for Datasette 0.52.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
712986115 https://github.com/simonw/datasette/issues/782#issuecomment-712986115 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcxMjk4NjExNQ== simonw 9599 2020-10-20T16:28:46Z 2020-10-20T16:29:51Z OWNER

I think this all comes down to how the ?_extras= mechanism works (see #262), as first hinted at in a30c5b220c15360d575e94b0e67f3255e120b916 (see commit message) when I added this long-forgotten undocumented feature: https://latest.datasette.io/fixtures/attraction_characteristic/2.json?_extras=foreign_key_tables

Extras need to be able to execute additional SQL, since that would solve the problem we have now where the expensive "suggested facets" code runs on all .json output even when its results are not being shown.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
712590398 https://github.com/simonw/datasette/issues/782#issuecomment-712590398 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcxMjU5MDM5OA== simonw 9599 2020-10-20T05:03:46Z 2020-10-20T05:04:09Z OWNER

OK, https://latest-with-plugins.datasette.io/ is running that now - e.g. https://latest-with-plugins.datasette.io/fixtures/roadside_attractions.json-preview or https://latest-with-plugins.datasette.io/fixtures/compound_three_primary_keys.json-preview

json { "rows": [ { "pk": 1, "name": "The Mystery Spot", "address": "465 Mystery Spot Road, Santa Cruz, CA 95065", "latitude": 37.0167, "longitude": -122.0024 }, { "pk": 2, "name": "Winchester Mystery House", "address": "525 South Winchester Boulevard, San Jose, CA 95128", "latitude": 37.3184, "longitude": -121.9511 }, { "pk": 3, "name": "Burlingame Museum of PEZ Memorabilia", "address": "214 California Drive, Burlingame, CA 94010", "latitude": 37.5793, "longitude": -122.3442 }, { "pk": 4, "name": "Bigfoot Discovery Museum", "address": "5497 Highway 9, Felton, CA 95018", "latitude": 37.0414, "longitude": -122.0725 } ], "total": 4, "next_url": null }

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
712585921 https://github.com/simonw/datasette/issues/782#issuecomment-712585921 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcxMjU4NTkyMQ== simonw 9599 2020-10-20T04:48:01Z 2020-10-20T04:48:01Z OWNER

I'll update datasette-json-preview with that now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
712585687 https://github.com/simonw/datasette/issues/782#issuecomment-712585687 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcxMjU4NTY4Nw== simonw 9599 2020-10-20T04:47:02Z 2020-10-20T04:47:12Z OWNER

Great point about CORS, I hadn't considered that.

I think I'm going to keep the Link: header (added in #1014) because I quite enjoy using it with GitHub and WordPress, but I'm not going to have it be the default way of doing pagination. For the default shape I'm now leaning towards this:

json { "total": 36, "rows": [{"id": 1, "name": "Cleo"}], "next_url": "https://latest-with-plugins.datasette.io/fixtures/facetable.json?_next=5" }

So three keys: total, rows and next_url. Then extra keys can be added using ?_extra= with various named bundles.

{
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
712569695 https://github.com/simonw/datasette/issues/782#issuecomment-712569695 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcxMjU2OTY5NQ== carlmjohnson 222245 2020-10-20T03:45:48Z 2020-10-20T03:46:14Z NONE

I vote against headers. It has a lot of strikes against it: poor discoverability, new developers often don’t know how to use them, makes CORS harder, makes it hard to use eg with JQ, needs ad hoc specification for each bit of metadata, etc.

The only advantage of headers is that you don’t need to do .rows, but that’s actually good as a data validation step anyway—if .rows is missing assume there’s an error and do your error handling path instead of parsing the rest.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
706745236 https://github.com/simonw/datasette/issues/782#issuecomment-706745236 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcwNjc0NTIzNg== simonw 9599 2020-10-11T18:16:05Z 2020-10-11T18:16:05Z OWNER

Here's the datasette-json-preview plugin I'll be using to experiment with different formats: https://github.com/simonw/datasette-json-preview

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
706740250 https://github.com/simonw/datasette/issues/782#issuecomment-706740250 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcwNjc0MDI1MA== simonw 9599 2020-10-11T17:40:48Z 2020-10-11T17:43:07Z OWNER

Building this plugin reminded me of an oddity of the register_output_renderer() plugin hook: one of the arguments that can be passed to it is data, which is the default internal data structure created by Datasette - but I deliberately avoided documenting that on https://docs.datasette.io/en/stable/plugin_hooks.html#register-output-renderer-datasette because it's not a stable interface.

That's not ideal. I'd like custom renderers to be able to access this data to get at things like suggested facets, on an opt-in basis.

So maybe that kind of stuff is re-implemented as "extras" which are awaitable callables - then renderer plugins can call the extras that they need to as part of their execution.

To illustrate the problem (in this case the need to access next_url) here's my first prototype of the plugin: ```python from datasette import hookimpl from datasette.utils.asgi import Response

@hookimpl def register_output_renderer(datasette): return { "extension": "json-preview", "render": json_preview, }

def json_preview(data, columns, rows): next_url = data.get("next_url") headers = {} if next_url: headers["link"] = '<{}>; rel="next"'.format(next_url) return Response.json([dict(zip(columns, row)) for row in rows], headers=headers) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
706738020 https://github.com/simonw/datasette/issues/782#issuecomment-706738020 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcwNjczODAyMA== simonw 9599 2020-10-11T17:23:18Z 2020-10-11T17:23:48Z OWNER

I'm going to prototype what it would look like if the default shape was a list of objects and ?_extra= turns that into an object with a rows key, in a plugin. As a separate extension (maybe .json-preview).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
706735341 https://github.com/simonw/datasette/issues/782#issuecomment-706735341 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcwNjczNTM0MQ== simonw 9599 2020-10-11T17:03:29Z 2020-10-11T17:15:34Z OWNER

Maybe .jsonfull becomes a new renderer that returns ALL of the defined ?_extra= blocks.

Or... ?_extra=all turns on ALL of the available information blocks (some of which can come from plugins).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
706735200 https://github.com/simonw/datasette/issues/782#issuecomment-706735200 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcwNjczNTIwMA== simonw 9599 2020-10-11T17:02:11Z 2020-10-11T17:14:51Z OWNER

Since the total count can be expensive to calculate, I'm inclined to make that an opt-in extra - maybe ?_extra=count.

Based on that, the default JSON shape could look something like this:

json { "rows": [{"id": 1}, {"id": 2}], "next": "2", "next_url": "/db/table?_next=2" } And with ?_extra=count: json { "rows": [{"id": 1}, {"id": 2}], "next": "2", "next_url": "/db/table?_next=2", "count": 31 }

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
706736541 https://github.com/simonw/datasette/issues/782#issuecomment-706736541 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcwNjczNjU0MQ== simonw 9599 2020-10-11T17:12:27Z 2020-10-11T17:12:27Z OWNER

The core issue that I keep reconsidering is whether the default .json representation should be an object or a list.

Arguments in favour of a list:

  • It's what I always want. Almost all of the code that I've written against the API myself uses ?_shape=array.
  • It's really easy to use. You can pipe it to e.g. sqlite-utils insert, you can load it into JavaScript without thinking about it.

Arguments against:

  • Nowhere to put pagination or total counts. I added pagination to the link: HTTP header in #1014 (inspired by the WordPress and GitHub APIs) but I haven't solved for total count, and there's other stuff that's useful like "truncated": true to indicate that more than 1000 results were returned and they were truncated.
  • An array is inherently non-extensible: if the root item is an object it's easy to add new features to it in a backwards-compatible way in the future. An array is a fixed format.

But maybe that last point is a positive? It ensures the default .json format remains completely predictable forever.

If .json DID default to an array of objects, the ?_shape= argument could still be used to get back alternative formats.

Maybe .json?_extra=total changes the shape of that default to be this instead:

json { "rows": [{"id": 1}, {"id": 2}], "total": 104 }

The thing I care about most though is next_url. That could be provided like so:

.json?_extra=total&_extra=next - alternative syntax .json?_extra=total,next:

json { "rows": [{"id": 1}, {"id": 2}], "total": 104, "next": "2", "next_url": "/db/table.json?_extra=total&_extra=next&_next=2" } This is feeling a bit verbose for a common combination though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
706735280 https://github.com/simonw/datasette/issues/782#issuecomment-706735280 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDcwNjczNTI4MA== simonw 9599 2020-10-11T17:03:01Z 2020-10-11T17:03:01Z OWNER

Should that default also include "columns" as a list of strings? That would be duplicate data of the keys in the "rows" list of objects, and I've never found myself wanting it in my own code - so I'm going to say no.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
691554088 https://github.com/simonw/datasette/issues/782#issuecomment-691554088 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDY5MTU1NDA4OA== simonw 9599 2020-09-12T21:39:03Z 2020-09-12T21:39:03Z OWNER

Plan: release a new release of Datasette (probably 0.49) with the new JSON API design, but provide a plugin called something like datasette-api-0-48 which runs as ASGI wrapping middleware and internally rewrites incoming requests to e.g. /db/table.json to behave if they have the ?_extra= params on them necessary to produce the 0.48 version of the JSON.

Anyone who has built applications against 0.48 can install that plugin.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
691526878 https://github.com/simonw/datasette/issues/782#issuecomment-691526878 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDY5MTUyNjg3OA== simonw 9599 2020-09-12T18:21:41Z 2020-09-12T18:22:20Z OWNER

Would it be so bad if the default format had a "rows" key containing the array of rows? Maybe it wouldn't. The reason I always use ?_shape=array is because I want an array of objects, rather than an array of arrays that I have to match up again with their columns.

A default format that's an object rather than array also gives something for the ?_extra= parameter to add its extras to.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
691526762 https://github.com/simonw/datasette/issues/782#issuecomment-691526762 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDY5MTUyNjc2Mg== simonw 9599 2020-09-12T18:20:19Z 2020-09-12T18:20:19Z OWNER

I'd like to revisit the idea of using ?_extra=x to opt-in to extra blocks of JSON, from #262

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
691526489 https://github.com/simonw/datasette/issues/782#issuecomment-691526489 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDY5MTUyNjQ4OQ== simonw 9599 2020-09-12T18:17:16Z 2020-09-12T18:17:16Z OWNER

(I think I may have been over-thinking the details of this is for a couple of years now.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
691526416 https://github.com/simonw/datasette/issues/782#issuecomment-691526416 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDY5MTUyNjQxNg== simonw 9599 2020-09-12T18:16:36Z 2020-09-12T18:16:36Z OWNER

I'm going to hack together a preview of this in a branch and deploy it somewhere so people can see what I've got planned. Much easier to evaluate a working prototype than static examples.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
691323302 https://github.com/simonw/datasette/issues/782#issuecomment-691323302 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDY5MTMyMzMwMg== simonw 9599 2020-09-11T21:38:27Z 2020-09-11T21:40:04Z OWNER

Another idea: the default output could be the list of dicts: json [ { "pk1": "a", "pk2": "a", "pk3": "a", "content": "a-a-a" }, ... ] BUT... I could include pagination information in the HTTP headers - as seen in the WordPress REST API or the GitHub API:

``` ~ % curl -s -i 'https://api.github.com/repos/simonw/datasette/commits' | head -n 40 HTTP/1.1 200 OK server: GitHub.com date: Fri, 11 Sep 2020 21:37:46 GMT content-type: application/json; charset=utf-8 status: 200 OK cache-control: public, max-age=60, s-maxage=60 vary: Accept, Accept-Encoding, Accept, X-Requested-With etag: W/"71c99379743513394e880c6306b66bf9" last-modified: Fri, 11 Sep 2020 21:32:54 GMT x-github-media-type: github.v3; format=json link: https://api.github.com/repositories/107914493/commits?page=2; rel="next", https://api.github.com/repositories/107914493/commits?page=44; rel="last" access-control-expose-headers: ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, Deprecation, Sunset access-control-allow-origin: * strict-transport-security: max-age=31536000; includeSubdomains; preload x-frame-options: deny x-content-type-options: nosniff x-xss-protection: 1; mode=block referrer-policy: origin-when-cross-origin, strict-origin-when-cross-origin content-security-policy: default-src 'none' X-Ratelimit-Limit: 60 X-Ratelimit-Remaining: 55 X-Ratelimit-Reset: 1599863850 X-Ratelimit-Used: 5 Accept-Ranges: bytes Content-Length: 118240 X-GitHub-Request-Id: EC76:0EAD:313F40:5291A4:5F5BEE37

[ { "sha": "d02f6151dae073135a22d0123e8abdc6cbef7c50", "node_id": "MDY6Q29tbWl0MTA3OTE0NDkzOmQwMmY2MTUxZGFlMDczMTM1YTIyZDAxMjNlOGFiZGM2Y2JlZjdjNTA=", "commit": { ``` Alternative shapes would provide the pagination information (and other extensions) in the JSON, e.g.:

/squirrels/squirrels.json?_shape=paginated json { "rows": [ { "pk1": "a", "pk2": "a", "pk3": "a", "content": "a-a-a" } ], "pagination": { "next": "234", "count": 442 } }

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
636370064 https://github.com/simonw/datasette/issues/782#issuecomment-636370064 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDYzNjM3MDA2NA== simonw 9599 2020-05-30T18:51:19Z 2020-05-30T18:51:19Z OWNER

https://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2&_shape=array returns this: json [ { "pk1": "a", "pk2": "a", "pk3": "a", "content": "a-a-a" }, { "pk1": "a", "pk2": "a", "pk3": "b", "content": "a-a-b" } ] There's one big problem with this format: it doesn't provide any space for pagination information.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  
636369978 https://github.com/simonw/datasette/issues/782#issuecomment-636369978 https://api.github.com/repos/simonw/datasette/issues/782 MDEyOklzc3VlQ29tbWVudDYzNjM2OTk3OA== simonw 9599 2020-05-30T18:50:31Z 2020-05-30T18:50:31Z OWNER

Here's the default JSON at the moment: https://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2

json { "database": "fixtures", "table": "compound_three_primary_keys", "is_view": false, "human_description_en": "", "rows": [ [ "a", "a", "a", "a-a-a" ], [ "a", "a", "b", "a-a-b" ] ], "truncated": false, "filtered_table_rows_count": 1001, "expanded_columns": [], "expandable_columns": [], "columns": [ "pk1", "pk2", "pk3", "content" ], "primary_keys": [ "pk1", "pk2", "pk3" ], "units": {}, "query": { "sql": "select pk1, pk2, pk3, content from compound_three_primary_keys order by pk1, pk2, pk3 limit 3", "params": {} }, "facet_results": {}, "suggested_facets": [ { "name": "pk1", "toggle_url": "http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2&_facet=pk1" }, { "name": "pk2", "toggle_url": "http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2&_facet=pk2" }, { "name": "pk3", "toggle_url": "http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2&_facet=pk3" } ], "next": "a,a,b", "next_url": "http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2&_next=a%2Ca%2Cb", "query_ms": 17.56119728088379, "source": "tests/fixtures.py", "source_url": "https://github.com/simonw/datasette/blob/master/tests/fixtures.py", "license": "Apache License 2.0", "license_url": "https://github.com/simonw/datasette/blob/master/LICENSE" } There's a lot of stuff in there. This increases the risk that future minor changes might break existing API consumers.

It returns rows as a list of lists of values, and expects you to correlate these with the list of columns. I originally designed it like this because I thought this was a more efficient representation than repeating the column names in a dictionary for every row. With hindsight this was a bad optimization - I always use ?shape=array because it's more convenient, and gzip encoding of the response means there's no bandwidth saving. Users who want that efficiency should request it using a custom ?_shape=.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Redesign default .json format 627794879  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1.2ms · About: github-to-sqlite