home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 474888132

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions issue performed_via_github_app
https://github.com/simonw/datasette/issues/422#issuecomment-474888132 https://api.github.com/repos/simonw/datasette/issues/422 474888132 MDEyOklzc3VlQ29tbWVudDQ3NDg4ODEzMg== 9599 2019-03-20T15:34:37Z 2019-03-20T15:34:37Z OWNER

Here's a trick for lower bound counts which looks like it might actually work. Consider the following queries:

select count(*) from ( select rowid from [most-common-name/surnames] limit 1000 ) https://fivethirtyeight.datasettes.com/fivethirtyeight-b76415d?sql=select+count%28*%29+from+%28%0D%0A++select+rowid+from+%5Bmost-common-name%2Fsurnames%5D+limit+1000%0D%0A%29

Takes 0.827ms (it took longer with select * from in the subquery).

Same query but with limit 10,000:

https://fivethirtyeight.datasettes.com/fivethirtyeight-b76415d?sql=select+count%28*%29+from+%28%0D%0A++select+rowid++from+%5Bmost-common-name%2Fsurnames%5D+limit+10000%0D%0A%29

Took 2.335ms

With 100,000 limit:

https://fivethirtyeight.datasettes.com/fivethirtyeight-b76415d?sql=select+count%28*%29+from+%28%0D%0A++select+rowid++from+%5Bmost-common-name%2Fsurnames%5D+limit+100000%0D%0A%29

Took 27.558ms

So one solution here would be to pick an upper bound (maybe 100,001) and use this query, which should give an accurate count below that upper bound but allow us to show "100,000+" as a count if the table exceeds that boundary.

Maybe the boundary is a config setting? Also, if a tighter timeout (maybe 20ms) is exceeded for that boundary we could halve it and try again.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
423316403  
Powered by Datasette · Queries took 0.808ms · About: github-to-sqlite