home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

2 rows where comments = 31 and user = 9599 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 2

state 1

  • closed 2

repo 1

  • datasette 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association pull_request body repo type active_lock_reason performed_via_github_app reactions draft state_reason
1901416155 I_kwDOBm6k_c5xVU7b 2189 Server hang on parallel execution of queries to named in-memory databases simonw 9599 closed 0     31 2023-09-18T17:23:18Z 2023-09-21T22:26:21Z 2023-09-21T22:26:21Z OWNER  

I've started to encounter a bug where queries to tables inside named in-memory databases sometimes trigger server hangs.

I'm still trying to figure out what's going on here - on one occasion I managed to Ctrl+C the server and saw an exception that mentioned a thread lock, but usually hitting Ctrl+C does nothing and I have to kill -9 the PID instead.

This is all running on my M2 Mac.

I've seen the bug in the Datasette 1.0 alphas and in Datasette 0.64.3 - but reverting to 0.61 appeared to fix it.

datasette 107914493 issue    
{
    "url": "https://api.github.com/repos/simonw/datasette/issues/2189/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
309471814 MDU6SXNzdWUzMDk0NzE4MTQ= 189 Ability to sort (and paginate) by column simonw 9599 closed 0 simonw 9599   31 2018-03-28T18:04:51Z 2018-04-15T18:54:22Z 2018-04-09T05:16:02Z OWNER  

As requested in https://github.com/simonw/datasette/issues/185#issuecomment-376614973

I've previously avoided this for performance reasons: sort-by-column on a column without an index is likely to perform badly for hundreds of thousands of rows.

That's not a good enough reason to avoid the feature entirely though. A few options:

  • Allow sort-by-column by default, give users the option to disable it for specific tables/columns
  • Disallow sort-by-column by default, give users option (probably in metadata.json) to enable it for specific tables/columns
  • Automatically detect if a column either has an index on it OR a table has less than X rows in it

We already have the mechanism in place to cut off SQL queries that take more than X seconds, so if someone DOES try to sort by a column that's too expensive it won't actually hurt anything - but it would be nice to not show people a "sort" option which is guaranteed to throw a timeout error.

The vast majority of datasette usage that I've seen so far is on smaller datasets where the performance penalties of sort-by-column are extremely unlikely to show up.


Still left to do:

  • [x] UI that shows which sort order is currently being applied (in HTML and in JSON)
  • [x] UI for applying a sort order (with rel=nofollow to avoid Google crawling it)
  • [x] Sort column names should be escaped correctly in generated SQL
  • [x] Validation that the selected sort order is a valid column
  • [x] Throw error if user attempts to apply _sort AND _sort_desc at the same time
  • [x] Ability to disable sorting (or sort only for specific columns) in metadata.json
  • [x] Fix "201 rows where sorted by sortable_with_nulls " bug
datasette 107914493 issue    
{
    "url": "https://api.github.com/repos/simonw/datasette/issues/189/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [pull_request] TEXT,
   [body] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
, [active_lock_reason] TEXT, [performed_via_github_app] TEXT, [reactions] TEXT, [draft] INTEGER, [state_reason] TEXT);
CREATE INDEX [idx_issues_repo]
                ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
                ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
                ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
                ON [issues] ([user]);
Powered by Datasette · Queries took 581.695ms · About: github-to-sqlite