github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/datasette/issues/485#issuecomment-496039483 | https://api.github.com/repos/simonw/datasette/issues/485 | 496039483 | MDEyOklzc3VlQ29tbWVudDQ5NjAzOTQ4Mw== | 9599 | 2019-05-26T23:22:53Z | 2019-05-26T23:22:53Z | OWNER | Comparing these two SQL queries (the one with union and the one without) using explain: With union: https://latest.datasette.io/fixtures?sql=explain+select+%27name%27+as+column%2C+count+%28distinct+name%29+as+count_distinct%2C+avg%28length%28name%29%29+as+avg_length+from+roadside_attractions%0D%0A++union%0D%0Aselect+%27address%27+as+column%2C+count%28distinct+address%29+as+count_distinct%2C+avg%28length%28address%29%29+as+avg_length+from+roadside_attractions produces 52 rows Without union: https://latest.datasette.io/fixtures?sql=explain+select%0D%0A++count+(distinct+name)+as+count_distinct_column_1%2C%0D%0A++avg(length(name))+as+avg_length_column_1%2C%0D%0A++count(distinct+address)+as+count_distinct_column_2%2C%0D%0A++avg(length(address))+as+avg_length_column_2%0D%0Afrom+roadside_attractions produces 32 rows So I'm going to use the one without the union. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
447469253 | |
https://github.com/simonw/datasette/issues/485#issuecomment-496039267 | https://api.github.com/repos/simonw/datasette/issues/485 | 496039267 | MDEyOklzc3VlQ29tbWVudDQ5NjAzOTI2Nw== | 9599 | 2019-05-26T23:19:38Z | 2019-05-26T23:20:10Z | OWNER | Thinking about that union query: I imagine doing this with union could encourage multiple full table scans. Maybe this query would only do one? https://latest.datasette.io/fixtures?sql=select%0D%0A++count+%28distinct+name%29+as+count_distinct_column_1%2C%0D%0A++avg%28length%28name%29%29+as+avg_length_column_1%2C%0D%0A++count%28distinct+address%29+as+count_distinct_column_2%2C%0D%0A++avg%28length%28address%29%29+as+avg_length_column_2%0D%0Afrom+roadside_attractions ``` select count (distinct name) as count_distinct_column_1, avg(length(name)) as avg_length_column_1, count(distinct address) as count_distinct_column_2, avg(length(address)) as avg_length_column_2 from roadside_attractions ``` <img width="800" alt="fixtures__select_count__distinct_name__as_count_distinct_column_1__avg_length_name___as_avg_length_column_1__count_distinct_address__as_count_distinct_column_2__avg_length_address___as_avg_length_column_2_from_roadside_attractions" src="https://user-images.githubusercontent.com/9599/58388316-201ad580-7fd2-11e9-95c3-c98e2758fc1e.png"> | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
447469253 | |
https://github.com/simonw/datasette/issues/485#issuecomment-495085021 | https://api.github.com/repos/simonw/datasette/issues/485 | 495085021 | MDEyOklzc3VlQ29tbWVudDQ5NTA4NTAyMQ== | 9599 | 2019-05-23T06:27:57Z | 2019-05-26T23:15:51Z | OWNER | I could attempt to calculate the statistics needed for this in a time limited SQL query something like this one: https://latest.datasette.io/fixtures?sql=select+%27name%27+as+column%2C+count+%28distinct+name%29+as+count_distinct%2C+avg%28length%28name%29%29+as+avg_length+from+roadside_attractions%0D%0A++union%0D%0Aselect+%27address%27+as+column%2C+count%28distinct+address%29+as+count_distinct%2C+avg%28length%28address%29%29+as+avg_length+from+roadside_attractions ``` select 'name' as column, count (distinct name) as count_distinct, avg(length(name)) as avg_length from roadside_attractions union select 'address' as column, count(distinct address) as count_distinct, avg(length(address)) as avg_length from roadside_attractions ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
447469253 | |
https://github.com/simonw/datasette/issues/485#issuecomment-496038601 | https://api.github.com/repos/simonw/datasette/issues/485 | 496038601 | MDEyOklzc3VlQ29tbWVudDQ5NjAzODYwMQ== | 9599 | 2019-05-26T23:08:41Z | 2019-05-26T23:08:41Z | OWNER | The code currently assumes the primary key is called "id" or "pk" - improving it to detect the primary key using database introspection should work much better. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
447469253 |