github

This data as json, CSV

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	issue	performed_via_github_app
https://github.com/simonw/datasette/issues/485#issuecomment-496039483	https://api.github.com/repos/simonw/datasette/issues/485	496039483	MDEyOklzc3VlQ29tbWVudDQ5NjAzOTQ4Mw==	9599	2019-05-26T23:22:53Z	2019-05-26T23:22:53Z	OWNER	Comparing these two SQL queries (the one with union and the one without) using explain: With union: https://latest.datasette.io/fixtures?sql=explain+select+%27name%27+as+column%2C+count+%28distinct+name%29+as+count_distinct%2C+avg%28length%28name%29%29+as+avg_length+from+roadside_attractions%0D%0A++union%0D%0Aselect+%27address%27+as+column%2C+count%28distinct+address%29+as+count_distinct%2C+avg%28length%28address%29%29+as+avg_length+from+roadside_attractions produces 52 rows Without union: https://latest.datasette.io/fixtures?sql=explain+select%0D%0A++count+(distinct+name)+as+count_distinct_column_1%2C%0D%0A++avg(length(name))+as+avg_length_column_1%2C%0D%0A++count(distinct+address)+as+count_distinct_column_2%2C%0D%0A++avg(length(address))+as+avg_length_column_2%0D%0Afrom+roadside_attractions produces 32 rows So I'm going to use the one without the union.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	447469253
https://github.com/simonw/datasette/issues/485#issuecomment-496039267	https://api.github.com/repos/simonw/datasette/issues/485	496039267	MDEyOklzc3VlQ29tbWVudDQ5NjAzOTI2Nw==	9599	2019-05-26T23:19:38Z	2019-05-26T23:20:10Z	OWNER	Thinking about that union query: I imagine doing this with union could encourage multiple full table scans. Maybe this query would only do one? https://latest.datasette.io/fixtures?sql=select%0D%0A++count+%28distinct+name%29+as+count_distinct_column_1%2C%0D%0A++avg%28length%28name%29%29+as+avg_length_column_1%2C%0D%0A++count%28distinct+address%29+as+count_distinct_column_2%2C%0D%0A++avg%28length%28address%29%29+as+avg_length_column_2%0D%0Afrom+roadside_attractions ``` select count (distinct name) as count_distinct_column_1, avg(length(name)) as avg_length_column_1, count(distinct address) as count_distinct_column_2, avg(length(address)) as avg_length_column_2 from roadside_attractions ``` <img width="800" alt="fixtures__select_count__distinct_name__as_count_distinct_column_1__avg_length_name___as_avg_length_column_1__count_distinct_address__as_count_distinct_column_2__avg_length_address___as_avg_length_column_2_from_roadside_attractions" src="https://user-images.githubusercontent.com/9599/58388316-201ad580-7fd2-11e9-95c3-c98e2758fc1e.png">	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	447469253
https://github.com/simonw/datasette/issues/485#issuecomment-495085021	https://api.github.com/repos/simonw/datasette/issues/485	495085021	MDEyOklzc3VlQ29tbWVudDQ5NTA4NTAyMQ==	9599	2019-05-23T06:27:57Z	2019-05-26T23:15:51Z	OWNER	I could attempt to calculate the statistics needed for this in a time limited SQL query something like this one: https://latest.datasette.io/fixtures?sql=select+%27name%27+as+column%2C+count+%28distinct+name%29+as+count_distinct%2C+avg%28length%28name%29%29+as+avg_length+from+roadside_attractions%0D%0A++union%0D%0Aselect+%27address%27+as+column%2C+count%28distinct+address%29+as+count_distinct%2C+avg%28length%28address%29%29+as+avg_length+from+roadside_attractions ``` select 'name' as column, count (distinct name) as count_distinct, avg(length(name)) as avg_length from roadside_attractions union select 'address' as column, count(distinct address) as count_distinct, avg(length(address)) as avg_length from roadside_attractions ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	447469253
https://github.com/simonw/datasette/issues/485#issuecomment-496038601	https://api.github.com/repos/simonw/datasette/issues/485	496038601	MDEyOklzc3VlQ29tbWVudDQ5NjAzODYwMQ==	9599	2019-05-26T23:08:41Z	2019-05-26T23:08:41Z	OWNER	The code currently assumes the primary key is called "id" or "pk" - improving it to detect the primary key using database introspection should work much better.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	447469253