github: issues: 1 row where body = "Datasette limits the number of rows returned to 1,000 and limits the time spent executing a SQL query to 1000ms - and both of these limits can be customized. It does not have a limit on the size of the response returned. It's possible to compose maliciously large SQL responses in a small number of rows using mechanisms like the `group_concat()` aggregate function. It would be good to avoid malicious SQL creating 100MB+ responses and potentially crashing the server. I think the easiest place to implement that is here: https://github.com/simonw/datasette/blob/f3f42957128c1e7ece584d45d9167f2ac003a3b8/datasette/app.py#L175-L190 Currently we use `cursor.fetchmany()` to fetch up to 1,001 rows at once. Instead, we could switch to iterating through `cursor.fetchone()` (or just using `for row in cursor`) and keeping a running tally of the size of the response as we go - maybe just using `rough_response_size += len(str(row))`. If that goes above a certain threshold we can terminate the response with an error, like we do with timelimits. The bigger challenge here is understanding how well this approach works and what impact it will have on overall Datasette performance. I think I need #33 for this.", repo = 107914493, state = "open" and type = "issue" sorted by updated

1 row where body = "Datasette limits the number of rows returned to 1,000 and limits the time spent executing a SQL query to 1000ms - and both of these limits can be customized. It does not have a limit on the size of the response returned. It's possible to compose maliciously large SQL responses in a small number of rows using mechanisms like the `group_concat()` aggregate function. It would be good to avoid malicious SQL creating 100MB+ responses and potentially crashing the server. I think the easiest place to implement that is here: https://github.com/simonw/datasette/blob/f3f42957128c1e7ece584d45d9167f2ac003a3b8/datasette/app.py#L175-L190 Currently we use `cursor.fetchmany()` to fetch up to 1,001 rows at once. Instead, we could switch to iterating through `cursor.fetchone()` (or just using `for row in cursor`) and keeping a running tally of the size of the response as we go - maybe just using `rough_response_size += len(str(row))`. If that goes above a certain threshold we can terminate the response with an error, like we do with timelimits. The bigger challenge here is understanding how well this approach works and what impact it will have on overall Datasette performance. I think I need #33 for this.", repo = 107914493, state = "open" and type = "issue" sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at ▲	closed_at	author_association	pull_request	body	repo	type	active_lock_reason	performed_via_github_app	reactions	draft	state_reason
316621102	MDU6SXNzdWUzMTY2MjExMDI=	235	Add limit on the size in KB of data returned from a single query	simonw 9599	open	0			2	2018-04-22T23:01:15Z	2018-04-24T00:30:02Z		OWNER		Datasette limits the number of rows returned to 1,000 and limits the time spent executing a SQL query to 1000ms - and both of these limits can be customized. It does not have a limit on the size of the response returned. It's possible to compose maliciously large SQL responses in a small number of rows using mechanisms like the `group_concat()` aggregate function. It would be good to avoid malicious SQL creating 100MB+ responses and potentially crashing the server. I think the easiest place to implement that is here: https://github.com/simonw/datasette/blob/f3f42957128c1e7ece584d45d9167f2ac003a3b8/datasette/app.py#L175-L190 Currently we use `cursor.fetchmany()` to fetch up to 1,001 rows at once. Instead, we could switch to iterating through `cursor.fetchone()` (or just using `for row in cursor`) and keeping a running tally of the size of the response as we go - maybe just using `rough_response_size += len(str(row))`. If that goes above a certain threshold we can terminate the response with an error, like we do with timelimits. The bigger challenge here is understanding how well this approach works and what impact it will have on overall Datasette performance. I think I need #33 for this.	datasette 107914493	issue			{ "url": "https://api.github.com/repos/simonw/datasette/issues/235/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [pull_request] TEXT,
   [body] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
, [active_lock_reason] TEXT, [performed_via_github_app] TEXT, [reactions] TEXT, [draft] INTEGER, [state_reason] TEXT);
CREATE INDEX [idx_issues_repo]
                ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
                ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
                ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
                ON [issues] ([user]);