(impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/simonw/sqlite-utils/pull/247?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Last update [1fe73c8...af989af](https://codecov.io/gh/simonw/sqlite-utils/pull/247?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",832687563,
https://github.com/simonw/sqlite-utils/pull/247#issuecomment-901338988,https://api.github.com/repos/simonw/sqlite-utils/issues/247,901338988,IC_kwDOCGYnMM41uVds,9599,2021-08-18T18:33:39Z,2021-08-18T18:33:39Z,OWNER,This was also requested in #296.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",832687563,
https://github.com/simonw/sqlite-utils/issues/296#issuecomment-901338356,https://api.github.com/repos/simonw/sqlite-utils/issues/296,901338356,IC_kwDOCGYnMM41uVT0,9599,2021-08-18T18:32:39Z,2021-08-18T18:32:39Z,OWNER,This is a good call. I have a fix for this in Datasette but it's not in `sqlite-utils` yet: https://github.com/simonw/datasette/blob/adb5b70de5cec3c3dd37184defe606a082c232cf/datasette/utils/__init__.py#L824-L835,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",944326512,
https://github.com/simonw/sqlite-utils/issues/317#issuecomment-901337305,https://api.github.com/repos/simonw/sqlite-utils/issues/317,901337305,IC_kwDOCGYnMM41uVDZ,9599,2021-08-18T18:30:59Z,2021-08-18T18:30:59Z,OWNER,"I'm just going to remove this - I added it when the library was mostly undocumented, but it has comprehensive documentation now.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",972827346,
https://github.com/simonw/datasette/issues/1439#issuecomment-900715375,https://api.github.com/repos/simonw/datasette/issues/1439,900715375,IC_kwDOBm6k_c41r9Nv,9599,2021-08-18T00:15:28Z,2021-08-18T00:15:28Z,OWNER,"Maybe I should use `-/` to encode forward slashes too, to defend against any ASGI servers that might not implement `raw_path` correctly.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,
https://github.com/simonw/datasette/issues/1439#issuecomment-900714630,https://api.github.com/repos/simonw/datasette/issues/1439,900714630,IC_kwDOBm6k_c41r9CG,9599,2021-08-18T00:13:33Z,2021-08-18T00:13:33Z,OWNER,"The documentation should definitely cover how table names become URLs, in case any third party code needs to be able to calculate this themselves.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,
https://github.com/simonw/datasette/issues/1439#issuecomment-900712981,https://api.github.com/repos/simonw/datasette/issues/1439,900712981,IC_kwDOBm6k_c41r8oV,9599,2021-08-18T00:09:59Z,2021-08-18T00:12:32Z,OWNER,"So given the original examples, a table called `table.csv` would have the following URLs:
- `/db/table-.csv` - the HTML version
- `/db/table-.csv.csv` - the CSV version
- `/db/table-.csv.json` - the JSON version
And if for some horific reason you had a table with the name `/db/table-.csv.csv` (so `/db/` was the first part of the actual table name in SQLite) the URLs would look like this:
- `/db/%2Fdb%2Ftable---.csv-.csv` - the HTML version
- `/db/%2Fdb%2Ftable---.csv-.csv.csv` - the CSV version
- `/db/%2Fdb%2Ftable---.csv-.csv.json` - the JSON version","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,
https://github.com/simonw/datasette/issues/1439#issuecomment-900711967,https://api.github.com/repos/simonw/datasette/issues/1439,900711967,IC_kwDOBm6k_c41r8Yf,9599,2021-08-18T00:08:09Z,2021-08-18T00:08:09Z,OWNER,"Here's an alternative I just made up which I'm calling ""dot dash"" encoding:
```python
def dot_dash_encode(s):
return s.replace(""-"", ""--"").replace(""."", ""-."")
def dot_dash_decode(s):
return s.replace(""-."", ""."").replace(""--"", ""-"")
```
And some examples:
```python
for example in (
""hello"",
""hello.csv"",
""hello-and-so-on.csv"",
""hello-.csv"",
""hello--and--so--on-.csv"",
""hello.csv."",
""hello.csv.-"",
""hello.csv.--"",
):
print(example)
print(dot_dash_encode(example))
print(example == dot_dash_decode(dot_dash_encode(example)))
print()
```
Outputs:
```
hello
hello
True
hello.csv
hello-.csv
True
hello-and-so-on.csv
hello--and--so--on-.csv
True
hello-.csv
hello---.csv
True
hello--and--so--on-.csv
hello----and----so----on---.csv
True
hello.csv.
hello-.csv-.
True
hello.csv.-
hello-.csv-.--
True
hello.csv.--
hello-.csv-.----
True
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,
https://github.com/simonw/datasette/issues/1439#issuecomment-900709703,https://api.github.com/repos/simonw/datasette/issues/1439,900709703,IC_kwDOBm6k_c41r71H,9599,2021-08-18T00:03:09Z,2021-08-18T00:03:09Z,OWNER,"But... what if I invent my own escaping scheme?
I actually did this once before, in https://github.com/simonw/datasette/commit/9fdb47ca952b93b7b60adddb965ea6642b1ff523 - while I was working on porting Datasette to ASGI in https://github.com/simonw/datasette/issues/272#issuecomment-494192779 because ASGI didn't yet have the `raw_path` mechanism.
I could bring that back - it looked like this:
```
""table/and/slashes"" => ""tableU+002FandU+002Fslashes""
""~table"" => ""U+007Etable""
""+bobcats!"" => ""U+002Bbobcats!""
""U+007Etable"" => ""UU+002B007Etable""
```
But I didn't particularly like it - it was quite verbose.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,
https://github.com/simonw/datasette/issues/1439#issuecomment-900705226,https://api.github.com/repos/simonw/datasette/issues/1439,900705226,IC_kwDOBm6k_c41r6vK,9599,2021-08-17T23:50:32Z,2021-08-17T23:50:47Z,OWNER,"An alternative solution would be to use some form of escaping for the characters that form the name of the table.
The obvious way to do this would be URL-encoding - but it doesn't hold for `.` characters. The hex for that is `%2E` but watch what happens with that in a URL:
```
# Against Cloud Run:
curl -s 'https://datasette.io/-/asgi-scope/foo/bar%2Fbaz%2E' | rg path
'path': '/-/asgi-scope/foo/bar/baz.',
'raw_path': b'/-/asgi-scope/foo/bar%2Fbaz.',
'root_path': '',
# Against Vercel:
curl -s 'https://til.simonwillison.net/-/asgi-scope/foo/bar%2Fbaz%2E' | rg path
'path': '/-/asgi-scope/foo/bar%2Fbaz%2E',
'raw_path': b'/-/asgi-scope/foo/bar%2Fbaz%2E',
'root_path': '',
```
Surprisingly in this case Vercel DOES keep it intact, but Cloud Run does not.
It's still no good though: I need a solution that works on Vercel, Cloud Run and every other potential hosting provider too.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,
https://github.com/simonw/datasette/issues/1439#issuecomment-900699670,https://api.github.com/repos/simonw/datasette/issues/1439,900699670,IC_kwDOBm6k_c41r5YW,9599,2021-08-17T23:34:23Z,2021-08-17T23:34:23Z,OWNER,"The challenge comes down to telling the difference between the following:
- `/db/table` - an HTML table page
- `/db/table.csv` - the CSV version of `/db/table`
- `/db/table.csv` - no this one is actually a database table called `table.csv`
- `/db/table.csv.csv` - the CSV version of `/db/table.csv`
- `/db/table.csv.csv.csv` and so on...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,
https://github.com/simonw/datasette/issues/1438#issuecomment-900690998,https://api.github.com/repos/simonw/datasette/issues/1438,900690998,IC_kwDOBm6k_c41r3Q2,9599,2021-08-17T23:11:16Z,2021-08-17T23:12:25Z,OWNER,"I have completely failed to replicate this initial bug - but it's still there on the `thesession.vercel.app` deployment (even though my own deployments to Vercel do not exhibit it). Here's a one-liner to replicate it against that deployment:
`curl -s 'https://thesession.vercel.app/thesession?sql=select+*+from+tunes+where+name+like+%22%25wise+maid%25%22' | rg '.csv'`
Whit outputs this:
`This data as json, CSV
`
It looks like, rather than being URL-encoded, the original query string is somehow making it through to Jinja and then being auto-escaped there.
The weird thing is that the equivalent query executed against my `til.simonwillison.net` Vercel instance does this:
`curl -s 'https://til.simonwillison.net/fixtures?sql=select+*+from+searchable+where+text1+like+%22%25a%25%22' | rg '.csv'`
`This data as json, CSV
`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",972918533,
https://github.com/simonw/datasette/issues/1438#issuecomment-900681413,https://api.github.com/repos/simonw/datasette/issues/1438,900681413,IC_kwDOBm6k_c41r07F,9599,2021-08-17T22:47:44Z,2021-08-17T22:47:44Z,OWNER,I deployed another copy of `fixtures.db` on Vercel at https://til.simonwillison.net/fixtures so I can compare it with `fixtures.db` on Cloud Run at https://latest.datasette.io/fixtures,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",972918533,
https://github.com/simonw/datasette/issues/1438#issuecomment-900518343,https://api.github.com/repos/simonw/datasette/issues/1438,900518343,IC_kwDOBm6k_c41rNHH,9599,2021-08-17T18:04:42Z,2021-08-17T18:04:42Z,OWNER,Here's how `request.query_string` works: https://github.com/simonw/datasette/blob/adb5b70de5cec3c3dd37184defe606a082c232cf/datasette/utils/asgi.py#L86-L88,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",972918533,
https://github.com/simonw/datasette/issues/1438#issuecomment-900516826,https://api.github.com/repos/simonw/datasette/issues/1438,900516826,IC_kwDOBm6k_c41rMva,9599,2021-08-17T18:02:27Z,2021-08-17T18:02:27Z,OWNER,"The key difference I can spot between Vercel and Cloud Run is that `+` in a query string gets converted to `%20` by Vercel before it gets to my app, but does not for Cloud Run:
```
# Vercel
~ % curl -s 'https://til.simonwillison.net/-/asgi-scope?sql=select+*+from+tunes+where+name+like+%22%25wise+maid%25%22%0D%0A' | rg 'query_string' -C 2
'method': 'GET',
'path': '/-/asgi-scope',
'query_string': b'sql=select%20*%20from%20tunes%20where%20name%20like%20%22%25'
b'wise%20maid%25%22%0D%0A',
'raw_path': b'/-/asgi-scope',
# Cloud Run
~ % curl -s 'https://latest-with-plugins.datasette.io/-/asgi-scope?sql=select+*+from+tunes+where+name+like+%22%25wise+maid%25%22%0D%0A' | rg 'query_string' -C 2
'method': 'GET',
'path': '/-/asgi-scope',
'query_string': b'sql=select+*+from+tunes+where+name+like+%22%25wise+maid%25%2'
b'2%0D%0A',
'raw_path': b'/-/asgi-scope',
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",972918533,
https://github.com/simonw/datasette/issues/1438#issuecomment-900513267,https://api.github.com/repos/simonw/datasette/issues/1438,900513267,IC_kwDOBm6k_c41rL3z,9599,2021-08-17T17:57:05Z,2021-08-17T17:57:05Z,OWNER,"I'm having trouble replicating this bug outside of Vercel. Against Cloud Run: view-source:https://latest.datasette.io/fixtures?sql=select+*+from+searchable+where+text1+like+%22%25cat%25%22
The HTML here is:
```html
This data as
json,
...
CSV
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",972918533,
https://github.com/simonw/datasette/issues/1438#issuecomment-900502364,https://api.github.com/repos/simonw/datasette/issues/1438,900502364,IC_kwDOBm6k_c41rJNc,9599,2021-08-17T17:40:41Z,2021-08-17T17:40:41Z,OWNER,Bug is likely in `path_with_format` itself: https://github.com/simonw/datasette/blob/adb5b70de5cec3c3dd37184defe606a082c232cf/datasette/utils/__init__.py#L710-L729,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",972918533,
https://github.com/simonw/datasette/issues/1438#issuecomment-900500824,https://api.github.com/repos/simonw/datasette/issues/1438,900500824,IC_kwDOBm6k_c41rI1Y,9599,2021-08-17T17:38:16Z,2021-08-17T17:38:16Z,OWNER,"Relevant template code: https://github.com/simonw/datasette/blob/adb5b70de5cec3c3dd37184defe606a082c232cf/datasette/templates/query.html#L71
`renderers` comes from here: https://github.com/simonw/datasette/blob/2883098770fc66e50183b2b231edbde20848d4d6/datasette/views/base.py#L593-L608","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",972918533,
https://github.com/simonw/datasette/issues/1293#issuecomment-899915829,https://api.github.com/repos/simonw/datasette/issues/1293,899915829,IC_kwDOBm6k_c41o6A1,9599,2021-08-17T01:02:35Z,2021-08-17T01:02:35Z,OWNER,"New approach: this time I'm building a simplified executor for the bytecode operations themselves.
```python
def execute_operations(operations, max_iterations = 100, trace=None):
trace = trace or (lambda *args: None)
registers: Dict[int, Any] = {}
cursors: Dict[int, Tuple[str, Dict]] = {}
instruction_pointer = 0
iterations = 0
result_row = None
while True:
iterations += 1
if iterations > max_iterations:
break
operation = operations[instruction_pointer]
trace(instruction_pointer, dict(operation))
opcode = operation[""opcode""]
if opcode == ""Init"":
if operation[""p2""] != 0:
instruction_pointer = operation[""p2""]
continue
else:
instruction_pointer += 1
continue
elif opcode == ""Goto"":
instruction_pointer = operation[""p2""]
continue
elif opcode == ""Halt"":
break
elif opcode == ""OpenRead"":
cursors[operation[""p1""]] = (""database_table"", {
""rootpage"": operation[""p2""],
""connection"": operation[""p3""],
})
elif opcode == ""OpenEphemeral"":
cursors[operation[""p1""]] = (""ephemeral"", {
""num_columns"": operation[""p2""],
""index_keys"": [],
})
elif opcode == ""MakeRecord"":
registers[operation[""p3""]] = (""MakeRecord"", {
""registers"": list(range(operation[""p1""] + operation[""p2""]))
})
elif opcode == ""IdxInsert"":
record = registers[operation[""p2""]]
cursors[operation[""p1""]][1][""index_keys""].append(record)
elif opcode == ""Rowid"":
registers[operation[""p2""]] = (""rowid"", {
""table"": operation[""p1""]
})
elif opcode == ""Sequence"":
registers[operation[""p2""]] = (""sequence"", {
""next_from_cursor"": operation[""p1""]
})
elif opcode == ""Column"":
registers[operation[""p3""]] = (""column"", {
""cursor"": operation[""p1""],
""column_offset"": operation[""p2""]
})
elif opcode == ""ResultRow"":
p1 = operation[""p1""]
p2 = operation[""p2""]
trace(""ResultRow: "", list(range(p1, p1 + p2)), registers)
result_row = [registers.get(i) for i in range(p1, p1 + p2)]
elif opcode == ""Integer"":
registers[operation[""p2""]] = (""Integer"", operation[""p1""])
elif opcode == ""String8"":
registers[operation[""p2""]] = (""String"", operation[""p4""])
instruction_pointer += 1
return {""registers"": registers, ""cursors"": cursors, ""result_row"": result_row}
```
Results are promising!
```
execute_operations(db.execute(""explain select 'hello', 55, rowid, * from searchable"").fetchall())
{'registers': {1: ('String', 'hello'),
2: ('Integer', 55),
3: ('rowid', {'table': 0}),
4: ('rowid', {'table': 0}),
5: ('column', {'cursor': 0, 'column_offset': 1}),
6: ('column', {'cursor': 0, 'column_offset': 2}),
7: ('column', {'cursor': 0, 'column_offset': 3})},
'cursors': {0: ('database_table', {'rootpage': 32, 'connection': 0})},
'result_row': [('String', 'hello'),
('Integer', 55),
('rowid', {'table': 0}),
('rowid', {'table': 0}),
('column', {'cursor': 0, 'column_offset': 1}),
('column', {'cursor': 0, 'column_offset': 2}),
('column', {'cursor': 0, 'column_offset': 3})]}
```
Here's what happens with a union across three tables:
```
execute_operations(db.execute(f""""""
explain select data as content from binary_data
union
select pk as content from complex_foreign_keys
union
select name as content from facet_cities
""""""}).fetchall())
{'registers': {1: ('column', {'cursor': 4, 'column_offset': 0}),
2: ('MakeRecord', {'registers': [0, 1, 2, 3]}),
3: ('column', {'cursor': 0, 'column_offset': 1}),
4: ('column', {'cursor': 3, 'column_offset': 0})},
'cursors': {3: ('ephemeral',
{'num_columns': 1,
'index_keys': [('MakeRecord', {'registers': [0, 1]}),
('MakeRecord', {'registers': [0, 1]}),
('MakeRecord', {'registers': [0, 1, 2, 3]})]}),
2: ('database_table', {'rootpage': 44, 'connection': 0}),
4: ('database_table', {'rootpage': 24, 'connection': 0}),
0: ('database_table', {'rootpage': 42, 'connection': 0})},
'result_row': [('column', {'cursor': 3, 'column_offset': 0})]}
```
Note how the result_row refers to cursor 3, which is an ephemeral table which had three different sets of `MakeRecord` index keys assigned to it - indicating that the output column is NOT from the same underlying table source.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",849978964,