html_url,issue_url,id,node_id,user,user_label,created_at,updated_at,author_association,body,reactions,issue,issue_label,performed_via_github_app https://github.com/simonw/datasette/issues/1439#issuecomment-1059854864,https://api.github.com/repos/simonw/datasette/issues/1439,1059854864,IC_kwDOBm6k_c4_LBoQ,9599,simonw,2022-03-05T23:59:05Z,2022-03-05T23:59:05Z,OWNER,"OK, for that percentage thing: the Python core implementation of URL percentage escaping deliberately ignores two of the characters we want to escape: `.` and `-`: https://github.com/python/cpython/blob/6927632492cbad86a250aa006c1847e03b03e70b/Lib/urllib/parse.py#L780-L783 ```python _ALWAYS_SAFE = frozenset(b'ABCDEFGHIJKLMNOPQRSTUVWXYZ' b'abcdefghijklmnopqrstuvwxyz' b'0123456789' b'_.-~') ``` It also defaults to skipping `/` (passed as a `safe=` parameter to various things). I'm going to try borrowing and modifying the core of the Python implementation: https://github.com/python/cpython/blob/6927632492cbad86a250aa006c1847e03b03e70b/Lib/urllib/parse.py#L795-L814 ```python class _Quoter(dict): """"""A mapping from bytes numbers (in range(0,256)) to strings. String values are percent-encoded byte values, unless the key < 128, and in either of the specified safe set, or the always safe set. """""" # Keeps a cache internally, via __missing__, for efficiency (lookups # of cached keys don't call Python code at all). def __init__(self, safe): """"""safe: bytes object."""""" self.safe = _ALWAYS_SAFE.union(safe) def __repr__(self): return f"""" def __missing__(self, b): # Handle a cache miss. Store quoted string in cache and return. res = chr(b) if b in self.safe else '%{:02X}'.format(b) self[b] = res return res ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1059853526,https://api.github.com/repos/simonw/datasette/issues/1439,1059853526,IC_kwDOBm6k_c4_LBTW,9599,simonw,2022-03-05T23:49:59Z,2022-03-05T23:49:59Z,OWNER,"I want to try regular percentage encoding, except that it also encodes both the `-` and the `.` characters, AND it uses `-` instead of `%` as the encoding character. Should check what it does with emoji too.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1059851259,https://api.github.com/repos/simonw/datasette/issues/1439,1059851259,IC_kwDOBm6k_c4_LAv7,9599,simonw,2022-03-05T23:35:47Z,2022-03-05T23:35:59Z,OWNER,"This [comment from glyph](https://twitter.com/glyph/status/1500244937312329730) got me thinking: > Have you considered replacing % with some other character and then using percent-encoding? What happens if a table name includes a `%` character and that ends up getting mangled by a misbehaving proxy? I should consider `%` in the escaping system too. And maybe go with that suggestion of using percent-encoding directly but with a different character.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1059850369,https://api.github.com/repos/simonw/datasette/issues/1439,1059850369,IC_kwDOBm6k_c4_LAiB,9599,simonw,2022-03-05T23:28:56Z,2022-03-05T23:28:56Z,OWNER,"Lots of great conversations about the dash encoding implementation on Twitter: https://twitter.com/simonw/status/1500228316309061633 @dracos helped me figure out a simpler regex: https://twitter.com/dracos/status/1500236433809973248 `^/(?P[^/]+)/(?P[^\/\-\.]*|\-/|\-\.|\-\-)*(?P\.\w+)?$` ![image](https://user-images.githubusercontent.com/9599/156903088-c01933ae-4713-4e91-8d71-affebf70b945.png) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1059836599,https://api.github.com/repos/simonw/datasette/issues/1439,1059836599,IC_kwDOBm6k_c4_K9K3,9599,simonw,2022-03-05T21:52:10Z,2022-03-05T21:52:10Z,OWNER,Blogged about this here: https://simonwillison.net/2022/Mar/5/dash-encoding/,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1059822391,https://api.github.com/repos/simonw/datasette/issues/1439,1059822391,IC_kwDOBm6k_c4_K5s3,9599,simonw,2022-03-05T19:50:12Z,2022-03-05T19:50:12Z,OWNER,I'm going to move this work to a PR.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1059822151,https://api.github.com/repos/simonw/datasette/issues/1439,1059822151,IC_kwDOBm6k_c4_K5pH,9599,simonw,2022-03-05T19:48:35Z,2022-03-05T19:48:35Z,OWNER,Those new docs: https://github.com/simonw/datasette/blob/d1cb73180b4b5a07538380db76298618a5fc46b6/docs/internals.rst#dash-encoding,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0, https://github.com/simonw/datasette/issues/1439#issuecomment-1059802318,https://api.github.com/repos/simonw/datasette/issues/1439,1059802318,IC_kwDOBm6k_c4_K0zO,9599,simonw,2022-03-05T17:34:33Z,2022-03-05T17:34:33Z,OWNER,"Wrote documentation: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",973139047,Rethink how .ext formats (v.s. ?_format=) works before 1.0,