github
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/simonw/datasette/pull/1648#issuecomment-1059823151 | https://api.github.com/repos/simonw/datasette/issues/1648 | 1059823151 | IC_kwDOBm6k_c4_K54v | 22429695 | 2022-03-05T19:56:41Z | 2022-03-07T15:38:08Z | NONE | # [Codecov](https://codecov.io/gh/simonw/datasette/pull/1648?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report > Merging [#1648](https://codecov.io/gh/simonw/datasette/pull/1648?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (32548b8) into [main](https://codecov.io/gh/simonw/datasette/commit/7d24fd405f3c60e4c852c5d746c91aa2ba23cf5b?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (7d24fd4) will **increase** coverage by `0.02%`. > The diff coverage is `100.00%`. [![Impacted file tree graph](https://codecov.io/gh/simonw/datasette/pull/1648/graphs/tree.svg?width=650&height=150&src=pr&token=eSahVY7kw1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)](https://codecov.io/gh/simonw/datasette/pull/1648?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) ```diff @@ Coverage Diff @@ ## main #1648 +/- ## ========================================== + Coverage 92.03% 92.05% +0.02% ========================================== Files 34 34 Lines 4557 4570 +13 ========================================== + Hits 4194 4207 +13 Misses 363 363 ``` | [Impacted Files](https://codecov.io/gh/simonw/datasette/pull/1648?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) | Coverage Δ | | |---|---|---| | [datasette/url\_builder.py](https://codecov.io/gh/simonw/datasette/pull/1648/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison#diff-ZGF0YXNldHRlL3VybF9idWlsZGVyLnB5) | `100.00% <100.00%> (ø)` | … | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160432941 | |
https://github.com/simonw/datasette/issues/1439#issuecomment-1059854864 | https://api.github.com/repos/simonw/datasette/issues/1439 | 1059854864 | IC_kwDOBm6k_c4_LBoQ | 9599 | 2022-03-05T23:59:05Z | 2022-03-05T23:59:05Z | OWNER | OK, for that percentage thing: the Python core implementation of URL percentage escaping deliberately ignores two of the characters we want to escape: `.` and `-`: https://github.com/python/cpython/blob/6927632492cbad86a250aa006c1847e03b03e70b/Lib/urllib/parse.py#L780-L783 ```python _ALWAYS_SAFE = frozenset(b'ABCDEFGHIJKLMNOPQRSTUVWXYZ' b'abcdefghijklmnopqrstuvwxyz' b'0123456789' b'_.-~') ``` It also defaults to skipping `/` (passed as a `safe=` parameter to various things). I'm going to try borrowing and modifying the core of the Python implementation: https://github.com/python/cpython/blob/6927632492cbad86a250aa006c1847e03b03e70b/Lib/urllib/parse.py#L795-L814 ```python class _Quoter(dict): """A mapping from bytes numbers (in range(0,256)) to strings. String values are percent-encoded byte values, unless the key < 128, and in either of the specified safe set, or the always safe set. """ # Keeps a cache internally, via __missing__, for efficiency (lookups # of cached keys don't call Python code at all). def __init__(self, safe): """safe: bytes object.""" self.safe = _ALWAYS_SAFE.union(safe) def __repr__(self): return f"<Quoter {dict(self)!r}>" def __missing__(self, b): # Handle a cache miss. Store quoted string in cache and return. res = chr(b) if b in self.safe else '%{:02X}'.format(b) self[b] = res return res ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
973139047 | |
https://github.com/simonw/datasette/issues/1439#issuecomment-1059853526 | https://api.github.com/repos/simonw/datasette/issues/1439 | 1059853526 | IC_kwDOBm6k_c4_LBTW | 9599 | 2022-03-05T23:49:59Z | 2022-03-05T23:49:59Z | OWNER | I want to try regular percentage encoding, except that it also encodes both the `-` and the `.` characters, AND it uses `-` instead of `%` as the encoding character. Should check what it does with emoji too. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
973139047 | |
https://github.com/simonw/datasette/issues/1439#issuecomment-1059851259 | https://api.github.com/repos/simonw/datasette/issues/1439 | 1059851259 | IC_kwDOBm6k_c4_LAv7 | 9599 | 2022-03-05T23:35:47Z | 2022-03-05T23:35:59Z | OWNER | This [comment from glyph](https://twitter.com/glyph/status/1500244937312329730) got me thinking: > Have you considered replacing % with some other character and then using percent-encoding? What happens if a table name includes a `%` character and that ends up getting mangled by a misbehaving proxy? I should consider `%` in the escaping system too. And maybe go with that suggestion of using percent-encoding directly but with a different character. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
973139047 | |
https://github.com/simonw/datasette/issues/1439#issuecomment-1059850369 | https://api.github.com/repos/simonw/datasette/issues/1439 | 1059850369 | IC_kwDOBm6k_c4_LAiB | 9599 | 2022-03-05T23:28:56Z | 2022-03-05T23:28:56Z | OWNER | Lots of great conversations about the dash encoding implementation on Twitter: https://twitter.com/simonw/status/1500228316309061633 @dracos helped me figure out a simpler regex: https://twitter.com/dracos/status/1500236433809973248 `^/(?P<database>[^/]+)/(?P<table>[^\/\-\.]*|\-/|\-\.|\-\-)*(?P<format>\.\w+)?$` ![image](https://user-images.githubusercontent.com/9599/156903088-c01933ae-4713-4e91-8d71-affebf70b945.png) | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
973139047 | |
https://github.com/simonw/datasette/issues/1439#issuecomment-1059836599 | https://api.github.com/repos/simonw/datasette/issues/1439 | 1059836599 | IC_kwDOBm6k_c4_K9K3 | 9599 | 2022-03-05T21:52:10Z | 2022-03-05T21:52:10Z | OWNER | Blogged about this here: https://simonwillison.net/2022/Mar/5/dash-encoding/ | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
973139047 | |
https://github.com/simonw/datasette/issues/1647#issuecomment-1059823119 | https://api.github.com/repos/simonw/datasette/issues/1647 | 1059823119 | IC_kwDOBm6k_c4_K54P | 9599 | 2022-03-05T19:56:27Z | 2022-03-05T19:56:27Z | OWNER | Updated this TIL with extra patterns I figured out: https://til.simonwillison.net/sqlite/ld-preload | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160407071 | |
https://github.com/simonw/datasette/issues/1439#issuecomment-1059822391 | https://api.github.com/repos/simonw/datasette/issues/1439 | 1059822391 | IC_kwDOBm6k_c4_K5s3 | 9599 | 2022-03-05T19:50:12Z | 2022-03-05T19:50:12Z | OWNER | I'm going to move this work to a PR. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
973139047 | |
https://github.com/simonw/datasette/issues/1439#issuecomment-1059822151 | https://api.github.com/repos/simonw/datasette/issues/1439 | 1059822151 | IC_kwDOBm6k_c4_K5pH | 9599 | 2022-03-05T19:48:35Z | 2022-03-05T19:48:35Z | OWNER | Those new docs: https://github.com/simonw/datasette/blob/d1cb73180b4b5a07538380db76298618a5fc46b6/docs/internals.rst#dash-encoding | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
973139047 | |
https://github.com/simonw/datasette/issues/1647#issuecomment-1059821674 | https://api.github.com/repos/simonw/datasette/issues/1647 | 1059821674 | IC_kwDOBm6k_c4_K5hq | 9599 | 2022-03-05T19:44:32Z | 2022-03-05T19:44:32Z | OWNER | I thought I'd need to introduce https://dirty-equals.helpmanual.io/types/string/ to help write tests for this, but I think I've found a good alternative that doesn't need a new dependency. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160407071 | |
https://github.com/simonw/datasette/issues/1647#issuecomment-1059819628 | https://api.github.com/repos/simonw/datasette/issues/1647 | 1059819628 | IC_kwDOBm6k_c4_K5Bs | 9599 | 2022-03-05T19:28:54Z | 2022-03-05T19:28:54Z | OWNER | OK, using that trick worked for testing this: docker run -it -p 8001:8001 ubuntu Then inside that container: apt-get install -y python3 build-essential tcl wget python3-pip git python3.8-venv For each version of SQLite I wanted to test I needed to figure out the tarball URL - for example, for `3.38.0` I navigated to https://www.sqlite.org/src/timeline?t=version-3.38.0 and clicked the "checkin" link and copied the tarball link: https://www.sqlite.org/src/tarball/40fa792d/SQLite-40fa792d.tar.gz Then to build it (the `CPPFLAGS` took some trial and error): ``` cd /tmp wget https://www.sqlite.org/src/tarball/40fa792d/SQLite-40fa792d.tar.gz tar -xzvf SQLite-40fa792d.tar.gz cd SQLite-40fa792d CPPFLAGS="-DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_RTREE=1" ./configure make ``` Then to test with Datasette: ``` cd /tmp git clone https://github.com/simonw/datasette cd datasette python3 -m venv venv source venv/bin/activate pip install wheel # So bdist_wheel works in next step pip install -e '.[test]' LD_PRELOAD=/tmp/SQLite-40fa792d/.libs/libsqlite3.so pytest ``` After some trial and error I proved that those tests passed with 3.36.0: ``` cd /tmp wget https://www.sqlite.org/src/tarball/5c9a6c06/SQLite-5c9a6c06.tar.gz tar -xzvf SQLite-5c9a6c06.tar.gz cd SQLite-5c9a6c06 CPPFLAGS="-DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_RTREE=1" ./configure make cd /tmp/datasette LD_PRELOAD=/tmp/SQLite-5c9a6c06/.libs/libsqlite3.so pytest tests/test_internals_database.py ``` BUT failed with 3.37.0: ``` # 3.37.0 cd /tmp wget https://www.sqlite.org/src/tarball/bd41822c/SQLite-bd41822c.tar.gz tar -xzvf SQLite-bd41822c.tar.gz cd SQLite-bd41822c CPPFLAGS="-DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_RTREE=1" ./configure make cd /tmp/datasette LD_PRELOAD=/tmp/SQLite-bd41822c/.libs/libsqlite3.so pytest tests/test_internals_database.py ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160407071 | |
https://github.com/simonw/datasette/issues/1647#issuecomment-1059807598 | https://api.github.com/repos/simonw/datasette/issues/1647 | 1059807598 | IC_kwDOBm6k_c4_K2Fu | 9599 | 2022-03-05T18:06:56Z | 2022-03-05T18:08:00Z | OWNER | Had a look through the commits in https://github.com/sqlite/sqlite/compare/version-3.37.2...version-3.38.0 but couldn't see anything obvious that might have caused this. Really wish I had a good mechanism for running the test suite against different SQLite versions! May have to revisit this old trick: https://til.simonwillison.net/sqlite/ld-preload | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160407071 | |
https://github.com/simonw/datasette/issues/1647#issuecomment-1059804577 | https://api.github.com/repos/simonw/datasette/issues/1647 | 1059804577 | IC_kwDOBm6k_c4_K1Wh | 9599 | 2022-03-05T17:49:46Z | 2022-03-05T17:49:46Z | OWNER | My best guess is that this is an undocumented change in SQLite 3.38 - I get that test failure with that SQLite version. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160407071 | |
https://github.com/simonw/datasette/issues/1439#issuecomment-1059802318 | https://api.github.com/repos/simonw/datasette/issues/1439 | 1059802318 | IC_kwDOBm6k_c4_K0zO | 9599 | 2022-03-05T17:34:33Z | 2022-03-05T17:34:33Z | OWNER | Wrote documentation: <img width="741" alt="Dash encoding. Datasette uses a custom encoding scheme in some places, called dash encoding. This is primarily used for table names and row primary keys, to avoid any confusion between / characters in those values and the Datasette URL that references them. Dash encoding applies the following rules, in order: 1. All single - characters are replaced by -- 2. . characters are replaced by -. 3. / characters are replaced by ./ These rules are applied in reverse order to decode a dash encoded string." src="https://user-images.githubusercontent.com/9599/156893903-5723f60e-e054-4365-84bc-f3084d11183d.png"> | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
973139047 | |
https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059652538 | https://api.github.com/repos/simonw/sqlite-utils/issues/412 | 1059652538 | IC_kwDOCGYnMM4_KQO6 | 9599 | 2022-03-05T02:13:17Z | 2022-03-05T02:13:17Z | OWNER | > It looks like the existing `pd.read_sql_query()` method has an optional dependency on SQLAlchemy: > > ``` > ... > import pandas as pd > pd.read_sql_query(db.conn, "select * from articles") > # ImportError: Using URI string without sqlalchemy installed. > ``` Hah, no I was wrong about this: SQLAlchemy is not needed for SQLite to work, I just had the arguments the wrong way round: ```python pd.read_sql_query("select * from articles", db.conn) # Shows a DateFrame ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160182768 | |
https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059651306 | https://api.github.com/repos/simonw/sqlite-utils/issues/412 | 1059651306 | IC_kwDOCGYnMM4_KP7q | 9599 | 2022-03-05T02:10:49Z | 2022-03-05T02:10:49Z | OWNER | I could teach `.insert_all()` and `.upsert_all()` to optionally accept a DataFrame. A challenge there is `mypy` - if Pandas is an optional dependency, is it possibly to declare types that accept a Union that includes DataFrame? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160182768 | |
https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059651056 | https://api.github.com/repos/simonw/sqlite-utils/issues/412 | 1059651056 | IC_kwDOCGYnMM4_KP3w | 9599 | 2022-03-05T02:09:38Z | 2022-03-05T02:09:38Z | OWNER | OK, so reading results from existing `sqlite-utils` into a Pandas DataFrame turns out to be trivial. How about writing a DataFrame to a database table? That feels like it could a lot more useful. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160182768 | |
https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059649803 | https://api.github.com/repos/simonw/sqlite-utils/issues/412 | 1059649803 | IC_kwDOCGYnMM4_KPkL | 9599 | 2022-03-05T02:02:41Z | 2022-03-05T02:02:41Z | OWNER | It looks like the existing `pd.read_sql_query()` method has an optional dependency on SQLAlchemy: ``` ... import pandas as pd pd.read_sql_query(db.conn, "select * from articles") # ImportError: Using URI string without sqlalchemy installed. ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160182768 | |
https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059649213 | https://api.github.com/repos/simonw/sqlite-utils/issues/412 | 1059649213 | IC_kwDOCGYnMM4_KPa9 | 9599 | 2022-03-05T02:00:10Z | 2022-03-05T02:00:10Z | OWNER | Requested feedback on Twitter here :https://twitter.com/simonw/status/1499927075930578948 | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160182768 | |
https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059649193 | https://api.github.com/repos/simonw/sqlite-utils/issues/412 | 1059649193 | IC_kwDOCGYnMM4_KPap | 9599 | 2022-03-05T02:00:02Z | 2022-03-05T02:00:02Z | OWNER | Yeah, I imagine there are plenty of ways to do this with Pandas already - I'm opportunistically looking for a way to provide better integration with the rest of the Pandas situation from the work I've done in `sqlite-utils` already. Might be that this isn't worth doing at all. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160182768 | |
https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059647114 | https://api.github.com/repos/simonw/sqlite-utils/issues/412 | 1059647114 | IC_kwDOCGYnMM4_KO6K | 25778 | 2022-03-05T01:54:24Z | 2022-03-05T01:54:24Z | CONTRIBUTOR | I haven't tried this, but it looks like Pandas has a method for this: https://pandas.pydata.org/docs/reference/api/pandas.read_sql_query.html | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160182768 | |
https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059646645 | https://api.github.com/repos/simonw/sqlite-utils/issues/412 | 1059646645 | IC_kwDOCGYnMM4_KOy1 | 9599 | 2022-03-05T01:53:10Z | 2022-03-05T01:53:10Z | OWNER | I'm not an experienced enough Pandas user to know if this design is right or not. I'm going to leave this open for a while and solicit some feedback. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160182768 | |
https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059646543 | https://api.github.com/repos/simonw/sqlite-utils/issues/412 | 1059646543 | IC_kwDOCGYnMM4_KOxP | 9599 | 2022-03-05T01:52:47Z | 2022-03-05T01:52:47Z | OWNER | I built a prototype of that second option and it looks pretty good: <img width="1108" alt="image" src="https://user-images.githubusercontent.com/9599/156862952-57cc937c-7f40-4137-b78f-ecc921a5d9a7.png"> Here's the `pandas.py` prototype: ```python from .db import Database as _Database, Table as _Table, View as _View import pandas as pd from typing import ( Iterable, Union, Optional, ) class Database(_Database): def query( self, sql: str, params: Optional[Union[Iterable, dict]] = None ) -> pd.DataFrame: return pd.DataFrame(super().query(sql, params)) def table(self, table_name: str, **kwargs) -> Union["Table", "View"]: "Return a table object, optionally configured with default options." klass = View if table_name in self.view_names() else Table return klass(self, table_name, **kwargs) class PandasQueryable: def rows_where( self, where: str = None, where_args: Optional[Union[Iterable, dict]] = None, order_by: str = None, select: str = "*", limit: int = None, offset: int = None, ) -> pd.DataFrame: return pd.DataFrame( super().rows_where( where, where_args, order_by=order_by, select=select, limit=limit, offset=offset, ) ) class Table(PandasQueryable, _Table): pass class View(PandasQueryable, _View): pass ``` | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160182768 | |
https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1059646247 | https://api.github.com/repos/simonw/sqlite-utils/issues/412 | 1059646247 | IC_kwDOCGYnMM4_KOsn | 9599 | 2022-03-05T01:51:03Z | 2022-03-05T01:51:03Z | OWNER | I considered two ways of doing this. First, have methods such as `db.query_df()` and `table.rows_df` which do the same as `.query()` and `table.rows` but return a DataFrame instead of a generator of dictionaries. Second, have a compatibility class that is imported separately such as: ```python from sqlite_utils.pandas import Database ``` Then have the `.query()` and `.rows` and other similar methods return dataframes. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1160182768 | |
https://github.com/simonw/datasette/issues/1640#issuecomment-1059638778 | https://api.github.com/repos/simonw/datasette/issues/1640 | 1059638778 | IC_kwDOBm6k_c4_KM36 | 9599 | 2022-03-05T01:19:00Z | 2022-03-05T01:19:00Z | OWNER | The reason I implemented it like this was to support things like the `curl` progress bar if users decide to serve up large files using the `--static` mechanism. Here's the code that hooks it up to the URL resolver: https://github.com/simonw/datasette/blob/458f03ad3a454d271f47a643f4530bd8b60ddb76/datasette/app.py#L1001-L1005 Which uses this function: https://github.com/simonw/datasette/blob/a6ff123de5464806441f6a6f95145c9a83b7f20b/datasette/utils/asgi.py#L285-L310 One option here would be to support a workaround that looks something like this: http://localhost:8001/my-static/log.txt?_unknown_size=1` The URL routing code could then look out for that `?_unknown_size=1` option and, if it's present, omit the `content-length` header entirely. It's a bit of a cludge, but it would be pretty straight-forward to implement. Would that work for you @broccolihighkicks? | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1148725876 | |
https://github.com/simonw/datasette/issues/1640#issuecomment-1059636420 | https://api.github.com/repos/simonw/datasette/issues/1640 | 1059636420 | IC_kwDOBm6k_c4_KMTE | 9599 | 2022-03-05T01:13:26Z | 2022-03-05T01:13:26Z | OWNER | Hah, this is certainly unexpected. It looks like this is the code in question: https://github.com/simonw/datasette/blob/a6ff123de5464806441f6a6f95145c9a83b7f20b/datasette/utils/asgi.py#L259-L266 You're right: it assumes that the file it is serving won't change length while it is serving it. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1148725876 | |
https://github.com/simonw/datasette/issues/1642#issuecomment-1059635969 | https://api.github.com/repos/simonw/datasette/issues/1642 | 1059635969 | IC_kwDOBm6k_c4_KMMB | 9599 | 2022-03-05T01:11:17Z | 2022-03-05T01:11:17Z | OWNER | `pip install datasette` in a fresh virtual environment doesn't show any warnings. Neither does `pip install -e '.'` in a fresh checkout. Or `pip install -e '.[test]'`. Closing this as can't reproduce. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1152072027 | |
https://github.com/simonw/datasette/issues/1645#issuecomment-1059634688 | https://api.github.com/repos/simonw/datasette/issues/1645 | 1059634688 | IC_kwDOBm6k_c4_KL4A | 9599 | 2022-03-05T01:06:08Z | 2022-03-05T01:06:08Z | OWNER | It sounds like you can workaround this with Varnish configuration for the moment, but I'm going to bump this up the list of things to fix - it's particularly relevant now as I'd like to get a solution in place before Datasette 1.0, since it's likely to be beneficial to plugins and hence should be part of the stable, documented plugin interface. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1154399841 | |
https://github.com/simonw/datasette/issues/1645#issuecomment-1059634412 | https://api.github.com/repos/simonw/datasette/issues/1645 | 1059634412 | IC_kwDOBm6k_c4_KLzs | 9599 | 2022-03-05T01:04:53Z | 2022-03-05T01:04:53Z | OWNER | The existing `app_css_hash` already isn't good enough, because I built that before `table.js` existed, and that file should obviously be smartly cached too. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1154399841 | |
https://github.com/simonw/datasette/issues/1645#issuecomment-1059633902 | https://api.github.com/repos/simonw/datasette/issues/1645 | 1059633902 | IC_kwDOBm6k_c4_KLru | 9599 | 2022-03-05T01:03:06Z | 2022-03-05T01:03:06Z | OWNER | I agree: this is bad. Ideally, content served from `/static/` would apply best practices for static content serving - which to my mind means the following: - Where possible, serve with a far-future cache expiry header and use an asset URL that changes when the file itself changes - For assets without that, support conditional GET to avoid transferring the whole asset if it hasn't changed - Some kind of sensible mechanism for setting cache TTLs on assets that don't have a unique-file-per-version - in particular assets that might be served from plugins. Datasette half-implemented the first of these: if you view source on https://latest.datasette.io/ you'll see it links to `/-/static/app.css?cead5a` - which in the template looks like this: https://github.com/simonw/datasette/blob/dd94157f8958bdfe9f45575add934ccf1aba6d63/datasette/templates/base.html#L5 I had forgotten I had implemented this! Here is how it is calculated: https://github.com/simonw/datasette/blob/458f03ad3a454d271f47a643f4530bd8b60ddb76/datasette/app.py#L510-L516 So `app.css` right now could be safely served with a far-future cache header... only it isn't: ``` ~ % curl -i 'https://latest.datasette.io/-/static/app.css?cead5a' HTTP/2 200 content-type: text/css x-databases: _memory, _internal, fixtures, extra_database x-cloud-trace-context: 9ddc825620eb53d30fc127d1c750f342 date: Sat, 05 Mar 2022 01:01:53 GMT server: Google Frontend content-length: 16178 ``` The larger question though is what to do about other assets. I'm particularly interested in plugin assets, since visualization plugins like `datasette-vega` and `datasette-cluster-map` ship with large amounts of JavaScript and I'd really like that to be sensibly cached by default. | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1154399841 |