home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

46 rows where "updated_at" is on date 2022-03-21 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

issue 14

  • Research: how much overhead does the n=1 time limit have? 11
  • Options for how `r.parsedate()` should handle invalid dates 8
  • Extract out `check_permissions()` from `BaseView 7
  • Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply 4
  • Refactor and simplify Datasette routing and views 3
  • Reconsider ensure_permissions() logic, can it be less confusing? 3
  • Convert with `--multi` and `--dry-run` flag does not work 2
  • insert fails on JSONL with whitespace 2
  • Handle spatialite geometry columns better 1
  • Expose SANIC_RESPONSE_TIMEOUT config option in a sensible way 1
  • Stream all results for arbitrary SQL and canned queries 1
  • Ability to stream all rows as newline-delimited JSON 1
  • Remove `check_permission()` from `BaseView` 1
  • Make `check_visibility()` a documented API 1

user 2

  • simonw 45
  • blaine 1

author_association 2

  • OWNER 45
  • NONE 1
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1074479932 https://github.com/simonw/datasette/issues/339#issuecomment-1074479932 https://api.github.com/repos/simonw/datasette/issues/339 IC_kwDOBm6k_c5AC0M8 simonw 9599 2022-03-21T22:22:34Z 2022-03-21T22:22:34Z OWNER

Closing this as obsolete since Datasette no longer uses Sanic.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Expose SANIC_RESPONSE_TIMEOUT config option in a sensible way 340396247  
1074479768 https://github.com/simonw/datasette/issues/276#issuecomment-1074479768 https://api.github.com/repos/simonw/datasette/issues/276 IC_kwDOBm6k_c5AC0KY simonw 9599 2022-03-21T22:22:20Z 2022-03-21T22:22:20Z OWNER

I'm closing this issue because this is now solved by a number of neat plugins:

  • https://datasette.io/plugins/datasette-geojson-map shows the geometry from SpatiaLite columns on a map
  • https://datasette.io/plugins/datasette-leaflet-geojson can be used to display inline maps next to each column
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Handle spatialite geometry columns better 324835838  
1074478299 https://github.com/simonw/datasette/issues/1671#issuecomment-1074478299 https://api.github.com/repos/simonw/datasette/issues/1671 IC_kwDOBm6k_c5ACzzb simonw 9599 2022-03-21T22:20:26Z 2022-03-21T22:20:26Z OWNER

Thinking about options for fixing this...

The following query works fine: sql select * from test_view where cast(has_expired as text) = '1' I don't want to start using this for every query, because one of the goals of Datasette is to help people who are learning SQL: - #1613

If someone clicks on "View and edit SQL" from a filtered table page I don't want them to have to wonder why that cast is there.

But... for querying views, the cast turns out to be necessary.

So one fix would be to get the SQL generating logic to use casts like this any time it is operating against a view.

An even better fix would be to detect which columns in a view come from a table and which ones might not, and only use casts for the columns that aren't definitely from a table.

The trick I was exploring here might be able to help with that: - #1293

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply 1174655187  
1074470568 https://github.com/simonw/datasette/issues/1671#issuecomment-1074470568 https://api.github.com/repos/simonw/datasette/issues/1671 IC_kwDOBm6k_c5ACx6o simonw 9599 2022-03-21T22:11:14Z 2022-03-21T22:12:49Z OWNER

I wonder if this will be a problem with generated columns, or with SQLite strict tables?

My hunch is that strict tables will continue to work without any changes, because https://www.sqlite.org/stricttables.html says nothing about their impact on comparison operations. I should test this to make absolutely sure though.

Generated columns have a type, so my hunch is they will continue to work fine too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply 1174655187  
1074468450 https://github.com/simonw/datasette/issues/1671#issuecomment-1074468450 https://api.github.com/repos/simonw/datasette/issues/1671 IC_kwDOBm6k_c5ACxZi simonw 9599 2022-03-21T22:08:35Z 2022-03-21T22:10:00Z OWNER

Relevant section of the SQLite documentation: 3.2. Affinity Of Expressions:

When an expression is a simple reference to a column of a real table (not a VIEW or subquery) then the expression has the same affinity as the table column.

In your example, has_expired is no longer a simple reference to a column of a real table, hence the bug.

Then 4.2. Type Conversions Prior To Comparison fills in the rest:

SQLite may attempt to convert values between the storage classes INTEGER, REAL, and/or TEXT before performing a comparison. Whether or not any conversions are attempted before the comparison takes place depends on the type affinity of the operands.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply 1174655187  
1074465536 https://github.com/simonw/datasette/issues/1671#issuecomment-1074465536 https://api.github.com/repos/simonw/datasette/issues/1671 IC_kwDOBm6k_c5ACwsA simonw 9599 2022-03-21T22:04:31Z 2022-03-21T22:04:31Z OWNER

Oh this is fascinating! I replicated the bug (thanks for the steps to reproduce) and it looks like this is down to the following:

Against views, where has_expired = 1 returns different results from where has_expired = '1'

This doesn't happen against tables because of SQLite's type affinity mechanism, which handles the type conversion automatically.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply 1174655187  
1074459746 https://github.com/simonw/datasette/issues/1679#issuecomment-1074459746 https://api.github.com/repos/simonw/datasette/issues/1679 IC_kwDOBm6k_c5ACvRi simonw 9599 2022-03-21T21:55:45Z 2022-03-21T21:55:45Z OWNER

I'm going to change the original logic to set n=1 for times that are <= 20ms - and update the comments to make it more obvious what is happening.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: how much overhead does the n=1 time limit have? 1175854982  
1074458506 https://github.com/simonw/datasette/issues/1679#issuecomment-1074458506 https://api.github.com/repos/simonw/datasette/issues/1679 IC_kwDOBm6k_c5ACu-K simonw 9599 2022-03-21T21:53:47Z 2022-03-21T21:53:47Z OWNER

Oh interesting, it turns out there is ONE place in the code that sets the ms to less than 20 - this test fixture: https://github.com/simonw/datasette/blob/4e47a2d894b96854348343374c8e97c9d7055cf6/tests/fixtures.py#L224-L226

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: how much overhead does the n=1 time limit have? 1175854982  
1074454687 https://github.com/simonw/datasette/issues/1679#issuecomment-1074454687 https://api.github.com/repos/simonw/datasette/issues/1679 IC_kwDOBm6k_c5ACuCf simonw 9599 2022-03-21T21:48:02Z 2022-03-21T21:48:02Z OWNER

Here's another microbenchmark that measures how many nanoseconds it takes to run 1,000 vmops:

```python import sqlite3 import time

db = sqlite3.connect(":memory:")

i = 0 out = []

def count(): global i i += 1000 out.append(((i, time.perf_counter_ns())))

db.set_progress_handler(count, 1000)

print("Start:", time.perf_counter_ns()) all = db.execute(""" with recursive counter(x) as ( select 0 union select x + 1 from counter ) select * from counter limit 10000; """).fetchall() print("End:", time.perf_counter_ns())

print() print("So how long does it take to execute 1000 ops?")

prev_time_ns = None for i, time_ns in out: if prev_time_ns is not None: print(time_ns - prev_time_ns, "ns") prev_time_ns = time_ns Running it: % python nanobench.py Start: 330877620374821 End: 330877632515822

So how long does it take to execute 1000 ops? 47290 ns 49573 ns 48226 ns 45674 ns 53238 ns 47313 ns 52346 ns 48689 ns 47092 ns 87596 ns 69999 ns 52522 ns 52809 ns 53259 ns 52478 ns 53478 ns 65812 ns ``` 87596ns is 0.087596ms - so even a measure rate of every 1000 ops is easily finely grained enough to capture differences of less than 0.1ms.

If anything I could bump that default 1000 up - and I can definitely eliminate the if ms < 50 branch entirely.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: how much overhead does the n=1 time limit have? 1175854982  
1074446576 https://github.com/simonw/datasette/issues/1679#issuecomment-1074446576 https://api.github.com/repos/simonw/datasette/issues/1679 IC_kwDOBm6k_c5ACsDw simonw 9599 2022-03-21T21:38:27Z 2022-03-21T21:38:27Z OWNER

OK here's a microbenchmark script: ```python import sqlite3 import timeit

db = sqlite3.connect(":memory:") db_with_progress_handler_1 = sqlite3.connect(":memory:") db_with_progress_handler_1000 = sqlite3.connect(":memory:")

db_with_progress_handler_1.set_progress_handler(lambda: None, 1) db_with_progress_handler_1000.set_progress_handler(lambda: None, 1000)

def execute_query(db): cursor = db.execute(""" with recursive counter(x) as ( select 0 union select x + 1 from counter ) select * from counter limit 10000; """) list(cursor.fetchall())

print("Without progress_handler") print(timeit.timeit(lambda: execute_query(db), number=100))

print("progress_handler every 1000 ops") print(timeit.timeit(lambda: execute_query(db_with_progress_handler_1000), number=100))

print("progress_handler every 1 op") print(timeit.timeit(lambda: execute_query(db_with_progress_handler_1), number=100)) Results: % python3 bench.py Without progress_handler 0.8789225700311363 progress_handler every 1000 ops 0.8829826560104266 progress_handler every 1 op 2.8892734259716235 ```

So running every 1000 ops makes almost no difference at all, but running every single op is a 3.2x performance degradation.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: how much overhead does the n=1 time limit have? 1175854982  
1074439309 https://github.com/simonw/datasette/issues/1679#issuecomment-1074439309 https://api.github.com/repos/simonw/datasette/issues/1679 IC_kwDOBm6k_c5ACqSN simonw 9599 2022-03-21T21:28:58Z 2022-03-21T21:28:58Z OWNER

David Raymond solved it there: https://sqlite.org/forum/forumpost/330c8532d8a88bcd

Don't forget to step through the results. All .execute() has done is prepared it.

db.execute(query).fetchall()

Sure enough, adding that gets the VM steps number up to 190,007 which is close enough that I'm happy.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: how much overhead does the n=1 time limit have? 1175854982  
1074378472 https://github.com/simonw/datasette/issues/1676#issuecomment-1074378472 https://api.github.com/repos/simonw/datasette/issues/1676 IC_kwDOBm6k_c5ACbbo simonw 9599 2022-03-21T20:18:10Z 2022-03-21T20:18:10Z OWNER

Maybe there is a better name for this method that helps emphasize its cascading nature.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Reconsider ensure_permissions() logic, can it be less confusing? 1175690070  
1074347023 https://github.com/simonw/datasette/issues/1679#issuecomment-1074347023 https://api.github.com/repos/simonw/datasette/issues/1679 IC_kwDOBm6k_c5ACTwP simonw 9599 2022-03-21T19:48:59Z 2022-03-21T19:48:59Z OWNER

Posed a question about that here: https://sqlite.org/forum/forumpost/de9ff10fa7

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: how much overhead does the n=1 time limit have? 1175854982  
1074341924 https://github.com/simonw/datasette/issues/1679#issuecomment-1074341924 https://api.github.com/repos/simonw/datasette/issues/1679 IC_kwDOBm6k_c5ACSgk simonw 9599 2022-03-21T19:42:08Z 2022-03-21T19:42:08Z OWNER

Here's the Python-C implementation of set_progress_handler: https://github.com/python/cpython/blob/4674fd4e938eb4a29ccd5b12c15455bd2a41c335/Modules/_sqlite/connection.c#L1177-L1201

It calls sqlite3_progress_handler(self->db, n, progress_callback, ctx);

https://www.sqlite.org/c3ref/progress_handler.html says:

The parameter N is the approximate number of virtual machine instructions that are evaluated between successive invocations of the callback X

So maybe VM-steps and virtual machine instructions are different things?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: how much overhead does the n=1 time limit have? 1175854982  
1074337997 https://github.com/simonw/datasette/issues/1679#issuecomment-1074337997 https://api.github.com/repos/simonw/datasette/issues/1679 IC_kwDOBm6k_c5ACRjN simonw 9599 2022-03-21T19:37:08Z 2022-03-21T19:37:08Z OWNER

This is weird: ```python import sqlite3

db = sqlite3.connect(":memory:")

i = 0

def count(): global i i += 1

db.set_progress_handler(count, 1)

db.execute(""" with recursive counter(x) as ( select 0 union select x + 1 from counter ) select * from counter limit 10000; """)

print(i) Outputs `24`. But if you try the same thing in the SQLite console: sqlite> .stats vmstep sqlite> with recursive counter(x) as ( ...> select 0 ...> union ...> select x + 1 from counter ...> ) ...> select * from counter limit 10000; ... VM-steps: 200007 ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: how much overhead does the n=1 time limit have? 1175854982  
1074332718 https://github.com/simonw/datasette/issues/1679#issuecomment-1074332718 https://api.github.com/repos/simonw/datasette/issues/1679 IC_kwDOBm6k_c5ACQQu simonw 9599 2022-03-21T19:31:10Z 2022-03-21T19:31:10Z OWNER

How long does it take for SQLite to execute 1000 opcodes anyway?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: how much overhead does the n=1 time limit have? 1175854982  
1074332325 https://github.com/simonw/datasette/issues/1679#issuecomment-1074332325 https://api.github.com/repos/simonw/datasette/issues/1679 IC_kwDOBm6k_c5ACQKl simonw 9599 2022-03-21T19:30:44Z 2022-03-21T19:30:44Z OWNER

So it looks like even for facet suggestion n=1000 always - it's never reduced to n=1.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: how much overhead does the n=1 time limit have? 1175854982  
1074331743 https://github.com/simonw/datasette/issues/1679#issuecomment-1074331743 https://api.github.com/repos/simonw/datasette/issues/1679 IC_kwDOBm6k_c5ACQBf simonw 9599 2022-03-21T19:30:05Z 2022-03-21T19:30:05Z OWNER

https://github.com/simonw/datasette/blob/1a7750eb29fd15dd2eea3b9f6e33028ce441b143/datasette/app.py#L118-L122 sets it to 50ms for facet suggestion but that's not going to pass ms < 50:

python Setting( "facet_suggest_time_limit_ms", 50, "Time limit for calculating a suggested facet", ),

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research: how much overhead does the n=1 time limit have? 1175854982  
1074321862 https://github.com/simonw/datasette/issues/1660#issuecomment-1074321862 https://api.github.com/repos/simonw/datasette/issues/1660 IC_kwDOBm6k_c5ACNnG simonw 9599 2022-03-21T19:19:01Z 2022-03-21T19:19:01Z OWNER

I've simplified this a ton now. I'm going to keep working on this in the long-term but I think this issue can be closed.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Refactor and simplify Datasette routing and views 1170144879  
1074302559 https://github.com/simonw/datasette/issues/1678#issuecomment-1074302559 https://api.github.com/repos/simonw/datasette/issues/1678 IC_kwDOBm6k_c5ACI5f simonw 9599 2022-03-21T19:04:03Z 2022-03-21T19:04:03Z OWNER

Documentation: https://docs.datasette.io/en/latest/internals.html#await-check-visibility-actor-action-resource-none

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Make `check_visibility()` a documented API 1175715988  
1074287177 https://github.com/simonw/datasette/issues/1660#issuecomment-1074287177 https://api.github.com/repos/simonw/datasette/issues/1660 IC_kwDOBm6k_c5ACFJJ simonw 9599 2022-03-21T18:51:42Z 2022-03-21T18:51:42Z OWNER

BaseView is looking a LOT slimmer now that I've moved all of the permissions stuff out of it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Refactor and simplify Datasette routing and views 1170144879  
1074256603 https://github.com/simonw/sqlite-utils/issues/417#issuecomment-1074256603 https://api.github.com/repos/simonw/sqlite-utils/issues/417 IC_kwDOCGYnMM5AB9rb blaine 9954 2022-03-21T18:19:41Z 2022-03-21T18:19:41Z NONE

That makes sense; just a little hint that points folks towards doing the right thing might be helpful!

fwiw, the reason I was using jq in the first place was just a quick way to extract one attribute from an actual JSON array. When I initially imported it, I got a table with a bunch of embedded JSON values, rather than a native table, because each array entry had two attributes, one with the data I actually wanted. Not sure how common a use-case this is, though (and easily fixed, aside from the jq weirdness!)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
insert fails on JSONL with whitespace 1175744654  
1074243540 https://github.com/simonw/sqlite-utils/issues/417#issuecomment-1074243540 https://api.github.com/repos/simonw/sqlite-utils/issues/417 IC_kwDOCGYnMM5AB6fU simonw 9599 2022-03-21T18:08:03Z 2022-03-21T18:08:03Z OWNER

I've not really thought about standards as much here as I should. It looks like there are two competing specs for newline-delimited JSON!

http://ndjson.org/ is the one I've been using in sqlite-utils - and https://github.com/ndjson/ndjson-spec#31-serialization says:

The JSON texts MUST NOT contain newlines or carriage returns.

https://jsonlines.org/ is the other one. It is slightly less clear, but it does say this:

  1. Each Line is a Valid JSON Value

The most common values will be objects or arrays, but any JSON value is permitted.

My interpretation of both of these is that newlines in the middle of a JSON object shouldn't be allowed.

So what's jq doing here? It looks to me like that jq format is its own thing - it's not actually compatible with either of those two loose specs described above.

The jq docs seem to call this "whitespace-separated JSON": https://stedolan.github.io/jq/manual/v1.6/#Invokingjq

The thing I like about newline-delimited JSON is that it's really trivial to parse - loop through each line, run it through json.loads() and that's it. No need to try and unwrap JSON objects that might span multiple lines.

Unless someone has written a robust Python implementation of a jq-compatible whitespace-separated JSON parser, I'm inclined to leave this as is. I'd be fine adding some documentation that helps point people towards jq -c though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
insert fails on JSONL with whitespace 1175744654  
1074184240 https://github.com/simonw/datasette/issues/1677#issuecomment-1074184240 https://api.github.com/repos/simonw/datasette/issues/1677 IC_kwDOBm6k_c5ABsAw simonw 9599 2022-03-21T17:20:17Z 2022-03-21T17:20:17Z OWNER

https://github.com/simonw/datasette/blob/e627510b760198ccedba9e5af47a771e847785c9/datasette/views/base.py#L69-L77

This is weirdly different from how check_permissions() used to work, in that it doesn't differentiate between None and False.

https://github.com/simonw/datasette/blob/4a4164b81191dec35e423486a208b05a9edc65e4/datasette/views/base.py#L79-L103

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Remove `check_permission()` from `BaseView` 1175694248  
1074180312 https://github.com/simonw/datasette/issues/1676#issuecomment-1074180312 https://api.github.com/repos/simonw/datasette/issues/1676 IC_kwDOBm6k_c5ABrDY simonw 9599 2022-03-21T17:16:45Z 2022-03-21T17:16:45Z OWNER

When looking at this code earlier I assumed that the following would check each permission in turn and fail if any of them failed: python await self.ds.ensure_permissions( request.actor, [ ("view-table", (database, table)), ("view-database", database), "view-instance", ] ) But it's not quite that simple: if any of them fail, it fails... but if an earlier one returns True the whole stack passes even if there would have been a failure later on!

If that is indeed the right abstraction, I need to work to make the documentation as clear as possible.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Reconsider ensure_permissions() logic, can it be less confusing? 1175690070  
1074178865 https://github.com/simonw/datasette/issues/1676#issuecomment-1074178865 https://api.github.com/repos/simonw/datasette/issues/1676 IC_kwDOBm6k_c5ABqsx simonw 9599 2022-03-21T17:15:27Z 2022-03-21T17:15:27Z OWNER

This method here: https://github.com/simonw/datasette/blob/e627510b760198ccedba9e5af47a771e847785c9/datasette/app.py#L632-L664

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Reconsider ensure_permissions() logic, can it be less confusing? 1175690070  
1074177827 https://github.com/simonw/datasette/issues/1675#issuecomment-1074177827 https://api.github.com/repos/simonw/datasette/issues/1675 IC_kwDOBm6k_c5ABqcj simonw 9599 2022-03-21T17:14:31Z 2022-03-21T17:14:31Z OWNER

Updated documentation: https://github.com/simonw/datasette/blob/e627510b760198ccedba9e5af47a771e847785c9/docs/internals.rst#await-ensure_permissionsactor-permissions

This method allows multiple permissions to be checked at onced. It raises a datasette.Forbidden exception if any of the checks are denied before one of them is explicitly granted.

This is useful when you need to check multiple permissions at once. For example, an actor should be able to view a table if either one of the following checks returns True or not a single one of them returns False:

That's pretty hard to understand! I'm going to open a separate issue to reconsider if this is a useful enough abstraction given how confusing it is.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extract out `check_permissions()` from `BaseView 1175648453  
1074161523 https://github.com/simonw/datasette/issues/1675#issuecomment-1074161523 https://api.github.com/repos/simonw/datasette/issues/1675 IC_kwDOBm6k_c5ABmdz simonw 9599 2022-03-21T16:59:55Z 2022-03-21T17:00:03Z OWNER

Also calling that function permissions_allowed() is confusing because there is a plugin hook with a similar name already: https://docs.datasette.io/en/stable/plugin_hooks.html#permission-allowed-datasette-actor-action-resource

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extract out `check_permissions()` from `BaseView 1175648453  
1074158890 https://github.com/simonw/datasette/issues/1675#issuecomment-1074158890 https://api.github.com/repos/simonw/datasette/issues/1675 IC_kwDOBm6k_c5ABl0q simonw 9599 2022-03-21T16:57:15Z 2022-03-21T16:57:15Z OWNER

Idea: ds.permission_allowed() continues to just return True or False.

A new ds.ensure_permissions(...) method is added which raises a Forbidden exception if a check fails (hence the different name)`.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extract out `check_permissions()` from `BaseView 1175648453  
1074156779 https://github.com/simonw/datasette/issues/1675#issuecomment-1074156779 https://api.github.com/repos/simonw/datasette/issues/1675 IC_kwDOBm6k_c5ABlTr simonw 9599 2022-03-21T16:55:08Z 2022-03-21T16:56:02Z OWNER

One benefit of the current design of check_permissions that raises an exception is that the exception includes information on WHICH of the permission checks failed. Returning just True or False loses that information.

I could return an object which evaluates to False but also carries extra information? Bit weird, I've never seen anything like that in other Python code.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extract out `check_permissions()` from `BaseView 1175648453  
1074143209 https://github.com/simonw/datasette/issues/1675#issuecomment-1074143209 https://api.github.com/repos/simonw/datasette/issues/1675 IC_kwDOBm6k_c5ABh_p simonw 9599 2022-03-21T16:46:05Z 2022-03-21T16:46:05Z OWNER

The other difference though is that ds.permission_allowed(...) works against an actor, while check_permission() works against a request (though just to access request.actor).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extract out `check_permissions()` from `BaseView 1175648453  
1074142617 https://github.com/simonw/datasette/issues/1675#issuecomment-1074142617 https://api.github.com/repos/simonw/datasette/issues/1675 IC_kwDOBm6k_c5ABh2Z simonw 9599 2022-03-21T16:45:27Z 2022-03-21T16:45:27Z OWNER

Though at that point check_permission is such a light wrapper around self.ds.permission_allowed() that there's little point in it existing at all.

So maybe check_permisions() becomes ds.permissions_allowed().

permission_allowed() v.s. permissions_allowed() is a bit of a subtle naming difference, but I think it works.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extract out `check_permissions()` from `BaseView 1175648453  
1074141457 https://github.com/simonw/datasette/issues/1675#issuecomment-1074141457 https://api.github.com/repos/simonw/datasette/issues/1675 IC_kwDOBm6k_c5ABhkR simonw 9599 2022-03-21T16:44:09Z 2022-03-21T16:44:09Z OWNER

A slightly odd thing about these methods is that they either fail silently or they raise a Forbidden exception.

Maybe they should instead return True or False and the calling code could decide if it wants to raise the exception? That would make them more usable and a little less surprising.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extract out `check_permissions()` from `BaseView 1175648453  
1074136176 https://github.com/simonw/datasette/issues/1660#issuecomment-1074136176 https://api.github.com/repos/simonw/datasette/issues/1660 IC_kwDOBm6k_c5ABgRw simonw 9599 2022-03-21T16:38:46Z 2022-03-21T16:38:46Z OWNER

I'm going to refactor this stuff out and document it so it can be easily used by plugins:

https://github.com/simonw/datasette/blob/4a4164b81191dec35e423486a208b05a9edc65e4/datasette/views/base.py#L69-L103

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Refactor and simplify Datasette routing and views 1170144879  
1074019047 https://github.com/simonw/datasette/issues/526#issuecomment-1074019047 https://api.github.com/repos/simonw/datasette/issues/526 IC_kwDOBm6k_c5ABDrn simonw 9599 2022-03-21T15:09:56Z 2022-03-21T15:09:56Z OWNER

I should research how much overhead creating a new connection costs - it may be that an easy way to solve this is to create A dedicated connection for the query and then close that connection at the end.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Stream all results for arbitrary SQL and canned queries 459882902  
1074017633 https://github.com/simonw/datasette/issues/1177#issuecomment-1074017633 https://api.github.com/repos/simonw/datasette/issues/1177 IC_kwDOBm6k_c5ABDVh simonw 9599 2022-03-21T15:08:51Z 2022-03-21T15:08:51Z OWNER

Related: - #1062

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Ability to stream all rows as newline-delimited JSON 780153562  
1073468996 https://github.com/simonw/sqlite-utils/issues/415#issuecomment-1073468996 https://api.github.com/repos/simonw/sqlite-utils/issues/415 IC_kwDOCGYnMM4_-9ZE simonw 9599 2022-03-21T04:14:42Z 2022-03-21T04:14:42Z OWNER

I can fix this like so: ``` % sqlite-utils convert demo.db demo foo '{"foo": "bar"}' --multi --dry-run abc --- becomes: {"foo": "bar"}

Would affect 1 row Diff is this:diff diff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py index 0cf0468..b2a0440 100644 --- a/sqlite_utils/cli.py +++ b/sqlite_utils/cli.py @@ -2676,7 +2676,10 @@ def convert( raise click.ClickException(str(e)) if dry_run: # Pull first 20 values for first column and preview them - db.conn.create_function("preview_transform", 1, lambda v: fn(v) if v else v) + preview = lambda v: fn(v) if v else v + if multi: + preview = lambda v: json.dumps(fn(v), default=repr) if v else v + db.conn.create_function("preview_transform", 1, preview) sql = """ select [{column}] as value, ```

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Convert with `--multi` and `--dry-run` flag does not work 1171599874  
1073463375 https://github.com/simonw/sqlite-utils/issues/415#issuecomment-1073463375 https://api.github.com/repos/simonw/sqlite-utils/issues/415 IC_kwDOCGYnMM4_-8BP simonw 9599 2022-03-21T04:02:36Z 2022-03-21T04:02:36Z OWNER

Thanks for the really clear steps to reproduce!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Convert with `--multi` and `--dry-run` flag does not work 1171599874  
1073456222 https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073456222 https://api.github.com/repos/simonw/sqlite-utils/issues/416 IC_kwDOCGYnMM4_-6Re simonw 9599 2022-03-21T03:45:52Z 2022-03-21T03:45:52Z OWNER

Needs tests and documentation.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Options for how `r.parsedate()` should handle invalid dates 1173023272  
1073456155 https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073456155 https://api.github.com/repos/simonw/sqlite-utils/issues/416 IC_kwDOCGYnMM4_-6Qb simonw 9599 2022-03-21T03:45:37Z 2022-03-21T03:45:37Z OWNER

Prototype: ```diff diff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py index 8255b56..0a3693e 100644 --- a/sqlite_utils/cli.py +++ b/sqlite_utils/cli.py @@ -2583,7 +2583,11 @@ def generate_convert_help(): """ ).strip() recipe_names = [ - n for n in dir(recipes) if not n.startswith("") and n not in ("json", "parser") + n + for n in dir(recipes) + if not n.startswith("_") + and n not in ("json", "parser") + and callable(getattr(recipes, n)) ] for name in recipe_names: fn = getattr(recipes, name) diff --git a/sqlite_utils/recipes.py b/sqlite_utils/recipes.py index 6918661..569c30d 100644 --- a/sqlite_utils/recipes.py +++ b/sqlite_utils/recipes.py @@ -1,17 +1,38 @@ from dateutil import parser import json

+IGNORE = object() +SET_NULL = object()

-def parsedate(value, dayfirst=False, yearfirst=False): + +def parsedate(value, dayfirst=False, yearfirst=False, errors=None): "Parse a date and convert it to ISO date format: yyyy-mm-dd" - return ( - parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).date().isoformat() - ) + try: + return ( + parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst) + .date() + .isoformat() + ) + except parser.ParserError: + if errors is IGNORE: + return value + elif errors is SET_NULL: + return None + else: + raise

-def parsedatetime(value, dayfirst=False, yearfirst=False): +def parsedatetime(value, dayfirst=False, yearfirst=False, errors=None): "Parse a datetime and convert it to ISO datetime format: yyyy-mm-ddTHH:MM:SS" - return parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).isoformat() + try: + return parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).isoformat() + except parser.ParserError: + if errors is IGNORE: + return value + elif errors is SET_NULL: + return None + else: + raise

def jsonsplit(value, delimiter=",", type=str): ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Options for how `r.parsedate()` should handle invalid dates 1173023272  
1073455905 https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073455905 https://api.github.com/repos/simonw/sqlite-utils/issues/416 IC_kwDOCGYnMM4_-6Mh simonw 9599 2022-03-21T03:44:47Z 2022-03-21T03:45:00Z OWNER

This is quite nice: % sqlite-utils convert test-dates.db dates date "r.parsedate(value, errors=r.IGNORE)" [####################################] 100% % sqlite-utils rows test-dates.db dates [{"id": 1, "date": "2016-03-15"}, {"id": 2, "date": "2016-03-16"}, {"id": 3, "date": "2016-03-17"}, {"id": 4, "date": "2016-03-18"}, {"id": 5, "date": "2016-03-19"}, {"id": 6, "date": "2016-03-20"}, {"id": 7, "date": "2016-03-21"}, {"id": 8, "date": "2016-03-22"}, {"id": 9, "date": "2016-03-23"}, {"id": 10, "date": "//"}, {"id": 11, "date": "2016-03-25"}, {"id": 12, "date": "2016-03-26"}, {"id": 13, "date": "2016-03-27"}, {"id": 14, "date": "2016-03-28"}, {"id": 15, "date": "2016-03-29"}, {"id": 16, "date": "2016-03-30"}, {"id": 17, "date": "2016-03-31"}, {"id": 18, "date": "2016-04-01"}] % sqlite-utils convert test-dates.db dates date "r.parsedate(value, errors=r.SET_NULL)" [####################################] 100% % sqlite-utils rows test-dates.db dates [{"id": 1, "date": "2016-03-15"}, {"id": 2, "date": "2016-03-16"}, {"id": 3, "date": "2016-03-17"}, {"id": 4, "date": "2016-03-18"}, {"id": 5, "date": "2016-03-19"}, {"id": 6, "date": "2016-03-20"}, {"id": 7, "date": "2016-03-21"}, {"id": 8, "date": "2016-03-22"}, {"id": 9, "date": "2016-03-23"}, {"id": 10, "date": null}, {"id": 11, "date": "2016-03-25"}, {"id": 12, "date": "2016-03-26"}, {"id": 13, "date": "2016-03-27"}, {"id": 14, "date": "2016-03-28"}, {"id": 15, "date": "2016-03-29"}, {"id": 16, "date": "2016-03-30"}, {"id": 17, "date": "2016-03-31"}, {"id": 18, "date": "2016-04-01"}]

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Options for how `r.parsedate()` should handle invalid dates 1173023272  
1073453370 https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073453370 https://api.github.com/repos/simonw/sqlite-utils/issues/416 IC_kwDOCGYnMM4_-5k6 simonw 9599 2022-03-21T03:41:06Z 2022-03-21T03:41:06Z OWNER

I'm going to try the errors=r.IGNORE option and see what that looks like once implemented.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Options for how `r.parsedate()` should handle invalid dates 1173023272  
1073453230 https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073453230 https://api.github.com/repos/simonw/sqlite-utils/issues/416 IC_kwDOCGYnMM4_-5iu simonw 9599 2022-03-21T03:40:37Z 2022-03-21T03:40:37Z OWNER

I think the options here should be:

  • On error, raise an exception and revert the transaction (the current default)
  • On error, leave the value as-is
  • On error, set the value to None

These need to be indicated by parameters to the r.parsedate() function.

Some design options:

  • ignore=True to ignore errors - but how does it know if it should leave the value or set it to None? This is similar to other ignore=True parameters elsewhere in the Python API.
  • errors="ignore", errors="set-null" - I don't like magic string values very much, but this is similar to Python's str.encode(errors=) mechanism
  • errors=r.IGNORE - using constants, which at least avoids magic strings. The other one could be errors=r.SET_NULL
  • error=lambda v: None or error=lambda v: v - this is a bit confusing though, introducing another callback that gets to have a go at converting the error if the first callback failed? And what happens if that lambda itself raises an error?
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Options for how `r.parsedate()` should handle invalid dates 1173023272  
1073451659 https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073451659 https://api.github.com/repos/simonw/sqlite-utils/issues/416 IC_kwDOCGYnMM4_-5KL simonw 9599 2022-03-21T03:35:01Z 2022-03-21T03:35:01Z OWNER

I confirmed that if it fails for any value ALL values are left alone, since it runs in a transaction.

Here's the code that does that:

https://github.com/simonw/sqlite-utils/blob/433813612ff9b4b501739fd7543bef0040dd51fe/sqlite_utils/db.py#L2523-L2526

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Options for how `r.parsedate()` should handle invalid dates 1173023272  
1073450588 https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073450588 https://api.github.com/repos/simonw/sqlite-utils/issues/416 IC_kwDOCGYnMM4_-45c simonw 9599 2022-03-21T03:32:58Z 2022-03-21T03:32:58Z OWNER

Then I ran this to convert 2016-03-27 etc to 2016/03/27 so I could see which ones were later converted:

sqlite-utils convert test-dates.db dates date 'value.replace("-", "/")'
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Options for how `r.parsedate()` should handle invalid dates 1173023272  
1073448904 https://github.com/simonw/sqlite-utils/issues/416#issuecomment-1073448904 https://api.github.com/repos/simonw/sqlite-utils/issues/416 IC_kwDOCGYnMM4_-4fI simonw 9599 2022-03-21T03:28:12Z 2022-03-21T03:30:37Z OWNER

Generating a test database using a pattern from https://www.geekytidbits.com/date-range-table-sqlite/ sqlite-utils create-database test-dates.db sqlite-utils create-table test-dates.db dates id integer date text --pk id sqlite-utils test-dates.db "WITH RECURSIVE cnt(x) AS ( SELECT 0 UNION ALL SELECT x+1 FROM cnt LIMIT (SELECT ((julianday('2016-04-01') - julianday('2016-03-15'))) + 1) ) insert into dates (date) select date(julianday('2016-03-15'), '+' || x || ' days') as date FROM cnt;" After running that: % sqlite-utils rows test-dates.db dates [{"id": 1, "date": "2016-03-15"}, {"id": 2, "date": "2016-03-16"}, {"id": 3, "date": "2016-03-17"}, {"id": 4, "date": "2016-03-18"}, {"id": 5, "date": "2016-03-19"}, {"id": 6, "date": "2016-03-20"}, {"id": 7, "date": "2016-03-21"}, {"id": 8, "date": "2016-03-22"}, {"id": 9, "date": "2016-03-23"}, {"id": 10, "date": "2016-03-24"}, {"id": 11, "date": "2016-03-25"}, {"id": 12, "date": "2016-03-26"}, {"id": 13, "date": "2016-03-27"}, {"id": 14, "date": "2016-03-28"}, {"id": 15, "date": "2016-03-29"}, {"id": 16, "date": "2016-03-30"}, {"id": 17, "date": "2016-03-31"}, {"id": 18, "date": "2016-04-01"}] Then to make one of them invalid:

sqlite-utils test-dates.db "update dates set date = '//' where id = 10"
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Options for how `r.parsedate()` should handle invalid dates 1173023272  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1172.762ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows