issue_comments

3,398 rows where user = 9599 sorted by updated_at descending

View and edit SQL

Suggested facets: reactions, created_at (date), updated_at (date)

issue

author_association

user

  • simonw · 3,398
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
697047591 https://github.com/simonw/sqlite-utils/issues/170#issuecomment-697047591 https://api.github.com/repos/simonw/sqlite-utils/issues/170 MDEyOklzc3VlQ29tbWVudDY5NzA0NzU5MQ== simonw 9599 2020-09-23T00:14:52Z 2020-09-23T00:14:52Z OWNER

@simonw
@db.register_function decorator, closes #162
4824775
@simonw
table.transform() method - closes #114
987dd12
@simonw
Keyword only arguments for transform()
f8e10df

Also renamed columns= to types=

Closes #165

Commits on Sep 22, 2020
@simonw
Implemented sqlite-utils transform command, closes #164
752d261
@simonw
Applied Black
f29f682
@simonw
table.extract() method, refs #42
f855379
@simonw
Docstring for sqlite-utils transform
c755f28
@simonw
Added table.extract(rename=) option, refs #42
c3210f2
@simonw
Applied Black
317071a
@simonw
New .rows_where(select=) argument
7178231
@simonw
table.extract() now works with rowid tables, refs #42
2db6c5b
@simonw
sqlite-utils extract, closes #42
55cf928
@simonw
Progress bar for "sqlite-utils extract", closes #169
5c4d58d
@simonw
Fixed PRAGMA foreign_keys handling for .transform, closes #167

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Release notes for 2.20 706768798  
697037974 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697037974 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5NzAzNzk3NA== simonw 9599 2020-09-22T23:39:31Z 2020-09-22T23:39:31Z OWNER

Documentation for sqlite-utils extract: https://sqlite-utils.readthedocs.io/en/latest/cli.html#extracting-columns-into-a-separate-table

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
697031174 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697031174 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5NzAzMTE3NA== simonw 9599 2020-09-22T23:16:00Z 2020-09-22T23:16:00Z OWNER

Trying this demo again:

wget 'https://raw.githubusercontent.com/wri/global-power-plant-database/master/output_database/global_power_plant_database.csv'
sqlite-utils insert global.db power_plants global_power_plant_database.csv --csv
sqlite-utils extract global.db power_plants country country_long --table countries --rename country_long name

It worked!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
697025403 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697025403 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5NzAyNTQwMw== simonw 9599 2020-09-22T22:57:53Z 2020-09-22T22:57:53Z OWNER

The documentation for the .extract() method is here: https://sqlite-utils.readthedocs.io/en/latest/python-api.html#extracting-columns-into-a-separate-table

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
697019944 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697019944 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5NzAxOTk0NA== simonw 9599 2020-09-22T22:40:00Z 2020-09-22T22:40:00Z OWNER

I tried out the prototype of the CLI on the Global Power Plants data:

wget 'https://raw.githubusercontent.com/wri/global-power-plant-database/master/output_database/global_power_plant_database.csv'
sqlite-utils insert global.db power_plants global_power_plant_database.csv --csv
sqlite-utils extract global.db power_plants country country_long

This threw an error because rowid columns are not yet supported. I fixed that like so:

sqlite-utils transform global.db power_plants --rename rowid id
sqlite-utils extract global.db power_plants country country_long

That worked! But it didn't play great with Datasette, because the resulting extracted table had columns country and country_long and neither of those are called name or value or title.

Based on this I need to add rowid table support AND I need to implement the proposed rename= argument for renaming columns on their way into the new table.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
697013681 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697013681 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5NzAxMzY4MQ== simonw 9599 2020-09-22T22:22:49Z 2020-09-22T22:22:49Z OWNER

The command-line version of this needs to accept a table and one or more columns, then a --table and --fk-column option.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
697012111 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-697012111 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5NzAxMjExMQ== simonw 9599 2020-09-22T22:18:13Z 2020-09-22T22:18:13Z OWNER

Here's how I'm generating the examples for the documentation:

In [2]: import sqlite_utils

In [3]: db = sqlite_utils.Database(memory=True)

In [4]: db["Trees"].insert({"id": 1, "TreeAddress": "52 Vine St", "CommonName":
   ...: "Palm", "LatinName": "foo"}, pk="id")
Out[4]: <Table Trees (id, TreeAddress, CommonName, LatinName)>

In [5]: db["Trees"].extract(["CommonName", "LatinName"], table="Species", fk_col
   ...: umn="species_id")

In [6]: print(db["Trees"].schema)
CREATE TABLE "Trees" (
   [id] INTEGER PRIMARY KEY,
   [TreeAddress] TEXT,
   [species_id] INTEGER,
   FOREIGN KEY(species_id) REFERENCES Species(id)
)

In [7]: print(db["Species"].schema)
CREATE TABLE [Species] (
   [id] INTEGER PRIMARY KEY,
   [CommonName] TEXT,
   [LatinName] TEXT
)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696987925 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696987925 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njk4NzkyNQ== simonw 9599 2020-09-22T21:19:04Z 2020-09-22T21:19:04Z OWNER

Need to make sure this works correctly for rowid tables.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696987257 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696987257 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njk4NzI1Nw== simonw 9599 2020-09-22T21:17:34Z 2020-09-22T21:17:34Z OWNER

What to do if the table already exists? The .lookup() function already knows how to modify an existing table to create the correct constraints etc, so I'll rely on that mechanism.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696980709 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696980709 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njk4MDcwOQ== simonw 9599 2020-09-22T21:05:07Z 2020-09-22T21:05:07Z OWNER

So .extract() probably takes a batch_size= argument too, which defaults to maybe 1000.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696980503 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696980503 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njk4MDUwMw== simonw 9599 2020-09-22T21:04:45Z 2020-09-22T21:04:45Z OWNER

table.extract() can take an optional progress= argument which is a callback which will be used to report progress - called after each batch with (num_done, total). It will get called with (0, total) once at the start to allow progress bars to be initialized. The command-line progress bar will use this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696979626 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696979626 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njk3OTYyNg== simonw 9599 2020-09-22T21:03:11Z 2020-09-22T21:03:11Z OWNER

And if you want to rename some of the columns in the new table:

db["trees"].extract(["common_name", "latin_name"], table="species", rename={"common_name": "name"})
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696979168 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696979168 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njk3OTE2OA== simonw 9599 2020-09-22T21:02:24Z 2020-09-22T21:02:24Z OWNER

In Python it looks like this:

# Simple case - species column species_id pointing to species table
db["trees"].extract("species")

# Setting a custom table
db["trees"].extract("species", table="Species")

# Custom foreign key column on trees
db["trees"].extract("species", fk_column="species")

# Extracting multiple columns
db["trees"].extract(["common_name", "latin_name"])
# (this creates a lookup table called common_name_latin_name ref'd by common_name_latin_name_id)

# Or with explicit table (fk_column here defaults to species_id because of the table name)
db["trees"].extract(["common_name", "latin_name"], table="species")
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696976678 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696976678 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njk3NjY3OA== simonw 9599 2020-09-22T20:57:57Z 2020-09-22T20:57:57Z OWNER

I think I understand the shape of this feature now. It lets you specify one or more columns on the source table which will be extracted into another table. It uses the .lookup() mechanism to populate that other table, which means each unique column value / pair / triple will be assigned an integer ID.

That integer ID gets written back into the first of the columns that are being transformed. A .transform() call then converts that column to an integer (and drops the additional columns). Finally we set up the new foreign key relationship.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696893774 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696893774 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njg5Mzc3NA== simonw 9599 2020-09-22T18:15:33Z 2020-09-22T18:15:33Z OWNER

I think the new foreign key column is called company_name_id by default in this example but can be customized by passing --fk-column=xxx

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696893244 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696893244 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5Njg5MzI0NA== simonw 9599 2020-09-22T18:14:33Z 2020-09-22T18:14:45Z OWNER

Thinking more about this one:

$ sqlite-utils extract my.db \
    dea_sales company_name company_address \
    --table companies

The goal here is to pull the company name and address pair out into a separate table.

Some questions:
- should this first verify that every company_name has just one company_address? I like the idea of a unique constraint on the created table for this.
- what should the foreign key column that gets added to the companies table be called?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
513262013 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-513262013 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDUxMzI2MjAxMw== simonw 9599 2019-07-19T14:58:23Z 2020-09-22T18:12:11Z OWNER

CLI design idea:

$ sqlite-utils extract my.db \
    dea_sales company_name

Here we just specify the original table and column - the new extracted table will automatically be called "company_name" and will have "id" and "value" columns, by default.

To set a custom extract table:

$ sqlite-utils extract my.db \
    dea_sales company_name \
    --table companies

And for extracting multiple columns and renaming them on the created table, maybe something like this:

$ sqlite-utils extract my.db \
    dea_sales company_name company_address \
    --table companies \
    --column company_name name \
    --column company_address address
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696800410 https://github.com/simonw/datasette/issues/973#issuecomment-696800410 https://api.github.com/repos/simonw/datasette/issues/973 MDEyOklzc3VlQ29tbWVudDY5NjgwMDQxMA== simonw 9599 2020-09-22T15:35:28Z 2020-09-22T15:35:28Z OWNER

Confirmed in local dev:

% datasette fixtures.db --inspect-file inspect.json
Traceback (most recent call last):
  File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/bin/datasette", line 11, in <module>
    load_entry_point('datasette', 'console_scripts', 'datasette')()
  File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/simon/Dropbox/Development/datasette/datasette/cli.py", line 406, in serve
    inspect_data = json.load(open(inspect_file))
TypeError: 'bool' object is not callable
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
'bool' object is not callable error 706486323  
696798114 https://github.com/simonw/datasette/issues/973#issuecomment-696798114 https://api.github.com/repos/simonw/datasette/issues/973 MDEyOklzc3VlQ29tbWVudDY5Njc5ODExNA== simonw 9599 2020-09-22T15:31:25Z 2020-09-22T15:31:25Z OWNER

D'oh because I have a new variable called open.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
'bool' object is not callable error 706486323  
696788109 https://github.com/simonw/datasette/issues/969#issuecomment-696788109 https://api.github.com/repos/simonw/datasette/issues/969 MDEyOklzc3VlQ29tbWVudDY5Njc4ODEwOQ== simonw 9599 2020-09-22T15:15:14Z 2020-09-22T15:15:14Z OWNER

I don't think a standard "pass these extra arguments to the publish tool" mechanism will work because there's no guarantee that a publisher uses a CLI tool - or if it does, it might make several calls to different CLI tools. The Cloud Run one runs a couple of commands, as illustrated by this test:

https://github.com/simonw/datasette/blob/a648bb82bac201c7658f6fdb499ff8ac17ebd2e8/tests/test_publish_cloudrun.py#L63-L73

Adding a --tar option for datasette publish heroku is a good fix for this though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Passing additional flags to tools used during publishing 705057955  
696778735 https://github.com/simonw/datasette/issues/943#issuecomment-696778735 https://api.github.com/repos/simonw/datasette/issues/943 MDEyOklzc3VlQ29tbWVudDY5Njc3ODczNQ== simonw 9599 2020-09-22T15:00:13Z 2020-09-22T15:00:39Z OWNER

Am I going to rewrite ALL of my tests to use this instead? It would clean up a lot of test code, at the cost of quite a bit of work.

It would make for much neater plugin tests too, and neater testing documentation: https://docs.datasette.io/en/stable/testing_plugins.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
await datasette.client.get(path) mechanism for executing internal requests 681375466  
696777886 https://github.com/simonw/datasette/issues/943#issuecomment-696777886 https://api.github.com/repos/simonw/datasette/issues/943 MDEyOklzc3VlQ29tbWVudDY5Njc3Nzg4Ng== simonw 9599 2020-09-22T14:58:54Z 2020-09-22T14:58:54Z OWNER
class DatasetteClient:
    def __init__(self, ds):
        self._client = httpx.AsyncClient(app=ds.app())

    def _fix(self, path):
        if path.startswith("/"):
            path = "http://localhost{}".format(path)
        return path

    async def get(self, path, **kwargs):
        return await self._client.get(self._fix(path), **kwargs)

    async def options(self, path, **kwargs):
        return await self._client.options(self._fix(path), **kwargs)

    async def head(self, path, **kwargs):
        return await self._client.head(self._fix(path), **kwargs)

    async def post(self, path, **kwargs):
        return await self._client.post(self._fix(path), **kwargs)

    async def put(self, path, **kwargs):
        return await self._client.put(self._fix(path), **kwargs)

    async def patch(self, path, **kwargs):
        return await self._client.patch(self._fix(path), **kwargs)

    async def delete(self, path, **kwargs):
        return await self._client.delete(self._fix(path), **kwargs)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
await datasette.client.get(path) mechanism for executing internal requests 681375466  
696776828 https://github.com/simonw/datasette/issues/943#issuecomment-696776828 https://api.github.com/repos/simonw/datasette/issues/943 MDEyOklzc3VlQ29tbWVudDY5Njc3NjgyOA== simonw 9599 2020-09-22T14:57:13Z 2020-09-22T14:57:13Z OWNER

I may as well implement all of the HTTP methods supported by the httpx client:

  • get
  • options
  • head
  • post
  • put
  • patch
  • delete
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
await datasette.client.get(path) mechanism for executing internal requests 681375466  
696775516 https://github.com/simonw/datasette/issues/943#issuecomment-696775516 https://api.github.com/repos/simonw/datasette/issues/943 MDEyOklzc3VlQ29tbWVudDY5Njc3NTUxNg== simonw 9599 2020-09-22T14:55:10Z 2020-09-22T14:55:10Z OWNER

Even smaller DatasetteClient implementation:

class DatasetteClient:
    def __init__(self, ds):
        self._client = httpx.AsyncClient(app=ds.app())

    def _fix(self, path):
        if path.startswith("/"):
            path = "http://localhost{}".format(path)
        return path

    async def get(self, path, **kwargs):
        return await self._client.get(self._fix(path), **kwargs)

    async def post(self, path, **kwargs):
        return await self._client.post(self._fix(path), **kwargs)

    async def options(self, path, **kwargs):
        return await self._client.options(self._fix(path), **kwargs)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
await datasette.client.get(path) mechanism for executing internal requests 681375466  
696774711 https://github.com/simonw/datasette/issues/943#issuecomment-696774711 https://api.github.com/repos/simonw/datasette/issues/943 MDEyOklzc3VlQ29tbWVudDY5Njc3NDcxMQ== simonw 9599 2020-09-22T14:53:56Z 2020-09-22T14:53:56Z OWNER

How important is it to use httpx.AsyncClient with a context manager?

https://www.python-httpx.org/async/#opening-and-closing-clients says:

Alternatively, use await client.aclose() if you want to close a client explicitly:

client = httpx.AsyncClient() ... await client.aclose()
The .aclose() method has a comment saying "Close transport and proxies" - I'm not using proxies, so the relevant implementation seems to be a call to await self._transport.aclose() in https://github.com/encode/httpx/blob/f932af9172d15a803ad40061a4c2c0cd891645cf/httpx/_client.py#L1741-L1751

The transport I am using is a class called ASGITransport in https://github.com/encode/httpx/blob/master/httpx/_transports/asgi.py

The aclose() method on that class does nothing. So it looks like I can instantiate a client without bothering with the async with httpx.AsyncClient bit.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
await datasette.client.get(path) mechanism for executing internal requests 681375466  
696769853 https://github.com/simonw/datasette/issues/943#issuecomment-696769853 https://api.github.com/repos/simonw/datasette/issues/943 MDEyOklzc3VlQ29tbWVudDY5Njc2OTg1Mw== simonw 9599 2020-09-22T14:46:21Z 2020-09-22T14:46:21Z OWNER

This adds httpx as a dependency - I think I'm OK with that. I use it for testing in all of my plugins anyway.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
await datasette.client.get(path) mechanism for executing internal requests 681375466  
696769501 https://github.com/simonw/datasette/issues/943#issuecomment-696769501 https://api.github.com/repos/simonw/datasette/issues/943 MDEyOklzc3VlQ29tbWVudDY5Njc2OTUwMQ== simonw 9599 2020-09-22T14:45:49Z 2020-09-22T14:45:49Z OWNER

I put together a minimal prototype of this and it feels pretty good:

diff --git a/datasette/app.py b/datasette/app.py
index 20aae7d..fb3bdad 100644
--- a/datasette/app.py
+++ b/datasette/app.py
@@ -4,6 +4,7 @@ import collections
 import datetime
 import glob
 import hashlib
+import httpx
 import inspect
 import itertools
 from itsdangerous import BadSignature
@@ -312,6 +313,7 @@ class Datasette:
         self._register_renderers()
         self._permission_checks = collections.deque(maxlen=200)
         self._root_token = secrets.token_hex(32)
+        self.client = DatasetteClient(self)

     async def invoke_startup(self):
         for hook in pm.hook.startup(datasette=self):
@@ -1209,3 +1211,25 @@ def route_pattern_from_filepath(filepath):

 class NotFoundExplicit(NotFound):
     pass
+
+
+class DatasetteClient:
+    def __init__(self, ds):
+        self.app = ds.app()
+
+    def _fix(self, path):
+        if path.startswith("/"):
+            path = "http://localhost{}".format(path)
+        return path
+
+    async def get(self, path, **kwargs):
+        async with httpx.AsyncClient(app=self.app) as client:
+            return await client.get(self._fix(path), **kwargs)
+
+    async def post(self, path, **kwargs):
+        async with httpx.AsyncClient(app=self.app) as client:
+            return await client.post(self._fix(path), **kwargs)
+
+    async def options(self, path, **kwargs):
+        async with httpx.AsyncClient(app=self.app) as client:
+            return await client.options(self._fix(path), **kwargs)

Used like this in ipython:

In [1]: from datasette.app import Datasette

In [2]: ds = Datasette(["fixtures.db"])

In [3]: (await ds.client.get("/-/config.json")).json()
Out[3]: 
{'default_page_size': 100,
 'max_returned_rows': 1000,
 'num_sql_threads': 3,
 'sql_time_limit_ms': 1000,
 'default_facet_size': 30,
 'facet_time_limit_ms': 200,
 'facet_suggest_time_limit_ms': 50,
 'hash_urls': False,
 'allow_facet': True,
 'allow_download': True,
 'suggest_facets': True,
 'default_cache_ttl': 5,
 'default_cache_ttl_hashed': 31536000,
 'cache_size_kb': 0,
 'allow_csv_stream': True,
 'max_csv_mb': 100,
 'truncate_cells_html': 2048,
 'force_https_urls': False,
 'template_debug': False,
 'base_url': '/'}

In [4]: (await ds.client.get("/fixtures/facetable.json?_shape=array")).json()
Out[4]: 
[{'pk': 1,
  'created': '2019-01-14 08:00:00',
  'planet_int': 1,
  'on_earth': 1,
  'state': 'CA',
  'city_id': 1,
  'neighborhood': 'Mission',
  'tags': '["tag1", "tag2"]',
  'complex_array': '[{"foo": "bar"}]',
  'distinct_some_null': 'one'},
 {'pk': 2,
  'created': '2019-01-14 08:00:00',
  'planet_int': 1,
  'on_earth': 1,
  'state': 'CA',
  'city_id': 1,
  'neighborhood': 'Dogpatch',
  'tags': '["tag1", "tag3"]',
  'complex_array': '[]',
  'distinct_some_null': 'two'},
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
await datasette.client.get(path) mechanism for executing internal requests 681375466  
693009048 https://github.com/simonw/datasette/issues/943#issuecomment-693009048 https://api.github.com/repos/simonw/datasette/issues/943 MDEyOklzc3VlQ29tbWVudDY5MzAwOTA0OA== simonw 9599 2020-09-15T22:17:30Z 2020-09-22T14:37:00Z OWNER

Maybe instead of implementing datasette.get() and datasette.post() and datasette.request() and datasette.stream() I could instead have a nested object called datasette.client which is a preconfigured AsyncClient instance.

response = await datasette.client.get("/")

Or perhaps this should be a method in case I ever need to be able to await it:

response = await (await datasette.client()).get("/")

This is a bit cosmetically ugly though, I'd rather avoid that if possible.

Maybe I could get this working by returning an object from .client() which provides a await obj.get() method:

response = await datasette.client().get("/")

I don't think there's any benefit to that over await datasette.client.get() though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
await datasette.client.get(path) mechanism for executing internal requests 681375466  
696573944 https://github.com/simonw/sqlite-utils/issues/168#issuecomment-696573944 https://api.github.com/repos/simonw/sqlite-utils/issues/168 MDEyOklzc3VlQ29tbWVudDY5NjU3Mzk0NA== simonw 9599 2020-09-22T08:11:30Z 2020-09-22T08:11:30Z OWNER

Huh... maybe I don't need to do anything here? It looks like it's been kept up to date: https://github.com/Homebrew/homebrew-core/commits/master/Formula/sqlite-utils.rb

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Automate (as much as possible) updates published to Homebrew 706167456  
696567988 https://github.com/simonw/sqlite-utils/issues/164#issuecomment-696567988 https://api.github.com/repos/simonw/sqlite-utils/issues/164 MDEyOklzc3VlQ29tbWVudDY5NjU2Nzk4OA== simonw 9599 2020-09-22T07:57:50Z 2020-09-22T07:57:50Z OWNER

Documentation: https://sqlite-utils.readthedocs.io/en/latest/cli.html#transforming-tables

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils transform sub-command 706017416  
696567460 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-696567460 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5NjU2NzQ2MA== simonw 9599 2020-09-22T07:56:42Z 2020-09-22T07:56:42Z OWNER

.transform() has landed now which should make this a lot easier to solve.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
696566750 https://github.com/simonw/sqlite-utils/issues/26#issuecomment-696566750 https://api.github.com/repos/simonw/sqlite-utils/issues/26 MDEyOklzc3VlQ29tbWVudDY5NjU2Njc1MA== simonw 9599 2020-09-22T07:55:00Z 2020-09-22T07:55:00Z OWNER

Problem: extract means something else now, see #47 and the upcoming work in #42.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for turning nested JSON into foreign keys / many-to-many 455486286  
696565981 https://github.com/simonw/sqlite-utils/issues/167#issuecomment-696565981 https://api.github.com/repos/simonw/sqlite-utils/issues/167 MDEyOklzc3VlQ29tbWVudDY5NjU2NTk4MQ== simonw 9599 2020-09-22T07:53:13Z 2020-09-22T07:53:13Z OWNER

Confirmed this is a bug, https://www.sqlite.org/lang_altertable.html#making_other_kinds_of_table_schema_changes explicitly says you should do the PRAGMA foreign_keys bits before and after the transaction, not during.

Right now my code does this INSIDE the transaction: https://github.com/simonw/sqlite-utils/blob/f29f6821f2d08e91c5c6d65d885a1bbc0c743bdd/sqlite_utils/db.py#L790-L793

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Review the foreign key pragma stuff 706098005  
696520928 https://github.com/simonw/sqlite-utils/issues/164#issuecomment-696520928 https://api.github.com/repos/simonw/sqlite-utils/issues/164 MDEyOklzc3VlQ29tbWVudDY5NjUyMDkyOA== simonw 9599 2020-09-22T05:50:17Z 2020-09-22T05:50:17Z OWNER

Idea for CLI options:

--type age integer
--drop colname
--rename oldname newname
--not-null col
--not-null-false col
--pk new_id
--pk-none
--default col value
--default-none column
--drop-foreign-key col other_table other_column
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils transform sub-command 706017416  
696500922 https://github.com/simonw/sqlite-utils/issues/164#issuecomment-696500922 https://api.github.com/repos/simonw/sqlite-utils/issues/164 MDEyOklzc3VlQ29tbWVudDY5NjUwMDkyMg== simonw 9599 2020-09-22T04:22:40Z 2020-09-22T04:22:40Z OWNER

Documentation for the .transform() method #114 (now landed) is here: https://sqlite-utils.readthedocs.io/en/latest/python-api.html#transforming-a-table

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils transform sub-command 706017416  
696500767 https://github.com/simonw/sqlite-utils/issues/114#issuecomment-696500767 https://api.github.com/repos/simonw/sqlite-utils/issues/114 MDEyOklzc3VlQ29tbWVudDY5NjUwMDc2Nw== simonw 9599 2020-09-22T04:21:45Z 2020-09-22T04:21:45Z OWNER

Documentation: https://sqlite-utils.readthedocs.io/en/latest/python-api.html#transforming-a-table

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method for advanced alter table 621989740  
696494070 https://github.com/simonw/sqlite-utils/pull/161#issuecomment-696494070 https://api.github.com/repos/simonw/sqlite-utils/issues/161 MDEyOklzc3VlQ29tbWVudDY5NjQ5NDA3MA== simonw 9599 2020-09-22T03:48:58Z 2020-09-22T03:48:58Z OWNER

One last thing. https://www.sqlite.org/lang_altertable.html#making_other_kinds_of_table_schema_change says that the first step should be:

If foreign key constraints are enabled, disable them using PRAGMA foreign_keys=OFF.

And the last steps should be:

If foreign key constraints were originally enabled then run PRAGMA foreign_key_check to verify that the schema change did not break any foreign key constraints.

Commit the transaction started in step 2.

If foreign keys constraints were originally enabled, reenable them now.

I need to implement that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method 705975133  
696490851 https://github.com/simonw/sqlite-utils/pull/161#issuecomment-696490851 https://api.github.com/repos/simonw/sqlite-utils/issues/161 MDEyOklzc3VlQ29tbWVudDY5NjQ5MDg1MQ== simonw 9599 2020-09-22T03:33:54Z 2020-09-22T03:33:54Z OWNER

It would be neat if .transform(pk=None) converted a primary key table to a rowid table.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method 705975133  
696488201 https://github.com/simonw/sqlite-utils/pull/161#issuecomment-696488201 https://api.github.com/repos/simonw/sqlite-utils/issues/161 MDEyOklzc3VlQ29tbWVudDY5NjQ4ODIwMQ== simonw 9599 2020-09-22T03:21:16Z 2020-09-22T03:21:16Z OWNER

Just needs documentation now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method 705975133  
696485791 https://github.com/simonw/sqlite-utils/pull/161#issuecomment-696485791 https://api.github.com/repos/simonw/sqlite-utils/issues/161 MDEyOklzc3VlQ29tbWVudDY5NjQ4NTc5MQ== simonw 9599 2020-09-22T03:10:15Z 2020-09-22T03:10:15Z OWNER

Design decision needed on foreign keys: what does the syntax look like for removing an existing foreign key?

Since I already have a good implementation of add_foreign_key() I'm tempted to only support dropping them. Maybe like this:

table.transform(drop_foreign_keys=[("author_id", "author", "id")])

It's a bit crufty but it's such a rare use-case that I think this will be good enough.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method 705975133  
696480925 https://github.com/simonw/sqlite-utils/pull/161#issuecomment-696480925 https://api.github.com/repos/simonw/sqlite-utils/issues/161 MDEyOklzc3VlQ29tbWVudDY5NjQ4MDkyNQ== simonw 9599 2020-09-22T02:45:47Z 2020-09-22T02:45:47Z OWNER

I'm not going to do conversions= because it would be inconsistent with how they work elsewhere. The SQL generated by this function looks like this:

INSERT INTO dogs_new_tmp VALUES (a, b) SELECT a, b from dogs;

So passing conversions={"name": "upper(?)"}) wouldn't make sense, since we're not using arguments hence there is no-where for that ? to go.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method 705975133  
696473559 https://github.com/simonw/sqlite-utils/issues/164#issuecomment-696473559 https://api.github.com/repos/simonw/sqlite-utils/issues/164 MDEyOklzc3VlQ29tbWVudDY5NjQ3MzU1OQ== simonw 9599 2020-09-22T02:10:37Z 2020-09-22T02:10:37Z OWNER

Maybe something like this:

sqlite-utils transform mydb.db mytable -c age integer --rename age dog_age
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils transform sub-command 706017416  
696465788 https://github.com/simonw/sqlite-utils/issues/163#issuecomment-696465788 https://api.github.com/repos/simonw/sqlite-utils/issues/163 MDEyOklzc3VlQ29tbWVudDY5NjQ2NTc4OA== simonw 9599 2020-09-22T01:33:04Z 2020-09-22T01:33:04Z OWNER

This would apply to .transform() in #114 too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Idea: conversions= could take Python functions 706001517  
696454485 https://github.com/simonw/sqlite-utils/issues/114#issuecomment-696454485 https://api.github.com/repos/simonw/sqlite-utils/issues/114 MDEyOklzc3VlQ29tbWVudDY5NjQ1NDQ4NQ== simonw 9599 2020-09-22T00:42:35Z 2020-09-22T00:42:35Z OWNER

The reason I'm working on this now is that I'd like to support many more options for data cleanup in the Datasette ecosystem - so being able to do things like convert the type of existing columns becomes increasingly important.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method for advanced alter table 621989740  
696454084 https://github.com/simonw/sqlite-utils/issues/162#issuecomment-696454084 https://api.github.com/repos/simonw/sqlite-utils/issues/162 MDEyOklzc3VlQ29tbWVudDY5NjQ1NDA4NA== simonw 9599 2020-09-22T00:40:44Z 2020-09-22T00:40:44Z OWNER

Documentation: https://sqlite-utils.readthedocs.io/en/latest/python-api.html#registering-custom-sql-functions

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
A decorator for registering custom SQL functions 705995722  
696449345 https://github.com/simonw/sqlite-utils/issues/162#issuecomment-696449345 https://api.github.com/repos/simonw/sqlite-utils/issues/162 MDEyOklzc3VlQ29tbWVudDY5NjQ0OTM0NQ== simonw 9599 2020-09-22T00:22:46Z 2020-09-22T00:22:46Z OWNER

Inspired by the idea of adding conversions= to #114 - since this would make it easy to register custom Python functions that can be used to convert the values in a table.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
A decorator for registering custom SQL functions 705995722  
696446658 https://github.com/simonw/sqlite-utils/pull/161#issuecomment-696446658 https://api.github.com/repos/simonw/sqlite-utils/issues/161 MDEyOklzc3VlQ29tbWVudDY5NjQ0NjY1OA== simonw 9599 2020-09-22T00:13:55Z 2020-09-22T00:14:21Z OWNER

Idea: allow a conversions= parameter, as seen on .insert_all() and friends, which lets you apply a SQL transformation function as part of the operation. E.g.:

table.transform({"age": int}, conversions={"name": "upper(?)"})

https://sqlite-utils.readthedocs.io/en/stable/python-api.html#converting-column-values-using-sql-functions

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method 705975133  
696445766 https://github.com/simonw/sqlite-utils/pull/161#issuecomment-696445766 https://api.github.com/repos/simonw/sqlite-utils/issues/161 MDEyOklzc3VlQ29tbWVudDY5NjQ0NTc2Ng== simonw 9599 2020-09-22T00:10:50Z 2020-09-22T00:11:12Z OWNER

A less horrible interface might be the following:

# Ensure the 'age' column is not null:
table.transform(not_null={"age"})
# The 'age' column is not null but I don't want it to be:
table.transform(not_null={"age": False})

So if the argument is a set it means "make sure these are all not null" - if the argument is a dictionary it means "set these to be null or not null depending on if their dictionary value is true or false".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method 705975133  
696444842 https://github.com/simonw/sqlite-utils/pull/161#issuecomment-696444842 https://api.github.com/repos/simonw/sqlite-utils/issues/161 MDEyOklzc3VlQ29tbWVudDY5NjQ0NDg0Mg== simonw 9599 2020-09-22T00:07:43Z 2020-09-22T00:09:05Z OWNER

Syntax challenge: I could use .transform(defaults={"age": None}) to indicate that the age column should have its default removed, but how would I tell .transform() that the age column, currently not null, should have the not null removed from it?

I could do this: .transform(not_not_null={"age"}) - it's a bit gross but it's also kind of funny. I actually like it!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method 705975133  
696444353 https://github.com/simonw/sqlite-utils/pull/161#issuecomment-696444353 https://api.github.com/repos/simonw/sqlite-utils/issues/161 MDEyOklzc3VlQ29tbWVudDY5NjQ0NDM1Mw== simonw 9599 2020-09-22T00:06:12Z 2020-09-22T00:06:12Z OWNER

I should support not_null= and default= arguments to the .transform() method because it looks like you can't use ALTER TABLE to change those.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method 705975133  
696443845 https://github.com/simonw/sqlite-utils/pull/161#issuecomment-696443845 https://api.github.com/repos/simonw/sqlite-utils/issues/161 MDEyOklzc3VlQ29tbWVudDY5NjQ0Mzg0NQ== simonw 9599 2020-09-22T00:04:31Z 2020-09-22T00:04:44Z OWNER

Good news: the .columns introspection does tell me those things:

>>> import sqlite_utils
>>> db = sqlite_utils.Database(memory=True)
>>> db.create_table("foo", {"id": int, "name": str, "age": int}, defaults={"age": 1}, not_null={"name", "age"})
<Table foo (id, name, age)>
>>> db["foo"]
<Table foo (id, name, age)>
>>> print(db["foo"].schema)
CREATE TABLE [foo] (
   [id] INTEGER,
   [name] TEXT NOT NULL,
   [age] INTEGER NOT NULL DEFAULT 1
)
>>> db["foo"].columns
[Column(cid=0, name='id', type='INTEGER', notnull=0, default_value=None, is_pk=0),
 Column(cid=1, name='name', type='TEXT', notnull=1, default_value=None, is_pk=0),
 Column(cid=2, name='age', type='INTEGER', notnull=1, default_value='1', is_pk=0)]
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method 705975133  
696443190 https://github.com/simonw/sqlite-utils/pull/161#issuecomment-696443190 https://api.github.com/repos/simonw/sqlite-utils/issues/161 MDEyOklzc3VlQ29tbWVudDY5NjQ0MzE5MA== simonw 9599 2020-09-22T00:02:22Z 2020-09-22T00:02:22Z OWNER

How would I detect which columns are not_null and what their defaults are? I don`t think my introspection logic handles that yet.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method 705975133  
696443042 https://github.com/simonw/sqlite-utils/pull/161#issuecomment-696443042 https://api.github.com/repos/simonw/sqlite-utils/issues/161 MDEyOklzc3VlQ29tbWVudDY5NjQ0MzA0Mg== simonw 9599 2020-09-22T00:01:50Z 2020-09-22T00:01:50Z OWNER

When you transform a table, it should keep its primary key, foreign keys, not_null and defaults. I don't think it needs to care about hash_id or extracts= since those don't affect the structure of the table as it is being created - well, hash_id does but if we are transforming an existing table we will get the hash_id column for free.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method 705975133  
696442621 https://github.com/simonw/sqlite-utils/pull/161#issuecomment-696442621 https://api.github.com/repos/simonw/sqlite-utils/issues/161 MDEyOklzc3VlQ29tbWVudDY5NjQ0MjYyMQ== simonw 9599 2020-09-22T00:00:23Z 2020-09-22T00:00:23Z OWNER

I still need to figure out what to do about these various other table properties: https://github.com/simonw/sqlite-utils/blob/b34c9b40c206d7a9d7ee57a8c1f198ff1f522735/sqlite_utils/db.py#L775-L787

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method 705975133  
696435194 https://github.com/simonw/sqlite-utils/issues/114#issuecomment-696435194 https://api.github.com/repos/simonw/sqlite-utils/issues/114 MDEyOklzc3VlQ29tbWVudDY5NjQzNTE5NA== simonw 9599 2020-09-21T23:34:14Z 2020-09-21T23:35:00Z OWNER

I think the fiddliest part of the implementation here is code that takes the existing columns_dict of the table and the incoming columns= and drop= and rename= parameters and produces the columns dictionary for the new table, ready to be fed to .create_table().

This logic probably also needs to return a structure that can be used to build the INSERT INTO ... SELECT ... FROM query.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method for advanced alter table 621989740  
696434638 https://github.com/simonw/sqlite-utils/issues/114#issuecomment-696434638 https://api.github.com/repos/simonw/sqlite-utils/issues/114 MDEyOklzc3VlQ29tbWVudDY5NjQzNDYzOA== simonw 9599 2020-09-21T23:32:26Z 2020-09-21T23:32:26Z OWNER

A test that confirms that this mechanism can turn a rowid into a non-rowid table would be good too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method for advanced alter table 621989740  
696434237 https://github.com/simonw/sqlite-utils/issues/114#issuecomment-696434237 https://api.github.com/repos/simonw/sqlite-utils/issues/114 MDEyOklzc3VlQ29tbWVudDY5NjQzNDIzNw== simonw 9599 2020-09-21T23:31:07Z 2020-09-21T23:31:57Z OWNER

Does it make sense to support the pk= argument for changing the primary key?

If the user requests a primary key that doesn't make sense I think an integrity error will be raised when the SQL is being executed, which should hopefully cancel the transaction and raise an error. Need to check that this is what happens.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method for advanced alter table 621989740  
696434097 https://github.com/simonw/sqlite-utils/issues/114#issuecomment-696434097 https://api.github.com/repos/simonw/sqlite-utils/issues/114 MDEyOklzc3VlQ29tbWVudDY5NjQzNDA5Nw== simonw 9599 2020-09-21T23:30:40Z 2020-09-21T23:30:40Z OWNER

Since I have a column_order=None argument already, maybe I can ignore the order of the columns in that first argument and use that instead?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method for advanced alter table 621989740  
696433778 https://github.com/simonw/sqlite-utils/issues/114#issuecomment-696433778 https://api.github.com/repos/simonw/sqlite-utils/issues/114 MDEyOklzc3VlQ29tbWVudDY5NjQzMzc3OA== simonw 9599 2020-09-21T23:29:39Z 2020-09-21T23:29:39Z OWNER

The columns= argument is optional - so you can do just a rename operation like so:

table.transform(rename={"age": "dog_age"})
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method for advanced alter table 621989740  
696433542 https://github.com/simonw/sqlite-utils/issues/114#issuecomment-696433542 https://api.github.com/repos/simonw/sqlite-utils/issues/114 MDEyOklzc3VlQ29tbWVudDY5NjQzMzU0Mg== simonw 9599 2020-09-21T23:28:58Z 2020-09-21T23:28:58Z OWNER

If you want to both change the type of a column AND rename it in the same operation, how would you do that? I think like this:

table.transform({"age": int}, rename={"age": "dog_age"})

So any rename logic is applied at the end, after the type transformation or re-ordering logic.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method for advanced alter table 621989740  
696432690 https://github.com/simonw/sqlite-utils/issues/114#issuecomment-696432690 https://api.github.com/repos/simonw/sqlite-utils/issues/114 MDEyOklzc3VlQ29tbWVudDY5NjQzMjY5MA== simonw 9599 2020-09-21T23:26:32Z 2020-09-21T23:27:38Z OWNER

To expand on what that first argument - the columns argument - does. Say you have a table like this:

id integer
name text
age text

Any columns omitted from the columns= argument are left alone - they have to be explicitly dropped using drop= if you want to drop them.

Any new columns are added (at the end of the table):

table.tranform({"size": float})

Any columns that have their type changed will have their type changed:

table.tranform({"age": int})

Should I also re-order columns if the order doesn't match? I think so. Open question as to what happens to columns that aren't mentioned at all in the dictionary though - what order should they go in?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method for advanced alter table 621989740  
696431058 https://github.com/simonw/sqlite-utils/issues/114#issuecomment-696431058 https://api.github.com/repos/simonw/sqlite-utils/issues/114 MDEyOklzc3VlQ29tbWVudDY5NjQzMTA1OA== simonw 9599 2020-09-21T23:21:37Z 2020-09-21T23:21:37Z OWNER

I may need to do something special for rowid tables to ensure that the rowid values in the transformed table match those from the old table.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method for advanced alter table 621989740  
696430843 https://github.com/simonw/sqlite-utils/issues/114#issuecomment-696430843 https://api.github.com/repos/simonw/sqlite-utils/issues/114 MDEyOklzc3VlQ29tbWVudDY5NjQzMDg0Mw== simonw 9599 2020-09-21T23:21:00Z 2020-09-21T23:21:00Z OWNER

For FTS tables associated with the table that is being transformed, should I automatically drop the old FTS table and recreate it against the new one or will it just magically continue to work after the table is renamed?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method for advanced alter table 621989740  
696423138 https://github.com/simonw/sqlite-utils/issues/114#issuecomment-696423138 https://api.github.com/repos/simonw/sqlite-utils/issues/114 MDEyOklzc3VlQ29tbWVudDY5NjQyMzEzOA== simonw 9599 2020-09-21T22:59:17Z 2020-09-21T23:01:06Z OWNER

I'm going to sketch out a prototype of this new API design in that branch.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method for advanced alter table 621989740  
696423066 https://github.com/simonw/sqlite-utils/issues/114#issuecomment-696423066 https://api.github.com/repos/simonw/sqlite-utils/issues/114 MDEyOklzc3VlQ29tbWVudDY5NjQyMzA2Ng== simonw 9599 2020-09-21T22:59:01Z 2020-09-21T22:59:01Z OWNER

I'm rethinking the API design now. Maybe it could look like this:

To change the type of the author_id column from text to int:

books.transform({"author_id": int})

This would leave the existing columns alone, but would change the type of this column.

To rename author_id to author_identifier:

books.transform(rename={"author_id": "author_identifier"})

To drop a column:

books.transform(drop=["author_id"])

Since the parameters all operate on columns they don't need to be called drop_column and rename_column.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method for advanced alter table 621989740  
696421240 https://github.com/simonw/sqlite-utils/issues/114#issuecomment-696421240 https://api.github.com/repos/simonw/sqlite-utils/issues/114 MDEyOklzc3VlQ29tbWVudDY5NjQyMTI0MA== simonw 9599 2020-09-21T22:53:48Z 2020-09-21T22:53:48Z OWNER

I've decided to call this table.transform() - I was over-thinking whether people would remember that .transform() actually transforms the table, but that's what documentation is for.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.transform() method for advanced alter table 621989740  
696308847 https://github.com/simonw/datasette/issues/972#issuecomment-696308847 https://api.github.com/repos/simonw/datasette/issues/972 MDEyOklzc3VlQ29tbWVudDY5NjMwODg0Nw== simonw 9599 2020-09-21T19:01:25Z 2020-09-21T19:01:25Z OWNER

I did a bunch of initial work for this in #427.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support faceting against arbitrary SQL queries 705840673  
696307922 https://github.com/simonw/datasette/issues/971#issuecomment-696307922 https://api.github.com/repos/simonw/datasette/issues/971 MDEyOklzc3VlQ29tbWVudDY5NjMwNzkyMg== simonw 9599 2020-09-21T18:59:52Z 2020-09-21T19:00:02Z OWNER

Given dbstat isn't as widely available as I thought I'm going to let people who want to use dbstat run their own select * from dbstat queries rather than bake support directly into Datasette.

The experience of exploring dbstat will improve if I land support for running facets against arbitrary custom SQL queries, which is half-done in that facets now execute against wrapped subqueries as-of ea66c45df96479ef66a89caa71fff1a97a862646

https://github.com/simonw/datasette/blob/ea66c45df96479ef66a89caa71fff1a97a862646/datasette/facets.py#L192-L200

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support the dbstat table 705827457  
696304108 https://github.com/simonw/datasette/issues/971#issuecomment-696304108 https://api.github.com/repos/simonw/datasette/issues/971 MDEyOklzc3VlQ29tbWVudDY5NjMwNDEwOA== simonw 9599 2020-09-21T18:52:50Z 2020-09-21T18:52:50Z OWNER

Looks like the pysqlite3-binary package doesn't support dbstat either.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support the dbstat table 705827457  
696302868 https://github.com/simonw/datasette/issues/971#issuecomment-696302868 https://api.github.com/repos/simonw/datasette/issues/971 MDEyOklzc3VlQ29tbWVudDY5NjMwMjg2OA== simonw 9599 2020-09-21T18:50:40Z 2020-09-21T18:50:40Z OWNER

Easiest way to get this may be to run create view dbstat_view as select * from dbstat on databases that support it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support the dbstat table 705827457  
696302020 https://github.com/simonw/datasette/issues/971#issuecomment-696302020 https://api.github.com/repos/simonw/datasette/issues/971 MDEyOklzc3VlQ29tbWVudDY5NjMwMjAyMA== simonw 9599 2020-09-21T18:49:09Z 2020-09-21T18:49:09Z OWNER

... made harder to work on because I apparently don't have the DBSTAT_VTAB module on macOS.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support the dbstat table 705827457  
696298614 https://github.com/simonw/datasette/issues/971#issuecomment-696298614 https://api.github.com/repos/simonw/datasette/issues/971 MDEyOklzc3VlQ29tbWVudDY5NjI5ODYxNA== simonw 9599 2020-09-21T18:43:07Z 2020-09-21T18:43:07Z OWNER

Or, do this:

SELECT 1 FROM dbstat limit 1;

And see if it returns a "table does not exist" error.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support the dbstat table 705827457  
696297930 https://github.com/simonw/datasette/issues/971#issuecomment-696297930 https://api.github.com/repos/simonw/datasette/issues/971 MDEyOklzc3VlQ29tbWVudDY5NjI5NzkzMA== simonw 9599 2020-09-21T18:41:47Z 2020-09-21T18:41:47Z OWNER

https://www.sqlite.org/dbstat.html

The DBSTAT virtual table is an eponymous virtual table, meaning that is not necessary to run CREATE VIRTUAL TABLE to create an instance of the dbstat virtual table before using it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support the dbstat table 705827457  
696297601 https://github.com/simonw/datasette/issues/971#issuecomment-696297601 https://api.github.com/repos/simonw/datasette/issues/971 MDEyOklzc3VlQ29tbWVudDY5NjI5NzYwMQ== simonw 9599 2020-09-21T18:41:07Z 2020-09-21T18:41:07Z OWNER

How to detect it? Looks like it's visible in SQLite compile time options: https://latest.datasette.io/-/versions

        "compile_options": [
            "COMPILER=gcc-8.3.0",
            "ENABLE_COLUMN_METADATA",
            "ENABLE_DBSTAT_VTAB",
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support the dbstat table 705827457  
695896557 https://github.com/simonw/datasette/issues/970#issuecomment-695896557 https://api.github.com/repos/simonw/datasette/issues/970 MDEyOklzc3VlQ29tbWVudDY5NTg5NjU1Nw== simonw 9599 2020-09-21T04:40:12Z 2020-09-21T04:40:12Z OWNER

The Python standard library has a module for this: https://docs.python.org/3/library/webbrowser.html

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
request an "-o" option on "datasette server" to open the default browser at the running url 705108492  
695895960 https://github.com/simonw/datasette/issues/970#issuecomment-695895960 https://api.github.com/repos/simonw/datasette/issues/970 MDEyOklzc3VlQ29tbWVudDY5NTg5NTk2MA== simonw 9599 2020-09-21T04:36:45Z 2020-09-21T04:36:45Z OWNER

I like this. It could work with the --root option too and automatically sign you in as the root user.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 1,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
request an "-o" option on "datasette server" to open the default browser at the running url 705108492  
695879531 https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695879531 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 MDEyOklzc3VlQ29tbWVudDY5NTg3OTUzMQ== simonw 9599 2020-09-21T02:55:28Z 2020-09-21T02:55:54Z MEMBER

Actually for the tie-breaker it should be something like https://latest.datasette.io/fixtures?sql=select+pk%2C+created%2C+planet_int%2C+on_earth%2C+state%2C+city_id%2C+neighborhood%2C+tags%2C+complex_array%2C+distinct_some_null+from+facetable+where+%28created+%3E+%3Ap1+or+%28created+%3D+%3Ap1+and+%28%28pk+%3E+%3Ap0%29%29%29%29+order+by+created%2C+pk+limit+11&p0=10&p1=2019-01-16+08%3A00%3A00

where
  (
    created > :p1
    or (
      created = :p1
      and ((pk > :p0))
    )
  )

But with rowid and timestamp in place of pk and created.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pagination 705215230  
695879237 https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695879237 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 MDEyOklzc3VlQ29tbWVudDY5NTg3OTIzNw== simonw 9599 2020-09-21T02:53:29Z 2020-09-21T02:53:29Z MEMBER

If previous page ended at 2018-02-11T16:32:53+00:00:

select
  search_index.rowid,
  search_index.type,
  search_index.key,
  search_index.title,
  search_index.category,
  search_index.timestamp,
  search_index.search_1
from
  search_index
 where 
  date("timestamp") = '2018-02-11'
 and timestamp < '2018-02-11T16:32:53+00:00'
order by
  search_index.timestamp desc, rowid
limit 41
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pagination 705215230  
695877627 https://github.com/dogsheep/dogsheep-beta/issues/16#issuecomment-695877627 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/16 MDEyOklzc3VlQ29tbWVudDY5NTg3NzYyNw== simonw 9599 2020-09-21T02:42:29Z 2020-09-21T02:42:29Z MEMBER

Fun twist: assuming timestamp is always stored as UTC, I need the interface to be timezone aware so I can see e.g. everything from 4th July 2020 in the San Francisco timezone definition of 4th July 2020.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Timeline view 694493566  
695875274 https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695875274 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 MDEyOklzc3VlQ29tbWVudDY5NTg3NTI3NA== simonw 9599 2020-09-21T02:28:58Z 2020-09-21T02:28:58Z MEMBER

Datasette's implementation is complex because it has to support compound primary keys: https://github.com/simonw/datasette/blob/a258339a935d8d29a95940ef1db01e98bb85ae63/datasette/utils/__init__.py#L88-L114 - but that's not something that's needed for dogsheep-beta.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pagination 705215230  
695856967 https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695856967 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 MDEyOklzc3VlQ29tbWVudDY5NTg1Njk2Nw== simonw 9599 2020-09-21T00:26:59Z 2020-09-21T00:26:59Z MEMBER

It's a shame Datasette doesn't currently have an easy way to implement sorted-by-rank keyset-paginated using a TableView or QueryView. I'll have to do this using the custom SQL query constructed in the plugin: https://github.com/dogsheep/dogsheep-beta/blob/bed9df2b3ef68189e2e445427721a28f4e9b4887/dogsheep_beta/__init__.py#L8-L43

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pagination 705215230  
695856398 https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695856398 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 MDEyOklzc3VlQ29tbWVudDY5NTg1NjM5OA== simonw 9599 2020-09-21T00:22:20Z 2020-09-21T00:22:20Z MEMBER

I'm going to try for keyset pagination sorted by relevance just as a learning exercise.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pagination 705215230  
695855723 https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695855723 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 MDEyOklzc3VlQ29tbWVudDY5NTg1NTcyMw== simonw 9599 2020-09-21T00:16:52Z 2020-09-21T00:17:53Z MEMBER

It feels a bit weird to implement keyset pagination against results sorted by rank because the ranks could change substantially if the search index gets updated while the user is paginating.

I may just ignore that though. If you want reliable pagination you can get it by sorting by date. Maybe it doesn't even make sense to offer pagination if you sort by relevance?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pagination 705215230  
695855646 https://github.com/dogsheep/dogsheep-beta/issues/26#issuecomment-695855646 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/26 MDEyOklzc3VlQ29tbWVudDY5NTg1NTY0Ng== simonw 9599 2020-09-21T00:16:11Z 2020-09-21T00:16:11Z MEMBER

Should I do this with offset/limit or should I do proper keyset pagination?

I think keyset because then it will work well for the full search interface with no filters or search string.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Pagination 705215230  
695851036 https://github.com/dogsheep/dogsheep-beta/issues/16#issuecomment-695851036 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/16 MDEyOklzc3VlQ29tbWVudDY5NTg1MTAzNg== simonw 9599 2020-09-20T23:34:57Z 2020-09-20T23:34:57Z MEMBER

Really basic starting point is to add facet by date.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Timeline view 694493566  
695839557 https://github.com/simonw/sqlite-utils/issues/160#issuecomment-695839557 https://api.github.com/repos/simonw/sqlite-utils/issues/160 MDEyOklzc3VlQ29tbWVudDY5NTgzOTU1Nw== simonw 9599 2020-09-20T21:37:03Z 2020-09-20T21:37:03Z OWNER

Should this support ignore=True as well? I'm tempted to skip that - I think replace=True is more useful because it implies "ignore if the options are already the same, but replace if they are different".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.enable_fts(..., replace=True) 705190723  
695698227 https://github.com/simonw/sqlite-utils/issues/42#issuecomment-695698227 https://api.github.com/repos/simonw/sqlite-utils/issues/42 MDEyOklzc3VlQ29tbWVudDY5NTY5ODIyNw== simonw 9599 2020-09-20T04:27:26Z 2020-09-20T04:28:26Z OWNER

This is going to need #114 (the transform_table() method) in order to convert string columns into integer foreign key columns.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
table.extract(...) method and "sqlite-utils extract" command 470345929  
695695776 https://github.com/simonw/sqlite-utils/issues/68#issuecomment-695695776 https://api.github.com/repos/simonw/sqlite-utils/issues/68 MDEyOklzc3VlQ29tbWVudDY5NTY5NTc3Ng== simonw 9599 2020-09-20T04:25:47Z 2020-09-20T04:25:47Z OWNER

This is a dupe of #130

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add support for porter stemming in FTS 531583658  
695133768 https://github.com/simonw/datasette/issues/943#issuecomment-695133768 https://api.github.com/repos/simonw/datasette/issues/943 MDEyOklzc3VlQ29tbWVudDY5NTEzMzc2OA== simonw 9599 2020-09-19T00:06:56Z 2020-09-19T00:07:35Z OWNER

dogsheep-beta could do with this too. It currently makes a call to TableView in a similar way to datasette-graphql in order to calculate facets.

dogsheep-beta would benefit with a mechanism for changing the facet timeout setting during that call (as would datasette-graphql, see the DatasetteSpecialConfig mechanism it uses).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
await datasette.client.get(path) mechanism for executing internal requests 681375466  
695124698 https://github.com/dogsheep/dogsheep-beta/issues/15#issuecomment-695124698 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/15 MDEyOklzc3VlQ29tbWVudDY5NTEyNDY5OA== simonw 9599 2020-09-18T23:17:38Z 2020-09-18T23:17:38Z MEMBER

This can be part of the demo instance in #6.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add a bunch of config examples 694136490  
695113871 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-695113871 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NTExMzg3MQ== simonw 9599 2020-09-18T22:30:17Z 2020-09-18T22:30:17Z MEMBER

I think I know what's going on here:

https://github.com/dogsheep/dogsheep-beta/blob/0f1b951c5131d16f3c8559a8e4d79ed5c559e3cb/dogsheep_beta/__init__.py#L166-L171

This is a logic bug - the compiled variable could be the template from the previous loop!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
695109140 https://github.com/dogsheep/dogsheep-beta/issues/25#issuecomment-695109140 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/25 MDEyOklzc3VlQ29tbWVudDY5NTEwOTE0MA== simonw 9599 2020-09-18T22:12:20Z 2020-09-18T22:12:20Z MEMBER

Documented here: https://github.com/dogsheep/dogsheep-beta/blob/534fc9689227eba70e69a45da0cee5820bbda9e1/README.md#datasette-plugin

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
template_debug mechanism 704685890  
695108895 https://github.com/dogsheep/dogsheep-beta/issues/25#issuecomment-695108895 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/25 MDEyOklzc3VlQ29tbWVudDY5NTEwODg5NQ== simonw 9599 2020-09-18T22:11:32Z 2020-09-18T22:11:32Z MEMBER

I'm going to make this a new plugin configuration setting, template_debug.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
template_debug mechanism 704685890  
694557425 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694557425 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NDU1NzQyNQ== simonw 9599 2020-09-17T23:41:01Z 2020-09-17T23:41:01Z MEMBER

I removed all of the json.loads() calls and I'm still getting that Undefined error.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
694554584 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694554584 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NDU1NDU4NA== simonw 9599 2020-09-17T23:31:25Z 2020-09-17T23:31:25Z MEMBER

I'd prefer it if errors in these template fragments were displayed as errors inline where the fragment should have been inserted, rather than 500ing the whole page - especially since the template fragments are user-provided and could have all kinds of odd errors in them which should be as easy to debug as possible.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
694553579 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694553579 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NDU1MzU3OQ== simonw 9599 2020-09-17T23:28:37Z 2020-09-17T23:28:37Z MEMBER

More investigation in pdb:

(dogsheep-beta) dogsheep-beta % datasette . --get '/-/beta?q=pycon&sort=oldest' --pdb
> /usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/__init__.py(341)loads()
-> raise TypeError(f'the JSON object must be str, bytes or bytearray, '
(Pdb) list
336             if s.startswith('\ufeff'):
337                 raise JSONDecodeError("Unexpected UTF-8 BOM (decode using utf-8-sig)",
338                                       s, 0)
339         else:
340             if not isinstance(s, (bytes, bytearray)):
341  ->             raise TypeError(f'the JSON object must be str, bytes or bytearray, '
342                                 f'not {s.__class__.__name__}')
343             s = s.decode(detect_encoding(s), 'surrogatepass')
344     
345         if "encoding" in kw:
346             import warnings
(Pdb) bytes
<class 'bytes'>
(Pdb) locals()['s']
Undefined
(Pdb) type(locals()['s'])
<class 'jinja2.runtime.Undefined'>
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
694552681 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694552681 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NDU1MjY4MQ== simonw 9599 2020-09-17T23:25:54Z 2020-09-17T23:25:54Z MEMBER

This is the template fragment it's rendering:

            <div style="overflow: hidden;">
              <p>Tweet by <a href="https://twitter.com/{{ display.screen_name }}">@{{ display.screen_name }}</a> ({{ display.user_name }}, {{ "{:,}".format(display.followers_count or 0) }} followers)
                on <a href="https://twitter.com/{{ display.screen_name }}/status/{{ display.tweet_id }}">{{ display.created_at }}</a></p>
              </p>
              <blockquote>{{ display.full_text }}</blockquote>
              {% if display.media_urls and json.loads(display.media_urls) %}
                {% for url in json.loads(display.media_urls) %}
                  <img src="{{ url }}" style="height: 200px;">
                {% endfor %}
              {% endif %}
            </div>
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
694552393 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694552393 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NDU1MjM5Mw== simonw 9599 2020-09-17T23:25:01Z 2020-09-17T23:25:17Z MEMBER

Ran locals() In the debugger:
{'range': <class 'range'>, 'dict': <class 'dict'>, 'lipsum': <function generate_lorem_ipsum at 0x10aeff430>, 'cycler': <class 'jinja2.utils.Cycler'>, 'joiner': <class 'jinja2.utils.Joiner'>, 'namespace': <class 'jinja2.utils.Namespace'>, 'rank': -9.383801886431414, 'rowid': 14297, 'type': 'twitter.db/tweets', 'key': '312658917933076480', 'title': 'Tweet by @chrisstreeter', 'category': 2, 'timestamp': '2013-03-15T20:17:49+00:00', 'search_1': '@simonw are you at pycon? Would love to meet you.', 'display': {'avatar_url': 'https://pbs.twimg.com/profile_images/806275088597204993/38yLHfJi_normal.jpg', 'user_name': 'Chris Streeter', 'screen_name': 'chrisstreeter', 'followers_count': 280, 'tweet_id': 312658917933076480, 'created_at': '2013-03-15T20:17:49+00:00', 'full_text': '@simonw are you at pycon? Would love to meet you.', 'media_urls_2': '[]', 'media_urls': '[]'}, 'json': <module 'json' from '/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/__init__.py'>}

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
694551646 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694551646 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NDU1MTY0Ng== simonw 9599 2020-09-17T23:22:48Z 2020-09-17T23:22:48Z MEMBER

Looks like its happening in a Jinja fragment template for one of the results:

  /Users/simon/Dropbox/Development/dogsheep-beta/dogsheep_beta/__init__.py(169)process_results()
-> output = compiled.render({**result, **{"json": json}})
  /Users/simon/.local/share/virtualenvs/dogsheep-beta-u_po4Rpj/lib/python3.8/site-packages/jinja2/asyncsupport.py(71)render()
-> return original_render(self, *args, **kwargs)
  /Users/simon/.local/share/virtualenvs/dogsheep-beta-u_po4Rpj/lib/python3.8/site-packages/jinja2/environment.py(1090)render()
-> self.environment.handle_exception()
  /Users/simon/.local/share/virtualenvs/dogsheep-beta-u_po4Rpj/lib/python3.8/site-packages/jinja2/environment.py(832)handle_exception()
-> reraise(*rewrite_traceback_stack(source=source))
  /Users/simon/.local/share/virtualenvs/dogsheep-beta-u_po4Rpj/lib/python3.8/site-packages/jinja2/_compat.py(28)reraise()
-> raise value.with_traceback(tb)
  <template>(5)top-level template code()
> /usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/__init__.py(341)loads()
-> raise TypeError(f'the JSON object must be str, bytes or bytearray, '
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  
694551406 https://github.com/dogsheep/dogsheep-beta/issues/24#issuecomment-694551406 https://api.github.com/repos/dogsheep/dogsheep-beta/issues/24 MDEyOklzc3VlQ29tbWVudDY5NDU1MTQwNg== simonw 9599 2020-09-17T23:22:07Z 2020-09-17T23:22:07Z MEMBER

Neat, I can debug this with the new --pdb option:

datasette . --get '/-/beta?q=pycon&sort=oldest' --pdb
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
the JSON object must be str, bytes or bytearray, not 'Undefined' 703970814  

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Query took 900.552ms · About: github-to-sqlite