id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,pull_request,body,repo,type,active_lock_reason,performed_via_github_app,reactions,draft,state_reason
1655860104,I_kwDOCGYnMM5ismuI,535,rows: --transpose or psql extended view-like functionality,7908073,closed,0,,,2,2023-04-05T15:37:33Z,2023-06-15T08:39:49Z,2023-06-14T22:05:28Z,CONTRIBUTOR,,"It would be nice if the rows subcommand had a flag, perhaps called `--transpose` which would print in long form instead of wide. Similar to extended display mode in psql (`\x`)

In other words instead of this:

```
sqlite-utils rows  --limit 5 --fmt github track_metadata.db songs
```

| track_id           | title             | song_id            | release                              | artist_id          | artist_mbid                          | artist_name      |   duration |   artist_familiarity |   artist_hotttnesss |   year |   track_7digitalid |   shs_perf |   shs_work |
|--------------------|-------------------|--------------------|--------------------------------------|--------------------|--------------------------------------|------------------|------------|----------------------|---------------------|--------|--------------------|------------|------------|
| TRMMMYQ128F932D901 | Silent Night      | SOQMMHC12AB0180CB8 | Monster Ballads X-Mas                | ARYZTJS1187B98C555 | 357ff05d-848a-44cf-b608-cb34b5701ae5 | Faster Pussy cat |    252.055 |             0.649822 |            0.394032 |   2003 |            7032331 |         -1 |          0 |
| TRMMMKD128F425225D | Tanssi vaan       | SOVFVAK12A8C1350D9 | Karkuteillä                          | ARMVN3U1187FB3A1EB | 8d7ef530-a6fd-4f8f-b2e2-74aec765e0f9 | Karkkiautomaatti |    156.551 |             0.439604 |            0.356992 |   1995 |            1514808 |         -1 |          0 |
| TRMMMRX128F93187D9 | No One Could Ever | SOGTUKN12AB017F4F1 | Butter                               | ARGEKB01187FB50750 | 3d403d44-36ce-465c-ad43-ae877e65adc4 | Hudson Mohawke   |    138.971 |             0.643681 |            0.437504 |   2006 |            6945353 |         -1 |          0 |
| TRMMMCH128F425532C | Si Vos Querés     | SOBNYVR12A8C13558C | De Culo                              | ARNWYLR1187B9B2F9C | 12be7648-7094-495f-90e6-df4189d68615 | Yerba Brava      |    145.058 |             0.448501 |            0.372349 |   2003 |            2168257 |         -1 |          0 |
| TRMMMWA128F426B589 | Tangle Of Aspens  | SOHSBXH12A8C13B0DF | Rene Ablaze Presents Winter Sessions | AREQDTE1269FB37231 |                                      | Der Mystic       |    514.298 |             0        |            0        |      0 |            2264873 |         -1 |          0 |


The output would look something like this:

```
$ for col in (sqlite-columns track_metadata.db songs)
    sqlite-utils --fmt github track_metadata.db ""select $col from songs order by rowid desc limit 5""
end
```

| track_id           |
|--------------------|
| TRYYYVU12903CD01E3 |
| TRYYYDJ128F9310A21 |
| TRYYYMG128F4260ECA |
| TRYYYJO128F426DA37 |
| TRYYYUS12903CD2DF0 |
| title                               |
|-------------------------------------|
| Fernweh feat. Sektion Kuchikäschtli |
| Faraday                             |
| Novemba                             |
| Jago Chhadeo                        |
| O Samba Da Vida                     |
| song_id            |
|--------------------|
| SOWXJXQ12AB0189F43 |
| SOLXGOR12A81C21EB7 |
| SOHODZI12A8C137BB3 |
| SOXQYIQ12A8C137FBB |
| SOTXAME12AB018F136 |
| release                         |
|---------------------------------|
| So Oder So                      |
| The Trance Collection Vol. 2    |
| Dub_Connected: electronic music |
| Naale Baba Lassi Pee Gya        |
| Pacha V.I.P.                    |
| artist_id          |
|--------------------|
| AR7PLM21187B990D08 |
| ARCMCOK1187B9B1073 |
| ARZ3R6M1187B9AF750 |
| ART5FZD1187B9A7FCF |
| AR7Z4J81187FB3FC59 |
| artist_mbid                          |
|--------------------------------------|
| 3af2b07e-c91c-4160-9bda-f0b9e3144ed3 |
| 4ac5f3de-c5ad-475e-ad50-41f1ef9dba20 |
| 8b97e9c8-61f5-4615-9a96-276f24204e34 |
| 2357c400-9109-42b6-b3fe-9e2d9f8e3872 |
| 9d50cb20-7e42-45cc-b0dd-154c3e92a577 |
| artist_name    |
|----------------|
| Texta          |
| Elude          |
| Gabriel Le Mar |
| Kuldeep Manak  |
| Kiko Navarro   |
|   duration |
|------------|
|    295.079 |
|    484.519 |
|    553.038 |
|    244.166 |
|    217.443 |
|   artist_familiarity |
|----------------------|
|             0.552977 |
|             0.403668 |
|             0.556918 |
|             0.4015   |
|             0.528617 |
|   artist_hotttnesss |
|---------------------|
|            0.454869 |
|            0.256935 |
|            0.336914 |
|            0.374866 |
|            0.411595 |
|   year |
|--------|
|   2004 |
|      0 |
|      0 |
|      0 |
|      0 |
|   track_7digitalid |
|--------------------|
|            8486723 |
|            5472456 |
|            2219291 |
|            1632096 |
|            7522478 |
|   shs_perf |
|------------|
|         -1 |
|         -1 |
|         -1 |
|         -1 |
|         -1 |
|   shs_work |
|------------|
|          0 |
|          0 |
|          0 |
|          0 |
|          0 |
",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/535/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1740150327,I_kwDOCGYnMM5nuJY3,557,Aliased ROWID option for tables created from alter=True commands,7908073,closed,0,,,2,2023-06-04T05:29:28Z,2023-06-14T06:09:21Z,2023-06-05T19:26:26Z,CONTRIBUTOR,,"> If you use INTEGER PRIMARY KEY column, the VACUUM does not change the values of that column. However, if you use unaliased rowid, the VACUUM command will reset the rowid values.

ROWID should never be used with foreign keys but the simple act of aliasing rowid to id (which is what happens when one does `id integer primary key` DDL) makes it OK.

It would be convenient if there were more options to use a string column (eg. filepath) as the PK, and be able to use it during upserts, but when creating a foreign key, to create an integer column which aliases rowid

I made an attempt to switch to integer primary keys here but it is not going well... In my usecase the path column is a business key. Yes, it should be as simple as including the `id` column in any select statement where I plan on using `upsert` but it would be nice if this could be abstracted away somehow  https://github.com/chapmanjacobd/library/commit/788cd125be01d76f0fe2153335d9f6b21db1343c

https://github.com/chapmanjacobd/library/actions/runs/5173602136/jobs/9319024777",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/557/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1522778923,I_kwDOBm6k_c5aw8Mr,1978,Document datasette.urls.row and row_blob,25778,closed,0,,,2,2023-01-06T15:45:51Z,2023-01-09T14:30:00Z,2023-01-09T14:30:00Z,CONTRIBUTOR,,"These are in the codebase but not in documentation. I think everything else in this class is documented.

```python
class Urls:
    ...
    def row(self, database, table, row_path, format=None):
        path = f""{self.table(database, table)}/{row_path}""
        if format is not None:
            path = path_with_format(path=path, format=format)
        return PrefixedUrlString(path)

    def row_blob(self, database, table, row_path, column):
        return self.table(database, table) + ""/{}.blob?_blob_column={}"".format(
            row_path, urllib.parse.quote_plus(column)
        )
```
",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/1978/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,not_planned
1436539554,I_kwDOCGYnMM5Vn9qi,511,"[insert_all, upsert_all] IntegrityError: constraint failed",7908073,closed,0,,,2,2022-11-04T19:21:48Z,2022-11-04T22:59:54Z,2022-11-04T22:54:09Z,CONTRIBUTOR,,"My understand is that `INSERT OR IGNORE` will ignore when inserts would cause duplicate keys so I'm not sure exactly why the error is raised from `sqlite3`.

```
import argparse
from pathlib import Path

from xklb import db, utils
from xklb.utils import log


def parse_args() -> argparse.Namespace:
    parser = argparse.ArgumentParser()
    parser.add_argument(""database"")
    parser.add_argument(""dbs"", nargs=""*"")
    parser.add_argument(""--upsert"")
    parser.add_argument(""--db"", ""-db"", help=argparse.SUPPRESS)
    parser.add_argument(""--verbose"", ""-v"", action=""count"", default=0)
    args = parser.parse_args()

    if args.db:
        args.database = args.db
    Path(args.database).touch()
    args.db = db.connect(args)
    log.info(utils.dict_filter_bool(args.__dict__))

    return args


def merge_db(args, source_db):
    source_db = str(Path(source_db).resolve())

    s_db = db.connect(argparse.Namespace(database=source_db, verbose=args.verbose))
    for table in [s for s in s_db.table_names() if not ""_fts"" in s and not s.startswith(""sqlite_"")]:
        log.info(""[%s]: %s"", source_db, table)
        with s_db.conn:
            data = s_db[table].rows

        with args.db.conn:
            if args.upsert:
                args.db[table].upsert_all(data, pk=args.upsert.split("",""), alter=True)
            else:
                args.db[table].insert_all(data, alter=True, replace=True)


def merge_dbs():
    args = parse_args()
    for s_db in args.dbs:
        merge_db(args, s_db)


if __name__ == ""__main__"":
    merge_dbs()

```

```
$ lb-dev merge video.db tube_71.db --upsert path -vv
SQL: INSERT OR IGNORE INTO [media]([path]) VALUES(?); - params: ['https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz']
...
File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:3122, in Table.insert_all(self, records, pk, foreign_keys, column_order, not_null, defaults, batch_size, hash_id, hash_id_columns, alter, ignore, replace, truncate, extracts, conversions, columns, upsert, analyze)
   3116             all_columns += [
   3117                 column for column in record if column not in all_columns
   3118             ]
   3120     first = False
-> 3122     self.insert_chunk(
   3123         alter,
   3124         extracts,
   3125         chunk,
   3126         all_columns,
   3127         hash_id,
   3128         hash_id_columns,
   3129         upsert,
   3130         pk,
   3131         conversions,
   3132         num_records_processed,
   3133         replace,
   3134         ignore,
   3135     )
   3137 if analyze:
   3138     self.analyze()

File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:2887, in Table.insert_chunk(self, alter, extracts, chunk, all_columns, hash_id, hash_id_columns, upsert, pk, conversions, num_records_processed, replace, ignore)
   2885 for query, params in queries_and_params:
   2886     try:
-> 2887         result = self.db.execute(query, params)
   2888     except OperationalError as e:
   2889         if alter and ("" column"" in e.args[0]):
   2890             # Attempt to add any missing columns, then try again

File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:484, in Database.execute(self, sql, parameters)
    482     self._tracer(sql, parameters)
    483 if parameters is not None:
--> 484     return self.conn.execute(sql, parameters)
    485 else:
    486     return self.conn.execute(sql)

IntegrityError: constraint failed
> /home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py(484)execute()
    482                 self._tracer(sql, parameters)
    483             if parameters is not None:
--> 484                 return self.conn.execute(sql, parameters)
    485             else:
    486                 return self.conn.execute(sql)
```

```
sqlite3 --version
3.36.0 2021-06-18 18:36:39
```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/511/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1339663518,I_kwDOBm6k_c5P2aSe,1784,"Include ""entrypoint"" option on `--load-extension`?",15178711,closed,0,,,2,2022-08-16T00:22:57Z,2022-08-23T18:34:31Z,2022-08-23T18:34:31Z,CONTRIBUTOR,,"## Problem 

SQLite extensions have the option to define multiple ""entrypoints"" in each loadable extension. For example, the upcoming version of `sqlite-lines` will have 2 entrypoints: the default `sqlite3_lines_init` (which SQLite will automatically guess for) and `sqlite3_lines_noread_init`. The `sqlite3_lines_noread_init` version omits functions that read from the filesystem, which is necessary for security purposes when running untrusted SQL (which Datasette does).

(Similar multiple entrypoints will also be added for sqlite-http).



The `--load-extension` flag, however, doesn't give the option to specify a different entrypoint, so the default one is always used. 

## Proposal

I want there to be a new command line option of the `--load-extension` flag to specify a custom entrypoint like so:
```
datasette my.db \
  --load-extension ./lines0 sqlite3_lines0_noread_init
```

Then, under the hood, this line of code:

https://github.com/simonw/datasette/blob/7af67b54b7d9bca43e948510fc62f6db2b748fa8/datasette/app.py#L562

Would look something like this:

```python
 conn.execute(""SELECT load_extension(?, ?)"", [extension, entrypoint]) 
```

One potential problem: For backward compatibility, I'm not sure if Click allows cli flags to have variable number of options (""arity""). So I guess it could also use a `:` delimiter like `--static`:

```
datasette my.db \
  --load-extension ./lines0:sqlite3_lines0_noread_init
```

Or maybe even a new flag name?

```
datasette my.db \
  --load-extension-entrypoint ./lines0 sqlite3_lines0_noread_init
```


Personally I prefer the `:` option... and maybe even `--load-extension` -> `--load`? Definitely out of scope for this issue tho

```
datasette my.db \
  --load./lines0:sqlite3_lines0_noread_init
```",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/1784/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
336936010,MDU6SXNzdWUzMzY5MzYwMTA=,331,Datasette throws error when loading spatialite db without extension loaded,82988,closed,0,,,2,2018-06-29T09:51:14Z,2022-01-20T21:29:40Z,2018-07-10T15:13:36Z,CONTRIBUTOR,,"When starting datasette on a SpatialLite database *without* loading the SpatiaLite extension (using eg `--load-extension=/usr/local/lib/mod_spatialite.dylib`) an error is thrown and the server fails to start:

```
datasette -p 8003 adminboundaries.db 
Serve! files=('adminboundaries.db',) on port 8003
Traceback (most recent call last):
  File ""/Users/ajh59/anaconda3/bin/datasette"", line 11, in <module>
    sys.exit(cli())
  File ""/Users/ajh59/anaconda3/lib/python3.6/site-packages/click/core.py"", line 722, in __call__
    return self.main(*args, **kwargs)
  File ""/Users/ajh59/anaconda3/lib/python3.6/site-packages/click/core.py"", line 697, in main
    rv = self.invoke(ctx)
  File ""/Users/ajh59/anaconda3/lib/python3.6/site-packages/click/core.py"", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File ""/Users/ajh59/anaconda3/lib/python3.6/site-packages/click/core.py"", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File ""/Users/ajh59/anaconda3/lib/python3.6/site-packages/click/core.py"", line 535, in invoke
    return callback(*args, **kwargs)
  File ""/Users/ajh59/anaconda3/lib/python3.6/site-packages/datasette/cli.py"", line 552, in serve
    ds.inspect()
  File ""/Users/ajh59/anaconda3/lib/python3.6/site-packages/datasette/app.py"", line 273, in inspect
    ""tables"": inspect_tables(conn, self.metadata.get(""databases"", {}).get(name, {}))
  File ""/Users/ajh59/anaconda3/lib/python3.6/site-packages/datasette/inspect.py"", line 79, in inspect_tables
    ""PRAGMA table_info({});"".format(escape_sqlite(table))
sqlite3.OperationalError: no such module: VirtualSpatialIndex
``` 

It would be nice to trap this and return a message saying something like:

```
It looks like you're trying to load a SpatiaLite database? Make sure you load in the SpatiaLite extension when starting datasette.

Read more: https://datasette.readthedocs.io/en/latest/spatialite.html
```

",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/331/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1059549523,I_kwDOBm6k_c4_J3FT,1526,"Add to vercel.json, rather than overwriting it.",192568,closed,0,,,2,2021-11-22T00:47:12Z,2021-11-22T04:49:45Z,2021-11-22T04:13:47Z,CONTRIBUTOR,,"I'd like to be able to add to vercel.json. But Datasette overwrites whatever I put in that file. I originally reported this here:
https://github.com/simonw/datasette-publish-vercel/issues/51

In that case, I wanted to do a rewrite... and now I need to do 301 redirects (because we had to rename our site).

Can this be addressed?
",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/1526/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
994390593,MDU6SXNzdWU5OTQzOTA1OTM=,1468,Faceting for custom SQL queries,72577720,closed,0,,,2,2021-09-13T02:52:16Z,2021-09-13T04:54:22Z,2021-09-13T04:54:17Z,CONTRIBUTOR,,"Facets are awesome.  But not when I need to join to tidy tables together.  Or even just running explicitly the default SQL query that simply lists all the rows and columns of a table (up to SIZE).  That is to say, when I browse a table, I see facets:

https://latest.datasette.io/fixtures/compound_three_primary_keys

But when I run a custom query, I don't:

https://latest.datasette.io/fixtures?sql=select+pk1%2C+pk2%2C+pk3%2C+content+from+compound_three_primary_keys+order+by+pk1%2C+pk2%2C+pk3+limit+101

Is there an idiom to cause custom SQL to come back with facet suggestions?",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/1468/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
839008371,MDU6SXNzdWU4MzkwMDgzNzE=,1274,Might there be some way to comment metadata.json?,192568,closed,0,,,2,2021-03-23T18:33:00Z,2021-03-23T20:14:54Z,2021-03-23T20:14:54Z,CONTRIBUTOR,,"I don't know what license to use... Would be nice to be able to add a comment regarding that uncertainty in my metadata.json file

I like laktak's little video comment in favor of Human json (Hjson)
https://stackoverflow.com/questions/244777/can-comments-be-used-in-json

Hmmm... one of the commenters there said comments are allowed in yaml... so that's a good argument for yaml.

Anyhow, just came to mind, and thought I'd mention it here. Looks like https://hjson.github.io/ has the details.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/1274/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
794554881,MDU6SXNzdWU3OTQ1NTQ4ODE=,1208,A lot of open(file) functions are used without a context manager thus producing ResourceWarning: unclosed file <_io.TextIOWrapper,4488943,closed,0,,,2,2021-01-26T20:56:28Z,2021-03-11T16:15:49Z,2021-03-11T16:15:49Z,CONTRIBUTOR,,"Your code is full of open files that are never closed, especially when you deal with reading/writing json/yaml files.

If you run python with warnings enabled this problem becomes evident.
This probably contributes to some memory leaks in long running datasettes if the GC will not 'collect' those resources properly.

This is easily fixed by using a context manager instead of just using open:
```python
with open('some_file', 'w') as opened_file:
    opened_file.write('string')
```

In some newer parts of the code you use Path objects 'read_text' and 'write_text' functions which close the file properly and are prefered in some cases.


If you want I can create a PR for all places i found this pattern in.


Bellow is a fraction of places where i found a ResourceWarning:
```python

update-docs-help.py:
  20          actual = actual.replace(""Usage: cli "", ""Usage: datasette "")
  21:         open(docs_path / filename, ""w"").write(actual)
  22  

datasette\app.py:
  210          ):
  211:             inspect_data = json.load((config_dir / ""inspect-data.json"").open())
  212              if immutables is None:

  266          if config_dir and (config_dir / ""settings.json"").exists() and not config:
  267:             config = json.load((config_dir / ""settings.json"").open())
  268          self._settings = dict(DEFAULT_SETTINGS, **(config or {}))

  445              self._app_css_hash = hashlib.sha1(
  446:                 open(os.path.join(str(app_root), ""datasette/static/app.css""))
  447                  .read()

datasette\cli.py:
  130      else:
  131:         out = open(inspect_file, ""w"")
  132      loop = asyncio.get_event_loop()

  459      if inspect_file:
  460:         inspect_data = json.load(open(inspect_file))
  461  

```

",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/1208/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
688659182,MDU6SXNzdWU2ODg2NTkxODI=,145,Bug when first record contains fewer columns than subsequent records,96218,closed,0,,,2,2020-08-30T05:44:44Z,2020-09-08T23:21:23Z,2020-09-08T23:21:23Z,CONTRIBUTOR,,"`insert_all()` selects the maximum batch size based on the number of fields in the first record.  If the first record has fewer fields than subsequent records (and `alter=True` is passed), this can result in SQL statements with more than the maximum permitted number of host parameters.  This situation is perhaps unlikely to occur, but could happen if the first record had, say, 10 columns, such that `batch_size` (based on  `SQLITE_MAX_VARIABLE_NUMBER = 999`) would be 99.  If the next 98 rows had 11 columns, the resulting SQL statement for the first batch would have `10 * 1 + 11 * 98 = 1088` host parameters (and subsequent batches, if the data were consistent from thereon out, would have `99 * 11 = 1089`).

I suspect that this bug is masked somewhat by the fact that while:
> [`SQLITE_MAX_VARIABLE_NUMBER`](https://www.sqlite.org/limits.html#max_variable_number) ... defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0.

it is common that it is increased at compile time.  Debian-based systems, for example, seem to ship with a version of sqlite compiled with `SQLITE_MAX_VARIABLE_NUMBER` set to 250,000, and I believe this is the case for homebrew installations too.

A test for this issue might look like this:
```python
def test_columns_not_in_first_record_should_not_cause_batch_to_be_too_large(fresh_db):
    # sqlite on homebrew and Debian/Ubuntu etc. is typically compiled with
    #  SQLITE_MAX_VARIABLE_NUMBER set to 250,000, so we need to exceed this value to
    #  trigger the error on these systems.
    THRESHOLD = 250000
    extra_columns = 1 + (THRESHOLD - 1) // 99
    records = [
        {""c0"": ""first record""},  # one column in first record -> batch_size = 100
        # fill out the batch with 99 records with enough columns to exceed THRESHOLD
        *[
            dict([(""c{}"".format(i), j) for i in range(extra_columns)])
            for j in range(99)
        ]
    ]
    try:
        fresh_db[""too_many_columns""].insert_all(records, alter=True)
    except sqlite3.OperationalError:
        raise
```

The best solution, I think, is simply to process all the records when determining columns, column types, and the batch size.  In my tests this doesn't seem to be particularly costly at all, and cuts out a lot of complications (including obviating my implementation of #139 at #142).  I'll raise a PR for your consideration.

",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/145/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
620969465,MDU6SXNzdWU2MjA5Njk0NjU=,767,Allow to specify a URL fragment for canned queries,2657547,closed,0,,5471110,2,2020-05-19T13:17:42Z,2020-05-27T21:52:25Z,2020-05-27T21:52:25Z,CONTRIBUTOR,,"Canned queries are very useful to direct users to prepared data and views. I like to use them with charts using datasette-vega a lot, because people get a direct impression at first glance.

datasette-vega doesn't show up by default though, and users have to click through to it. Also, datasette-vega does not always guess the best way to render columns correctly though, so it would be nice if I could specify a URL fragment in my canned queries to make sure people see what I want them to see.

My current workaround is to include a fragement link in ``description_html`` and ask people to reload the page, like [here](https://data.rixx.de/songs/show_by_bpm#g.mark=bar&g.x_column=bpm_floor&g.x_type=ordinal&g.y_column=bpm_count&g.y_type=quantitative), which is a bit hacky.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/767/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
518739697,MDU6SXNzdWU1MTg3Mzk2OTc=,30,`followers` fails because `transform_user` is called twice,21148,closed,0,,,2,2019-11-06T20:44:52Z,2019-11-09T20:15:28Z,2019-11-09T19:55:52Z,CONTRIBUTOR,,"Trying to run `twitter-to-sqlite followers` errors out:

```
Traceback (most recent call last):
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/bin/twitter-to-sqlite"", line 10, in <module>
    sys.exit(cli())
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/lib/python3.7/site-packages/click/core.py"", line 764, in __call__
    return self.main(*args, **kwargs)
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/lib/python3.7/site-packages/click/core.py"", line 717, in main
    rv = self.invoke(ctx)
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/lib/python3.7/site-packages/click/core.py"", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/lib/python3.7/site-packages/click/core.py"", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/lib/python3.7/site-packages/click/core.py"", line 555, in invoke
    return callback(*args, **kwargs)
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/lib/python3.7/site-packages/twitter_to_sqlite/cli.py"", line 130, in followers
    go(bar.update)
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/lib/python3.7/site-packages/twitter_to_sqlite/cli.py"", line 116, in go
    utils.save_users(db, [profile])
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/lib/python3.7/site-packages/twitter_to_sqlite/utils.py"", line 302, in save_users
    transform_user(user)
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/lib/python3.7/site-packages/twitter_to_sqlite/utils.py"", line 181, in transform_user
    user[""created_at""] = parser.parse(user[""created_at""])
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/lib/python3.7/site-packages/dateutil/parser/_parser.py"", line 1374, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/lib/python3.7/site-packages/dateutil/parser/_parser.py"", line 646, in parse
    res, skipped_tokens = self._parse(timestr, **kwargs)
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/lib/python3.7/site-packages/dateutil/parser/_parser.py"", line 725, in _parse
    l = _timelex.split(timestr)         # Splits the timestr into tokens
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/lib/python3.7/site-packages/dateutil/parser/_parser.py"", line 207, in split
    return list(cls(s))
  File ""/Users/jacob/Library/Caches/pypoetry/virtualenvs/jkm-dogsheep-ezLnyXZS-py3.7/lib/python3.7/site-packages/dateutil/parser/_parser.py"", line 76, in __init__
    '{itype}'.format(itype=instream.__class__.__name__))
TypeError: Parser must be a string or character stream, not datetime
```

This appears to be because https://github.com/dogsheep/twitter-to-sqlite/blob/master/twitter_to_sqlite/cli.py#L111 calls `transform_user`, and then https://github.com/dogsheep/twitter-to-sqlite/blob/master/twitter_to_sqlite/cli.py#L116 calls `transform_user` again, which fails because the user is already transformed.

I was able to work around this by commenting out https://github.com/dogsheep/twitter-to-sqlite/blob/master/twitter_to_sqlite/cli.py#L116. 

Shall I work up a patch for that, or is there a better approach?",206156866,issue,,,"{""url"": ""https://api.github.com/repos/dogsheep/twitter-to-sqlite/issues/30/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
291639118,MDU6SXNzdWUyOTE2MzkxMTg=,183,Custom Queries - escaping strings,82988,closed,0,,,2,2018-01-25T16:49:13Z,2019-06-24T06:45:07Z,2019-06-24T06:45:07Z,CONTRIBUTOR,,"If a SQLite table column name contains spaces, they are usually referred to in double quotes:

`SELECT * FROM mytable WHERE ""gappy column name""=""my value"";`

In the JSON metadata file, this is passed by escaping the double quotes:

`""queries"": {""my query"": ""SELECT * FROM mytable WHERE \""gappy column name\""=\""my value\"";""}`

When specifying a custom query in `metadata.json` using double quotes, these are then rendered in the *datasette* query box using single quotes:

`SELECT * FROM mytable WHERE 'gappy column name'='my value';`

which does not work.

Alternatively, a valid custom query can be passed using backticks (\`) to quote the column name and single (unescaped) quotes for the matched value:

``""queries"": {""my query"": ""SELECT * FROM mytable WHERE `gappy column name`='my value';""}``
",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/183/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
459627549,MDU6SXNzdWU0NTk2Mjc1NDk=,523,Show total/unfiltered row count when filtering,2657547,closed,0,,,2,2019-06-23T22:56:48Z,2019-06-24T01:38:14Z,2019-06-24T01:38:14Z,CONTRIBUTOR,,"When I'm seeing a filtered view of a table, I'd like to be able to see something like '2 rows where status != ""closed"" (of 1000 total)' to have a context for the data I'm seeing – e.g. currently my database is being filled by an importer, so this information would be super helpful.

Since this information would be a performance hit, maybe something like '12 rows where status != ""closed"" (of ??? total)' with lazy-loading on-click(?) could be applied (Or via a ""How many total?"" tooltip, or …)",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/523/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
438200529,MDU6SXNzdWU0MzgyMDA1Mjk=,438,Plugins are loaded when running pytest,45057,closed,0,,,2,2019-04-29T08:25:58Z,2019-05-02T05:09:18Z,2019-05-02T05:09:11Z,CONTRIBUTOR,,"If I have a datasette plugin installed on my system, its hooks are called when running the main datasette tests. This is probably undesirable, especially with the inspect hook in #437, as the plugin may rely on inspected state that the tests don't know about.",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/438/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
369716228,MDU6SXNzdWUzNjk3MTYyMjg=,366,Default built image size over Zeit Now 100MiB limit,416374,closed,0,,,2,2018-10-12T21:27:17Z,2018-11-05T06:23:32Z,2018-11-05T06:23:32Z,CONTRIBUTOR,,"Using `dataset publish now` with no other custom options on a small (43KB) sqlite database leads to the error ""The built image size (373.5M) exceeds the 100MiB limit"". I think this is because of a recent Zeit change: https://github.com/zeit/now-cli/issues/1523",107914493,issue,,,"{""url"": ""https://api.github.com/repos/simonw/datasette/issues/366/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed