id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,pull_request,body,repo,type,active_lock_reason,performed_via_github_app,reactions,draft,state_reason
1976986318,I_kwDOCGYnMM511mrO,599,Cannot find spatialite on arm64 linux,37802088,closed,0,,,1,2023-11-03T22:05:51Z,2023-11-04T01:06:31Z,2023-11-04T00:33:28Z,CONTRIBUTOR,,"Initially, I found an issue in `datasette` where it wouldn’t find `spatialite` when running on my Radxa Rock 5B - an RK3588 powered SBC, running the arm64 build of Debian Bullseye. I confirmed the same behaviour on my Raspberry Pi 4 - a BCM2711 powered SBC, running the arm64 build of Debian Bookworm.

```
$ datasette --load-extension=spatialite example.db
Error: Could not find SpatiaLite extension
```

I did some digging and realised the issue originates in this project. Even with the `libsqlite3-mod-spatialite` package installed, `pytest` skips all of the GIS tests in the project.

```
$ apt list --installed | grep spatial
[…]
libsqlite3-mod-spatialite/stable,now 5.0.1-3 arm64 [installed]

$ ls -l /usr/lib/*/*spatial*
lrwxrwxrwx 1 root root      23 Dec  1  2022 /usr/lib/aarch64-linux-gnu/mod_spatialite.so -> mod_spatialite.so.7.1.0
lrwxrwxrwx 1 root root      23 Dec  1  2022 /usr/lib/aarch64-linux-gnu/mod_spatialite.so.7 -> mod_spatialite.so.7.1.0
-rw-r--r-- 1 root root 7348584 Dec  1  2022 /usr/lib/aarch64-linux-gnu/mod_spatialite.so.7.1.0
```

```
$ pytest
tests/test_get.py ......                                                 [ 73%]
tests/test_gis.py ssssssssssss                                           [ 75%]
tests/test_hypothesis.py ....                                            [ 75%]
```

I tracked the issue down to the [`find_sqlite()` function in the `utils.py`](https://github.com/simonw/sqlite-utils/blob/622c3a5a7dd53a09c029e2af40c2643fe7579340/sqlite_utils/utils.py#L60) file. The [`SPATIALITE_PATHS`](https://github.com/simonw/sqlite-utils/blob/main/sqlite_utils/utils.py#L34-L39) array doesn’t have an entry for the location of this module on arm64 linux.
",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/599/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1239034903,I_kwDOCGYnMM5J2iwX,433,CLI eats my cursor,7908073,closed,0,,,10,2022-05-17T18:52:52Z,2023-11-04T00:46:30Z,2023-11-04T00:46:30Z,CONTRIBUTOR,,"I'm not sure why this happens but `sqlite-utils` makes my terminal cursor disappear after running commands like `sqlite-utils insert`. I've only noticed this behavior in `sqlite-utils`, not in any other CLI tools

I can still type commands after it runs but the text cursor is invisible",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/433/reactions"", ""total_count"": 5, ""+1"": 5, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1821108702,I_kwDOCGYnMM5si-ne,579,Special handling for SQLite column of type `JSON`,15178711,open,0,,,0,2023-07-25T20:37:23Z,2023-07-25T20:37:23Z,,CONTRIBUTOR,,"`sqlite-utils` should detect and have specially handling for column with a `JSON` column. For example:

```sql
CREATE TABLE ""dogs"" (
  id INTEGER PRIMARY KEY,
  name TEXT,
  friends JSON 
);
```

## Automatic Nesting

According to [""Nested JSON Values""](https://sqlite-utils.datasette.io/en/stable/cli.html#nested-json-values), sqlite-utils will only expand JSON if the `--json-cols` flag is passed. It looks like it'll try to `json.load` all text column to test if its JSON, which can get expensive on non-json columns. 

Instead, `sqlite-utils` should be default (ie without the `--json-cols` flags) do the `maybe_json()` operation on columns with a declared `JSON` type. So the above table would expand the `""friends""` column as expected, withoutthe `--json-cols` flag:

```bash
sqlite-utils dogs.db ""select * from dogs"" | python -mjson.tool
```

```
[
    {
        ""id"": 1,
        ""name"": ""Cleo"",
        ""friends"": [
            {
                ""name"": ""Pancakes""
            },
            {
                ""name"": ""Bailey""
            }
        ]
    }
]
```

---

I'm sure there's other ways `sqlite-utils` can specially handle JSON columns, so keeping this open while I think of more",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/579/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,
1801394744,I_kwDOCGYnMM5rXxo4,567,Plugin system,15178711,closed,0,,,9,2023-07-12T17:02:14Z,2023-07-22T22:59:37Z,2023-07-22T22:59:36Z,CONTRIBUTOR,,"I'd like there to be a plugin system for sqlite-utils, similar to the datasette/llm plugins. I'd like to make plugins that would do things like:

- Register SQLite extensions for more SQL functions + virtual tables
- Register new subcommands
- Different input file formats for `sqlite-utils memory`
- Different output file formats (in addition to `--csv` `--tsv` `--nl` etc.

A few real-world use-cases of plugins I'd like to see in sqlite-utils:

- Register many of my sqlite extensions in sqlite-utils (`sqlite-http`, `sqlite-lines`, `sqlite-regex`, etc.)
- New subcommands to work with `sqlite-vss` vector tables
- Input/ouput Parquet/Avro/Arrow IPC files with `sqlite-arrow`",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/567/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1655860104,I_kwDOCGYnMM5ismuI,535,rows: --transpose or psql extended view-like functionality,7908073,closed,0,,,2,2023-04-05T15:37:33Z,2023-06-15T08:39:49Z,2023-06-14T22:05:28Z,CONTRIBUTOR,,"It would be nice if the rows subcommand had a flag, perhaps called `--transpose` which would print in long form instead of wide. Similar to extended display mode in psql (`\x`)

In other words instead of this:

```
sqlite-utils rows  --limit 5 --fmt github track_metadata.db songs
```

| track_id           | title             | song_id            | release                              | artist_id          | artist_mbid                          | artist_name      |   duration |   artist_familiarity |   artist_hotttnesss |   year |   track_7digitalid |   shs_perf |   shs_work |
|--------------------|-------------------|--------------------|--------------------------------------|--------------------|--------------------------------------|------------------|------------|----------------------|---------------------|--------|--------------------|------------|------------|
| TRMMMYQ128F932D901 | Silent Night      | SOQMMHC12AB0180CB8 | Monster Ballads X-Mas                | ARYZTJS1187B98C555 | 357ff05d-848a-44cf-b608-cb34b5701ae5 | Faster Pussy cat |    252.055 |             0.649822 |            0.394032 |   2003 |            7032331 |         -1 |          0 |
| TRMMMKD128F425225D | Tanssi vaan       | SOVFVAK12A8C1350D9 | Karkuteillä                          | ARMVN3U1187FB3A1EB | 8d7ef530-a6fd-4f8f-b2e2-74aec765e0f9 | Karkkiautomaatti |    156.551 |             0.439604 |            0.356992 |   1995 |            1514808 |         -1 |          0 |
| TRMMMRX128F93187D9 | No One Could Ever | SOGTUKN12AB017F4F1 | Butter                               | ARGEKB01187FB50750 | 3d403d44-36ce-465c-ad43-ae877e65adc4 | Hudson Mohawke   |    138.971 |             0.643681 |            0.437504 |   2006 |            6945353 |         -1 |          0 |
| TRMMMCH128F425532C | Si Vos Querés     | SOBNYVR12A8C13558C | De Culo                              | ARNWYLR1187B9B2F9C | 12be7648-7094-495f-90e6-df4189d68615 | Yerba Brava      |    145.058 |             0.448501 |            0.372349 |   2003 |            2168257 |         -1 |          0 |
| TRMMMWA128F426B589 | Tangle Of Aspens  | SOHSBXH12A8C13B0DF | Rene Ablaze Presents Winter Sessions | AREQDTE1269FB37231 |                                      | Der Mystic       |    514.298 |             0        |            0        |      0 |            2264873 |         -1 |          0 |


The output would look something like this:

```
$ for col in (sqlite-columns track_metadata.db songs)
    sqlite-utils --fmt github track_metadata.db ""select $col from songs order by rowid desc limit 5""
end
```

| track_id           |
|--------------------|
| TRYYYVU12903CD01E3 |
| TRYYYDJ128F9310A21 |
| TRYYYMG128F4260ECA |
| TRYYYJO128F426DA37 |
| TRYYYUS12903CD2DF0 |
| title                               |
|-------------------------------------|
| Fernweh feat. Sektion Kuchikäschtli |
| Faraday                             |
| Novemba                             |
| Jago Chhadeo                        |
| O Samba Da Vida                     |
| song_id            |
|--------------------|
| SOWXJXQ12AB0189F43 |
| SOLXGOR12A81C21EB7 |
| SOHODZI12A8C137BB3 |
| SOXQYIQ12A8C137FBB |
| SOTXAME12AB018F136 |
| release                         |
|---------------------------------|
| So Oder So                      |
| The Trance Collection Vol. 2    |
| Dub_Connected: electronic music |
| Naale Baba Lassi Pee Gya        |
| Pacha V.I.P.                    |
| artist_id          |
|--------------------|
| AR7PLM21187B990D08 |
| ARCMCOK1187B9B1073 |
| ARZ3R6M1187B9AF750 |
| ART5FZD1187B9A7FCF |
| AR7Z4J81187FB3FC59 |
| artist_mbid                          |
|--------------------------------------|
| 3af2b07e-c91c-4160-9bda-f0b9e3144ed3 |
| 4ac5f3de-c5ad-475e-ad50-41f1ef9dba20 |
| 8b97e9c8-61f5-4615-9a96-276f24204e34 |
| 2357c400-9109-42b6-b3fe-9e2d9f8e3872 |
| 9d50cb20-7e42-45cc-b0dd-154c3e92a577 |
| artist_name    |
|----------------|
| Texta          |
| Elude          |
| Gabriel Le Mar |
| Kuldeep Manak  |
| Kiko Navarro   |
|   duration |
|------------|
|    295.079 |
|    484.519 |
|    553.038 |
|    244.166 |
|    217.443 |
|   artist_familiarity |
|----------------------|
|             0.552977 |
|             0.403668 |
|             0.556918 |
|             0.4015   |
|             0.528617 |
|   artist_hotttnesss |
|---------------------|
|            0.454869 |
|            0.256935 |
|            0.336914 |
|            0.374866 |
|            0.411595 |
|   year |
|--------|
|   2004 |
|      0 |
|      0 |
|      0 |
|      0 |
|   track_7digitalid |
|--------------------|
|            8486723 |
|            5472456 |
|            2219291 |
|            1632096 |
|            7522478 |
|   shs_perf |
|------------|
|         -1 |
|         -1 |
|         -1 |
|         -1 |
|         -1 |
|   shs_work |
|------------|
|          0 |
|          0 |
|          0 |
|          0 |
|          0 |
",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/535/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1581090327,I_kwDOCGYnMM5ePYYX,529,Microsoft line endings,7908073,closed,0,,,1,2023-02-12T02:20:48Z,2023-06-14T23:12:12Z,2023-06-14T23:11:47Z,CONTRIBUTOR,,"sqlite-utils prints `\r\n` but [it should probably](https://devblogs.microsoft.com/commandline/extended-eol-in-notepad/) print `\n` (unless the platform is detected as Windows?)

It has tripped me up a few times when piping the output of sqlite-utils to other programs:

```
$ sqlite-utils --no-headers --csv  ~/lb/fs/d.db 'select path from media limit 1' | cat -A
/mnt/d7/file^M$
$ sqlite-utils --no-headers --csv  ~/lb/fs/d.db 'select path from media limit 1' | tr -d '\r' | cat -A
/mnt/d7/file$
```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/529/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1740150327,I_kwDOCGYnMM5nuJY3,557,Aliased ROWID option for tables created from alter=True commands,7908073,closed,0,,,2,2023-06-04T05:29:28Z,2023-06-14T06:09:21Z,2023-06-05T19:26:26Z,CONTRIBUTOR,,"> If you use INTEGER PRIMARY KEY column, the VACUUM does not change the values of that column. However, if you use unaliased rowid, the VACUUM command will reset the rowid values.

ROWID should never be used with foreign keys but the simple act of aliasing rowid to id (which is what happens when one does `id integer primary key` DDL) makes it OK.

It would be convenient if there were more options to use a string column (eg. filepath) as the PK, and be able to use it during upserts, but when creating a foreign key, to create an integer column which aliases rowid

I made an attempt to switch to integer primary keys here but it is not going well... In my usecase the path column is a business key. Yes, it should be as simple as including the `id` column in any select statement where I plan on using `upsert` but it would be nice if this could be abstracted away somehow  https://github.com/chapmanjacobd/library/commit/788cd125be01d76f0fe2153335d9f6b21db1343c

https://github.com/chapmanjacobd/library/actions/runs/5173602136/jobs/9319024777",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/557/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1740026046,I_kwDOCGYnMM5ntrC-,556,Support storing incrementally piped values,601708,open,0,,,1,2023-06-04T00:45:23Z,2023-06-04T01:21:15Z,,CONTRIBUTOR,,"I'm trying to use sqlite-utils to data generated incrementally. There are a few
aspects of this that I don't currently know how to handle. I would like an option
to apply writes incrementally, line-by-line as they are received. I would like an
option to echo incremental progress. And, it would be nice to have 

In particular, I'm using CoreLocationCLI -w -j to generate, newline-delimited JSON.

One variant of the command 

`stdbuf -oL CoreLocationCLI -w -j | pee 'sqlite-utils insert loc.db loc -' nl`

`pee`, from `moreutils`, is like `tee` but spawns and pipes to the processes
created by invoking each of its arguments, so, for gratuitous demonstration,
`pee 'sponge out.log' cat` would behave like `tee`.

It looks like I can get what I want with:
`stdbuf -oL CoreLocationCLI -w -j | while read line; do <<<""$line"" sqlite-utils insert loc.db loc -; echo ""$line""; done | nl`


",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/556/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,
1578790070,I_kwDOCGYnMM5eGmy2,527,`Table.convert()` skips falsey values,167893,closed,0,,,5,2023-02-10T00:00:52Z,2023-05-09T21:15:05Z,2023-05-08T21:03:24Z,CONTRIBUTOR,,"# Summary

By design, `Table.convert()` does [not attempt](https://github.com/simonw/sqlite-utils/blob/fc221f9b62ed8624b1d2098e564f525c84497969/sqlite_utils/db.py#L2663) conversion of falsey values (`None`, `""""`, `0`, ...). This is surprising (directly contradicts the docstring) and `convert()` may quietly skip cells where the user assumed a conversion would take place. 

# Example
Increment a column of integers by one

``` python
from sqlite_utils import Database

db = Database(memory=True)
table = db['table']
col = 'x'
table.insert_all([{col: 0}, {col:1}])
print(table.get(1)) # 0
print(table.get(2)) # 1
print()

table.convert(col, lambda x: x+1)
print(table.get(1)) # got 0, expected 1 ⚠⚠⚠
print(table.get(2)) # got 2, expected 2
```

Another example might be, say, transforming cells containing empty string to `NULL`.

# Discussion

This was, I think, a pragmatic choice so that consumers can skip writing guard clauses for these falsey values (particularly from the CLI). But this surprising undocumented behavior can lead to incorrect data. I don't think this is a good trade-off between convenience and correctness.

In the absence of this convenience users will either have to write guard clauses into their conversion expressions (or adapt the called function to do the same), so: 
``` python
    fn(value) if value else value
```
instead of:
``` python
    fn(value)
```
This is more typing and sometimes I will forget, and there will be errors. (But they will be noisy errors, which is a good thing).

Such a change will certainly inconvenience some existing consumers; there will be some breakage. But I think this is worth it to avoid quietly not converting some values by default, which can lead to quietly bad data.

I have a PR that I will attach, please take a look and see what you think.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/527/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1575131737,I_kwDOCGYnMM5d4ppZ,525,Repeated calls to `Table.convert()` fail,167893,closed,0,,,4,2023-02-07T22:40:47Z,2023-05-08T21:59:41Z,2023-05-08T21:54:02Z,CONTRIBUTOR,,"## Summary
When using the API, repeated calls to `Table.convert()`  do not work correctly since all conversions quietly use the callable (function, lambda) from the first call to `convert()` only. Subsequent invocations with different callables use the callable from the first invocation only.

## Example
```python
from sqlite_utils import Database

db = Database(memory=True)
table = db['table']
col = 'x'
table.insert_all([{col: 1}])
print(table.get(1))

table.convert(col, lambda x: x*2)
print(table.get(1))

def zeroize(x):
    return 0
#zeroize = lambda x: 0
#zeroize.__name__ = 'zeroize'
table.convert(col, zeroize)
print(table.get(1))
```

Output:
```
{'x': 1}
{'x': 2}
{'x': 4}
```
Expected:
```
{'x': 1}
{'x': 2}
{'x': 0}
```

## Explanation
This is some relevant [documentation](https://github.com/simonw/sqlite-utils/blob/1491b66dd7439dd87cd5cd4c4684f46eb3c5751b/docs/python-api.rst#registering-custom-sql-functions:~:text=By%20default%20registering%20a%20function%20with%20the%20same%20name%20and%20number%20of%20arguments%20will%20have%20no%20effect).

 * `Table.convert()` takes a `Callable` to perform data conversion on a column
 * The `Callable` is passed to `Database.register_function()`
 * `Database.register_function()` uses the callable's `__name__` attribute for registration
 * (Aside: all lambdas have a `__name__` of `<lambda>`: I thought this was the problem, and it was close, but not quite)
 * However `convert()` first wraps the callable by local function [`convert_value()`](https://github.com/simonw/sqlite-utils/blob/fc221f9b62ed8624b1d2098e564f525c84497969/sqlite_utils/db.py#L2661)
 * Consequently `register_function()` sees name `convert_value` for all invocations from `convert()`
 * `register_function()` silently ignores registrations using the same name, retaining only the first such registration

There's a mismatch between the comments and the code: https://github.com/simonw/sqlite-utils/blob/fc221f9b62ed8624b1d2098e564f525c84497969/sqlite_utils/db.py#L404

but actually the existing function is returned/used instead (as the ""registering custom sql functions"" doc I linked above says too). Seems like this can be rectified to match the comment?

## Suggested fix
I think there are four things:
1. The call to `register_function()` from `convert()`should have an explicit `name=` parameter (to continue using `convert_value()` and the progress bar).
2. For functions, this name can be the real function name. (I understand the sqlite api needs a name, and it's nice if those are recognizable names where possible). For lambdas would `'lambda-{uuid}'` or similar be acceptable? 
3. `register_function()` really should throw an error on repeated attempts to register a duplicate (function, arity)-pair.
4. A test? I haven't looked at the test framework here but seems this should be testable.

## See also 
- #458 ",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/525/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1044267332,I_kwDOCGYnMM4-PkFE,336,"sqlite-util tranform --column-order mangles columns of type ""timestamp""",536941,closed,0,,,1,2021-11-04T01:15:38Z,2023-05-08T21:13:38Z,2023-05-08T21:13:38Z,CONTRIBUTOR,,"Reproducible code below:

```bash
> echo 'create table bar (baz text, created_at timestamp default CURRENT_TIMESTAMP)' | sqlite3 foo.db
> sqlite3 foo.db
SQLite version 3.36.0 2021-06-18 18:36:39
Enter "".help"" for usage hints.
sqlite> .schema bar
CREATE TABLE bar (baz text, created_at timestamp default CURRENT_TIMESTAMP);
sqlite> .exit
> sqlite-utils transform foo.db bar --column-order baz
sqlite3 foo.db
SQLite version 3.36.0 2021-06-18 18:36:39
Enter "".help"" for usage hints.
sqlite> .schema bar
CREATE TABLE IF NOT EXISTS ""bar"" (
   [baz] TEXT,
   [created_at] FLOAT DEFAULT 'CURRENT_TIMESTAMP'
);
sqlite> .exit
> sqlite-utils transform foo.db bar --column-order baz
> sqlite3 foo.db
SQLite version 3.36.0 2021-06-18 18:36:39
Enter "".help"" for usage hints.
sqlite> .schema bar
CREATE TABLE IF NOT EXISTS ""bar"" (
   [baz] TEXT,
   [created_at] FLOAT DEFAULT '''CURRENT_TIMESTAMP'''
);
```

",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/336/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1595340692,I_kwDOCGYnMM5fFveU,530,"add ability to configure ""on delete"" and ""on update"" attributes of foreign keys:",536941,open,0,,,2,2023-02-22T15:44:14Z,2023-05-08T20:39:01Z,,CONTRIBUTOR,,"sqlite supports these, and it would be quite nice to be able to add them with sqlite-utils.

https://www.sqlite.org/foreignkeys.html#fk_actions",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/530/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,
1393202060,I_kwDOCGYnMM5TCpOM,496,devrel/python api: Pylance type hinting,7908073,open,0,,,4,2022-10-01T03:03:34Z,2023-05-03T05:53:27Z,,CONTRIBUTOR,,"Pylance is generally pretty good at figuring out stuff but `sqlite-utils` has some quirks which make type hinting kinda useless. Maybe you don't care but I thought I would bring it to your attention.

For example:

```
db[""subs""].insert_all(subs, pk=""index"")
```

```
Cannot access member ""insert_all"" for type ""View""
  Member ""insert_all"" is unknown
```

`insert_all` and all the other methods show up as a type issues because the program can't know whether something is a View or a Table. Fair enough. But that basically throws all type checking out the window.

`pk=""index""` also shows up as a type issue:

```
Argument of type ""Literal['index']"" cannot be assigned to parameter ""pk"" of type ""Default"" in function ""insert_all""
  ""Literal['index']"" is incompatible with ""Default""
```

I think this is because DEFAULT is an empty class? 

maybe a few small changes could be made to make the library more type-friendly

The interim solution is of course to turn off type hints completely for the line
```
db[""subs""].insert_all(subs, pk=""index"")  # type: ignore
```
",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/496/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,
1560651350,I_kwDOCGYnMM5dBaZW,523,Feature request: trim all leading and trailing white space for all columns for all tables in a database,536941,open,0,,,1,2023-01-28T02:40:10Z,2023-01-28T02:41:14Z,,CONTRIBUTOR,,"It's pretty common that i need to trim leading or trailing white space from lots of columns in a database a part of an initial ETL.

I use the following recipe a lot, and it would be great to include this functionality into sqlite-utils

`trimify.sql`
```sql
select 'select group_concat(''update [' || name || '] set ['' || name || ''] = trim(['' || name || ''])'', '';
'') || '';
'' as sql_to_run from pragma_table_info('''||name||''');' from sqlite_schema;
```

then something like:

```bash
	sqlite3 example.db < scripts/trimify.sql > table_trim.sql && \
             sqlite3 $example.db < table_trim.sql > trim.sql && \
             sqlite3 $example.db < trim.sql
```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/523/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,
1436539554,I_kwDOCGYnMM5Vn9qi,511,"[insert_all, upsert_all] IntegrityError: constraint failed",7908073,closed,0,,,2,2022-11-04T19:21:48Z,2022-11-04T22:59:54Z,2022-11-04T22:54:09Z,CONTRIBUTOR,,"My understand is that `INSERT OR IGNORE` will ignore when inserts would cause duplicate keys so I'm not sure exactly why the error is raised from `sqlite3`.

```
import argparse
from pathlib import Path

from xklb import db, utils
from xklb.utils import log


def parse_args() -> argparse.Namespace:
    parser = argparse.ArgumentParser()
    parser.add_argument(""database"")
    parser.add_argument(""dbs"", nargs=""*"")
    parser.add_argument(""--upsert"")
    parser.add_argument(""--db"", ""-db"", help=argparse.SUPPRESS)
    parser.add_argument(""--verbose"", ""-v"", action=""count"", default=0)
    args = parser.parse_args()

    if args.db:
        args.database = args.db
    Path(args.database).touch()
    args.db = db.connect(args)
    log.info(utils.dict_filter_bool(args.__dict__))

    return args


def merge_db(args, source_db):
    source_db = str(Path(source_db).resolve())

    s_db = db.connect(argparse.Namespace(database=source_db, verbose=args.verbose))
    for table in [s for s in s_db.table_names() if not ""_fts"" in s and not s.startswith(""sqlite_"")]:
        log.info(""[%s]: %s"", source_db, table)
        with s_db.conn:
            data = s_db[table].rows

        with args.db.conn:
            if args.upsert:
                args.db[table].upsert_all(data, pk=args.upsert.split("",""), alter=True)
            else:
                args.db[table].insert_all(data, alter=True, replace=True)


def merge_dbs():
    args = parse_args()
    for s_db in args.dbs:
        merge_db(args, s_db)


if __name__ == ""__main__"":
    merge_dbs()

```

```
$ lb-dev merge video.db tube_71.db --upsert path -vv
SQL: INSERT OR IGNORE INTO [media]([path]) VALUES(?); - params: ['https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz']
...
File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:3122, in Table.insert_all(self, records, pk, foreign_keys, column_order, not_null, defaults, batch_size, hash_id, hash_id_columns, alter, ignore, replace, truncate, extracts, conversions, columns, upsert, analyze)
   3116             all_columns += [
   3117                 column for column in record if column not in all_columns
   3118             ]
   3120     first = False
-> 3122     self.insert_chunk(
   3123         alter,
   3124         extracts,
   3125         chunk,
   3126         all_columns,
   3127         hash_id,
   3128         hash_id_columns,
   3129         upsert,
   3130         pk,
   3131         conversions,
   3132         num_records_processed,
   3133         replace,
   3134         ignore,
   3135     )
   3137 if analyze:
   3138     self.analyze()

File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:2887, in Table.insert_chunk(self, alter, extracts, chunk, all_columns, hash_id, hash_id_columns, upsert, pk, conversions, num_records_processed, replace, ignore)
   2885 for query, params in queries_and_params:
   2886     try:
-> 2887         result = self.db.execute(query, params)
   2888     except OperationalError as e:
   2889         if alter and ("" column"" in e.args[0]):
   2890             # Attempt to add any missing columns, then try again

File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:484, in Database.execute(self, sql, parameters)
    482     self._tracer(sql, parameters)
    483 if parameters is not None:
--> 484     return self.conn.execute(sql, parameters)
    485 else:
    486     return self.conn.execute(sql)

IntegrityError: constraint failed
> /home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py(484)execute()
    482                 self._tracer(sql, parameters)
    483             if parameters is not None:
--> 484                 return self.conn.execute(sql, parameters)
    485             else:
    486                 return self.conn.execute(sql)
```

```
sqlite3 --version
3.36.0 2021-06-18 18:36:39
```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/511/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1430325103,I_kwDOCGYnMM5VQQdv,507,conn.execute: UnicodeEncodeError: 'utf-8' codec can't encode character,7908073,closed,0,,,1,2022-10-31T18:49:51Z,2022-11-01T00:40:17Z,2022-11-01T00:40:16Z,CONTRIBUTOR,,"I'm not really sure what caused this and it happened in the middle of my program (after running for 35775 seconds).

```
Extracting metadata 49.9% (chunk 9893 of 19831)
...
  File ""/home/xk/.local/lib/python3.10/site-packages/xklb/fs_extract.py"", line 90, in extract_chunk
    args.db[""media""].insert_all(utils.list_dict_filter_bool(media), pk=""path"", alter=True, replace=True)
  File ""/home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py"", line 3107, in insert_all
    self.insert_chunk(
  File ""/home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py"", line 2872, in insert_chunk
    result = self.db.execute(query, params)
  File ""/home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py"", line 483, in execute
    return self.conn.execute(sql, parameters)
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' in position 62: surrogates not allowed
```

This might be relevant: https://stackoverflow.com/questions/31898353/python-cant-encode-with-surrogateescape

I'm going to try re-running with 

```py
    def execute(
        self, sql: str, parameters: Optional[Union[Iterable, dict]] = None
    ) -> sqlite3.Cursor:
        """"""
        Execute SQL query and return a ``sqlite3.Cursor``.

        :param sql: SQL query to execute
        :param parameters: Parameters to use in that query - an iterable for ``where id = ?``
          parameters, or a dictionary for ``where id = :id``
        """"""
        try:
            if self._tracer:
                self._tracer(sql, parameters)
            if parameters is not None:
                return self.conn.execute(sql, parameters)
            else:
                return self.conn.execute(sql)
        except UnicodeEncodeError:
            sql = sql.encode('utf-8', 'surrogatepass').decode('utf-8')
            if parameters is not None:
                parameters = parameters.encode('utf-8', 'surrogatepass').decode('utf-8')
            return self.execute(sql, parameters)
```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/507/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1393212964,I_kwDOCGYnMM5TCr4k,497,column_names,7908073,closed,0,,,1,2022-10-01T03:34:21Z,2022-10-25T21:09:28Z,2022-10-25T21:09:28Z,CONTRIBUTOR,,"It would be nice to have a `column_names`. Similar to `table_names`.

Or if you could get one or all of the following syntax to work for both Database and Table that might be even better: 

Style 1
- `if 'table1' in db`
- `if 'col1' in db['table1']`

Style 2
- `if 'table1' in db.tables`
- `if 'col1' in db['table1'].columns`

maybe the table ones actually work but I'm too lazy to check. I just know that I have to do:

 `[c.name for c in db['table1'].columns]`

Edit: This is possible with `columns_dict`. I have actually used that before but I forgot about it. Feel free to close, but I do think accessing this data could be more consistent and intuitive.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/497/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1361355564,I_kwDOCGYnMM5RJKMs,482,balanced table default column_order,7908073,closed,0,,,1,2022-09-05T03:00:18Z,2022-10-10T17:43:02Z,2022-09-06T20:17:27Z,CONTRIBUTOR,,"Is there any performance or size difference with column order in SQLITE ? similar to this https://www.cybertec-postgresql.com/en/column-order-in-postgresql-does-matter/

It might be interesting to have an option to create with an optimized column order. I'm assuming this would look something like INTEGER columns, REAL columns, BLOB columns, TEXT columns, NULL columns. NULL columns at the end because they are more likely to be TEXT and it is impossible to know if they will become INTEGER

(Of course, any schema evolution would reduce optimization but maybe column order could also be re-evaluated when schema changes)

edit:

this is easy to accomplish with the existing `transform` method:

```
int_columns = [k for k, v in table_columns.items() if v == int]
db[table].transform(column_order=[*int_columns])
```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/482/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1366423176,I_kwDOCGYnMM5RcfaI,485,Progressbar not shown when inserting/upserting jsonlines file,99098079,closed,0,,,1,2022-09-08T14:13:18Z,2022-09-15T20:39:52Z,2022-09-15T20:37:52Z,CONTRIBUTOR,,"When inserting or upserting a jsonlines file, no progressbar is shown. Expected behavior is that, just like with .csv/.tsv files, also for a jsonlines file (--nl), unless --silent is provided, a progressbar is shown.

```bash
sql-utils upsert mydb.db posts posts.jl --nl --pk post_id
(silence)
```

Currently `file_progress` is only called within the tsv/csv logic, however I think it can be safely wrapped around all the all the input formats that use `decoded`: https://github.com/simonw/sqlite-utils/blob/main/sqlite_utils/cli.py#L963",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/485/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1355193529,I_kwDOCGYnMM5Qxpy5,479,OperationalError: cannot VACUUM from within a transaction,7908073,open,0,,,0,2022-08-30T05:34:24Z,2022-08-30T05:34:24Z,,CONTRIBUTOR,,"Maybe when calling `.vacuum()` and other DB-level write-lock operations `sqlite_utils` could guard against this error message by automatically committing first?

```
     46 db[""media""].optimize()  # type: ignore
---> 47 db.vacuum()

File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:1047, in Database.vacuum(self)
   1045 def vacuum(self):
   1046     ""Run a SQLite ``VACUUM`` against the database.""
-> 1047     self.execute(""VACUUM;"")

File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:470, in Database.execute(self, sql, parameters)
    468     return self.conn.execute(sql, parameters)
    469 else:
--> 470     return self.conn.execute(sql)

OperationalError: cannot VACUUM from within a transaction
```

It might also be nice to add a sentence or two about how transactions are committed on the [docs page](https://sqlite-utils.datasette.io/en/latest/python-api.html#detect-fts). When I was swapping out my sqlite3 code for this library it was nice that everything was pretty much drop-in but I was/am unsure what to do about the places I explicitly call `.commit()` in my code

Related to https://github.com/simonw/sqlite-utils/issues/121",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/479/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,
1324659241,I_kwDOCGYnMM5O9LIp,459,Single quoted transform recipes on Windows do not work as expected ,19921,open,0,,,0,2022-08-01T16:14:54Z,2022-08-01T16:14:54Z,,CONTRIBUTOR,,"Trying to follow the tutorial for sqlite-utils and datasette https://datasette.io/tutorials/clean-data on Windows 11 OS `Microsoft Windows [Version 10.0.22622.440]`, with sqlite-utils and datasette installed using pipx.

```
pipx list
package datasette 0.61.1, installed using Python 3.10.4
    - datasette.exe
package sqlite-utils 3.28, installed using Python 3.10.4
    - sqlite-utils.exe
```  

In the step to transform dates into ISO dates the quoted value `'r.parsedatetime(value)'` is copied verbatim into the columns instead of applying the output of the Python recipe.

```
sqlite-utils convert manatees.db locations \
  REPDATE created_date last_edited_date \
  'r.parsedatetime(value)' --dry-run

1975/01/31 00:00:00+00
 --- becomes:
r.parsedatetime(value)

Would affect 13568 rows
```

However, if I change the code from single quotes to double quotes, it works as expected.

```
sqlite-utils convert manatees.db locations \
  REPDATE created_date last_edited_date \
  ""r.parsedatetime(value)"" --dry-run

1975/01/31 00:00:00+00
 --- becomes:
1975-01-31T00:00:00+00:00

Would affect 13568 rows
```

Specifying the transform code recipe should work with single quotes on Windows.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/459/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,
1310243385,I_kwDOCGYnMM5OGLo5,456,feature request: pivot command,536941,open,0,,,5,2022-07-20T00:58:08Z,2022-07-20T17:50:50Z,,CONTRIBUTOR,,pivoting long-format table to wide-format tables is pretty common and kind of pain. would love to see this feature in sqlite-utils!,140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/456/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,
1279863844,I_kwDOCGYnMM5MSSwk,449,Utilities for duplicating tables and creating a table with the results of a query,1690072,closed,0,,,4,2022-06-22T09:41:43Z,2022-07-15T21:46:13Z,2022-07-15T21:21:36Z,CONTRIBUTOR,,"is there a duplicate table functionality? Otherwise, I'd be happy to submit a PR.

In sqlite3 it would look like:

```python
import sqlite3 as sl

con = sl.connect('prompt-tune.db')

def db_duplicate_table(table_name, table_name_new, con=con):
    # Duplicates table `table_name` to a new table `table_name_new`.
    try:
        cur = con.cursor()
        cur.execute(f""""""CREATE TABLE {table_name_new} AS SELECT * FROM {table_name}"""""")
    except Exception as e:
        print(e)
    finally:
        cur.close()

db_duplicate_table('orig_table', 'new_table')
```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/449/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1178456794,I_kwDOCGYnMM5GPdLa,418,Add generated files to .gitignore,25778,closed,0,,,0,2022-03-23T17:48:12Z,2022-03-24T21:01:44Z,2022-03-24T21:01:44Z,CONTRIBUTOR,,"I end up with these in my local directory:

	.hypothesis/
	Pipfile
	Pipfile.lock
	pyproject.toml

Might as well gitignore them.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/418/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1160034488,I_kwDOCGYnMM5FJLi4,411,Support for generated columns,25778,open,0,,,8,2022-03-04T20:41:33Z,2022-03-11T22:32:43Z,,CONTRIBUTOR,,"This is a fairly new feature -- SQLite version 3.31.0 (2020-01-22) -- that I, admittedly, haven't gotten to work yet. But it looks _incredibly_ useful: https://dgl.cx/2020/06/sqlite-json-support

I'm not sure if this is an option on `add-column` or a separate command like `add-generated-column`. Either way, it needs an argument to populate it. It could be something like this:

```sh
sqlite-utils add-column data.db table-name generated --as 'json_extract(data, ""$.field"")' --virtual
```

More here: https://www.sqlite.org/gencol.html",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/411/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,
1124237013,I_kwDOCGYnMM5DAn7V,398,Add SpatiaLite helpers to CLI,25778,closed,0,,,9,2022-02-04T14:01:28Z,2022-02-16T01:02:29Z,2022-02-16T00:58:07Z,CONTRIBUTOR,,"Now that #385 is merged, add CLI versions of those methods.

```sh
# init spatialite
sqlite-utils init-spatialite database.db

# or maybe/also
sqlite-utils create database.db --enable-wal --spatialite

# add geometry columns
# needs a database, table, geometry column name, type, with optional SRID and not-null
# this needs to create a table if it doesn't already exist
sqlite-utils add-geometry-column database.db table-name geometry --srid 4326 --not-null

# spatial index an existing table/column
sqlite-utils create-spatial-index database.db table-name geometry
```

Should be mostly straightforward. The one thing worth highlighting in docs is that geometry columns can only be added to existing tables. Trying to add a geometry column to a table that doesn't exist yet might mean you have a schema like `{""rowid"": int, ""geometry"": bytes}`. Might be worth nudging people to explicitly create a table first, then add geometry columns.
",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/398/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1126692066,I_kwDOCGYnMM5DJ_Ti,403,Document how to add a primary key to a rowid table using `sqlite-utils transform --pk`,536941,closed,0,,,4,2022-02-08T01:39:40Z,2022-02-09T04:22:43Z,2022-02-08T19:33:59Z,CONTRIBUTOR,,"*Original title: Add option for adding a new, serial, primary key*

sometimes we have tables that don't have primary keys, but ought to have them. we *can* use rowid for that, but it would often be nicer to have an explicit primary key. using the current value of rowid would be fine.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/403/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1096558279,I_kwDOCGYnMM5BXCbH,365,create-index should run analyze after creating index,536941,closed,0,,7558727,16,2022-01-07T18:21:25Z,2022-01-11T02:43:34Z,2022-01-11T01:36:48Z,CONTRIBUTOR,,"sqlite's query planner depends upon analyze to make good use of indices. It would be nice if analyze was run as part of the create-index command.

If data is inserted later, things can get out date, but it would still probably be a net win. ",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/365/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
1077102934,I_kwDOCGYnMM5AM0lW,353,"Allow passing a file of code to ""sqlite-utils convert""",536941,closed,0,,,8,2021-12-10T18:06:14Z,2021-12-11T01:38:29Z,2021-12-11T01:09:39Z,CONTRIBUTOR,,"sqlite-utils is so nice, but the ergonomics of the multiline code in kind of tough. It's really hard (maybe impossible) to make the newlines play well with Makefiles.

it would be great to write your code fragment in a separate file and direct it into the sqlite-utils

either like

```sqlite-utils convert my.db my_table my_column < custom_code.py```

or

```sqlite-utils convert my.db my_table my_column --custom-code=custom_code.py```

Thanks, as ever, for these great tools!",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/353/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
817989436,MDU6SXNzdWU4MTc5ODk0MzY=,242,Async support,25778,open,0,,,13,2021-02-27T18:29:38Z,2021-10-28T14:37:56Z,,CONTRIBUTOR,,"Following our conversation last week, want to note this here before I forget.

I've had a couple situations where I'd like to do a bunch of updates in an async event loop, but I run into SQLite's issues with concurrent writes. This feels like something sqlite-utils could help with.

PeeWee ORM has a [SQLite write queue](http://docs.peewee-orm.com/en/latest/peewee/playhouse.html#sqliteq) that might be a good model. It's using threads or gevent, but I _think_ that approach would translate well enough to asyncio. 

Happy to help with this, too.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/242/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,
831751367,MDU6SXNzdWU4MzE3NTEzNjc=,246,Escaping FTS search strings,16001974,closed,0,,,4,2021-03-15T12:15:09Z,2021-08-18T18:57:13Z,2021-08-18T18:43:12Z,CONTRIBUTOR,,"
Thanks for the excellent library, it's very nice to use!

I've been building some in memory search functionality for a data annotation tool i'm making, and I got tripped up a little bit with escaping the full text search queries. First I tried using `db.quote(q)`, which doesn't work, because sqlite FTS has it's own (separate)[ query syntax](https://www2.sqlite.org/fts5.html#full_text_query_syntax). You can see this happening here also:

http://search-24ways.herokuapp.com/24ways-f8f455f/articles?_search=acces%2A

I got around this by aggressively escaping quotes inside the query string like this:

```python
        quoted = q.replace('""', '""""')
        quoted = f'""{quoted}""'
        print(quoted)
        results = db[""data""].search(quoted, columns=[""id""])
        return [x[""id""] for x in results]

```

This works in the sense it doesn't crash, but it also removes access to the search query syntax. Given the well specified definition, it might be possible for sqlite-utils to provide a `db.quote_query(q)` which would intelligently escape a query whilst leaving the syntax intact. This would be very nice!


",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/246/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
923697888,MDU6SXNzdWU5MjM2OTc4ODg=,278,"Support db as first parameter before subcommand, or as environment variable",601708,closed,0,,,3,2021-06-17T09:26:29Z,2021-06-20T22:39:57Z,2021-06-18T15:43:19Z,CONTRIBUTOR,,,140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/278/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
923602693,MDU6SXNzdWU5MjM2MDI2OTM=,276,support small help flag -h,601708,closed,0,,,0,2021-06-17T07:59:31Z,2021-06-18T14:56:59Z,2021-06-18T14:56:59Z,CONTRIBUTOR,,,140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/276/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
907642546,MDU6SXNzdWU5MDc2NDI1NDY=,264,"Supporting additional output formats, like GeoJSON",25778,closed,0,,,3,2021-05-31T18:03:32Z,2021-06-03T05:12:21Z,2021-06-03T05:12:21Z,CONTRIBUTOR,,"I have a project going where it would be useful to do some spatial processing in SQLite (instead of large files) and then output GeoJSON. So my workflow would be something like this:

1. Read Shapefiles, GeoJSON, CSVs into a SQLite database
2. Join, filter, prune as needed
3. Export GeoJSON for just the stuff I need at that moment, while still having a database of things that will be useful later

I'm wondering if this is worth adding to SQLite-utils itself (GeoJSON, at least), or if it's better to make a counterpart to the ecosystem of `*-to-sqlite` tools, say a suite of `sqlite-to-*` things. Or would it be crazy to have a plugin system?",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/264/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
868188068,MDU6SXNzdWU4NjgxODgwNjg=,257,"Insert from JSON containing strings with non-ascii characters are escaped as unicode for lists, tuples, dicts.",6586811,closed,0,,,0,2021-04-26T20:46:25Z,2021-05-19T02:57:05Z,2021-05-19T02:57:05Z,CONTRIBUTOR,,"JSON Test File (test.json):

```json
[
    {
        ""id"": 123,
        ""text"": ""FR Théâtre""
    },
    {
        ""id"": 223,
        ""text"": [
            ""FR Théâtre""
        ]
    }
]
```

Command to import:

```bash
sqlite-utils insert test.db text test.json --pk=id
```

Resulting table view from datasette:

![image](https://user-images.githubusercontent.com/6586811/116147833-cdf2fb00-a6a5-11eb-8412-0aae81b6e6dd.png)

Original, db.py line 2225:

```python
        return json.dumps(value, default=repr)
```

Fix, db.py line 2225:

```python
        return json.dumps(value, default=repr, ensure_ascii=False)
```",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/257/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
688670158,MDU6SXNzdWU2ODg2NzAxNTg=,147,SQLITE_MAX_VARS maybe hard-coded too low,96218,open,0,,,7,2020-08-30T07:26:45Z,2021-02-15T21:27:55Z,,CONTRIBUTOR,,"I came across this while about to open an issue and PR against the documentation for `batch_size`, which is a bit incomplete.

As mentioned in #145, while:

> [`SQLITE_MAX_VARIABLE_NUMBER`](https://www.sqlite.org/limits.html#max_variable_number) ... defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0.

it is common that it is increased at compile time. Debian-based systems, for example, seem to ship with a version of sqlite compiled with SQLITE_MAX_VARIABLE_NUMBER set to 250,000, and I believe this is the case for homebrew installations too.

In working to understand what `batch_size` was actually doing and why, I realized that by setting `SQLITE_MAX_VARS` in `db.py` to match the value my sqlite was compiled with (I'm on Debian), I was able to decrease the time to `insert_all()` my test data set (~128k records across 7 tables) from ~26.5s to ~3.5s.  Given that this about .05% of my total dataset, this is time I am keen to save...

Unfortunately, it seems that `sqlite3` in the python standard library doesn't expose the `get_limit()` C API (even though `pysqlite` used to), so it's hard to know what value sqlite has been compiled with (note that this could mean, I suppose, that it's less than 999, and even hardcoding `SQLITE_MAX_VARS` to the conservative default might not be adequate.  It can also be lowered -- but not raised -- at runtime).  The best I could come up with is `echo """" | sqlite3 -cmd "".limits variable_number""` (only available in `sqlite >= 2015-05-07 (3.8.10)`).

Obviously this couldn't be relied upon in `sqlite_utils`, but I wonder what your opinion would be about exposing `SQLITE_MAX_VARS` as a user-configurable parameter (with suitable ""here be dragons"" warnings)?  I'm going to go ahead and monkey-patch it for my purposes in any event, but it seems like it might be worth considering.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/147/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,
766156875,MDU6SXNzdWU3NjYxNTY4NzU=,209,Test failure with sqlite 3.34 in test_cli.py::test_optimize,191622,closed,0,,,1,2020-12-14T08:58:18Z,2021-01-01T23:52:46Z,2021-01-01T23:52:46Z,CONTRIBUTOR,,"pytest output:
```
...
============================== short test summary info ===============================
FAILED tests/test_cli.py::test_optimize[tables0] - assert 1662976 < 1662976
FAILED tests/test_cli.py::test_optimize[tables1] - assert 1667072 < 1662976
===================== 2 failed, 538 passed, 3 skipped in 34.32s ======================
```

Came across this while packaging `sqlite-utils` for NixOS, but it can be recreated it using the `alpine:edge` docker image as well as follows:

```
docker run --rm -it alpine:edge /bin/sh
# apk update && apk add git sqlite python3 gcc python3-dev musl-dev && python3 -m ensurepip
# git clone https://github.com/simonw/sqlite-utils.git
# cd sqlite-utils/
# pip3 install -e .[test]
# pytest
```

This definitely works on sqlite v3.33.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/209/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
488293926,MDU6SXNzdWU0ODgyOTM5MjY=,58,Support enabling FTS on views,49260,closed,0,,,1,2019-09-02T18:56:36Z,2020-10-16T18:39:36Z,2020-10-16T18:39:31Z,CONTRIBUTOR,,"Right now enable_fts() is only implemented for Table(). Technically sqlite supports enabling fts on views. But it requires deeper thought since views don't have `rowid` and the current implementation of enable_fts() relies on the presence of `rowid` column. 

It is possible to provide an alternative rowid using the `content_rowid` option to the FTS5() function. 

Ref: https://sqlite.org/fts5.html#fts5_table_creation_and_initialization

>  The ""content_rowid"" option, used to set the rowid field of an external content table. 

This will further complicate `enable_fts()` function by adding an extra argument. I'm wondering if that is outside the scope of this tool or should I work on that feature and send a PR? ",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/58/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
688659182,MDU6SXNzdWU2ODg2NTkxODI=,145,Bug when first record contains fewer columns than subsequent records,96218,closed,0,,,2,2020-08-30T05:44:44Z,2020-09-08T23:21:23Z,2020-09-08T23:21:23Z,CONTRIBUTOR,,"`insert_all()` selects the maximum batch size based on the number of fields in the first record.  If the first record has fewer fields than subsequent records (and `alter=True` is passed), this can result in SQL statements with more than the maximum permitted number of host parameters.  This situation is perhaps unlikely to occur, but could happen if the first record had, say, 10 columns, such that `batch_size` (based on  `SQLITE_MAX_VARIABLE_NUMBER = 999`) would be 99.  If the next 98 rows had 11 columns, the resulting SQL statement for the first batch would have `10 * 1 + 11 * 98 = 1088` host parameters (and subsequent batches, if the data were consistent from thereon out, would have `99 * 11 = 1089`).

I suspect that this bug is masked somewhat by the fact that while:
> [`SQLITE_MAX_VARIABLE_NUMBER`](https://www.sqlite.org/limits.html#max_variable_number) ... defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0.

it is common that it is increased at compile time.  Debian-based systems, for example, seem to ship with a version of sqlite compiled with `SQLITE_MAX_VARIABLE_NUMBER` set to 250,000, and I believe this is the case for homebrew installations too.

A test for this issue might look like this:
```python
def test_columns_not_in_first_record_should_not_cause_batch_to_be_too_large(fresh_db):
    # sqlite on homebrew and Debian/Ubuntu etc. is typically compiled with
    #  SQLITE_MAX_VARIABLE_NUMBER set to 250,000, so we need to exceed this value to
    #  trigger the error on these systems.
    THRESHOLD = 250000
    extra_columns = 1 + (THRESHOLD - 1) // 99
    records = [
        {""c0"": ""first record""},  # one column in first record -> batch_size = 100
        # fill out the batch with 99 records with enough columns to exceed THRESHOLD
        *[
            dict([(""c{}"".format(i), j) for i in range(extra_columns)])
            for j in range(99)
        ]
    ]
    try:
        fresh_db[""too_many_columns""].insert_all(records, alter=True)
    except sqlite3.OperationalError:
        raise
```

The best solution, I think, is simply to process all the records when determining columns, column types, and the batch size.  In my tests this doesn't seem to be particularly costly at all, and cuts out a lot of complications (including obviating my implementation of #139 at #142).  I'll raise a PR for your consideration.

",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/145/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
686978131,MDU6SXNzdWU2ODY5NzgxMzE=,139,"insert_all(..., alter=True) should work for new columns introduced after the first 100 records",96218,closed,0,,,7,2020-08-27T06:25:25Z,2020-08-28T22:48:51Z,2020-08-28T22:30:14Z,CONTRIBUTOR,,"Is there a way to make `.insert_all()` work properly when new columns are introduced outside the first 100 records (with or without the `alter=True` argument)?

I'm using `.insert_all()` to bulk insert ~3-4k records at a time and it is common for records to need to introduce new columns.  However, if new columns are introduced after the first 100 records, `sqlite_utils` doesn't even raise the `OperationalError: table ... has no column named ...` exception; it just silently drops the extra data and moves on.

It took me a while to find this little snippet in the [documentation for `.insert_all()`](https://sqlite-utils.readthedocs.io/en/stable/python-api.html#bulk-inserts) (it's not mentioned under [Adding columns automatically on insert/update](https://sqlite-utils.readthedocs.io/en/stable/python-api.html#bulk-inserts)):

> The column types used in the CREATE TABLE statement are automatically derived from the types of data in that first batch of rows. **_Any additional or missing columns in subsequent batches will be ignored._**

I tried changing the `batch_size` argument to the total number of records, but it seems only to effect the number of rows that are committed at a time, and has no influence on this problem.

Is there a way around this that you would suggest?  It seems like it should raise an exception at least.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/139/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
610517472,MDU6SXNzdWU2MTA1MTc0NzI=,103,sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns,32605365,closed,0,,,8,2020-05-01T02:26:14Z,2020-05-14T00:18:57Z,2020-05-14T00:18:57Z,CONTRIBUTOR,,"If using insert_all to put in 1000 rows of data with varying number of columns, it comes up with this message `sqlite3.OperationalError: too many SQL variables` if the number of columns is larger in later records (past the first row)

I've reduced `SQLITE_MAX_VARS` by 100 to 899 at the top of `db.py` to add wiggle room, so that if the column count increases it wont go past SQLite's batch limit as calculated by this line of code based on the count of the first row's dict keys

        batch_size = max(1, min(batch_size, SQLITE_MAX_VARS // num_columns))",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/103/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed
546073980,MDU6SXNzdWU1NDYwNzM5ODA=,74,Test failures on openSUSE 15.1: AssertionError: Explicit other_table and other_column,15092,open,0,,,3,2020-01-07T04:35:50Z,2020-01-12T07:21:17Z,,CONTRIBUTOR,,"openSUSE 15.1 is using python 3.6.5 and click-7.0 , however it has test failures while openSUSE Tumbleweed on py37 passes.

Most fail on the cli exit code like
```py
[   74s] =================================== FAILURES ===================================
[   74s] _________________________________ test_tables __________________________________
[   74s] 
[   74s] db_path = '/tmp/pytest-of-abuild/pytest-0/test_tables0/test.db'
[   74s] 
[   74s]     def test_tables(db_path):
[   74s]         result = CliRunner().invoke(cli.cli, [""tables"", db_path])
[   74s] >       assert '[{""table"": ""Gosh""},\n {""table"": ""Gosh2""}]' == result.output.strip()
[   74s] E       assert '[{""table"": ""...e"": ""Gosh2""}]' == ''
[   74s] E         - [{""table"": ""Gosh""},
[   74s] E         -  {""table"": ""Gosh2""}]
[   74s] 
[   74s] tests/test_cli.py:28: AssertionError
```

packaging project at https://build.opensuse.org/package/show/home:jayvdb:py-new/python-sqlite-utils

I'll keep digging into this after I have github-to-sqlite working on Tumbleweed, as I'll need openSUSE Leap 15.1 working before I can submit this into the main python repo.",140912432,issue,,,"{""url"": ""https://api.github.com/repos/simonw/sqlite-utils/issues/74/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,