issues

132 rows where repo = 140912432 sorted by updated_at descending

View and edit SQL

Suggested facets: user, comments, author_association, created_at (date), updated_at (date), closed_at (date)

type

state

repo

  • sqlite-utils · 132
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association pull_request body repo type active_lock_reason performed_via_github_app
675839512 MDU6SXNzdWU2NzU4Mzk1MTI= 132 Features for enabling and disabling WAL mode simonw 9599 closed 0     5 2020-08-10T03:25:44Z 2020-08-10T18:59:35Z 2020-08-10T18:59:35Z OWNER  

I finally figured out how to enable WAL - turns out it's a property of the database file itself: https://github.com/simonw/til/blob/master/sqlite/enabling-wal-mode.md

sqlite-utils 140912432 issue    
675753042 MDU6SXNzdWU2NzU3NTMwNDI= 131 "insert" command options for column types simonw 9599 open 0     1 2020-08-09T18:59:11Z 2020-08-09T19:00:41Z   OWNER  

The insert command currently results in string types for every column.

It would be useful if you could do the following:

  • automatically detects the column types based on eg the first 1000 records
  • explicitly state the rule for specific columns

--detect-types could work for the former - or it could do that by default and allow opt-out using --no-detect-types`

For specific columns maybe this:

sqlite-utils insert db.db images images.tsv \
  --tsv \
  -c id int \
  -c score float
sqlite-utils 140912432 issue    
671130371 MDU6SXNzdWU2NzExMzAzNzE= 130 Support tokenize option for FTS simonw 9599 closed 0     3 2020-08-01T19:27:22Z 2020-08-01T20:51:28Z 2020-08-01T20:51:14Z OWNER  

FTS5 supports things like porter stemming using a tokenize= option:

https://www.sqlite.org/fts5.html#tokenizers

Something like this in code:

            CREATE VIRTUAL TABLE [{table}_fts] USING {fts_version} (
                {columns},
                tokenize='porter',
                content=[{table}]
            );

I tried this out just now and it worked exactly as expected.

So... db[table].enable_fts(...) should accept a 'tokenize= argument, and sqlite-utils enable-fts ... should support a --tokenize option.

sqlite-utils 140912432 issue    
665802405 MDU6SXNzdWU2NjU4MDI0MDU= 124 sqlite-utils query should support named parameters simonw 9599 closed 0     1 2020-07-26T15:25:10Z 2020-07-30T22:57:51Z 2020-07-27T03:53:58Z OWNER  

To help out with escaping - so you can run this:

sqlite-utils query "insert into foo (blah) values (:blah)" --param blah `something here`
sqlite-utils 140912432 issue    
668308777 MDU6SXNzdWU2NjgzMDg3Nzc= 129 "insert-files --sqlar" for creating SQLite archives simonw 9599 closed 0     2 2020-07-30T02:28:29Z 2020-07-30T22:41:01Z 2020-07-30T22:40:55Z OWNER  

A --sqlar option could cause insert-files to behave in the same way as SQLite's own sqlar mechanism.

https://www.sqlite.org/sqlar.html and https://sqlite.org/sqlar/doc/trunk/README.md

sqlite-utils 140912432 issue    
666040390 MDU6SXNzdWU2NjYwNDAzOTA= 127 Ability to insert files piped to insert-files stdin simonw 9599 closed 0     3 2020-07-27T07:09:33Z 2020-07-30T03:08:52Z 2020-07-30T03:08:18Z OWNER  

Inserting files by piping them in should work - but since a filename cannot be derived this will need a --name blah.gif option.

cat blah.gif | sqlite-utils insert-files files.db files - --name=blah.gif

_Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/122#issuecomment-664128071_

sqlite-utils 140912432 issue    
666639051 MDU6SXNzdWU2NjY2MzkwNTE= 128 Support UUID and memoryview types simonw 9599 closed 0     1 2020-07-27T23:08:34Z 2020-07-30T01:10:43Z 2020-07-30T01:10:43Z OWNER  

psycopg2 can return data from PostgreSQL as uuid.UUID or memoryview objects. These should to be supported by sqlite-utils - mainly for https://github.com/simonw/db-to-sqlite

sqlite-utils 140912432 issue    
665700495 MDU6SXNzdWU2NjU3MDA0OTU= 122 CLI utility for inserting binary files into SQLite simonw 9599 closed 0     10 2020-07-26T03:27:39Z 2020-07-27T07:10:41Z 2020-07-27T07:09:03Z OWNER  

SQLite BLOB columns can store entire binary files. The challenge is inserting them, since they don't neatly fit into JSON objects.

It would be great if the sqlite-utils CLI had a trick for helping with this.

Inspired by https://github.com/simonw/datasette-media/issues/14

sqlite-utils 140912432 issue    
621989740 MDU6SXNzdWU2MjE5ODk3NDA= 114 table.transform_table() method for advanced alter table simonw 9599 open 0     12 2020-05-20T18:20:46Z 2020-07-27T04:01:13Z   OWNER  

SQLite's ALTER TABLE can only do the following:

  • Rename a table
  • Rename a column
  • Add a column

Notably, it cannot drop columns - so tricks like "add a float version of this text column, populate it, then drop the old one and rename" won't work.

The docs here https://www.sqlite.org/lang_altertable.html describe a way of implementing full alters safely within a transaction, but it's fiddly.

  1. Create new table
  2. Copy data
  3. Drop old table
  4. Rename new into old

It would be great if sqlite-utils provided an abstraction to help make these kinds of changes safely.

sqlite-utils 140912432 issue    
665819048 MDU6SXNzdWU2NjU4MTkwNDg= 126 Ability to insert binary data on the CLI using JSON simonw 9599 closed 0     2 2020-07-26T16:54:14Z 2020-07-27T04:00:33Z 2020-07-27T03:59:45Z OWNER  

I could solve round tripping (at least a bit) by allowing insert to be run with a flag that says "these columns are base64 encoded, store the decoded data in a BLOB".

That would solve inserting binary data using JSON too.
_Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/125#issuecomment-664012247_

sqlite-utils 140912432 issue    
665817570 MDU6SXNzdWU2NjU4MTc1NzA= 125 Output binary columns in "sqlite-utils query" JSON simonw 9599 closed 0     4 2020-07-26T16:47:02Z 2020-07-27T00:49:41Z 2020-07-27T00:48:45Z OWNER  

You get an error if you try to run a query that returns data from a BLOB.

sqlite-utils 140912432 issue    
665701216 MDU6SXNzdWU2NjU3MDEyMTY= 123 --raw option for outputting binary content simonw 9599 closed 0     0 2020-07-26T03:35:39Z 2020-07-26T16:44:11Z 2020-07-26T16:44:11Z OWNER  

Related to the insert-files work in #122 - it should be easy to get binary data back out of the database again.

One way to do that could be:

sqlite-utils files.db "select content from files where key = 'foo.jpg'" --raw

The --raw option would cause just the contents of the first column to be output directly to stdout.

sqlite-utils 140912432 issue    
652961907 MDU6SXNzdWU2NTI5NjE5MDc= 121 Improved (and better documented) support for transactions simonw 9599 open 0     3 2020-07-08T04:56:51Z 2020-07-09T22:40:48Z   OWNER  

_Originally posted by @simonw in https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655283393_

We should put some thought into how this library supports and encourages smart use of transactions.

sqlite-utils 140912432 issue    
652700770 MDU6SXNzdWU2NTI3MDA3NzA= 119 Ability to remove a foreign key simonw 9599 open 0     1 2020-07-07T22:31:37Z 2020-07-08T18:10:18Z   OWNER  

Useful if you add one but make a mistake and need to undo it without recreating the database from scratch.

sqlite-utils 140912432 issue    
651844316 MDExOlB1bGxSZXF1ZXN0NDQ1MDIzMzI2 118 Add insert --truncate option tsibley 79913 closed 0     9 2020-07-06T21:58:40Z 2020-07-08T17:26:21Z 2020-07-08T17:26:21Z CONTRIBUTOR simonw/sqlite-utils/pulls/118

Deletes all rows in the table (if it exists) before inserting new rows.
SQLite doesn't implement a TRUNCATE TABLE statement but does optimize an
unqualified DELETE FROM.

This can be handy if you want to refresh the entire contents of a table
but a) don't have a PK (so can't use --replace), b) don't want the table
to disappear (even briefly) for other connections, and c) have to handle
records that used to exist being deleted.

Ideally the replacement of rows would appear instantaneous to other
connections by putting the DELETE + INSERT in a transaction, but this is
very difficult without breaking other code as the current transaction
handling is inconsistent and non-systematic. There exists the
possibility for the DELETE to succeed but the INSERT to fail, leaving an
empty table. This is not much worse, however, than the current
possibility of one chunked INSERT succeeding and being committed while
the next chunked INSERT fails, leaving a partially complete operation.

sqlite-utils 140912432 pull    
652816158 MDExOlB1bGxSZXF1ZXN0NDQ1ODMzOTA4 120 Fix query command's support for DML tsibley 79913 closed 0     1 2020-07-08T01:36:34Z 2020-07-08T05:14:04Z 2020-07-08T05:14:04Z CONTRIBUTOR simonw/sqlite-utils/pulls/120

See commit messages for details. I ran into this while investigating another feature/issue.

sqlite-utils 140912432 pull    
644161221 MDU6SXNzdWU2NDQxNjEyMjE= 117 Support for compound (composite) foreign keys simonw 9599 open 0     3 2020-06-23T21:33:42Z 2020-06-23T21:40:31Z   OWNER  

It turns out SQLite supports composite foreign keys: https://www.sqlite.org/foreignkeys.html#fk_composite

Their example looks like this:

CREATE TABLE album(
  albumartist TEXT,
  albumname TEXT,
  albumcover BINARY,
  PRIMARY KEY(albumartist, albumname)
);

CREATE TABLE song(
  songid     INTEGER,
  songartist TEXT,
  songalbum TEXT,
  songname   TEXT,
  FOREIGN KEY(songartist, songalbum) REFERENCES album(albumartist, albumname)
);

Here's what that looks like in sqlite-utils:

In [1]: import sqlite_utils                                                                                                                

In [2]: import sqlite3                                                                                                                     

In [3]: conn = sqlite3.connect(":memory:")                                                                                                 

In [4]: conn                                                                                                                               
Out[4]: <sqlite3.Connection at 0x1087186c0>

In [5]: conn.executescript(""" 
   ...: CREATE TABLE album( 
   ...:   albumartist TEXT, 
   ...:   albumname TEXT, 
   ...:   albumcover BINARY, 
   ...:   PRIMARY KEY(albumartist, albumname) 
   ...: ); 
   ...:  
   ...: CREATE TABLE song( 
   ...:   songid     INTEGER, 
   ...:   songartist TEXT, 
   ...:   songalbum TEXT, 
   ...:   songname   TEXT, 
   ...:   FOREIGN KEY(songartist, songalbum) REFERENCES album(albumartist, albumname) 
   ...: ); 
   ...: """)                                                                                                                               
Out[5]: <sqlite3.Cursor at 0x1088def10>

In [6]: db = sqlite_utils.Database(conn)                                                                                                   

In [7]: db.tables                                                                                                                          
Out[7]: 
[<Table album (albumartist, albumname, albumcover)>,
 <Table song (songid, songartist, songalbum, songname)>]

In [8]: db.tables[0].foreign_keys                                                                                                          
Out[8]: []

In [9]: db.tables[1].foreign_keys                                                                                                          
Out[9]: 
[ForeignKey(table='song', column='songartist', other_table='album', other_column='albumartist'),
 ForeignKey(table='song', column='songalbum', other_table='album', other_column='albumname')]

The table appears to have two separate foreign keys, when actually it has a single compound composite foreign key.

sqlite-utils 140912432 issue    
644122661 MDU6SXNzdWU2NDQxMjI2NjE= 116 Documentation for table.pks introspection property simonw 9599 closed 0     2 2020-06-23T20:27:24Z 2020-06-23T21:21:33Z 2020-06-23T21:03:14Z OWNER  

https://github.com/simonw/sqlite-utils/blob/4d9a3204361d956440307a57bd18c829a15861db/sqlite_utils/db.py#L535-L540

sqlite-utils 140912432 issue    
637889964 MDU6SXNzdWU2Mzc4ODk5NjQ= 115 Ability to execute insert/update statements with the CLI simonw 9599 closed 0     1 2020-06-12T17:01:17Z 2020-06-12T17:51:11Z 2020-06-12T17:41:10Z OWNER  
$ sqlite-utils github.db "update stars set starred_at = ''"
Traceback (most recent call last):
  File "/Users/simon/.local/bin/sqlite-utils", line 8, in <module>
    sys.exit(cli())
  File "/Users/simon/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/simon/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/simon/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/simon/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/simon/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/simon/.local/pipx/venvs/sqlite-utils/lib/python3.8/site-packages/sqlite_utils/cli.py", line 673, in query
    headers = [c[0] for c in cursor.description]
TypeError: 'NoneType' object is not iterable
sqlite-utils 140912432 issue    
621286870 MDU6SXNzdWU2MjEyODY4NzA= 113 Syntactic sugar for ATTACH DATABASE simonw 9599 open 0     1 2020-05-19T21:10:00Z 2020-05-19T21:11:22Z   OWNER  

https://www.sqlite.org/lang_attach.html

Maybe something like this:

db.attach("other_db", "other_db.db")
sqlite-utils 140912432 issue    
610517472 MDU6SXNzdWU2MTA1MTc0NzI= 103 sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns b0b5h4rp13 32605365 closed 0     8 2020-05-01T02:26:14Z 2020-05-14T00:18:57Z 2020-05-14T00:18:57Z CONTRIBUTOR  

If using insert_all to put in 1000 rows of data with varying number of columns, it comes up with this message sqlite3.OperationalError: too many SQL variables if the number of columns is larger in later records (past the first row)

I've reduced SQLITE_MAX_VARS by 100 to 899 at the top of db.py to add wiggle room, so that if the column count increases it wont go past SQLite's batch limit as calculated by this line of code based on the count of the first row's dict keys

    batch_size = max(1, min(batch_size, SQLITE_MAX_VARS // num_columns))
sqlite-utils 140912432 issue    
616271236 MDU6SXNzdWU2MTYyNzEyMzY= 112 add_foreign_key(...., ignore=True) simonw 9599 open 0     4 2020-05-12T00:24:00Z 2020-05-12T00:27:24Z   OWNER  

When using this library I often find myself wanting to "add this foreign key, but only if it doesn't exist yet". The ignore=True parameter is increasingly being used for this else where in the library (e.g. in create_view()).

sqlite-utils 140912432 issue    
461215118 MDU6SXNzdWU0NjEyMTUxMTg= 30 Option to open database in read-only mode simonw 9599 closed 0     1 2019-06-26T22:50:38Z 2020-05-11T19:17:17Z 2020-05-11T19:17:17Z OWNER  

Would this make it 100% safe to run reads against a database file that is being written to by another process?

sqlite-utils 140912432 issue    
615477131 MDU6SXNzdWU2MTU0NzcxMzE= 111 sqlite-utils drop-table and drop-view commands simonw 9599 closed 0     2 2020-05-10T21:10:42Z 2020-05-11T01:58:36Z 2020-05-11T00:44:26Z OWNER  

Would be useful to be able to drop views and tables from the CLI.

sqlite-utils 140912432 issue    
613755043 MDU6SXNzdWU2MTM3NTUwNDM= 110 Support decimal.Decimal type dvhthomas 134771 closed 0     6 2020-05-07T03:57:19Z 2020-05-11T01:58:20Z 2020-05-11T01:50:11Z NONE  

Decimal types in Postgres cause a failure in db.py data type selection

I have a Django app using a MoneyField, which uses a numeric(14,0) data type in Postgres (https://www.postgresql.org/docs/9.3/datatype-numeric.html). When attempting to export that table I get the following error:

$ db-to-sqlite --table isaweb_proposal "postgres://connection" test.db
....
    column_type=COLUMN_TYPE_MAPPING[column_type],
KeyError: <class 'decimal.Decimal'>

Looking at sql_utils.db.py at 292-ish it's clear that there is no matching type for what I assume SQLAlchemy interprets as Python decimal.Decimal.

From the SQLite docs it looks like DECIMAL in other DBs are considered numeric.

I'm not quite sure if it's as simple as adding a data type to that list or if there are repercussions beyond it.

Thanks for a great tool!

sqlite-utils 140912432 issue    
612658444 MDU6SXNzdWU2MTI2NTg0NDQ= 109 table.create_index(..., ignore=True) simonw 9599 closed 0     1 2020-05-05T14:44:21Z 2020-05-05T14:46:53Z 2020-05-05T14:46:53Z OWNER  

Option to silently do nothing if the index already exists.

sqlite-utils 140912432 issue    
611222968 MDU6SXNzdWU2MTEyMjI5Njg= 107 sqlite-utils create-view CLI command simonw 9599 closed 0     2 2020-05-02T16:15:13Z 2020-05-03T15:36:58Z 2020-05-03T15:36:37Z OWNER  

Can go with #27 - sqlite-utils create-table.

sqlite-utils 140912432 issue    
455496504 MDU6SXNzdWU0NTU0OTY1MDQ= 27 sqlite-utils create-table command simonw 9599 closed 0     8 2019-06-13T01:43:30Z 2020-05-03T15:26:15Z 2020-05-03T15:26:15Z OWNER  

Spun off from #24 - it would be useful if CLI users could create new tables (with explicit column types, not null rules and defaults) without having to insert an example record.

  • Get it working
  • Support --pk
  • Support --not-null
  • Support --default
  • Support --fk colname othertable othercol
  • Support --replace and --ignore
  • Documentation
sqlite-utils 140912432 issue    
611326701 MDU6SXNzdWU2MTEzMjY3MDE= 108 Documentation unit tests for CLI commands simonw 9599 closed 0     2 2020-05-03T03:58:42Z 2020-05-03T04:13:57Z 2020-05-03T04:13:57Z OWNER  

Have a test that ensures all CLI commands are documented.

sqlite-utils 140912432 issue    
611216862 MDU6SXNzdWU2MTEyMTY4NjI= 106 create_view(..., ignore=True, replace=True) parameters simonw 9599 closed 0     1 2020-05-02T15:45:21Z 2020-05-02T16:04:51Z 2020-05-02T16:02:10Z OWNER  

Two new parameters which specify what should happen if the view already exists. I want this for https://github.com/dogsheep/github-to-sqlite/issues/37

Here's the current create_view() implementation:

https://github.com/simonw/sqlite-utils/blob/b4d953d3ccef28bb81cea40ca165a647b59971fa/sqlite_utils/db.py#L325-L332

ignore=True will not do anything if the view exists already.

replace=True will drop and redefine the view - but only if its SQL definition differs, otherwise it will be left alone.

sqlite-utils 140912432 issue    
602569315 MDU6SXNzdWU2MDI1NjkzMTU= 102 Can't store an array or dictionary containing a bytes value simonw 9599 closed 0     0 2020-04-18T22:49:21Z 2020-05-01T20:45:45Z 2020-05-01T20:45:45Z OWNER  
In [1]: import sqlite_utils                                                     

In [2]: db = sqlite_utils.Database(memory=True)                                 

In [3]: db["t"].insert({"id": 1, "data": {"foo": b"bytes"}})                    
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-a8ab1f72c72c> in <module>
----> 1 db["t"].insert({"id": 1, "data": {"foo": b"bytes"}})

~/Dropbox/Development/sqlite-utils/sqlite_utils/db.py in insert(self, record, pk, foreign_keys, column_order, not_null, defaults, hash_id, alter, ignore, replace, extracts, conversions, columns)
    950             extracts=extracts,
    951             conversions=conversions,
--> 952             columns=columns,
    953         )
    954 

~/Dropbox/Development/sqlite-utils/sqlite_utils/db.py in insert_all(self, records, pk, foreign_keys, column_order, not_null, defaults, batch_size, hash_id, alter, ignore, replace, extracts, conversions, columns, upsert)
   1052                 for key in all_columns:
   1053                     value = jsonify_if_needed(
-> 1054                         record.get(key, None if key != hash_id else _hash(record))
   1055                     )
   1056                     if key in extracts:

~/Dropbox/Development/sqlite-utils/sqlite_utils/db.py in jsonify_if_needed(value)
   1318 def jsonify_if_needed(value):
   1319     if isinstance(value, (dict, list, tuple)):
-> 1320         return json.dumps(value)
   1321     elif isinstance(value, (datetime.time, datetime.date, datetime.datetime)):
   1322         return value.isoformat()

/usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    229         cls is None and indent is None and separators is None and
    230         default is None and not sort_keys and not kw):
--> 231         return _default_encoder.encode(obj)
    232     if cls is None:
    233         cls = JSONEncoder

/usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py in encode(self, o)
    197         # exceptions aren't as detailed.  The list call should be roughly
    198         # equivalent to the PySequence_Fast that ''.join() would do.
--> 199         chunks = self.iterencode(o, _one_shot=True)
    200         if not isinstance(chunks, (list, tuple)):
    201             chunks = list(chunks)

/usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py in iterencode(self, o, _one_shot)
    255                 self.key_separator, self.item_separator, self.sort_keys,
    256                 self.skipkeys, _one_shot)
--> 257         return _iterencode(o, 0)
    258 
    259 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,

/usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py in default(self, o)
    177 
    178         """
--> 179         raise TypeError(f'Object of type {o.__class__.__name__} '
    180                         f'is not JSON serializable')
    181 

TypeError: Object of type bytes is not JSON serializable
sqlite-utils 140912432 issue    
610853576 MDU6SXNzdWU2MTA4NTM1NzY= 105 "sqlite-utils views" command simonw 9599 closed 0     1 2020-05-01T16:56:11Z 2020-05-01T20:40:07Z 2020-05-01T20:38:36Z OWNER  

Similar to sqlite-utils tables. See also #104.

sqlite-utils 140912432 issue    
610853393 MDU6SXNzdWU2MTA4NTMzOTM= 104 --schema option to "sqlite-utils tables" simonw 9599 closed 0     0 2020-05-01T16:55:49Z 2020-05-01T17:12:37Z 2020-05-01T17:12:37Z OWNER  

Adds output showing the table schema.

sqlite-utils 140912432 issue    
573578548 MDU6SXNzdWU1NzM1Nzg1NDg= 89 Ability to customize columns used by extracts= feature simonw 9599 open 0     2 2020-03-01T16:54:48Z 2020-04-18T00:00:42Z   OWNER  

@simonw any thoughts on allow extracts to specify the lookup column name? If I'm understanding the documentation right, .lookup() allows you to define the "value" column (the documentation uses name), but when you use extracts keyword as part of .insert(), .upsert() etc. the lookup must be done against a column named "value". I have an existing lookup table that I've populated with columns "id" and "name" as opposed to "id" and "value", and seems I can't use extracts=, unless I'm missing something...

Initial thought on how to do this would be to allow the dictionary value to be a tuple of table name column pair... so:

table = db.table("trees", extracts={"species_id": ("Species", "name"})

I haven't dug too much into the existing code yet, but does this make sense? Worth doing?

_Originally posted by @chrishas35 in https://github.com/simonw/sqlite-utils/issues/46#issuecomment-592999503_

sqlite-utils 140912432 issue    
601392318 MDU6SXNzdWU2MDEzOTIzMTg= 101 README should include an example of CLI data insertion simonw 9599 closed 0     0 2020-04-16T19:45:37Z 2020-04-17T23:59:49Z 2020-04-17T23:59:49Z OWNER  

Maybe using curl from the GitHub API.

sqlite-utils 140912432 issue    
601358649 MDU6SXNzdWU2MDEzNTg2NDk= 100 Mechanism for forcing column-type, over-riding auto-detection simonw 9599 closed 0     3 2020-04-16T19:12:52Z 2020-04-17T23:53:32Z 2020-04-17T23:53:32Z OWNER  

As seen in https://github.com/dogsheep/github-to-sqlite/issues/27#issuecomment-614843406 - there's a problem where you insert a record with a None value for a column and that column is created as TEXT - but actually you intended it to be an INT (as later examples will demonstrate).

Some kind of mechanism for over-riding the detected types of columns would be useful here.

sqlite-utils 140912432 issue    
549287310 MDU6SXNzdWU1NDkyODczMTA= 76 order_by mechanism metab0t 10501166 closed 0     4 2020-01-14T02:06:03Z 2020-04-16T06:23:29Z 2020-04-16T03:13:06Z NONE  

In some cases, I want to iterate rows in a table with ORDER BY clause. It would be nice to have a rows_order_by function similar to rows_where.
In a more general case, rows_filter function might be added to allow more customized filtering to iterate rows.

sqlite-utils 140912432 issue    
593751293 MDU6SXNzdWU1OTM3NTEyOTM= 97 Adding a "recreate" flag to the `Database` constructor betatim 1448859 closed 0     4 2020-04-04T05:41:10Z 2020-04-15T14:29:31Z 2020-04-13T03:52:29Z NONE  

I have a script that imports data into a sqlite DB. When I re-run that script I'd like to remove the existing sqlite DB, instead of adding to it. The pragmatic answer is to add the check and file deletion to my script.

However I thought it would be easy and useful for others to add a recreate=True flag to db = sqlite_utils.Database("binder-launches.db"). After taking a look at the code for it I am not so sure any more. This is because the connection string could be a URL (or "connection string") like "file:///tmp/foo.db". I don't know what the equivalent of os.path.exists() is for a connection string or how to detect that something is a connection string and raise an error "can't use recreate=True and conn_string at the same time".

Does anyone have an idea/suggestion where to start investigating?

sqlite-utils 140912432 issue    
597671518 MDU6SXNzdWU1OTc2NzE1MTg= 98 Only set .last_rowid and .last_pk for single update/inserts, not for .insert_all()/.upsert_all() with multiple records simonw 9599 closed 0     6 2020-04-10T03:19:40Z 2020-04-13T03:29:15Z 2020-04-13T03:29:15Z OWNER   sqlite-utils 140912432 issue    
598640234 MDU6SXNzdWU1OTg2NDAyMzQ= 99 .upsert_all() should maybe error if dictionaries passed to it do not have the same keys simonw 9599 closed 0     2 2020-04-13T03:02:25Z 2020-04-13T03:05:20Z 2020-04-13T03:05:04Z OWNER  

While investigating #98 I stumbled across this:

    def test_upsert_compound_primary_key(fresh_db):
        table = fresh_db["table"]
        table.upsert_all(
            [
                {"species": "dog", "id": 1, "name": "Cleo", "age": 4},
                {"species": "cat", "id": 1, "name": "Catbag"},
            ],
            pk=("species", "id"),
        )
        table.upsert_all(
            [
                {"species": "dog", "id": 1, "age": 5},
                {"species": "dog", "id": 2, "name": "New Dog", "age": 1},
            ],
            pk=("species", "id"),
        )
>       assert [
            {"species": "dog", "id": 1, "name": "Cleo", "age": 5},
            {"species": "cat", "id": 1, "name": "Catbag", "age": None},
            {"species": "dog", "id": 2, "name": "New Dog", "age": 1},
        ] == list(table.rows)
E       AssertionError: assert [{'age': 5, '...cies': 'dog'}] == [{'age': 5, '...cies': 'dog'}]
E         At index 0 diff: {'species': 'dog', 'id': 1, 'name': 'Cleo', 'age': 5} != {'species': 'dog', 'id': 1, 'name': None, 'age': 5}
E         Full diff:
E         - [{'age': 5, 'id': 1, 'name': 'Cleo', 'species': 'dog'},
E         ?                              ^^^ --
E         + [{'age': 5, 'id': 1, 'name': None, 'species': 'dog'},
E         ?                              ^^^
E         {'age': None, 'id': 1, 'name': 'Catbag', 'species': 'cat'},
E         {'age': 1, 'id': 2, 'name': 'New Dog', 'species': 'dog'}]

If you run .upsert_all() with multiple dictionaries it doesn't quite have the effect you might expect.

sqlite-utils 140912432 issue    
589801352 MDExOlB1bGxSZXF1ZXN0Mzk1MjU4Njg3 96 Add type conversion for Panda's Timestamp b0b5h4rp13 32605365 closed 0     2 2020-03-29T14:13:09Z 2020-03-31T04:40:49Z 2020-03-31T04:40:48Z CONTRIBUTOR simonw/sqlite-utils/pulls/96

Add type conversion for Panda's Timestamp, if Panda library is present in system
(thanks for this project, I was about to do the same thing from scratch)

sqlite-utils 140912432 pull    
586486367 MDU6SXNzdWU1ODY0ODYzNjc= 95 Columns with only null values are no longer created in the database simonw 9599 closed 0     0 2020-03-23T20:07:42Z 2020-03-23T20:31:15Z 2020-03-23T20:31:15Z OWNER  

Bug introduced in #94, and released in 2.4.3.

sqlite-utils 140912432 issue    
586477757 MDU6SXNzdWU1ODY0Nzc3NTc= 94 If column data is a mixture of integers and nulls, detected type should be INTEGER simonw 9599 closed 0     0 2020-03-23T19:51:46Z 2020-03-23T19:57:10Z 2020-03-23T19:57:10Z OWNER  

It looks like detected type for that case is TEXT at the moment.

sqlite-utils 140912432 issue    
471818939 MDU6SXNzdWU0NzE4MTg5Mzk= 48 Jupyter notebook demo of the library, launchable on Binder simonw 9599 open 0     0 2019-07-23T17:05:05Z 2020-03-21T15:21:46Z   OWNER   sqlite-utils 140912432 issue    
581795570 MDU6SXNzdWU1ODE3OTU1NzA= 93 Support more string values for types in .add_column() simonw 9599 open 0     0 2020-03-15T19:32:49Z 2020-03-16T18:15:42Z   OWNER  

https://sqlite-utils.readthedocs.io/en/2.4.2/python-api.html#adding-columns says:

SQLite types you can specify are "TEXT", "INTEGER", "FLOAT" or "BLOB".

As discovered in #92 this isn't the right list of values. I should expand this to match https://www.sqlite.org/datatype3.html

sqlite-utils 140912432 issue    
581339961 MDU6SXNzdWU1ODEzMzk5NjE= 92 .columns_dict doesn't work for all possible column types simonw 9599 closed 0     7 2020-03-14T19:30:35Z 2020-03-15T18:37:43Z 2020-03-14T20:04:14Z OWNER  

Got this error:

  File ".../python3.7/site-packages/sqlite_utils/db.py", line 462, in <dictcomp>
    for column in self.columns
KeyError: 'REAL'

.columns_dict uses REVERSE_COLUMN_TYPE_MAPPING:
https://github.com/simonw/sqlite-utils/blob/43f1c6ab4e3a6b76531fb6f5447adb83d26f3971/sqlite_utils/db.py#L457-L463
REVERSE_COLUMN_TYPE_MAPPING defines FLOAT not REALA
https://github.com/simonw/sqlite-utils/blob/43f1c6ab4e3a6b76531fb6f5447adb83d26f3971/sqlite_utils/db.py#L68-L74

sqlite-utils 140912432 issue    
577302229 MDU6SXNzdWU1NzczMDIyMjk= 91 Enable ordering FTS results by rank slygent 416374 open 0     0 2020-03-07T08:43:51Z 2020-03-07T08:43:51Z   NONE  

According to https://www.sqlite.org/fts5.html (not sure about FTS4) results can be sorted by relevance. At the moment results are returned by default by rowid. Perhaps a flag can be added to the search method?

sqlite-utils 140912432 issue    
573740712 MDU6SXNzdWU1NzM3NDA3MTI= 90 Cannot .enable_fts() for columns with spaces in their names simonw 9599 closed 0     0 2020-03-02T06:06:03Z 2020-03-02T06:10:49Z 2020-03-02T06:10:49Z OWNER  
import sqlite_utils
db = sqlite_utils.Database(memory=True)                                 
db["test"].insert({"space in name": "hello"})                           
db["test"].enable_fts(["space in name"])                                
---------------------------------------------------------------------------
OperationalError                          Traceback (most recent call last)
<ipython-input-8-ce4b87dd1c7a> in <module>
----> 1 db['test'].enable_fts(["space in name"])

/usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in enable_fts(self, columns, fts_version, create_triggers)
    755         )
    756         self.db.conn.executescript(sql)
--> 757         self.populate_fts(columns)
    758 
    759         if create_triggers:

/usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in populate_fts(self, columns)
    787             table=self.name, columns=", ".join(columns)
    788         )
--> 789         self.db.conn.executescript(sql)
    790         return self
    791 

OperationalError: near "in": syntax error
sqlite-utils 140912432 issue    
471780443 MDU6SXNzdWU0NzE3ODA0NDM= 46 extracts= option for insert/update/etc simonw 9599 closed 0     3 2019-07-23T15:55:46Z 2020-03-01T16:53:40Z 2019-07-23T17:00:44Z OWNER  

Relates to #42 and #44. I want the ability to extract values out into lookup tables during bulk insert/upsert operations.

db.insert_all(rows, extracts=["species"])

  • creates species table for values in the species column

db.insert_all(rows, extracts={"species": "Species"})

  • as above but the new table is called Species.
sqlite-utils 140912432 issue    
571805300 MDU6SXNzdWU1NzE4MDUzMDA= 88 table.disable_fts() method and "sqlite-utils disable-fts ..." command simonw 9599 closed 0     5 2020-02-27T04:00:50Z 2020-02-27T04:40:44Z 2020-02-27T04:40:44Z OWNER  

This would make it easier to iterate on the FTS configuration for a database without having to wipe and recreate the database each time.

sqlite-utils 140912432 issue    
539204432 MDU6SXNzdWU1MzkyMDQ0MzI= 70 Implement ON DELETE and ON UPDATE actions for foreign keys LucasElArruda 26292069 open 0     2 2019-12-17T17:19:10Z 2020-02-27T04:18:53Z   NONE  

Hi! I did not find any mention on the library about ON DELETE and ON UPDATE actions for foreign keys. Are those expected to be implemented? If not, it would be a nice thing to include!

sqlite-utils 140912432 issue    
559197745 MDU6SXNzdWU1NTkxOTc3NDU= 82 Tutorial command no longer works petey284 10350886 closed 0     3 2020-02-03T16:36:11Z 2020-02-27T04:16:43Z 2020-02-27T04:16:30Z NONE  

Issue with command on tutorial on Simon's site.

The following command no longer works, and breaks with the previous too many variables error: #50

> curl "https://data.nasa.gov/resource/y77d-th95.json" | \
    sqlite-utils insert meteorites.db meteorites - --pk=id

Output:

Traceback (most recent call last):
  File "continuum\miniconda3\envs\main\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "continuum\miniconda3\envs\main\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "Continuum\miniconda3\envs\main\Scripts\sqlite-utils.exe\__main__.py", line 9, in <module>
  File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 717, in main
    rv = self.invoke(ctx)
  File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "continuum\miniconda3\envs\main\lib\site-packages\sqlite_utils\cli.py", line 434, in insert
    default=default,
  File "continuum\miniconda3\envs\main\lib\site-packages\sqlite_utils\cli.py", line 384, in insert_upsert_implementation
    docs, pk=pk, batch_size=batch_size, alter=alter, **extra_kwargs
  File "continuum\miniconda3\envs\main\lib\site-packages\sqlite_utils\db.py", line 1081, in insert_all
    result = self.db.conn.execute(query, params)
sqlite3.OperationalError: too many SQL variables

My thought is that maybe the dataset grew over the last few years and so didn't run into this issue before.

No error when I reduce the count of entries to 83. Once the number of entries hits 84 the command fails.

// This passes

type meteorite_83.txt | sqlite-utils insert meteorites.db meteorites - --pk=id

// But this fails

type meteorite_84.txt | sqlite-utils insert meteorites.db meteorites - --pk=id

A potential fix might be to chunk the incoming data? I can work on a PR if pointed in right direction.

sqlite-utils 140912432 issue    
564579430 MDU6SXNzdWU1NjQ1Nzk0MzA= 86 Problem with square bracket in CSV column name foscoj 8149512 closed 0     7 2020-02-13T10:19:57Z 2020-02-27T04:16:08Z 2020-02-27T04:16:07Z NONE  

testing some data from european power information (entsoe.eu), the title of the csv contains square brackets.
as I am playing with glitch, sqlite-utils are used for creating the db.

Traceback (most recent call last):

File "/app/.local/bin/sqlite-utils", line 8, in <module>

sys.exit(cli())

File "/app/.local/lib/python3.7/site-packages/click/core.py", line 764, in call

return self.main(*args, **kwargs)

File "/app/.local/lib/python3.7/site-packages/click/core.py", line 717, in main

rv = self.invoke(ctx)

File "/app/.local/lib/python3.7/site-packages/click/core.py", line 1137, in invoke

return _process_result(sub_ctx.command.invoke(sub_ctx))

File "/app/.local/lib/python3.7/site-packages/click/core.py", line 956, in invoke

return ctx.invoke(self.callback, **ctx.params)

File "/app/.local/lib/python3.7/site-packages/click/core.py", line 555, in invoke

return callback(*args, **kwargs)

File "/app/.local/lib/python3.7/site-packages/sqlite_utils/cli.py", line 434, in insert

default=default,

File "/app/.local/lib/python3.7/site-packages/sqlite_utils/cli.py", line 384, in insert_upsert_implementation

docs, pk=pk, batch_size=batch_size, alter=alter, **extra_kwargs

File "/app/.local/lib/python3.7/site-packages/sqlite_utils/db.py", line 997, in insert_all

extracts=extracts,

File "/app/.local/lib/python3.7/site-packages/sqlite_utils/db.py", line 618, in create

extracts=extracts,

File "/app/.local/lib/python3.7/site-packages/sqlite_utils/db.py", line 310, in create_table

self.conn.execute(sql)

sqlite3.OperationalError: unrecognized token: "]"

entsoe_2016.csv

renamed to txt for uploading compatibility

entsoe_2016.txt

code is remixed directly from your https://glitch.com/edit/#!/datasette-csvs repo

sqlite-utils 140912432 issue    
565837965 MDU6SXNzdWU1NjU4Mzc5NjU= 87 Should detect collections.OrderedDict as a regular dictionary simonw 9599 closed 0     2 2020-02-16T02:06:34Z 2020-02-16T02:20:59Z 2020-02-16T02:20:59Z OWNER  
  File "...python3.7/site-packages/sqlite_utils/db.py", line 292, in create_table
    column_type=COLUMN_TYPE_MAPPING[column_type],
KeyError: <class 'collections.OrderedDict'>
sqlite-utils 140912432 issue    
562911863 MDU6SXNzdWU1NjI5MTE4NjM= 85 Create index doesn't work for columns containing spaces simonw 9599 closed 0     1 2020-02-11T00:34:46Z 2020-02-11T05:13:20Z 2020-02-11T05:13:20Z OWNER   sqlite-utils 140912432 issue    
559374410 MDU6SXNzdWU1NTkzNzQ0MTA= 83 Make db["table"].exists a documented API simonw 9599 closed 0     1 2020-02-03T22:31:44Z 2020-02-08T23:58:35Z 2020-02-08T23:56:23Z OWNER  

Right now it's a static thing which might get out-of-sync with the database. It should probably be a live check. Maybe call it .exists() instead?

sqlite-utils 140912432 issue    
561460274 MDU6SXNzdWU1NjE0NjAyNzQ= 84 .upsert() with hash_id throws error simonw 9599 closed 0     0 2020-02-07T07:08:19Z 2020-02-07T07:17:11Z 2020-02-07T07:17:11Z OWNER  
db[table_name].upsert_all(rows, hash_id="pk")

This throws an error: PrimaryKeyRequired('upsert() requires a pk')

The problem is, if you try this:

db[table_name].upsert_all(rows, hash_id="pk", pk="pk")

You get this error: AssertionError('Use either pk= or hash_id=')

hash_id= should imply that pk= that column.

sqlite-utils 140912432 issue    
558600274 MDU6SXNzdWU1NTg2MDAyNzQ= 81 Remove .detect_column_types() from table, make it a documented API simonw 9599 closed 0     4 2020-02-01T21:25:54Z 2020-02-01T21:55:35Z 2020-02-01T21:55:35Z OWNER  

I used it in geojson-to-sqlite here: https://github.com/simonw/geojson-to-sqlite/blob/f10e44264712dd59ae7dfa2e6fd5a904b682fb33/geojson_to_sqlite/utils.py#L45-L50

It would make more sense for this method to live on the Database rather than the Table - or even to exist as a separate utility method entirely.

Then it should be documented.

sqlite-utils 140912432 issue    
545407916 MDU6SXNzdWU1NDU0MDc5MTY= 73 upsert_all() throws issue when upserting to empty table psychemedia 82988 closed 0     6 2020-01-05T11:58:57Z 2020-01-31T14:21:09Z 2020-01-05T17:20:18Z NONE  

If I try to add a list of dicts to an empty table using upsert_all, I get an error:

import sqlite3
from sqlite_utils import Database
import pandas as pd

conx = sqlite3.connect(':memory')
cx = conx.cursor()
cx.executescript('CREATE TABLE "test" ("Col1" TEXT);')

q="SELECT * FROM test;"
pd.read_sql(q, conx) #shows empty table

db = Database(conx)
db['test'].upsert_all([{'Col1':'a'},{'Col1':'b'}])

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-74-8c26d93d7587> in <module>
      1 db = Database(conx)
----> 2 db['test'].upsert_all([{'Col1':'a'},{'Col1':'b'}])

/usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in upsert_all(self, records, pk, foreign_keys, column_order, not_null, defaults, batch_size, hash_id, alter, extracts)
   1157             alter=alter,
   1158             extracts=extracts,
-> 1159             upsert=True,
   1160         )
   1161 

/usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in insert_all(self, records, pk, foreign_keys, column_order, not_null, defaults, batch_size, hash_id, alter, ignore, replace, extracts, upsert)
   1040                     sql = "INSERT OR IGNORE INTO [{table}]({pks}) VALUES({pk_placeholders});".format(
   1041                         table=self.name,
-> 1042                         pks=", ".join(["[{}]".format(p) for p in pks]),
   1043                         pk_placeholders=", ".join(["?" for p in pks]),
   1044                     )

TypeError: 'NoneType' object is not iterable

A hacky workaround in use is:

try:
  db['test'].upsert_all([{'Col1':'a'},{'Col1':'b'}])
except:
  db['test'].insert_all([{'Col1':'a'},{'Col1':'b'}])
sqlite-utils 140912432 issue    
557892819 MDExOlB1bGxSZXF1ZXN0MzY5Mzk0MDQz 80 on_create mechanism for after table creation simonw 9599 closed 0     5 2020-01-31T03:38:48Z 2020-01-31T05:08:04Z 2020-01-31T05:08:04Z OWNER simonw/sqlite-utils/pulls/80

I need this for geojson-to-sqlite, in particular https://github.com/simonw/geojson-to-sqlite/issues/6

sqlite-utils 140912432 pull    
557842245 MDU6SXNzdWU1NTc4NDIyNDU= 79 Helper methods for working with SpatiaLite simonw 9599 open 0     0 2020-01-31T00:39:19Z 2020-01-31T00:39:19Z   OWNER  

As demonstrated by this piece of documentation, using SpatiaLite with sqlite-utils requires a fair bit of boilerplate:
https://github.com/simonw/sqlite-utils/blob/f7289174e66ae4d91d57de94bbd9d09fabf7aff4/docs/python-api.rst#L880-L909

sqlite-utils 140912432 issue    
557825032 MDU6SXNzdWU1NTc4MjUwMzI= 77 Ability to insert data that is transformed by a SQL function simonw 9599 closed 0     2 2020-01-30T23:45:55Z 2020-01-31T00:34:02Z 2020-01-31T00:24:32Z OWNER  

I want to be able to run the equivalent of this SQL insert:

# Convert to "Well Known Text" format
wkt = shape(geojson['geometry']).wkt
# Insert and commit the record
conn.execute("INSERT INTO places (id, name, geom) VALUES(null, ?, GeomFromText(?, 4326))", (
   "Wales", wkt
))
conn.commit()

From the Datasette SpatiaLite docs: https://datasette.readthedocs.io/en/stable/spatialite.html

To do this, I need a way of telling sqlite-utils that a specific column should be wrapped in GeomFromText(?, 4326).

sqlite-utils 140912432 issue    
557830332 MDExOlB1bGxSZXF1ZXN0MzY5MzQ4MDg0 78 New conversions= feature, refs #77 simonw 9599 closed 0     0 2020-01-31T00:02:33Z 2020-01-31T00:24:31Z 2020-01-31T00:24:31Z OWNER simonw/sqlite-utils/pulls/78 sqlite-utils 140912432 pull    
546078359 MDExOlB1bGxSZXF1ZXN0MzU5ODIyNzcz 75 Explicitly include tests and docs in sdist jayvdb 15092 closed 0     1 2020-01-07T04:53:20Z 2020-01-31T00:21:27Z 2020-01-31T00:21:27Z CONTRIBUTOR simonw/sqlite-utils/pulls/75

Also exclude 'tests' from runtime installation.

sqlite-utils 140912432 pull    
546073980 MDU6SXNzdWU1NDYwNzM5ODA= 74 Test failures on openSUSE 15.1: AssertionError: Explicit other_table and other_column jayvdb 15092 open 0     3 2020-01-07T04:35:50Z 2020-01-12T07:21:17Z   CONTRIBUTOR  

openSUSE 15.1 is using python 3.6.5 and click-7.0 , however it has test failures while openSUSE Tumbleweed on py37 passes.

Most fail on the cli exit code like

[   74s] =================================== FAILURES ===================================
[   74s] _________________________________ test_tables __________________________________
[   74s] 
[   74s] db_path = '/tmp/pytest-of-abuild/pytest-0/test_tables0/test.db'
[   74s] 
[   74s]     def test_tables(db_path):
[   74s]         result = CliRunner().invoke(cli.cli, ["tables", db_path])
[   74s] >       assert '[{"table": "Gosh"},\n {"table": "Gosh2"}]' == result.output.strip()
[   74s] E       assert '[{"table": "...e": "Gosh2"}]' == ''
[   74s] E         - [{"table": "Gosh"},
[   74s] E         -  {"table": "Gosh2"}]
[   74s] 
[   74s] tests/test_cli.py:28: AssertionError

packaging project at https://build.opensuse.org/package/show/home:jayvdb:py-new/python-sqlite-utils

I'll keep digging into this after I have github-to-sqlite working on Tumbleweed, as I'll need openSUSE Leap 15.1 working before I can submit this into the main python repo.

sqlite-utils 140912432 issue    
521868864 MDU6SXNzdWU1MjE4Njg4NjQ= 66 The ".upsert()" method is misnamed simonw 9599 closed 0     15 2019-11-12T23:48:28Z 2019-12-31T01:30:21Z 2019-12-31T01:30:20Z OWNER  

This thread here is illuminating: https://stackoverflow.com/questions/3634984/insert-if-not-exists-else-update

The term UPSERT in SQLite has a specific meaning as-of 3.24.0 (2018-06-04): https://www.sqlite.org/lang_UPSERT.html

It means "behave as an UPDATE or a no-op if the INSERT would violate a uniqueness constraint". The syntax in 3.24.0+ looks like this (confusingly it does not use the term "upsert"):

INSERT INTO phonebook(name,phonenumber) VALUES('Alice','704-555-1212')
  ON CONFLICT(name) DO UPDATE SET phonenumber=excluded.phonenumber

Here's the problem: the sqlite-utils .upsert() and .upsert_all() methods don't do this. They use the following SQL:

INSERT OR REPLACE INTO [{table}] ({columns}) VALUES {rows};

If the record already exists, it will be entirely replaced by a new record - as opposed to updating any specified fields but leaving existing fields as they are (the behaviour of "upsert" in SQLite itself).

sqlite-utils 140912432 issue    
529376481 MDExOlB1bGxSZXF1ZXN0MzQ2MjY0OTI2 67 Run tests against 3.5 too simonw 9599 closed 0     2 2019-11-27T14:20:35Z 2019-12-31T01:29:44Z 2019-12-31T01:29:43Z OWNER simonw/sqlite-utils/pulls/67 sqlite-utils 140912432 pull    
543738004 MDExOlB1bGxSZXF1ZXN0MzU3OTkyNTg4 72 Fixed implementation of upsert simonw 9599 closed 0     0 2019-12-30T05:08:05Z 2019-12-30T05:29:24Z 2019-12-30T05:29:24Z OWNER simonw/sqlite-utils/pulls/72

Refs #66

sqlite-utils 140912432 pull    
542814756 MDU6SXNzdWU1NDI4MTQ3NTY= 71 Tests are failing due to missing FTS5 simonw 9599 closed 0     3 2019-12-27T09:41:16Z 2019-12-27T09:49:37Z 2019-12-27T09:49:37Z OWNER  

https://travis-ci.com/simonw/sqlite-utils/jobs/268436167

This is a recent change: 2 months ago they worked fine.

I'm not sure what changed here. Maybe something to do with https://launchpad.net/~jonathonf/+archive/ubuntu/backports ?

sqlite-utils 140912432 issue    
534507142 MDU6SXNzdWU1MzQ1MDcxNDI= 69 Feature request: enable extensions loading aborruso 30607 open 0     1 2019-12-08T08:06:25Z 2019-12-26T20:40:21Z   NONE  

Hi, it would be great to add a parameter that enables the load of a sqlite extension you need.

Something like "-ext modspatialite".

In this way your great tool would be even more comfortable and powerful.

Thank you very much

sqlite-utils 140912432 issue    
531583658 MDU6SXNzdWU1MzE1ODM2NTg= 68 Add support for porter stemming in FTS simonw 9599 open 0     0 2019-12-02T22:35:52Z 2019-12-02T22:35:52Z   OWNER  

FTS5 can have porter stemming enabled.

sqlite-utils 140912432 issue    
519039316 MDExOlB1bGxSZXF1ZXN0MzM3ODUzMzk0 65 Release 1.12.1 simonw 9599 closed 0     0 2019-11-07T04:51:29Z 2019-11-07T04:58:48Z 2019-11-07T04:58:47Z OWNER simonw/sqlite-utils/pulls/65 sqlite-utils 140912432 pull    
476413293 MDU6SXNzdWU0NzY0MTMyOTM= 52 Throws error if .insert_all() / .upsert_all() called with empty list simonw 9599 closed 0     1 2019-08-03T04:09:00Z 2019-11-07T04:32:39Z 2019-11-07T04:32:39Z OWNER  

See also https://github.com/simonw/db-to-sqlite/issues/18

sqlite-utils 140912432 issue    
519032008 MDExOlB1bGxSZXF1ZXN0MzM3ODQ3NTcz 64 test_insert_upsert_all_empty_list simonw 9599 closed 0     0 2019-11-07T04:24:45Z 2019-11-07T04:32:38Z 2019-11-07T04:32:38Z OWNER simonw/sqlite-utils/pulls/64 sqlite-utils 140912432 pull    
500783373 MDU6SXNzdWU1MDA3ODMzNzM= 62 [enhancement] Method to delete a row in python Sergeileduc 4454869 closed 0     5 2019-10-01T09:45:47Z 2019-11-04T16:30:34Z 2019-11-04T16:18:18Z NONE  

Hi !
Thanks for the lib !

Obviously, every possible sql queries won't have a dedicated method.

But I was thinking : a method to delete a row (I'm terrible with names, maybe delete_where() or something, would be useful.

I have a Database, with primary key.

For the moment, I use :

db.conn.execute(f"DELETE FROM table WHERE key = {key_id}")
db.conn.commit()

to delete a row I don't need anymore, giving his primary key.

Works like a charm.

Just an idea :

table.delete_where_pkey({'key': key_id})

or something (I know, I'm terrible at naming methods...).

Pros : well, no need to write SQL query.

Cons : WHERE normally allows to do many more things (operators =, <>, >, <, BETWEEN), not to mention AND, OR, etc...
Method is maybe to specific, and/or a pain to render more flexible.

Again, just a thought. Writing his own sql works too, so...

Thanks again.
See yah.

sqlite-utils 140912432 issue    
491219910 MDU6SXNzdWU0OTEyMTk5MTA= 61 importing CSV to SQLite as library witeshadow 17739 closed 0     2 2019-09-09T17:12:40Z 2019-11-04T16:25:01Z 2019-11-04T16:25:01Z NONE  

CSV can be imported to SQLite when used CLI, but I don't see documentation for when using as library.

sqlite-utils 140912432 issue    
517241040 MDU6SXNzdWU1MTcyNDEwNDA= 63 ensure_index() method simonw 9599 closed 0     1 2019-11-04T15:51:22Z 2019-11-04T16:20:36Z 2019-11-04T16:20:35Z OWNER  
db["table"].ensure_index(["col1", "col2"])

This will do the following:
- if the specified table or column does not exist, do nothing
- if they exist and already have an index, do nothing
- otherwise, create the index

I want this for tools like twitter-to-sqlite search where the search_runs table may or not have been created yet but, if it IS created, I want to put an index on the hash column.

sqlite-utils 140912432 issue    
488338965 MDU6SXNzdWU0ODgzMzg5NjU= 59 Ability to introspect triggers simonw 9599 closed 0     0 2019-09-02T23:47:16Z 2019-09-03T01:52:36Z 2019-09-03T00:09:42Z OWNER  

Now that we're creating triggers (thanks to @amjith in #57) it would be neat if we could introspect them too.

I'm thinking:

db.triggers - lists all triggers for the database
db["tablename"].triggers - lists triggers for that table

The underlying query for this is select * from sqlite_master where type = 'trigger'

I'll return the trigger information in a new namedtuple, similar to how Indexes and ForeignKeys work.

sqlite-utils 140912432 issue    
487987958 MDExOlB1bGxSZXF1ZXN0MzEzMTA1NjM0 57 Add triggers while enabling FTS amjith 49260 closed 0     4 2019-09-02T04:23:40Z 2019-09-03T01:03:59Z 2019-09-02T23:42:29Z CONTRIBUTOR simonw/sqlite-utils/pulls/57

This adds the option for a user to set up triggers in the database to keep their FTS table in sync with the parent table.

Ref: https://sqlite.org/fts5.html#external_content_and_contentless_tables

I would prefer to make the creation of triggers the default behavior, but that will break existing usage where people have been calling populate_fts after inserting new rows.

I am happy to make changes to the PR as you see fit.

sqlite-utils 140912432 pull    
488341021 MDExOlB1bGxSZXF1ZXN0MzEzMzgzMzE3 60 db.triggers and table.triggers introspection simonw 9599 closed 0     0 2019-09-03T00:04:32Z 2019-09-03T00:09:42Z 2019-09-03T00:09:42Z OWNER simonw/sqlite-utils/pulls/60

Closes #59

sqlite-utils 140912432 pull    
488293926 MDU6SXNzdWU0ODgyOTM5MjY= 58 Support enabling FTS on views amjith 49260 open 0     0 2019-09-02T18:56:36Z 2019-09-02T18:56:36Z   CONTRIBUTOR  

Right now enable_fts() is only implemented for Table(). Technically sqlite supports enabling fts on views. But it requires deeper thought since views don't have rowid and the current implementation of enable_fts() relies on the presence of rowid column.

It is possible to provide an alternative rowid using the content_rowid option to the FTS5() function.

Ref: https://sqlite.org/fts5.html#fts5_table_creation_and_initialization

The "content_rowid" option, used to set the rowid field of an external content table.

This will further complicate enable_fts() function by adding an extra argument. I'm wondering if that is outside the scope of this tool or should I work on that feature and send a PR?

sqlite-utils 140912432 issue    
487847945 MDExOlB1bGxSZXF1ZXN0MzEzMDA3NDgz 56 Escape the table name in populate_fts and search. amjith 49260 closed 0     2 2019-09-01T06:29:05Z 2019-09-02T17:23:21Z 2019-09-02T17:23:21Z CONTRIBUTOR simonw/sqlite-utils/pulls/56

The table names weren't escaped using double quotes in the populate_fts method.

Reproducible case:

>>> import sqlite_utils
>>> db = sqlite_utils.Database("abc.db")
>>> db["http://example.com"].insert_all([
...     {"id": 1, "age": 4, "name": "Cleo"},
...     {"id": 2, "age": 2, "name": "Pancakes"}
... ], pk="id")
<Table http://example.com (id, age, name)>
>>> db["http://example.com"].enable_fts(["name"])
Traceback (most recent call last):
  File "<input>", line 1, in <module>
    db["http://example.com"].enable_fts(["name"])
  File "/home/amjith/.virtualenvs/itsysearch/lib/python3.7/site-packages/sqlite_utils/db.py", l
ine 705, in enable_fts
    self.populate_fts(columns)
  File "/home/amjith/.virtualenvs/itsysearch/lib/python3.7/site-packages/sqlite_utils/db.py", l
ine 715, in populate_fts
    self.db.conn.executescript(sql)
sqlite3.OperationalError: unrecognized token: ":"
>>> 
sqlite-utils 140912432 pull    
480961330 MDU6SXNzdWU0ODA5NjEzMzA= 54 Ability to list views, and to access db["view_name"].rows / rows_where / etc ftrain 20264 closed 0     5 2019-08-15T02:00:28Z 2019-08-23T12:41:09Z 2019-08-23T12:20:15Z NONE  

The docs show me how to create a view via db.create_view() but I can't seem to get back to that view post-creation; if I query it as a table it returns None, and it doesn't appear in the table listing, even though querying the view works fine from inside the sqlite3 command-line.

It'd be great to have the view as a pseudo-table, or if the python/sqlite3 module makes that hard to pull off (I couldn't figure it out), to have that edge-case documented next to the db.create_view() docs.

sqlite-utils 140912432 issue    
481887482 MDExOlB1bGxSZXF1ZXN0MzA4MjkyNDQ3 55 Ability to introspect and run queries against views simonw 9599 closed 0     1 2019-08-17T13:40:56Z 2019-08-23T12:19:42Z 2019-08-23T12:19:42Z OWNER simonw/sqlite-utils/pulls/55

See #54

sqlite-utils 140912432 pull    
449565204 MDU6SXNzdWU0NDk1NjUyMDQ= 23 Syntactic sugar for creating m2m records simonw 9599 closed 0     10 2019-05-29T02:17:48Z 2019-08-04T03:54:58Z 2019-08-04T03:37:34Z OWNER  

Python library only. What would be a syntactically pleasant way of creating a m2m record?

sqlite-utils 140912432 issue    
476436920 MDExOlB1bGxSZXF1ZXN0MzAzOTkwNjgz 53 Work in progress: m2m() method for creating many-to-many records simonw 9599 closed 0     0 2019-08-03T10:03:56Z 2019-08-04T03:38:10Z 2019-08-04T03:37:33Z OWNER simonw/sqlite-utils/pulls/53
  • table.insert({"name": "Barry"}).m2m("tags", lookup={"tag": "Coworker"})
  • Explicit table name .m2m("humans", ..., m2m_table="relationships")
  • Automatically use an existing m2m table if a single obvious candidate exists (a table with two foreign keys in the correct directions)
  • Require the explicit m2m_table= argument if multiple candidates for the m2m table exist
  • Documentation

Refs #23

sqlite-utils 140912432 pull    
462430920 MDU6SXNzdWU0NjI0MzA5MjA= 35 table.update(...) method simonw 9599 closed 0     2 2019-06-30T18:06:15Z 2019-07-28T15:43:52Z 2019-07-28T15:43:52Z OWNER  

Spun off from #23 - this method will allow a user to update a specific row.

Currently the only way to do that it is to call .upsert({full record}) with the primary key field matching an existing record - but this does not support partial updates.

db["events"].update(3, {"name": "Renamed"})

This method only works on an existing table, so there's no need for a pk="id" specifier - it can detect the primary key by looking at the table.

If the primary key is compound the first argument can be a tuple:

db["events_venues"].update((3, 2), {"custom_label": "Label"})

The method can be called without the second dictionary argument. Doing this selects the row specified by the primary key (throwing an error if it does not exist) and remembers it so that chained operations can be carried out - see proposal in https://github.com/simonw/sqlite-utils/issues/23#issuecomment-507055345

sqlite-utils 140912432 issue    
467862459 MDExOlB1bGxSZXF1ZXN0Mjk3NDEyNDY0 38 table.update() method simonw 9599 closed 0     2 2019-07-14T17:03:49Z 2019-07-28T15:43:51Z 2019-07-28T15:43:51Z OWNER simonw/sqlite-utils/pulls/38

Refs #35

Still to do:

  • Unit tests
  • Switch to using .get()
  • Better exceptions, plus unit tests for what happens if pk does not exist
  • Documentation
  • Ensure compound primary keys work properly
  • alter=True support
sqlite-utils 140912432 pull    
473083260 MDU6SXNzdWU0NzMwODMyNjA= 50 "Too many SQL variables" on large inserts simonw 9599 closed 0     3 2019-07-25T21:43:31Z 2019-07-28T11:59:33Z 2019-07-28T11:59:33Z OWNER  

Reported here: https://github.com/dogsheep/healthkit-to-sqlite/issues/9

It looks like there's a default limit of 999 variables - we need to be smart about that, maybe dynamically lower the batch size based on the number of columns.

sqlite-utils 140912432 issue    
473733752 MDExOlB1bGxSZXF1ZXN0MzAxODI0MDk3 51 Fix for too many SQL variables, closes #50 simonw 9599 closed 0     1 2019-07-28T11:30:30Z 2019-07-28T11:59:32Z 2019-07-28T11:59:32Z OWNER simonw/sqlite-utils/pulls/51 sqlite-utils 140912432 pull    
472115381 MDU6SXNzdWU0NzIxMTUzODE= 49 extracts= should support multiple-column extracts simonw 9599 open 0     1 2019-07-24T07:06:41Z 2019-07-24T07:10:21Z   OWNER  

Lookup tables can be constructed on compound columns, but the extracts= option doesn't currently support that.

Right now extracts can be defined in two ways:

# Extract these columns into tables with the same name:
dogs = db.table("dogs", extracts=["breed", "most_recent_trophy"])

# Same as above but with custom table names:
dogs = db.table("dogs", extracts={"breed": "Breeds", "most_recent_trophy": "Trophies"})

Need some kind of syntax for much more complicated extractions, like when two columns (say "source" and "source_version") are extracted into a single table.

sqlite-utils 140912432 issue    
471797101 MDExOlB1bGxSZXF1ZXN0MzAwMzc3NTk5 47 extracts= table parameter simonw 9599 closed 0     0 2019-07-23T16:30:29Z 2019-07-23T17:00:43Z 2019-07-23T17:00:43Z OWNER simonw/sqlite-utils/pulls/47

Still needs docs. Refs #46

sqlite-utils 140912432 pull    
470691999 MDU6SXNzdWU0NzA2OTE5OTk= 43 .add_column() doesn't match indentation of initial creation simonw 9599 closed 0     3 2019-07-20T16:33:10Z 2019-07-23T13:09:11Z 2019-07-23T13:09:05Z OWNER  

I spotted a table which was created once and then had columns added to it and the formatted SQL looks like this:

CREATE TABLE [records] (
   [type] TEXT,
   [sourceName] TEXT,
   [sourceVersion] TEXT,
   [unit] TEXT,
   [creationDate] TEXT,
   [startDate] TEXT,
   [endDate] TEXT,
   [value] TEXT,
   [metadata_Health Mate App Version] TEXT,
   [metadata_Withings User Identifier] TEXT,
   [metadata_Modified Date] TEXT,
   [metadata_Withings Link] TEXT,
   [metadata_HKWasUserEntered] TEXT
, [device] TEXT, [metadata_HKMetadataKeyHeartRateMotionContext] TEXT, [metadata_HKDeviceManufacturerName] TEXT, [metadata_HKMetadataKeySyncVersion] TEXT, [metadata_HKMetadataKeySyncIdentifier] TEXT, [metadata_HKSwimmingStrokeStyle] TEXT, [metadata_HKVO2MaxTestType] TEXT, [metadata_HKTimeZone] TEXT, [metadata_Average HR] TEXT, [metadata_Recharge] TEXT, [metadata_Lights] TEXT, [metadata_Asleep] TEXT, [metadata_Rating] TEXT, [metadata_Energy Threshold] TEXT, [metadata_Deep Sleep] TEXT, [metadata_Nap] TEXT, [metadata_Edit Slots] TEXT, [metadata_Tags] TEXT, [metadata_Daytime HR] TEXT)

It would be nice if the columns that were added later matched the indentation of the initial columns.

sqlite-utils 140912432 issue    
471628483 MDU6SXNzdWU0NzE2Mjg0ODM= 44 Utilities for building lookup tables simonw 9599 closed 0     2 2019-07-23T10:59:58Z 2019-07-23T13:07:01Z 2019-07-23T13:07:01Z OWNER  

While building https://github.com/dogsheep/healthkit-to-sqlite I found a need for a neat mechanism for easily building lookup tables - tables where each unique value in a column is replaced by a foreign key to a separate table.

csvs-to-sqlite currently creates those with its "extract" mechanism - but that's written as custom code against Pandas. I'd like to eventually replace Pandas with sqlite-utils there.

See also #42

sqlite-utils 140912432 issue    
471684708 MDExOlB1bGxSZXF1ZXN0MzAwMjg2NTM1 45 Implemented table.lookup(...), closes #44 simonw 9599 closed 0     0 2019-07-23T13:03:30Z 2019-07-23T13:07:00Z 2019-07-23T13:07:00Z OWNER simonw/sqlite-utils/pulls/45 sqlite-utils 140912432 pull    
351845423 MDU6SXNzdWUzNTE4NDU0MjM= 3 Experiment with contentless FTS tables simonw 9599 closed 0     1 2018-08-18T19:31:01Z 2019-07-22T20:58:55Z 2019-07-22T20:58:55Z OWNER  

Could greatly reduce size of resulting database for large datasets: http://cocoamine.net/blog/2015/09/07/contentless-fts4-for-large-immutable-documents/

sqlite-utils 140912432 issue    
470345929 MDU6SXNzdWU0NzAzNDU5Mjk= 42 table.extract(...) method and "sqlite-utils extract" command simonw 9599 open 0     4 2019-07-19T14:09:36Z 2019-07-19T14:58:23Z   OWNER  

One of my favourite features of csvs-to-sqlite is that it can "extract" columns into a separate lookup table - for example:

csvs-to-sqlite big_csv_file.csv -c country output.db

This will turn the country column in the resulting table into a integer foreign key against a new country table. You can see an example of what that looks like here: https://san-francisco.datasettes.com/registered-business-locations-3d50679/Business+Corridor was extracted from https://san-francisco.datasettes.com/registered-business-locations-3d50679/Registered_Business_Locations_-_San_Francisco?Business%20Corridor=1

I'd like to have the same capability in sqlite-utils - but with the ability to run it against an existing SQLite table rather than just against a CSV.

sqlite-utils 140912432 issue    
470131537 MDU6SXNzdWU0NzAxMzE1Mzc= 41 sqlite-utils insert --tsv option simonw 9599 closed 0     0 2019-07-19T04:27:21Z 2019-07-19T04:50:47Z 2019-07-19T04:50:47Z OWNER  

Right now we only support ingesting CSV, but sometimes interesting data is released as TSV.

https://www.washingtonpost.com/national/2019/07/18/how-download-use-dea-pain-pills-database/ for example.

sqlite-utils 140912432 issue    
467928674 MDExOlB1bGxSZXF1ZXN0Mjk3NDU5Nzk3 40 .get() method plus support for compound primary keys simonw 9599 closed 0     1 2019-07-15T03:43:13Z 2019-07-15T04:28:57Z 2019-07-15T04:28:52Z OWNER simonw/sqlite-utils/pulls/40
  • Tests for the NotFoundError exception
  • Documentation for .get() method
  • Support --pk multiple times to define CLI compound primary keys
  • Documentation for compound primary keys
sqlite-utils 140912432 pull    
467864071 MDU6SXNzdWU0Njc4NjQwNzE= 39 table.get(...) method simonw 9599 closed 0     0 2019-07-14T17:20:51Z 2019-07-15T04:28:53Z 2019-07-15T04:28:53Z OWNER  

Utility method for fetching a record by its primary key.

Accepts a single value (for primary key / rowid tables) or a list/tuple of values (for compound primary keys, refs #36).

Raises a NotFoundError if the record cannot be found.

sqlite-utils 140912432 issue    

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [pull_request] TEXT,
   [body] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
, [active_lock_reason] TEXT, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issues_repo]
                ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
                ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
                ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
                ON [issues] ([user]);
Powered by Datasette · Query took 70.977ms · About: github-to-sqlite