home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

69 rows where comments = 3 and repo = 140912432 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: user, milestone, author_association, created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 59
  • pull 10

state 2

  • closed 57
  • open 12

repo 1

  • sqlite-utils · 69 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association pull_request body repo type active_lock_reason performed_via_github_app reactions draft state_reason
1884335789 PR_kwDOCGYnMM5Zs0KB 591 Test against Python 3.12 preview simonw 9599 closed 0     3 2023-09-06T16:10:00Z 2023-11-04T00:58:03Z 2023-11-04T00:58:02Z OWNER simonw/sqlite-utils/pulls/591

https://dev.to/hugovk/help-test-python-312-beta-1508/


:books: Documentation preview :books:: https://sqlite-utils--591.org.readthedocs.build/en/591/

sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/591/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 1,
    "eyes": 0
}
0  
1879209560 I_kwDOCGYnMM5wAnZY 589 Mechanism for de-registering registered SQL functions simonw 9599 open 0     3 2023-09-03T19:32:39Z 2023-09-03T19:36:34Z   OWNER  

I used a custom SQL function in a migration script and then realized that it should be de-registered before the end of the script to avoid leaking into the calling code.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/589/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1857851384 I_kwDOCGYnMM5uvI_4 587 New .add_foreign_key() can break if PRAGMA legacy_alter_table=ON and there's an invalid foreign key reference simonw 9599 closed 0     3 2023-08-19T20:01:26Z 2023-08-19T20:04:33Z 2023-08-19T20:04:32Z OWNER  

Extremely detailed story of how I got to this point:

  • https://github.com/simonw/llm/issues/162

Steps to reproduce (only if that pragma is on though): bash python -c ' import sqlite_utils db = sqlite_utils.Database(memory=True) db.execute(""" CREATE TABLE "logs" ( [id] INTEGER PRIMARY KEY, [model] TEXT, [prompt] TEXT, [system] TEXT, [prompt_json] TEXT, [options_json] TEXT, [response] TEXT, [response_json] TEXT, [reply_to_id] INTEGER, [chat_id] INTEGER REFERENCES [log]([id]), [duration_ms] INTEGER, [datetime_utc] TEXT ); """) db["logs"].add_foreign_key("reply_to_id", "logs", "id") ' This succeeds in some environments, fails in others.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/587/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1856075668 I_kwDOCGYnMM5uoXeU 586 .transform() fails to drop column if table is part of a view simonw 9599 open 0     3 2023-08-18T05:25:22Z 2023-08-18T06:13:47Z   OWNER  

I got this error trying to drop a column from a table that was part of a SQL view:

error in view plugins: no such table: main.pypi_releases

Upon further investigation I found that this pattern seemed to fix it: python def transform_the_table(conn): # Run this in a transaction: with conn: # We have to read all the views first, because we need to drop and recreate them db = sqlite_utils.Database(conn) views = {v.name: v.schema for v in db.views if table.lower() in v.schema.lower()} for view in views.keys(): db[view].drop() db[table].transform( types=types, rename=rename, drop=drop, column_order=[p[0] for p in order_pairs], ) # Now recreate the views for name, schema in views.items(): db.create_view(name, schema) So grab a copy of any view that might reference this table, start a transaction, drop those views, run the transform, recreate the views again.

I wonder if this should become an option in sqlite-utils? Maybe a recreate_views=True argument for table.tranform(...)? Should it be opt-in or opt-out?

Originally posted by @simonw in https://github.com/simonw/datasette-edit-schema/issues/35#issuecomment-1683370548

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/586/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1816918185 I_kwDOCGYnMM5sS_ip 574 `prepare_connection()` plugin hook simonw 9599 closed 0     3 2023-07-22T22:52:47Z 2023-07-22T23:13:14Z 2023-07-22T22:59:10Z OWNER  

Splitting off an issue for prepare_connection() since Alex got the PR in seconds before I shipped 3.34!

Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/567#issuecomment-1646686424

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/574/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1816852402 I_kwDOCGYnMM5sSvey 569 register_command plugin hook simonw 9599 closed 0     3 2023-07-22T18:17:27Z 2023-07-22T19:19:35Z 2023-07-22T19:19:35Z OWNER  

I'm going to start by adding the register_command hook using the exact same pattern as Datasette and LLM.

Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/567#issuecomment-1646643450

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/569/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1718595700 I_kwDOCGYnMM5mb7B0 550 AttributeError: 'EntryPoints' object has no attribute 'get' for flake8 on Python 3.7 simonw 9599 closed 0     3 2023-05-21T18:24:39Z 2023-05-21T18:42:25Z 2023-05-21T18:41:58Z OWNER  

https://github.com/simonw/sqlite-utils/actions/runs/5039064797/jobs/9036965488

Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.7.16/x64/bin/flake8", line 8, in <module> sys.exit(main()) File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/main/cli.py", line 22, in main app.run(argv) File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/main/application.py", line 363, in run self._run(argv) File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/main/application.py", line 350, in _run self.initialize(argv) File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/main/application.py", line 330, in initialize self.find_plugins(config_finder) File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/main/application.py", line 153, in find_plugins self.check_plugins = plugin_manager.Checkers(local_plugins.extension) File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/plugins/manager.py", line 357, in __init__ self.namespace, local_plugins=local_plugins File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/plugins/manager.py", line 238, in __init__ self._load_entrypoint_plugins() File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/plugins/manager.py", line 254, in _load_entrypoint_plugins eps = importlib_metadata.entry_points().get(self.namespace, ()) AttributeError: 'EntryPoints' object has no attribute 'get' Error: Process completed with exit code 1.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/550/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1718586377 PR_kwDOCGYnMM5Q9cAv 549 TUI powered by Trogon simonw 9599 closed 0     3 2023-05-21T17:55:42Z 2023-05-21T18:42:00Z 2023-05-21T18:41:56Z OWNER simonw/sqlite-utils/pulls/549

Refs: - #545


:books: Documentation preview :books:: https://sqlite-utils--549.org.readthedocs.build/en/549/

sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/549/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
1665200812 PR_kwDOCGYnMM5OKveS 537 Support self-referencing FKs in `Table.create` numist 544011 closed 0     3 2023-04-12T20:26:59Z 2023-05-08T22:45:33Z 2023-05-08T21:10:01Z CONTRIBUTOR simonw/sqlite-utils/pulls/537

:books: Documentation preview :books:: https://sqlite-utils--537.org.readthedocs.build/en/537/

sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/537/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
1516644980 I_kwDOCGYnMM5aZip0 520 rows_from_file() raises confusing error if file-like object is not in binary mode simonw 9599 closed 0     3 2023-01-02T19:00:14Z 2023-05-08T22:08:07Z 2023-05-08T22:08:07Z OWNER  

I got this error:

File "/Users/simon/Dropbox/Development/openai-to-sqlite/openai_to_sqlite/cli.py", line 27, in embeddings rows, _ = rows_from_file(input) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/simon/.local/share/virtualenvs/openai-to-sqlite-jt4obeb2/lib/python3.11/site-packages/sqlite_utils/utils.py", line 305, in rows_from_file first_bytes = buffered.peek(2048).strip() ^^^^^^^^^^^^^^^^^^^ From this code: ```python

@cli.command() @click.argument( "db_path", type=click.Path(file_okay=True, dir_okay=False, allow_dash=False), ) @click.option( "-i", "--input", type=click.File("r"), default="-", ) def embeddings(db_path, input): "Store embeddings for one or more text documents" click.echo("Here is some output") db = sqlite_utils.Database(db_path) rows, _ = rows_from_file(input) print(list(rows)) `` The error went away when I changed it totype=click.File("rb")`.

This should either be called out in the documentation or rows_from_file() should be fixed to handle text-mode files in addition to binary files.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/520/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1434911255 I_kwDOCGYnMM5VhwIX 510 Cannot enable FTS5 despite it being available ar-jan 1176293 closed 0     3 2022-11-03T16:03:49Z 2022-11-18T18:37:52Z 2022-11-17T10:36:28Z NONE  

When I do sqlite-utils enable-fts my.db table_name column_name (with or without --fts5), I get an FTS4 virtual table instead of the expected FTS5.

FTS5 is however available and Python/SQLite versions do not seem to be the issue. I can manually create the FTS5 virtual table, and then Datasette also works with it from this same Python environment.

>>> sqlite3.version 2.6.0 >>> sqlite3.sqlite_version 3.39.4

PRAGMA compile_options; includes ENABLE_FTS5.

sqlite-utils, version 3.30.

Any ideas what's happening and how to fix?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/510/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1450952393 I_kwDOCGYnMM5We8bJ 512 mypy failures in CI simonw 9599 closed 0     3 2022-11-16T06:22:48Z 2022-11-16T07:49:51Z 2022-11-16T07:49:50Z OWNER  

https://github.com/simonw/sqlite-utils/actions/runs/3472012235 failed on Python 3.11:

Truncated output: sqlite_utils/db.py:2467: note: PEP 484 prohibits implicit Optional. Accordingly, mypy has changed its default to no_implicit_optional=True sqlite_utils/db.py:2467: note: Use https://github.com/hauntsaninja/no_implicit_optional to automatically upgrade your codebase sqlite_utils/db.py:2530: error: Incompatible default for argument "where" (default has type "None", argument has type "str") [assignment] sqlite_utils/db.py:2530: note: PEP 484 prohibits implicit Optional. Accordingly, mypy has changed its default to no_implicit_optional=True sqlite_utils/db.py:2530: note: Use https://github.com/hauntsaninja/no_implicit_optional to automatically upgrade your codebase sqlite_utils/db.py:2658: error: Argument 1 to "count_where" of "Queryable" has incompatible type "Optional[str]"; expected "str" [arg-type] Found 23 errors in 1 file (checked 51 source files) Best look at https://github.com/hauntsaninja/no_implicit_optional

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/512/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1429029604 I_kwDOCGYnMM5VLULk 506 Make `cursor.rowcount` accessible (wontfix) simonw 9599 closed 0     3 2022-10-30T21:51:55Z 2022-11-01T17:37:47Z 2022-11-01T17:37:13Z OWNER  

In building this Datasette feature on top of sqlite-utils I thought it might be useful to expose the number of rows that had been affected by a bulk insert or update - the cursor.rowcount:

  • https://github.com/simonw/datasette/issues/1866

This isn't currently exposed by sqlite-utils.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/506/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1386562662 I_kwDOCGYnMM5SpURm 493 Tiny typographical error in install/uninstall docs simonw 9599 open 0     3 2022-09-26T19:00:42Z 2022-10-25T21:31:15Z   OWNER  

Added in: - #483

I don't know how to fix this in Sphinx: I'm getting this: https://sqlite-utils.datasette.io/en/latest/cli.html#cli-install

The insert –convert and query –functions options

But I want it to display insert --convert and not insert –convert there.

Here's the code: https://github.com/simonw/sqlite-utils/blob/85247038f70d7eb2f3e272cfeaa4c44459cafba8/docs/cli.rst#L2125

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/493/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1392690202 I_kwDOCGYnMM5TAsQa 495 Support JSON values returned from .convert() functions mhalle 649467 closed 0     3 2022-09-30T16:33:49Z 2022-10-25T21:23:37Z 2022-10-25T21:23:28Z NONE  

When using the convert function on a JSON column, the result of the conversion function must be a string. If the return value is either a dict (object) or a list (array), the convert call will error out with an unhelpful user defined function exception.

It makes sense that since the original column value was a string and required conversion to data structures, the result should be converted back into a JSON string as well. However, other functions auto-convert to JSON string representation, so the fact that convert doesn't could be surprising.

At least the documentation should note this requirement, because the sqlite error messages won't readily reveal the issue.

Jf only sqlite's JSON column type meant something :)

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/495/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1149661489 I_kwDOCGYnMM5EhnEx 409 `with db:` for transactions simonw 9599 open 0     3 2022-02-24T19:22:06Z 2022-10-01T03:42:50Z   OWNER  

This can be a documented wrapper around with db.conn:.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/409/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1353074021 I_kwDOCGYnMM5QpkVl 474 Add an option for specifying column names when inserting CSV data hubgit 14294 open 0     3 2022-08-27T15:29:59Z 2022-08-31T03:42:36Z   NONE  

https://sqlite-utils.datasette.io/en/stable/cli.html#csv-files-without-a-header-row

The first row of any CSV or TSV file is expected to contain the names of the columns in that file.

If your file does not include this row, you can use the --no-headers option to specify that the tool should not use that fist row as headers.

If you do this, the table will be created with column names called untitled_1 and untitled_2 and so on. You can then rename them using the sqlite-utils transform ... --rename command.

It would be nice to be able to specify the column names when importing CSV/TSV without a header row, via an extra command line option.

(renaming a column of a large table can take a long time, which makes it an inconvenient workaround)

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/474/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1309542173 PR_kwDOCGYnMM47pwAb 455 in extract code, check equality with IS instead of = for nulls fgregg 536941 closed 0     3 2022-07-19T13:40:25Z 2022-08-27T14:45:03Z 2022-08-27T14:45:03Z CONTRIBUTOR simonw/sqlite-utils/pulls/455

sqlite "IS" is equivalent to SQL "IS NOT DISTINCT FROM"

closes #423

sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/455/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
1227571375 I_kwDOCGYnMM5JK0Cv 431 Allow making m2m relation of a table to itself rafguns 738408 open 0     3 2022-05-06T08:30:43Z 2022-06-23T14:12:51Z   NONE  

I am building a database, in which one of the tables has a many-to-many relationship to itself. As far as I can see, this is not (yet) possible using .m2m() in sqlite-utils. This may be a bit of a niche use case, so feel free to close this issue if you feel it would introduce too much complexity compared to the benefits.

Example: suppose I have a table of people, and I want to store the information that John and Mary have two children, Michael and Suzy. It would be neat if I could do something like this:

```python from sqlite_utils import Database

db = Database(memory=True) db["people"].insert({"name": "John"}, pk="name").m2m( "people", [{"name": "Michael"}, {"name": "Suzy"}], m2m_table="parent_child", pk="name" ) db["people"].insert({"name": "Mary"}, pk="name").m2m( "people", [{"name": "Michael"}, {"name": "Suzy"}], m2m_table="parent_child", pk="name" ) ```

But if I do that, the many-to-many table parent_child has only one column: CREATE TABLE [parent_child] ( [people_id] TEXT REFERENCES [people]([name]), PRIMARY KEY ([people_id], [people_id]) )

This could be solved by adding one or two keyword_arguments to .m2m(), e.g. .m2m(..., left_name=None, right_name=None) or .m2m(..., names=(None, None)).

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/431/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1278571700 I_kwDOCGYnMM5MNXS0 447 Incorrect syntax highlighting in docs CLI reference simonw 9599 closed 0     3 2022-06-21T14:53:10Z 2022-06-21T18:48:47Z 2022-06-21T18:48:46Z OWNER  

https://sqlite-utils.datasette.io/en/stable/cli-reference.html#insert

It looks like Python keywords are being incorrectly highlighted here.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/447/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1277295119 I_kwDOCGYnMM5MIfoP 445 `sqlite_utils.utils.TypeTracker` should be a documented API simonw 9599 closed 0     3 2022-06-20T19:08:28Z 2022-06-20T19:49:02Z 2022-06-20T19:46:58Z OWNER  

I've used it in a couple of external places now:

  • https://github.com/simonw/datasette-socrata/blob/32fb256a461bf0e790eca10bdc7dd9d96c20f7c4/datasette_socrata/init.py#L264-L280
  • https://github.com/simonw/datasette-lite/blob/caa8eade10f0321c64f9f65c4561186f02d57c5b/webworker.js#L55-L64

Refs: - https://github.com/simonw/datasette-lite/issues/32

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/445/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1243151184 I_kwDOCGYnMM5KGPtQ 434 `detect_fts()` identifies the wrong table if tables have names that are subsets of each other ryascott 559711 closed 0     3 2022-05-20T13:28:31Z 2022-06-14T23:24:09Z 2022-06-14T23:24:09Z NONE  

Windows 10 Python 3.9.6

When I was running a full text search through the Python library, I noticed that the query was being run on a different full text search table than the one I was trying to search.

I took a look at the following function

https://github.com/simonw/sqlite-utils/blob/841ad44bacaff05ec79ef78166d12e80c82ba6d7/sqlite_utils/db.py#L2213

and noticed:

python sql LIKE '%VIRTUAL TABLE%USING FTS%content=%{table}%'

My database contains tables with similar names and %{table}% was matching another table that ended differently in its name. I have included a sample test that shows this occurring:

I search for Marsupials in db["books"] and The Clue of the Broken Blade is returned.

This occurs since the search for Marsupials was "successfully" done against db["booksb"] and rowid 1 is returned. "The Clue of the Broken Blade" has a rowid of 1 in db["books"] and this is what is returned from the search.

```python def test_fts_search_with_similar_table_names(fresh_db): db = Database(memory=True) db["books"].insert_all( [ { "title": "The Clue of the Broken Blade", "author": "Franklin W. Dixon", }, { "title": "Habits of Australian Marsupials", "author": "Marlee Hawkins", }, ] ) db["booksb"].insert( { "title": "Habits of Australian Marsupials", "author": "Marlee Hawkins", } )

db["booksb"].enable_fts(["title", "author"])
db["books"].enable_fts(["title", "author"])


query = "Marsupials"

assert [
        {   "rowid": 1,
            "title": "Habits of Australian Marsupials",
            "author": "Marlee Hawkins",
        },
    ] == list(db["books"].search(query))

```

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/434/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1175744654 I_kwDOCGYnMM5GFHCO 417 insert fails on JSONL with whitespace blaine 9954 closed 0     3 2022-03-21T17:58:14Z 2022-03-25T21:19:06Z 2022-03-25T21:17:13Z NONE  

Any JSON that is newline-delimited and has whitespace (newlines) between the start of a JSON object and an attribute fails due to a parse error.

e.g. given the valid JSONL:

{ "attribute": "value" } { "attribute": "value2" }

I would expect that sqlite-utils insert --nl my.db mytable file.jsonl would properly import the data into mytable. However, the following error is thrown instead:

json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 1 (char 2)

It makes sense that since the file is intended to be newline separated, the thing being parsed is "{" (which obviously fails), however the default newline-separated output of jq isn't compact. Using jq -c avoids this problem, but the fix is unintuitive and undocumented.

Proposed solutions: 1. Default to a "loose" newline-separated parse; this could be implemented internally as [the equivalent of] a jq -c filter ahead of the insert step. 2. Catch the JSONDecodeError (or pre-empt it in the case of a record === "{\n") and give the user a "it looks like your json isn't actually newline-delimited; try running it through jq -c instead" error message.

It might just have been too early in the morning when I was playing with this, but running pipes of data through sqlite-utils without the 'knack' of it led to some false starts.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/417/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1123903919 I_kwDOCGYnMM5C_Wmv 397 Support IF NOT EXISTS for table creation rafguns 738408 closed 0     3 2022-02-04T07:41:15Z 2022-02-06T01:30:46Z 2022-02-06T01:29:01Z NONE  

Currently, I have a bunch of code that looks like this:

python subjects = db["subjects"] if db["subjects"].exists() else db["subjects"].create({ ... }) It would be neat if sqlite-utils could simplify that by supporting CREATE TABLE IF NOT EXISTS, so that I'd be able to write, e.g.

python subjects = db["subjects"].create({...}, if_not_exists=True)

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/397/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
683805434 MDU6SXNzdWU2ODM4MDU0MzQ= 135 Code for finding SpatiaLite in the usual locations simonw 9599 closed 0     3 2020-08-21T20:15:34Z 2022-02-05T00:04:26Z 2020-08-21T20:30:13Z OWNER  

I built this for shapefile-to-sqlite but it would be useful in sqlite-utils too:

https://github.com/simonw/shapefile-to-sqlite/blob/e754d0747ca2facf9a7433e2d5d15a6a37a9cf6e/shapefile_to_sqlite/utils.py#L16-L19

python SPATIALITE_PATHS = ( "/usr/lib/x86_64-linux-gnu/mod_spatialite.so", "/usr/local/lib/mod_spatialite.dylib", )

https://github.com/simonw/shapefile-to-sqlite/blob/e754d0747ca2facf9a7433e2d5d15a6a37a9cf6e/shapefile_to_sqlite/utils.py#L105-L109

python def find_spatialite(): for path in SPATIALITE_PATHS: if os.path.exists(path): return path return None

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/135/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
683812642 MDU6SXNzdWU2ODM4MTI2NDI= 136 --load-extension=spatialite shortcut option simonw 9599 closed 0     3 2020-08-21T20:31:25Z 2022-02-05T00:04:26Z 2020-10-16T19:14:32Z OWNER  

In conjunction with #135 - this would do the same thing as --load-extension=path-to-spatialite (see #134)

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/136/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
534507142 MDU6SXNzdWU1MzQ1MDcxNDI= 69 Feature request: enable extensions loading aborruso 30607 closed 0     3 2019-12-08T08:06:25Z 2022-02-05T00:04:25Z 2020-10-16T18:42:49Z NONE  

Hi, it would be great to add a parameter that enables the load of a sqlite extension you need.

Something like "-ext modspatialite".

In this way your great tool would be even more comfortable and powerful.

Thank you very much

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/69/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1097251014 I_kwDOCGYnMM5BZrjG 375 `sqlite-utils bulk` command simonw 9599 closed 0   3.21 7558727 3 2022-01-09T17:12:38Z 2022-01-11T02:12:58Z 2022-01-11T02:10:55Z OWNER  

The .executemany() method is a very efficient way to execute the same SQL query against a huge list of parameters.

sqlite-utils insert supports a bunch of ways of loading a list of dictionaries - from CSV, TSV, JSON, newline JSON and more thanks to: - #361

What if you could load a list of dictionaries and provide a SQL query with :named parameters that correspond to keys in those dictionaries instead?

This would need to be a new command - I thought about adding a --sql option to insert but that doesn't make sense as that command already requires a table name.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/375/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1097477582 PR_kwDOCGYnMM4wtl17 377 `sqlite-utils bulk` command simonw 9599 closed 0   3.21 7558727 3 2022-01-10T05:34:24Z 2022-01-11T02:10:57Z 2022-01-11T02:10:54Z OWNER simonw/sqlite-utils/pulls/377

Refs #375

Still needs:

  • [x] Refactor @insert_upsert_options so that it doesn't duplicate @import_options
  • [x] Tests
  • [x] Documentation
  • [x] Try it against a really big file
sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/377/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
1097135732 I_kwDOCGYnMM5BZPZ0 373 List `--fmt` options in the docs simonw 9599 closed 0   3.21 7558727 3 2022-01-09T08:22:11Z 2022-01-10T19:27:24Z 2022-01-09T17:49:00Z OWNER  

https://sqlite-utils.datasette.io/en/stable/cli.html#table-formatted-output currently cheats and tells the user to run --help - can fix this using cog.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/373/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1097087280 I_kwDOCGYnMM5BZDkw 368 Offer `python -m sqlite_utils` as an alternative to `sqlite-utils` simonw 9599 closed 0   3.21 7558727 3 2022-01-09T02:29:30Z 2022-01-10T19:27:20Z 2022-01-09T02:40:50Z OWNER  

Add this to sqlite_utils/cli.py:

python if __name__ == "__main__": cli() Now the tool can be run using python -m sqlite_utils.cli --help

Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008214998

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/368/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1071531082 I_kwDOCGYnMM4_3kRK 349 A way of creating indexes on newly created tables simonw 9599 open 0     3 2021-12-05T18:56:12Z 2021-12-07T01:04:37Z   OWNER  

I'm writing code for https://github.com/simonw/git-history/issues/33 that creates a table inside a loop:

python item_pk = db[item_table].lookup( {"_item_id": item_id}, item_to_insert, column_order=("_id", "_item_id"), pk="_id", ) I need to look things up by _item_id on this table, which means I need an index on that column (the table can get very big).

But there's no mechanism in SQLite utils to detect if the table was created for the first time and add an index to it. And I don't want to run CREATE INDEX IF NOT EXISTS every time through the loop.

This should work like the foreign_keys= mechanism.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/349/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1072435124 I_kwDOCGYnMM4_7A-0 350 Optional caching mechanism for table.lookup() simonw 9599 open 0     3 2021-12-06T17:54:25Z 2021-12-06T17:56:57Z   OWNER  

Inspired by work on git-history where I used this pattern: ```python column_name_to_id = {}

def column_id(column):
    if column not in column_name_to_id:
        id = db["columns"].lookup(
            {"namespace": namespace_id, "name": column},
            foreign_keys=(("namespace", "namespaces", "id"),),
        )
        column_name_to_id[column] = id
    return column_name_to_id[column]

`` If you're going to be doing a large number oftable.lookup(...)` calls and you know that no other script will be modifying the database at the same time you can presumably get a big speedup using a Python in-memory cache - maybe even a LRU one to avoid memory bloat.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/350/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1039037439 PR_kwDOCGYnMM4t0uaI 333 Add functionality to read Parquet files. Florents-Tselai 2118708 closed 0     3 2021-10-28T23:43:19Z 2021-11-25T19:47:35Z 2021-11-25T19:47:35Z NONE simonw/sqlite-utils/pulls/333

I needed this for a project of mine, and I thought it'd be useful to have it in sqlite-utils (It's also mentioned in #248 ). The current implementation works (data is read & data types are inferred correctly. I've added a single straightforward test case, but @simonw please let me know if there are any non-obvious flags/combinations I should test too.

sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/333/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
964400482 MDU6SXNzdWU5NjQ0MDA0ODI= 310 `sqlite-utils insert --flatten` option to flatten nested JSON simonw 9599 closed 0     3 2021-08-09T21:23:08Z 2021-10-16T13:54:56Z 2021-08-09T21:44:06Z OWNER  

I had to do this with a jq recipe today: https://til.simonwillison.net/cloudrun/tailing-cloud-run-request-logs

cat log.json | jq -c '[leaf_paths as $path | { "key": $path | join("_"), "value": getpath($path) }] | from_entries' \ | sqlite-utils insert /tmp/logs.db logs - --nl --alter --batch-size 1 That was to turn something like this: json { "httpRequest": { "latency": "0.112114537s", "requestMethod": "GET", "requestSize": "534", "status": 200, }, "insertId": "6111722f000b5b4c4d4071e2", "labels": { "service": "datasette-io" } } Into this instead: json { "httpRequest_latency": "0.112114537s", "httpRequest_requestMethod": "GET", "httpRequest_requestSize": "534", "httpRequest_status": 200, "insertId": "6111722f000b5b4c4d4071e2", "labels_service": "datasette-io" } I have to do this often enough that I think it should be an option, --flatten - so I can do this instead: cat log.json | sqlite-utils insert /tmp/logs.db logs - --flatten

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/310/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
929748885 MDExOlB1bGxSZXF1ZXN0Njc3NTU0OTI5 293 Test against Python 3.10-dev simonw 9599 closed 0     3 2021-06-25T01:40:39Z 2021-10-13T21:49:33Z 2021-10-13T21:49:33Z OWNER simonw/sqlite-utils/pulls/293
sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/293/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
1004613267 I_kwDOCGYnMM474S6T 328 Invalid JSON output when no rows gravis 12752 closed 0     3 2021-09-22T18:37:26Z 2021-09-22T20:21:34Z 2021-09-22T20:20:18Z NONE  

sqlite-utils query generates a JSON output with the result from the query:

json [{...},{...}] If no rows are returned by the query, I'm expecting an empty JSON array:

json []

But actually I'm getting an empty string. To be consistent, the output should be [] when the request succeeds (return code == 0).

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/328/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
913135723 MDU6SXNzdWU5MTMxMzU3MjM= 266 Add some types, enforce with mypy simonw 9599 closed 0     3 2021-06-07T06:05:56Z 2021-08-18T22:25:38Z 2021-08-18T22:25:38Z OWNER  

A good starting point would be adding type information to the members of these named tuples and the introspection methods that return them:

https://github.com/simonw/sqlite-utils/blob/9dff7a38831d471b1dff16d40d89eb5c3b4e84d6/sqlite_utils/db.py#L51-L75

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/266/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
957731178 MDU6SXNzdWU5NTc3MzExNzg= 304 `table.convert(..., where=)` and `sqlite-utils convert ... --where=` simonw 9599 closed 0     3 2021-08-02T04:27:23Z 2021-08-02T19:00:00Z 2021-08-02T18:58:10Z OWNER  

For applying the conversion to a subset of rows selected using the where clause.

Should also take optional arguments, as seen in db["dogs"].delete_where("age < ?", [3]).

Follows #302 and #251. This was originally https://github.com/simonw/sqlite-transform/issues/9

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/304/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
925677191 MDU6SXNzdWU5MjU2NzcxOTE= 289 Mypy fixes for rows_from_file() adamchainz 857609 closed 0     3 2021-06-20T20:34:59Z 2021-06-22T18:44:36Z 2021-06-22T18:13:26Z NONE  

Following https://github.com/simonw/sqlite-utils/issues/279#issuecomment-864328927

You had two mypy errors.

The first:

sqlite_utils/utils.py:157: error: Argument 1 to "BufferedReader" has incompatible type "BinaryIO"; expected "RawIOBase"

Looking at the BufferedReader docs, it seems to expect a RawIOBase, and this has been copied into typeshed. There may be scope to change how BufferedReader is documented and typed upstream, but for now it wouldn't be too bad to use a typing.cast():

```

Detect the format, then call this recursively

buffered = io.BufferedReader( cast(io.RawIOBase, fp), # Undocumented BufferedReader support for BinaryIO buffer_size=4096, ) ```

The second error seems to be flagging a legitimate bug in your code:

sqlite_utils/utils.py:163: error: Argument 1 to "decode" of "bytes" has incompatible type "Optional[str]"; expected "str"

From your type hints, encoding may be None. In the CSV format block, you use encoding or "utf-8-sig" to set a default, maybe that's desirable in this case too?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/289/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
923697888 MDU6SXNzdWU5MjM2OTc4ODg= 278 Support db as first parameter before subcommand, or as environment variable mcint 601708 closed 0     3 2021-06-17T09:26:29Z 2021-06-20T22:39:57Z 2021-06-18T15:43:19Z CONTRIBUTOR  
sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/278/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
919250621 MDU6SXNzdWU5MTkyNTA2MjE= 269 bool type not supported frafra 4068 closed 0     3 2021-06-11T22:00:36Z 2021-06-15T01:34:10Z 2021-06-15T01:34:10Z NONE  

Hi! Thank you for sharing this very nice tool :) It would be nice to have support for more types, like bool: it is not possible to convert to boolean at the moment. My suggestion would be to handle it as bool(int(value)), like csvkit does.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/269/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
907642546 MDU6SXNzdWU5MDc2NDI1NDY= 264 Supporting additional output formats, like GeoJSON eyeseast 25778 closed 0     3 2021-05-31T18:03:32Z 2021-06-03T05:12:21Z 2021-06-03T05:12:21Z CONTRIBUTOR  

I have a project going where it would be useful to do some spatial processing in SQLite (instead of large files) and then output GeoJSON. So my workflow would be something like this:

  1. Read Shapefiles, GeoJSON, CSVs into a SQLite database
  2. Join, filter, prune as needed
  3. Export GeoJSON for just the stuff I need at that moment, while still having a database of things that will be useful later

I'm wondering if this is worth adding to SQLite-utils itself (GeoJSON, at least), or if it's better to make a counterpart to the ecosystem of *-to-sqlite tools, say a suite of sqlite-to-* things. Or would it be crazy to have a plugin system?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/264/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
838148087 MDU6SXNzdWU4MzgxNDgwODc= 250 Handle byte order marks (BOMs) in CSV files simonw 9599 closed 0     3 2021-03-22T22:13:18Z 2021-05-29T05:34:21Z 2021-05-29T05:34:21Z OWNER  

I often find sqlite-utils insert ... --csv creates a first column with a weird character at the start of it - which it turns out is the UTF-8 BOM. Fix that.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/250/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
815554385 MDU6SXNzdWU4MTU1NTQzODU= 237 db["my_table"].drop(ignore=True) parameter, plus sqlite-utils drop-table --ignore and drop-view --ignore mhalle 649467 closed 0     3 2021-02-24T14:55:06Z 2021-02-25T17:11:41Z 2021-02-25T17:11:41Z NONE  

When I'm generating a derived table in python, I often drop the table and create it from scratch. However, the first time I generate the table, it doesn't exist, so the drop raises an exception. That means more boilerplate.

I was going to submit a pull request that adds an "if_exists" option to the drop method of tables and views.

However, for a utility like sqlite_utils, perhaps the "IF EXISTS" SQL semantics is what you want most of the time, and thus should be the default.

What do you think?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/237/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
783778672 MDU6SXNzdWU3ODM3Nzg2NzI= 220 Better error message for *_fts methods against views mhalle 649467 closed 0     3 2021-01-11T23:24:00Z 2021-02-22T20:44:51Z 2021-02-14T22:34:26Z NONE  

enable_fts and its related methods only work on tables, not views.

Could those methods and possibly others move up to the Queryable superclass?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/220/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
807817197 MDU6SXNzdWU4MDc4MTcxOTc= 229 Hitting `_csv.Error: field larger than field limit (131072)` frosencrantz 631242 closed 0     3 2021-02-13T19:52:44Z 2021-02-14T21:33:33Z 2021-02-14T21:33:33Z NONE  

I have a csv file where one of the fields is so large it is throwing an exception with this error and stops loading: _csv.Error: field larger than field limit (131072)

The stack trace occurs here: https://github.com/simonw/sqlite-utils/blob/3.1/sqlite_utils/cli.py#L633

There is a way to handle this that helps: https://stackoverflow.com/questions/15063936/csv-error-field-larger-than-field-limit-131072

One issue I had with this problem was sqlite-utils only provides limited context as to where the problem line is. There is the progress bar, but that is by percent rather than by line number. It would have been helpful if it could have provided a line number.

Also, it would have been useful if it had allowed the loading to continue with later lines.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/229/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
735650864 MDU6SXNzdWU3MzU2NTA4NjQ= 194 3.0 release with some minor breaking changes simonw 9599 closed 0   3.0 6079500 3 2020-11-03T21:36:31Z 2020-11-08T17:19:35Z 2020-11-08T17:19:34Z OWNER  

While working on search (#192) I've spotted a few small changes I would like to make that would break backwards compatibility in minor ways, hence requiring a 3.x release.

db[table].search() - I would like this to default to sorting by rank

Also I'd like to free up the -c and -f options for other purposes from the standard output formats here:

https://github.com/simonw/sqlite-utils/blob/43eae8b193d362f2b292df73e087ed6f10838144/sqlite_utils/cli.py#L48-L58

I'd like -f to be used to indicate a full-text search column during an insert and -c to indicate a column (so you can specify which columns you want to output).

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/194/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
735663855 MDExOlB1bGxSZXF1ZXN0NTE1MDE0ODgz 195 table.search() improvements plus sqlite-utils search command simonw 9599 closed 0     3 2020-11-03T22:02:08Z 2020-11-06T18:30:49Z 2020-11-06T18:30:42Z OWNER simonw/sqlite-utils/pulls/195

Refs #192. Still needs tests.

sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/195/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
729818242 MDExOlB1bGxSZXF1ZXN0NTEwMjM1OTA5 189 Allow iterables other than Lists in m2m records adamwolf 35681 closed 0     3 2020-10-26T18:47:44Z 2020-10-27T16:28:37Z 2020-10-27T16:24:21Z CONTRIBUTOR simonw/sqlite-utils/pulls/189

I was playing around with sqlite-utils, creating a Roam Research dogsheep-style importer for Datasette, and ran into a slight snag.

I wanted to use a generator to add an order column in an importer. It looked something like:

``` def order_generator(iterable, attr=None): if attr is None: attr = "order" order: int = 0

for i in iterable:
    i[attr] = order
    order += 1
    yield i

```

When I used this with insert_all and other things, it worked fine--but it didn't work as the records argument to m2m. I dug into it, and sqlite-utils is explicitly checking if the records argument is a list or a tuple. I flipped the check upside down, and now it checks if the argument is a mapping. If it's a mapping, it wraps it in a list, otherwise it leaves it alone.

(I get that it might not really make sense to put the order column on the second table. I changed my import schema a bit, and no longer have a real example, but maybe this change still makes sense.)

The automated tests still pass, but I did not add any new ones.

Let me know what you think! I'm really loving Datasette and its ecosystem; thanks for everything!

sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/189/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
573578548 MDU6SXNzdWU1NzM1Nzg1NDg= 89 Ability to customize columns used by extracts= feature simonw 9599 open 0     3 2020-03-01T16:54:48Z 2020-10-16T19:17:50Z   OWNER  

@simonw any thoughts on allow extracts to specify the lookup column name? If I'm understanding the documentation right, .lookup() allows you to define the "value" column (the documentation uses name), but when you use extracts keyword as part of .insert(), .upsert() etc. the lookup must be done against a column named "value". I have an existing lookup table that I've populated with columns "id" and "name" as opposed to "id" and "value", and seems I can't use extracts=, unless I'm missing something...

Initial thought on how to do this would be to allow the dictionary value to be a tuple of table name column pair... so: table = db.table("trees", extracts={"species_id": ("Species", "name"})

I haven't dug too much into the existing code yet, but does this make sense? Worth doing?

Originally posted by @chrishas35 in https://github.com/simonw/sqlite-utils/issues/46#issuecomment-592999503

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/89/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
652700770 MDU6SXNzdWU2NTI3MDA3NzA= 119 Ability to remove a foreign key simonw 9599 closed 0     3 2020-07-07T22:31:37Z 2020-09-24T20:36:59Z 2020-09-24T20:36:59Z OWNER  

Useful if you add one but make a mistake and need to undo it without recreating the database from scratch.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/119/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
652961907 MDU6SXNzdWU2NTI5NjE5MDc= 121 Improved (and better documented) support for transactions simonw 9599 open 0     3 2020-07-08T04:56:51Z 2020-09-24T20:36:46Z   OWNER  

Originally posted by @simonw in https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655283393

We should put some thought into how this library supports and encourages smart use of transactions.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/121/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
708261775 MDU6SXNzdWU3MDgyNjE3NzU= 175 Add docs for .transform(column_order=) simonw 9599 closed 0     3 2020-09-24T15:19:04Z 2020-09-24T20:35:48Z 2020-09-24T16:00:56Z OWNER  

Need to update docs for .transform() now that column_order= is available. Originally posted by @simonw in https://github.com/simonw/sqlite-utils/pull/174#discussion_r494403327

Maybe also add this as an option to sqlite-utils transform - since reordering columns is actually a pretty nice capability.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/175/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
695377804 MDU6SXNzdWU2OTUzNzc4MDQ= 153 table.optimize() should delete junk rows from *_fts_docsize simonw 9599 closed 0     3 2020-09-07T20:31:09Z 2020-09-24T20:35:46Z 2020-09-07T21:16:33Z OWNER  

The second challenge here is cleaning up all of those junk rows in existing *_fts_docsize tables. Doing that just to the demo database from https://github-to-sqlite.dogsheep.net/github.db dropped its size from 22MB to 16MB! Here's the SQL: sql DELETE FROM [licenses_fts_docsize] WHERE id NOT IN ( SELECT rowid FROM [licenses_fts]); I can do that as part of the existing table.optimize() method, which optimizes FTS tables. Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/149#issuecomment-688501064

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/153/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
688386219 MDExOlB1bGxSZXF1ZXN0NDc1NjY1OTg0 142 insert_all(..., alter=True) should work for new columns introduced after the first 100 records simonwiles 96218 closed 0     3 2020-08-28T22:22:57Z 2020-08-30T07:28:23Z 2020-08-28T22:30:14Z CONTRIBUTOR simonw/sqlite-utils/pulls/142

Closes #139.

sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/142/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
671130371 MDU6SXNzdWU2NzExMzAzNzE= 130 Support tokenize option for FTS simonw 9599 closed 0     3 2020-08-01T19:27:22Z 2020-08-01T20:51:28Z 2020-08-01T20:51:14Z OWNER  

FTS5 supports things like porter stemming using a tokenize= option:

https://www.sqlite.org/fts5.html#tokenizers

Something like this in code: CREATE VIRTUAL TABLE [{table}_fts] USING {fts_version} ( {columns}, tokenize='porter', content=[{table}] ); I tried this out just now and it worked exactly as expected.

So... db[table].enable_fts(...) should accept a 'tokenize= argument, and sqlite-utils enable-fts ... should support a --tokenize option.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/130/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
666040390 MDU6SXNzdWU2NjYwNDAzOTA= 127 Ability to insert files piped to insert-files stdin simonw 9599 closed 0     3 2020-07-27T07:09:33Z 2020-07-30T03:08:52Z 2020-07-30T03:08:18Z OWNER  

Inserting files by piping them in should work - but since a filename cannot be derived this will need a --name blah.gif option.

cat blah.gif | sqlite-utils insert-files files.db files - --name=blah.gif

Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/122#issuecomment-664128071

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/127/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
644161221 MDU6SXNzdWU2NDQxNjEyMjE= 117 Support for compound (composite) foreign keys simonw 9599 open 0     3 2020-06-23T21:33:42Z 2020-06-23T21:40:31Z   OWNER  

It turns out SQLite supports composite foreign keys: https://www.sqlite.org/foreignkeys.html#fk_composite

Their example looks like this: ```sql CREATE TABLE album( albumartist TEXT, albumname TEXT, albumcover BINARY, PRIMARY KEY(albumartist, albumname) );

CREATE TABLE song( songid INTEGER, songartist TEXT, songalbum TEXT, songname TEXT, FOREIGN KEY(songartist, songalbum) REFERENCES album(albumartist, albumname) ); ```

Here's what that looks like in sqlite-utils:

``` In [1]: import sqlite_utils

In [2]: import sqlite3

In [3]: conn = sqlite3.connect(":memory:")

In [4]: conn
Out[4]: <sqlite3.Connection at 0x1087186c0>

In [5]: conn.executescript(""" ...: CREATE TABLE album( ...: albumartist TEXT, ...: albumname TEXT, ...: albumcover BINARY, ...: PRIMARY KEY(albumartist, albumname) ...: ); ...:
...: CREATE TABLE song( ...: songid INTEGER, ...: songartist TEXT, ...: songalbum TEXT, ...: songname TEXT, ...: FOREIGN KEY(songartist, songalbum) REFERENCES album(albumartist, albumname) ...: ); ...: """)
Out[5]: <sqlite3.Cursor at 0x1088def10>

In [6]: db = sqlite_utils.Database(conn)

In [7]: db.tables
Out[7]: [<Table album (albumartist, albumname, albumcover)>, <Table song (songid, songartist, songalbum, songname)>] In [8]: db.tables[0].foreign_keys Out[8]: [] In [9]: db.tables[1].foreign_keys Out[9]: [ForeignKey(table='song', column='songartist', other_table='album', other_column='albumartist'), ForeignKey(table='song', column='songalbum', other_table='album', other_column='albumname')] ``` The table appears to have two separate foreign keys, when actually it has a single compound composite foreign key.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/117/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
601358649 MDU6SXNzdWU2MDEzNTg2NDk= 100 Mechanism for forcing column-type, over-riding auto-detection simonw 9599 closed 0     3 2020-04-16T19:12:52Z 2020-04-17T23:53:32Z 2020-04-17T23:53:32Z OWNER  

As seen in https://github.com/dogsheep/github-to-sqlite/issues/27#issuecomment-614843406 - there's a problem where you insert a record with a None value for a column and that column is created as TEXT - but actually you intended it to be an INT (as later examples will demonstrate).

Some kind of mechanism for over-riding the detected types of columns would be useful here.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/100/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
471780443 MDU6SXNzdWU0NzE3ODA0NDM= 46 extracts= option for insert/update/etc simonw 9599 closed 0     3 2019-07-23T15:55:46Z 2020-03-01T16:53:40Z 2019-07-23T17:00:44Z OWNER  

Relates to #42 and #44. I want the ability to extract values out into lookup tables during bulk insert/upsert operations.

db.insert_all(rows, extracts=["species"])

  • creates species table for values in the species column

db.insert_all(rows, extracts={"species": "Species"})

  • as above but the new table is called Species.
sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/46/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
559197745 MDU6SXNzdWU1NTkxOTc3NDU= 82 Tutorial command no longer works petey284 10350886 closed 0     3 2020-02-03T16:36:11Z 2020-02-27T04:16:43Z 2020-02-27T04:16:30Z NONE  

Issue with command on tutorial on Simon's site.

The following command no longer works, and breaks with the previous too many variables error: #50

``` cmd

curl "https://data.nasa.gov/resource/y77d-th95.json" | \ sqlite-utils insert meteorites.db meteorites - --pk=id ```

Output: cmd Traceback (most recent call last): File "continuum\miniconda3\envs\main\lib\runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "continuum\miniconda3\envs\main\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "Continuum\miniconda3\envs\main\Scripts\sqlite-utils.exe\__main__.py", line 9, in <module> File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 764, in __call__ return self.main(*args, **kwargs) File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 717, in main rv = self.invoke(ctx) File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) File "continuum\miniconda3\envs\main\lib\site-packages\click\core.py", line 555, in invoke return callback(*args, **kwargs) File "continuum\miniconda3\envs\main\lib\site-packages\sqlite_utils\cli.py", line 434, in insert default=default, File "continuum\miniconda3\envs\main\lib\site-packages\sqlite_utils\cli.py", line 384, in insert_upsert_implementation docs, pk=pk, batch_size=batch_size, alter=alter, **extra_kwargs File "continuum\miniconda3\envs\main\lib\site-packages\sqlite_utils\db.py", line 1081, in insert_all result = self.db.conn.execute(query, params) sqlite3.OperationalError: too many SQL variables

My thought is that maybe the dataset grew over the last few years and so didn't run into this issue before.

No error when I reduce the count of entries to 83. Once the number of entries hits 84 the command fails.

// This passes cmd type meteorite_83.txt | sqlite-utils insert meteorites.db meteorites - --pk=id

// But this fails cmd type meteorite_84.txt | sqlite-utils insert meteorites.db meteorites - --pk=id

A potential fix might be to chunk the incoming data? I can work on a PR if pointed in right direction.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/82/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
546073980 MDU6SXNzdWU1NDYwNzM5ODA= 74 Test failures on openSUSE 15.1: AssertionError: Explicit other_table and other_column jayvdb 15092 open 0     3 2020-01-07T04:35:50Z 2020-01-12T07:21:17Z   CONTRIBUTOR  

openSUSE 15.1 is using python 3.6.5 and click-7.0 , however it has test failures while openSUSE Tumbleweed on py37 passes.

Most fail on the cli exit code like py [ 74s] =================================== FAILURES =================================== [ 74s] _________________________________ test_tables __________________________________ [ 74s] [ 74s] db_path = '/tmp/pytest-of-abuild/pytest-0/test_tables0/test.db' [ 74s] [ 74s] def test_tables(db_path): [ 74s] result = CliRunner().invoke(cli.cli, ["tables", db_path]) [ 74s] > assert '[{"table": "Gosh"},\n {"table": "Gosh2"}]' == result.output.strip() [ 74s] E assert '[{"table": "...e": "Gosh2"}]' == '' [ 74s] E - [{"table": "Gosh"}, [ 74s] E - {"table": "Gosh2"}] [ 74s] [ 74s] tests/test_cli.py:28: AssertionError

packaging project at https://build.opensuse.org/package/show/home:jayvdb:py-new/python-sqlite-utils

I'll keep digging into this after I have github-to-sqlite working on Tumbleweed, as I'll need openSUSE Leap 15.1 working before I can submit this into the main python repo.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/74/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
542814756 MDU6SXNzdWU1NDI4MTQ3NTY= 71 Tests are failing due to missing FTS5 simonw 9599 closed 0     3 2019-12-27T09:41:16Z 2019-12-27T09:49:37Z 2019-12-27T09:49:37Z OWNER  

https://travis-ci.com/simonw/sqlite-utils/jobs/268436167

This is a recent change: 2 months ago they worked fine.

I'm not sure what changed here. Maybe something to do with https://launchpad.net/~jonathonf/+archive/ubuntu/backports ?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/71/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
470691999 MDU6SXNzdWU0NzA2OTE5OTk= 43 .add_column() doesn't match indentation of initial creation simonw 9599 closed 0     3 2019-07-20T16:33:10Z 2019-07-23T13:09:11Z 2019-07-23T13:09:05Z OWNER  

I spotted a table which was created once and then had columns added to it and the formatted SQL looks like this:

sql CREATE TABLE [records] ( [type] TEXT, [sourceName] TEXT, [sourceVersion] TEXT, [unit] TEXT, [creationDate] TEXT, [startDate] TEXT, [endDate] TEXT, [value] TEXT, [metadata_Health Mate App Version] TEXT, [metadata_Withings User Identifier] TEXT, [metadata_Modified Date] TEXT, [metadata_Withings Link] TEXT, [metadata_HKWasUserEntered] TEXT , [device] TEXT, [metadata_HKMetadataKeyHeartRateMotionContext] TEXT, [metadata_HKDeviceManufacturerName] TEXT, [metadata_HKMetadataKeySyncVersion] TEXT, [metadata_HKMetadataKeySyncIdentifier] TEXT, [metadata_HKSwimmingStrokeStyle] TEXT, [metadata_HKVO2MaxTestType] TEXT, [metadata_HKTimeZone] TEXT, [metadata_Average HR] TEXT, [metadata_Recharge] TEXT, [metadata_Lights] TEXT, [metadata_Asleep] TEXT, [metadata_Rating] TEXT, [metadata_Energy Threshold] TEXT, [metadata_Deep Sleep] TEXT, [metadata_Nap] TEXT, [metadata_Edit Slots] TEXT, [metadata_Tags] TEXT, [metadata_Daytime HR] TEXT)

It would be nice if the columns that were added later matched the indentation of the initial columns.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/43/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
448391492 MDU6SXNzdWU0NDgzOTE0OTI= 21 Option to ignore inserts if primary key exists already simonw 9599 closed 0     3 2019-05-25T00:17:12Z 2019-05-29T05:09:01Z 2019-05-29T04:18:26Z OWNER  

I've just noticed that SQLite lets you IGNORE inserts that collide with a pre-existing key. This can be quite handy if you have a dataset that keeps changing in part, and you don't want to upsert and replace pre-existing PK rows but you do want to ignore collisions to existing PK rows.

Do sqlite_utils support such (cavalier!) behaviour?

Originally posted by @psychemedia in https://github.com/simonw/sqlite-utils/issues/18#issuecomment-480621924

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/21/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
403922644 MDU6SXNzdWU0MDM5MjI2NDQ= 8 Problems handling column names containing spaces or - psychemedia 82988 closed 0     3 2019-01-28T17:23:28Z 2019-04-14T15:29:33Z 2019-02-23T21:09:03Z NONE  

Irrrespective of whether using column names containing a space or - character is good practice, SQLite does allow it, but sqlite-utils throws an error in the following cases:

```python from sqlite_utils import Database

dbname = 'test.db' DB = Database(sqlite3.connect(dbname))

import pandas as pd df = pd.DataFrame({'col1':range(3), 'col2':range(3)})

Convert pandas dataframe to appropriate list/dict format

DB['test1'].insert_all( df.to_dict(orient='records') )

Works fine

```

However:

python df = pd.DataFrame({'col 1':range(3), 'col2':range(3)}) DB['test1'].insert_all(df.to_dict(orient='records'))

throws:

```

OperationalError Traceback (most recent call last) <ipython-input-27-070b758f4f92> in <module>() 1 import pandas as pd 2 df = pd.DataFrame({'col 1':range(3), 'col2':range(3)}) ----> 3 DB['test1'].insert_all(df.to_dict(orient='records'))

/usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in insert_all(self, records, pk, foreign_keys, upsert, batch_size, column_order) 327 jsonify_if_needed(record.get(key, None)) for key in all_columns 328 ) --> 329 result = self.db.conn.execute(sql, values) 330 self.db.conn.commit() 331 self.last_id = result.lastrowid

OperationalError: near "1": syntax error ```

and:

python df = pd.DataFrame({'col-1':range(3), 'col2':range(3)}) DB['test1'].upsert_all(df.to_dict(orient='records'))

results in:

```

OperationalError Traceback (most recent call last) <ipython-input-28-654523549d20> in <module>() 1 import pandas as pd 2 df = pd.DataFrame({'col-1':range(3), 'col2':range(3)}) ----> 3 DB['test1'].insert_all(df.to_dict(orient='records'))

/usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in insert_all(self, records, pk, foreign_keys, upsert, batch_size, column_order) 327 jsonify_if_needed(record.get(key, None)) for key in all_columns 328 ) --> 329 result = self.db.conn.execute(sql, values) 330 self.db.conn.commit() 331 self.last_id = result.lastrowid

OperationalError: near "-": syntax error ```

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/8/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
413842611 MDU6SXNzdWU0MTM4NDI2MTE= 14 Utilities for adding indexes simonw 9599 closed 0     3 2019-02-24T16:57:28Z 2019-02-24T19:11:28Z 2019-02-24T19:11:28Z OWNER  

Both in the Python API and the CLI tool. For the CLI tool this should work:

$ sqlite-utils create-index mydb.db mytable col1 col2

This will create a compound index across col1 and col2. The name of the index will be automatically chosen unless you use the --name=... option.

Support a --unique option too.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/14/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
403625674 MDU6SXNzdWU0MDM2MjU2NzQ= 7 .insert_all() should accept a generator and process it efficiently simonw 9599 closed 0     3 2019-01-28T02:11:58Z 2019-01-28T06:26:53Z 2019-01-28T06:26:53Z OWNER  

Right now you have to load every record into memory before passing the list to .insert_all() and friends.

If you want to process millions of rows, this is inefficient. Python has generators - we should use them!

The only catch here is that part of the magic of sqlite-utils is that it guesses the column types and creates the table for you. This code will need to be updated to notice if the table needs creating and, if it does, create it using the first X (where x=1,000 but can be customized) records.

If a record outside of those first 1,000 has a rogue column, we can crash with an error.

This will free us up to make the --nl option added in #6 much more efficient.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/7/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [pull_request] TEXT,
   [body] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
, [active_lock_reason] TEXT, [performed_via_github_app] TEXT, [reactions] TEXT, [draft] INTEGER, [state_reason] TEXT);
CREATE INDEX [idx_issues_repo]
                ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
                ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
                ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
                ON [issues] ([user]);
Powered by Datasette · Queries took 76.271ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows