home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

8,069 rows sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date)

user >30

  • simonw 6,978
  • codecov[bot] 146
  • eyeseast 53
  • russss 39
  • psychemedia 32
  • fgregg 32
  • abdusco 26
  • mroswell 20
  • aborruso 19
  • chrismp 18
  • jacobian 14
  • carlmjohnson 14
  • RhetTbull 14
  • tballison 13
  • wragge 12
  • brandonrobertz 12
  • tsibley 11
  • rixx 11
  • frafra 10
  • terrycojones 10
  • stonebig 10
  • rayvoelker 10
  • maxhawkins 9
  • clausjuhl 9
  • bobwhitelock 9
  • dependabot[bot] 9
  • 20after4 8
  • dracos 8
  • UtahDave 8
  • tomchristie 8
  • …

updated_at (date) >30 ✖

  • 2021-03-22 66
  • 2021-11-19 60
  • 2020-09-22 53
  • 2020-10-15 52
  • 2020-10-30 49
  • 2022-03-21 46
  • 2020-06-09 43
  • 2022-01-09 42
  • 2020-10-20 41
  • 2020-06-18 39
  • 2020-12-18 39
  • 2021-11-16 39
  • 2021-12-16 39
  • 2022-06-14 39
  • 2020-05-27 38
  • 2020-12-30 38
  • 2020-10-09 37
  • 2022-03-19 37
  • 2021-11-20 36
  • 2022-01-20 36
  • 2021-05-27 34
  • 2020-06-01 33
  • 2020-06-08 33
  • 2020-09-15 33
  • 2021-01-04 33
  • 2021-11-29 33
  • 2021-08-13 32
  • 2022-03-05 32
  • 2019-06-24 31
  • 2020-09-21 31
  • …

issue >30

  • Show column metadata plus links for foreign keys on arbitrary query results 50
  • Redesign default .json format 48
  • Rethink how .ext formats (v.s. ?_format=) works before 1.0 48
  • JavaScript plugin hooks mechanism similar to pluggy 47
  • Updated Dockerfile with SpatiaLite version 5.0 45
  • Complete refactor of TableView and table.html template 45
  • Port Datasette to ASGI 42
  • Authentication (and permissions) as a core concept 40
  • Deploy a live instance of demos/apache-proxy 34
  • await datasette.client.get(path) mechanism for executing internal requests 33
  • Maintain an in-memory SQLite table of connected databases and their tables 32
  • Ability to sort (and paginate) by column 31
  • Research: demonstrate if parallel SQL queries are worthwhile 31
  • link_or_copy_directory() error - Invalid cross-device link 28
  • Export to CSV 27
  • base_url configuration setting 27
  • Documentation with recommendations on running Datasette in production without using Docker 27
  • Optimize all those calls to index_list and foreign_key_list 27
  • Support cross-database joins 26
  • Ability for a canned query to write to the database 26
  • table.transform() method for advanced alter table 26
  • New pattern for views that return either JSON or HTML, available for plugins 26
  • Proof of concept for Datasette on AWS Lambda with EFS 25
  • WIP: Add Gmail takeout mbox import 25
  • Redesign register_output_renderer callback 24
  • Make it easier to insert geometries, with documentation and maybe code 24
  • "datasette insert" command and plugin hook 23
  • Datasette Plugins 22
  • .json and .csv exports fail to apply base_url 22
  • Idea: import CSV to memory, run SQL, export in a single command 22
  • …

author_association 4

  • OWNER 6,492
  • NONE 732
  • MEMBER 486
  • CONTRIBUTOR 359
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1170595021 https://github.com/simonw/sqlite-utils/issues/26#issuecomment-1170595021 https://api.github.com/repos/simonw/sqlite-utils/issues/26 IC_kwDOCGYnMM5FxdzN izzues 60892516 2022-06-29T23:35:29Z 2022-06-29T23:35:29Z NONE

Have you seen MakeTypes? Not the exact same thing but it may be relevant.

And it's inspired by the paper "Types from Data: Making Structured Data First-Class Citizens in F#".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for turning nested JSON into foreign keys / many-to-many 455486286  
1168715058 https://github.com/simonw/datasette/pull/1763#issuecomment-1168715058 https://api.github.com/repos/simonw/datasette/issues/1763 IC_kwDOBm6k_c5FqS0y codecov[bot] 22429695 2022-06-28T13:19:28Z 2022-06-28T13:19:28Z NONE

Codecov Report

Merging #1763 (fd6a817) into main (00e59ec) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1763   +/-   ##
=======================================
  Coverage   91.67%   91.67%           
=======================================
  Files          36       36           
  Lines        4658     4658           
=======================================
  Hits         4270     4270           
  Misses        388      388           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00e59ec...fd6a817. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump black from 22.1.0 to 22.6.0 1287325944  
1168704157 https://github.com/simonw/datasette/pull/1693#issuecomment-1168704157 https://api.github.com/repos/simonw/datasette/issues/1693 IC_kwDOBm6k_c5FqQKd dependabot[bot] 49699333 2022-06-28T13:11:36Z 2022-06-28T13:11:36Z CONTRIBUTOR

Superseded by #1763.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump black from 22.1.0 to 22.3.0 1184850337  
1164460052 https://github.com/simonw/sqlite-utils/issues/431#issuecomment-1164460052 https://api.github.com/repos/simonw/sqlite-utils/issues/431 IC_kwDOCGYnMM5FaEAU rafguns 738408 2022-06-23T14:12:51Z 2022-06-23T14:12:51Z NONE

Yeah, I think I prefer your suggestion: it seems cleaner than my initial left_name=/right_name= idea. Perhaps one downside is that it's less obvious what the role of each field is: in this example, is people_id_1 a reference to parent or child?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Allow making m2m relation of a table to itself 1227571375  
1163917719 https://github.com/dogsheep/healthkit-to-sqlite/issues/12#issuecomment-1163917719 https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/12 IC_kwDOC8tyDs5FX_mX Mjboothaus 956433 2022-06-23T04:35:02Z 2022-06-23T04:35:02Z NONE

In terms of unique identifiers - could you use values stored in HKMetadataKeySyncIdentifier?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Some workout columns should be float, not text 727848625  
1163097455 https://github.com/simonw/datasette/pull/1760#issuecomment-1163097455 https://api.github.com/repos/simonw/datasette/issues/1760 IC_kwDOBm6k_c5FU3Vv codecov[bot] 22429695 2022-06-22T13:27:08Z 2022-06-22T13:27:08Z NONE

Codecov Report

Merging #1760 (69951ee) into main (00e59ec) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1760   +/-   ##
=======================================
  Coverage   91.67%   91.67%           
=======================================
  Files          36       36           
  Lines        4658     4658           
=======================================
  Hits         4270     4270           
  Misses        388      388           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00e59ec...69951ee. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump furo from 2022.4.7 to 2022.6.21 1280136357  
1163091750 https://github.com/simonw/datasette/pull/1753#issuecomment-1163091750 https://api.github.com/repos/simonw/datasette/issues/1753 IC_kwDOBm6k_c5FU18m dependabot[bot] 49699333 2022-06-22T13:22:34Z 2022-06-22T13:22:34Z CONTRIBUTOR

Superseded by #1760.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump furo from 2022.4.7 to 2022.6.4.1 1261826957  
1162500525 https://github.com/simonw/sqlite-utils/issues/448#issuecomment-1162500525 https://api.github.com/repos/simonw/sqlite-utils/issues/448 IC_kwDOCGYnMM5FSlmt mungewell 236907 2022-06-22T00:46:43Z 2022-06-22T00:46:43Z NONE

log.txt

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Reading rows from a file => AttributeError: '_io.StringIO' object has no attribute 'readinto' 1279144769  
1162498734 https://github.com/simonw/sqlite-utils/issues/448#issuecomment-1162498734 https://api.github.com/repos/simonw/sqlite-utils/issues/448 IC_kwDOCGYnMM5FSlKu mungewell 236907 2022-06-22T00:43:45Z 2022-06-22T00:43:45Z NONE

Attempted to test on a machine with a new version of Python, but install failed with an error message for the 'click' package.

C:\WINDOWS\system32>"c:\Program Files\Python310\python.exe"
Python 3.10.2 (tags/v3.10.2:a58ebcc, Jan 17 2022, 14:12:15) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> quit()

C:\WINDOWS\system32>cd C:\Users\swood\Downloads\sqlite-utils-main-20220621\sqlite-utils-main

C:\Users\swood\Downloads\sqlite-utils-main-20220621\sqlite-utils-main>"c:\Program Files\Python310\python.exe" setup.py install
running install
running bdist_egg
running egg_info

...

Installed c:\program files\python310\lib\site-packages\click_default_group_wheel-1.2.2-py3.10.egg
Searching for click
Downloading https://files.pythonhosted.org/packages/3d/da/f3bbf30f7e71d881585d598f67f4424b2cc4c68f39849542e81183218017/click-default-group-wheel-1.2.2.tar.gz#sha256=e90da42d92c03e88a12ed0c0b69c8a29afb5d36e3dc8d29c423ba4219e6d7747
Best match: click default-group-wheel-1.2.2
Processing click-default-group-wheel-1.2.2.tar.gz
Writing C:\Users\swood\AppData\Local\Temp\easy_install-aiaj0_eh\click-default-group-wheel-1.2.2\setup.cfg
Running click-default-group-wheel-1.2.2\setup.py -q bdist_egg --dist-dir C:\Users\swood\AppData\Local\Temp\easy_install-aiaj0_eh\click-default-group-wheel-1.2.2\egg-dist-tmp-z61a4h8n
zip_safe flag not set; analyzing archive contents...
removing 'c:\program files\python310\lib\site-packages\click_default_group_wheel-1.2.2-py3.10.egg' (and everything under it)
Copying click_default_group_wheel-1.2.2-py3.10.egg to c:\program files\python310\lib\site-packages
click-default-group-wheel 1.2.2 is already the active version in easy-install.pth

Installed c:\program files\python310\lib\site-packages\click_default_group_wheel-1.2.2-py3.10.egg
error: The 'click' distribution was not found and is required by click-default-group-wheel, sqlite-utils
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Reading rows from a file => AttributeError: '_io.StringIO' object has no attribute 'readinto' 1279144769  
1162234441 https://github.com/simonw/sqlite-utils/issues/446#issuecomment-1162234441 https://api.github.com/repos/simonw/sqlite-utils/issues/446 IC_kwDOCGYnMM5FRkpJ simonw 9599 2022-06-21T19:28:35Z 2022-06-21T19:28:35Z OWNER

just -l now does this:

% just -l
Available recipes:
    black         # Apply Black
    cog           # Rebuild docs with cog
    default       # Run tests and linters
    lint          # Run linters: black, flake8, mypy, cog
    test *options # Run pytest with supplied options
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Use Just to automate running tests and linters locally 1277328147  
1162231111 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162231111 https://api.github.com/repos/simonw/sqlite-utils/issues/297 IC_kwDOCGYnMM5FRj1H simonw 9599 2022-06-21T19:25:44Z 2022-06-21T19:25:44Z OWNER

Pushed that prototype to a branch.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
1162223668 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162223668 https://api.github.com/repos/simonw/sqlite-utils/issues/297 IC_kwDOCGYnMM5FRiA0 simonw 9599 2022-06-21T19:19:22Z 2022-06-21T19:22:15Z OWNER

Built a prototype of --fast for the sqlite-utils memory command:

% time sqlite-utils memory taxi.csv 'SELECT passenger_count, COUNT(*), AVG(total_amount) FROM taxi GROUP BY passenger_count' --fast
passenger_count  COUNT(*)  AVG(total_amount)
---------------  --------  -----------------
                 128020    32.2371511482553 
0                42228     17.0214016766151 
1                1533197   17.6418833067999 
2                286461    18.0975870711456 
3                72852     17.9153958710923 
4                25510     18.452774990196  
5                50291     17.2709248175672 
6                32623     17.6002964166367 
7                2         87.17            
8                2         95.705           
9                1         113.6            
sqlite-utils memory taxi.csv  --fast  12.71s user 0.48s system 104% cpu 12.627 total

Takes 13s - about the same time as calling sqlite3 :memory: ... directly as seen in https://til.simonwillison.net/sqlite/one-line-csv-operations

Without the --fast option that takes several minutes (262s = 4m20s)!

Here's the prototype so far:

diff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py
index 86eddfb..1c83ef6 100644
--- a/sqlite_utils/cli.py
+++ b/sqlite_utils/cli.py
@@ -14,6 +14,8 @@ import io
 import itertools
 import json
 import os
+import shutil
+import subprocess
 import sys
 import csv as csv_std
 import tabulate
@@ -1669,6 +1671,7 @@ def query(
     is_flag=True,
     help="Analyze resulting tables and output results",
 )
+@click.option("--fast", is_flag=True, help="Fast mode, only works with CSV and TSV")
 @load_extension_option
 def memory(
     paths,
@@ -1692,6 +1695,7 @@ def memory(
     save,
     analyze,
     load_extension,
+    fast,
 ):
     """Execute SQL query against an in-memory database, optionally populated by imported data

@@ -1719,6 +1723,22 @@ def memory(
     \b
         sqlite-utils memory animals.csv --schema
     """
+    if fast:
+        if (
+            attach
+            or flatten
+            or param
+            or encoding
+            or no_detect_types
+            or analyze
+            or load_extension
+        ):
+            raise click.ClickException(
+                "--fast mode does not support any of the following options: --attach, --flatten, --param, --encoding, --no-detect-types, --analyze, --load-extension"
+            )
+        # TODO: Figure out and pass other supported options
+        memory_fast(paths, sql)
+        return
     db = sqlite_utils.Database(memory=True)
     # If --dump or --save or --analyze used but no paths detected, assume SQL query is a path:
     if (dump or save or schema or analyze) and not paths:
@@ -1791,6 +1811,33 @@ def memory(
     )


+def memory_fast(paths, sql):
+    if not shutil.which("sqlite3"):
+        raise click.ClickException("sqlite3 not found in PATH")
+    args = ["sqlite3", ":memory:", "-cmd", ".mode csv"]
+    table_names = []
+
+    def name(path):
+        base_name = pathlib.Path(path).stem or "t"
+        table_name = base_name
+        prefix = 1
+        while table_name in table_names:
+            prefix += 1
+            table_name = "{}_{}".format(base_name, prefix)
+        return table_name
+
+    for path in paths:
+        table_name = name(path)
+        table_names.append(table_name)
+        args.extend(
+            ["-cmd", ".import {} {}".format(pathlib.Path(path).resolve(), table_name)]
+        )
+
+    args.extend(["-cmd", ".mode column"])
+    args.append(sql)
+    subprocess.run(args)
+
+
 def _execute_query(
     db, sql, param, raw, table, csv, tsv, no_headers, fmt, nl, arrays, json_cols
 ):
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
1162186856 https://github.com/simonw/sqlite-utils/issues/447#issuecomment-1162186856 https://api.github.com/repos/simonw/sqlite-utils/issues/447 IC_kwDOCGYnMM5FRZBo simonw 9599 2022-06-21T18:48:46Z 2022-06-21T18:48:46Z OWNER

That fixed it:

https://user-images.githubusercontent.com/9599/174875556-3a569c90-5c92-48eb-935c-470638deb335.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Incorrect syntax highlighting in docs CLI reference 1278571700  
1162179354 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162179354 https://api.github.com/repos/simonw/sqlite-utils/issues/297 IC_kwDOCGYnMM5FRXMa simonw 9599 2022-06-21T18:44:03Z 2022-06-21T18:44:03Z OWNER

The thing I like about that --fast option is that it could selectively use this alternative mechanism just for the files for which it can work (CSV and TSV files). I could also add a --fast option to sqlite-utils memory which could then kick in only for operations that involve just TSV and CSV files.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
1161869859 https://github.com/simonw/sqlite-utils/issues/447#issuecomment-1161869859 https://api.github.com/repos/simonw/sqlite-utils/issues/447 IC_kwDOCGYnMM5FQLoj simonw 9599 2022-06-21T15:00:42Z 2022-06-21T15:00:42Z OWNER

Deploying that to https://sqlite-utils.datasette.io/en/latest/cli-reference.html#insert

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Incorrect syntax highlighting in docs CLI reference 1278571700  
1161857806 https://github.com/simonw/sqlite-utils/issues/447#issuecomment-1161857806 https://api.github.com/repos/simonw/sqlite-utils/issues/447 IC_kwDOCGYnMM5FQIsO simonw 9599 2022-06-21T14:55:51Z 2022-06-21T14:58:14Z OWNER

https://stackoverflow.com/a/44379513 suggests that the fix is:

.. code-block:: text

Or set this in conf.py:

highlight_language = "none"

I like that better - I don't like that all :: blocks default to being treated as Python code.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Incorrect syntax highlighting in docs CLI reference 1278571700  
1161849874 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1161849874 https://api.github.com/repos/simonw/sqlite-utils/issues/297 IC_kwDOCGYnMM5FQGwS simonw 9599 2022-06-21T14:49:12Z 2022-06-21T14:49:12Z OWNER

Since there are all sorts of existing options for sqlite-utils insert that won't work with this, maybe it would be better to have an entirely separate command - this for example:

sqlite-utils fast-insert data.db mytable data.csv
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
882052693 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-882052693 https://api.github.com/repos/simonw/sqlite-utils/issues/297 IC_kwDOCGYnMM40kw5V simonw 9599 2021-07-18T12:57:54Z 2022-06-21T13:17:15Z OWNER

Another implementation option would be to use the CSV virtual table mechanism. This could avoid shelling out to the sqlite3 binary, but requires solving the harder problem of compiling and distributing a loadable SQLite module: https://www.sqlite.org/csv.html

(Would be neat to produce a Python wheel of this, see https://simonwillison.net/2022/May/23/bundling-binary-tools-in-python-wheels/)

This would also help solve the challenge of making this optimization available to the sqlite-utils memory command. That command operates against an in-memory database so it's not obvious how it could shell out to a binary.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
1160991031 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1160991031 https://api.github.com/repos/simonw/sqlite-utils/issues/297 IC_kwDOCGYnMM5FM1E3 simonw 9599 2022-06-21T00:35:20Z 2022-06-21T00:35:20Z OWNER

Relevant TIL: https://til.simonwillison.net/sqlite/one-line-csv-operations

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
1160798645 https://github.com/simonw/sqlite-utils/issues/446#issuecomment-1160798645 https://api.github.com/repos/simonw/sqlite-utils/issues/446 IC_kwDOCGYnMM5FMGG1 simonw 9599 2022-06-20T19:55:34Z 2022-06-20T19:55:34Z OWNER

just now defaults to running the tests and linters.

just test runs the tests - it can take arguments, e.g. just test -k transform

just lint runs all of the linters.

just black applies Black.

In all case it assumes you are using pipenv, at least for the moment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Use Just to automate running tests and linters locally 1277328147  
1160794604 https://github.com/simonw/sqlite-utils/issues/443#issuecomment-1160794604 https://api.github.com/repos/simonw/sqlite-utils/issues/443 IC_kwDOCGYnMM5FMFHs simonw 9599 2022-06-20T19:49:37Z 2022-06-20T19:49:37Z OWNER

Also now shows up here: https://sqlite-utils.datasette.io/en/latest/reference.html#sqlite-utils-utils-rows-from-file

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Make `utils.rows_from_file()` a documented API 1269998342  
1160794175 https://github.com/simonw/sqlite-utils/issues/445#issuecomment-1160794175 https://api.github.com/repos/simonw/sqlite-utils/issues/445 IC_kwDOCGYnMM5FMFA_ simonw 9599 2022-06-20T19:49:02Z 2022-06-20T19:49:02Z OWNER

New documentation:

  • https://sqlite-utils.datasette.io/en/latest/python-api.html#detecting-column-types-using-typetracker
  • https://sqlite-utils.datasette.io/en/latest/reference.html#sqlite-utils-utils-typetracker
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`sqlite_utils.utils.TypeTracker` should be a documented API 1277295119  
1160793114 https://github.com/simonw/sqlite-utils/issues/445#issuecomment-1160793114 https://api.github.com/repos/simonw/sqlite-utils/issues/445 IC_kwDOCGYnMM5FMEwa simonw 9599 2022-06-20T19:47:36Z 2022-06-20T19:47:36Z OWNER

I also added inline documentation and types: https://github.com/simonw/sqlite-utils/blob/773f2b6b20622bb986984a1c3161d5b3aaa1046b/sqlite_utils/utils.py#L318-L360

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`sqlite_utils.utils.TypeTracker` should be a documented API 1277295119  
1160763268 https://github.com/simonw/sqlite-utils/issues/445#issuecomment-1160763268 https://api.github.com/repos/simonw/sqlite-utils/issues/445 IC_kwDOCGYnMM5FL9eE simonw 9599 2022-06-20T19:09:21Z 2022-06-20T19:09:21Z OWNER

Code to document: https://github.com/simonw/sqlite-utils/blob/3fbe8a784cc2f3fa0bfa8612fec9752ff9068a2b/sqlite_utils/utils.py#L318-L331

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`sqlite_utils.utils.TypeTracker` should be a documented API 1277295119  
1160717784 https://github.com/simonw/datasette/pull/1759#issuecomment-1160717784 https://api.github.com/repos/simonw/datasette/issues/1759 IC_kwDOBm6k_c5FLyXY codecov[bot] 22429695 2022-06-20T18:04:46Z 2022-06-20T18:04:46Z NONE

Codecov Report

Merging #1759 (b901bb0) into main (2e97516) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1759   +/-   ##
=======================================
  Coverage   91.67%   91.67%           
=======================================
  Files          36       36           
  Lines        4658     4658           
=======================================
  Hits         4270     4270           
  Misses        388      388           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e97516...b901bb0. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extract facet portions of table.html out into included templates 1275523220  
1160717735 https://github.com/simonw/datasette/pull/1759#issuecomment-1160717735 https://api.github.com/repos/simonw/datasette/issues/1759 IC_kwDOBm6k_c5FLyWn simonw 9599 2022-06-20T18:04:41Z 2022-06-20T18:04:41Z OWNER

I don't think this change needs any changes to the documentation: https://docs.datasette.io/en/stable/custom_templates.html#custom-templates

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extract facet portions of table.html out into included templates 1275523220  
1160712911 https://github.com/simonw/datasette/pull/1759#issuecomment-1160712911 https://api.github.com/repos/simonw/datasette/issues/1759 IC_kwDOBm6k_c5FLxLP simonw 9599 2022-06-20T17:58:37Z 2022-06-20T17:58:37Z OWNER

This is a great idea.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
Extract facet portions of table.html out into included templates 1275523220  
1155966234 https://github.com/simonw/sqlite-utils/issues/444#issuecomment-1155966234 https://api.github.com/repos/simonw/sqlite-utils/issues/444 IC_kwDOCGYnMM5E5qUa simonw 9599 2022-06-15T04:18:05Z 2022-06-15T04:18:05Z OWNER

I'm going to push a branch with my not-yet-working code (which does at least include a test).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV `extras_key=` and `ignore_extras=` equivalents for CLI tool 1271426387  
1155815956 https://github.com/simonw/sqlite-utils/issues/444#issuecomment-1155815956 https://api.github.com/repos/simonw/sqlite-utils/issues/444 IC_kwDOCGYnMM5E5FoU simonw 9599 2022-06-14T23:49:56Z 2022-06-15T03:58:10Z OWNER

Yeah my initial implementation there makes no sense:

            csv_reader_args = {"dialect": dialect}
            if delimiter:
                csv_reader_args["delimiter"] = delimiter
            if quotechar:
                csv_reader_args["quotechar"] = quotechar
            reader = _extra_key_strategy(
                csv_std.reader(decoded, **csv_reader_args), ignore_extras, extras_key
            )
            first_row = next(reader)
            if no_headers:
                headers = ["untitled_{}".format(i + 1) for i in range(len(first_row))]
                reader = itertools.chain([first_row], reader)
            else:
                headers = first_row
            docs = (dict(zip(headers, row)) for row in reader)

Because my _extra_key_strategy() helper function is designed to work against csv.DictReader - not against csv.reader() which returns a sequenc of lists, not a sequence of dictionaries.

In fact, what's happening here is that dict(zip(headers, row)) is ignoring anything in the row that doesn't correspond to a header:

>>> list(zip(["a", "b"], [1, 2, 3]))
[('a', 1), ('b', 2)]
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV `extras_key=` and `ignore_extras=` equivalents for CLI tool 1271426387  
1155953345 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155953345 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E5nLB simonw 9599 2022-06-15T03:53:43Z 2022-06-15T03:53:43Z OWNER

I tried fixing this by using .tell() to read the file position as I was iterating through it:

diff --git a/sqlite_utils/utils.py b/sqlite_utils/utils.py
index d2ccc5f..29ad12e 100644
--- a/sqlite_utils/utils.py
+++ b/sqlite_utils/utils.py
@@ -149,10 +149,13 @@ class UpdateWrapper:
     def __init__(self, wrapped, update):
         self._wrapped = wrapped
         self._update = update
+        self._tell = wrapped.tell()

     def __iter__(self):
         for line in self._wrapped:
-            self._update(len(line))
+            tell = self._wrapped.tell()
+            self._update(self._tell - tell)
+            self._tell = tell
             yield line
 ```
This did not work - I get this error:

File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/utils.py", line 206, in _extra_key_strategy
for row in reader:
File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/utils.py", line 156, in iter
tell = self._wrapped.tell()
OSError: telling position disabled by next() call
`` It looks like you can't use.tell()` during iteration: https://stackoverflow.com/questions/29618936/how-to-solve-oserror-telling-position-disabled-by-next-call

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155815186 https://github.com/simonw/sqlite-utils/issues/444#issuecomment-1155815186 https://api.github.com/repos/simonw/sqlite-utils/issues/444 IC_kwDOCGYnMM5E5FcS simonw 9599 2022-06-14T23:48:16Z 2022-06-14T23:48:16Z OWNER

This is tricky to implement because of this code: https://github.com/simonw/sqlite-utils/blob/b8af3b96f5c72317cc8783dc296a94f6719987d9/sqlite_utils/cli.py#L938-L945

It's reconstructing each document using the known headers here:

docs = (dict(zip(headers, row)) for row in reader)

So my first attempt at this - the diff here - did not have the desired result:

diff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py
index 86eddfb..00b920b 100644
--- a/sqlite_utils/cli.py
+++ b/sqlite_utils/cli.py
@@ -6,7 +6,7 @@ import hashlib
 import pathlib
 import sqlite_utils
 from sqlite_utils.db import AlterError, BadMultiValues, DescIndex
-from sqlite_utils.utils import maximize_csv_field_size_limit
+from sqlite_utils.utils import maximize_csv_field_size_limit, _extra_key_strategy
 from sqlite_utils import recipes
 import textwrap
 import inspect
@@ -797,6 +797,15 @@ _import_options = (
         "--encoding",
         help="Character encoding for input, defaults to utf-8",
     ),
+    click.option(
+        "--ignore-extras",
+        is_flag=True,
+        help="If a CSV line has more than the expected number of values, ignore the extras",
+    ),
+    click.option(
+        "--extras-key",
+        help="If a CSV line has more than the expected number of values put them in a list in this column",
+    ),
 )


@@ -885,6 +894,8 @@ def insert_upsert_implementation(
     sniff,
     no_headers,
     encoding,
+    ignore_extras,
+    extras_key,
     batch_size,
     alter,
     upsert,
@@ -909,6 +920,10 @@ def insert_upsert_implementation(
         raise click.ClickException("--flatten cannot be used with --csv or --tsv")
     if encoding and not (csv or tsv):
         raise click.ClickException("--encoding must be used with --csv or --tsv")
+    if ignore_extras and extras_key:
+        raise click.ClickException(
+            "--ignore-extras and --extras-key cannot be used together"
+        )
     if pk and len(pk) == 1:
         pk = pk[0]
     encoding = encoding or "utf-8-sig"
@@ -935,7 +950,9 @@ def insert_upsert_implementation(
                 csv_reader_args["delimiter"] = delimiter
             if quotechar:
                 csv_reader_args["quotechar"] = quotechar
-            reader = csv_std.reader(decoded, **csv_reader_args)
+            reader = _extra_key_strategy(
+                csv_std.reader(decoded, **csv_reader_args), ignore_extras, extras_key
+            )
             first_row = next(reader)
             if no_headers:
                 headers = ["untitled_{}".format(i + 1) for i in range(len(first_row))]
@@ -1101,6 +1118,8 @@ def insert(
     sniff,
     no_headers,
     encoding,
+    ignore_extras,
+    extras_key,
     batch_size,
     alter,
     detect_types,
@@ -1176,6 +1195,8 @@ def insert(
             sniff,
             no_headers,
             encoding,
+            ignore_extras,
+            extras_key,
             batch_size,
             alter=alter,
             upsert=False,
@@ -1214,6 +1235,8 @@ def upsert(
     sniff,
     no_headers,
     encoding,
+    ignore_extras,
+    extras_key,
     alter,
     not_null,
     default,
@@ -1254,6 +1277,8 @@ def upsert(
             sniff,
             no_headers,
             encoding,
+            ignore_extras,
+            extras_key,
             batch_size,
             alter=alter,
             upsert=True,
@@ -1297,6 +1322,8 @@ def bulk(
     sniff,
     no_headers,
     encoding,
+    ignore_extras,
+    extras_key,
     load_extension,
 ):
     """
@@ -1331,6 +1358,8 @@ def bulk(
             sniff=sniff,
             no_headers=no_headers,
             encoding=encoding,
+            ignore_extras=ignore_extras,
+            extras_key=extras_key,
             batch_size=batch_size,
             alter=False,
             upsert=False,
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV `extras_key=` and `ignore_extras=` equivalents for CLI tool 1271426387  
1155804591 https://github.com/simonw/sqlite-utils/issues/444#issuecomment-1155804591 https://api.github.com/repos/simonw/sqlite-utils/issues/444 IC_kwDOCGYnMM5E5C2v simonw 9599 2022-06-14T23:28:36Z 2022-06-14T23:28:36Z OWNER

I'm going with --extras-key and --ignore-extras as the two new options.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV `extras_key=` and `ignore_extras=` equivalents for CLI tool 1271426387  
1155804459 https://github.com/simonw/sqlite-utils/issues/444#issuecomment-1155804459 https://api.github.com/repos/simonw/sqlite-utils/issues/444 IC_kwDOCGYnMM5E5C0r simonw 9599 2022-06-14T23:28:18Z 2022-06-14T23:28:18Z OWNER

I think these become part of the _import_options list which is used in a few places:

https://github.com/simonw/sqlite-utils/blob/b8af3b96f5c72317cc8783dc296a94f6719987d9/sqlite_utils/cli.py#L765-L800

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV `extras_key=` and `ignore_extras=` equivalents for CLI tool 1271426387  
1155803262 https://github.com/simonw/sqlite-utils/issues/430#issuecomment-1155803262 https://api.github.com/repos/simonw/sqlite-utils/issues/430 IC_kwDOCGYnMM5E5Ch- simonw 9599 2022-06-14T23:26:11Z 2022-06-14T23:26:11Z OWNER

It looks like PRAGMA temp_store was the right option to use here: https://www.sqlite.org/pragma.html#pragma_temp_store

temp_store_directory is listed as deprecated here: https://www.sqlite.org/pragma.html#pragma_temp_store_directory

I'm going to turn this into a help-wanted documentation issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Document how to use `PRAGMA temp_store` to avoid errors when running VACUUM against huge databases 1224112817  
1155801812 https://github.com/simonw/sqlite-utils/issues/434#issuecomment-1155801812 https://api.github.com/repos/simonw/sqlite-utils/issues/434 IC_kwDOCGYnMM5E5CLU simonw 9599 2022-06-14T23:23:32Z 2022-06-14T23:23:32Z OWNER

Since table names can be quoted like this:

CREATE VIRTUAL TABLE "searchable_fts"
    USING FTS4 (text1, text2, [name with . and spaces], content="searchable")

OR like this:

CREATE VIRTUAL TABLE "searchable_fts"
    USING FTS4 (text1, text2, [name with . and spaces], content=[searchable])

This fix looks to be correct to me (copying from the updated test_with_trace() test):

            (
                "SELECT name FROM sqlite_master\n"
                "    WHERE rootpage = 0\n"
                "    AND (\n"
                "        sql LIKE :like\n"
                "        OR sql LIKE :like2\n"
                "        OR (\n"
                "            tbl_name = :table\n"
                "            AND sql LIKE '%VIRTUAL TABLE%USING FTS%'\n"
                "        )\n"
                "    )",
                {
                    "like": "%VIRTUAL TABLE%USING FTS%content=[dogs]%",
                    "like2": '%VIRTUAL TABLE%USING FTS%content="dogs"%',
                    "table": "dogs",
                },
            )
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`detect_fts()` identifies the wrong table if tables have names that are subsets of each other 1243151184  
1155794149 https://github.com/simonw/sqlite-utils/issues/434#issuecomment-1155794149 https://api.github.com/repos/simonw/sqlite-utils/issues/434 IC_kwDOCGYnMM5E5ATl simonw 9599 2022-06-14T23:09:54Z 2022-06-14T23:09:54Z OWNER

A test that demonstrates the problem:

@pytest.mark.parametrize("reverse_order", (True, False))
def test_detect_fts_similar_tables(fresh_db, reverse_order):
    # https://github.com/simonw/sqlite-utils/issues/434
    table1, table2 = ("demo", "demo2")
    if reverse_order:
        table1, table2 = table2, table1

    fresh_db[table1].insert({"title": "Hello"}).enable_fts(
        ["title"], fts_version="FTS4"
    )
    fresh_db[table2].insert({"title": "Hello"}).enable_fts(
        ["title"], fts_version="FTS4"
    )
    assert fresh_db[table1].detect_fts() == "{}_fts".format(table1)
    assert fresh_db[table2].detect_fts() == "{}_fts".format(table2)

The order matters - so this test currently passes in one direction and fails in the other:

>       assert fresh_db[table2].detect_fts() == "{}_fts".format(table2)
E       AssertionError: assert 'demo2_fts' == 'demo_fts'
E         - demo_fts
E         + demo2_fts
E         ?     +

tests/test_introspect.py:53: AssertionError
========================================================================================= short test summary info =========================================================================================
FAILED tests/test_introspect.py::test_detect_fts_similar_tables[True] - AssertionError: assert 'demo2_fts' == 'demo_fts'
=============================================================================== 1 failed, 1 passed, 855 deselected in 1.00s ===============================================================================
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`detect_fts()` identifies the wrong table if tables have names that are subsets of each other 1243151184  
1155791109 https://github.com/simonw/sqlite-utils/issues/434#issuecomment-1155791109 https://api.github.com/repos/simonw/sqlite-utils/issues/434 IC_kwDOCGYnMM5E4_kF simonw 9599 2022-06-14T23:04:40Z 2022-06-14T23:04:40Z OWNER

Definitely a bug - thanks for the detailed write-up!

You're right, the code at fault is here:

https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/db.py#L2213-L2231

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`detect_fts()` identifies the wrong table if tables have names that are subsets of each other 1243151184  
1155789101 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155789101 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E4_Et simonw 9599 2022-06-14T23:00:45Z 2022-06-14T23:00:45Z OWNER

I'm going to mark this as "help wanted" and leave it open. I'm glad that it's not actually a bug where errors get swallowed.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155788944 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155788944 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E4_CQ simonw 9599 2022-06-14T23:00:24Z 2022-06-14T23:00:24Z OWNER

The progress bar only works if the file-like object passed to it has a fp.fileno() that isn't 0 (for stdin) - that's how it detects that the file is something which it can measure the size of in order to show progress.

If we know the file size in bytes AND we know the character encoding, can we change UpdateWrapper to update the number of bytes-per-character instead?

I don't think so: I can't see a way of definitively saying "for this encoding the number of bytes per character is X" - and in fact I'm pretty sure that question doesn't even make sense since variable-length encodings exist.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155784284 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155784284 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E495c simonw 9599 2022-06-14T22:51:03Z 2022-06-14T22:52:13Z OWNER

Yes, this is the problem. The progress bar length is set to the length in bytes of the file - os.path.getsize(file.name) - but it's then incremented by the length of each DECODED line in turn.

So if the file is in utf-16-le (twice the size of utf-8) the progress bar will finish at 50%!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155782835 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155782835 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E49iz simonw 9599 2022-06-14T22:48:22Z 2022-06-14T22:49:53Z OWNER

Here's the code that implements the progress bar in question: https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/cli.py#L918-L932

It calls file_progress() which looks like this:

https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/utils.py#L159-L175

Which uses this:

https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/utils.py#L148-L156

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155781399 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155781399 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E49MX simonw 9599 2022-06-14T22:45:41Z 2022-06-14T22:45:41Z OWNER

TIL how to use iconv: https://til.simonwillison.net/linux/iconv

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155776023 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155776023 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E474X simonw 9599 2022-06-14T22:36:07Z 2022-06-14T22:36:07Z OWNER

Wait! The arguments in that are the wrong way round. This is correct:

sqlite-utils insert --csv --delimiter ";" --encoding "utf-16-le" test.db test csv

It still outputs the following:

[------------------------------------] 0%
[#################-------------------] 49% 00:00:02%

But it creates a test.db file that is 6.2MB.

That database has 3141 rows in it:

% sqlite-utils tables test.db --counts -t
table      count
-------  -------
test        3142

I converted that csv file to utf-8 like so:

iconv -f UTF-16LE -t UTF-8 csv > utf8.csv

And it contains 3142 lines:

% wc -l utf8.csv 
    3142 utf8.csv

So my hunch here is that the problem is actually that the progress bar doesn't know how to correctly measure files in utf-16-le encoding!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155772244 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155772244 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E469U simonw 9599 2022-06-14T22:30:03Z 2022-06-14T22:30:03Z OWNER

Tried this:

% python -i $(which sqlite-utils) insert --csv --delimiter ";" --encoding "utf-16-le" test test.db csv
  [------------------------------------]    0%
  [#################-------------------]   49%  00:00:01Traceback (most recent call last):
  File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1072, in main
    ctx.exit()
  File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 692, in exit
    raise Exit(code)
click.exceptions.Exit: 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/bin/sqlite-utils", line 33, in <module>
    sys.exit(load_entry_point('sqlite-utils', 'console_scripts', 'sqlite-utils')())
  File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1090, in main
    sys.exit(e.exit_code)
SystemExit: 0
>>> 
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155771462 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155771462 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E46xG simonw 9599 2022-06-14T22:28:38Z 2022-06-14T22:28:38Z OWNER

Maybe this isn't a CSV field value problem - I tried this patch and didn't seem to hit the new breakpoints:

diff --git a/sqlite_utils/utils.py b/sqlite_utils/utils.py
index d2ccc5f..f1b823a 100644
--- a/sqlite_utils/utils.py
+++ b/sqlite_utils/utils.py
@@ -204,13 +204,17 @@ def _extra_key_strategy(
         # DictReader adds a 'None' key with extra row values
         if None not in row:
             yield row
-        elif ignore_extras:
+            continue
+        else:
+            breakpoint()
+        if ignore_extras:
             # ignoring row.pop(none) because of this issue:
             # https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155358637
             row.pop(None)  # type: ignore
             yield row
         elif not extras_key:
             extras = row.pop(None)  # type: ignore
+            breakpoint()
             raise RowError(
                 "Row {} contained these extra values: {}".format(row, extras)
             )
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155769216 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155769216 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E46OA simonw 9599 2022-06-14T22:24:49Z 2022-06-14T22:25:06Z OWNER

I have a hunch that this crash may be caused by a CSV value which is too long, as addressed at the library level in:
- #440

But not yet addressed in the CLI tool, see:

  • 444

Either way though, I really don't like that errors like this are swallowed!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155767915 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155767915 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E455r simonw 9599 2022-06-14T22:22:27Z 2022-06-14T22:22:27Z OWNER

I forgot to add equivalents of extras_key= and ignore_extras= to the CLI tool - will do that in a separate issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1155767202 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155767202 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E45ui simonw 9599 2022-06-14T22:21:10Z 2022-06-14T22:21:10Z OWNER

I can't figure out why that error is being swallowed like that. The most likely culprit was this code:

https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/cli.py#L1021-L1043

But I tried changing it like this:

diff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py
index 86eddfb..ed26fdd 100644
--- a/sqlite_utils/cli.py
+++ b/sqlite_utils/cli.py
@@ -1023,6 +1023,7 @@ def insert_upsert_implementation(
             docs, pk=pk, batch_size=batch_size, alter=alter, **extra_kwargs
         )
     except Exception as e:
+        raise
         if (
             isinstance(e, sqlite3.OperationalError)
             and e.args

And your steps to reproduce still got to 49% and then failed silently.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155764428 https://github.com/simonw/sqlite-utils/issues/432#issuecomment-1155764428 https://api.github.com/repos/simonw/sqlite-utils/issues/432 IC_kwDOCGYnMM5E45DM simonw 9599 2022-06-14T22:16:21Z 2022-06-14T22:16:21Z OWNER

Initial idea of how the .table() method would change:

diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py
index 7a06304..3ecb40b 100644
--- a/sqlite_utils/db.py
+++ b/sqlite_utils/db.py
@@ -474,11 +474,12 @@ class Database:
             self._tracer(sql, None)
         return self.conn.executescript(sql)

-    def table(self, table_name: str, **kwargs) -> Union["Table", "View"]:
+    def table(self, table_name: str, alias: Optional[str] = None, **kwargs) -> Union["Table", "View"]:
         """
         Return a table object, optionally configured with default options.

         :param table_name: Name of the table
+        :param alias: The database alias to use, if referring to a table in another connected database
         """
         klass = View if table_name in self.view_names() else Table
         return klass(self, table_name, **kwargs)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support `rows_where()`, `delete_where()` etc for attached alias databases 1236693079  
1155764064 https://github.com/simonw/sqlite-utils/issues/432#issuecomment-1155764064 https://api.github.com/repos/simonw/sqlite-utils/issues/432 IC_kwDOCGYnMM5E449g simonw 9599 2022-06-14T22:15:44Z 2022-06-14T22:15:44Z OWNER

Implementing this would be a pretty big change - initial instinct is that I'd need to introduce a self.alias property to Queryable (the subclass of Table and View) and a new self.name_with_alias getter which returns alias.tablename if alias is set to a not-None value. Then I'd need to rewrite every piece of code like this:

https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/db.py#L1161

To look like this instead:

        sql = "select {} from [{}]".format(select, self.name_with_alias)

But some parts would be harder - for example:

https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/db.py#L1227-L1231

Would have to know to query alias.sqlite_master instead.

The cached table counts logic like this would need a bunch of changes too:

https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/db.py#L644-L657

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support `rows_where()`, `delete_where()` etc for attached alias databases 1236693079  
1155759857 https://github.com/simonw/sqlite-utils/issues/432#issuecomment-1155759857 https://api.github.com/repos/simonw/sqlite-utils/issues/432 IC_kwDOCGYnMM5E437x simonw 9599 2022-06-14T22:09:07Z 2022-06-14T22:09:07Z OWNER

Third option, and I think the one I like the best:

rows = db.table("tablename", alias="otherdb").rows_where(alias="otherdb")

The db.table(tablename) method already exists as an alternative to db[tablename]: https://sqlite-utils.datasette.io/en/stable/python-api.html#python-api-table-configuration

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support `rows_where()`, `delete_where()` etc for attached alias databases 1236693079  
1155758664 https://github.com/simonw/sqlite-utils/issues/432#issuecomment-1155758664 https://api.github.com/repos/simonw/sqlite-utils/issues/432 IC_kwDOCGYnMM5E43pI simonw 9599 2022-06-14T22:07:50Z 2022-06-14T22:07:50Z OWNER

Another potential fix: add a alias= parameter to rows_where() and other similar methods. Then you could do this:

rows = db["tablename"].rows_where(alias="otherdb")

This feels wrong to me: db["tablename"] is the bit that is supposed to return a table object. Having part of what that table object is exist as a parameter to other methods is confusing.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support `rows_where()`, `delete_where()` etc for attached alias databases 1236693079  
1155756742 https://github.com/simonw/sqlite-utils/issues/432#issuecomment-1155756742 https://api.github.com/repos/simonw/sqlite-utils/issues/432 IC_kwDOCGYnMM5E43LG simonw 9599 2022-06-14T22:05:38Z 2022-06-14T22:05:49Z OWNER

I don't like the idea of table_names() returning names of tables from connected databases as well, because it feels like it could lead to surprising behaviour - especially if those connected databases turn to have table names that are duplicated in the main connected database.

It would be neat if functions like .rows_where() worked though.

One thought would be to support something like this:

rows = db["otherdb.tablename"].rows_where()

But... . is a valid character in a SQLite table name. So "otherdb.tablename" might ambiguously refer to a table called tablename in a connected database with the alias otherdb, OR a table in the current database with the name otherdb.tablename.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support `rows_where()`, `delete_where()` etc for attached alias databases 1236693079  
1155753397 https://github.com/simonw/sqlite-utils/issues/431#issuecomment-1155753397 https://api.github.com/repos/simonw/sqlite-utils/issues/431 IC_kwDOCGYnMM5E42W1 simonw 9599 2022-06-14T22:01:38Z 2022-06-14T22:01:38Z OWNER

Yeah, I think it would be neat if the library could support self-referential many-to-many in a nice way.

I'm not sure about the left_name/right_name design though. Would it be possible to have this work as the user intends, by spotting that the other table name "people" matches the name of the current table?

db["people"].insert({"name": "Mary"}, pk="name").m2m(
    "people", [{"name": "Michael"}, {"name": "Suzy"}], m2m_table="parent_child", pk="name"
)

The created table could look like this:

CREATE TABLE [parent_child] (
   [people_id_1] TEXT REFERENCES [people]([name]),
   [people_id_2] TEXT REFERENCES [people]([name]),
   PRIMARY KEY ([people_id_1], [people_id_2])
)

I've not thought very hard about this, so the design I'm proposing here might not work.

Are there other reasons people might wan the left_name= and right_name= parameters? If so then I'm much happier with those.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Allow making m2m relation of a table to itself 1227571375  
1155750270 https://github.com/simonw/sqlite-utils/issues/441#issuecomment-1155750270 https://api.github.com/repos/simonw/sqlite-utils/issues/441 IC_kwDOCGYnMM5E41l- simonw 9599 2022-06-14T21:57:57Z 2022-06-14T21:57:57Z OWNER

I added where= and where_args= parameters to that .search() method - updated documentation is here: https://sqlite-utils.datasette.io/en/latest/python-api.html#searching-with-table-search

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Combining `rows_where()` and `search()` to limit which rows are searched 1257724585  
1155749696 https://github.com/simonw/sqlite-utils/issues/433#issuecomment-1155749696 https://api.github.com/repos/simonw/sqlite-utils/issues/433 IC_kwDOCGYnMM5E41dA simonw 9599 2022-06-14T21:57:05Z 2022-06-14T21:57:05Z OWNER

Marking this as help wanted because I can't figure out how to replicate it!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CLI eats my cursor 1239034903  
1155748444 https://github.com/simonw/sqlite-utils/issues/442#issuecomment-1155748444 https://api.github.com/repos/simonw/sqlite-utils/issues/442 IC_kwDOCGYnMM5E41Jc simonw 9599 2022-06-14T21:55:15Z 2022-06-14T21:55:15Z OWNER

Documentation: https://sqlite-utils.datasette.io/en/latest/python-api.html#setting-the-maximum-csv-field-size-limit

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`maximize_csv_field_size_limit()` utility function 1269886084  
1155714131 https://github.com/simonw/sqlite-utils/issues/442#issuecomment-1155714131 https://api.github.com/repos/simonw/sqlite-utils/issues/442 IC_kwDOCGYnMM5E4sxT simonw 9599 2022-06-14T21:07:50Z 2022-06-14T21:07:50Z OWNER

Here's the commit where I added that originally, including a test: https://github.com/simonw/sqlite-utils/commit/1a93b72ba710ea2271eaabc204685a27d2469374

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`maximize_csv_field_size_limit()` utility function 1269886084  
1155672675 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155672675 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E4ipj simonw 9599 2022-06-14T20:19:07Z 2022-06-14T20:19:07Z OWNER

Documentation: https://sqlite-utils.datasette.io/en/latest/python-api.html#reading-rows-from-a-file

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 1,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1155672522 https://github.com/simonw/sqlite-utils/issues/443#issuecomment-1155672522 https://api.github.com/repos/simonw/sqlite-utils/issues/443 IC_kwDOCGYnMM5E4inK simonw 9599 2022-06-14T20:18:58Z 2022-06-14T20:18:58Z OWNER

New documentation: https://sqlite-utils.datasette.io/en/latest/python-api.html#reading-rows-from-a-file

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Make `utils.rows_from_file()` a documented API 1269998342  
1155666672 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155666672 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E4hLw simonw 9599 2022-06-14T20:11:52Z 2022-06-14T20:11:52Z OWNER

I'm going to rename restkey to extras_key for consistency with ignore_extras.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1155515426 https://github.com/simonw/sqlite-utils/issues/441#issuecomment-1155515426 https://api.github.com/repos/simonw/sqlite-utils/issues/441 IC_kwDOCGYnMM5E38Qi betatim 1448859 2022-06-14T17:53:43Z 2022-06-14T17:53:43Z NONE

That would be handy (additional where filters) but I think the trick with the with statement is already an order of magnitude better than what I had thought of, so my problem is solved by it (plus I got to learn about with today!)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Combining `rows_where()` and `search()` to limit which rows are searched 1257724585  
1155421299 https://github.com/simonw/sqlite-utils/issues/441#issuecomment-1155421299 https://api.github.com/repos/simonw/sqlite-utils/issues/441 IC_kwDOCGYnMM5E3lRz simonw 9599 2022-06-14T16:23:52Z 2022-06-14T16:23:52Z OWNER

Actually I have a thought for something that could help here: I could add a mechanism for inserting additional where filters and parameters into that .search() method.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Combining `rows_where()` and `search()` to limit which rows are searched 1257724585  
1155389614 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155389614 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E3diu simonw 9599 2022-06-14T15:54:03Z 2022-06-14T15:54:03Z OWNER

Filed an issue against python/typeshed:

  • https://github.com/python/typeshed/issues/8075
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1155364367 https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1155364367 https://api.github.com/repos/simonw/sqlite-utils/issues/412 IC_kwDOCGYnMM5E3XYP simonw 9599 2022-06-14T15:36:28Z 2022-06-14T15:36:28Z OWNER

Here's as far as I got with my initial prototype, in sqlite_utils/pandas.py:

from .db import Database as _Database, Table as _Table, View as _View
import pandas as pd
from typing import (
    Iterable,
    Union,
    Optional,
)


class Database(_Database):
    def query(
        self, sql: str, params: Optional[Union[Iterable, dict]] = None
    ) -> pd.DataFrame:
        return pd.DataFrame(super().query(sql, params))

    def table(self, table_name: str, **kwargs) -> Union["Table", "View"]:
        "Return a table object, optionally configured with default options."
        klass = View if table_name in self.view_names() else Table
        return klass(self, table_name, **kwargs)


class PandasQueryable:
    def rows_where(
        self,
        where: str = None,
        where_args: Optional[Union[Iterable, dict]] = None,
        order_by: str = None,
        select: str = "*",
        limit: int = None,
        offset: int = None,
    ) -> pd.DataFrame:
        return pd.DataFrame(
            super().rows_where(
                where,
                where_args,
                order_by=order_by,
                select=select,
                limit=limit,
                offset=offset,
            )
        )


class Table(PandasQueryable, _Table):
    pass


class View(PandasQueryable, _View):
    pass
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optional Pandas integration 1160182768  
1155358637 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155358637 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E3V-t simonw 9599 2022-06-14T15:31:34Z 2022-06-14T15:31:34Z OWNER

Getting this past mypy is really hard!

% mypy sqlite_utils
sqlite_utils/utils.py:189: error: No overload variant of "pop" of "MutableMapping" matches argument type "None"
sqlite_utils/utils.py:189: note: Possible overload variants:
sqlite_utils/utils.py:189: note:     def pop(self, key: str) -> str
sqlite_utils/utils.py:189: note:     def [_T] pop(self, key: str, default: Union[str, _T] = ...) -> Union[str, _T]

That's because of this line:

row.pop(key=None)

Which is legit here - we have a dictionary where one of the keys is None and we want to remove that key. But the baked in type is apparently def pop(self, key: str) -> str.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1155350755 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155350755 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E3UDj simonw 9599 2022-06-14T15:25:18Z 2022-06-14T15:25:18Z OWNER

That broke mypy:

sqlite_utils/utils.py:229: error: Incompatible types in assignment (expression has type "Iterable[Dict[Any, Any]]", variable has type "DictReader[str]")

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1155317293 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155317293 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E3L4t simonw 9599 2022-06-14T15:04:01Z 2022-06-14T15:04:01Z OWNER

I think that's unavoidable: it looks like csv.Sniffer only works if you feed it a CSV file with an equal number of values in each row, which is understandable.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1155310521 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155310521 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E3KO5 simonw 9599 2022-06-14T14:58:50Z 2022-06-14T14:58:50Z OWNER

Interesting challenge in writing tests for this: if you give csv.Sniffer a short example with an invalid row in it sometimes it picks the wrong delimiter!

id,name\r\n1,Cleo,oops

It decided the delimiter there was e.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154475454 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154475454 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez-W- simonw 9599 2022-06-13T21:52:03Z 2022-06-13T21:52:03Z OWNER

The exception will be called RowError.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154474482 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154474482 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez-Hy simonw 9599 2022-06-13T21:50:59Z 2022-06-13T21:51:24Z OWNER

Decision: I'm going to default to raising an exception if a row has too many values in it.

You'll be able to pass ignore_extras=True to ignore those extra values, or pass restkey="the_rest" to stick them in a list in the restkey column.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154457893 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154457893 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez6El simonw 9599 2022-06-13T21:29:02Z 2022-06-13T21:29:02Z OWNER

Here's the current function signature for rows_from_file():

https://github.com/simonw/sqlite-utils/blob/26e6d2622c57460a24ffdd0128bbaac051d51a5f/sqlite_utils/utils.py#L174-L179

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154457028 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154457028 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez53E simonw 9599 2022-06-13T21:28:03Z 2022-06-13T21:28:03Z OWNER

Whatever I decide, I can implement it in rows_from_file(), maybe as an optional parameter - then decide how to call it from the sqlite-utils insert CLI (perhaps with a new option there too).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154456183 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154456183 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez5p3 simonw 9599 2022-06-13T21:26:55Z 2022-06-13T21:26:55Z OWNER

So I need to make a design decision here: what should sqlite-utils do with CSV files that have rows with more values than there are headings?

Some options:

  • Ignore those extra fields entirely - silently drop that data. I'm not keen on this.
  • Throw an error. The library does this already, but the error is incomprehensible - it could turn into a useful, human-readable error instead.
  • Put the data in a JSON list in a column with a known name (None is not a valid column name, so not that). This could be something like _restkey or _values_with_no_heading. This feels like a better option, but I'd need to carefully pick a name for it - and come up with an answer for the question of what to do if the CSV file being important already uses that heading name for something else.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154454127 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154454127 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez5Jv simonw 9599 2022-06-13T21:24:18Z 2022-06-13T21:24:18Z OWNER

That weird behaviour is documented here: https://docs.python.org/3/library/csv.html#csv.DictReader

If a row has more fields than fieldnames, the remaining data is put in a list and stored with the fieldname specified by restkey (which defaults to None). If a non-blank row has fewer fields than fieldnames, the missing values are filled-in with the value of restval (which defaults to None).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154453319 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154453319 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez49H simonw 9599 2022-06-13T21:23:16Z 2022-06-13T21:23:16Z OWNER

Aha! I think I see what's happening here. Here's what DictReader does if one of the lines has too many items in it:

>>> import csv, io
>>> list(csv.DictReader(io.StringIO("id,name\n1,Cleo,nohead\n2,Barry")))
[{'id': '1', 'name': 'Cleo', None: ['nohead']}, {'id': '2', 'name': 'Barry'}]

See how that row with too many items gets this:
[{'id': '1', 'name': 'Cleo', None: ['nohead']}

That's a None for the key and (weirdly) a list containing the single item for the value!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154449442 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154449442 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez4Ai simonw 9599 2022-06-13T21:18:26Z 2022-06-13T21:20:12Z OWNER

Here are full steps to replicate the bug:

from urllib.request import urlopen
import sqlite_utils
db = sqlite_utils.Database(memory=True)
with urlopen("https://artsdatabanken.no/Fab2018/api/export/csv") as fab:
    reader, other = sqlite_utils.utils.rows_from_file(fab, encoding="utf-16le")
    db["fab2018"].insert_all(reader, pk="Id")
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154396400 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154396400 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5EzrDw simonw 9599 2022-06-13T20:28:25Z 2022-06-13T20:28:25Z OWNER

Fixing that key thing (to ignore any key that is None) revealed a new bug:

File ~/Dropbox/Development/sqlite-utils/sqlite_utils/utils.py:376, in hash_record(record, keys)
    373 if keys is not None:
    374     to_hash = {key: record[key] for key in keys}
    375 return hashlib.sha1(
--> 376     json.dumps(to_hash, separators=(",", ":"), sort_keys=True, default=repr).encode(
    377         "utf8"
    378     )
    379 ).hexdigest()

File ~/.pyenv/versions/3.8.2/lib/python3.8/json/__init__.py:234, in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    232 if cls is None:
    233     cls = JSONEncoder
--> 234 return cls(
    235     skipkeys=skipkeys, ensure_ascii=ensure_ascii,
    236     check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    237     separators=separators, default=default, sort_keys=sort_keys,
    238     **kw).encode(obj)

File ~/.pyenv/versions/3.8.2/lib/python3.8/json/encoder.py:199, in JSONEncoder.encode(self, o)
    195         return encode_basestring(o)
    196 # This doesn't pass the iterator directly to ''.join() because the
    197 # exceptions aren't as detailed.  The list call should be roughly
    198 # equivalent to the PySequence_Fast that ''.join() would do.
--> 199 chunks = self.iterencode(o, _one_shot=True)
    200 if not isinstance(chunks, (list, tuple)):
    201     chunks = list(chunks)

File ~/.pyenv/versions/3.8.2/lib/python3.8/json/encoder.py:257, in JSONEncoder.iterencode(self, o, _one_shot)
    252 else:
    253     _iterencode = _make_iterencode(
    254         markers, self.default, _encoder, self.indent, floatstr,
    255         self.key_separator, self.item_separator, self.sort_keys,
    256         self.skipkeys, _one_shot)
--> 257 return _iterencode(o, 0)

TypeError: '<' not supported between instances of 'NoneType' and 'str'
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154387591 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154387591 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ezo6H simonw 9599 2022-06-13T20:17:51Z 2022-06-13T20:17:51Z OWNER

I don't understand why that works but calling insert_all() does not.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154386795 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154386795 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ezotr simonw 9599 2022-06-13T20:16:53Z 2022-06-13T20:16:53Z OWNER

Steps to demonstrate that sqlite-utils insert is not affected:

curl -o artsdatabanken.csv https://artsdatabanken.no/Fab2018/api/export/csv
sqlite-utils insert arts.db artsdatabanken artsdatabanken.csv --sniff --csv --encoding utf-16le
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154385916 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154385916 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ezof8 simonw 9599 2022-06-13T20:15:49Z 2022-06-13T20:15:49Z OWNER

rows_from_file() isn't part of the documented API but maybe it should be!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154373361 https://github.com/simonw/sqlite-utils/issues/441#issuecomment-1154373361 https://api.github.com/repos/simonw/sqlite-utils/issues/441 IC_kwDOCGYnMM5Ezlbx simonw 9599 2022-06-13T20:01:25Z 2022-06-13T20:01:25Z OWNER

Yeah, at the moment the best way to do this is with search_sql(), but you're right it really isn't very intuitive.

Here's how I would do this, using a CTE trick to combine the queries:

search_sql = db["articles"].search_sql(columns=["title", "author"]))
sql = f"""
with search_results as ({search_sql})
select * from search_results where owner = :owner
"""
results = db.query(sql, {"query": "my search query", "owner": "my owner"})

I'm not sure if sqlite-utils should ever evolve to provide a better way of doing this kind of thing to be honest - if it did, it would turn into more of an ORM. Something like PeeWee may be a better option here.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Combining `rows_where()` and `search()` to limit which rows are searched 1257724585  
1151887842 https://github.com/simonw/datasette/issues/1528#issuecomment-1151887842 https://api.github.com/repos/simonw/datasette/issues/1528 IC_kwDOBm6k_c5EqGni eyeseast 25778 2022-06-10T03:23:08Z 2022-06-10T03:23:08Z CONTRIBUTOR

I just put together a version of this in a plugin: https://github.com/eyeseast/datasette-query-files. Happy to have any feedback.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add new `"sql_file"` key to Canned Queries in metadata? 1060631257  
1147435032 https://github.com/simonw/datasette/pull/1753#issuecomment-1147435032 https://api.github.com/repos/simonw/datasette/issues/1753 IC_kwDOBm6k_c5EZHgY codecov[bot] 22429695 2022-06-06T13:15:11Z 2022-06-06T13:15:11Z NONE

Codecov Report

Merging #1753 (23a8515) into main (2e97516) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1753   +/-   ##
=======================================
  Coverage   91.67%   91.67%           
=======================================
  Files          36       36           
  Lines        4658     4658           
=======================================
  Hits         4270     4270           
  Misses        388      388           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e97516...23a8515. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump furo from 2022.4.7 to 2022.6.4.1 1261826957  
1142556455 https://github.com/simonw/datasette/pull/1740#issuecomment-1142556455 https://api.github.com/repos/simonw/datasette/issues/1740 IC_kwDOBm6k_c5EGgcn simonw 9599 2022-05-31T19:25:49Z 2022-05-31T19:25:49Z OWNER

Thanks, this looks like a good idea to me.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
chore: Set permissions for GitHub actions 1226106354  
1141711418 https://github.com/simonw/sqlite-utils/issues/26#issuecomment-1141711418 https://api.github.com/repos/simonw/sqlite-utils/issues/26 IC_kwDOCGYnMM5EDSI6 nileshtrivedi 19304 2022-05-31T06:21:15Z 2022-05-31T06:21:15Z NONE

I ran into this. My use case has a JSON file with array of book objects with a key called reviews which is also an array of objects. My JSON is human-edited and does not specify IDs for either books or reviews. Because sqlite-utils does not support inserting nested objects, I instead have to maintain two separate CSV files with id column in books.csv and book_id column in reviews.csv.

I think the right way to declare the relationship while inserting a JSON might be to describe the relationship:

sqlite-utils insert data.db books mydata.json --hasmany reviews --hasone author --manytomany tags

This is relying on the assumption that foreign keys can point to rowid primary key.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for turning nested JSON into foreign keys / many-to-many 455486286  
1141488533 https://github.com/simonw/sqlite-utils/pull/437#issuecomment-1141488533 https://api.github.com/repos/simonw/sqlite-utils/issues/437 IC_kwDOCGYnMM5ECbuV simonw 9599 2022-05-30T21:32:36Z 2022-05-30T21:32:36Z OWNER

Thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
docs to dogs 1244294227  
1140321380 https://github.com/simonw/datasette/issues/1751#issuecomment-1140321380 https://api.github.com/repos/simonw/datasette/issues/1751 IC_kwDOBm6k_c5D9-xk knutwannheden 408765 2022-05-28T19:52:17Z 2022-05-28T19:52:17Z NONE

Closing in favor of existing issue #1298.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add scrollbars to table presentation in default layout 1251710928  
1139426398 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1139426398 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5D6kRe frafra 4068 2022-05-27T09:04:05Z 2022-05-27T10:44:54Z NONE

This code works:

import csv
import sqlite_utils
db = sqlite_utils.Database("test.db")
reader = csv.DictReader(open("csv", encoding="utf-16-le").read().split("\r\n"), delimiter=";")
db["test"].insert_all(reader, pk="Id")

I used iconv to change the encoding; sqlite-utils can import the resulting file, even if it stops at 98 %:

sqlite-utils insert --csv test test.db clean 
  [------------------------------------]    0%
  [###################################-]   98%  00:00:00
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1139484453 https://github.com/simonw/sqlite-utils/issues/433#issuecomment-1139484453 https://api.github.com/repos/simonw/sqlite-utils/issues/433 IC_kwDOCGYnMM5D6ycl frafra 4068 2022-05-27T10:20:08Z 2022-05-27T10:20:08Z NONE

I can confirm. This only happens with sqlite-utils. I am using gnome-terminal with bash.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CLI eats my cursor 1239034903  
1139392769 https://github.com/simonw/sqlite-utils/issues/438#issuecomment-1139392769 https://api.github.com/repos/simonw/sqlite-utils/issues/438 IC_kwDOCGYnMM5D6cEB frafra 4068 2022-05-27T08:21:53Z 2022-05-27T08:21:53Z NONE

Argument were specified in the wrong order. PATH TABLE FILE can be misleading :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
illegal UTF-16 surrogate 1250161887  
1139379923 https://github.com/simonw/sqlite-utils/issues/438#issuecomment-1139379923 https://api.github.com/repos/simonw/sqlite-utils/issues/438 IC_kwDOCGYnMM5D6Y7T frafra 4068 2022-05-27T08:05:01Z 2022-05-27T08:05:01Z NONE

I tried to debug it using pdb, but it looks sqlite-utils catches the exception, so it is not quick to figure out where the failure is happening.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
illegal UTF-16 surrogate 1250161887  
1133417432 https://github.com/simonw/sqlite-utils/issues/435#issuecomment-1133417432 https://api.github.com/repos/simonw/sqlite-utils/issues/435 IC_kwDOCGYnMM5DjpPY simonw 9599 2022-05-20T21:56:10Z 2022-05-20T21:56:10Z OWNER

Before:

After:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch to Furo documentation theme 1243704847  
1133416698 https://github.com/simonw/sqlite-utils/issues/435#issuecomment-1133416698 https://api.github.com/repos/simonw/sqlite-utils/issues/435 IC_kwDOCGYnMM5DjpD6 simonw 9599 2022-05-20T21:54:43Z 2022-05-20T21:54:43Z OWNER

Done: https://sqlite-utils.datasette.io/en/latest/reference.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch to Furo documentation theme 1243704847  
1133396285 https://github.com/simonw/datasette/issues/1746#issuecomment-1133396285 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjkE9 simonw 9599 2022-05-20T21:28:29Z 2022-05-20T21:28:29Z OWNER

That fixed it:

https://user-images.githubusercontent.com/9599/169614893-ec81fe0e-6043-4d7d-b429-7f087dbeaf61.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133348094 https://github.com/simonw/datasette/issues/1746#issuecomment-1133348094 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjYT- simonw 9599 2022-05-20T20:40:09Z 2022-05-20T20:40:09Z OWNER

Relevant JavaScript: https://github.com/simonw/datasette/blob/1d33fd03b3c211e0f48a8f3bde83880af89e4e69/docs/_static/js/custom.js#L20-L24

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133347051 https://github.com/simonw/datasette/issues/1746#issuecomment-1133347051 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjYDr simonw 9599 2022-05-20T20:39:17Z 2022-05-20T20:39:17Z OWNER

Now live at https://docs.datasette.io/en/latest/ - the JavaScript that adds the banner about that not being the stable version doesn't seem to work though.

Before:

https://user-images.githubusercontent.com/9599/169607254-99bc0358-4a08-43bf-9aac-a24cc2121979.png">

After:

https://user-images.githubusercontent.com/9599/169607296-bcec1ed2-517c-4acc-a9a9-d1119d0cc589.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1081861670 https://github.com/simonw/datasette/pull/1693#issuecomment-1081861670 https://api.github.com/repos/simonw/datasette/issues/1693 IC_kwDOBm6k_c5Ae-Ym codecov[bot] 22429695 2022-03-29T13:18:47Z 2022-05-20T20:36:30Z NONE

Codecov Report

Merging #1693 (65a5d5e) into main (1465fea) will not change coverage.
The diff coverage is n/a.

:exclamation: Current head 65a5d5e differs from pull request most recent head ec2d1e4. Consider uploading reports for the commit ec2d1e4 to get more accurate results

@@           Coverage Diff           @@
##             main    #1693   +/-   ##
=======================================
  Coverage   91.67%   91.67%           
=======================================
  Files          36       36           
  Lines        4658     4658           
=======================================
  Hits         4270     4270           
  Misses        388      388           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1d33fd0...ec2d1e4. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump black from 22.1.0 to 22.3.0 1184850337  
1133335940 https://github.com/simonw/datasette/issues/1746#issuecomment-1133335940 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjVWE simonw 9599 2022-05-20T20:30:29Z 2022-05-20T20:30:29Z OWNER

I think the trick will be to extend the base.html template from Furo using the same trick I used in https://til.simonwillison.net/readthedocs/custom-sphinx-templates

https://github.com/pradyunsg/furo/blob/2022.04.07/src/furo/theme/furo/base.html - the site_meta block looks good.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133333144 https://github.com/simonw/datasette/issues/1746#issuecomment-1133333144 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjUqY simonw 9599 2022-05-20T20:28:25Z 2022-05-20T20:28:25Z OWNER

One last question: how to include the Plausible analytics?

Furo doesn't have any specific tools for this:

  • https://github.com/pradyunsg/furo/discussions/243
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 323.553ms · About: github-to-sqlite