home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

8,069 rows sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

issue >1000

  • Show column metadata plus links for foreign keys on arbitrary query results 50
  • Redesign default .json format 48
  • Rethink how .ext formats (v.s. ?_format=) works before 1.0 48
  • JavaScript plugin hooks mechanism similar to pluggy 47
  • Updated Dockerfile with SpatiaLite version 5.0 45
  • Complete refactor of TableView and table.html template 45
  • Port Datasette to ASGI 42
  • Authentication (and permissions) as a core concept 40
  • Deploy a live instance of demos/apache-proxy 34
  • await datasette.client.get(path) mechanism for executing internal requests 33
  • Maintain an in-memory SQLite table of connected databases and their tables 32
  • Ability to sort (and paginate) by column 31
  • Research: demonstrate if parallel SQL queries are worthwhile 31
  • link_or_copy_directory() error - Invalid cross-device link 28
  • Export to CSV 27
  • base_url configuration setting 27
  • Documentation with recommendations on running Datasette in production without using Docker 27
  • Optimize all those calls to index_list and foreign_key_list 27
  • Support cross-database joins 26
  • Ability for a canned query to write to the database 26
  • table.transform() method for advanced alter table 26
  • New pattern for views that return either JSON or HTML, available for plugins 26
  • Proof of concept for Datasette on AWS Lambda with EFS 25
  • WIP: Add Gmail takeout mbox import 25
  • Redesign register_output_renderer callback 24
  • Make it easier to insert geometries, with documentation and maybe code 24
  • "datasette insert" command and plugin hook 23
  • Datasette Plugins 22
  • .json and .csv exports fail to apply base_url 22
  • Idea: import CSV to memory, run SQL, export in a single command 22
  • Plugin hook for dynamic metadata 22
  • base_url is omitted in JSON and CSV views 22
  • Handle spatialite geometry columns better 21
  • table.extract(...) method and "sqlite-utils extract" command 21
  • Database page loads too slowly with many large tables (due to table counts) 21
  • ?sort=colname~numeric to sort by by column cast to real 21
  • Switch documentation theme to Furo 21
  • "flash messages" mechanism 20
  • Move CI to GitHub Issues 20
  • load_template hook doesn't work for include/extends 20
  • Mechanism for storing metadata in _metadata tables 20
  • Introduce concept of a database `route`, separate from its name 20
  • CSV files with too many values in a row cause errors 20
  • Better way of representing binary data in .csv output 19
  • Introspect if table is FTS4 or FTS5 19
  • A proper favicon 19
  • Package as standalone binary 18
  • Ability to ship alpha and beta releases 18
  • Magic parameters for canned queries 18
  • Support column descriptions in metadata.json 18
  • datasette.client internal requests mechanism 18
  • Figure out why SpatiaLite 5.0 hangs the database page on Linux 18
  • Publish to Docker Hub failing with "libcrypt.so.1: cannot open shared object file" 17
  • Facets 16
  • ?_col= and ?_nocol= support for toggling columns on table view 16
  • Support "allow" block on root, databases and tables, not just queries 16
  • Action menu for table columns 16
  • Make it possible to download BLOB data from the Datasette UI 16
  • `--batch-size 1` doesn't seem to commit for every item 16
  • create-index should run analyze after creating index 16
  • Add new spatialite helper methods 16
  • Bug: Sort by column with NULL in next_page URL 15
  • Mechanism for customizing the SQL used to select specific columns in the table view 15
  • The ".upsert()" method is misnamed 15
  • --dirs option for scanning directories for SQLite databases 15
  • Document (and reconsider design of) Database.execute() and Database.execute_against_connection_in_thread() 15
  • latest.datasette.io is no longer updating 15
  • "sqlite-utils convert" command to replace the separate "sqlite-transform" tool 15
  • --lines and --text and --convert and --import 15
  • Ability to customize presentation of specific columns in HTML view 14
  • Allow plugins to define additional URL routes and views 14
  • Mechanism for turning nested JSON into foreign keys / many-to-many 14
  • "Invalid SQL" page should let you edit the SQL 14
  • .execute_write() and .execute_write_fn() methods on Database 14
  • Upload all my photos to a secure S3 bucket 14
  • Canned query permissions mechanism 14
  • Incorrect URLs when served behind a proxy with base_url set 14
  • "datasette -p 0 --root" gives the wrong URL 14
  • Plugin hook for loading templates 14
  • sqlite-utils extract could handle nested objects 14
  • Advanced class-based `conversions=` mechanism 14
  • Design plugin hook for extras 14
  • Dockerfile should build more recent SQLite with FTS5 and spatialite support 13
  • Fix all the places that currently use .inspect() data 13
  • Plugin hook: filters_from_request 13
  • Get Datasette tests passing on Windows in GitHub Actions 13
  • If you apply ?_facet_array=tags then &_facet=tags does nothing 13
  • Mechanism for adding arbitrary pages like /about 13
  • Prototoype for Datasette on PostgreSQL 13
  • Import machine-learning detected labels (dog, llama etc) from Apple Photos 13
  • Mechanism for skipping CSRF checks on API posts 13
  • table.transform() method 13
  • Policy on documenting "public" datasette.utils functions 13
  • Async support 13
  • Serve using UNIX domain socket 13
  • `register_commands()` plugin hook to register extra CLI commands 13
  • Fix compatibility with Python 3.10 13
  • Support STRICT tables 13
  • Optional Pandas integration 13
  • Refactor TableView to use asyncinject 13
  • Add “updated” to metadata 12
  • Metadata should be a nested arbitrary KV store 12
  • Mechanism for ranking results from SQLite full-text search 12
  • Sanely handle Infinity/-Infinity values in JSON using ?_json_infinity=1 12
  • Package datasette for installation using homebrew 12
  • Datasette Library 12
  • _facet_array should work against views 12
  • Full text search of all tables at once? 12
  • Port Datasette from Sanic to ASGI + Uvicorn 12
  • Populate "endpoint" key in ASGI scope 12
  • --cp option for datasette publish and datasette package for shipping additional files and directories 12
  • base_url doesn't entirely work for running Datasette inside Binder 12
  • Having view-table permission but NOT view-database should still grant access to /db/table 12
  • register_output_renderer() should support streaming data 12
  • Efficiently calculate list of databases/tables a user can view 12
  • Support creating descending order indexes 12
  • Consider using CSP to protect against future XSS 12
  • Rethink approach to [ and ] in column names (currently throws error) 12
  • Research: CTEs and union all to calculate facets AND query at the same time 12
  • Traces should include SQL executed by subtasks created with `asyncio.gather` 12
  • Ensure "pip install datasette" still works with Python 3.6 12
  • Tilde encoding: use ~ instead of - for dash-encoding 12
  • Code examples in the documentation should be formatted with Black 12
  • Implement ?_extra and new API design for TableView 12
  • Misleading progress bar against utf-16-le CSV input 12
  • Implement command-line tool interface 11
  • Option to expose expanded foreign keys in JSON/CSV 11
  • Mechanism for checking if a SQLite database file is safe to open 11
  • Expand plugins documentation to multiple pages 11
  • Mechanism for plugins to add action menu items for various things 11
  • --since feature can be confused by retweets 11
  • bpylist.archiver.CircularReference: archive has a cycle with uid(13) 11
  • Datasette secret mechanism - initially for signed cookies 11
  • Writable canned queries live demo on Glitch 11
  • base_url doesn't seem to work when adding criteria and clicking "apply" 11
  • POST to /db/canned-query that returns JSON should be supported (for API clients) 11
  • datasette.urls.table() / .instance() / .database() methods for constructing URLs, also exposed to templates 11
  • Writable canned queries with magic parameters fail if POST body is empty 11
  • Database class mechanism for cross-connection in-memory databases 11
  • Race condition errors in new refresh_schemas() mechanism 11
  • Option for importing CSV data using the SQLite .import mechanism 11
  • "Query parameters" form shows wrong input fields if query contains "03:31" style times 11
  • render_cell() hook should support returning an awaitable 11
  • sqlite-utils index-foreign-keys fails due to pre-existing index 11
  • `sqlite-utils insert --convert` option 11
  • Research how much of a difference analyze / sqlite_stat1 makes 11
  • Table+query JSON and CSV links broken when using `base_url` setting 11
  • Options for how `r.parsedate()` should handle invalid dates 11
  • Research: how much overhead does the n=1 time limit have? 11
  • Document how to use a `--convert` function that runs initialization code first 11
  • Writable canned queries fail with useless non-error against immutable databases 11
  • Set up some example datasets on a Cloudflare-backed domain 10
  • Filter UI on table page 10
  • Support for units 10
  • datasette publish lambda plugin 10
  • Add ?_extra= mechanism for requesting extra properties in JSON 10
  • Build Dockerfile with recent Sqlite + Spatialite 10
  • Table view should support filtering via many-to-many relationships 10
  • Default to opening files in mutable mode, special option for immutable files 10
  • New design for facet abstraction, including querystring and metadata.json 10
  • Syntactic sugar for creating m2m records 10
  • Option to display binary data 10
  • extracts= should support multiple-column extracts 10
  • Documented internals API for use in plugins 10
  • Mechanism for writing to database via a queue 10
  • See if I can get Datasette working on Zeit Now v2 10
  • Ability to serve thumbnailed Apple Photo from its place on disk 10
  • Release Datasette 0.44 10
  • Rename master branch to main 10
  • Plugin hook for instance/database/table metadata 10
  • Refactor default views to use register_routes 10
  • CLI utility for inserting binary files into SQLite 10
  • FTS table with 7 rows has _fts_docsize table with 9,141 rows 10
  • Switch to .blob render extension for BLOB downloads 10
  • Navigation menu plus plugin hook 10
  • Use YAML examples in documentation by default, not JSON 10
  • Adopt Prettier for JavaScript code formatting 10
  • Ability for plugins to collaborate when adding extra HTML to blocks in default templates 10
  • --no-headers option for CSV and TSV 10
  • Add support for Jinja2 version 3.0 10
  • Test Datasette Docker images built for different architectures 10
  • Research: syntactic sugar for using --get with SQL queries, maybe "datasette query" 10
  • Add reference page to documentation using Sphinx autodoc 10
  • [Enhancement] Please allow 'insert-files' to insert content as text. 10
  • Docker configuration for exercising Datasette behind Apache mod_proxy 10
  • Python library methods for calling ANALYZE 10
  • Documentation should clarify /stable/ vs /latest/ 10
  • Remove Hashed URL mode 10
  • Config file with support for defining canned queries 9
  • Datasette serve should accept paths/URLs to CSVs and other file formats 9
  • Figure out some interesting example SQL queries 9
  • bump uvicorn to 0.9.0 to be Python-3.8 friendly 9
  • Refactor TableView.data() method 9
  • Set up a live demo Datasette instance 9
  • Move hashed URL mode out to a plugin 9
  • ?_searchmode=raw option for running FTS searches without escaping characters 9
  • Option to automatically configure based on directory layout 9
  • Replace "datasette publish --extra-options" with "--setting" 9
  • New WIP writable canned queries 9
  • Example permissions plugin 9
  • Research feasibility of 100% test coverage 9
  • canned_queries() plugin hook 9
  • Consider dropping explicit CSRF protection entirely? 9
  • Add insert --truncate option 9
  • Improve performance of extract operations 9
  • Figure out how to run an environment that exercises the base_url proxy setting 9
  • sqlite-utils search command 9
  • GENERATED column support 9
  • Datasette on Amazon Linux on ARM returns 404 for static assets 9
  • "Stream all rows" is not at all obvious 9
  • Better internal database_name for _internal database 9
  • Mechanism for minifying JavaScript that ships with Datasette 9
  • Mechanism for executing JavaScript unit tests 9
  • Use _counts to speed up counts 9
  • Use force_https_urls on when deploying with Cloud Run 9
  • Ability to increase size of the SQL editor window 9
  • Custom pages don't work with base_url setting 9
  • CSV ?_stream=on redundantly calculates facets for every page 9
  • "invalid reference format" publishing Docker image 9
  • `default_allow_sql` setting (a re-imagining of the old `allow_sql` setting) 9
  • Show count of facet values if ?_facet_size=max 9
  • Manage /robots.txt in Datasette core, block robots by default 9
  • Test against pysqlite3 running SQLite 3.37 9
  • Allow to set `facets_array` in metadata (like current `facets`) 9
  • Add SpatiaLite helpers to CLI 9
  • Get Datasette compatible with Pyodide 9
  • Make URLs immutable 8
  • datasette publish heroku 8
  • Ability to bundle and serve additional static files 8
  • Add GraphQL endpoint 8
  • prepare_context() plugin hook 8
  • Wildcard support in query parameters 8
  • URL hashing now optional: turn on with --config hash_urls:1 (#418) 8
  • "datasette publish cloudrun" command to publish to Google Cloud Run 8
  • Add register_output_renderer hook 8
  • Improvements to table label detection 8
  • sqlite-utils create-table command 8
  • Stream all results for arbitrary SQL and canned queries 8
  • Add a universal navigation bar which can be modified by plugins 8
  • Command to fetch stargazers for one or more repos 8
  • allow leading comments in SQL input field 8
  • Helper methods for working with SpatiaLite 8
  • datasette publish cloudrun --memory option 8
  • Commits in GitHub API can have null author 8
  • extra_template_vars() sending wrong view_name for index 8
  • Import photo metadata from Apple Photos into SQLite 8
  • Visually distinguish integer and text columns 8
  • sqlite3.OperationalError: too many SQL variables in insert_all when using rows with varying numbers of columns 8
  • Allow-list pragma_table_info(tablename) and similar 8
  • Rename project to dogsheep-photos 8
  • Consolidate request.raw_args and request.args 8
  • Group permission checks by request on /-/permissions debug page 8
  • Upgrade CodeMirror 8
  • Mechanism for defining custom display of results 8
  • .delete_where() does not auto-commit (unlike .insert() or .upsert()) 8
  • the JSON object must be str, bytes or bytearray, not 'Undefined' 8
  • Wide tables should scroll horizontally within the page 8
  • OPTIONS requests return a 500 error 8
  • Bring date parsing into Datasette core 8
  • Establish pattern for release branches to support bug fixes 8
  • GitHub Actions workflow to build and sign macOS binary executables 8
  • Make original path available to render hooks 8
  • --sniff option for sniffing delimiters 8
  • --crossdb option for joining across databases 8
  • sqlite-utils memory command for directly querying CSV/JSON data 8
  • absolute_url() behind a proxy assembles incorrect http://127.0.0.1:8001/ URLs 8
  • Tests failing with FileNotFoundError in runner.isolated_filesystem 8
  • Rename Datasette.__init__(config=) parameter to settings= 8
  • Allow passing a file of code to "sqlite-utils convert" 8
  • Documented JavaScript variables on different templates made available for plugins 8
  • Support for generated columns 8
  • Get rid of the no-longer necessary ?_format=json hack for tables called x.json 8
  • Refactor and simplify Datasette routing and views 8
  • Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply 8
  • "Error: near "(": syntax error" when using sqlite-utils indexes CLI 8
  • ?_group_count=country - return counts by specific column(s) 7
  • Ship a Docker image of the whole thing 7
  • add "format sql" button to query page, uses sql-formatter 7
  • Windows installation error 7
  • Keyset pagination doesn't work correctly for compound primary keys 7
  • inspect should record column types 7
  • Travis should push tagged images to Docker Hub for each release 7
  • Improve and document foreign_keys=... argument to insert/create/etc 7
  • ?_where=sql-fragment parameter for table views 7
  • Define mechanism for plugins to return structured data 7
  • Utility mechanism for plugins to render templates 7
  • Syntax for ?_through= that works as a form field 7
  • Problem with square bracket in CSV column name 7
  • Update SQLite bundled with Docker container 7
  • index.html is not reliably loaded from a plugin 7
  • .columns_dict doesn't work for all possible column types 7
  • Only set .last_rowid and .last_pk for single update/inserts, not for .insert_all()/.upsert_all() with multiple records 7
  • Expose scores from ZCOMPUTEDASSETATTRIBUTES 7
  • publish heroku does not work on Windows 10 7
  • Demo is failing to deploy 7
  • Support reverse pagination (previous page, has-previous-items) 7
  • Docker container is no longer being pushed (it's stuck on 0.45) 7
  • insert_all(..., alter=True) should work for new columns introduced after the first 100 records 7
  • Push to Docker Hub failed - but it shouldn't run for alpha releases anyway 7
  • Simplify imports of common classes 7
  • SQLITE_MAX_VARS maybe hard-coded too low 7
  • Commands for making authenticated API calls 7
  • Pagination 7
  • Support the dbstat table 7
  • Much, much faster extract() implementation 7
  • Documented HTML hooks for JavaScript plugin authors 7
  • Redesign application homepage 7
  • "Edit SQL" button on canned queries 7
  • Fix last remaining links to "/" that do not respect base_url 7
  • .extract() shouldn't extract null values 7
  • export.xml file name varies with different language settings 7
  • "View all" option for facets, to provide a (paginated) list of ALL of the facet counts plus a link to view them 7
  • table.pks_and_rows_where() method returning primary keys along with the rows 7
  • Invalid SQL: "no such table: pragma_database_list" on database page 7
  • Latest Datasette tags missing from Docker Hub 7
  • "More" link for facets that shows _facet_size=max results 7
  • ?_nocol= does not interact well with default facets 7
  • sqlite-utils memory should handle TSV and JSON in addition to CSV 7
  • Introspection property for telling if a table is a rowid table 7
  • Query page .csv and .json links are not correctly URL-encoded on Vercel under unknown specific conditions 7
  • New pattern for async view classes 7
  • Extra options to `lookup()` which get passed to `insert()` 7
  • Columns starting with an underscore behave poorly in filters 7
  • Test failure in test_rebuild_fts 7
  • `.execute_write(... block=True)` should be the default behaviour 7
  • Maybe let plugins define custom serve options? 7
  • Add SpatiaLite helpers to CLI 7
  • Use dash encoding for table names and row primary keys in URLs 7
  • I forgot to include the changelog in the 3.25.1 release 7
  • Remove hashed URL mode 7
  • Extract out `check_permissions()` from `BaseView 7
  • `--nolock` feature for opening locked databases 7
  • Addressable pages for every row in a table 6
  • Default HTML/CSS needs to look reasonable and be responsive 6
  • Support Django-style filters in querystring arguments 6
  • Detect foreign keys and use them to link HTML pages together 6
  • [WIP] Add publish to heroku support 6
  • Nasty bug: last column not being correctly displayed 6
  • Figure out how to bundle a more up-to-date SQLite 6
  • Don't duplicate simple primary keys in the link column 6
  • Load plugins from a `--plugins-dir=plugins/` directory 6
  • Ability for plugins to define extra JavaScript and CSS 6
  • inspect() should detect many-to-many relationships 6
  • Deploy demo of Datasette on every commit that passes tests 6
  • Plugin hook for loading metadata.json 6
  • Faceted browse against a JSON list of tags 6
  • CSV export in "Advanced export" pane doesn't respect query 6
  • Additional Column Constraints? 6
  • Rename metadata.json to config.json 6
  • Easier way of creating custom row templates 6
  • Experiment with type hints 6
  • Command for running a search and saving tweets for that search 6
  • Handle really wide tables better 6
  • Ways to improve fuzzy search speed on larger data sets? 6
  • Improve UI of "datasette publish cloudrun" to reduce chances of accidentally over-writing a service 6
  • Mechanism for indicating foreign key relationships in the table and query page URLs 6
  • updating metadata.json without recreating the app 6
  • Provide a cookiecutter template for creating new plugins 6
  • upsert_all() throws issue when upserting to empty table 6
  • "Templates considered" comment broken in >=0.35 6
  • Documentation for the "request" object 6
  • Support YAML in metadata - metadata.yaml 6
  • Command for retrieving dependents for a repo 6
  • Question: Access to immutable database-path 6
  • Support decimal.Decimal type 6
  • allow_by_query setting for configuring permissions with a SQL statement 6
  • python tests/fixtures.py command has a bug 6
  • Mechanism for specifying allow_sql permission in metadata.json 6
  • Way to enable a default=False permission for anonymous users 6
  • Ability to set ds_actor cookie such that it expires 6
  • startup() plugin hook 6
  • "Too many open files" error running tests 6
  • datasette.add_message() doesn't work inside plugins 6
  • Datasette sdist is missing templates (hence broken when installing from Homebrew) 6
  • End-user documentation 6
  • extra_ plugin hooks should take the same arguments 6
  • Mechanism for differentiating between "by me" and "liked by me" 6
  • Progress bar for sqlite-utils insert 6
  • Rendering glitch with column headings on mobile 6
  • Change "--config foo:bar" to "--setting foo bar" 6
  • Add Link: pagination HTTP headers 6
  • Figure out how to display images from <en-media> tags inline in Datasette 6
  • Method for datasette.client() to forward on authentication 6
  • Fallback to databases in inspect-data.json when no -i options are passed 6
  • Better display of binary data on arbitrary query results page 6
  • Table actions menu on view pages, not on query pages 6
  • load_template() plugin hook 6
  • PrefixedUrlString mechanism broke everything 6
  • Support order by relevance against FTS4 6
  • changes to allow for compound foreign keys 6
  • Support for generated columns 6
  • sqlite-utils analyze-tables command and table.analyze_column() method 6
  • More flexible CORS support in core, to encourage good security practices 6
  • Improve the display of facets information 6
  • Update Docker Spatialite version to 5.0.1 + add support for Spatialite topology functions 6
  • `sqlite-utils indexes` command 6
  • Error: Use either --since or --since_id, not both 6
  • `db.query()` method (renamed `db.execute_returning_dicts()`) 6
  • "searchmode": "raw" in table metadata 6
  • `table.search(..., quote=True)` parameter and `sqlite-utils search --quote` option 6
  • sqlite-utils insert errors should show SQL and parameters, if possible 6
  • Mechanism to cause specific branches to deploy their own demos 6
  • clean checkout & clean environment has test failures 6
  • Win32 "used by another process" error with datasette publish 6
  • ReadTheDocs build failed for 0.59.2 release 6
  • Command for creating an empty database 6
  • Idea: hover to reveal details of linked row 6
  • Writable canned queries fail to load custom templates 6
  • filters_from_request plugin hook, now used in TableView 6
  • Release Datasette 0.60 6
  • Drop support for Python 3.6 6
  • Support mutating row in `--convert` without returning it 6
  • Scripted exports 6
  • Reconsider policy on blocking queries containing the string "pragma" 6
  • datasette one.db one.db opens database twice, as one and one_2 6
  • `deterministic=True` fails on versions of SQLite prior to 3.8.3 6
  • Ship Datasette 0.61 6
  • Proposal: datasette query 6
  • .db downloads should be served with an ETag 6
  • Experiment with patterns for concurrent long running queries 5
  • Create neat example database 5
  • Redesign JSON output, ditch jsono, offer variants controlled by parameter instead 5
  • Option to open readonly but not immutable 5
  • datasette publish can fail if /tmp is on a different device 5
  • Refactor views 5
  • Add links to example Datasette instances to appropiate places in docs 5
  • Ability to enable/disable specific features via --config 5
  • Custom URL routing with independent tests 5
  • datasette inspect takes a very long time on large dbs 5
  • Get Datasette working with Zeit Now v2's 100MB image size limit 5
  • Hashed URLs should be optional 5
  • Plugin for allowing CORS from specified hosts 5
  • Design changes to homepage to support mutable files 5
  • Option to facet by date using month or year 5
  • extra_template_vars plugin hook 5
  • Ability to list views, and to access db["view_name"].rows / rows_where / etc 5
  • Rethink progress bars for various commands 5
  • [enhancement] Method to delete a row in python 5
  • Testing utilities should be available to plugins 5
  • If you have databases called foo.db and foo-bar.db you cannot visit /foo-bar 5
  • Don't auto-format SQL on page load 5
  • stargazers command, refs #4 5
  • Add this view for seeing new releases 5
  • Escape_fts5_query-hookimplementation does not work with queries to standard tables 5
  • on_create mechanism for after table creation 5
  • Datasette.render_template() method 5
  • Rethink how sanity checks work 5
  • Release automation: automate the bit that posts the GitHub release 5
  • table.disable_fts() method and "sqlite-utils disable-fts ..." command 5
  • twitter-to-sqlite user-timeline [screen_names] --sql / --attach 5
  • Option in metadata.json to set default sort order for a table 5
  • Feature: record history of follower counts 5
  • Custom CSS class on body for styling canned queries 5
  • Repos have a big blob of JSON in the organization column 5
  • Annotate photos using the Google Cloud Vision API 5
  • Create a public demo 5
  • Unit test that checks that all plugin hooks have corresponding unit tests 5
  • Ability to sign in to Datasette as a root account 5
  • CSRF protection 5
  • Consider using enable_callback_tracebacks(True) 5
  • Fix the demo - it breaks because of the tags table change 5
  • Feature: pull request reviews and comments 5
  • Mechanism for passing additional options to `datasette my.db` that affect plugins 5
  • sqlite-utils insert: options for column types 5
  • Features for enabling and disabling WAL mode 5
  • Add homebrew installation to documentation 5
  • 'datasette --get' option, refs #926 5
  • Path parameters for custom pages 5
  • Private/secret databases: database files that are only visible to plugins 5
  • Handle case where subsequent records (after first batch) include extra columns 5
  • Better handling of encodings other than utf-8 for "sqlite-utils insert" 5
  • For 1.0 update trove classifier in setup.py 5
  • How should datasette.client interact with base_url 5
  • Add documentation on serving Datasette behind a proxy using base_url 5
  • Add search highlighting snippets 5
  • datasette.urls.static_plugins(...) method 5
  • Default menu links should check a real permission 5
  • Rethink how table.search() method works 5
  • Foreign key links break for compound foreign keys 5
  • Rename datasette.config() method to datasette.setting() 5
  • Show pysqlite3 version on /-/versions 5
  • Feature Request: Gmail 5
  • Release notes for Datasette 0.54 5
  • 500 error caused by faceting if a column called `n` exists 5
  • Share button for copying current URL 5
  • Research using CTEs for faster facet counts 5
  • Better default display of arrays of items 5
  • Upgrade to Python 3.9.4 5
  • Add Docker multi-arch support with Buildx 5
  • ?_facet_size=X to increase number of facets results on the page 5
  • `table.xindexes` using `PRAGMA index_xinfo(table)` 5
  • DRAFT: A new plugin hook for dynamic metadata 5
  • feature: support "events" 5
  • Serve all db files in a folder 5
  • .transform(types=) turns rowid into a concrete column 5
  • Stop using generated columns in fixtures.db 5
  • `datasette publish cloudrun --cpu X` option 5
  • Ability to search for text across all columns in a table 5
  • Ability to insert file contents as text, in addition to blob 5
  • Upgrade to httpx 0.20.0 (request() got an unexpected keyword argument 'allow_redirects') 5
  • Allow routes to have extra options 5
  • Way to test SQLite 3.37 (and potentially other versions) in CI 5
  • Redesign CSV export to improve usability 5
  • introduce new option for datasette package to use a slim base image 5
  • Add KNN and data_licenses to hidden tables list 5
  • Move canned queries closer to the SQL input area 5
  • Improvements to help make Datasette a better tool for learning SQL 5
  • Test failures with SQLite 3.37.0+ due to column affinity case 5
  • Implement redirects from old % encoding to new dash encoding 5
  • Adopt a code of conduct 5
  • Display autodoc type information more legibly 5
  • Research running SQL in table view in parallel using `asyncio.gather()` 5
  • Support `rows_where()`, `delete_where()` etc for attached alias databases 5
  • CSV `extras_key=` and `ignore_extras=` equivalents for CLI tool 5
  • Protect against malicious SQL that causes damage even though our DB is immutable 4
  • Homepage UI for editing metadata file 4
  • Switch to ujson 4
  • Pick a name 4
  • datasette publish hyper 4
  • Support for title/source/license metadata 4
  • Enforce pagination (or at least limits) for arbitrary custom SQL 4
  • Add NHS England Hospitals example to wiki 4
  • Consider data-package as a format for metadata 4
  • add support for ?field__isnull=1 4
  • Plugin that adds an authentication layer of some sort 4
  • ?_json=foo&_json=bar query string argument 4
  • A primary key column that has foreign key restriction associated won't rendering label column 4
  • Support WITH query 4
  • 500 from missing table name 4
  • Ability to bundle metadata and templates inside the SQLite file 4
  • Ability to apply sort on mobile in portrait mode 4
  • metadata.json support for plugin configuration options 4
  • Explore "distinct values for column" in inspect() 4
  • Escaping named parameters in canned queries 4
  • Mechanism for automatically picking up changes when on-disk .db file changes 4
  • Add version number support with Versioneer 4
  • Support table names ending with .json or .csv 4
  • Explore if SquashFS can be used to shrink size of packaged Docker containers 4
  • Limit text display in cells containing large amounts of text 4
  • Datasette on Zeit Now returns http URLs for facet and next links 4
  • Expose SANIC_RESPONSE_TIMEOUT config option in a sensible way 4
  • Requesting support for query description 4
  • render_cell(value) plugin hook 4
  • Ability to display facet counts for many-to-many relationships 4
  • add_column() should support REFERENCES {other_table}({other_column}) 4
  • Figure out what to do about table counts in a mutable world 4
  • Refactor facets to a class and new plugin, refs #427 4
  • Tracing support for seeing what SQL queries were executed 4
  • Paginate + search for databases/tables on the homepage 4
  • Replace most of `.inspect()` (and `datasette inspect`) with table counting 4
  • Decide what to do about /-/inspect 4
  • Allow .insert(..., foreign_keys=()) to auto-detect table and primary key 4
  • Facets not correctly persisted in hidden form fields 4
  • Every datasette plugin on the ecosystem page should have a screenshot 4
  • Support opening multiple databases with the same stem 4
  • Enforce import sort order with isort 4
  • Decide what goes into Datasette 1.0 4
  • Fix static mounts using relative paths and prevent traversal exploits 4
  • Get tests running on Windows using Travis CI 4
  • Support unicode in url 4
  • Too many SQL variables 4
  • More advanced connection pooling 4
  • Option to fetch only checkins more recent than the current max checkin 4
  • Add triggers while enabling FTS 4
  • --sql and --attach options for feeding commands from SQL queries 4
  • Use better pagination (and implement progress bar) 4
  • Command to import home-timeline 4
  • retweets-of-me command 4
  • `import` command fails on empty files 4
  • Failed to import workout points 4
  • Datasette should work with Python 3.8 (and drop compatibility with Python 3.5) 4
  • Mechanism for register_output_renderer to suggest extension or not 4
  • Assets table with downloads 4
  • Exception running first command: IndexError: list index out of range 4
  • Allow creation of virtual tables at startup 4
  • order_by mechanism 4
  • Remove .detect_column_types() from table, make it a documented API 4
  • Cashe-header missing in http-response 4
  • Add documentation on Database introspection methods to internals.rst 4
  • Adding a "recreate" flag to the `Database` constructor 4
  • Custom pages mechanism, refs #648 4
  • escape_fts() does not correctly escape * wildcards 4
  • Fall back to authentication via ENV 4
  • Directory configuration mode should support metadata.yaml 4
  • Cloud Run fails to serve database files larger than 32MB 4
  • [Feature Request] Support Repo Name in Search 🥺 4
  • Ability to set custom default _size on a per-table basis 4
  • Try out ExifReader 4
  • add_foreign_key(...., ignore=True) 4
  • register_output_renderer can_render mechanism 4
  • Error pages not correctly loading CSS 4
  • Publish secrets 4
  • Example authentication plugin 4
  • /-/metadata and so on should respect view-instance permission 4
  • Log out mechanism for clearing ds_actor cookie 4
  • Take advantage of .coverage being a SQLite database 4
  • Skip counting hidden tables 4
  • Use white-space: pre-wrap on ALL table cell contents 4
  • github-to-sqlite tags command for fetching tags 4
  • Output binary columns in "sqlite-utils query" JSON 4
  • Security issue: read-only canned queries leak CSRF token in URL 4
  • Test failures caused by failed attempts to mock pip 4
  • --load-extension option for sqlite-utils query 4
  • request an "-o" option on "datasette server" to open the default browser at the running url 4
  • Idea: conversions= could take Python functions 4
  • sqlite-utils transform sub-command 4
  • sqlite-utils transform/insert --detect-types 4
  • from_json jinja2 filter 4
  • column name links broken in 0.50.1 4
  • extra_js_urls and extra_css_urls should respect base_url setting 4
  • Some workout columns should be float, not text 4
  • Include LICENSE in sdist 4
  • Add template block prior to extra URL loaders 4
  • .blob output renderer 4
  • Table/database action menu cut off if too short 4
  • Rebrand and redirect config.rst as settings.rst 4
  • --load-extension=spatialite not working with datasetteproject/datasette docker image 4
  • Fix footer not sticking to bottom in short pages 4
  • "_searchmode=raw" throws an index out of range error when combined with "_search_COLUMN" 4
  • sqlite-utils should suggest --csv if JSON parsing fails 4
  • sqlite-utils analyze-tables command 4
  • Searching for "github-to-sqlite" throws an error 4
  • UNIQUE constraint failed: workouts.id 4
  • Modernize code to Python 3.6+ 4
  • Prettier package not actually being cached 4
  • reset_counts() method and command 4
  • Archive import appears to be broken on recent exports 4
  • Certain database names results in 404: "Database not found: None" 4
  • view_name = "query" for the query page 4
  • Tests are very slow. 4
  • photo-to-sqlite: command not found 4
  • Installing datasette via docker: Path 'fixtures.db' does not exist 4
  • Error reading csv files with large column data 4
  • --port option should validate port is between 0 and 65535 4
  • Allow canned query params to specify default values 4
  • Escaping FTS search strings 4
  • Refresh SpatiaLite documentation 4
  • Feature or Documentation Request: Individual table as home page template 4
  • Dockerfile: use Ubuntu 20.10 as base 4
  • improve table horizontal scroll experience 4
  • Document how to send multiple values for "Named parameters" 4
  • Avoid error sorting by relationships if related tables are not allowed 4
  • Can't use apt-get in Dockerfile when using datasetteproj/datasette as base 4
  • Figure out how to publish alpha/beta releases to Docker Hub 4
  • Intermittent CI failure: restore_working_directory FileNotFoundError 4
  • row.update() or row.pk 4
  • db.schema property and sqlite-utils schema command 4
  • Cannot set type JSON 4
  • Automatic type detection for CSV data 4
  • Big performance boost on faceting: skip the inner order by 4
  • Command for fetching Hacker News threads from the search API 4
  • Add Gmail takeout mbox import (v2) 4
  • Ability to default to hiding the SQL for a canned query 4
  • Document exceptions that can be raised by db.execute() and friends 4
  • Add reference documentation generated from docstrings 4
  • xml.etree.ElementTree.ParseError: not well-formed (invalid token) 4
  • sqlite-utils memory can't deal with multiple files with the same name 4
  • ?_sort=rowid with _next= returns error 4
  • Exceeding Cloud Run memory limits when deploying a 4.8G database 4
  • `table.lookup()` option to populate additional columns when creating a record 4
  • Improve Apache proxy documentation, link to demo 4
  • Provide function to generate hash_id from specified columns 4
  • Use datasette-table Web Component to guide the design of the JSON API for 1.0 4
  • Add `Link: rel="alternate"` header pointing to JSON for a table/query 4
  • Maybe return JSON from HTML pages if `Accept: application/json` is sent 4
  • `sqlite-utils insert --extract colname` 4
  • Allow users to pass a full convert() function definition 4
  • Update janus requirement from <0.8,>=0.6.2 to >=0.6.2,<1.1 4
  • Confirm if documented nginx proxy config works for row pages with escaped characters in their primary key 4
  • Better error message if `--convert` code fails to return a dict 4
  • `--fmt` should imply `-t` 4
  • Add documentation page with the output of `--help` 4
  • Release notes for 0.60 4
  • Link to stable docs from older versions 4
  • `sqlite-utils bulk --batch-size` option 4
  • Document how to add a primary key to a rowid table using `sqlite-utils transform --pk` 4
  • Creating tables with custom datatypes 4
  • Update Dockerfile generated by `datasette publish` 4
  • Sensible `cache-control` headers for static assets, including those served by plugins 4
  • `sqlite3.NotSupportedError`: deterministic=True requires SQLite 3.8.3 or higher 4
  • Datasette feature for publishing snapshots of query results 4
  • Automated test for Pyodide compatibility 4
  • ?_trace=1 fails with datasette-geojson for some reason 4
  • Combining `rows_where()` and `search()` to limit which rows are searched 4
  • Implement sensible query pagination 3
  • Command line tool for uploading one or more DBs to Now 3
  • Ability to plot a simple graph 3
  • date, year, month and day querystring lookups 3
  • Implement a better database index page 3
  • Add more detailed API documentation to the README 3
  • UI for editing named parameters 3
  • Link to JSON for the list of tables 3
  • UI support for running FTS searches 3
  • If view is filtered, search should apply within those filtered rows 3
  • ?_search=x should work if used directly against a FTS virtual table 3
  • Show extra instructions with the interrupted 3
  • apsw as alternative sqlite3 binding (for full text search) 3
  • _group_count= feature improvements 3
  • Datasette CSS should include content hash in the URL 3
  • datasette skeleton command for kick-starting database and table metadata 3
  • Custom template for named canned query 3
  • proposal new option to disable user agents cache 3
  • Cleaner mechanism for handling custom errors 3
  • Run pks_for_table in inspect, executing once at build time rather than constantly 3
  • Hide Spatialite system tables 3
  • Support filtering with units and more 3
  • Allow plugins to add new cli sub commands 3
  • datasette publish --install=name-of-plugin 3
  • label_column option in metadata.json 3
  • External metadata.json 3
  • Add new metadata key persistent_urls which removes the hash from all database urls 3
  • Facets should not execute for ?shape=array|object 3
  • Documentation for URL hashing, redirects and cache policy 3
  • Build smallest possible Docker image with Datasette plus recent SQLite (with json1) plus Spatialite 4.4.0 3
  • Support multiple filters of the same type 3
  • ?_ttl= parameter to control caching 3
  • Avoid plugins accidentally loading dependencies twice 3
  • Per-database and per-table /-/ URL namespace 3
  • Ability to configure SQLite cache_size 3
  • Installation instructions, including how to use the docker image 3
  • Ensure --help examples in docs are always up to date 3
  • Use pysqlite3 if available 3
  • Integration with JupyterLab 3
  • datasette publish digitalocean plugin 3
  • Update official datasetteproject/datasette Docker container to SQLite 3.26.0 3
  • Ensure downloading a 100+MB SQLite database file works 3
  • How to pass configuration to plugins? 3
  • Use SQLITE_DBCONFIG_DEFENSIVE plus other recommendations from SQLite security docs 3
  • Experiment: run Jinja in async mode 3
  • .insert_all() should accept a generator and process it efficiently 3
  • Problems handling column names containing spaces or - 3
  • Zeit API v1 does not work for new users - need to migrate to v2 3
  • Utilities for adding indexes 3
  • Add query parameter to hide SQL textarea 3
  • Datasette doesn't reload when database file changes 3
  • Installing installs the tests package 3
  • Fix the "datasette now publish ... --alias=x" option 3
  • Make it so Docker build doesn't delay PyPI release 3
  • Option to ignore inserts if primary key exists already 3
  • Accessibility for non-techie newsies? 3
  • Test against Python 3.8-dev using Travis 3
  • Exporting sqlite database(s)? 3
  • "about" parameter in metadata does not appear when alone 3
  • asgi_wrapper plugin hook 3
  • Unable to use rank when fts-table generated with csvs-to-sqlite 3
  • Mechanism for secrets in plugin configuration 3
  • datasette publish option for setting plugin configuration secrets 3
  • Potential improvements to facet-by-date 3
  • CodeMirror fails to load on database page 3
  • .add_column() doesn't match indentation of initial creation 3
  • extracts= option for insert/update/etc 3
  • Script uses a lot of RAM 3
  • "Too many SQL variables" on large inserts 3
  • Datasette Edit 3
  • "twitter-to-sqlite user-timeline" command for pulling tweets by a specific user 3
  • Exposing Datasette via Jupyter-server-proxy 3
  • Added support for multi arch builds 3
  • Extract "source" into a separate lookup table 3
  • Track and use the 'since' value 3
  • Queries per DB table in metadata.json 3
  • Handle spaces in DB names 3
  • since_id support for home-timeline 3
  • make uvicorn optional dependancy (because not ok on windows python yet) 3
  • --since support for various commands for refresh-by-cron 3
  • upgrade to uvicorn-0.9 to be Python-3.8 friendly 3
  • Offer to format readonly SQL 3
  • _where= parameter is not persisted in hidden form fields 3
  • /-/plugins shows incorrect name for plugins 3
  • Static assets no longer loading for installed plugins 3
  • Add this repos_starred view 3
  • Publish to Heroku is broken: "WARNING: You must pass the application as an import string to enable 'reload' or 'workers" 3
  • rowid is not included in dropdown filter menus 3
  • Custom queries with 0 results should say "0 results" 3
  • Don't suggest column for faceting if all values are 1 3
  • Command for importing events 3
  • Make database level information from metadata.json available in the index.html template 3
  • Feature request: enable extensions loading 3
  • Add a glossary to the documentation 3
  • fts5 syntax error when using punctuation 3
  • Template debug mode that outputs template context 3
  • Copy and paste doesn't work reliably on iPhone for SQL editor 3
  • Tests are failing due to missing FTS5 3
  • Test failures on openSUSE 15.1: AssertionError: Explicit other_table and other_column 3
  • --port option to expose a port other than 8001 in "datasette package" 3
  • Tutorial command no longer works 3
  • Use inspect-file, if possible, for total row count 3
  • prepare_connection() plugin hook should accept optional datasette argument 3
  • Ability to customize columns used by extracts= feature 3
  • Variables from extra_template_vars() not exposed in _context=1 3
  • Search box CSS doesn't look great on OS X Safari 3
  • Handle "User not found" error 3
  • WIP implementation of writable canned queries 3
  • --plugin-secret over-rides existing metadata.json plugin config 3
  • Update aiofiles requirement from ~=0.4.0 to >=0.4,<0.6 3
  • Pull repository contributors 3
  • Mechanism for forcing column-type, over-riding auto-detection 3
  • Issue and milestone should have foreign key to repo 3
  • Issue comments don't appear to populate issues foreign key 3
  • strange behavior using accented characters 3
  • Configuration directory mode 3
  • Create index on issue_comments(user) and other foreign keys 3
  • Mechanism for creating views if they don't yet exist 3
  • Add notlike table filter 3
  • Question: Any fixed date for the release with the uft8-encoding fix? 3
  • fts search on a column doesn't work anymore due to escape_fts 3
  • Only install osxphotos if running on macOS 3
  • Way of seeing full schema for a database 3
  • Add PyPI project urls to setup.py 3
  • request.url and request.scheme should obey force_https_urls config setting 3
  • CSRF protection for /-/messages tool and writable canned queries 3
  • Documentation for new "params" setting for canned queries 3
  • Ability to customize what happens when a view permission fails 3
  • Documentation is inconsistent about "id" as required field on actor 3
  • Document the ds_actor signed cookie 3
  • Horizontal scrollbar on changelog page on mobile 3
  • Redesign register_facet_classes plugin hook 3
  • Fall back to FTS4 if FTS5 is not available 3
  • Consider pagination of canned queries 3
  • Script to generate larger SQLite test files 3
  • Support for compound (composite) foreign keys 3
  • initial windows ci setup 3
  • "Logged in as: XXX - logout" navigation item 3
  • Canned query page should show the name of the canned query 3
  • asgi_wrapper plugin hook is crashing at startup 3
  • Ability to remove a foreign key 3
  • Improved (and better documented) support for transactions 3
  • Some links don't honor base_url 3
  • Add a table of contents to the README 3
  • "allow": true for anyone, "allow": false for nobody 3
  • Interactive debugging tool for "allow" blocks 3
  • Ability to insert files piped to insert-files stdin 3
  • Support tokenize option for FTS 3
  • Refactor TableView class so things like datasette-graphql can reuse the logic 3
  • Travis should not build the master branch, only the main branch 3
  • "datasette install" and "datasette uninstall" commands 3
  • db.execute_write_fn(create_tables, block=True) hangs a thread if connection fails 3
  • Pass columns to extra CSS/JS/etc plugin hooks 3
  • Code for finding SpatiaLite in the usual locations 3
  • --load-extension=spatialite shortcut option 3
  • Try out CodeMirror SQL hints 3
  • insert_all(..., alter=True) should work for new columns introduced after the first 100 records 3
  • Datasette plugin to provide custom page for running faceted, ranked searches 3
  • Timeline view 3
  • table.optimize() should delete junk rows from *_fts_docsize 3
  • Documentation for 404.html, 500.html templates 3
  • Add --tar option to "datasette publish heroku" 3
  • Add docs for .transform(column_order=) 3
  • Better handling of multiple matching template wildcard paths 3
  • Documentation covering buildpack deployment 3
  • Datasette should default to running Uvicorn with workers=1 3
  • Remove xfail tests when new httpx is released 3
  • json / CSV links are broken in Datasette 0.50 3
  • Add a "delete" icon next to filters (in addition to "remove filter") 3
  • Fix issues relating to base_url 3
  • datasette.urls.table(..., format="json") argument 3
  • Add horizontal scrollbar to tables 3
  • /db/table/-/blob/pk/column.blob download URL 3
  • Allow iterables other than Lists in m2m records 3
  • Refactor .csv to be an output renderer - and teach register_output_renderer to stream all rows 3
  • .csv should link to .blob downloads 3
  • Table actions menu plus plugin hook 3
  • latest.datasette.io should include plugins from fixtures 3
  • database_actions plugin hook 3
  • 3.0 release with some minor breaking changes 3
  • table.search() improvements plus sqlite-utils search command 3
  • DigitalOcean buildpack memory errors for large sqlite db? 3
  • Foreign keys with blank titles result in non-clickable links 3
  • OperationalError('interrupted') can 500 on row page 3
  • Custom widgets for canned query forms 3
  • Support linking to compound foreign keys 3
  • Accessing a database's `.json` is slow for very large SQLite files 3
  • Fix --metadata doc usage 3
  • github-to-sqlite workflows command 3
  • "datasette inspect" outputs invalid JSON if an error is logged 3
  • Make it easier to theme Datasette with CSS 3
  • Update for Big Sur 3
  • Remove unneeded exists=True for -a/--auth flag. 3
  • Add Prettier to contributing documentation 3
  • Install Prettier via package.json 3
  • Use structlog for logging 3
  • Retire "Ecosystem" page in favour of datasette.io/plugins and /tools 3
  • Better error message for *_fts methods against views 3
  • "Statement may not contain PRAGMA" error is not strictly true 3
  • ?_size= argument is not persisted by hidden form fields in the table filters 3
  • WIP: Plugin includes 3
  • Rename /:memory: to /_memory 3
  • Release 0.54 3
  • Not all quoted statuses get fetched? 3
  • Use Data from SQLite in other commands 3
  • Use context manager instead of plain open 3
  • gzip support for HTML (and JSON) responses 3
  • Re-submitting filter form duplicates _x querystring arguments 3
  • KeyError: 'Contents' on running upload 3
  • Support SSL/TLS directly 3
  • Hitting `_csv.Error: field larger than field limit (131072)` 3
  • ensure immutable databses when starting in configuration directory mode with 3
  • Allow facetting on custom queries 3
  • db["my_table"].drop(ignore=True) parameter, plus sqlite-utils drop-table --ignore and drop-view --ignore 3
  • Suggest for ArrayFacet possibly confused by blank values 3
  • Minor type in IP adress 3
  • Some links aren't properly URL encoded. 3
  • Plugin hook that could support 'order by random()' for table view 3
  • Support for HTTP Basic Authentication 3
  • Try implementing SQLite timeouts using .interrupt() instead of using .set_progress_handler() 3
  • Use SQLite conn.interrupt() instead of sqlite_timelimit() 3
  • Handle byte order marks (BOMs) in CSV files 3
  • Speed up tests with pytest-xdist 3
  • Generating URL for a row inside `render_cell` hook 3
  • Columns named "link" display in bold 3
  • Improve `path_with_replaced_args()` and friends and document them 3
  • Re-display user's query with an error message if an error occurs 3
  • Supporting additional output formats, like GeoJSON 3
  • Release Datasette 0.57 3
  • Add some types, enforce with mypy 3
  • bool type not supported 3
  • Official Datasette Docker image should use SQLite >= 3.31.0 (for generated columns) 3
  • Mechanism for plugins to exclude certain paths from CSRF checks 3
  • Support db as first parameter before subcommand, or as environment variable 3
  • Mypy fixes for rows_from_file() 3
  • Test against Python 3.10-dev 3
  • Use HN algolia endpoint to retrieve trees 3
  • utils.parse_metadata() should be a documented internal function 3
  • `table.convert(..., where=)` and `sqlite-utils convert ... --where=` 3
  • `publish cloudrun` should deploy a more recent SQLite version 3
  • `sqlite-utils insert --flatten` option to flatten nested JSON 3
  • Modify base.html template to support optional sticky footer 3
  • Add scientists to target groups 3
  • Try blacken-docs 3
  • Add Authorization header when CORS flag is set 3
  • Invalid JSON output when no rows 3
  • Update pyyaml requirement from ~=5.3 to >=5.3,<7.0 3
  • Add functionality to read Parquet files. 3
  • Datasette 1.0 JSON API (and documentation) 3
  • "Links from other tables" broken for columns starting with underscore 3
  • Add new `"sql_file"` key to Canned Queries in metadata? 3
  • Research pattern for re-registering existing Click tools with register_commands 3
  • A way of creating indexes on newly created tables 3
  • Optional caching mechanism for table.lookup() 3
  • Custom pages don't work on windows 3
  • `keep_blank_values=True` when parsing `request.args` 3
  • if csv export is truncated in non streaming mode set informative response header 3
  • TableView refactor 3
  • add hash id to "_memory" url if hashed url mode is turned on and crossdb is also turned on 3
  • KeyError: 'created_at' for private accounts? 3
  • Offer `python -m sqlite_utils` as an alternative to `sqlite-utils` 3
  • `explain query plan select` is too strict about whitespace 3
  • List `--fmt` options in the docs 3
  • `sqlite-utils bulk` command 3
  • `sqlite-utils bulk` command 3
  • Add a CLI reference page to the docs, inspired by sqlite-utils 3
  • Tests failing against Python 3.6 3
  • Ensure template_path always uses "/" to match jinja 3
  • Link: rel="alternate" to JSON for queries too 3
  • Try test suite against macOS and Windows 3
  • Support IF NOT EXISTS for table creation 3
  • Refactor URL routing to enable testing 3
  • Make route matched pattern groups more consistent 3
  • Reconsider ensure_permissions() logic, can it be less confusing? 3
  • insert fails on JSONL with whitespace 3
  • [plugins][documentation] Is it possible to serve per-plugin static folders when writing one-off (single file) plugins? 3
  • Bump black from 22.1.0 to 22.3.0 3
  • Make "<Binary: 2427344 bytes>" easier to read 3
  • Refactor `RowView` and remove `RowTableShared` 3
  • ?_trace=1 doesn't work on Global Power Plants demo 3
  • Remove python-baseconv dependency 3
  • Allow making m2m relation of a table to itself 3
  • `detect_fts()` identifies the wrong table if tables have names that are subsets of each other 3
  • Extract facet portions of table.html out into included templates 3
  • `sqlite_utils.utils.TypeTracker` should be a documented API 3
  • Incorrect syntax highlighting in docs CLI reference 3
  • Initial test suite 2
  • Implement full URL design 2
  • Endpoint that returns SQL ready to be piped into DB 2
  • Ability to serialize massive JSON without blocking event loop 2
  • Unit tests against application itself 2
  • Solution for temporarily uploading DB so it can be built by docker 2
  • Ship first version to PyPI 2
  • Command that builds a local docker container 2
  • _nocache=1 query string option for use with sort-by-random 2
  • Deploy final versions of fivethirtyeight and parlgov datasets (with view pagination) 2
  • :fire: Removes DS_Store 2
  • TemplateAssertionError: no filter named 'tojson' 2
  • Add --load-extension option to datasette for loading extra SQLite extensions 2
  • Plot rows on a map with Leaflet and Leaflet.markercluster 2
  • Filtered tables should show count of all matching rows, if fast enough 2
  • Hide FTS-created tables by default on the database index page 2
  • Build a visualization plugin for Vega 2
  • Heatmap visualization plugin 2
  • datasette publish gcloud 2
  • Set up a pattern portfolio 2
  • Document the querystring argument for setting a different time limit 2
  • metadata.json support for per-database and per-table information 2
  • I18n and L10n support 2
  • More metadata options for template authors 2
  • Custom Queries - escaping strings 2
  • Rename table_rows and filtered_table_rows to have _count suffix 2
  • Raise 404 on nonexistent table URLs 2
  • Investigate syntactic sugar for plugins 2
  • Unit tests for installable plugins 2
  • Add limit on the size in KB of data returned from a single query 2
  • …

user 288

  • simonw 6,978
  • codecov[bot] 146
  • eyeseast 53
  • russss 39
  • psychemedia 32
  • fgregg 32
  • abdusco 26
  • mroswell 20
  • aborruso 19
  • chrismp 18
  • jacobian 14
  • carlmjohnson 14
  • RhetTbull 14
  • tballison 13
  • wragge 12
  • brandonrobertz 12
  • tsibley 11
  • rixx 11
  • frafra 10
  • terrycojones 10
  • stonebig 10
  • rayvoelker 10
  • maxhawkins 9
  • clausjuhl 9
  • bobwhitelock 9
  • dependabot[bot] 9
  • 20after4 8
  • dracos 8
  • UtahDave 8
  • tomchristie 8
  • bsilverm 8
  • rgieseke 7
  • mhalle 7
  • zeluspudding 7
  • cobiadigital 7
  • amjith 6
  • simonwiles 6
  • zaneselvans 6
  • khusmann 5
  • khimaros 5
  • jaywgraves 5
  • MarkusH 5
  • dazzag24 5
  • SteadBytes 5
  • dependabot-preview[bot] 5
  • jayvdb 4
  • bollwyvl 4
  • jefftriplett 4
  • ctb 4
  • Btibert3 4
  • dholth 4
  • lovasoa 4
  • r4vi 4
  • jsfenfen 4
  • glasnt 4
  • jungle-boogie 4
  • ColinMaudry 4
  • kbaikov 4
  • JBPressac 4
  • nitinpaultifr 4
  • Kabouik 4
  • henry501 4
  • benpickles 3
  • frankieroberto 3
  • fs111 3
  • obra 3
  • janimo 3
  • atomotic 3
  • ghing 3
  • pkoppstein 3
  • yozlet 3
  • yschimke 3
  • philroche 3
  • macropin 3
  • camallen 3
  • wsxiaoys 3
  • xrotwang 3
  • Mjboothaus 3
  • robroc 3
  • betatim 3
  • dufferzafar 3
  • Florents-Tselai 3
  • ashishdotme 3
  • Segerberg 3
  • blairdrummond 3
  • jsancho-gpl 3
  • kevindkeogh 3
  • daniel-butler 3
  • learning4life 3
  • FabianHertwig 3
  • polyrand 3
  • pjamargh 3
  • garethr 2
  • danp 2
  • davidbgk 2
  • ftrain 2
  • chrishas35 2
  • tannewt 2
  • ingenieroariel 2
  • coleifer 2
  • gavinband 2
  • aviflax 2
  • tholo 2
  • cldellow 2
  • mungewell 2
  • frankier 2
  • lchski 2
  • tmaier 2
  • slygent 2
  • frosencrantz 2
  • eads 2
  • leafgarland 2
  • glyph 2
  • rafguns 2
  • strada 2
  • eelkevdbos 2
  • ligurio 2
  • n8henrie 2
  • soobrosa 2
  • nathancahill 2
  • bsmithgall 2
  • willingc 2
  • nattaylor 2
  • durkie 2
  • raynae 2
  • wulfmann 2
  • philshem 2
  • bram2000 2
  • zzeleznick 2
  • chris48s 2
  • plpxsk 2
  • henrikek 2
  • sw-yx 2
  • nickvazz 2
  • hydrosquall 2
  • aaronyih1 2
  • jussiarpalahti 2
  • lagolucas 2
  • chekos 2
  • ad-si 2
  • smithdc1 2
  • gsajko 2
  • null92 2
  • rachelll4 2
  • tunguyenatwork 2
  • LVerneyPEReN 2
  • anotherjesse 1
  • jarib 1
  • jokull 1
  • dsisnero 1
  • llimllib 1
  • gijs 1
  • blaine 1
  • gravis 1
  • nkirsch 1
  • tomdyson 1
  • mrchrisadams 1
  • dkam 1
  • harperreed 1
  • nileshtrivedi 1
  • furilo 1
  • adamwolf 1
  • prabhur 1
  • dmd 1
  • rubenv 1
  • Uninen 1
  • carsonyl 1
  • nryberg 1
  • step21 1
  • stefanocudini 1
  • rcoup 1
  • scoates 1
  • hpk42 1
  • annapowellsmith 1
  • aslakr 1
  • thorn0 1
  • yurivish 1
  • jmelloy 1
  • Krazybug 1
  • dvhthomas 1
  • phubbard 1
  • sethvincent 1
  • meatcar 1
  • aitoehigie 1
  • michaelmcandrew 1
  • drewda 1
  • stiles 1
  • saulpw 1
  • thadk 1
  • robintw 1
  • astrojuanlu 1
  • ipmb 1
  • steren 1
  • aidansteele 1
  • mikepqr 1
  • 0x1997 1
  • knutwannheden 1
  • davidszotten 1
  • kevboh 1
  • eaubin 1
  • yunzheng 1
  • karlcow 1
  • heyarne 1
  • simonrjones 1
  • mcint 1
  • justinpinkney 1
  • merwok 1
  • mattkiefer 1
  • virtadpt 1
  • snth 1
  • joshmgrant 1
  • bcongdon 1
  • nickdirienzo 1
  • adamjonas 1
  • hannseman 1
  • kaihendry 1
  • urbas 1
  • brimstone 1
  • adamchainz 1
  • PabloLerma 1
  • heussd 1
  • RayBB 1
  • limar 1
  • drkane 1
  • Gagravarr 1
  • agguser 1
  • dyllan-to-you 1
  • justinallen 1
  • jordaneremieff 1
  • wdccdw 1
  • progpow 1
  • ltrgoddard 1
  • costrouc 1
  • jratike80 1
  • ccorcos 1
  • qqilihq 1
  • QAInsights 1
  • secretGeek 1
  • fkuhn 1
  • jameslittle230 1
  • dskrad 1
  • kwladyka 1
  • Carib0u 1
  • fatihky 1
  • phoenixjun 1
  • JesperTreetop 1
  • bapowell 1
  • louispotok 1
  • ChristopherWilks 1
  • Maltazar 1
  • eumiro 1
  • wuhland 1
  • foscoj 1
  • dvot197007 1
  • kokes 1
  • csusanu 1
  • rprimet 1
  • metab0t 1
  • luxint 1
  • spdkils 1
  • sturzl 1
  • robmarkcole 1
  • jfeiwell 1
  • coisnepe 1
  • chmaynard 1
  • asg017 1
  • noklam 1
  • GmGniap 1
  • rdtq 1
  • AnkitKundariya 1
  • LucasElArruda 1
  • duarteocarmo 1
  • mattiaborsoi 1
  • sarcasticadmin 1
  • abeyerpath 1
  • b0b5h4rp13 1
  • Rik-de-Kort 1
  • patricktrainer 1
  • justmars 1
  • miuku 1
  • jcmkk3 1
  • matt-jensen-wa 1
  • izzues 1
  • thisismyfuckingusername 1
  • MichaelTiemannOSC 1
  • kirajano 1
  • knowledgecamp12 1
  • McEazy2700 1

author_association 4

  • OWNER 6,492
  • NONE 732
  • MEMBER 486
  • CONTRIBUTOR 359
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1170595021 https://github.com/simonw/sqlite-utils/issues/26#issuecomment-1170595021 https://api.github.com/repos/simonw/sqlite-utils/issues/26 IC_kwDOCGYnMM5FxdzN izzues 60892516 2022-06-29T23:35:29Z 2022-06-29T23:35:29Z NONE

Have you seen MakeTypes? Not the exact same thing but it may be relevant.

And it's inspired by the paper "Types from Data: Making Structured Data First-Class Citizens in F#".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for turning nested JSON into foreign keys / many-to-many 455486286  
1168715058 https://github.com/simonw/datasette/pull/1763#issuecomment-1168715058 https://api.github.com/repos/simonw/datasette/issues/1763 IC_kwDOBm6k_c5FqS0y codecov[bot] 22429695 2022-06-28T13:19:28Z 2022-06-28T13:19:28Z NONE

Codecov Report

Merging #1763 (fd6a817) into main (00e59ec) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1763   +/-   ##
=======================================
  Coverage   91.67%   91.67%           
=======================================
  Files          36       36           
  Lines        4658     4658           
=======================================
  Hits         4270     4270           
  Misses        388      388           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00e59ec...fd6a817. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump black from 22.1.0 to 22.6.0 1287325944  
1168704157 https://github.com/simonw/datasette/pull/1693#issuecomment-1168704157 https://api.github.com/repos/simonw/datasette/issues/1693 IC_kwDOBm6k_c5FqQKd dependabot[bot] 49699333 2022-06-28T13:11:36Z 2022-06-28T13:11:36Z CONTRIBUTOR

Superseded by #1763.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump black from 22.1.0 to 22.3.0 1184850337  
1164460052 https://github.com/simonw/sqlite-utils/issues/431#issuecomment-1164460052 https://api.github.com/repos/simonw/sqlite-utils/issues/431 IC_kwDOCGYnMM5FaEAU rafguns 738408 2022-06-23T14:12:51Z 2022-06-23T14:12:51Z NONE

Yeah, I think I prefer your suggestion: it seems cleaner than my initial left_name=/right_name= idea. Perhaps one downside is that it's less obvious what the role of each field is: in this example, is people_id_1 a reference to parent or child?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Allow making m2m relation of a table to itself 1227571375  
1163917719 https://github.com/dogsheep/healthkit-to-sqlite/issues/12#issuecomment-1163917719 https://api.github.com/repos/dogsheep/healthkit-to-sqlite/issues/12 IC_kwDOC8tyDs5FX_mX Mjboothaus 956433 2022-06-23T04:35:02Z 2022-06-23T04:35:02Z NONE

In terms of unique identifiers - could you use values stored in HKMetadataKeySyncIdentifier?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Some workout columns should be float, not text 727848625  
1163097455 https://github.com/simonw/datasette/pull/1760#issuecomment-1163097455 https://api.github.com/repos/simonw/datasette/issues/1760 IC_kwDOBm6k_c5FU3Vv codecov[bot] 22429695 2022-06-22T13:27:08Z 2022-06-22T13:27:08Z NONE

Codecov Report

Merging #1760 (69951ee) into main (00e59ec) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1760   +/-   ##
=======================================
  Coverage   91.67%   91.67%           
=======================================
  Files          36       36           
  Lines        4658     4658           
=======================================
  Hits         4270     4270           
  Misses        388      388           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00e59ec...69951ee. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump furo from 2022.4.7 to 2022.6.21 1280136357  
1163091750 https://github.com/simonw/datasette/pull/1753#issuecomment-1163091750 https://api.github.com/repos/simonw/datasette/issues/1753 IC_kwDOBm6k_c5FU18m dependabot[bot] 49699333 2022-06-22T13:22:34Z 2022-06-22T13:22:34Z CONTRIBUTOR

Superseded by #1760.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump furo from 2022.4.7 to 2022.6.4.1 1261826957  
1162500525 https://github.com/simonw/sqlite-utils/issues/448#issuecomment-1162500525 https://api.github.com/repos/simonw/sqlite-utils/issues/448 IC_kwDOCGYnMM5FSlmt mungewell 236907 2022-06-22T00:46:43Z 2022-06-22T00:46:43Z NONE

log.txt

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Reading rows from a file => AttributeError: '_io.StringIO' object has no attribute 'readinto' 1279144769  
1162498734 https://github.com/simonw/sqlite-utils/issues/448#issuecomment-1162498734 https://api.github.com/repos/simonw/sqlite-utils/issues/448 IC_kwDOCGYnMM5FSlKu mungewell 236907 2022-06-22T00:43:45Z 2022-06-22T00:43:45Z NONE

Attempted to test on a machine with a new version of Python, but install failed with an error message for the 'click' package.

C:\WINDOWS\system32>"c:\Program Files\Python310\python.exe"
Python 3.10.2 (tags/v3.10.2:a58ebcc, Jan 17 2022, 14:12:15) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> quit()

C:\WINDOWS\system32>cd C:\Users\swood\Downloads\sqlite-utils-main-20220621\sqlite-utils-main

C:\Users\swood\Downloads\sqlite-utils-main-20220621\sqlite-utils-main>"c:\Program Files\Python310\python.exe" setup.py install
running install
running bdist_egg
running egg_info

...

Installed c:\program files\python310\lib\site-packages\click_default_group_wheel-1.2.2-py3.10.egg
Searching for click
Downloading https://files.pythonhosted.org/packages/3d/da/f3bbf30f7e71d881585d598f67f4424b2cc4c68f39849542e81183218017/click-default-group-wheel-1.2.2.tar.gz#sha256=e90da42d92c03e88a12ed0c0b69c8a29afb5d36e3dc8d29c423ba4219e6d7747
Best match: click default-group-wheel-1.2.2
Processing click-default-group-wheel-1.2.2.tar.gz
Writing C:\Users\swood\AppData\Local\Temp\easy_install-aiaj0_eh\click-default-group-wheel-1.2.2\setup.cfg
Running click-default-group-wheel-1.2.2\setup.py -q bdist_egg --dist-dir C:\Users\swood\AppData\Local\Temp\easy_install-aiaj0_eh\click-default-group-wheel-1.2.2\egg-dist-tmp-z61a4h8n
zip_safe flag not set; analyzing archive contents...
removing 'c:\program files\python310\lib\site-packages\click_default_group_wheel-1.2.2-py3.10.egg' (and everything under it)
Copying click_default_group_wheel-1.2.2-py3.10.egg to c:\program files\python310\lib\site-packages
click-default-group-wheel 1.2.2 is already the active version in easy-install.pth

Installed c:\program files\python310\lib\site-packages\click_default_group_wheel-1.2.2-py3.10.egg
error: The 'click' distribution was not found and is required by click-default-group-wheel, sqlite-utils
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Reading rows from a file => AttributeError: '_io.StringIO' object has no attribute 'readinto' 1279144769  
1162234441 https://github.com/simonw/sqlite-utils/issues/446#issuecomment-1162234441 https://api.github.com/repos/simonw/sqlite-utils/issues/446 IC_kwDOCGYnMM5FRkpJ simonw 9599 2022-06-21T19:28:35Z 2022-06-21T19:28:35Z OWNER

just -l now does this:

% just -l
Available recipes:
    black         # Apply Black
    cog           # Rebuild docs with cog
    default       # Run tests and linters
    lint          # Run linters: black, flake8, mypy, cog
    test *options # Run pytest with supplied options
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Use Just to automate running tests and linters locally 1277328147  
1162231111 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162231111 https://api.github.com/repos/simonw/sqlite-utils/issues/297 IC_kwDOCGYnMM5FRj1H simonw 9599 2022-06-21T19:25:44Z 2022-06-21T19:25:44Z OWNER

Pushed that prototype to a branch.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
1162223668 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162223668 https://api.github.com/repos/simonw/sqlite-utils/issues/297 IC_kwDOCGYnMM5FRiA0 simonw 9599 2022-06-21T19:19:22Z 2022-06-21T19:22:15Z OWNER

Built a prototype of --fast for the sqlite-utils memory command:

% time sqlite-utils memory taxi.csv 'SELECT passenger_count, COUNT(*), AVG(total_amount) FROM taxi GROUP BY passenger_count' --fast
passenger_count  COUNT(*)  AVG(total_amount)
---------------  --------  -----------------
                 128020    32.2371511482553 
0                42228     17.0214016766151 
1                1533197   17.6418833067999 
2                286461    18.0975870711456 
3                72852     17.9153958710923 
4                25510     18.452774990196  
5                50291     17.2709248175672 
6                32623     17.6002964166367 
7                2         87.17            
8                2         95.705           
9                1         113.6            
sqlite-utils memory taxi.csv  --fast  12.71s user 0.48s system 104% cpu 12.627 total

Takes 13s - about the same time as calling sqlite3 :memory: ... directly as seen in https://til.simonwillison.net/sqlite/one-line-csv-operations

Without the --fast option that takes several minutes (262s = 4m20s)!

Here's the prototype so far:

diff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py
index 86eddfb..1c83ef6 100644
--- a/sqlite_utils/cli.py
+++ b/sqlite_utils/cli.py
@@ -14,6 +14,8 @@ import io
 import itertools
 import json
 import os
+import shutil
+import subprocess
 import sys
 import csv as csv_std
 import tabulate
@@ -1669,6 +1671,7 @@ def query(
     is_flag=True,
     help="Analyze resulting tables and output results",
 )
+@click.option("--fast", is_flag=True, help="Fast mode, only works with CSV and TSV")
 @load_extension_option
 def memory(
     paths,
@@ -1692,6 +1695,7 @@ def memory(
     save,
     analyze,
     load_extension,
+    fast,
 ):
     """Execute SQL query against an in-memory database, optionally populated by imported data

@@ -1719,6 +1723,22 @@ def memory(
     \b
         sqlite-utils memory animals.csv --schema
     """
+    if fast:
+        if (
+            attach
+            or flatten
+            or param
+            or encoding
+            or no_detect_types
+            or analyze
+            or load_extension
+        ):
+            raise click.ClickException(
+                "--fast mode does not support any of the following options: --attach, --flatten, --param, --encoding, --no-detect-types, --analyze, --load-extension"
+            )
+        # TODO: Figure out and pass other supported options
+        memory_fast(paths, sql)
+        return
     db = sqlite_utils.Database(memory=True)
     # If --dump or --save or --analyze used but no paths detected, assume SQL query is a path:
     if (dump or save or schema or analyze) and not paths:
@@ -1791,6 +1811,33 @@ def memory(
     )


+def memory_fast(paths, sql):
+    if not shutil.which("sqlite3"):
+        raise click.ClickException("sqlite3 not found in PATH")
+    args = ["sqlite3", ":memory:", "-cmd", ".mode csv"]
+    table_names = []
+
+    def name(path):
+        base_name = pathlib.Path(path).stem or "t"
+        table_name = base_name
+        prefix = 1
+        while table_name in table_names:
+            prefix += 1
+            table_name = "{}_{}".format(base_name, prefix)
+        return table_name
+
+    for path in paths:
+        table_name = name(path)
+        table_names.append(table_name)
+        args.extend(
+            ["-cmd", ".import {} {}".format(pathlib.Path(path).resolve(), table_name)]
+        )
+
+    args.extend(["-cmd", ".mode column"])
+    args.append(sql)
+    subprocess.run(args)
+
+
 def _execute_query(
     db, sql, param, raw, table, csv, tsv, no_headers, fmt, nl, arrays, json_cols
 ):
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
1162186856 https://github.com/simonw/sqlite-utils/issues/447#issuecomment-1162186856 https://api.github.com/repos/simonw/sqlite-utils/issues/447 IC_kwDOCGYnMM5FRZBo simonw 9599 2022-06-21T18:48:46Z 2022-06-21T18:48:46Z OWNER

That fixed it:

https://user-images.githubusercontent.com/9599/174875556-3a569c90-5c92-48eb-935c-470638deb335.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Incorrect syntax highlighting in docs CLI reference 1278571700  
1162179354 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1162179354 https://api.github.com/repos/simonw/sqlite-utils/issues/297 IC_kwDOCGYnMM5FRXMa simonw 9599 2022-06-21T18:44:03Z 2022-06-21T18:44:03Z OWNER

The thing I like about that --fast option is that it could selectively use this alternative mechanism just for the files for which it can work (CSV and TSV files). I could also add a --fast option to sqlite-utils memory which could then kick in only for operations that involve just TSV and CSV files.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
1161869859 https://github.com/simonw/sqlite-utils/issues/447#issuecomment-1161869859 https://api.github.com/repos/simonw/sqlite-utils/issues/447 IC_kwDOCGYnMM5FQLoj simonw 9599 2022-06-21T15:00:42Z 2022-06-21T15:00:42Z OWNER

Deploying that to https://sqlite-utils.datasette.io/en/latest/cli-reference.html#insert

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Incorrect syntax highlighting in docs CLI reference 1278571700  
1161857806 https://github.com/simonw/sqlite-utils/issues/447#issuecomment-1161857806 https://api.github.com/repos/simonw/sqlite-utils/issues/447 IC_kwDOCGYnMM5FQIsO simonw 9599 2022-06-21T14:55:51Z 2022-06-21T14:58:14Z OWNER

https://stackoverflow.com/a/44379513 suggests that the fix is:

.. code-block:: text

Or set this in conf.py:

highlight_language = "none"

I like that better - I don't like that all :: blocks default to being treated as Python code.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Incorrect syntax highlighting in docs CLI reference 1278571700  
1161849874 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1161849874 https://api.github.com/repos/simonw/sqlite-utils/issues/297 IC_kwDOCGYnMM5FQGwS simonw 9599 2022-06-21T14:49:12Z 2022-06-21T14:49:12Z OWNER

Since there are all sorts of existing options for sqlite-utils insert that won't work with this, maybe it would be better to have an entirely separate command - this for example:

sqlite-utils fast-insert data.db mytable data.csv
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
882052693 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-882052693 https://api.github.com/repos/simonw/sqlite-utils/issues/297 IC_kwDOCGYnMM40kw5V simonw 9599 2021-07-18T12:57:54Z 2022-06-21T13:17:15Z OWNER

Another implementation option would be to use the CSV virtual table mechanism. This could avoid shelling out to the sqlite3 binary, but requires solving the harder problem of compiling and distributing a loadable SQLite module: https://www.sqlite.org/csv.html

(Would be neat to produce a Python wheel of this, see https://simonwillison.net/2022/May/23/bundling-binary-tools-in-python-wheels/)

This would also help solve the challenge of making this optimization available to the sqlite-utils memory command. That command operates against an in-memory database so it's not obvious how it could shell out to a binary.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
1160991031 https://github.com/simonw/sqlite-utils/issues/297#issuecomment-1160991031 https://api.github.com/repos/simonw/sqlite-utils/issues/297 IC_kwDOCGYnMM5FM1E3 simonw 9599 2022-06-21T00:35:20Z 2022-06-21T00:35:20Z OWNER

Relevant TIL: https://til.simonwillison.net/sqlite/one-line-csv-operations

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Option for importing CSV data using the SQLite .import mechanism 944846776  
1160798645 https://github.com/simonw/sqlite-utils/issues/446#issuecomment-1160798645 https://api.github.com/repos/simonw/sqlite-utils/issues/446 IC_kwDOCGYnMM5FMGG1 simonw 9599 2022-06-20T19:55:34Z 2022-06-20T19:55:34Z OWNER

just now defaults to running the tests and linters.

just test runs the tests - it can take arguments, e.g. just test -k transform

just lint runs all of the linters.

just black applies Black.

In all case it assumes you are using pipenv, at least for the moment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Use Just to automate running tests and linters locally 1277328147  
1160794604 https://github.com/simonw/sqlite-utils/issues/443#issuecomment-1160794604 https://api.github.com/repos/simonw/sqlite-utils/issues/443 IC_kwDOCGYnMM5FMFHs simonw 9599 2022-06-20T19:49:37Z 2022-06-20T19:49:37Z OWNER

Also now shows up here: https://sqlite-utils.datasette.io/en/latest/reference.html#sqlite-utils-utils-rows-from-file

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Make `utils.rows_from_file()` a documented API 1269998342  
1160794175 https://github.com/simonw/sqlite-utils/issues/445#issuecomment-1160794175 https://api.github.com/repos/simonw/sqlite-utils/issues/445 IC_kwDOCGYnMM5FMFA_ simonw 9599 2022-06-20T19:49:02Z 2022-06-20T19:49:02Z OWNER

New documentation:

  • https://sqlite-utils.datasette.io/en/latest/python-api.html#detecting-column-types-using-typetracker
  • https://sqlite-utils.datasette.io/en/latest/reference.html#sqlite-utils-utils-typetracker
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`sqlite_utils.utils.TypeTracker` should be a documented API 1277295119  
1160793114 https://github.com/simonw/sqlite-utils/issues/445#issuecomment-1160793114 https://api.github.com/repos/simonw/sqlite-utils/issues/445 IC_kwDOCGYnMM5FMEwa simonw 9599 2022-06-20T19:47:36Z 2022-06-20T19:47:36Z OWNER

I also added inline documentation and types: https://github.com/simonw/sqlite-utils/blob/773f2b6b20622bb986984a1c3161d5b3aaa1046b/sqlite_utils/utils.py#L318-L360

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`sqlite_utils.utils.TypeTracker` should be a documented API 1277295119  
1160763268 https://github.com/simonw/sqlite-utils/issues/445#issuecomment-1160763268 https://api.github.com/repos/simonw/sqlite-utils/issues/445 IC_kwDOCGYnMM5FL9eE simonw 9599 2022-06-20T19:09:21Z 2022-06-20T19:09:21Z OWNER

Code to document: https://github.com/simonw/sqlite-utils/blob/3fbe8a784cc2f3fa0bfa8612fec9752ff9068a2b/sqlite_utils/utils.py#L318-L331

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`sqlite_utils.utils.TypeTracker` should be a documented API 1277295119  
1160717784 https://github.com/simonw/datasette/pull/1759#issuecomment-1160717784 https://api.github.com/repos/simonw/datasette/issues/1759 IC_kwDOBm6k_c5FLyXY codecov[bot] 22429695 2022-06-20T18:04:46Z 2022-06-20T18:04:46Z NONE

Codecov Report

Merging #1759 (b901bb0) into main (2e97516) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1759   +/-   ##
=======================================
  Coverage   91.67%   91.67%           
=======================================
  Files          36       36           
  Lines        4658     4658           
=======================================
  Hits         4270     4270           
  Misses        388      388           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e97516...b901bb0. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extract facet portions of table.html out into included templates 1275523220  
1160717735 https://github.com/simonw/datasette/pull/1759#issuecomment-1160717735 https://api.github.com/repos/simonw/datasette/issues/1759 IC_kwDOBm6k_c5FLyWn simonw 9599 2022-06-20T18:04:41Z 2022-06-20T18:04:41Z OWNER

I don't think this change needs any changes to the documentation: https://docs.datasette.io/en/stable/custom_templates.html#custom-templates

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Extract facet portions of table.html out into included templates 1275523220  
1160712911 https://github.com/simonw/datasette/pull/1759#issuecomment-1160712911 https://api.github.com/repos/simonw/datasette/issues/1759 IC_kwDOBm6k_c5FLxLP simonw 9599 2022-06-20T17:58:37Z 2022-06-20T17:58:37Z OWNER

This is a great idea.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
Extract facet portions of table.html out into included templates 1275523220  
1155966234 https://github.com/simonw/sqlite-utils/issues/444#issuecomment-1155966234 https://api.github.com/repos/simonw/sqlite-utils/issues/444 IC_kwDOCGYnMM5E5qUa simonw 9599 2022-06-15T04:18:05Z 2022-06-15T04:18:05Z OWNER

I'm going to push a branch with my not-yet-working code (which does at least include a test).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV `extras_key=` and `ignore_extras=` equivalents for CLI tool 1271426387  
1155815956 https://github.com/simonw/sqlite-utils/issues/444#issuecomment-1155815956 https://api.github.com/repos/simonw/sqlite-utils/issues/444 IC_kwDOCGYnMM5E5FoU simonw 9599 2022-06-14T23:49:56Z 2022-06-15T03:58:10Z OWNER

Yeah my initial implementation there makes no sense:

            csv_reader_args = {"dialect": dialect}
            if delimiter:
                csv_reader_args["delimiter"] = delimiter
            if quotechar:
                csv_reader_args["quotechar"] = quotechar
            reader = _extra_key_strategy(
                csv_std.reader(decoded, **csv_reader_args), ignore_extras, extras_key
            )
            first_row = next(reader)
            if no_headers:
                headers = ["untitled_{}".format(i + 1) for i in range(len(first_row))]
                reader = itertools.chain([first_row], reader)
            else:
                headers = first_row
            docs = (dict(zip(headers, row)) for row in reader)

Because my _extra_key_strategy() helper function is designed to work against csv.DictReader - not against csv.reader() which returns a sequenc of lists, not a sequence of dictionaries.

In fact, what's happening here is that dict(zip(headers, row)) is ignoring anything in the row that doesn't correspond to a header:

>>> list(zip(["a", "b"], [1, 2, 3]))
[('a', 1), ('b', 2)]
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV `extras_key=` and `ignore_extras=` equivalents for CLI tool 1271426387  
1155953345 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155953345 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E5nLB simonw 9599 2022-06-15T03:53:43Z 2022-06-15T03:53:43Z OWNER

I tried fixing this by using .tell() to read the file position as I was iterating through it:

diff --git a/sqlite_utils/utils.py b/sqlite_utils/utils.py
index d2ccc5f..29ad12e 100644
--- a/sqlite_utils/utils.py
+++ b/sqlite_utils/utils.py
@@ -149,10 +149,13 @@ class UpdateWrapper:
     def __init__(self, wrapped, update):
         self._wrapped = wrapped
         self._update = update
+        self._tell = wrapped.tell()

     def __iter__(self):
         for line in self._wrapped:
-            self._update(len(line))
+            tell = self._wrapped.tell()
+            self._update(self._tell - tell)
+            self._tell = tell
             yield line
 ```
This did not work - I get this error:

File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/utils.py", line 206, in _extra_key_strategy
for row in reader:
File "/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/utils.py", line 156, in iter
tell = self._wrapped.tell()
OSError: telling position disabled by next() call
`` It looks like you can't use.tell()` during iteration: https://stackoverflow.com/questions/29618936/how-to-solve-oserror-telling-position-disabled-by-next-call

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155815186 https://github.com/simonw/sqlite-utils/issues/444#issuecomment-1155815186 https://api.github.com/repos/simonw/sqlite-utils/issues/444 IC_kwDOCGYnMM5E5FcS simonw 9599 2022-06-14T23:48:16Z 2022-06-14T23:48:16Z OWNER

This is tricky to implement because of this code: https://github.com/simonw/sqlite-utils/blob/b8af3b96f5c72317cc8783dc296a94f6719987d9/sqlite_utils/cli.py#L938-L945

It's reconstructing each document using the known headers here:

docs = (dict(zip(headers, row)) for row in reader)

So my first attempt at this - the diff here - did not have the desired result:

diff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py
index 86eddfb..00b920b 100644
--- a/sqlite_utils/cli.py
+++ b/sqlite_utils/cli.py
@@ -6,7 +6,7 @@ import hashlib
 import pathlib
 import sqlite_utils
 from sqlite_utils.db import AlterError, BadMultiValues, DescIndex
-from sqlite_utils.utils import maximize_csv_field_size_limit
+from sqlite_utils.utils import maximize_csv_field_size_limit, _extra_key_strategy
 from sqlite_utils import recipes
 import textwrap
 import inspect
@@ -797,6 +797,15 @@ _import_options = (
         "--encoding",
         help="Character encoding for input, defaults to utf-8",
     ),
+    click.option(
+        "--ignore-extras",
+        is_flag=True,
+        help="If a CSV line has more than the expected number of values, ignore the extras",
+    ),
+    click.option(
+        "--extras-key",
+        help="If a CSV line has more than the expected number of values put them in a list in this column",
+    ),
 )


@@ -885,6 +894,8 @@ def insert_upsert_implementation(
     sniff,
     no_headers,
     encoding,
+    ignore_extras,
+    extras_key,
     batch_size,
     alter,
     upsert,
@@ -909,6 +920,10 @@ def insert_upsert_implementation(
         raise click.ClickException("--flatten cannot be used with --csv or --tsv")
     if encoding and not (csv or tsv):
         raise click.ClickException("--encoding must be used with --csv or --tsv")
+    if ignore_extras and extras_key:
+        raise click.ClickException(
+            "--ignore-extras and --extras-key cannot be used together"
+        )
     if pk and len(pk) == 1:
         pk = pk[0]
     encoding = encoding or "utf-8-sig"
@@ -935,7 +950,9 @@ def insert_upsert_implementation(
                 csv_reader_args["delimiter"] = delimiter
             if quotechar:
                 csv_reader_args["quotechar"] = quotechar
-            reader = csv_std.reader(decoded, **csv_reader_args)
+            reader = _extra_key_strategy(
+                csv_std.reader(decoded, **csv_reader_args), ignore_extras, extras_key
+            )
             first_row = next(reader)
             if no_headers:
                 headers = ["untitled_{}".format(i + 1) for i in range(len(first_row))]
@@ -1101,6 +1118,8 @@ def insert(
     sniff,
     no_headers,
     encoding,
+    ignore_extras,
+    extras_key,
     batch_size,
     alter,
     detect_types,
@@ -1176,6 +1195,8 @@ def insert(
             sniff,
             no_headers,
             encoding,
+            ignore_extras,
+            extras_key,
             batch_size,
             alter=alter,
             upsert=False,
@@ -1214,6 +1235,8 @@ def upsert(
     sniff,
     no_headers,
     encoding,
+    ignore_extras,
+    extras_key,
     alter,
     not_null,
     default,
@@ -1254,6 +1277,8 @@ def upsert(
             sniff,
             no_headers,
             encoding,
+            ignore_extras,
+            extras_key,
             batch_size,
             alter=alter,
             upsert=True,
@@ -1297,6 +1322,8 @@ def bulk(
     sniff,
     no_headers,
     encoding,
+    ignore_extras,
+    extras_key,
     load_extension,
 ):
     """
@@ -1331,6 +1358,8 @@ def bulk(
             sniff=sniff,
             no_headers=no_headers,
             encoding=encoding,
+            ignore_extras=ignore_extras,
+            extras_key=extras_key,
             batch_size=batch_size,
             alter=False,
             upsert=False,
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV `extras_key=` and `ignore_extras=` equivalents for CLI tool 1271426387  
1155804591 https://github.com/simonw/sqlite-utils/issues/444#issuecomment-1155804591 https://api.github.com/repos/simonw/sqlite-utils/issues/444 IC_kwDOCGYnMM5E5C2v simonw 9599 2022-06-14T23:28:36Z 2022-06-14T23:28:36Z OWNER

I'm going with --extras-key and --ignore-extras as the two new options.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV `extras_key=` and `ignore_extras=` equivalents for CLI tool 1271426387  
1155804459 https://github.com/simonw/sqlite-utils/issues/444#issuecomment-1155804459 https://api.github.com/repos/simonw/sqlite-utils/issues/444 IC_kwDOCGYnMM5E5C0r simonw 9599 2022-06-14T23:28:18Z 2022-06-14T23:28:18Z OWNER

I think these become part of the _import_options list which is used in a few places:

https://github.com/simonw/sqlite-utils/blob/b8af3b96f5c72317cc8783dc296a94f6719987d9/sqlite_utils/cli.py#L765-L800

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV `extras_key=` and `ignore_extras=` equivalents for CLI tool 1271426387  
1155803262 https://github.com/simonw/sqlite-utils/issues/430#issuecomment-1155803262 https://api.github.com/repos/simonw/sqlite-utils/issues/430 IC_kwDOCGYnMM5E5Ch- simonw 9599 2022-06-14T23:26:11Z 2022-06-14T23:26:11Z OWNER

It looks like PRAGMA temp_store was the right option to use here: https://www.sqlite.org/pragma.html#pragma_temp_store

temp_store_directory is listed as deprecated here: https://www.sqlite.org/pragma.html#pragma_temp_store_directory

I'm going to turn this into a help-wanted documentation issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Document how to use `PRAGMA temp_store` to avoid errors when running VACUUM against huge databases 1224112817  
1155801812 https://github.com/simonw/sqlite-utils/issues/434#issuecomment-1155801812 https://api.github.com/repos/simonw/sqlite-utils/issues/434 IC_kwDOCGYnMM5E5CLU simonw 9599 2022-06-14T23:23:32Z 2022-06-14T23:23:32Z OWNER

Since table names can be quoted like this:

CREATE VIRTUAL TABLE "searchable_fts"
    USING FTS4 (text1, text2, [name with . and spaces], content="searchable")

OR like this:

CREATE VIRTUAL TABLE "searchable_fts"
    USING FTS4 (text1, text2, [name with . and spaces], content=[searchable])

This fix looks to be correct to me (copying from the updated test_with_trace() test):

            (
                "SELECT name FROM sqlite_master\n"
                "    WHERE rootpage = 0\n"
                "    AND (\n"
                "        sql LIKE :like\n"
                "        OR sql LIKE :like2\n"
                "        OR (\n"
                "            tbl_name = :table\n"
                "            AND sql LIKE '%VIRTUAL TABLE%USING FTS%'\n"
                "        )\n"
                "    )",
                {
                    "like": "%VIRTUAL TABLE%USING FTS%content=[dogs]%",
                    "like2": '%VIRTUAL TABLE%USING FTS%content="dogs"%',
                    "table": "dogs",
                },
            )
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`detect_fts()` identifies the wrong table if tables have names that are subsets of each other 1243151184  
1155794149 https://github.com/simonw/sqlite-utils/issues/434#issuecomment-1155794149 https://api.github.com/repos/simonw/sqlite-utils/issues/434 IC_kwDOCGYnMM5E5ATl simonw 9599 2022-06-14T23:09:54Z 2022-06-14T23:09:54Z OWNER

A test that demonstrates the problem:

@pytest.mark.parametrize("reverse_order", (True, False))
def test_detect_fts_similar_tables(fresh_db, reverse_order):
    # https://github.com/simonw/sqlite-utils/issues/434
    table1, table2 = ("demo", "demo2")
    if reverse_order:
        table1, table2 = table2, table1

    fresh_db[table1].insert({"title": "Hello"}).enable_fts(
        ["title"], fts_version="FTS4"
    )
    fresh_db[table2].insert({"title": "Hello"}).enable_fts(
        ["title"], fts_version="FTS4"
    )
    assert fresh_db[table1].detect_fts() == "{}_fts".format(table1)
    assert fresh_db[table2].detect_fts() == "{}_fts".format(table2)

The order matters - so this test currently passes in one direction and fails in the other:

>       assert fresh_db[table2].detect_fts() == "{}_fts".format(table2)
E       AssertionError: assert 'demo2_fts' == 'demo_fts'
E         - demo_fts
E         + demo2_fts
E         ?     +

tests/test_introspect.py:53: AssertionError
========================================================================================= short test summary info =========================================================================================
FAILED tests/test_introspect.py::test_detect_fts_similar_tables[True] - AssertionError: assert 'demo2_fts' == 'demo_fts'
=============================================================================== 1 failed, 1 passed, 855 deselected in 1.00s ===============================================================================
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`detect_fts()` identifies the wrong table if tables have names that are subsets of each other 1243151184  
1155791109 https://github.com/simonw/sqlite-utils/issues/434#issuecomment-1155791109 https://api.github.com/repos/simonw/sqlite-utils/issues/434 IC_kwDOCGYnMM5E4_kF simonw 9599 2022-06-14T23:04:40Z 2022-06-14T23:04:40Z OWNER

Definitely a bug - thanks for the detailed write-up!

You're right, the code at fault is here:

https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/db.py#L2213-L2231

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`detect_fts()` identifies the wrong table if tables have names that are subsets of each other 1243151184  
1155789101 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155789101 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E4_Et simonw 9599 2022-06-14T23:00:45Z 2022-06-14T23:00:45Z OWNER

I'm going to mark this as "help wanted" and leave it open. I'm glad that it's not actually a bug where errors get swallowed.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155788944 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155788944 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E4_CQ simonw 9599 2022-06-14T23:00:24Z 2022-06-14T23:00:24Z OWNER

The progress bar only works if the file-like object passed to it has a fp.fileno() that isn't 0 (for stdin) - that's how it detects that the file is something which it can measure the size of in order to show progress.

If we know the file size in bytes AND we know the character encoding, can we change UpdateWrapper to update the number of bytes-per-character instead?

I don't think so: I can't see a way of definitively saying "for this encoding the number of bytes per character is X" - and in fact I'm pretty sure that question doesn't even make sense since variable-length encodings exist.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155784284 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155784284 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E495c simonw 9599 2022-06-14T22:51:03Z 2022-06-14T22:52:13Z OWNER

Yes, this is the problem. The progress bar length is set to the length in bytes of the file - os.path.getsize(file.name) - but it's then incremented by the length of each DECODED line in turn.

So if the file is in utf-16-le (twice the size of utf-8) the progress bar will finish at 50%!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155782835 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155782835 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E49iz simonw 9599 2022-06-14T22:48:22Z 2022-06-14T22:49:53Z OWNER

Here's the code that implements the progress bar in question: https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/cli.py#L918-L932

It calls file_progress() which looks like this:

https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/utils.py#L159-L175

Which uses this:

https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/utils.py#L148-L156

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155781399 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155781399 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E49MX simonw 9599 2022-06-14T22:45:41Z 2022-06-14T22:45:41Z OWNER

TIL how to use iconv: https://til.simonwillison.net/linux/iconv

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155776023 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155776023 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E474X simonw 9599 2022-06-14T22:36:07Z 2022-06-14T22:36:07Z OWNER

Wait! The arguments in that are the wrong way round. This is correct:

sqlite-utils insert --csv --delimiter ";" --encoding "utf-16-le" test.db test csv

It still outputs the following:

[------------------------------------] 0%
[#################-------------------] 49% 00:00:02%

But it creates a test.db file that is 6.2MB.

That database has 3141 rows in it:

% sqlite-utils tables test.db --counts -t
table      count
-------  -------
test        3142

I converted that csv file to utf-8 like so:

iconv -f UTF-16LE -t UTF-8 csv > utf8.csv

And it contains 3142 lines:

% wc -l utf8.csv 
    3142 utf8.csv

So my hunch here is that the problem is actually that the progress bar doesn't know how to correctly measure files in utf-16-le encoding!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155772244 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155772244 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E469U simonw 9599 2022-06-14T22:30:03Z 2022-06-14T22:30:03Z OWNER

Tried this:

% python -i $(which sqlite-utils) insert --csv --delimiter ";" --encoding "utf-16-le" test test.db csv
  [------------------------------------]    0%
  [#################-------------------]   49%  00:00:01Traceback (most recent call last):
  File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1072, in main
    ctx.exit()
  File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 692, in exit
    raise Exit(code)
click.exceptions.Exit: 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/bin/sqlite-utils", line 33, in <module>
    sys.exit(load_entry_point('sqlite-utils', 'console_scripts', 'sqlite-utils')())
  File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py", line 1090, in main
    sys.exit(e.exit_code)
SystemExit: 0
>>> 
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155771462 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155771462 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E46xG simonw 9599 2022-06-14T22:28:38Z 2022-06-14T22:28:38Z OWNER

Maybe this isn't a CSV field value problem - I tried this patch and didn't seem to hit the new breakpoints:

diff --git a/sqlite_utils/utils.py b/sqlite_utils/utils.py
index d2ccc5f..f1b823a 100644
--- a/sqlite_utils/utils.py
+++ b/sqlite_utils/utils.py
@@ -204,13 +204,17 @@ def _extra_key_strategy(
         # DictReader adds a 'None' key with extra row values
         if None not in row:
             yield row
-        elif ignore_extras:
+            continue
+        else:
+            breakpoint()
+        if ignore_extras:
             # ignoring row.pop(none) because of this issue:
             # https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155358637
             row.pop(None)  # type: ignore
             yield row
         elif not extras_key:
             extras = row.pop(None)  # type: ignore
+            breakpoint()
             raise RowError(
                 "Row {} contained these extra values: {}".format(row, extras)
             )
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155769216 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155769216 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E46OA simonw 9599 2022-06-14T22:24:49Z 2022-06-14T22:25:06Z OWNER

I have a hunch that this crash may be caused by a CSV value which is too long, as addressed at the library level in:
- #440

But not yet addressed in the CLI tool, see:

  • 444

Either way though, I really don't like that errors like this are swallowed!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155767915 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155767915 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E455r simonw 9599 2022-06-14T22:22:27Z 2022-06-14T22:22:27Z OWNER

I forgot to add equivalents of extras_key= and ignore_extras= to the CLI tool - will do that in a separate issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1155767202 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1155767202 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5E45ui simonw 9599 2022-06-14T22:21:10Z 2022-06-14T22:21:10Z OWNER

I can't figure out why that error is being swallowed like that. The most likely culprit was this code:

https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/cli.py#L1021-L1043

But I tried changing it like this:

diff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py
index 86eddfb..ed26fdd 100644
--- a/sqlite_utils/cli.py
+++ b/sqlite_utils/cli.py
@@ -1023,6 +1023,7 @@ def insert_upsert_implementation(
             docs, pk=pk, batch_size=batch_size, alter=alter, **extra_kwargs
         )
     except Exception as e:
+        raise
         if (
             isinstance(e, sqlite3.OperationalError)
             and e.args

And your steps to reproduce still got to 49% and then failed silently.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1155764428 https://github.com/simonw/sqlite-utils/issues/432#issuecomment-1155764428 https://api.github.com/repos/simonw/sqlite-utils/issues/432 IC_kwDOCGYnMM5E45DM simonw 9599 2022-06-14T22:16:21Z 2022-06-14T22:16:21Z OWNER

Initial idea of how the .table() method would change:

diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py
index 7a06304..3ecb40b 100644
--- a/sqlite_utils/db.py
+++ b/sqlite_utils/db.py
@@ -474,11 +474,12 @@ class Database:
             self._tracer(sql, None)
         return self.conn.executescript(sql)

-    def table(self, table_name: str, **kwargs) -> Union["Table", "View"]:
+    def table(self, table_name: str, alias: Optional[str] = None, **kwargs) -> Union["Table", "View"]:
         """
         Return a table object, optionally configured with default options.

         :param table_name: Name of the table
+        :param alias: The database alias to use, if referring to a table in another connected database
         """
         klass = View if table_name in self.view_names() else Table
         return klass(self, table_name, **kwargs)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support `rows_where()`, `delete_where()` etc for attached alias databases 1236693079  
1155764064 https://github.com/simonw/sqlite-utils/issues/432#issuecomment-1155764064 https://api.github.com/repos/simonw/sqlite-utils/issues/432 IC_kwDOCGYnMM5E449g simonw 9599 2022-06-14T22:15:44Z 2022-06-14T22:15:44Z OWNER

Implementing this would be a pretty big change - initial instinct is that I'd need to introduce a self.alias property to Queryable (the subclass of Table and View) and a new self.name_with_alias getter which returns alias.tablename if alias is set to a not-None value. Then I'd need to rewrite every piece of code like this:

https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/db.py#L1161

To look like this instead:

        sql = "select {} from [{}]".format(select, self.name_with_alias)

But some parts would be harder - for example:

https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/db.py#L1227-L1231

Would have to know to query alias.sqlite_master instead.

The cached table counts logic like this would need a bunch of changes too:

https://github.com/simonw/sqlite-utils/blob/1b09538bc6c1fda773590f3e600993ef06591041/sqlite_utils/db.py#L644-L657

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support `rows_where()`, `delete_where()` etc for attached alias databases 1236693079  
1155759857 https://github.com/simonw/sqlite-utils/issues/432#issuecomment-1155759857 https://api.github.com/repos/simonw/sqlite-utils/issues/432 IC_kwDOCGYnMM5E437x simonw 9599 2022-06-14T22:09:07Z 2022-06-14T22:09:07Z OWNER

Third option, and I think the one I like the best:

rows = db.table("tablename", alias="otherdb").rows_where(alias="otherdb")

The db.table(tablename) method already exists as an alternative to db[tablename]: https://sqlite-utils.datasette.io/en/stable/python-api.html#python-api-table-configuration

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support `rows_where()`, `delete_where()` etc for attached alias databases 1236693079  
1155758664 https://github.com/simonw/sqlite-utils/issues/432#issuecomment-1155758664 https://api.github.com/repos/simonw/sqlite-utils/issues/432 IC_kwDOCGYnMM5E43pI simonw 9599 2022-06-14T22:07:50Z 2022-06-14T22:07:50Z OWNER

Another potential fix: add a alias= parameter to rows_where() and other similar methods. Then you could do this:

rows = db["tablename"].rows_where(alias="otherdb")

This feels wrong to me: db["tablename"] is the bit that is supposed to return a table object. Having part of what that table object is exist as a parameter to other methods is confusing.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support `rows_where()`, `delete_where()` etc for attached alias databases 1236693079  
1155756742 https://github.com/simonw/sqlite-utils/issues/432#issuecomment-1155756742 https://api.github.com/repos/simonw/sqlite-utils/issues/432 IC_kwDOCGYnMM5E43LG simonw 9599 2022-06-14T22:05:38Z 2022-06-14T22:05:49Z OWNER

I don't like the idea of table_names() returning names of tables from connected databases as well, because it feels like it could lead to surprising behaviour - especially if those connected databases turn to have table names that are duplicated in the main connected database.

It would be neat if functions like .rows_where() worked though.

One thought would be to support something like this:

rows = db["otherdb.tablename"].rows_where()

But... . is a valid character in a SQLite table name. So "otherdb.tablename" might ambiguously refer to a table called tablename in a connected database with the alias otherdb, OR a table in the current database with the name otherdb.tablename.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Support `rows_where()`, `delete_where()` etc for attached alias databases 1236693079  
1155753397 https://github.com/simonw/sqlite-utils/issues/431#issuecomment-1155753397 https://api.github.com/repos/simonw/sqlite-utils/issues/431 IC_kwDOCGYnMM5E42W1 simonw 9599 2022-06-14T22:01:38Z 2022-06-14T22:01:38Z OWNER

Yeah, I think it would be neat if the library could support self-referential many-to-many in a nice way.

I'm not sure about the left_name/right_name design though. Would it be possible to have this work as the user intends, by spotting that the other table name "people" matches the name of the current table?

db["people"].insert({"name": "Mary"}, pk="name").m2m(
    "people", [{"name": "Michael"}, {"name": "Suzy"}], m2m_table="parent_child", pk="name"
)

The created table could look like this:

CREATE TABLE [parent_child] (
   [people_id_1] TEXT REFERENCES [people]([name]),
   [people_id_2] TEXT REFERENCES [people]([name]),
   PRIMARY KEY ([people_id_1], [people_id_2])
)

I've not thought very hard about this, so the design I'm proposing here might not work.

Are there other reasons people might wan the left_name= and right_name= parameters? If so then I'm much happier with those.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Allow making m2m relation of a table to itself 1227571375  
1155750270 https://github.com/simonw/sqlite-utils/issues/441#issuecomment-1155750270 https://api.github.com/repos/simonw/sqlite-utils/issues/441 IC_kwDOCGYnMM5E41l- simonw 9599 2022-06-14T21:57:57Z 2022-06-14T21:57:57Z OWNER

I added where= and where_args= parameters to that .search() method - updated documentation is here: https://sqlite-utils.datasette.io/en/latest/python-api.html#searching-with-table-search

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Combining `rows_where()` and `search()` to limit which rows are searched 1257724585  
1155749696 https://github.com/simonw/sqlite-utils/issues/433#issuecomment-1155749696 https://api.github.com/repos/simonw/sqlite-utils/issues/433 IC_kwDOCGYnMM5E41dA simonw 9599 2022-06-14T21:57:05Z 2022-06-14T21:57:05Z OWNER

Marking this as help wanted because I can't figure out how to replicate it!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CLI eats my cursor 1239034903  
1155748444 https://github.com/simonw/sqlite-utils/issues/442#issuecomment-1155748444 https://api.github.com/repos/simonw/sqlite-utils/issues/442 IC_kwDOCGYnMM5E41Jc simonw 9599 2022-06-14T21:55:15Z 2022-06-14T21:55:15Z OWNER

Documentation: https://sqlite-utils.datasette.io/en/latest/python-api.html#setting-the-maximum-csv-field-size-limit

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`maximize_csv_field_size_limit()` utility function 1269886084  
1155714131 https://github.com/simonw/sqlite-utils/issues/442#issuecomment-1155714131 https://api.github.com/repos/simonw/sqlite-utils/issues/442 IC_kwDOCGYnMM5E4sxT simonw 9599 2022-06-14T21:07:50Z 2022-06-14T21:07:50Z OWNER

Here's the commit where I added that originally, including a test: https://github.com/simonw/sqlite-utils/commit/1a93b72ba710ea2271eaabc204685a27d2469374

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`maximize_csv_field_size_limit()` utility function 1269886084  
1155672675 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155672675 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E4ipj simonw 9599 2022-06-14T20:19:07Z 2022-06-14T20:19:07Z OWNER

Documentation: https://sqlite-utils.datasette.io/en/latest/python-api.html#reading-rows-from-a-file

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 1,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1155672522 https://github.com/simonw/sqlite-utils/issues/443#issuecomment-1155672522 https://api.github.com/repos/simonw/sqlite-utils/issues/443 IC_kwDOCGYnMM5E4inK simonw 9599 2022-06-14T20:18:58Z 2022-06-14T20:18:58Z OWNER

New documentation: https://sqlite-utils.datasette.io/en/latest/python-api.html#reading-rows-from-a-file

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Make `utils.rows_from_file()` a documented API 1269998342  
1155666672 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155666672 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E4hLw simonw 9599 2022-06-14T20:11:52Z 2022-06-14T20:11:52Z OWNER

I'm going to rename restkey to extras_key for consistency with ignore_extras.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1155515426 https://github.com/simonw/sqlite-utils/issues/441#issuecomment-1155515426 https://api.github.com/repos/simonw/sqlite-utils/issues/441 IC_kwDOCGYnMM5E38Qi betatim 1448859 2022-06-14T17:53:43Z 2022-06-14T17:53:43Z NONE

That would be handy (additional where filters) but I think the trick with the with statement is already an order of magnitude better than what I had thought of, so my problem is solved by it (plus I got to learn about with today!)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Combining `rows_where()` and `search()` to limit which rows are searched 1257724585  
1155421299 https://github.com/simonw/sqlite-utils/issues/441#issuecomment-1155421299 https://api.github.com/repos/simonw/sqlite-utils/issues/441 IC_kwDOCGYnMM5E3lRz simonw 9599 2022-06-14T16:23:52Z 2022-06-14T16:23:52Z OWNER

Actually I have a thought for something that could help here: I could add a mechanism for inserting additional where filters and parameters into that .search() method.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Combining `rows_where()` and `search()` to limit which rows are searched 1257724585  
1155389614 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155389614 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E3diu simonw 9599 2022-06-14T15:54:03Z 2022-06-14T15:54:03Z OWNER

Filed an issue against python/typeshed:

  • https://github.com/python/typeshed/issues/8075
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1155364367 https://github.com/simonw/sqlite-utils/issues/412#issuecomment-1155364367 https://api.github.com/repos/simonw/sqlite-utils/issues/412 IC_kwDOCGYnMM5E3XYP simonw 9599 2022-06-14T15:36:28Z 2022-06-14T15:36:28Z OWNER

Here's as far as I got with my initial prototype, in sqlite_utils/pandas.py:

from .db import Database as _Database, Table as _Table, View as _View
import pandas as pd
from typing import (
    Iterable,
    Union,
    Optional,
)


class Database(_Database):
    def query(
        self, sql: str, params: Optional[Union[Iterable, dict]] = None
    ) -> pd.DataFrame:
        return pd.DataFrame(super().query(sql, params))

    def table(self, table_name: str, **kwargs) -> Union["Table", "View"]:
        "Return a table object, optionally configured with default options."
        klass = View if table_name in self.view_names() else Table
        return klass(self, table_name, **kwargs)


class PandasQueryable:
    def rows_where(
        self,
        where: str = None,
        where_args: Optional[Union[Iterable, dict]] = None,
        order_by: str = None,
        select: str = "*",
        limit: int = None,
        offset: int = None,
    ) -> pd.DataFrame:
        return pd.DataFrame(
            super().rows_where(
                where,
                where_args,
                order_by=order_by,
                select=select,
                limit=limit,
                offset=offset,
            )
        )


class Table(PandasQueryable, _Table):
    pass


class View(PandasQueryable, _View):
    pass
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Optional Pandas integration 1160182768  
1155358637 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155358637 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E3V-t simonw 9599 2022-06-14T15:31:34Z 2022-06-14T15:31:34Z OWNER

Getting this past mypy is really hard!

% mypy sqlite_utils
sqlite_utils/utils.py:189: error: No overload variant of "pop" of "MutableMapping" matches argument type "None"
sqlite_utils/utils.py:189: note: Possible overload variants:
sqlite_utils/utils.py:189: note:     def pop(self, key: str) -> str
sqlite_utils/utils.py:189: note:     def [_T] pop(self, key: str, default: Union[str, _T] = ...) -> Union[str, _T]

That's because of this line:

row.pop(key=None)

Which is legit here - we have a dictionary where one of the keys is None and we want to remove that key. But the baked in type is apparently def pop(self, key: str) -> str.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1155350755 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155350755 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E3UDj simonw 9599 2022-06-14T15:25:18Z 2022-06-14T15:25:18Z OWNER

That broke mypy:

sqlite_utils/utils.py:229: error: Incompatible types in assignment (expression has type "Iterable[Dict[Any, Any]]", variable has type "DictReader[str]")

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1155317293 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155317293 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E3L4t simonw 9599 2022-06-14T15:04:01Z 2022-06-14T15:04:01Z OWNER

I think that's unavoidable: it looks like csv.Sniffer only works if you feed it a CSV file with an equal number of values in each row, which is understandable.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1155310521 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155310521 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5E3KO5 simonw 9599 2022-06-14T14:58:50Z 2022-06-14T14:58:50Z OWNER

Interesting challenge in writing tests for this: if you give csv.Sniffer a short example with an invalid row in it sometimes it picks the wrong delimiter!

id,name\r\n1,Cleo,oops

It decided the delimiter there was e.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154475454 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154475454 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez-W- simonw 9599 2022-06-13T21:52:03Z 2022-06-13T21:52:03Z OWNER

The exception will be called RowError.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154474482 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154474482 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez-Hy simonw 9599 2022-06-13T21:50:59Z 2022-06-13T21:51:24Z OWNER

Decision: I'm going to default to raising an exception if a row has too many values in it.

You'll be able to pass ignore_extras=True to ignore those extra values, or pass restkey="the_rest" to stick them in a list in the restkey column.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154457893 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154457893 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez6El simonw 9599 2022-06-13T21:29:02Z 2022-06-13T21:29:02Z OWNER

Here's the current function signature for rows_from_file():

https://github.com/simonw/sqlite-utils/blob/26e6d2622c57460a24ffdd0128bbaac051d51a5f/sqlite_utils/utils.py#L174-L179

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154457028 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154457028 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez53E simonw 9599 2022-06-13T21:28:03Z 2022-06-13T21:28:03Z OWNER

Whatever I decide, I can implement it in rows_from_file(), maybe as an optional parameter - then decide how to call it from the sqlite-utils insert CLI (perhaps with a new option there too).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154456183 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154456183 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez5p3 simonw 9599 2022-06-13T21:26:55Z 2022-06-13T21:26:55Z OWNER

So I need to make a design decision here: what should sqlite-utils do with CSV files that have rows with more values than there are headings?

Some options:

  • Ignore those extra fields entirely - silently drop that data. I'm not keen on this.
  • Throw an error. The library does this already, but the error is incomprehensible - it could turn into a useful, human-readable error instead.
  • Put the data in a JSON list in a column with a known name (None is not a valid column name, so not that). This could be something like _restkey or _values_with_no_heading. This feels like a better option, but I'd need to carefully pick a name for it - and come up with an answer for the question of what to do if the CSV file being important already uses that heading name for something else.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154454127 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154454127 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez5Jv simonw 9599 2022-06-13T21:24:18Z 2022-06-13T21:24:18Z OWNER

That weird behaviour is documented here: https://docs.python.org/3/library/csv.html#csv.DictReader

If a row has more fields than fieldnames, the remaining data is put in a list and stored with the fieldname specified by restkey (which defaults to None). If a non-blank row has fewer fields than fieldnames, the missing values are filled-in with the value of restval (which defaults to None).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154453319 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154453319 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez49H simonw 9599 2022-06-13T21:23:16Z 2022-06-13T21:23:16Z OWNER

Aha! I think I see what's happening here. Here's what DictReader does if one of the lines has too many items in it:

>>> import csv, io
>>> list(csv.DictReader(io.StringIO("id,name\n1,Cleo,nohead\n2,Barry")))
[{'id': '1', 'name': 'Cleo', None: ['nohead']}, {'id': '2', 'name': 'Barry'}]

See how that row with too many items gets this:
[{'id': '1', 'name': 'Cleo', None: ['nohead']}

That's a None for the key and (weirdly) a list containing the single item for the value!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154449442 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154449442 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ez4Ai simonw 9599 2022-06-13T21:18:26Z 2022-06-13T21:20:12Z OWNER

Here are full steps to replicate the bug:

from urllib.request import urlopen
import sqlite_utils
db = sqlite_utils.Database(memory=True)
with urlopen("https://artsdatabanken.no/Fab2018/api/export/csv") as fab:
    reader, other = sqlite_utils.utils.rows_from_file(fab, encoding="utf-16le")
    db["fab2018"].insert_all(reader, pk="Id")
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154396400 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154396400 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5EzrDw simonw 9599 2022-06-13T20:28:25Z 2022-06-13T20:28:25Z OWNER

Fixing that key thing (to ignore any key that is None) revealed a new bug:

File ~/Dropbox/Development/sqlite-utils/sqlite_utils/utils.py:376, in hash_record(record, keys)
    373 if keys is not None:
    374     to_hash = {key: record[key] for key in keys}
    375 return hashlib.sha1(
--> 376     json.dumps(to_hash, separators=(",", ":"), sort_keys=True, default=repr).encode(
    377         "utf8"
    378     )
    379 ).hexdigest()

File ~/.pyenv/versions/3.8.2/lib/python3.8/json/__init__.py:234, in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    232 if cls is None:
    233     cls = JSONEncoder
--> 234 return cls(
    235     skipkeys=skipkeys, ensure_ascii=ensure_ascii,
    236     check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    237     separators=separators, default=default, sort_keys=sort_keys,
    238     **kw).encode(obj)

File ~/.pyenv/versions/3.8.2/lib/python3.8/json/encoder.py:199, in JSONEncoder.encode(self, o)
    195         return encode_basestring(o)
    196 # This doesn't pass the iterator directly to ''.join() because the
    197 # exceptions aren't as detailed.  The list call should be roughly
    198 # equivalent to the PySequence_Fast that ''.join() would do.
--> 199 chunks = self.iterencode(o, _one_shot=True)
    200 if not isinstance(chunks, (list, tuple)):
    201     chunks = list(chunks)

File ~/.pyenv/versions/3.8.2/lib/python3.8/json/encoder.py:257, in JSONEncoder.iterencode(self, o, _one_shot)
    252 else:
    253     _iterencode = _make_iterencode(
    254         markers, self.default, _encoder, self.indent, floatstr,
    255         self.key_separator, self.item_separator, self.sort_keys,
    256         self.skipkeys, _one_shot)
--> 257 return _iterencode(o, 0)

TypeError: '<' not supported between instances of 'NoneType' and 'str'
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154387591 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154387591 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ezo6H simonw 9599 2022-06-13T20:17:51Z 2022-06-13T20:17:51Z OWNER

I don't understand why that works but calling insert_all() does not.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154386795 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154386795 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ezotr simonw 9599 2022-06-13T20:16:53Z 2022-06-13T20:16:53Z OWNER

Steps to demonstrate that sqlite-utils insert is not affected:

curl -o artsdatabanken.csv https://artsdatabanken.no/Fab2018/api/export/csv
sqlite-utils insert arts.db artsdatabanken artsdatabanken.csv --sniff --csv --encoding utf-16le
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154385916 https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1154385916 https://api.github.com/repos/simonw/sqlite-utils/issues/440 IC_kwDOCGYnMM5Ezof8 simonw 9599 2022-06-13T20:15:49Z 2022-06-13T20:15:49Z OWNER

rows_from_file() isn't part of the documented API but maybe it should be!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CSV files with too many values in a row cause errors 1250629388  
1154373361 https://github.com/simonw/sqlite-utils/issues/441#issuecomment-1154373361 https://api.github.com/repos/simonw/sqlite-utils/issues/441 IC_kwDOCGYnMM5Ezlbx simonw 9599 2022-06-13T20:01:25Z 2022-06-13T20:01:25Z OWNER

Yeah, at the moment the best way to do this is with search_sql(), but you're right it really isn't very intuitive.

Here's how I would do this, using a CTE trick to combine the queries:

search_sql = db["articles"].search_sql(columns=["title", "author"]))
sql = f"""
with search_results as ({search_sql})
select * from search_results where owner = :owner
"""
results = db.query(sql, {"query": "my search query", "owner": "my owner"})

I'm not sure if sqlite-utils should ever evolve to provide a better way of doing this kind of thing to be honest - if it did, it would turn into more of an ORM. Something like PeeWee may be a better option here.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Combining `rows_where()` and `search()` to limit which rows are searched 1257724585  
1151887842 https://github.com/simonw/datasette/issues/1528#issuecomment-1151887842 https://api.github.com/repos/simonw/datasette/issues/1528 IC_kwDOBm6k_c5EqGni eyeseast 25778 2022-06-10T03:23:08Z 2022-06-10T03:23:08Z CONTRIBUTOR

I just put together a version of this in a plugin: https://github.com/eyeseast/datasette-query-files. Happy to have any feedback.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add new `"sql_file"` key to Canned Queries in metadata? 1060631257  
1147435032 https://github.com/simonw/datasette/pull/1753#issuecomment-1147435032 https://api.github.com/repos/simonw/datasette/issues/1753 IC_kwDOBm6k_c5EZHgY codecov[bot] 22429695 2022-06-06T13:15:11Z 2022-06-06T13:15:11Z NONE

Codecov Report

Merging #1753 (23a8515) into main (2e97516) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1753   +/-   ##
=======================================
  Coverage   91.67%   91.67%           
=======================================
  Files          36       36           
  Lines        4658     4658           
=======================================
  Hits         4270     4270           
  Misses        388      388           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e97516...23a8515. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump furo from 2022.4.7 to 2022.6.4.1 1261826957  
1142556455 https://github.com/simonw/datasette/pull/1740#issuecomment-1142556455 https://api.github.com/repos/simonw/datasette/issues/1740 IC_kwDOBm6k_c5EGgcn simonw 9599 2022-05-31T19:25:49Z 2022-05-31T19:25:49Z OWNER

Thanks, this looks like a good idea to me.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
chore: Set permissions for GitHub actions 1226106354  
1141711418 https://github.com/simonw/sqlite-utils/issues/26#issuecomment-1141711418 https://api.github.com/repos/simonw/sqlite-utils/issues/26 IC_kwDOCGYnMM5EDSI6 nileshtrivedi 19304 2022-05-31T06:21:15Z 2022-05-31T06:21:15Z NONE

I ran into this. My use case has a JSON file with array of book objects with a key called reviews which is also an array of objects. My JSON is human-edited and does not specify IDs for either books or reviews. Because sqlite-utils does not support inserting nested objects, I instead have to maintain two separate CSV files with id column in books.csv and book_id column in reviews.csv.

I think the right way to declare the relationship while inserting a JSON might be to describe the relationship:

sqlite-utils insert data.db books mydata.json --hasmany reviews --hasone author --manytomany tags

This is relying on the assumption that foreign keys can point to rowid primary key.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for turning nested JSON into foreign keys / many-to-many 455486286  
1141488533 https://github.com/simonw/sqlite-utils/pull/437#issuecomment-1141488533 https://api.github.com/repos/simonw/sqlite-utils/issues/437 IC_kwDOCGYnMM5ECbuV simonw 9599 2022-05-30T21:32:36Z 2022-05-30T21:32:36Z OWNER

Thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
docs to dogs 1244294227  
1140321380 https://github.com/simonw/datasette/issues/1751#issuecomment-1140321380 https://api.github.com/repos/simonw/datasette/issues/1751 IC_kwDOBm6k_c5D9-xk knutwannheden 408765 2022-05-28T19:52:17Z 2022-05-28T19:52:17Z NONE

Closing in favor of existing issue #1298.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add scrollbars to table presentation in default layout 1251710928  
1139426398 https://github.com/simonw/sqlite-utils/issues/439#issuecomment-1139426398 https://api.github.com/repos/simonw/sqlite-utils/issues/439 IC_kwDOCGYnMM5D6kRe frafra 4068 2022-05-27T09:04:05Z 2022-05-27T10:44:54Z NONE

This code works:

import csv
import sqlite_utils
db = sqlite_utils.Database("test.db")
reader = csv.DictReader(open("csv", encoding="utf-16-le").read().split("\r\n"), delimiter=";")
db["test"].insert_all(reader, pk="Id")

I used iconv to change the encoding; sqlite-utils can import the resulting file, even if it stops at 98 %:

sqlite-utils insert --csv test test.db clean 
  [------------------------------------]    0%
  [###################################-]   98%  00:00:00
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Misleading progress bar against utf-16-le CSV input 1250495688  
1139484453 https://github.com/simonw/sqlite-utils/issues/433#issuecomment-1139484453 https://api.github.com/repos/simonw/sqlite-utils/issues/433 IC_kwDOCGYnMM5D6ycl frafra 4068 2022-05-27T10:20:08Z 2022-05-27T10:20:08Z NONE

I can confirm. This only happens with sqlite-utils. I am using gnome-terminal with bash.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
CLI eats my cursor 1239034903  
1139392769 https://github.com/simonw/sqlite-utils/issues/438#issuecomment-1139392769 https://api.github.com/repos/simonw/sqlite-utils/issues/438 IC_kwDOCGYnMM5D6cEB frafra 4068 2022-05-27T08:21:53Z 2022-05-27T08:21:53Z NONE

Argument were specified in the wrong order. PATH TABLE FILE can be misleading :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
illegal UTF-16 surrogate 1250161887  
1139379923 https://github.com/simonw/sqlite-utils/issues/438#issuecomment-1139379923 https://api.github.com/repos/simonw/sqlite-utils/issues/438 IC_kwDOCGYnMM5D6Y7T frafra 4068 2022-05-27T08:05:01Z 2022-05-27T08:05:01Z NONE

I tried to debug it using pdb, but it looks sqlite-utils catches the exception, so it is not quick to figure out where the failure is happening.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
illegal UTF-16 surrogate 1250161887  
1133417432 https://github.com/simonw/sqlite-utils/issues/435#issuecomment-1133417432 https://api.github.com/repos/simonw/sqlite-utils/issues/435 IC_kwDOCGYnMM5DjpPY simonw 9599 2022-05-20T21:56:10Z 2022-05-20T21:56:10Z OWNER

Before:

After:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch to Furo documentation theme 1243704847  
1133416698 https://github.com/simonw/sqlite-utils/issues/435#issuecomment-1133416698 https://api.github.com/repos/simonw/sqlite-utils/issues/435 IC_kwDOCGYnMM5DjpD6 simonw 9599 2022-05-20T21:54:43Z 2022-05-20T21:54:43Z OWNER

Done: https://sqlite-utils.datasette.io/en/latest/reference.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch to Furo documentation theme 1243704847  
1133396285 https://github.com/simonw/datasette/issues/1746#issuecomment-1133396285 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjkE9 simonw 9599 2022-05-20T21:28:29Z 2022-05-20T21:28:29Z OWNER

That fixed it:

https://user-images.githubusercontent.com/9599/169614893-ec81fe0e-6043-4d7d-b429-7f087dbeaf61.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133348094 https://github.com/simonw/datasette/issues/1746#issuecomment-1133348094 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjYT- simonw 9599 2022-05-20T20:40:09Z 2022-05-20T20:40:09Z OWNER

Relevant JavaScript: https://github.com/simonw/datasette/blob/1d33fd03b3c211e0f48a8f3bde83880af89e4e69/docs/_static/js/custom.js#L20-L24

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133347051 https://github.com/simonw/datasette/issues/1746#issuecomment-1133347051 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjYDr simonw 9599 2022-05-20T20:39:17Z 2022-05-20T20:39:17Z OWNER

Now live at https://docs.datasette.io/en/latest/ - the JavaScript that adds the banner about that not being the stable version doesn't seem to work though.

Before:

https://user-images.githubusercontent.com/9599/169607254-99bc0358-4a08-43bf-9aac-a24cc2121979.png">

After:

https://user-images.githubusercontent.com/9599/169607296-bcec1ed2-517c-4acc-a9a9-d1119d0cc589.png">

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1081861670 https://github.com/simonw/datasette/pull/1693#issuecomment-1081861670 https://api.github.com/repos/simonw/datasette/issues/1693 IC_kwDOBm6k_c5Ae-Ym codecov[bot] 22429695 2022-03-29T13:18:47Z 2022-05-20T20:36:30Z NONE

Codecov Report

Merging #1693 (65a5d5e) into main (1465fea) will not change coverage.
The diff coverage is n/a.

:exclamation: Current head 65a5d5e differs from pull request most recent head ec2d1e4. Consider uploading reports for the commit ec2d1e4 to get more accurate results

@@           Coverage Diff           @@
##             main    #1693   +/-   ##
=======================================
  Coverage   91.67%   91.67%           
=======================================
  Files          36       36           
  Lines        4658     4658           
=======================================
  Hits         4270     4270           
  Misses        388      388           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1d33fd0...ec2d1e4. Read the comment docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Bump black from 22.1.0 to 22.3.0 1184850337  
1133335940 https://github.com/simonw/datasette/issues/1746#issuecomment-1133335940 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjVWE simonw 9599 2022-05-20T20:30:29Z 2022-05-20T20:30:29Z OWNER

I think the trick will be to extend the base.html template from Furo using the same trick I used in https://til.simonwillison.net/readthedocs/custom-sphinx-templates

https://github.com/pradyunsg/furo/blob/2022.04.07/src/furo/theme/furo/base.html - the site_meta block looks good.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  
1133333144 https://github.com/simonw/datasette/issues/1746#issuecomment-1133333144 https://api.github.com/repos/simonw/datasette/issues/1746 IC_kwDOBm6k_c5DjUqY simonw 9599 2022-05-20T20:28:25Z 2022-05-20T20:28:25Z OWNER

One last question: how to include the Plausible analytics?

Furo doesn't have any specific tools for this:

  • https://github.com/pradyunsg/furo/discussions/243
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Switch documentation theme to Furo 1243498298  

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 570.302ms · About: github-to-sqlite