Feel free to close this issue - I mostly added it for reference for future folks that run into this :)
I have a CSV file with one column that has very long strings. When i try to import this file via the insert
command I get the following error:
sqlite-utils insert database.db table_name file_with_large_column.csv
Traceback (most recent call last):
File "/usr/local/bin/sqlite-utils", line 10, in <module>
sys.exit(cli())
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/sqlite_utils/cli.py", line 774, in insert
default=default,
File "/usr/local/lib/python3.7/site-packages/sqlite_utils/cli.py", line 705, in insert_upsert_implementation
docs, pk=pk, batch_size=batch_size, alter=alter, **extra_kwargs
File "/usr/local/lib/python3.7/site-packages/sqlite_utils/db.py", line 1852, in insert_all
first_record = next(records)
File "/usr/local/lib/python3.7/site-packages/sqlite_utils/cli.py", line 703, in <genexpr>
docs = (decode_base64_values(doc) for doc in docs)
File "/usr/local/lib/python3.7/site-packages/sqlite_utils/cli.py", line 681, in <genexpr>
docs = (dict(zip(headers, row)) for row in reader)
_csv.Error: field larger than field limit (131072)
Built with the docker image datasetteproject/datasette:0.54
with the following versions:
# sqlite-utils --version
sqlite-utils, version 3.4.1
# datasette --version
datasette, version 0.54
It appears this is a known issue reading in csv files in python and doesn't look to be modifiable through system / env vars (i may be very wrong on this).
Noting that using sqlite3 import
command work without error (not using the python csv reader)
sqlite3 database.db
sqlite> .mode csv
sqlite> .import file_with_large_column.csv table_name
Sadly I couldn't see an easy way around this while using the cli as it appears this value needs to be changed in python code. FWIW I've switched to using https://datasette.io/tools/csvs-to-sqlite for importing csv data and it's working well.
Finally, I'm loving https://datasette.io/ thank you very much for an amazing tool and data ecosytem 🙇♀️