{"id": 403922644, "node_id": "MDU6SXNzdWU0MDM5MjI2NDQ=", "number": 8, "title": "Problems handling column names containing spaces or - ", "user": {"value": 82988, "label": "psychemedia"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2019-01-28T17:23:28Z", "updated_at": "2019-04-14T15:29:33Z", "closed_at": "2019-02-23T21:09:03Z", "author_association": "NONE", "pull_request": null, "body": "Irrrespective of whether using column names containing a space or - character is good practice, SQLite does allow it, but `sqlite-utils` throws an error in the following cases:\r\n\r\n```python\r\nfrom sqlite_utils import Database\r\n\r\ndbname = 'test.db'\r\nDB = Database(sqlite3.connect(dbname))\r\n\r\nimport pandas as pd\r\ndf = pd.DataFrame({'col1':range(3), 'col2':range(3)})\r\n\r\n#Convert pandas dataframe to appropriate list/dict format\r\nDB['test1'].insert_all( df.to_dict(orient='records') )\r\n#Works fine\r\n```\r\n\r\nHowever:\r\n\r\n```python\r\ndf = pd.DataFrame({'col 1':range(3), 'col2':range(3)})\r\nDB['test1'].insert_all(df.to_dict(orient='records'))\r\n```\r\n\r\nthrows:\r\n\r\n```\r\n---------------------------------------------------------------------------\r\nOperationalError Traceback (most recent call last)\r\n in ()\r\n 1 import pandas as pd\r\n 2 df = pd.DataFrame({'col 1':range(3), 'col2':range(3)})\r\n----> 3 DB['test1'].insert_all(df.to_dict(orient='records'))\r\n\r\n/usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in insert_all(self, records, pk, foreign_keys, upsert, batch_size, column_order)\r\n 327 jsonify_if_needed(record.get(key, None)) for key in all_columns\r\n 328 )\r\n--> 329 result = self.db.conn.execute(sql, values)\r\n 330 self.db.conn.commit()\r\n 331 self.last_id = result.lastrowid\r\n\r\nOperationalError: near \"1\": syntax error\r\n```\r\n\r\nand:\r\n\r\n```python\r\ndf = pd.DataFrame({'col-1':range(3), 'col2':range(3)})\r\nDB['test1'].upsert_all(df.to_dict(orient='records'))\r\n```\r\n\r\nresults in:\r\n\r\n```\r\n---------------------------------------------------------------------------\r\nOperationalError Traceback (most recent call last)\r\n in ()\r\n 1 import pandas as pd\r\n 2 df = pd.DataFrame({'col-1':range(3), 'col2':range(3)})\r\n----> 3 DB['test1'].insert_all(df.to_dict(orient='records'))\r\n\r\n/usr/local/lib/python3.7/site-packages/sqlite_utils/db.py in insert_all(self, records, pk, foreign_keys, upsert, batch_size, column_order)\r\n 327 jsonify_if_needed(record.get(key, None)) for key in all_columns\r\n 328 )\r\n--> 329 result = self.db.conn.execute(sql, values)\r\n 330 self.db.conn.commit()\r\n 331 self.last_id = result.lastrowid\r\n\r\nOperationalError: near \"-\": syntax error\r\n```", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/8/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 403625674, "node_id": "MDU6SXNzdWU0MDM2MjU2NzQ=", "number": 7, "title": ".insert_all() should accept a generator and process it efficiently", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 3, "created_at": "2019-01-28T02:11:58Z", "updated_at": "2019-01-28T06:26:53Z", "closed_at": "2019-01-28T06:26:53Z", "author_association": "OWNER", "pull_request": null, "body": "Right now you have to load every record into memory before passing the list to `.insert_all()` and friends.\r\n\r\nIf you want to process millions of rows, this is inefficient. Python has generators - we should use them!\r\n\r\nThe only catch here is that part of the magic of `sqlite-utils` is that it guesses the column types and creates the table for you. This code will need to be updated to notice if the table needs creating and, if it does, create it using the first X (where x=1,000 but can be customized) records.\r\n\r\nIf a record outside of those first 1,000 has a rogue column, we can crash with an error.\r\n\r\nThis will free us up to make the `--nl` option added in #6 much more efficient.", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/7/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"} {"id": 403624090, "node_id": "MDU6SXNzdWU0MDM2MjQwOTA=", "number": 6, "title": "\"sqlite-utils insert\" should support newline-delimited JSON", "user": {"value": 9599, "label": "simonw"}, "state": "closed", "locked": 0, "assignee": null, "milestone": null, "comments": 1, "created_at": "2019-01-28T02:00:02Z", "updated_at": "2019-01-28T02:17:45Z", "closed_at": "2019-01-28T02:17:45Z", "author_association": "OWNER", "pull_request": null, "body": "We can already export newline delimited JSON. We should learn to import it as well.\r\n\r\nThe neat thing about importing it is that you can import GBs of data without having to read the whole lot into memory in order to decode the wrapping JSON array.\r\n\r\nDatasette can export it now: https://github.com/simonw/datasette/issues/405\r\n\r\nDemo: https://latest.datasette.io/fixtures/facetable.json?_shape=array&_nl=on\r\n\r\nIt should be possible to do this:\r\n\r\n $ curl \"https://latest.datasette.io/fixtures/facetable.json?_shape=array&_nl=on\" \\\r\n | sqlite-utils insert data.db facetable - --nl\r\n", "repo": {"value": 140912432, "label": "sqlite-utils"}, "type": "issue", "active_lock_reason": null, "performed_via_github_app": null, "reactions": "{\"url\": \"https://api.github.com/repos/simonw/sqlite-utils/issues/6/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", "draft": null, "state_reason": "completed"}