home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 797159961

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association pull_request body repo type active_lock_reason performed_via_github_app reactions draft state_reason
797159961 MDExOlB1bGxSZXF1ZXN0NTY0MjE1MDEx 225 fix for problem in Table.insert_all on search for columns per chunk of rows 261237 closed 0     2 2021-01-29T20:16:07Z 2021-02-14T21:04:13Z 2021-02-14T21:04:13Z NONE simonw/sqlite-utils/pulls/225

Hi,

I ran into a problem when trying to create a database from my Apple Healthkit data using healthkit-to-sqlite. The program crashed because of an invalid insert statement that was generated for table rDistanceCycling.

The actual problem turned out to be in sqlite-utils. Table.insert_all processes the data to be inserted in chunks of rows and checks for every chunk which columns are used, and it will collect all column names in the variable all_columns. The collection of columns is done using a nested list comprehension that is not completely correct.

I'm using a Windows machine and had to make a few adjustments to the tests in order to be able to run them because they had a posix dependency.

Thanks, kind regards,

Frans

```

this is a (condensed) chunk of data from my Apple healthkit export that caused the problem.

the 3 last items in the chunk have additional keys: metadata_HKMetadataKeySyncVersion and metadata_HKMetadataKeySyncIdentifier

chunk = [{'sourceName': 'AppleÂ\xa0Watch van Frans', 'sourceVersion': '7.0.1', 'device': '<<HKDevice: 0x281cf6c70>, name:Apple Watch, manufacturer:Apple Inc., model:Watch, hardware:Watch3,4, software:7.0.1>', 'unit': 'km', 'creationDate': '2020-10-10 12:29:09 +0100', 'startDate': '2020-10-10 12:29:06 +0100', 'endDate': '2020-10-10 12:29:07 +0100', 'value': '0.00518016'}, {'sourceName': 'AppleÂ\xa0Watch van Frans', 'sourceVersion': '7.0.1', 'device': '<<HKDevice: 0x281cf6c70>, name:Apple Watch, manufacturer:Apple Inc., model:Watch, hardware:Watch3,4, software:7.0.1>', 'unit': 'km', 'creationDate': '2020-10-10 12:29:10 +0100', 'startDate': '2020-10-10 12:29:07 +0100', 'endDate': '2020-10-10 12:29:08 +0100', 'value': '0.00544049'}, {'sourceName': 'AppleÂ\xa0Watch van Frans', 'sourceVersion': '6.2.6', 'device': '<<HKDevice: 0x281cf83e0>, name:Apple Watch, manufacturer:Apple Inc., model:Watch, hardware:Watch3,4, software:6.2.6>', 'unit': 'km', 'creationDate': '2020-10-14 05:54:12 +0100', 'startDate': '2020-07-15 16:40:50 +0100', 'endDate': '2020-07-15 16:42:49 +0100', 'value': '0.952092', 'metadata_HKMetadataKeySyncVersion': '1', 'metadata_HKMetadataKeySyncIdentifier': '3:674DBCDB-3FE8-40D1-9FC1-E54A2B413805:616520450.99823:616520569.99360:119'}, {'sourceName': 'AppleÂ\xa0Watch van Frans', 'sourceVersion': '6.2.6', 'device': '<<HKDevice: 0x281cf83e0>, name:Apple Watch, manufacturer:Apple Inc., model:Watch, hardware:Watch3,4, software:6.2.6>', 'unit': 'km', 'creationDate': '2020-10-14 05:54:12 +0100', 'startDate': '2020-07-15 16:42:49 +0100', 'endDate': '2020-07-15 16:44:51 +0100', 'value': '0.848983', 'metadata_HKMetadataKeySyncVersion': '1', 'metadata_HKMetadataKeySyncIdentifier': '3:674DBCDB-3FE8-40D1-9FC1-E54A2B413805:616520569.99360:616520691.98826:119'}, {'sourceName': 'AppleÂ\xa0Watch van Frans', 'sourceVersion': '6.2.6', 'device': '<<HKDevice: 0x281cf83e0>, name:Apple Watch, manufacturer:Apple Inc., model:Watch, hardware:Watch3,4, software:6.2.6>', 'unit': 'km', 'creationDate': '2020-10-14 05:54:12 +0100', 'startDate': '2020-07-15 16:44:51 +0100', 'endDate': '2020-07-15 16:46:50 +0100', 'value': '0.834403', 'metadata_HKMetadataKeySyncVersion': '1', 'metadata_HKMetadataKeySyncIdentifier': '3:674DBCDB-3FE8-40D1-9FC1-E54A2B413805:616520691.98826:616520810.98305:119'}]

def all_columns_old(): all_columns = [col for col in chunk[0]] all_columns += [column for record in chunk for column in record if column not in all_columns] return all_columns

def all_columns_new(): all_columns = [col for col in chunk[0]] for record in chunk: all_columns += [column for column in record if column not in all_columns] return all_columns

if name == 'main': from pprint import pprint

print('problem: ')
pprint(all_columns_old())
print('\nfix: ')
pprint(all_columns_new())

```

140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/225/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.877ms · About: github-to-sqlite