home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

10 rows where issue = 807437089 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • simonw 9
  • agguser 1

author_association 2

  • OWNER 9
  • NONE 1

issue 1

  • --no-headers option for CSV and TSV · 10 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1001115286 https://github.com/simonw/sqlite-utils/issues/228#issuecomment-1001115286 https://api.github.com/repos/simonw/sqlite-utils/issues/228 IC_kwDOCGYnMM47q86W agguser 1206106 2021-12-26T07:01:31Z 2021-12-26T07:01:31Z NONE

--no-headers does not work? ``` $ echo 'a,1\nb,2' | sqlite-utils memory --no-headers -t - 'select * from stdin' a 1


b 2 ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--no-headers option for CSV and TSV 807437089  
778851721 https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778851721 https://api.github.com/repos/simonw/sqlite-utils/issues/228 MDEyOklzc3VlQ29tbWVudDc3ODg1MTcyMQ== simonw 9599 2021-02-14T22:23:46Z 2021-02-14T22:23:46Z OWNER

I called this --no-headers for consistency with the existing output option: https://github.com/simonw/sqlite-utils/blob/427dace184c7da57f4a04df07b1e84cdae3261e8/sqlite_utils/cli.py#L61-L64

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--no-headers option for CSV and TSV 807437089  
778849394 https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778849394 https://api.github.com/repos/simonw/sqlite-utils/issues/228 MDEyOklzc3VlQ29tbWVudDc3ODg0OTM5NA== simonw 9599 2021-02-14T22:06:53Z 2021-02-14T22:06:53Z OWNER

For the moment I think just adding --no-header - which causes column names "unknown1,unknown2,..." to be used - should be enough.

Users can import with that option, then use sqlite-utils transform --rename to rename them.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--no-headers option for CSV and TSV 807437089  
778811746 https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778811746 https://api.github.com/repos/simonw/sqlite-utils/issues/228 MDEyOklzc3VlQ29tbWVudDc3ODgxMTc0Ng== simonw 9599 2021-02-14T17:39:30Z 2021-02-14T21:16:54Z OWNER

I'm going to detach this from the #131 column types idea.

The three things I need to handle here are:

  • The CSV file doesn't have a header row at all, so I need to specify what the column names should be
  • The CSV file DOES have a header row but I want to ignore it and use alternative column names
  • The CSV doesn't have a header row at all and I want to automatically use unknown1,unknown2... so I can start exploring it as quickly as possible.

Here's a potential design that covers the first two:

--replace-header="foo,bar,baz" - ignore whatever is in the first row and pretend it was this instead --add-header="foo,bar,baz" - add a first row with these details, to use as the header

It doesn't cover the "give me unknown column names" case though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--no-headers option for CSV and TSV 807437089  
778843086 https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778843086 https://api.github.com/repos/simonw/sqlite-utils/issues/228 MDEyOklzc3VlQ29tbWVudDc3ODg0MzA4Ng== simonw 9599 2021-02-14T21:15:43Z 2021-02-14T21:15:43Z OWNER

I'm not convinced the .has_header() rules are useful for the kind of CSV files I work with: https://github.com/python/cpython/blob/63298930fb531ba2bb4f23bc3b915dbf1e17e9e1/Lib/csv.py#L383

python def has_header(self, sample): # Creates a dictionary of types of data in each column. If any # column is of a single type (say, integers), *except* for the first # row, then the first row is presumed to be labels. If the type # can't be determined, it is assumed to be a string in which case # the length of the string is the determining factor: if all of the # rows except for the first are the same length, it's a header. # Finally, a 'vote' is taken at the end for each column, adding or # subtracting from the likelihood of the first row being a header.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--no-headers option for CSV and TSV 807437089  
778842982 https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778842982 https://api.github.com/repos/simonw/sqlite-utils/issues/228 MDEyOklzc3VlQ29tbWVudDc3ODg0Mjk4Mg== simonw 9599 2021-02-14T21:15:11Z 2021-02-14T21:15:11Z OWNER

Implementation tip: I have code that reads the first row and uses it as headers here: https://github.com/simonw/sqlite-utils/blob/8f042ae1fd323995d966a94e8e6df85cc843b938/sqlite_utils/cli.py#L689-L691

So If I want to use unknown1,unknown2... I can do that by reading the first row, counting the number of columns, generating headers based on that range and then continuing to build that generator (maybe with itertools.chain() to replay the record we already read).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--no-headers option for CSV and TSV 807437089  
778812050 https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778812050 https://api.github.com/repos/simonw/sqlite-utils/issues/228 MDEyOklzc3VlQ29tbWVudDc3ODgxMjA1MA== simonw 9599 2021-02-14T17:41:30Z 2021-02-14T17:41:30Z OWNER

I just spotted that csv.Sniffer in the Python standard library has a .has_header(sample) method which detects if the first row appears to be a header or not, which is interesting. https://docs.python.org/3/library/csv.html#csv.Sniffer

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--no-headers option for CSV and TSV 807437089  
778811934 https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778811934 https://api.github.com/repos/simonw/sqlite-utils/issues/228 MDEyOklzc3VlQ29tbWVudDc3ODgxMTkzNA== simonw 9599 2021-02-14T17:40:48Z 2021-02-14T17:40:48Z OWNER

Another pattern that might be useful is to generate a header that is just "unknown1,unknown2,unknown3" for each of the columns in the rest of the file. This makes it easy to e.g. facet-explore within Datasette to figure out the correct names, then use sqlite-utils transform --rename to rename the columns.

I needed to do that for the https://bl.iro.bl.uk/work/ns/3037474a-761c-456d-a00c-9ef3c6773f4c example.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--no-headers option for CSV and TSV 807437089  
778511347 https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778511347 https://api.github.com/repos/simonw/sqlite-utils/issues/228 MDEyOklzc3VlQ29tbWVudDc3ODUxMTM0Nw== simonw 9599 2021-02-12T23:27:50Z 2021-02-12T23:27:50Z OWNER

For the moment, a workaround can be to cat an additional row onto the start of the file.

echo "name,url,description" | cat - missing_headings.csv | sqlite-utils insert blah.db table - --csv
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--no-headers option for CSV and TSV 807437089  
778349672 https://github.com/simonw/sqlite-utils/issues/228#issuecomment-778349672 https://api.github.com/repos/simonw/sqlite-utils/issues/228 MDEyOklzc3VlQ29tbWVudDc3ODM0OTY3Mg== simonw 9599 2021-02-12T18:00:43Z 2021-02-12T18:00:43Z OWNER

I could combine this with #131 to allow types to be specified in addition to column names.

Probably need an option that means "ignore the existing heading row and use this one instead".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
--no-headers option for CSV and TSV 807437089  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 36.015ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows