home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

11 rows where author_association = "CONTRIBUTOR" and "updated_at" is on date 2022-10-07 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date), updated_at (date)

issue 3

  • docker image is duplicating db files somehow 9
  • Publishing to cloudrun with immutable mode? 1
  • Exceeding Cloud Run memory limits when deploying a 4.8G database 1

user 1

  • fgregg 11

author_association 1

  • CONTRIBUTOR · 11 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1271103097 https://github.com/simonw/datasette/issues/1836#issuecomment-1271103097 https://api.github.com/repos/simonw/datasette/issues/1836 IC_kwDOBm6k_c5Lw355 fgregg 536941 2022-10-07T04:43:41Z 2022-10-07T04:43:41Z CONTRIBUTOR

@simonw, should i open up a new issue for investigating the differences between "immutable=1" and "mode=ro" and possibly switching to "mode=ro". Or would you like to keep that conversation in this issue?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
docker image is duplicating db files somehow 1400374908  
1271101072 https://github.com/simonw/datasette/issues/1480#issuecomment-1271101072 https://api.github.com/repos/simonw/datasette/issues/1480 IC_kwDOBm6k_c5Lw3aQ fgregg 536941 2022-10-07T04:39:10Z 2022-10-07T04:39:10Z CONTRIBUTOR

switching from immutable=1 to mode=ro completely addressed this. see https://github.com/simonw/datasette/issues/1836#issuecomment-1271100651 for details.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Exceeding Cloud Run memory limits when deploying a 4.8G database 1015646369  
1271100651 https://github.com/simonw/datasette/issues/1836#issuecomment-1271100651 https://api.github.com/repos/simonw/datasette/issues/1836 IC_kwDOBm6k_c5Lw3Tr fgregg 536941 2022-10-07T04:38:14Z 2022-10-07T04:38:14Z CONTRIBUTOR

yes, and i also think that this is causing the apparent memory problems in #1480. when the container starts up, it will make some operation on the database in immutable mode which apparently makes some small change to the db file. if that's so, then the db files will be copied to the read/write layer which counts against cloudrun's memory allocation!

running a test of that now.

this completely addressed #1480

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
docker image is duplicating db files somehow 1400374908  
1271035998 https://github.com/simonw/datasette/issues/1301#issuecomment-1271035998 https://api.github.com/repos/simonw/datasette/issues/1301 IC_kwDOBm6k_c5Lwnhe fgregg 536941 2022-10-07T02:38:04Z 2022-10-07T02:38:04Z CONTRIBUTOR

the only mode that publish cloudrun supports right now is immutable

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Publishing to cloudrun with immutable mode? 860722711  
1271020193 https://github.com/simonw/datasette/issues/1836#issuecomment-1271020193 https://api.github.com/repos/simonw/datasette/issues/1836 IC_kwDOBm6k_c5Lwjqh fgregg 536941 2022-10-07T02:15:05Z 2022-10-07T02:21:08Z CONTRIBUTOR

when i hack the connect method to open non mutable files with "mode=ro" and not "immutable=1" https://github.com/simonw/datasette/blob/eff112498ecc499323c26612d707908831446d25/datasette/database.py#L79

then:

bash 870 B RUN /bin/sh -c datasette inspect nlrb.db --inspect-file inspect-data.json

the datasette inspect layer is only the size of the json file!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
docker image is duplicating db files somehow 1400374908  
1271008997 https://github.com/simonw/datasette/issues/1836#issuecomment-1271008997 https://api.github.com/repos/simonw/datasette/issues/1836 IC_kwDOBm6k_c5Lwg7l fgregg 536941 2022-10-07T02:00:37Z 2022-10-07T02:00:49Z CONTRIBUTOR

yes, and i also think that this is causing the apparent memory problems in #1480. when the container starts up, it will make some operation on the database in immutable mode which apparently makes some small change to the db file. if that's so, then the db files will be copied to the read/write layer which counts against cloudrun's memory allocation!

running a test of that now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
docker image is duplicating db files somehow 1400374908  
1271003212 https://github.com/simonw/datasette/issues/1836#issuecomment-1271003212 https://api.github.com/repos/simonw/datasette/issues/1836 IC_kwDOBm6k_c5LwfhM fgregg 536941 2022-10-07T01:52:04Z 2022-10-07T01:52:04Z CONTRIBUTOR

and if we try immutable mode, which is how things are opened by datasette inspect we duplicate the files!!!

```python

test_sql_immutable.py

import sqlite3 import sys

db_name = sys.argv[1] conn = sqlite3.connect(f'file:/app/{db_name}?immutable=1', uri=True) cur = conn.cursor() cur.execute('select count(*) from filing') print(cur.fetchone()) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
docker image is duplicating db files somehow 1400374908  
1270992795 https://github.com/simonw/datasette/issues/1836#issuecomment-1270992795 https://api.github.com/repos/simonw/datasette/issues/1836 IC_kwDOBm6k_c5Lwc-b fgregg 536941 2022-10-07T01:29:15Z 2022-10-07T01:50:14Z CONTRIBUTOR

fascinatingly, telling python to open sqlite in read only mode makes this layer have a size of 0

```python

test_sql_ro.py

import sqlite3 import sys

db_name = sys.argv[1] conn = sqlite3.connect(f'file:/app/{db_name}?mode=ro', uri=True) cur = conn.cursor() cur.execute('select count(*) from filing') print(cur.fetchone()) ```

that's quite weird because setting the file permissions to read only didn't do anything. (on reflection, that chmod isn't doing anything because the dockerfile commands are run as root)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
docker image is duplicating db files somehow 1400374908  
1270988081 https://github.com/simonw/datasette/issues/1836#issuecomment-1270988081 https://api.github.com/repos/simonw/datasette/issues/1836 IC_kwDOBm6k_c5Lwb0x fgregg 536941 2022-10-07T01:19:01Z 2022-10-07T01:27:35Z CONTRIBUTOR

okay, some progress!! running some sql against a database file causes that file to get duplicated even if it doesn't apparently change the file.

make a little test script like this:

```python

test_sql.py

import sqlite3 import sys

db_name = sys.argv[1] conn = sqlite3.connect(f'file:/app/{db_name}', uri=True) cur = conn.cursor() cur.execute('select count(*) from filing') print(cur.fetchone()) ```

then

docker RUN python test_sql.py nlrb.db

produced a layer that's the same size as nlrb.db!!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
docker image is duplicating db files somehow 1400374908  
1270936982 https://github.com/simonw/datasette/issues/1836#issuecomment-1270936982 https://api.github.com/repos/simonw/datasette/issues/1836 IC_kwDOBm6k_c5LwPWW fgregg 536941 2022-10-07T00:52:41Z 2022-10-07T00:52:41Z CONTRIBUTOR

it's not that the inspect command is somehow changing the db files. if i set them to only read-only, the "inspect" layer still has the same very large size.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
docker image is duplicating db files somehow 1400374908  
1270923537 https://github.com/simonw/datasette/issues/1836#issuecomment-1270923537 https://api.github.com/repos/simonw/datasette/issues/1836 IC_kwDOBm6k_c5LwMER fgregg 536941 2022-10-07T00:46:08Z 2022-10-07T00:46:08Z CONTRIBUTOR

i thought it was maybe to do with reading through all the files, but that does not seem to be the case

if i make a little test file like:

```python

test_read.py

import hashlib import sys import pathlib

HASH_BLOCK_SIZE = 1024 * 1024

def inspect_hash(path): """Calculate the hash of a database, efficiently.""" m = hashlib.sha256() with path.open("rb") as fp: while True: data = fp.read(HASH_BLOCK_SIZE) if not data: break m.update(data)

return m.hexdigest()

inspect_hash(pathlib.Path(sys.argv[1])) ```

then a line in the Dockerfile like

docker RUN python test_read.py nlrb.db && echo "[]" > /etc/inspect.json

just produes a layer of 3B

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
docker image is duplicating db files somehow 1400374908  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 465.486ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows