Skip to content

Empty read from gitdb.OStream.read() before EOF #120

Open
@lordmauve

Description

@lordmauve

I have code that relies on reading an object from a gitdb stream.

To do this I used with a standard .read() loop (like with io.RawIOBase):

stream = db.stream(bytes.fromhex(sha))
while chunk := stream.read(4096):
    yield chunk

The behaviour I expected to see (from the duck-type with RawIOBase) is to only see b'' at EOF:

If 0 bytes are returned, and size was not 0, this indicates end of file.

However stream.read(4096) can return empty chunks even before the end of the stream, so the loop exits early.

For the file where I saw this first, it is sensitive to the size parameter - it apparently occurs for 0 < size <= 4096.

Looking at the code there is a condition to repeat a read if we got insufficient bytes:

gitdb/gitdb/stream.py

Lines 316 to 317 in f36c0cc

if dcompdat and (len(dcompdat) - len(dat)) < size and self._br < self._s:
dcompdat += self.read(size - len(dcompdat))

However the leading if dcompdat and means that the condition doesn't apply if zero bytes were read. Removing this part of the condition addresses the issue (but I understand from the comment that this is in order to support compressed_bytes_read()).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions