Skip to content

SymbolicReference.set_reference may not roll back properly on error #1669

Closed
@EliahKagan

Description

@EliahKagan

SymbolicReference.set_reference contains this code, which is meant to attempt a transactional write:

ok = True
try:
fd.write(write_value.encode("utf-8") + b"\n")
lfd.commit()
ok = True
finally:
if not ok:
lfd.rollback()

However, because the ok flag is initialized to True instead of False--which strongly appears unintentional--it is always True even if the ok = True statement in the try block is not reached. Consequently, lfd.rollback() in the finally block is never called. This seems on first glance like a serious bug, but as detailed below I believe it might actually be quite minor in the context of what kind of cleanup GitPython generally does or does not guarantee.

This can be contrasted to another place where that pattern is used correctly:

GitPython/git/index/base.py

Lines 227 to 233 in a5a6464

ok = False
try:
self._serialize(stream, ignore_extension_data)
ok = True
finally:
if not ok:
lfd.rollback()

This is very simple to fix, since it is sufficient to change True to False in the initial assignment. So most of the work to fix it would be figuring out where a regression test should go and how it should manage to produce a case where cleanup doesn't happen immediately.

(Besides that simple fix, other fixes that also reduce the risk of such bugs coming back and allow the code to be simplified are possible. In particular, the LockedFD object represents a non-memory resource that always can and should be managed deterministically according to where control flow is, so it ought to implement the context manager protocol.)

I am unsure if this bug causes any problems in practice. If I understand LockedFD correctly, the write that is not rolled back is to a temporary file separate from the real target, accessed through a file object specific to that LockedFD object. The lock file name is determined from the target file name, so if a separate attempt to write to the target is made, that could cause problems. LockedFD has a __del__ method that should be called eventually and performs the rollback. On CPython, which has reference counting as its primary garbage collection strategy, cleanup would often happen immediately as a result... though not always, even in the absence of reference cycles, because, in exception handling, an exception object holds a reference to a traceback object which holds references to all stack frames in its call stack, through which local variables thus remain accessible until control leaves the handling except block.

Still, it may be that this shall not be considered a bug--or at least not a bug of its own, or at least not a bug that merits a regression test accompanying its bugfix--on the grounds that it might be considered a special case of the Leakage of System Resources limitation documented prominently in the readme. In support of this view, the third place in the code of GitPython where ldf.rollback is called uses a different pattern and its comments suggest that cleanup via LockedFD.__del__ is considered acceptable to rely on:

GitPython/git/refs/log.py

Lines 261 to 268 in a5a6464

fp = lfd.open(write=True, stream=True)
try:
self._serialize(fp)
lfd.commit()
except Exception:
# on failure it rolls back automatically, but we make it clear
lfd.rollback()
raise

(Interestingly, that is not exactly equivalent to the behavior of the try-finally pattern used in git/index/base.py. If it is intended to be, then it should catch BaseException rather than Exception.)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions