Description
In git, if I understand correctly, a commit-ish is a git object from which a commit can be reached by dereferencing it zero or more times, which is to say that all commits are commit-ish, some tag objects are commit-ish--those that, through (possibly repeated) dereferencing, eventually reach a commit--and no other types of git objects are ever commit-ish.
commit-ish (also committish)
A commit object or an object that can be recursively dereferenced to a commit object. The following are all commit-ishes: a commit object, a tag object that points to a commit object, a tag object that points to a tag object that points to a commit object, etc.
Therefore, all instances of GitPython's Commit
class, and some instances of GitPython's TagObject
class, encapsulate git objects that are actually commit-ish.
But GitPython has a Commit_ish
union type in the git.types
module, and that Commit_ish
type is considerably broader:
Line 53 in b2c3d8b
These four classes are the GitPython classes whose instances encapsulate any of the four types of git objects (of which blobs and trees are never actually commit-ish):
object type
One of the identifiers "commit", "tree", "tag" or "blob" describing the type of an object.
GitPython uses its Commit_ish
type in accordance with this much broader concept, at least some of the time and possibly always. For example, Commit_ish
is given as the return type of Object.new
:
Lines 77 to 78 in b2c3d8b
Commit_ish
cannot simply be replaced by Object
because GitPython's Object
class is also, through IndexObject
, a superclass of Submodule
(and the RootModule
subclass of Submodule
):
GitPython/git/objects/submodule/base.py
Line 82 in b2c3d8b
GitPython/git/objects/submodule/base.py
Lines 87 to 88 in b2c3d8b
GitPython/git/objects/submodule/base.py
Lines 100 to 101 in b2c3d8b
However, elsewhere in GitPython, Commit_ish
is used where it seems only a commit is intended to be allowed, though it is unclear if this is unintentional, intentional but only to allow type checkers to allow some code that can only reasonably be checked at runtime, or intentional for some other reason. For example, the Repo.commit
method, when called with one argument, looks up a commit in the repository it represents from a Commit_ish
or string, and returns the commit it finds as a Commit
:
Line 698 in b2c3d8b
This leads to a situation where one can write code that type checkers allow and that may appear intended to work, but that always fails, and in a way that may be unclear to users less familiar with git concepts:
>>> import git
>>> repo = git.Repo()
>>> tree = repo.tree()
>>> repo.commit(tree)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\ek\source\repos\GitPython\git\repo\base.py", line 709, in commit
return self.rev_parse(str(rev) + "^0")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\ek\source\repos\GitPython\git\repo\fun.py", line 379, in rev_parse
obj = to_commit(obj)
^^^^^^^^^^^^^^
File "C:\Users\ek\source\repos\GitPython\git\repo\fun.py", line 221, in to_commit
raise ValueError("Cannot convert object %r to type commit" % obj)
ValueError: Cannot convert object <git.Tree "d5538cc6cc8839ccb0168baf9f98aebcedfd9c2c"> to type commit
An argument that this specific situation with Repo.commit
is not a typing bug is that this operation is fundamentally one that can only be checked at runtime in some cases. After all, an argument of type str
is also allowed and it cannot known until runtime what object a string happens to name. Even so, the method docstring should possibly be expanded to clarify this issue. Or perhaps if the situation with Commit_ish
is improved, then the potential for confusion will go away.
One way to improve this situation is to clearly document it in a docstring for the Commit_ish
type. But if possible it seems to me that more should be done:
- If known, the reason for the current situation should be stated there.
- Its relationship to other types should be clarified where otherwise confusing. For example,
Object
may benefit from greater clarity about what it ideally represents (git objects) versus the entirety of what it represents (that anObject
can also be aSubmodule
), and the way thatTree_ish
is narrower than all tree-ish git objects whileCommit_ish
is broader than all commit-ish git objects can be noted in one of their docstrings. - Maybe
Commit_ish
should be deprecated and one or more new types introduced, replacing all uses of it in GitPython.
If I am making a fundamental mistake about git concepts here, and GitPython's Commit_ish
has a closer and more intuitive relationship to commit-ish git objects than I think it does, then I apologize.
I have not figured out very much from GitPython's revision history what the reason for defining Commit_ish
as it is currently defined is, or alternatively why this union of all four actual git object types was introduced with the narrower-seeming name Commit_ish
. However, the Commit_ish
type was introduced in 82b131c (#1282), where the annotations it was used to replace had listed all four types Commit
, TagObject
, Tree
, and Blob
as explicit alternatives.