Skip to content

Commit_ish is much broader than commit-ish #1858

Closed
@EliahKagan

Description

@EliahKagan

In git, if I understand correctly, a commit-ish is a git object from which a commit can be reached by dereferencing it zero or more times, which is to say that all commits are commit-ish, some tag objects are commit-ish--those that, through (possibly repeated) dereferencing, eventually reach a commit--and no other types of git objects are ever commit-ish.

As gitglossary(7) says:

commit-ish (also committish)

A commit object or an object that can be recursively dereferenced to a commit object. The following are all commit-ishes: a commit object, a tag object that points to a commit object, a tag object that points to a tag object that points to a commit object, etc.

Therefore, all instances of GitPython's Commit class, and some instances of GitPython's TagObject class, encapsulate git objects that are actually commit-ish.

But GitPython has a Commit_ish union type in the git.types module, and that Commit_ish type is considerably broader:

Commit_ish = Union["Commit", "TagObject", "Blob", "Tree"]

These four classes are the GitPython classes whose instances encapsulate any of the four types of git objects (of which blobs and trees are never actually commit-ish):

object type

One of the identifiers "commit", "tree", "tag" or "blob" describing the type of an object.

GitPython uses its Commit_ish type in accordance with this much broader concept, at least some of the time and possibly always. For example, Commit_ish is given as the return type of Object.new:

@classmethod
def new(cls, repo: "Repo", id: Union[str, "Reference"]) -> Commit_ish:

Commit_ish cannot simply be replaced by Object because GitPython's Object class is also, through IndexObject, a superclass of Submodule (and the RootModule subclass of Submodule):

class Submodule(IndexObject, TraversableIterableObj):

The submodule type does not have a string type associated with it, as it exists
solely as a marker in the tree and index.

type: Literal["submodule"] = "submodule" # type: ignore
"""This is a bogus type for base class compatibility."""

However, elsewhere in GitPython, Commit_ish is used where it seems only a commit is intended to be allowed, though it is unclear if this is unintentional, intentional but only to allow type checkers to allow some code that can only reasonably be checked at runtime, or intentional for some other reason. For example, the Repo.commit method, when called with one argument, looks up a commit in the repository it represents from a Commit_ish or string, and returns the commit it finds as a Commit:

def commit(self, rev: Union[str, Commit_ish, None] = None) -> Commit:

This leads to a situation where one can write code that type checkers allow and that may appear intended to work, but that always fails, and in a way that may be unclear to users less familiar with git concepts:

>>> import git
>>> repo = git.Repo()
>>> tree = repo.tree()
>>> repo.commit(tree)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\ek\source\repos\GitPython\git\repo\base.py", line 709, in commit
    return self.rev_parse(str(rev) + "^0")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ek\source\repos\GitPython\git\repo\fun.py", line 379, in rev_parse
    obj = to_commit(obj)
          ^^^^^^^^^^^^^^
  File "C:\Users\ek\source\repos\GitPython\git\repo\fun.py", line 221, in to_commit
    raise ValueError("Cannot convert object %r to type commit" % obj)
ValueError: Cannot convert object <git.Tree "d5538cc6cc8839ccb0168baf9f98aebcedfd9c2c"> to type commit

An argument that this specific situation with Repo.commit is not a typing bug is that this operation is fundamentally one that can only be checked at runtime in some cases. After all, an argument of type str is also allowed and it cannot known until runtime what object a string happens to name. Even so, the method docstring should possibly be expanded to clarify this issue. Or perhaps if the situation with Commit_ish is improved, then the potential for confusion will go away.

One way to improve this situation is to clearly document it in a docstring for the Commit_ish type. But if possible it seems to me that more should be done:

  • If known, the reason for the current situation should be stated there.
  • Its relationship to other types should be clarified where otherwise confusing. For example, Object may benefit from greater clarity about what it ideally represents (git objects) versus the entirety of what it represents (that an Object can also be a Submodule), and the way that Tree_ish is narrower than all tree-ish git objects while Commit_ish is broader than all commit-ish git objects can be noted in one of their docstrings.
  • Maybe Commit_ish should be deprecated and one or more new types introduced, replacing all uses of it in GitPython.

If I am making a fundamental mistake about git concepts here, and GitPython's Commit_ish has a closer and more intuitive relationship to commit-ish git objects than I think it does, then I apologize.

I have not figured out very much from GitPython's revision history what the reason for defining Commit_ish as it is currently defined is, or alternatively why this union of all four actual git object types was introduced with the narrower-seeming name Commit_ish. However, the Commit_ish type was introduced in 82b131c (#1282), where the annotations it was used to replace had listed all four types Commit, TagObject, Tree, and Blob as explicit alternatives.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions