Skip to content

GitPython repo.index.commit() spawns persistent git.exe instance, holds handles to repo #553

Closed
@ghost

Description

I am trying to use GitPython for some repo manipulation, but ran into issues with my app, with handles open where i wouldn't expect.

Bug-jarring the issue, it seems that calling repo.index.commit() results in a several (4, consistently) git.exe processes being spawned, each holding a handle to the repo's root directory newRepo. When the test below goes to delete this TempDir, the processes are still up, causing a failure on context-manager __exit__(). My app occasionally needs to do a similar cleanup, so hits the same issue.

On the one hand, it looks like there is no contect-manager capable repo-wrapper, meaning it makes sense if some resources is open, it will not typically be cleaned/GC'd before the tempdir __exit__(). On the other hand - if Repo is going to behave like that, it really should not persist resources that have such side-effects.

Here is a working unittest:

import unittest
import git
import tempfile
import os.path

class Test(unittest.TestCase):

    def testCreateRepo(self):
        with tempfile.TemporaryDirectory(prefix=(__loader__.name) + "_") as mydir:

            # MAKE NEW REPO 
            repo = git.Repo.init(path=os.path.join(mydir, "newRepo"), mkdir=True)
            self.assertTrue(os.path.isdir(os.path.join(repo.working_dir, ".git")), "Failed to make new repo?")
            
            # MAKE FILE, COMMIT REPO
            testFileName = "testFile.txt"
            open(os.path.join(repo.working_dir, testFileName) , "w").close()
            repo.index.add([testFileName])
            self.assertTrue(repo.is_dirty())
            
            #### 
            # COMMENTING THIS OUT --> TEST PASSES
            repo.index.commit("added initial test file") 
            self.assertFalse(repo.is_dirty())
            #### 
            
            # adding this does not affect the handle
            git.cmd.Git.clear_cache()
            
            
            print("done") # exception thrown right after this, on __exit__
            # i can also os.walk(topdown=False) and delete all files and dirs (including .git/)
            # it is just the  newRepo/  folder itself who's handle is held open
            
if __name__ == '__main__':
    unittest.main()

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\Users\%USER%\AppData\Local\Temp\EXAMPLE_gitpython_v3kbrly_\newRepo'

digging a little deeper, it seems that gitPython spawns multiple instances of git.exe processes, and each of them holds a handle to the root folder of the repo newRepo.

  • set a breakpoint immediately before the error, use sysinternals/handle to see open handles to newRepo ... git.exe (4 separate PID's of git.exe to be precise)
  • using sysinternals/procexp i can see that that they are all spawned from the python instance
    -- I'm typically running this from PyDev, but i verified the issue reproduces under vanila command line invocation of python.exe as well
  • the exception indicates it's a handle to newRepo being held. Adding a little extra code to the above I think that is the only handle held. I am able to successfully os.remove/os.rmdir() every dir and file, including all of .git/; and i finally manually recreate the issue seen on exit() in my example when i os.rmdir(newRepo)

stepping through, it's the call to repo.index.commit() that actually leads to the the git.exe(s) being spawned.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions