Description
Hi, I've faced a problem on Windows with file/directory removal with shutil.rmtree()
. The problem, by itself, is not new and already has known solutions, including git.util.rmtree()
. However, just using this function was not enough in my case and I started to dig deeper - there was an error about an open file handle. I discovered there were extra child git
processes hanging in the process tree, but, unexpectedly, it was happening after all the library objects were removed:
Then I tried to catch and trace Popen
calls from the library and found out this function:
Lines 1190 to 1199 in 5b3669e
From comments and names, it looks like such persistent behavior is intended. There is also the AutoInterrupt
class here:
Lines 367 to 373 in 5b3669e
returned from:
Line 841 in 5b3669e
The class comment indicates that the process should be killed when the object goes out of scope. To me, it looks like an attempt to imitate the RAII C++ idiom. Unfortunately, in Python it does not work this way because of garbage collector and the scoping rules. That is, an object that has no references can be removed in any time after it lost its last reference. The "pythonic" way to deal with this behavior is using a context manager (which this class is not).
My suggestion is to collect these processes and manage them on a library/repo level, or wrap this in a context manager (which is better than just calling wait()
manually in case of exceptions). BTW, probably, the process created is not reused after the first call.
import git
r = git.Repo.init('test_repo')
r.index.commit('aa')
del r
Upd: I see there is an undocumented repo.close()
, which can help, but can have undesired side-effects because of a forced gc call.