Description
The git.cmd.Git.refresh
class method includes this code:
Lines 478 to 481 in d28c20b
It is unclear what "the difference between a first import and a second import" is referring to. (I should have noticed this issue when working on #1725, in which I actually edited that comment to improve its formatting, capitalization, and punctuation, but I apparently did not notice it at that time.)
That code can run multiple times, accessing the same class-wide state each time, because Git.refresh
can be called multiple times (and, being public, may reasonably be called by code outside of GitPython). But this is not related to how many imports, or import attempts, occur. The comment would make sense if Python's import system worked differently, but seems at odds with the actual design. As I understand it:
-
If an import succeeds, the resulting module object is cached in
sys.modules
. Subsequent imports use the same module object and do not rerun the module's code.For example, suppose
hello.py
contains:print("Hello, world!")
Then running
import hello
from a REPL prints the greeting the first time it is run, but running it subsequent times produces no repeated side effect. -
If an import fails, the new module object, which may itself be incompletely populated, is removed from
sys.modules
, before control returns via the raised exception propagating up the call stack to the code that attempted the import. Even if that module object turns out still to exist, it will not be reused in subsequent attempts to import the code. A subsequent import attempt will rerun the module's code from scratch, creating a new module object which serves as a new namespace. The state in the old module object is not used; global variables of the module, including classes, and thus also including their attributes, are all new.For example, suppose
throw.py
contains:import sys me = sys.modules[__name__] try: me.x += 1 except AttributeError: me.x = 1 try: a = sys.a except AttributeError: a = sys.a = [] a.append(me) raise ValueError(x)
Then running
import throw
from a REPL raisesValueError
with a value of1
each time it is run, andsys.a
is populated with the abandoned module object from each attempt, all of which are separate objects. (Importingsys
in the REPL allows it to be inspected, sinceimport sys
succeeds the first time it occurs, whether that is in the REPL or inthrow.py
, and thus allimport sys
get the samesys
module object with the samea
attribute.) -
If the contents of
sys.modules
are explicitly modified (rather than implicitly, due to the effect ofimport
andfrom
statements), one of the above two situations still applies, depending on the nature of the mutation. The exception to this would be if a reference to the old module object is obtained and patched back intosys.modules
. That would be a very unusual thing to do, except perhaps for purposes of debugging or experimentation, and I'm reasonably sure the intent of that comment--and the code it documents--is not to cover such cases. Furthermore, that would still not rerun top-level code of the module. -
Python's import system is extensible and can be made to work differently from the usual rules. Of course, this cannot generally be anticipated by library code, and I likewise doubt this is a factor in the design or documentation.
I suspect that the code is correct and the comment incorrect, but since I'm not sure what was originally intended, I am not entirely sure how to fix it. It could be changed to say something like "discern the difference between the initial refresh at import time and subsequent calls," but this is very general and I worry it would paper over whatever really ought to be said. On the other hand, perhaps this really is all that was meant.
I haven't looked into the history of the Python import system but I don't think it operated in a fundamentally different way at the time this explanation was written. This code, and the original accompanying comment, was added in #640 (in 2b3e769). The reasoning given in that pull request, for the actual code change, continues to makes sense, and does not seem to have relied on any incorrect--nor any now-outdated--assumptions about the import system. As far as I can tell, it is only the comment that should be improved.