Description
Methods of installing GitPython that run code from setup.py
fail in some locales. This does not affect installing from a wheel, but it does affect installing from an sdist, or installing from a local directory, including the editable install procedure recommended for development in the readme. The same problem happens when building GitPython. Building or installing using the old method of running setup.py
directly is also affected. The error is of this form, though the codec will not always be gbk
:
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa6 in position 473: illegal multibyte sequence
I am unsure if this ever happens in practice on Unix-like systems, whose locales are usually UTF-8. However, it happens on Windows systems in which README.md
cannot be decoded using the system's active ANSI code page. This is a rarely-changed systemwide setting, so changing the user preferred languages, display language, or input method are not workarounds. I discovered this on a Simplified Chinese (zh-CN) build of Windows Server 2022 while using it to test some WSL-related test helper logic in #1745. Such a system uses ANSI code page 936. README.md
is UTF-8, but it currently happens that it can be decoded with code page 1252, which Windows builds for Western European languages use as their ANSI code page. I expect encodings other than cp936 to fail as well.
With the PyPI sdist for GitPython 3.1.40 (on Python 3.12 x86-64, though I expect all supported versions to be affected):
(.venv) C:\Users\Administrator\gptest> pip install --no-binary GitPython GitPython
Collecting GitPython
Using cached GitPython-3.1.40.tar.gz (200 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... error
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [18 lines of output]
Traceback (most recent call last):
File "C:\Users\Administrator\gptest\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 353, in <module>
main()
File "C:\Users\Administrator\gptest\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Administrator\gptest\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 118, in get_requires_for_build_wheel
return hook(config_settings)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Administrator\AppData\Local\Temp\pip-build-env-8j1uxmuv\overlay\Lib\site-packages\setuptools\build_meta.py", line 325, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=['wheel'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Administrator\AppData\Local\Temp\pip-build-env-8j1uxmuv\overlay\Lib\site-packages\setuptools\build_meta.py", line 295, in _get_build_requires
self.run_setup()
File "C:\Users\Administrator\AppData\Local\Temp\pip-build-env-8j1uxmuv\overlay\Lib\site-packages\setuptools\build_meta.py", line 311, in run_setup
exec(code, locals())
File "<string>", line 20, in <module>
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa6 in position 473: illegal multibyte sequence
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
A workaround is to pass -X utf8
to python
. For example, python -X utf8 -m pip install ...
can be used for installation.
The fix should be straightforward. Importing setup.py
confirms that the specific cause is reading README.md
:
(.venv) C:\Users\Administrator\repos\GitPython [main ≡]> python -c 'import setup'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\Administrator\repos\GitPython\setup.py", line 20, in <module>
long_description = rm_file.read()
^^^^^^^^^^^^^^
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa6 in position 477: illegal multibyte sequence
It can be fixed by passing encoding="utf-8"
. I have proposed this change in #1748.
gitdb and smmap are unaffected, because gitdb does not open files to read those data in setup.py
, while smmap does open README.md
but passes encoding="utf-8"
.