Skip to content

Source and editable installs fail in some locales #1747

Closed
@EliahKagan

Description

@EliahKagan

Methods of installing GitPython that run code from setup.py fail in some locales. This does not affect installing from a wheel, but it does affect installing from an sdist, or installing from a local directory, including the editable install procedure recommended for development in the readme. The same problem happens when building GitPython. Building or installing using the old method of running setup.py directly is also affected. The error is of this form, though the codec will not always be gbk:

UnicodeDecodeError: 'gbk' codec can't decode byte 0xa6 in position 473: illegal multibyte sequence

I am unsure if this ever happens in practice on Unix-like systems, whose locales are usually UTF-8. However, it happens on Windows systems in which README.md cannot be decoded using the system's active ANSI code page. This is a rarely-changed systemwide setting, so changing the user preferred languages, display language, or input method are not workarounds. I discovered this on a Simplified Chinese (zh-CN) build of Windows Server 2022 while using it to test some WSL-related test helper logic in #1745. Such a system uses ANSI code page 936. README.md is UTF-8, but it currently happens that it can be decoded with code page 1252, which Windows builds for Western European languages use as their ANSI code page. I expect encodings other than cp936 to fail as well.

With the PyPI sdist for GitPython 3.1.40 (on Python 3.12 x86-64, though I expect all supported versions to be affected):

(.venv) C:\Users\Administrator\gptest> pip install --no-binary GitPython GitPython
Collecting GitPython
  Using cached GitPython-3.1.40.tar.gz (200 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [18 lines of output]
      Traceback (most recent call last):
        File "C:\Users\Administrator\gptest\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 353, in <module>
          main()
        File "C:\Users\Administrator\gptest\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\Administrator\gptest\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\Administrator\AppData\Local\Temp\pip-build-env-8j1uxmuv\overlay\Lib\site-packages\setuptools\build_meta.py", line 325, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\Administrator\AppData\Local\Temp\pip-build-env-8j1uxmuv\overlay\Lib\site-packages\setuptools\build_meta.py", line 295, in _get_build_requires
          self.run_setup()
        File "C:\Users\Administrator\AppData\Local\Temp\pip-build-env-8j1uxmuv\overlay\Lib\site-packages\setuptools\build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 20, in <module>
      UnicodeDecodeError: 'gbk' codec can't decode byte 0xa6 in position 473: illegal multibyte sequence
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

A workaround is to pass -X utf8 to python. For example, python -X utf8 -m pip install ... can be used for installation.

The fix should be straightforward. Importing setup.py confirms that the specific cause is reading README.md:

(.venv) C:\Users\Administrator\repos\GitPython [main ≡]> python -c 'import setup'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\Administrator\repos\GitPython\setup.py", line 20, in <module>
    long_description = rm_file.read()
                       ^^^^^^^^^^^^^^
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa6 in position 477: illegal multibyte sequence

It can be fixed by passing encoding="utf-8". I have proposed this change in #1748.

gitdb and smmap are unaffected, because gitdb does not open files to read those data in setup.py, while smmap does open README.md but passes encoding="utf-8".

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions