Choosing a Python interpreter for a Pants project

Choosing a compatible Python interpreter carefully is important not only for developers, but for the Pants repository administrators as well. Developers using Pants build system in their project may see various errors and have different behaviors depending on the Python interpreter...

Choosing a Python interpreter for a Pants project
Photo by Javier Allegue Barros / Unsplash

Introduction

When setting up a Python monorepo managed by Pants, there are many things to consider. How should one structure the source code? How should the requirements be declared? How fine-grained should the packaging approach be? Among these and other questions it is also necessary to determine a strategy for specifying a Python interpreter to be used.

You may want to provide maximum freedom or be very restrictive in how Python interpreters will be found on a developer computer. The approach you take will depend on the IT policies of your organization, number of software engineers, and available resources to support the development infrastructure.

Choosing a compatible Python interpreter carefully is important not only for developers, but for the Pants repository administrators as well. Developers using Pants build system in their project may see various errors and have different behaviors depending on the Python interpreter used to run the code. This would include missing system dependencies on Linux, having code that requires a framework build of Python on MacOS, or being affected by deprecation of functionality in a certain Python version.

When deciding how to declare your Python interpreter requirements, you will need to consider a number of things, including for example:

  • Minimum version of Python that your code should be compatible with
  • Origin of Python interpreter to be used and how it is installed in the target environments
  • Frequency of updates for both the supported Python version and the interpreter itself as well as the updating mechanism

Setting the default Python version

If users of a Pants repository can install arbitrary software on their computers, it is possible that a given user could have multiple versions of the Python interpreter available in their system. If the runtime environment of the code is known in advance and is tied to a particular Python version, you may be conservative and configure your default Python interpreter compatibility constraints in pants.toml to be tied to a major Python version like this:

[python]
interpreter_constraints = ["CPython==3.8.*"]

This will ensure that a particular version of Python is always used, to prevent users taking advantage of syntax and functionality of later versions of Python (making the code incompatible with the runtime environment). But with only interpreter_constraints set, how and where that version of Python will end up installed on a machine is unconstrained.

Setting interpreter search path

Pants will by default inspect the $PATH environment variable to discover available Python interpreters. If there are many compatible versions of the interpreter found, this may result in subtle, hard to troubleshoot bugs when the order in which the interpreters will be discovered is not deterministic. To be more rigorous, one can change this by setting the option search_path in the [python-bootstrap] scope, for example:

[python-bootstrap]
search_path = ["<PYENV_LOCAL>", "/usr/bin"]

<PYENV_LOCAL> is a special notation that Pants understands and it refers to the interpreter specified in the local file .python-version used by pyenv. See Changing the interpreter search path to learn about other special symbols Pants understands when searching for a Python interpreter.

If the responsibility to install a Python interpreter of a particular version is transferred to a developer, then they install the software themselves, optionally using the documentation shared with them. If it is the build system administrators who are in charge, then the software is likely to be pre-installed on a developer's machine and is managed by your organization's IT staff.

It is possible, however, that the expectations of the software engineers will be in conflict because some developers may have certain preferences with regard to which Python interpreter they would want to use or have installed. It may therefore be necessary to find a sensible solution that would scale and cover the needs of most engineers. To learn more about Python interpreter compatibility, please see Interpreter compatibility in the Pants build system documentation.

Installing Python on Linux

On Debian-based Linux, Python can be installed with the system python3 package:

$ sudo apt-get install python3

By default, the python3 package will install a predefined version of the interpreter, depending on the operating system version. Installing a Python interpreter of a particular version is trivial:

$ sudo apt-get install python3.8

Be careful, however, when relying on system Python on Linux, particularly with Debian-derived distros, since system Python is often modified from official (python.org) distributions in subtle, but consequential ways. See Python in Debian to learn more. In addition, global Python installations can also have arbitrary packages installed and made available on the PYTHONPATH if users use pip or easy_install without creating a virtual environment, which would break hermeticity and reproducibility.

If you do decide to use a Python interpreter from a system package, it is possible to install multiple versions of Python interpreters side by side. One can choose to update what version of Python will be used when a particular symlink is accessed. For instance, the update-alternatives command would update the /usr/bin/python3 symlink to point to the python3.8 system package interpreter. This would be required, for example, if you want to use by default Python 3.8 on an Ubuntu 18.04 system that ships with Python 3.6:

$ update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8 1

For engineers writing Python on Linux, you may want to set expectations that all systems will be configured in such a way that the python3 command (when not having any virtual environment activated), must point to the interpreter /usr/bin/python3. To avoid confusion, developers should be discouraged from modifying their shell start-up scripts in a way that overrides the python3 interpreter. This may be important if they use the same Python interpreter for development outside of the Pants monorepo or to run Python code locally bypassing Pants for debugging purposes.

If avoiding Python interpreters via system packages on Linux is an option, you may want to look into the pyenv which provides a great alternative to the system Python interpreters. You'll be able to learn more about pyenv below when sources of Python interpreters for MacOS are discussed.

Installing Python on MacOS

At the time of writing, MacOS devices, for example, Big Sur and Monterey, still come with Python 2.7. The assumption, however, is that anyone setting up a Pants repository would be interested in Python 3. A fresh installation of MacOS includes a /usr/bin/python3 binary, however, it is a stub that will prompt you to install the command line developer tools that provide Python 3 interpreter.

There are a few ways one can get an arbitrary version of Python3 interpreter installed on a MacOS device: with Homebrew, Xcode developer tools, or pyenv among others. Setting up Python on MacOS may be slightly more complicated than doing this on Linux. This is because for an arbitrary MacOS computer of a software engineer, there are likely to be a variety of Python interpreters installed.

brew

/usr/local/bin/python3 binary is likely to be managed by brew and developers may have versions ranging from Python 3.7 all the way to Python 3.10 (and later, once newer versions are released). Using brew's Python, however, may prove to be unreliable due to a very frequent update schedule. Most likely developers on MacOS will be using brew and running brew install python will install the most recent version available.

brew will try to keep this Python interpreter on the latest version. To stop the Python formula from being updated, you can pin it with brew pin. Keep in mind, however, that if a pinned Python formula that another formula depends on becomes outdated, you will need to upgrade it. Therefore, you may want to be careful using the Python coming from brew in the Pants repository as your Python interpreter unless you have a clear strategy for managing brew's updates across your developers' computers.

Xcode

Xcode software is free to download and only requires an Apple ID account. On a MacOS computer, the /usr/bin/python3 is a stub that queries the active Xcode installation (that is set with the xcode-select command) and runs it from there. If your developers already have Xcode installed, you may find it convenient to set search_path = ["/usr/bin"] since you can then be sure that a compatible version of Python interpreter is guaranteed to be present on a developer's computer. That is, all developers will have only one Python interpreter they really care about and its version won't suddenly change (in contrast to how brew updates Python).

Python from Xcode ships with a framework build of Python which may be required to run Python code that needs access to the screen such as wxPython or Qt applications. Keep in mind, however, that the guarantee that Apple makes for Xcode's Python is that it's sufficient to run the operations and processes in MacOS that need Python, and not necessarily user code. Therefore, this interpreter is often considered to be nonstandard in subtle ways that cause various things to break.

Relying on a single Xcode's Python interpreter on MacOS, however, will free you from the MacOS operating system upgrades concerns should Apple decide to start shipping Python 3 readily available in the future. If the Xcode installations are managed in your organization, you'll also be able to control the Python interpreter version used by developers (for instance, Xcode 12 ships Python 3.8 and Xcode 13 ships Python 3.9).

Be advised, however, that if Xcode is installed from the App Store, it can be accidentally updated by a user to a later version (that may come with another Python version) as part of updates offered in the App Store. To avoid this, it is possible that an organization may give their developers computers that already have everything they need for their work pre-installed. This may be the case for a large organization that needs to enforce the same system environment to ensure a universal developer experience.

Alternatively, providing instructions on how to start using a Python interpreter that ships as part of Xcode would be helpful since its installation can be scripted (with xip command) and can be run in quiet mode when updating the software remotely. Having Xcode installed as part of a scripted developer environment setup (done by a developer or a system administrator) would ensure that it’s not being updated as part of the standard updates of other system applications via App Store.

pyenv

pyenv approach makes it possible to have multiple versions of Python installed and users can decide what version to use at a given point of time. pyenv is very popular among Python developers, supports both Linux and MacOS, and provides a very clean and stable way to deal with multiple versions of Python interpreters. To be sure that engineers writing Python programs would target a certain Python version, one can install a specific version of Python that is of interest:

$ pyenv install 3.8.10      
$ pyenv local 3.8.10
$ cat .python-version
3.8.10
$ python -V
Python 3.8.10

You can choose to rely solely on the local repository settings (with the assumption that pyenv is going to be set up for the repository) or be more permissive and search for other interpreters, for example:

[python-bootstrap]
search_path = ["<PYENV_LOCAL>", "/usr/bin"]

In this particular example, if the local .python-version file (created after running the pyenv local <version> command) with the interpreter specified is not found, then Pants will attempt to run a compatible Python interpreter from the /usr/bin directory. You may want to check in the .python-version file into the source code repository and provide documentation on how to set up pyenv on a computer and install a Python interpreter of interest.

Having Pants project support pyenv driven approach may be sensible if you would want to ensure that all engineers are using the same Python interpreter set up in the same way. In addition, with pyenv, they would be able to take advantage of any other Python interpreter should they require them for some other local development outside of the Pants repository (while using the same pyenv interface). Relying on the presence of a compatible Python 3 interpreter found with the .python-version file would also work very well if your Pants repository will be used in a mixed environment, when some of your developers are on MacOS and some are on Linux.

Conclusion

Finding the right strategy to declare your Python interpreter in a Pants project requires careful thought. Navigating intricacies of Python interpreters on various platforms may be very demanding and the more diverse set of Python interpreters you will claim to support, the more unnecessary troubleshooting and hand-holding you may end up providing. Also, having a simple way to get a Python interpreter installed may be important if there are enough developers who need to interact with the Pants repository and write or run code, but who are not Python programmers or are not familiar with the Python ecosystem and tooling.

We hope you find a strategy that works for you. If you have questions, please feel free to raise a GitHub issue or ask on the Pants community Slack! It's a friendly and supportive community that is happy to respond to your questions and feedback.

Come say hi!

Pants community welcomes newcomers, and Pants development is guided by your feedback.

Click here to join the community chat on Slack