Nothing Special   »   [go: up one dir, main page]

Following system colour scheme Selected dark colour scheme Selected light colour scheme

Python Enhancement Proposals

PEP 600 – Future ‘manylinux’ Platform Tags for Portable Linux Built Distributions

Author:
Nathaniel J. Smith <njs at pobox.com>, Thomas Kluyver <thomas at kluyver.me.uk>
Sponsor:
Paul Moore <p.f.moore at gmail.com>
BDFL-Delegate:
Paul Moore <p.f.moore at gmail.com>
Discussions-To:
Discourse thread
Status:
Final
Type:
Standards Track
Topic:
Packaging
Created:
03-May-2019
Post-History:
03-May-2019
Replaces:
513, 571, 599
Resolution:
Discourse message

Table of Contents

Abstract

This PEP proposes a scheme for new ‘manylinux’ wheel tags to be defined without requiring a PEP for every specific tag, similar to how Windows and macOS tags already work. This will allow package maintainers to take advantage of new tags more quickly, while making better use of limited volunteer time.

Non-goals include: handling non-glibc-based platforms; integrating with external package managers or handling external dependencies such as CUDA; making manylinux tags more sophisticated than their Windows/macOS equivalents; doing anything besides taking our existing tried-and-tested approach and streamlining it. These are important issues and other PEPs may address them in the future, but for this PEP they’re out of scope.

Rationale

Python users appreciate it when PyPI has pre-compiled packages for their platform, because it makes installation fast and simple. But distributing pre-compiled binaries on Linux is challenging because of the diversity of Linux-based platforms. For example, Debian, Android, and Alpine all use the Linux kernel, but with radically different userspace libraries, which makes it difficult or impossible to create a single wheel that works on all three. This complexity has caused many previous discussions of Linux wheels to stall out.

The “manylinux” project succeeded by adopting a strategy of ruthless pragmatism. We chose a large but tractable set of Linux platforms – specifically, mainstream glibc-based distributions like Debian, OpenSuSE, Ubuntu, RHEL, etc. – and then we did whatever it takes to make wheels that work across all these platforms.

This approach requires many compromises. Manylinux wheels can only rely on external libraries that maintain a consistent ABI and are universally available across all these distributions, which in practice restricts them to a small set of core libraries like glibc and a few others. Wheels have to be built on carefully-chosen platforms of the oldest possible vintage, using a Python that is itself built in a carefully-chosen configuration. Other shared library dependencies have to be bundled into the wheel, which requires a complex process to avoid collisions between unrelated wheels. And finally, the details of these requirements change over time, as new distro versions are released, and old ones fall out of use.

It turns out that these requirements are not too onerous: they’re essentially equivalent to what you have to do to ship Windows or macOS wheels, and the manylinux approach has achieved substantial uptake among both package maintainers and end-users. But any manylinux PEP needs some way to address these complexities.

In previous manylinux PEPs (PEP 513, PEP 571, PEP 599), we’ve done this by attempting to write down in the PEP the exact set of libraries, symbol versions, Python configuration, etc. that we believed would lead to wheels that work on all mainstream glibc-based Linux systems. But this created several problems:

First, PEPs are generally supposed to be normative references: if software doesn’t match the PEP, then we fix the software. But in this case, the PEPs are attempting to describe Linux distributions, which are a moving target, and do not consider our PEPs to constrain their behavior. This means that we’ve been taking on an unbounded commitment to keep updating every manylinux PEP whenever the Linux distro landscape changes. This is a substantial commitment for unfunded volunteers to take on, and it’s not clear that this work produces value for our users.

And second, every time we move manylinux forward to a newer range of supported platforms, or add support for a new architecture, we have to go through a fairly elaborate process: writing a new PEP, updating the PyPI and pip codebases to recognize the new tag, waiting for the new pip to percolate to users, etc. None of this happens on Windows/macOS; it’s only a tax on Linux maintainers. This slows deployment of new manylinux versions, and consumes part of our community’s limited PEP review bandwidth, thus slowing progress of the Python packaging ecosystem as a whole. This is especially problematic for less-popular architectures, who have less volunteer resources to overcome these barriers.

How can we fix it?

A manylinux PEP has to address three main audiences:

  • Package installers, like pip, need to be able to determine which wheel tags are compatible with the system they find themselves running on. This requires some automated process to introspect the system and match it up with wheel tags.
  • Package indexes, like PyPI, need to be able to validate which wheel tags are valid. Generally, this just requires something like a list of valid tags, or regex they match, with no need to know anything about the actual semantics for individual tags. (But see the discussion of upload verification below.)
  • Package maintainers need to be able to build wheels that meet the requirements for a given wheel tag.

Here’s the key insight behind this new PEP: it’s crucial that different package installers and package indexes all agree on which manylinux tags are valid and which systems they install on, so we need a PEP to specify these – but, these are straightforward, and don’t really change between manylinux versions. The complicated part that keeps changing is the process of actually building the wheels – but, if there are multiple competing build environments, it doesn’t matter whether they use exactly the same rules as each other, as long as they all produce wheels that work on end-user systems. Therefore, we don’t need an interoperability standard for building wheels, so we don’t need to write the details into a PEP.

To further convince ourselves that this approach will work, let’s look again at how we handle wheels on Windows and macOS: the PEPs describe which tags are valid, and which systems they’re supposed to work on, but not how to actually build wheels for those platforms. And in practice, if you want to distribute Windows or macOS wheels, you might have to jump through some complicated and poorly documented hoops in order to bundle dependencies, target the right range of OS versions, etc. But the system works, and the way to improve it is to write better docs and build better tooling; no-one thinks that the way to make Windows wheels work better is to publish a PEP describing which symbols we think Microsoft should be including in their libraries and how their linker ought to work. This PEP extends that philosophy to manylinux as well.

Specification

Core definition

Tags using the new scheme will look like:

manylinux_2_17_x86_64

Or more generally:

manylinux_${GLIBCMAJOR}_${GLIBCMINOR}_${ARCH}

This tag is a promise: the wheel’s creator promises that the wheel will work on any mainstream Linux distro that uses glibc version ${GLIBCMAJOR}.${GLIBCMINOR} or later, and where the ${ARCH} matches the return value from distutils.util.get_platform(). (For more detail about architecture tags, see PEP 425.)

If a user installs this wheel into an environment that matches these requirements and it doesn’t work, then that wheel does not comply with this specification. This should be considered a bug in the wheel, and it’s the wheel creator’s responsibility to look for a fix (possibly with the help of the broader community).

The word “mainstream” is intentionally somewhat vague, and should be interpreted expansively. The goal is to rule out weird homebrew Linux systems; generally any distro you’ve actually heard of should be considered “mainstream”. We also provide a way for maintainers of “weird” distros to manually override this check, though based on experience with previous manylinux PEPs, we don’t expect this feature to see much use.

And finally, compliant wheels are required to “play well with others”, i.e., installing a manylinux wheel must not cause other unrelated packages to break.

Any method of producing wheels which meets these criteria is acceptable. However, in practice we expect that the auditwheel project will maintain an up-to-date set of tools and build images for producing manylinux wheels, as well as documentation about how they work and how to use them, and that most maintainers will want to use those. For the latest information on building manylinux wheels, including recommendations about which build images to use, see https://packaging.python.org.

Since these requirements are fairly high-level, here are some examples of how they play out in specific situations:

Example: if a wheel is tagged as manylinux_2_17_x86_64, but it uses symbols that were only added in glibc 2.18, then that wheel won’t work on systems with glibc 2.17. Therefore, we can conclude that this wheel is in violation of this specification.

Example: Until ~2017, all major Linux distros included libncursesw.so.5 as part of their default install. Until that date, a wheel that linked to libncursesw.so.5 was compliant with this specification. Then, distros started switching to ncurses 6, which has a different name and incompatible ABI, and stopped installing libncursesw.so.5 by default. So after that date, a wheel that links to libncursesw.so.5 was no longer compliant with this specification.

Example: The Linux ELF linker places all shared library SONAMEs into a single process-global namespace. If independent wheels used the same SONAME for their bundled libraries, they might end up colliding and using the wrong library version, which would violate the “play well with others” rule. Therefore, this specification requires that wheels use globally-unique names for all bundled libraries. (Auditwheel currently accomplishes this by renaming all bundled libraries to include a globally-unique hash.)

Example: we’ve observed certain wheels using C++ in ways that interfere with other packages via an unclear mechanism. This is also a violation of the “play well with others” rule, so those wheels aren’t compliant with this specification.

Example: The imaginary architecture LEG v7 has both big-endian and little-endian variants. Big-endian binaries require a big-endian system, and little-endian binaries require a little-endian system. But unfortunately, it’s discovered that due to a bug in PEP 425, both variants use the same architecture tag, legv7. This makes it impossible to create a compliant manylinux_2_17_legv7 wheel: no matter what we do, it will crash on some user’s systems. So, we write a new PEP defining architecture tags legv7le and legv7be; now we can ship manylinux LEG v7 wheels.

Example: There’s also a LEG v8. It also has big-endian and little-endian variants. But fortunately, it turns out that PEP 425 already does the right thing LEG v8, so LEG v8 enthusiasts can start shipping manylinux_2_17_legv8le and manylinux_2_17_legv8be wheels immediately once this PEP is implemented, even though the authors of this PEP don’t know anything at all about LEG v8.

Legacy manylinux tags

The existing manylinux tags are redefined as aliases for new-style tags:

  • manylinux1_x86_64 is now an alias for manylinux_2_5_x86_64
  • manylinux1_i686 is now an alias for manylinux_2_5_i686
  • manylinux2010_x86_64 is now an alias for manylinux_2_12_x86_64
  • manylinux2010_i686 is now an alias for manylinux_2_12_i686
  • manylinux2014_x86_64 is now an alias for manylinux_2_17_x86_64
  • manylinux2014_i686 is now an alias for manylinux_2_17_i686
  • manylinux2014_aarch64 is now an alias for manylinux_2_17_aarch64
  • manylinux2014_armv7l is now an alias for manylinux_2_17_armv7l
  • manylinux2014_ppc64 is now an alias for manylinux_2_17_ppc64
  • manylinux2014_ppc64le is now an alias for manylinux_2_17_ppc64le
  • manylinux2014_s390x is now an alias for manylinux_2_17_s390x

This redefinition is largely a no-op, but does affect a few things:

  • Previously, we had an open-ended and growing commitment to keep updating every manylinux PEP whenever a new Linux distro was released, for the rest of time. By making this PEP normative for the older tags, that obligation goes away. When this PEP is accepted, the previous manylinux PEPs will receive a final update noting that they are no longer maintained and referring to this PEP.
  • The “play well with others” rule was always intended, but previous PEPs didn’t state it explicitly; now it’s explicit.
  • Previous PEPs assumed that glibc 3.x might be incompatible with glibc 2.x, so we checked for compatibility between a system and a tag using logic like:
    sys_major == tag_major and sys_minor >= tag_minor
    

    Recently the glibc maintainers advised us that we should assume that glibc will maintain backwards-compatibility indefinitely, even if they bump the major version number. So the new check for compatibility is:

    (sys_major, sys_minor) >= (tag_major, tag_minor)
    

Package installers

Generally, package installers should install manylinux wheels on systems that have an appropriate glibc and architecture, and not otherwise. If there are multiple compatible manylinux wheels available, then the wheel with the highest glibc version should be preferred, in order to take advantage of newer compilers and glibc features.

In addition, we follow previous specifications, and allow for Python distributors to manually override this check by adding a _manylinux module to their standard library. If this package is importable, and if it defines a function called manylinux_compatible, then package installers should call this function, passing in the major version, minor version, and architecture from the manylinux tag, and it will either return a boolean saying whether wheels with the given tag should be considered compatible with the current system, or else None to indicate that the default logic should be used.

For compatibility with previous specifications, if the tag is manylinux1 or manylinux_2_5 exactly, then we also check the module for a boolean attribute manylinux1_compatible, if the tag version is manylinux2010 or manylinux_2_12 exactly, then we also check the module for a boolean attribute manylinux2010_compatible, and if the tag version is manylinux2014 or manylinux_2_17 exactly, then we also check the module for a boolean attribute manylinux2014_compatible. If both the new and old attributes are defined, then manylinux_compatible takes precedence.

Here’s some example code. You don’t have to actually use this code, but you can use it for reference if you have questions about the exact semantics:

LEGACY_ALIASES = {
    "manylinux1_x86_64": "manylinux_2_5_x86_64",
    "manylinux1_i686": "manylinux_2_5_i686",
    "manylinux2010_x86_64": "manylinux_2_12_x86_64",
    "manylinux2010_i686": "manylinux_2_12_i686",
    "manylinux2014_x86_64": "manylinux_2_17_x86_64",
    "manylinux2014_i686": "manylinux_2_17_i686",
    "manylinux2014_aarch64": "manylinux_2_17_aarch64",
    "manylinux2014_armv7l": "manylinux_2_17_armv7l",
    "manylinux2014_ppc64": "manylinux_2_17_ppc64",
    "manylinux2014_ppc64le": "manylinux_2_17_ppc64le",
    "manylinux2014_s390x": "manylinux_2_17_s390x",
}

def manylinux_tag_is_compatible_with_this_system(tag):
    # Normalize and parse the tag
    tag = LEGACY_ALIASES.get(tag, tag)
    m = re.match("manylinux_([0-9]+)_([0-9]+)_(.*)", tag)
    if not m:
        return False
    tag_major_str, tag_minor_str, tag_arch = m.groups()
    tag_major = int(tag_major_str)
    tag_minor = int(tag_minor_str)

    if not system_uses_glibc():
        return False
    sys_major, sys_minor = get_system_glibc_version()
    if (sys_major, sys_minor) < (tag_major, tag_minor):
        return False
    sys_arch = get_system_arch()
    if sys_arch != tag_arch:
        return False

    # Check for manual override
    try:
        import _manylinux
    except ImportError:
        pass
    else:
        if hasattr(_manylinux, "manylinux_compatible"):
            result = _manylinux.manylinux_compatible(
                tag_major, tag_minor, tag_arch,
            )
            if result is not None:
                return bool(result)
        else:
            if (tag_major, tag_minor) == (2, 5):
                if hasattr(_manylinux, "manylinux1_compatible"):
                    return bool(_manylinux.manylinux1_compatible)
            if (tag_major, tag_minor) == (2, 12):
                if hasattr(_manylinux, "manylinux2010_compatible"):
                    return bool(_manylinux.manylinux2010_compatible)

    return True

Package indexes

The exact set of wheel tags accepted by PyPI, or any package index, is a policy question, and up to the maintainers of that index. But, we recommend that package indexes accept any wheels whose platform tag matches the following regexes:

  • manylinux1_(x86_64|i686)
  • manylinux2010_(x86_64|i686)
  • manylinux2014_(x86_64|i686|aarch64|armv7l|ppc64|ppc64le|s390x)
  • manylinux_[0-9]+_[0-9]+_(.*)

Package indexes may impose additional requirements; for example, they might audit uploaded wheels and reject those that contain known problems, such as a manylinux_2_17 wheel that references symbols from later glibc versions, or dependencies on external libraries that are known not to exist on all systems. Or a package index might decide to be conservative and reject wheels tagged manylinux_2_999, on the grounds that no-one knows what the Linux distro landscape will look like when glibc 2.999 is released. We leave the details of any such checks to the discretion of the package index maintainers.

Rejected alternatives

Continuing the manylinux20XX series: As discussed above, this leads to much more effort-intensive, slower, and more complex rollouts of new versions. And while there are two places where it seems at first to have some compensating benefits, if you look more closely this turns out not to be the case.

First, this forces us to produce human-readable descriptions of how Linux distros work, in the text of the PEP. But this is less valuable than it might seem at first, and can actually be handled better by the new “perennial” approach anyway.

If you’re trying to build wheels, the main thing you need is a tutorial on how to use the build images and tooling around them. If you’re trying to add support for a new build profile or create a competitor to auditwheel, then your best resources will be the auditwheel source code and issue tracker, which are always going to be more detailed, precise, and reliable than a summary spec written in English and without tests. Documentation like the old manylinux20XX PEPs does add value! But in both cases, it’s primarily as a secondary reference to provide overview and context.

And furthermore, the PEP process is poorly suited to maintaining this kind of reference documentation – there’s a reason we don’t keep the pip user manual in the PEPs repository! The auditwheel maintainers are the best situated to understand what kinds of documentation are useful to their users, and to maintain that documentation over time. For example, there’s substantial overlap between the different manylinux versions, and the PEP process currently forces us to handle this by copy-pasting everything between a growing list of documents; instead, the auditwheel maintainers might choose to factor out the common parts into a single piece of shared documentation.

A related concern was that with the perennial approach, it may become harder for package maintainers to decide which build profile to target: instead of having to pick between manylinux1, manylinux2010, manylinux2014, …, they now have a wider array of options like manylinux_2_5, manylinux_2_6, …, manylinux_2_20, … But again, we don’t believe this will be a problem in practice. In either system, most package maintainers won’t be starting by reading PEPs and trying to implement them from scratch. If you’re a particularly expert and ambitious package maintainer who needs to target a new version or new architecture, the perennial approach gives you additional flexibility. But for regular everyday maintainers, we expect they’ll start from a tutorial like packaging.python.org, and by choosing from existing build images. A tutorial can just as easily recommend manylinux_2_17 as it can recommend manylinux2014, and we expect the actual set of pre-provided build images to be identical in both cases. And again, by maintaining this documentation in the right place, instead of trying to do it PEPs repository, we expect that we’ll end up with documentation that’s higher-quality and more fitted to purpose.

Finally, some participants have pointed out that it’s very nice to be able to look at a wheel and tell definitively whether it meets the requirements of the spec. With the new “perennial” approach, we can never say with 100% certainty that a wheel does meet the spec, because that depends on the Linux distros. As engineers we have a well-justified dislike for that kind of uncertainty.

However: as demonstrated by the examples above, we can still tell definitively when a wheel doesn’t meet the spec, which turns out to be what’s important in practice. And, in practice, with the manylinux20XX approach, whenever distros change, we actually change the spec; it takes a bit longer. So even if a wheel was compliant today, it might be become non-compliant tomorrow. This is frustrating, but unfortunately this uncertainty is unavoidable if what you care about is distributing working wheels to users.

So even on these points where the old approach initially seems to have advantages, we expect the new approach to actually do as well or better.

Switching to perennial tags, but continuing to write a PEP for each version: This was proposed as a kind of hybrid, to try to get some of the advantages of the perennial tagging system – like easier rollouts of new versions – while keeping the advantages of the manylinux20XX scheme, like forcing us to write documentation about Linux distros, simplifying options for package maintainers, and being able to definitively tell when a wheel meets the spec. But as discussed above, on a closer look, it turns out that these advantages are largely illusory. And this also inherits significant disadvantages from the manylinux20XX scheme, like creating indefinite obligations to update a growing list of copy-pasted PEPs.

Making auditwheel normative: Another possibility that was considered was to make auditwheel the normative reference on the definition of manylinux, i.e., a wheel would be compliant if and only if auditwheel check completed without errors. This was rejected because the point of packaging PEPs is to define interoperability between tools, not to bless specific tools.

Adding extra words to the tag string: Another proposal we considered was to add extra words to the wheel tag, e.g. manylinux_glibc_2_17 instead of manylinux_2_17. The motivation would be to leave the door open to other kinds of versioning heuristics in the future – for example, we could have manylinux_glibc_$VERSION and manylinux_alpine_$VERSION.

But “manylinux” has always been a synonym for “broad compatibility with mainstream glibc-based distros”; reusing it for unrelated build profiles like alpine is more confusing than helpful. Also, some early reviewers who aren’t steeped in the details of packaging found the word glibc actively misleading, jumping to the conclusion that it meant they needed a system with exactly that glibc version. And tags like manylinux_$VERSION and alpine_$VERSION also have the advantages of parsimony and directness. So we’ll go with that.


Source: https://github.com/python/peps/blob/main/peps/pep-0600.rst

Last modified: 2023-09-09 17:39:29 GMT