4.3. Implementation Versions

The type of the core Python syntax as defined for a specific release version of the disttype itself is released by various Python distributions under the same disttype. These comprise the reference implementation CPython [CPython] and various additional distributions, which are in some cases less compliant to the reference. The deviation of the core syntax is in any case related to some specific features only, in none of the cases essential. The packed standard libraries anyhow vary in some cases considerably, thus require either ports of standard libraries, or adapted code sections.

_images/pythonids-blueprint.png

Figure: Python Infrastructure Services zoom

The distributions apply various different versioning schemes, as some follow the syntax scheme, while others introduce completly independent schemes. In case of the add-on packages as IPython and Cython the identification is even almost redundant on the first view. The syntax versioning is defined by the PEP-440, in particular the numbering scheme for final releases [FINALRELEASE]. See also Python syntax versions.

The complete information in order to decide the actual current Python variant thus comprises not only the syntax version ‘disttype’,

Python disttype:  (<major>, <minor>, <micro>)

but also the actual distribution including the release version.

Python:             <syntaxcategory==Python> <disttype>
PythonDist:         <dist> <distrel>

While the version of the disttype fits into a 16-bit value, this is no longer the case for the complete information.

The distribution information is in case of CPython basically the same as the syntax information related to the distribution, the distrel and disttype are in particular literally the same information. The syntax changes are reflected for the reference actually in the major and minor version numbers only, while the micro version number reflects in some cases standard library changes. The category of the syntax is constant for all - category == Python.

Therefore the distribution information is designed as an extra bitmask of 32bit, containing the major and minor version numbers of the syntax information, the distribution identifier, and the release version of the distribution.

<disttype-major><disttype-minor><dist><distrel>

4.3.1. Hierarchy of Python Categories

The Python syntax versioning follows a well defined numbering scheme [PEP440], while the distribution versioning is defined individually, thus resulting in non-conformance to each other, even not Cython.

The resulting layout design covers the dependency of tree structures with branches relying on their parent nodes. This for example defines that a version of a Python distribution which is numbered incremental, defines a syntax version where it is compatible, and eventually from where on it is compatible for coming releases.

_images/pythonids-category-hierarchy.png

Figure: Implementations zoom

The main advance of a versioning hierarchy which includes the versions of multiple layers is the inherent support for comparison based on the inherently existing order. Thus this allows for single integer operations for the immediate determination of compatibility issues and ranges. The common traditional handling of arrays and identifiers requires a larger code-block and thus requires to spend more CPU power on simple comparisons.

4.3.1.1. Number Ranges

The current maximum value ranges of the version numbers for the Python distributiopns are:

dist

distrel

distrel_major

distrel_minor

distrel_micro

reference

CircuitPython

Python-3

3

0

3

[CircuitPython]

CPython

CPython-2.7, CPython-3.x

3

7

15

[CPython]

Cython

Cython-0.x, Cython-3.x

3

29

4

[Cython]

iPython

iPython-2.7, iPython3.x

5

5

0

[IPython]

IronPython

IronPython-2.7

2

7

7

[IronPython]

Jython

Jython-2.7

2

7

1

[Jython]

MicroPython

[MicroPython]

PyPy

PyPy-5.x

5

10

0

[PyPy]

The resulting estimated required bitarraysizes are.

disttype-major

disttype-minor

distrel_major

distrel_minor

distrel_micro

3-bits

5-bits

6-bits

6-bits

6-bits

4.3.1.2. Bit Mask Layout

The following bit-mask encoding layout represents the platform IDs as part of the stack of information systems identifiers. The sizes of the bit groups are designed to be sufficient for all supported OS and distributions, which represent various versioning philosophies and different weights on resulting numbering schemes and the number assignment incrementation cycle periods.

_images/bitarray-principle-stack.png

Figure: bit-mask encoding zoom

It is important to note here, that the distrel field enumerates the numbers of the releases of the distribution, which is not the implemented Python syntax release. The required ranges of the version subfields differ from the syntax versions, e.g. in case of PyPy, which has passed the distrel of 5.10. This is here the major and minor number of the Python2 and the Python3 release, while they differ in the micro version. The syntax release is encoded into the disttype, which represents the major and minor version number of the implemented Python syntax. The micro version number of the syntax is available by the 16-bit hex-value of the Python syntax version enumeration.

The visualized mapping scheme with the bit allocation within byte boundaries is given as

_images/bitarray-principle-stack-bytes.png

Figure: byte maping zoom

The bit boundaries are finally a compromise with the main design target to fit completely into a 32bit value.

The following table shows the available number ranges for the components of the bit array pythonids.pythondist.PYDIST.

bit-group

width

number-type

max-values

preferred operators

category

1bit

constant

1

disttype-major

3bit

int

7

< > ==

disttype-minor

5bit

int

31

< > ==

dist

5bit

int

31

< > ==

distrel-major

6bit

int

63

< > ==

distrel-minor

6bit

int

63

< > ==

distrel-micro

6bit

int

63

< > ==

Note

Just to remind, the values are hierarchical, thus each range is a subset of its prefix-ranges and has to be permutated with all previous ranges. So also for the distrel, which is a specific sub-set of the disttype.

4.3.1.3. Performance of Comparison Operations

The provided standard information on the Python syntax and the specific variant by the distribution releass is originally fragmented across several interfaces. The data is presented by various parts with different data types. Some libraries provide a more condensed set of data, but not comprising. The data is in general not primarily intended for frequent access by high performance routines, nor for shared modules with several system dependencies.

The pythomids provide therefore the information as numeric values only in order to enable fast comparison and range checks on all supported Python distributions. The layout is still a compromise due to the huge amounts of distributions to be represented by a generic application. But resulting of the design the measured access on the various platforms offers speed improvements beginning by about 60% with frequently more than 300% compared to the usage of the standard data. The numeric representation in addition provides simpler code by avoiding the implementation of specific caching values.

The performance gain is e.g. in particular enhanced in comparison to interfaces like string.startswith(), which is directly applied to the standard string values. The gain is here more than 60% compared on all supported platforms.

4.3.2. Distribution Numbering Scheme

The complete bit array is describing a release of the specific distribution. This contains the distrel bit field as the version of the distribution referenced by the dist field. The distrel is represented as tuple of 3-value version number. The disttype with major and minor version numbers is represented as a tuple of a 2-value version number.

4.3.2.1. Major and Minor

The disttype information contains here the reduced size by major and minor version numbers only. This is due to the fact, that the distribution defines the syntax variant including the specific set teh provided standard libraries.

_images/bitarray-major-minor.png

Figure: 2-value versions zoom

4.3.2.2. Three-Value Number

The distribution version numbers distrel are in most cases 3-value tuples. The value ranges vary upto 30 for Cython, thus the layout is designed for a value range of 0..63.

_images/bitarray-3num-major-minor.png

Figure: 3-value versions zoom

4.3.2.3. Combined Bitmask

The combined bitmask is

_images/bitarray-complete-to-bytes.png

Figure: basic scheme zoom

For application examples refer to the sections with the Python distributions, e.g. CPython