Issue #22914: Update the Python 2/3 porting HOWTO to describe a more
automated process.
This commit is contained in:
parent
969175091c
commit
6b335196c5
|
@ -19,600 +19,359 @@ Porting Python 2 Code to Python 3
|
||||||
If you would like to read one core Python developer's take on why Python 3
|
If you would like to read one core Python developer's take on why Python 3
|
||||||
came into existence, you can read Nick Coghlan's `Python 3 Q & A`_.
|
came into existence, you can read Nick Coghlan's `Python 3 Q & A`_.
|
||||||
|
|
||||||
If you prefer to read a (free) book on porting a project to Python 3,
|
|
||||||
consider reading `Porting to Python 3`_ by Lennart Regebro which should cover
|
|
||||||
much of what is discussed in this HOWTO.
|
|
||||||
|
|
||||||
For help with porting, you can email the python-porting_ mailing list with
|
For help with porting, you can email the python-porting_ mailing list with
|
||||||
questions.
|
questions.
|
||||||
|
|
||||||
The Short Version
|
The Short Explanation
|
||||||
=================
|
=====================
|
||||||
|
|
||||||
* Decide what's the oldest version of Python 2 you want to support (if at all)
|
To make your project be single-source Python 2/3 compatible, the basic steps
|
||||||
* Make sure you have a thorough test suite and use continuous integration
|
are:
|
||||||
testing to make sure you stay compatible with the versions of Python you care
|
|
||||||
about
|
#. Update your code to drop support for Python 2.5 or older (supporting only
|
||||||
* If you have dependencies, check their Python 3 status using caniusepython3
|
Python 2.7 is ideal)
|
||||||
(`command-line tool <https://pypi.python.org/pypi/caniusepython3>`__,
|
#. Make sure you have good test coverage (coverage.py_ can help)
|
||||||
`web app <https://caniusepython3.com/>`__)
|
#. Learn the differences between Python 2 & 3
|
||||||
|
#. Use Modernize_ or Futurize_ to update your code
|
||||||
With that done, your options are:
|
#. Use Pylint_ to help make sure you don't regress on your Python 3 support
|
||||||
|
(if only supporting Python 2.7/3.4 or newer)
|
||||||
* If you are dropping Python 2 support, use :ref:`2to3 <2to3-reference>` to port
|
#. Use caniusepython3_ to find out which of your dependencies are blocking your
|
||||||
to Python 3
|
use of Python 3
|
||||||
|
#. Once your dependencies are no longer blocking you, use continuous integration
|
||||||
* If you are keeping Python 2 support, then start writing Python 2/3-compatible
|
to make sure you stay compatible with Python 2 & 3 (tox_ can help test
|
||||||
code starting **TODAY**
|
against multiple versions of Python)
|
||||||
|
|
||||||
+ If you have dependencies that have not been ported, reach out to them to port
|
If you are dropping support for Python 2 entirely, then after you learn the
|
||||||
their project while working to make your code compatible with Python 3 so
|
differences between Python 2 & 3 you can run 2to3_ over your code and skip the
|
||||||
you're ready when your dependencies are all ported
|
rest of the steps outlined above.
|
||||||
+ If all your dependencies have been ported (or you have none), go ahead and
|
|
||||||
port to Python 3
|
|
||||||
|
Details
|
||||||
* If you are creating a new project that wants to have 2/3 compatibility,
|
=======
|
||||||
code in Python 3 and then backport to Python 2
|
|
||||||
|
A key point about supporting Python 2 & 3 simultaneously is that you can start
|
||||||
|
**today**! Even if your dependencies are not supporting Python 3 yet that does
|
||||||
Before You Begin
|
not mean you can't modernize your code **now** to support Python 3. Most changes
|
||||||
================
|
required to support Python 3 lead to cleaner code using newer practices even in
|
||||||
|
Python 2.
|
||||||
If your project is on the Cheeseshop_/PyPI_, make sure it has the proper
|
|
||||||
`trove classifiers`_ to signify what versions of Python it **currently**
|
Another key point is that modernizing your Python 2 code to also support
|
||||||
supports. At minimum you should specify the major version(s), e.g.
|
Python 3 is largely automated for you. While you might have to make some API
|
||||||
``Programming Language :: Python :: 2`` if your project currently only supports
|
decisions thanks to Python 3 clarifying text data versus binary data, the
|
||||||
Python 2. It is preferrable that you be as specific as possible by listing every
|
lower-level work is now mostly done for you and thus can at least benefit from
|
||||||
major/minor version of Python that you support, e.g. if your project supports
|
the automated changes immediately.
|
||||||
Python 2.6 and 2.7, then you want the classifiers of::
|
|
||||||
|
Keep those key points in mind while you read on about the details of porting
|
||||||
Programming Language :: Python :: 2
|
your code to support Python 2 & 3 simultaneously.
|
||||||
Programming Language :: Python :: 2.6
|
|
||||||
Programming Language :: Python :: 2.7
|
|
||||||
|
Drop support for Python 2.5 and older (at least)
|
||||||
Once your project supports Python 3 you will want to go back and add the
|
------------------------------------------------
|
||||||
appropriate classifiers for Python 3 as well. This is important as setting the
|
|
||||||
``Programming Language :: Python :: 3`` classifier will lead to your project
|
While you can make Python 2.5 work with Python 3, it is **much** easier if you
|
||||||
being listed under the `Python 3 Packages`_ section of PyPI.
|
only have to work with Python 2.6 or newer (and easier still if you only have
|
||||||
|
to work with Python 2.7). If dropping Python 2.5 is not an option then the six_
|
||||||
Make sure you have a robust test suite. You need to
|
project can help you support Python 2.5 & 3 simultaneously. Do realize, though,
|
||||||
make sure everything continues to work, just like when you support a new
|
that nearly all the projects listed in this HOWTO will not be available to you.
|
||||||
minor/feature release of Python. This means making sure your test suite is
|
|
||||||
thorough and is ported properly between Python 2 & 3 (consider using coverage_
|
If you are able to only support Python 2.6 or newer, then the required changes
|
||||||
to measure that you have effective test coverage). You will also most likely
|
to your code should continue to look and feel like idiomatic Python code. At
|
||||||
want to use something like tox_ to automate testing between all of your
|
worst you will have to use a function instead of a method in some instances or
|
||||||
supported versions of Python. You will also want to **port your tests first** so
|
have to import a function instead of using a built-in one, but otherwise the
|
||||||
that you can make sure that you detect breakage during the transition. Tests also
|
overall transformation should not feel foreign to you.
|
||||||
tend to be simpler than the code they are testing so it gives you an idea of how
|
|
||||||
easy it can be to port code.
|
But please aim for Python 2.7. Bugfixes for that version of Python will continue
|
||||||
|
until 2020 while Python 2.6 is no longer supported. There are also some tools
|
||||||
Drop support for older Python versions if possible. Python 2.5
|
mentioned in this HOWTO which do not support Python 2.6 (e.g., Pylint_), and
|
||||||
introduced a lot of useful syntax and libraries which have become idiomatic
|
this will become more commonplace as time goes on.
|
||||||
in Python 3. Python 2.6 introduced future statements which makes
|
|
||||||
compatibility much easier if you are going from Python 2 to 3.
|
Make sure you specify the proper version support in your ``setup.py`` file
|
||||||
Python 2.7 continues the trend in the stdlib. Choose the newest version
|
--------------------------------------------------------------------------
|
||||||
of Python which you believe can be your minimum support version
|
|
||||||
and work from there.
|
In your ``setup.py`` file you should have the proper `trove classifier`_
|
||||||
|
specifying what versions of Python you support. As your project does not support
|
||||||
Target the newest version of Python 3 that you can. Beyond just the usual
|
Python 3 yet you should at least have
|
||||||
bugfixes, compatibility has continued to improve between Python 2 and 3 as time
|
``Programming Language :: Python :: 2 :: Only`` specified. Ideally you should
|
||||||
has passed. E.g. Python 3.3 added back the ``u`` prefix for
|
also specify each major/minor version of Python that you do support, e.g.
|
||||||
strings, making source-compatible Python code easier to write.
|
``Programming Language :: Python :: 2.7``.
|
||||||
|
|
||||||
|
Have good test coverage
|
||||||
Writing Source-Compatible Python 2/3 Code
|
-----------------------
|
||||||
=========================================
|
|
||||||
|
Once you have your code supporting the oldest version of Python 2 you want it
|
||||||
Over the years the Python community has discovered that the easiest way to
|
to, you will want to make sure your test suite has good coverage. A good rule of
|
||||||
support both Python 2 and 3 in parallel is to write Python code that works in
|
thumb is that if you want to be confident enough in your test suite that any
|
||||||
either version. While this might sound counter-intuitive at first, it actually
|
failures that appear after having tools rewrite your code are actual bugs in the
|
||||||
is not difficult and typically only requires following some select
|
tools and not in your code. If you want a number to aim for, try to get over 80%
|
||||||
(non-idiomatic) practices and using some key projects to help make bridging
|
coverage (and don't feel bad if you can't easily get past 90%). If you
|
||||||
between Python 2 and 3 easier.
|
don't already have a tool to measure test coverage then coverage.py_ is
|
||||||
|
recommended.
|
||||||
Projects to Consider
|
|
||||||
--------------------
|
Learn the differences between Python 2 & 3
|
||||||
|
-------------------------------------------
|
||||||
The lowest level library for supporting Python 2 & 3 simultaneously is six_.
|
|
||||||
Reading through its documentation will give you an idea of where exactly the
|
Once you have your code well-tested you are ready to begin porting your code to
|
||||||
Python language changed between versions 2 & 3 and thus what you will want the
|
Python 3! But to fully understand how your code is going to change and what
|
||||||
library to help you continue to support.
|
you want to look out for while you code, you will want to learn what changes
|
||||||
|
Python 3 makes in terms of Python 2. Typically the two best ways of doing that
|
||||||
To help automate porting your code over to using six, you can use
|
is reading the `"What's New"`_ doc for each release of Python 3 and the
|
||||||
modernize_. This project will attempt to rewrite your code to be as modern as
|
`Porting to Python 3`_ book (which is free online).
|
||||||
possible while using six to smooth out any differences between Python 2 & 3.
|
|
||||||
|
|
||||||
If you want to write your compatible code to feel more like Python 3 there is
|
Update your code
|
||||||
the future_ project. It tries to provide backports of objects from Python 3 so
|
----------------
|
||||||
that you can use them from Python 2-compatible code, e.g. replacing the
|
|
||||||
``bytes`` type from Python 2 with the one from Python 3.
|
Once you feel like you know what is different in Python 3 compared to Python 2,
|
||||||
It also provides a translation script like modernize (its translation code is
|
it's time to update your code! You have a choice between two tools in porting
|
||||||
actually partially based on it) to help start working with a pre-existing code
|
your code automatically: Modernize_ and Futurize_. Which tool you choose will
|
||||||
base. It is also unique in that its translation script will also port Python 3
|
depend on how much like Python 3 you want your code to be. Futurize_ does its
|
||||||
code backwards as well as Python 2 code forwards.
|
best to make Python 3 idioms and practices exist in Python 2, e.g. backporting
|
||||||
|
the ``bytes`` type from Python 3 so that you have semantic parity between the
|
||||||
|
major versions of Python. Modernize_,
|
||||||
Tips & Tricks
|
on the other hand, is more conservative and targets a Python 2/3 subset of
|
||||||
-------------
|
Python, relying on six_ to help provide compatibility.
|
||||||
|
|
||||||
To help with writing source-compatible code using one of the projects mentioned
|
Regardless of which tool you choose, they will update your code to run under
|
||||||
in `Projects to Consider`_, consider following the below suggestions. Some of
|
Python 3 while staying compatible with the version of Python 2 you started with.
|
||||||
them are handled by the suggested projects, so if you do use one of them then
|
Depending on how conservative you want to be, you may want to run the tool over
|
||||||
read their documentation first to see which suggestions below will taken care of
|
your test suite first and visually inspect the diff to make sure the
|
||||||
for you.
|
transformation is accurate. After you have transformed your test suite and
|
||||||
|
verified that all the tests still pass as expected, then you can transform your
|
||||||
Support Python 2.7
|
application code knowing that any tests which fail is a translation failure.
|
||||||
//////////////////
|
|
||||||
|
Unfortunately the tools can't automate everything to make your code work under
|
||||||
As a first step, make sure that your project is compatible with Python 2.7.
|
Python 3 and so there are a handful of things you will need to update manually
|
||||||
This is just good to do as Python 2.7 is the last release of Python 2 and thus
|
to get full Python 3 support (which of these steps are necessary vary between
|
||||||
will be used for a rather long time. It also allows for use of the ``-3`` flag
|
the tools). Read the documentation for the tool you choose to use to see what it
|
||||||
to Python to help discover places in your code where compatibility might be an
|
fixes by default and what it can do optionally to know what will (not) be fixed
|
||||||
issue (the ``-3`` flag is in Python 2.6 but Python 2.7 adds more warnings).
|
for you and what you may have to fix on your own (e.g. using ``io.open()`` over
|
||||||
|
the built-in ``open()`` function is off by default in Modernize). Luckily,
|
||||||
Try to Support Python 2.6 and Newer Only
|
though, there are only a couple of things to watch out for which can be
|
||||||
////////////////////////////////////////
|
considered large issues that may be hard to debug if not watched for.
|
||||||
|
|
||||||
While not possible for all projects, if you can support Python 2.6 and newer
|
Division
|
||||||
**only**, your life will be much easier. Various future statements, stdlib
|
++++++++
|
||||||
additions, etc. exist only in Python 2.6 and later which greatly assist in
|
|
||||||
supporting Python 3. But if you project must keep support for Python 2.5 then
|
In Python 3, ``5 / 2 == 2.5`` and not ``2``; all division between ``int`` values
|
||||||
it is still possible to simultaneously support Python 3.
|
result in a ``float``. This change has actually been planned since Python 2.2
|
||||||
|
which was released in 2002. Since then users have been encouraged to add
|
||||||
Below are the benefits you gain if you only have to support Python 2.6 and
|
``from __future__ import division`` to any and all files which use the ``/`` and
|
||||||
newer. Some of these options are personal choice while others are
|
``//`` operators or to be running the interpreter with the ``-Q`` flag. If you
|
||||||
**strongly** recommended (the ones that are more for personal choice are
|
have not been doing this then you will need to go through your code and do two
|
||||||
labeled as such). If you continue to support older versions of Python then you
|
things:
|
||||||
at least need to watch out for situations that these solutions fix and handle
|
|
||||||
them appropriately (which is where library help from e.g. six_ comes in handy).
|
#. Add ``from __future__ import division`` to your files
|
||||||
|
#. Update any division operator as necessary to either use ``//`` to use floor
|
||||||
|
division or continue using ``/`` and expect a float
|
||||||
``from __future__ import print_function``
|
|
||||||
'''''''''''''''''''''''''''''''''''''''''
|
The reason that ``/`` isn't simply translated to ``//`` automatically is that if
|
||||||
|
an object defines its own ``__div__`` method but not ``__floordiv__`` then your
|
||||||
It will not only get you used to typing ``print()`` as a function instead of a
|
code would begin to fail.
|
||||||
statement, but it will also give you the various benefits the function has over
|
|
||||||
the Python 2 statement (six_ provides a function if you support Python 2.5 or
|
Text versus binary data
|
||||||
older).
|
+++++++++++++++++++++++
|
||||||
|
|
||||||
|
In Python 2 you could use the ``str`` type for both text and binary data.
|
||||||
``from __future__ import unicode_literals``
|
Unfortunately this confluence of two different concepts could lead to brittle
|
||||||
'''''''''''''''''''''''''''''''''''''''''''
|
code which sometimes worked for either kind of data, sometimes not. It also
|
||||||
|
could lead to confusing APIs if people didn't explicitly state that something
|
||||||
If you choose to use this future statement then all string literals in
|
that accepted ``str`` accepted either text or binary data instead of one
|
||||||
Python 2 will be assumed to be Unicode (as is already the case in Python 3).
|
specific type. This complicated the situation especially for anyone supporting
|
||||||
If you choose not to use this future statement then you should mark all of your
|
multiple languages as APIs wouldn't bother explicitly supporting ``unicode``
|
||||||
text strings with a ``u`` prefix and only support Python 3.3 or newer. But you
|
when they claimed text data support.
|
||||||
are **strongly** advised to do one or the other (six_ provides a function in
|
|
||||||
case you don't want to use the future statement **and** you want to support
|
To make the distinction between text and binary data clearer and more
|
||||||
Python 3.2 or older).
|
pronounced, Python 3 did what most languages created in the age of the internet
|
||||||
|
have done and made text and binary data distinct types that cannot blindly be
|
||||||
|
mixed together (Python predates widespread access to the internet). For any code
|
||||||
Bytes/string literals
|
that only deals with text or only binary data, this separation doesn't pose an
|
||||||
'''''''''''''''''''''
|
issue. But for code that has to deal with both, it does mean you might have to
|
||||||
|
now care about when you are using text compared to binary data, which is why
|
||||||
This is a **very** important one. Prefix Python 2 strings that
|
this cannot be entirely automated.
|
||||||
are meant to contain bytes with a ``b`` prefix to very clearly delineate
|
|
||||||
what is and is not a Python 3 text string (six_ provides a function to use for
|
To start, you will need to decide which APIs take text and which take binary
|
||||||
Python 2.5 compatibility).
|
(it is **highly** recommended you don't design APIs that can take both due to
|
||||||
|
the difficulty of keeping the code working; as stated earlier it is difficult to
|
||||||
This point cannot be stressed enough: make sure you know what all of your string
|
do well). In Python 2 this means making sure the APIs that take text can work
|
||||||
literals in Python 2 are meant to be in Python 3. Any string literal that
|
with ``unicode`` in Python 2 and those that work with binary data work with the
|
||||||
should be treated as bytes should have the ``b`` prefix. Any string literal
|
``bytes`` type from Python 3 and thus a subset of ``str`` in Python 2 (which the
|
||||||
that should be Unicode/text in Python 2 should either have the ``u`` literal
|
``bytes`` type in Python 2 is an alias for). Usually the biggest issue is
|
||||||
(supported, but ignored, in Python 3.3 and later) or you should have
|
realizing which methods exist for which types in Python 2 & 3 simultaneously
|
||||||
``from __future__ import unicode_literals`` at the top of the file. But the key
|
(for text that's ``unicode`` in Python 2 and ``str`` in Python 3, for binary
|
||||||
point is you should know how Python 3 will treat every one one of your string
|
that's ``str``/``bytes`` in Python 2 and ``bytes`` in Python 3). The following
|
||||||
literals and you should mark them as appropriate.
|
table lists the **unique** methods of each data type across Python 2 & 3
|
||||||
|
(e.g., the ``decode()`` method is usable on the equivalent binary data type in
|
||||||
There are some differences between byte literals in Python 2 and those in
|
either Python 2 or 3, but it can't be used by the text data type consistently
|
||||||
Python 3 thanks to the bytes type just being an alias to ``str`` in Python 2.
|
between Python 2 and 3 because ``str`` in Python 3 doesn't have the method).
|
||||||
See the `Handle Common "Gotchas"`_ section for what to watch out for.
|
|
||||||
|
======================== =====================
|
||||||
``from __future__ import absolute_import``
|
**Text data** **Binary data**
|
||||||
''''''''''''''''''''''''''''''''''''''''''
|
------------------------ ---------------------
|
||||||
Discussed in more detail below, but you should use this future statement to
|
__mod__ (``%`` operator)
|
||||||
prevent yourself from accidentally using implicit relative imports.
|
------------------------ ---------------------
|
||||||
|
\ decode
|
||||||
|
------------------------ ---------------------
|
||||||
Supporting Python 2.5 and Newer Only
|
encode
|
||||||
////////////////////////////////////
|
------------------------ ---------------------
|
||||||
|
format
|
||||||
If you are supporting Python 2.5 and newer there are still some features of
|
------------------------ ---------------------
|
||||||
Python that you can utilize.
|
isdecimal
|
||||||
|
------------------------ ---------------------
|
||||||
|
isnumeric
|
||||||
``from __future__ import absolute_import``
|
======================== =====================
|
||||||
''''''''''''''''''''''''''''''''''''''''''
|
|
||||||
|
Making the distinction easier to handle can be accomplished by encoding and
|
||||||
Implicit relative imports (e.g., importing ``spam.bacon`` from within
|
decoding between binary data and text at the edge of your code. This means that
|
||||||
``spam.eggs`` with the statement ``import bacon``) do not work in Python 3.
|
when you receive text in binary data, you should immediately decode it. And if
|
||||||
This future statement moves away from that and allows the use of explicit
|
your code needs to send text as binary data then encode it as late as possible.
|
||||||
relative imports (e.g., ``from . import bacon``).
|
This allows your code to work with only text internally and thus eliminates
|
||||||
|
having to keep track of what type of data you are working with.
|
||||||
In Python 2.5 you must use
|
|
||||||
the __future__ statement to get to use explicit relative imports and prevent
|
The next issue is making sure you know whether the string literals in your code
|
||||||
implicit ones. In Python 2.6 explicit relative imports are available without
|
represent text or binary data. At minimum you should add a ``b`` prefix to any
|
||||||
the statement, but you still want the __future__ statement to prevent implicit
|
literal that presents binary data. For text you should either use the
|
||||||
relative imports. In Python 2.7 the __future__ statement is not needed. In
|
``from __future__ import unicode_literals`` statement or add a ``u`` prefix to
|
||||||
other words, unless you are only supporting Python 2.7 or a version earlier
|
the text literal.
|
||||||
than Python 2.5, use this __future__ statement.
|
|
||||||
|
As part of this dichotomy you also need to be careful about opening files.
|
||||||
|
|
||||||
Mark all Unicode strings with a ``u`` prefix
|
|
||||||
'''''''''''''''''''''''''''''''''''''''''''''
|
|
||||||
|
|
||||||
While Python 2.6 has a ``__future__`` statement to automatically cause Python 2
|
|
||||||
to treat all string literals as Unicode, Python 2.5 does not have that shortcut.
|
|
||||||
This means you should go through and mark all string literals with a ``u``
|
|
||||||
prefix to turn them explicitly into text strings where appropriate and only
|
|
||||||
support Python 3.3 or newer. Otherwise use a project like six_ which provides a
|
|
||||||
function to pass all text string literals through.
|
|
||||||
|
|
||||||
|
|
||||||
Capturing the Currently Raised Exception
|
|
||||||
''''''''''''''''''''''''''''''''''''''''
|
|
||||||
|
|
||||||
In Python 2.5 and earlier the syntax to access the current exception is::
|
|
||||||
|
|
||||||
try:
|
|
||||||
raise Exception()
|
|
||||||
except Exception, exc:
|
|
||||||
# Current exception is 'exc'.
|
|
||||||
pass
|
|
||||||
|
|
||||||
This syntax changed in Python 3 (and backported to Python 2.6 and later)
|
|
||||||
to::
|
|
||||||
|
|
||||||
try:
|
|
||||||
raise Exception()
|
|
||||||
except Exception as exc:
|
|
||||||
# Current exception is 'exc'.
|
|
||||||
# In Python 3, 'exc' is restricted to the block; in Python 2.6/2.7 it will "leak".
|
|
||||||
pass
|
|
||||||
|
|
||||||
Because of this syntax change you must change how you capture the current
|
|
||||||
exception in Python 2.5 and earlier to::
|
|
||||||
|
|
||||||
try:
|
|
||||||
raise Exception()
|
|
||||||
except Exception:
|
|
||||||
import sys
|
|
||||||
exc = sys.exc_info()[1]
|
|
||||||
# Current exception is 'exc'.
|
|
||||||
pass
|
|
||||||
|
|
||||||
You can get more information about the raised exception from
|
|
||||||
:func:`sys.exc_info` than simply the current exception instance, but you most
|
|
||||||
likely don't need it.
|
|
||||||
|
|
||||||
.. note::
|
|
||||||
In Python 3, the traceback is attached to the exception instance
|
|
||||||
through the ``__traceback__`` attribute. If the instance is saved in
|
|
||||||
a local variable that persists outside of the ``except`` block, the
|
|
||||||
traceback will create a reference cycle with the current frame and its
|
|
||||||
dictionary of local variables. This will delay reclaiming dead
|
|
||||||
resources until the next cyclic :term:`garbage collection` pass.
|
|
||||||
|
|
||||||
In Python 2, this problem only occurs if you save the traceback itself
|
|
||||||
(e.g. the third element of the tuple returned by :func:`sys.exc_info`)
|
|
||||||
in a variable.
|
|
||||||
|
|
||||||
|
|
||||||
Handle Common "Gotchas"
|
|
||||||
///////////////////////
|
|
||||||
|
|
||||||
These are things to watch out for no matter what version of Python 2 you are
|
|
||||||
supporting which are not syntactic considerations.
|
|
||||||
|
|
||||||
|
|
||||||
``from __future__ import division``
|
|
||||||
'''''''''''''''''''''''''''''''''''
|
|
||||||
|
|
||||||
While the exact same outcome can be had by using the ``-Qnew`` argument to
|
|
||||||
Python, using this future statement lifts the requirement that your users use
|
|
||||||
the flag to get the expected behavior of division in Python 3
|
|
||||||
(e.g., ``1/2 == 0.5; 1//2 == 0``).
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Specify when opening a file as binary
|
|
||||||
'''''''''''''''''''''''''''''''''''''
|
|
||||||
|
|
||||||
Unless you have been working on Windows, there is a chance you have not always
|
Unless you have been working on Windows, there is a chance you have not always
|
||||||
bothered to add the ``b`` mode when opening a binary file (e.g., ``rb`` for
|
bothered to add the ``b`` mode when opening a binary file (e.g., ``rb`` for
|
||||||
binary reading). Under Python 3, binary files and text files are clearly
|
binary reading). Under Python 3, binary files and text files are clearly
|
||||||
distinct and mutually incompatible; see the :mod:`io` module for details.
|
distinct and mutually incompatible; see the :mod:`io` module for details.
|
||||||
Therefore, you **must** make a decision of whether a file will be used for
|
Therefore, you **must** make a decision of whether a file will be used for
|
||||||
binary access (allowing to read and/or write bytes data) or text access
|
binary access (allowing to read and/or write binary data) or text access
|
||||||
(allowing to read and/or write unicode data).
|
(allowing to read and/or write text data). You should also use :func:`io.open`
|
||||||
|
for opening files instead of the built-in :func:`open` function as the :mod:`io`
|
||||||
Text files
|
module is consistent from Python 2 to 3 while the built-in :func:`open` function
|
||||||
''''''''''
|
is not (in Python 3 it's actually :func:`io.open`).
|
||||||
|
|
||||||
Text files created using ``open()`` under Python 2 return byte strings,
|
Finally, the indexing of binary data requires careful handling (slicing does
|
||||||
while under Python 3 they return unicode strings. Depending on your porting
|
**not** require any special handling). In Python 2,
|
||||||
strategy, this can be an issue.
|
``b'123'[1] == b'2'`` while in Python 3 ``b'123'[1] == 50``. Because binary data
|
||||||
|
is simply a collection of binary numbers, Python 3 returns the integer value for
|
||||||
If you want text files to return unicode strings in Python 2, you have two
|
the byte you index on. But in Python 2 because ``bytes == str``, indexing
|
||||||
possibilities:
|
returns a one-item slice of bytes. The six_ project has a function
|
||||||
|
named ``six.indexbytes()`` which will return an integer like in Python 3:
|
||||||
* Under Python 2.6 and higher, use :func:`io.open`. Since :func:`io.open`
|
``six.indexbytes(b'123', 1)``.
|
||||||
is essentially the same function in both Python 2 and Python 3, it will
|
|
||||||
help iron out any issues that might arise.
|
To summarize:
|
||||||
|
|
||||||
* If pre-2.6 compatibility is needed, then you should use :func:`codecs.open`
|
#. Decide which of your APIs take text and which take binary data
|
||||||
instead. This will make sure that you get back unicode strings in Python 2.
|
#. Make sure that your code that works with text also works with ``unicode`` and
|
||||||
|
code for binary data works with ``bytes`` in Python 2 (see the table above
|
||||||
Subclass ``object``
|
for what methods you cannot use for each type)
|
||||||
'''''''''''''''''''
|
#. Mark all binary literals with a ``b`` prefix, use a ``u`` prefix or
|
||||||
|
:mod:`__future__` import statement for text literals
|
||||||
New-style classes have been around since Python 2.2. You need to make sure
|
#. Decode binary data to text as soon as possible, encode text as binary data as
|
||||||
you are subclassing from ``object`` to avoid odd edge cases involving method
|
late as possible
|
||||||
resolution order, etc. This continues to be totally valid in Python 3 (although
|
#. Open files using :func:`io.open` and make sure to specify the ``b`` mode when
|
||||||
unneeded as all classes implicitly inherit from ``object``).
|
appropriate
|
||||||
|
#. Be careful when indexing binary data
|
||||||
|
|
||||||
Deal With the Bytes/String Dichotomy
|
Prevent compatibility regressions
|
||||||
''''''''''''''''''''''''''''''''''''
|
---------------------------------
|
||||||
|
|
||||||
One of the biggest issues people have when porting code to Python 3 is handling
|
Once you have fully translated your code to be compatible with Python 3, you
|
||||||
the bytes/string dichotomy. Because Python 2 allowed the ``str`` type to hold
|
will want to make sure your code doesn't regress and stop working under
|
||||||
textual data, people have over the years been rather loose in their delineation
|
Python 3. This is especially true if you have a dependency which is blocking you
|
||||||
of what ``str`` instances held text compared to bytes. In Python 3 you cannot
|
from actually running under Python 3 at the moment.
|
||||||
be so care-free anymore and need to properly handle the difference. The key to
|
|
||||||
handling this issue is to make sure that **every** string literal in your
|
To help with staying compatible, any new modules you create should have
|
||||||
Python 2 code is either syntactically or functionally marked as either bytes or
|
at least the following block of code at the top of it::
|
||||||
text data. After this is done you then need to make sure your APIs are designed
|
|
||||||
to either handle a specific type or made to be properly polymorphic.
|
from __future__ import absolute_import
|
||||||
|
from __future__ import division
|
||||||
|
from __future__ import print_statement
|
||||||
Mark Up Python 2 String Literals
|
from __future__ import unicode_literals
|
||||||
********************************
|
|
||||||
|
You can also run Python 2 with the ``-3`` flag to be warned about various
|
||||||
First thing you must do is designate every single string literal in Python 2
|
compatibility issues your code triggers during execution. If you turn warnings
|
||||||
as either textual or bytes data. If you are only supporting Python 2.6 or
|
into errors with ``-Werror`` then you can make sure that you don't accidentally
|
||||||
newer, this can be accomplished by marking bytes literals with a ``b`` prefix
|
miss a warning.
|
||||||
and then designating textual data with a ``u`` prefix or using the
|
|
||||||
``unicode_literals`` future statement.
|
|
||||||
|
You can also use the Pylint_ project and its ``--py3k`` flag to lint your code
|
||||||
If your project supports versions of Python predating 2.6, then you should use
|
to receive warnings when your code begins to deviate from Python 3
|
||||||
the six_ project and its ``b()`` function to denote bytes literals. For text
|
compatibility. This also prevents you from having to run Modernize_ or Futurize_
|
||||||
literals you can either use six's ``u()`` function or use a ``u`` prefix.
|
over your code regularly to catch compatibility regressions. This does require
|
||||||
|
you only support Python 2.7 and Python 3.4 or newer as that is Pylint's
|
||||||
|
minimum Python version support.
|
||||||
Decide what APIs Will Accept
|
|
||||||
****************************
|
|
||||||
|
Check which dependencies block your transition
|
||||||
In Python 2 it was very easy to accidentally create an API that accepted both
|
----------------------------------------------
|
||||||
bytes and textual data. But in Python 3, thanks to the more strict handling of
|
|
||||||
disparate types, this loose usage of bytes and text together tends to fail.
|
**After** you have made your code compatible with Python 3 you should begin to
|
||||||
|
care about whether your dependencies have also been ported. The caniusepython3_
|
||||||
Take the dict ``{b'a': 'bytes', u'a': 'text'}`` in Python 2.6. It creates the
|
project was created to help you determine which projects
|
||||||
dict ``{u'a': 'text'}`` since ``b'a' == u'a'``. But in Python 3 the equivalent
|
-- directly or indirectly -- are blocking you from supporting Python 3. There
|
||||||
dict creates ``{b'a': 'bytes', 'a': 'text'}``, i.e., no lost data. Similar
|
is both a command-line tool as well as a web interface at
|
||||||
issues can crop up when transitioning Python 2 code to Python 3.
|
https://caniusepython3.com .
|
||||||
|
|
||||||
This means you need to choose what an API is going to accept and create and
|
The project also provides code which you can integrate into your test suite so
|
||||||
consistently stick to that API in both Python 2 and 3.
|
that you will have a failing test when you no longer have dependencies blocking
|
||||||
|
you from using Python 3. This allows you to avoid having to manually check your
|
||||||
|
dependencies and to be notified quickly when you can start running on Python 3.
|
||||||
Bytes / Unicode Comparison
|
|
||||||
**************************
|
Update your ``setup.py`` file to denote Python 3 compatibility
|
||||||
|
--------------------------------------------------------------
|
||||||
In Python 3, mixing bytes and unicode is forbidden in most situations; it
|
|
||||||
will raise a :class:`TypeError` where Python 2 would have attempted an implicit
|
Once your code works under Python 3, you should update the classifiers in
|
||||||
coercion between types. However, there is one case where it doesn't and
|
your ``setup.py`` to contain ``Programming Language :: Python :: 3`` and to not
|
||||||
it can be very misleading::
|
specify sole Python 2 support. This will tell
|
||||||
|
anyone using your code that you support Python 2 **and** 3. Ideally you will
|
||||||
>>> b"" == ""
|
also want to add classifiers for each major/minor version of Python you now
|
||||||
False
|
support.
|
||||||
|
|
||||||
This is because an equality comparison is required by the language to always
|
Use continuous integration to stay compatible
|
||||||
succeed (and return ``False`` for incompatible types). However, this also
|
---------------------------------------------
|
||||||
means that code incorrectly ported to Python 3 can display buggy behaviour
|
|
||||||
if such comparisons are silently executed. To detect such situations,
|
Once you are able to fully run under Python 3 you will want to make sure your
|
||||||
Python 3 has a ``-b`` flag that will display a warning::
|
code always works under both Python 2 & 3. Probably the best tool for running
|
||||||
|
your tests under multiple Python interpreters is tox_. You can then integrate
|
||||||
$ python3 -b
|
tox with your continuous integration system so that you never accidentally break
|
||||||
>>> b"" == ""
|
Python 2 or 3 support.
|
||||||
__main__:1: BytesWarning: Comparison between bytes and string
|
|
||||||
False
|
You may also want to use use the ``-bb`` flag with the Python 3 interpreter to
|
||||||
|
trigger an exception when you are comparing bytes to strings. Usually it's
|
||||||
To turn the warning into an exception, use the ``-bb`` flag instead::
|
simply ``False``, but if you made a mistake in your separation of text/binary
|
||||||
|
data handling you may be accidentally comparing text and binary data. This flag
|
||||||
$ python3 -bb
|
will raise an exception when that occurs to help track down such cases.
|
||||||
>>> b"" == ""
|
|
||||||
Traceback (most recent call last):
|
And that's mostly it! At this point your code base is compatible with both
|
||||||
File "<stdin>", line 1, in <module>
|
Python 2 and 3 simultaneously. Your testing will also be set up so that you
|
||||||
BytesWarning: Comparison between bytes and string
|
don't accidentally break Python 2 or 3 compatibility regardless of which version
|
||||||
|
you typically run your tests under while developing.
|
||||||
|
|
||||||
Indexing bytes objects
|
|
||||||
''''''''''''''''''''''
|
Dropping Python 2 support completely
|
||||||
|
====================================
|
||||||
Another potentially surprising change is the indexing behaviour of bytes
|
|
||||||
objects in Python 3::
|
If you are able to fully drop support for Python 2, then the steps required
|
||||||
|
to transition to Python 3 simplify greatly.
|
||||||
>>> b"xyz"[0]
|
|
||||||
120
|
#. Update your code to only support Python 2.7
|
||||||
|
#. Make sure you have good test coverage (coverage.py_ can help)
|
||||||
Indeed, Python 3 bytes objects (as well as :class:`bytearray` objects)
|
#. Learn the differences between Python 2 & 3
|
||||||
are sequences of integers. But code converted from Python 2 will often
|
#. Use 2to3_ to rewrite your code to run only under Python 3
|
||||||
assume that indexing a bytestring produces another bytestring, not an
|
|
||||||
integer. To reconcile both behaviours, use slicing::
|
After this your code will be fully Python 3 compliant but in a way that is not
|
||||||
|
supported by Python 2. You should also update the classifiers in your
|
||||||
>>> b"xyz"[0:1]
|
``setup.py`` to contain ``Programming Language :: Python :: 3 :: Only``.
|
||||||
b'x'
|
|
||||||
>>> n = 1
|
|
||||||
>>> b"xyz"[n:n+1]
|
.. _2to3: https://docs.python.org/3/library/2to3.html
|
||||||
b'y'
|
.. _caniusepython3:
|
||||||
|
.. _coverage.py: https://pypi.python.org/pypi/coverage
|
||||||
The only remaining gotcha is that an out-of-bounds slice returns an empty
|
.. _Futurize: http://python-future.org/automatic_conversion.html
|
||||||
bytes object instead of raising ``IndexError``:
|
.. _Modernize:
|
||||||
|
|
||||||
>>> b"xyz"[3]
|
|
||||||
Traceback (most recent call last):
|
|
||||||
File "<stdin>", line 1, in <module>
|
|
||||||
IndexError: index out of range
|
|
||||||
>>> b"xyz"[3:4]
|
|
||||||
b''
|
|
||||||
|
|
||||||
|
|
||||||
``__str__()``/``__unicode__()``
|
|
||||||
'''''''''''''''''''''''''''''''
|
|
||||||
|
|
||||||
In Python 2, objects can specify both a string and unicode representation of
|
|
||||||
themselves. In Python 3, though, there is only a string representation. This
|
|
||||||
becomes an issue as people can inadvertently do things in their ``__str__()``
|
|
||||||
methods which have unpredictable results (e.g., infinite recursion if you
|
|
||||||
happen to use the ``unicode(self).encode('utf8')`` idiom as the body of your
|
|
||||||
``__str__()`` method).
|
|
||||||
|
|
||||||
You can use a mixin class to work around this. This allows you to only define a
|
|
||||||
``__unicode__()`` method for your class and let the mixin derive
|
|
||||||
``__str__()`` for you (code from
|
|
||||||
http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/)::
|
|
||||||
|
|
||||||
import sys
|
|
||||||
|
|
||||||
class UnicodeMixin(object):
|
|
||||||
|
|
||||||
"""Mixin class to handle defining the proper __str__/__unicode__
|
|
||||||
methods in Python 2 or 3."""
|
|
||||||
|
|
||||||
if sys.version_info[0] >= 3: # Python 3
|
|
||||||
def __str__(self):
|
|
||||||
return self.__unicode__()
|
|
||||||
else: # Python 2
|
|
||||||
def __str__(self):
|
|
||||||
return self.__unicode__().encode('utf8')
|
|
||||||
|
|
||||||
|
|
||||||
class Spam(UnicodeMixin):
|
|
||||||
|
|
||||||
def __unicode__(self):
|
|
||||||
return u'spam-spam-bacon-spam' # 2to3 will remove the 'u' prefix
|
|
||||||
|
|
||||||
|
|
||||||
Don't Index on Exceptions
|
|
||||||
'''''''''''''''''''''''''
|
|
||||||
|
|
||||||
In Python 2, the following worked::
|
|
||||||
|
|
||||||
>>> exc = Exception(1, 2, 3)
|
|
||||||
>>> exc.args[1]
|
|
||||||
2
|
|
||||||
>>> exc[1] # Python 2 only!
|
|
||||||
2
|
|
||||||
|
|
||||||
But in Python 3, indexing directly on an exception is an error. You need to
|
|
||||||
make sure to only index on the :attr:`BaseException.args` attribute which is a
|
|
||||||
sequence containing all arguments passed to the :meth:`__init__` method.
|
|
||||||
|
|
||||||
Even better is to use the documented attributes the exception provides.
|
|
||||||
|
|
||||||
|
|
||||||
Don't use ``__getslice__`` & Friends
|
|
||||||
''''''''''''''''''''''''''''''''''''
|
|
||||||
|
|
||||||
Been deprecated for a while, but Python 3 finally drops support for
|
|
||||||
``__getslice__()``, etc. Move completely over to :meth:`__getitem__` and
|
|
||||||
friends.
|
|
||||||
|
|
||||||
|
|
||||||
Updating doctests
|
|
||||||
'''''''''''''''''
|
|
||||||
|
|
||||||
Don't forget to make them Python 2/3 compatible as well. If you wrote a
|
|
||||||
monolithic set of doctests (e.g., a single docstring containing all of your
|
|
||||||
doctests), you should at least consider breaking the doctests up into smaller
|
|
||||||
pieces to make it more manageable to fix. Otherwise it might very well be worth
|
|
||||||
your time and effort to port your tests to :mod:`unittest`.
|
|
||||||
|
|
||||||
|
|
||||||
Update ``map`` for imbalanced input sequences
|
|
||||||
'''''''''''''''''''''''''''''''''''''''''''''
|
|
||||||
|
|
||||||
With Python 2, when ``map`` was given more than one input sequence it would pad
|
|
||||||
the shorter sequences with ``None`` values, returning a sequence as long as the
|
|
||||||
longest input sequence.
|
|
||||||
|
|
||||||
With Python 3, if the input sequences to ``map`` are of unequal length, ``map``
|
|
||||||
will stop at the termination of the shortest of the sequences. For full
|
|
||||||
compatibility with ``map`` from Python 2.x, wrap the sequence arguments in
|
|
||||||
:func:`itertools.zip_longest`, e.g. ``map(func, *sequences)`` becomes
|
|
||||||
``list(map(func, itertools.zip_longest(*sequences)))``.
|
|
||||||
|
|
||||||
Eliminate ``-3`` Warnings
|
|
||||||
-------------------------
|
|
||||||
|
|
||||||
When you run your application's test suite, run it using the ``-3`` flag passed
|
|
||||||
to Python. This will cause various warnings to be raised during execution about
|
|
||||||
things that are semantic changes between Python 2 and 3. Try to eliminate those
|
|
||||||
warnings to make your code even more portable to Python 3.
|
|
||||||
|
|
||||||
|
|
||||||
Alternative Approaches
|
|
||||||
======================
|
|
||||||
|
|
||||||
While supporting Python 2 & 3 simultaneously is typically the preferred choice
|
|
||||||
by people so that they can continue to improve code and have it work for the
|
|
||||||
most number of users, your life may be easier if you only have to support one
|
|
||||||
major version of Python going forward.
|
|
||||||
|
|
||||||
Supporting Only Python 3 Going Forward From Python 2 Code
|
|
||||||
---------------------------------------------------------
|
|
||||||
|
|
||||||
If you have Python 2 code but going forward only want to improve it as Python 3
|
|
||||||
code, then you can use :ref:`2to3 <2to3-reference>` to translate your Python 2
|
|
||||||
code to Python 3 code. This is only recommended, though, if your current
|
|
||||||
version of your project is going into maintenance mode and you want all new
|
|
||||||
features to be exclusive to Python 3.
|
|
||||||
|
|
||||||
|
|
||||||
Backporting Python 3 code to Python 2
|
|
||||||
-------------------------------------
|
|
||||||
|
|
||||||
If you have Python 3 code and have little interest in supporting Python 2 you
|
|
||||||
can use 3to2_ to translate from Python 3 code to Python 2 code. This is only
|
|
||||||
recommended if you don't plan to heavily support Python 2 users. Otherwise
|
|
||||||
write your code for Python 3 and then backport as far back as you want. This
|
|
||||||
is typically easier than going from Python 2 to 3 as you will have worked out
|
|
||||||
any difficulties with e.g. bytes/strings, etc.
|
|
||||||
|
|
||||||
|
|
||||||
Other Resources
|
|
||||||
===============
|
|
||||||
|
|
||||||
The authors of the following blog posts, wiki pages, and books deserve special
|
|
||||||
thanks for making public their tips for porting Python 2 code to Python 3 (and
|
|
||||||
thus helping provide information for this document and its various revisions
|
|
||||||
over the years):
|
|
||||||
|
|
||||||
* https://wiki.python.org/moin/PortingPythonToPy3k
|
|
||||||
* http://python3porting.com/
|
|
||||||
* http://docs.pythonsprints.com/python3_porting/py-porting.html
|
|
||||||
* http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/
|
|
||||||
* http://dabeaz.blogspot.com/2011/01/porting-py65-and-my-superboard-to.html
|
|
||||||
* http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/
|
|
||||||
* http://lucumr.pocoo.org/2010/2/11/porting-to-python-3-a-guide/
|
|
||||||
* https://wiki.ubuntu.com/Python/3
|
|
||||||
|
|
||||||
If you feel there is something missing from this document that should be added,
|
|
||||||
please email the python-porting_ mailing list.
|
|
||||||
|
|
||||||
|
|
||||||
.. _3to2: https://pypi.python.org/pypi/3to2
|
|
||||||
.. _Cheeseshop: PyPI_
|
|
||||||
.. _coverage: https://pypi.python.org/pypi/coverage
|
|
||||||
.. _future: http://python-future.org/
|
|
||||||
.. _modernize: https://github.com/mitsuhiko/python-modernize
|
|
||||||
.. _Porting to Python 3: http://python3porting.com/
|
.. _Porting to Python 3: http://python3porting.com/
|
||||||
.. _PyPI: https://pypi.python.org/pypi
|
.. _Pylint: https://pypi.python.org/pypi/pylint
|
||||||
.. _Python 3 Packages: https://pypi.python.org/pypi?:action=browse&c=533&show=all
|
|
||||||
.. _Python 3 Q & A: http://ncoghlan-devs-python-notes.readthedocs.org/en/latest/python3/questions_and_answers.html
|
.. _Python 3 Q & A: http://ncoghlan-devs-python-notes.readthedocs.org/en/latest/python3/questions_and_answers.html
|
||||||
|
|
||||||
|
.. _python-future: http://python-future.org/
|
||||||
.. _python-porting: https://mail.python.org/mailman/listinfo/python-porting
|
.. _python-porting: https://mail.python.org/mailman/listinfo/python-porting
|
||||||
.. _six: https://pypi.python.org/pypi/six
|
.. _six: https://pypi.python.org/pypi/six
|
||||||
.. _tox: https://pypi.python.org/pypi/tox
|
.. _tox: https://pypi.python.org/pypi/tox
|
||||||
.. _trove classifiers: https://pypi.python.org/pypi?%3Aaction=list_classifiers
|
.. _trove classifier: https://pypi.python.org/pypi?%3Aaction=list_classifiers
|
||||||
|
.. _"What's New": https://docs.python.org/3/whatsnew/index.html
|
||||||
|
|
|
@ -219,6 +219,9 @@ Build
|
||||||
Documentation
|
Documentation
|
||||||
-------------
|
-------------
|
||||||
|
|
||||||
|
- Issue #22914: Update the Python 2/3 porting HOWTO to describe a more automated
|
||||||
|
approach.
|
||||||
|
|
||||||
- Issue #21514: The documentation of the json module now refers to new JSON RFC
|
- Issue #21514: The documentation of the json module now refers to new JSON RFC
|
||||||
7159 instead of obsoleted RFC 4627.
|
7159 instead of obsoleted RFC 4627.
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue