Update porting HOWTO to drop unicode_literals and mention static type checking
This commit is contained in:
parent
b1a1619bf0
commit
5866719510
|
@ -17,7 +17,8 @@ Porting Python 2 Code to Python 3
|
|||
please see :ref:`cporting-howto`.
|
||||
|
||||
If you would like to read one core Python developer's take on why Python 3
|
||||
came into existence, you can read Nick Coghlan's `Python 3 Q & A`_.
|
||||
came into existence, you can read Nick Coghlan's `Python 3 Q & A`_ or
|
||||
Brett Cannon's `Why Python 3 exists`_.
|
||||
|
||||
For help with porting, you can email the python-porting_ mailing list with
|
||||
questions.
|
||||
|
@ -32,8 +33,7 @@ are:
|
|||
#. Make sure you have good test coverage (coverage.py_ can help;
|
||||
``pip install coverage``)
|
||||
#. Learn the differences between Python 2 & 3
|
||||
#. Use Modernize_ or Futurize_ to update your code (``pip install modernize`` or
|
||||
``pip install future``, respectively)
|
||||
#. Use Futurize_ (or Modernize_) to update your code (e.g. ``pip install future``)
|
||||
#. Use Pylint_ to help make sure you don't regress on your Python 3 support
|
||||
(``pip install pylint``)
|
||||
#. Use caniusepython3_ to find out which of your dependencies are blocking your
|
||||
|
@ -41,10 +41,9 @@ are:
|
|||
#. Once your dependencies are no longer blocking you, use continuous integration
|
||||
to make sure you stay compatible with Python 2 & 3 (tox_ can help test
|
||||
against multiple versions of Python; ``pip install tox``)
|
||||
|
||||
If you are dropping support for Python 2 entirely, then after you learn the
|
||||
differences between Python 2 & 3 you can run 2to3_ over your code and skip the
|
||||
rest of the steps outlined above.
|
||||
#. Consider using optional static type checking to make sure your type usage
|
||||
works in both Python 2 & 3 (e.g. use mypy_ to check your typing under both
|
||||
Python 2 & Python 3).
|
||||
|
||||
|
||||
Details
|
||||
|
@ -54,7 +53,7 @@ A key point about supporting Python 2 & 3 simultaneously is that you can start
|
|||
**today**! Even if your dependencies are not supporting Python 3 yet that does
|
||||
not mean you can't modernize your code **now** to support Python 3. Most changes
|
||||
required to support Python 3 lead to cleaner code using newer practices even in
|
||||
Python 2.
|
||||
Python 2 code.
|
||||
|
||||
Another key point is that modernizing your Python 2 code to also support
|
||||
Python 3 is largely automated for you. While you might have to make some API
|
||||
|
@ -82,12 +81,13 @@ have to import a function instead of using a built-in one, but otherwise the
|
|||
overall transformation should not feel foreign to you.
|
||||
|
||||
But you should aim for only supporting Python 2.7. Python 2.6 is no longer
|
||||
supported and thus is not receiving bugfixes. This means **you** will have to
|
||||
work around any issues you come across with Python 2.6. There are also some
|
||||
freely upported and thus is not receiving bugfixes. This means **you** will have
|
||||
to work around any issues you come across with Python 2.6. There are also some
|
||||
tools mentioned in this HOWTO which do not support Python 2.6 (e.g., Pylint_),
|
||||
and this will become more commonplace as time goes on. It will simply be easier
|
||||
for you if you only support the versions of Python that you have to support.
|
||||
|
||||
|
||||
Make sure you specify the proper version support in your ``setup.py`` file
|
||||
--------------------------------------------------------------------------
|
||||
|
||||
|
@ -98,6 +98,7 @@ Python 3 yet you should at least have
|
|||
also specify each major/minor version of Python that you do support, e.g.
|
||||
``Programming Language :: Python :: 2.7``.
|
||||
|
||||
|
||||
Have good test coverage
|
||||
-----------------------
|
||||
|
||||
|
@ -106,10 +107,11 @@ to, you will want to make sure your test suite has good coverage. A good rule of
|
|||
thumb is that if you want to be confident enough in your test suite that any
|
||||
failures that appear after having tools rewrite your code are actual bugs in the
|
||||
tools and not in your code. If you want a number to aim for, try to get over 80%
|
||||
coverage (and don't feel bad if you can't easily get past 90%). If you
|
||||
coverage (and don't feel bad if you can't easily get passed 90%). If you
|
||||
don't already have a tool to measure test coverage then coverage.py_ is
|
||||
recommended.
|
||||
|
||||
|
||||
Learn the differences between Python 2 & 3
|
||||
-------------------------------------------
|
||||
|
||||
|
@ -127,13 +129,15 @@ Update your code
|
|||
|
||||
Once you feel like you know what is different in Python 3 compared to Python 2,
|
||||
it's time to update your code! You have a choice between two tools in porting
|
||||
your code automatically: Modernize_ and Futurize_. Which tool you choose will
|
||||
your code automatically: Futurize_ and Modernize_. Which tool you choose will
|
||||
depend on how much like Python 3 you want your code to be. Futurize_ does its
|
||||
best to make Python 3 idioms and practices exist in Python 2, e.g. backporting
|
||||
the ``bytes`` type from Python 3 so that you have semantic parity between the
|
||||
major versions of Python. Modernize_,
|
||||
on the other hand, is more conservative and targets a Python 2/3 subset of
|
||||
Python, relying on six_ to help provide compatibility.
|
||||
Python, directly relying on six_ to help provide compatibility. As Python 3 is
|
||||
the future, it might be best to consider Futurize to begin adjusting to any new
|
||||
practices that Python 3 introduces which you are not accustomed to yet.
|
||||
|
||||
Regardless of which tool you choose, they will update your code to run under
|
||||
Python 3 while staying compatible with the version of Python 2 you started with.
|
||||
|
@ -153,6 +157,7 @@ the built-in ``open()`` function is off by default in Modernize). Luckily,
|
|||
though, there are only a couple of things to watch out for which can be
|
||||
considered large issues that may be hard to debug if not watched for.
|
||||
|
||||
|
||||
Division
|
||||
++++++++
|
||||
|
||||
|
@ -173,6 +178,7 @@ an object defines a ``__truediv__`` method but not ``__floordiv__`` then your
|
|||
code would begin to fail (e.g. a user-defined class that uses ``/`` to
|
||||
signify some operation but not ``//`` for the same thing or at all).
|
||||
|
||||
|
||||
Text versus binary data
|
||||
+++++++++++++++++++++++
|
||||
|
||||
|
@ -189,7 +195,7 @@ To make the distinction between text and binary data clearer and more
|
|||
pronounced, Python 3 did what most languages created in the age of the internet
|
||||
have done and made text and binary data distinct types that cannot blindly be
|
||||
mixed together (Python predates widespread access to the internet). For any code
|
||||
that only deals with text or only binary data, this separation doesn't pose an
|
||||
that deals only with text or only binary data, this separation doesn't pose an
|
||||
issue. But for code that has to deal with both, it does mean you might have to
|
||||
now care about when you are using text compared to binary data, which is why
|
||||
this cannot be entirely automated.
|
||||
|
@ -198,15 +204,15 @@ To start, you will need to decide which APIs take text and which take binary
|
|||
(it is **highly** recommended you don't design APIs that can take both due to
|
||||
the difficulty of keeping the code working; as stated earlier it is difficult to
|
||||
do well). In Python 2 this means making sure the APIs that take text can work
|
||||
with ``unicode`` in Python 2 and those that work with binary data work with the
|
||||
``bytes`` type from Python 3 and thus a subset of ``str`` in Python 2 (which the
|
||||
``bytes`` type in Python 2 is an alias for). Usually the biggest issue is
|
||||
realizing which methods exist for which types in Python 2 & 3 simultaneously
|
||||
with ``unicode`` and those that work with binary data work with the
|
||||
``bytes`` type from Python 3 (which is a subset of ``str`` in Python 2 and acts
|
||||
as an alias for ``bytes`` type in Python 2). Usually the biggest issue is
|
||||
realizing which methods exist on which types in Python 2 & 3 simultaneously
|
||||
(for text that's ``unicode`` in Python 2 and ``str`` in Python 3, for binary
|
||||
that's ``str``/``bytes`` in Python 2 and ``bytes`` in Python 3). The following
|
||||
table lists the **unique** methods of each data type across Python 2 & 3
|
||||
(e.g., the ``decode()`` method is usable on the equivalent binary data type in
|
||||
either Python 2 or 3, but it can't be used by the text data type consistently
|
||||
either Python 2 or 3, but it can't be used by the textual data type consistently
|
||||
between Python 2 and 3 because ``str`` in Python 3 doesn't have the method). Do
|
||||
note that as of Python 3.5 the ``__mod__`` method was added to the bytes type.
|
||||
|
||||
|
@ -232,10 +238,11 @@ This allows your code to work with only text internally and thus eliminates
|
|||
having to keep track of what type of data you are working with.
|
||||
|
||||
The next issue is making sure you know whether the string literals in your code
|
||||
represent text or binary data. At minimum you should add a ``b`` prefix to any
|
||||
literal that presents binary data. For text you should either use the
|
||||
``from __future__ import unicode_literals`` statement or add a ``u`` prefix to
|
||||
the text literal.
|
||||
represent text or binary data. You should add a ``b`` prefix to any
|
||||
literal that presents binary data. For text you should add a ``u`` prefix to
|
||||
the text literal. (there is a :mod:`__future__` import to force all unspecified
|
||||
literals to be Unicode, but usage has shown it isn't as effective as adding a
|
||||
``b`` or ``u`` prefix to all literals explicitly)
|
||||
|
||||
As part of this dichotomy you also need to be careful about opening files.
|
||||
Unless you have been working on Windows, there is a chance you have not always
|
||||
|
@ -243,11 +250,13 @@ bothered to add the ``b`` mode when opening a binary file (e.g., ``rb`` for
|
|||
binary reading). Under Python 3, binary files and text files are clearly
|
||||
distinct and mutually incompatible; see the :mod:`io` module for details.
|
||||
Therefore, you **must** make a decision of whether a file will be used for
|
||||
binary access (allowing binary data to be read and/or written) or text access
|
||||
binary access (allowing binary data to be read and/or written) or textual access
|
||||
(allowing text data to be read and/or written). You should also use :func:`io.open`
|
||||
for opening files instead of the built-in :func:`open` function as the :mod:`io`
|
||||
module is consistent from Python 2 to 3 while the built-in :func:`open` function
|
||||
is not (in Python 3 it's actually :func:`io.open`).
|
||||
is not (in Python 3 it's actually :func:`io.open`). Do not bother with the
|
||||
outdated practice of using :func:`codecs.open` as that's only necessary for
|
||||
keeping compatibility with Python 2.5.
|
||||
|
||||
The constructors of both ``str`` and ``bytes`` have different semantics for the
|
||||
same arguments between Python 2 & 3. Passing an integer to ``bytes`` in Python 2
|
||||
|
@ -274,21 +283,22 @@ To summarize:
|
|||
#. Make sure that your code that works with text also works with ``unicode`` and
|
||||
code for binary data works with ``bytes`` in Python 2 (see the table above
|
||||
for what methods you cannot use for each type)
|
||||
#. Mark all binary literals with a ``b`` prefix, use a ``u`` prefix or
|
||||
:mod:`__future__` import statement for text literals
|
||||
#. Mark all binary literals with a ``b`` prefix, textual literals with a ``u``
|
||||
prefix
|
||||
#. Decode binary data to text as soon as possible, encode text as binary data as
|
||||
late as possible
|
||||
#. Open files using :func:`io.open` and make sure to specify the ``b`` mode when
|
||||
appropriate
|
||||
#. Be careful when indexing binary data
|
||||
#. Be careful when indexing into binary data
|
||||
|
||||
|
||||
Use feature detection instead of version detection
|
||||
++++++++++++++++++++++++++++++++++++++++++++++++++
|
||||
|
||||
Inevitably you will have code that has to choose what to do based on what
|
||||
version of Python is running. The best way to do this is with feature detection
|
||||
of whether the version of Python you're running under supports what you need.
|
||||
If for some reason that doesn't work then you should make the version check is
|
||||
If for some reason that doesn't work then you should make the version check be
|
||||
against Python 2 and not Python 3. To help explain this, let's look at an
|
||||
example.
|
||||
|
||||
|
@ -340,14 +350,12 @@ at least the following block of code at the top of it::
|
|||
from __future__ import absolute_import
|
||||
from __future__ import division
|
||||
from __future__ import print_function
|
||||
from __future__ import unicode_literals
|
||||
|
||||
You can also run Python 2 with the ``-3`` flag to be warned about various
|
||||
compatibility issues your code triggers during execution. If you turn warnings
|
||||
into errors with ``-Werror`` then you can make sure that you don't accidentally
|
||||
miss a warning.
|
||||
|
||||
|
||||
You can also use the Pylint_ project and its ``--py3k`` flag to lint your code
|
||||
to receive warnings when your code begins to deviate from Python 3
|
||||
compatibility. This also prevents you from having to run Modernize_ or Futurize_
|
||||
|
@ -364,22 +372,23 @@ care about whether your dependencies have also been ported. The caniusepython3_
|
|||
project was created to help you determine which projects
|
||||
-- directly or indirectly -- are blocking you from supporting Python 3. There
|
||||
is both a command-line tool as well as a web interface at
|
||||
https://caniusepython3.com .
|
||||
https://caniusepython3.com.
|
||||
|
||||
The project also provides code which you can integrate into your test suite so
|
||||
that you will have a failing test when you no longer have dependencies blocking
|
||||
you from using Python 3. This allows you to avoid having to manually check your
|
||||
dependencies and to be notified quickly when you can start running on Python 3.
|
||||
|
||||
|
||||
Update your ``setup.py`` file to denote Python 3 compatibility
|
||||
--------------------------------------------------------------
|
||||
|
||||
Once your code works under Python 3, you should update the classifiers in
|
||||
your ``setup.py`` to contain ``Programming Language :: Python :: 3`` and to not
|
||||
specify sole Python 2 support. This will tell
|
||||
anyone using your code that you support Python 2 **and** 3. Ideally you will
|
||||
also want to add classifiers for each major/minor version of Python you now
|
||||
support.
|
||||
specify sole Python 2 support. This will tell anyone using your code that you
|
||||
support Python 2 **and** 3. Ideally you will also want to add classifiers for
|
||||
each major/minor version of Python you now support.
|
||||
|
||||
|
||||
Use continuous integration to stay compatible
|
||||
---------------------------------------------
|
||||
|
@ -404,20 +413,17 @@ don't accidentally break Python 2 or 3 compatibility regardless of which version
|
|||
you typically run your tests under while developing.
|
||||
|
||||
|
||||
Dropping Python 2 support completely
|
||||
====================================
|
||||
Consider using optional static type checking
|
||||
--------------------------------------------
|
||||
|
||||
If you are able to fully drop support for Python 2, then the steps required
|
||||
to transition to Python 3 simplify greatly.
|
||||
|
||||
#. Update your code to only support Python 2.7
|
||||
#. Make sure you have good test coverage (coverage.py_ can help)
|
||||
#. Learn the differences between Python 2 & 3
|
||||
#. Use 2to3_ to rewrite your code to run only under Python 3
|
||||
|
||||
After this your code will be fully Python 3 compliant but in a way that is not
|
||||
supported by Python 2. You should also update the classifiers in your
|
||||
``setup.py`` to contain ``Programming Language :: Python :: 3 :: Only``.
|
||||
Another way to help port your code is to use a static type checker like
|
||||
mypy_ or pytype_ on your code. These tools can be used to analyze your code as
|
||||
if it's being run under Python 2, then you can run the tool a second time as if
|
||||
your code is running under Python 3. By running a static type checker twice like
|
||||
this you can discover if you're e.g. misusing binary data type in one version
|
||||
of Python compared to another. If you add optional type hints to your code you
|
||||
can also explicitly state whether your APIs use textual or binary data, helping
|
||||
to make sure everything functions as expected in both versions of Python.
|
||||
|
||||
|
||||
.. _2to3: https://docs.python.org/3/library/2to3.html
|
||||
|
@ -428,13 +434,19 @@ supported by Python 2. You should also update the classifiers in your
|
|||
.. _importlib: https://docs.python.org/3/library/importlib.html#module-importlib
|
||||
.. _importlib2: https://pypi.python.org/pypi/importlib2
|
||||
.. _Modernize: https://python-modernize.readthedocs.org/en/latest/
|
||||
.. _mypy: http://mypy-lang.org/
|
||||
.. _Porting to Python 3: http://python3porting.com/
|
||||
.. _Pylint: https://pypi.python.org/pypi/pylint
|
||||
|
||||
.. _Python 3 Q & A: https://ncoghlan-devs-python-notes.readthedocs.org/en/latest/python3/questions_and_answers.html
|
||||
|
||||
.. _pytype: https://github.com/google/pytype
|
||||
.. _python-future: http://python-future.org/
|
||||
.. _python-porting: https://mail.python.org/mailman/listinfo/python-porting
|
||||
.. _six: https://pypi.python.org/pypi/six
|
||||
.. _tox: https://pypi.python.org/pypi/tox
|
||||
.. _trove classifier: https://pypi.python.org/pypi?%3Aaction=list_classifiers
|
||||
|
||||
.. _"What's New": https://docs.python.org/3/whatsnew/index.html
|
||||
|
||||
.. _Why Python 3 exists: http://www.snarky.ca/why-python-3-exists
|
||||
|
|
Loading…
Reference in New Issue