cpython/Doc/howto/doanddont.rst

277 lines
9.1 KiB
ReStructuredText
Raw Normal View History

2007-08-15 11:28:22 -03:00
************************************
2009-01-03 17:18:54 -04:00
Idioms and Anti-Idioms in Python
2007-08-15 11:28:22 -03:00
************************************
:Author: Moshe Zadka
Merged revisions 62350-62355,62358-62359,62364-62365,62370,62372-62375,62378-62379,62381 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r62350 | nick.coghlan | 2008-04-15 12:25:31 +0200 (Tue, 15 Apr 2008) | 1 line Issue 2439: add pkgutils.get_data() as a convenience wrapper for the PEP 302 get_data() API (contributed by Paul Moore) ........ r62351 | nick.coghlan | 2008-04-15 12:28:14 +0200 (Tue, 15 Apr 2008) | 1 line Add test file missing from rev 62350 ........ r62352 | benjamin.peterson | 2008-04-15 13:58:46 +0200 (Tue, 15 Apr 2008) | 2 lines Add myself to Doc/ACKS.txt ........ r62353 | andrew.kuchling | 2008-04-15 15:10:07 +0200 (Tue, 15 Apr 2008) | 6 lines Add *,**,@ to index, as suggested by http://farmdev.com/thoughts/24/what-does-the-def-star-variable-or-def-asterisk-parameter-syntax-do-in-python-/ The right entry type to use isn't clear; operator seems wrong, because *,**,@ aren't being used in expressions here. I put them as 'statement'; 'syntax' might be better. ........ r62354 | andrew.kuchling | 2008-04-15 15:10:41 +0200 (Tue, 15 Apr 2008) | 1 line Typo fix ........ r62355 | mark.dickinson | 2008-04-15 22:51:18 +0200 (Tue, 15 Apr 2008) | 3 lines Fix for possible signed overflow: the behaviour of -LONG_MIN is undefined in ANSI C. ........ r62358 | jeroen.ruigrok | 2008-04-16 14:47:01 +0200 (Wed, 16 Apr 2008) | 2 lines Reformat to 80 columns prior to adding documentation. ........ r62359 | jeroen.ruigrok | 2008-04-16 14:57:43 +0200 (Wed, 16 Apr 2008) | 2 lines Add details about the return value for mmap.flush(). ........ r62364 | raymond.hettinger | 2008-04-17 12:48:31 +0200 (Thu, 17 Apr 2008) | 1 line Issue 2648: Add leading zero to money format recipe in the docs. ........ r62365 | jeroen.ruigrok | 2008-04-17 14:39:45 +0200 (Thu, 17 Apr 2008) | 2 lines Be consistent in the use of read-only. ........ r62370 | andrew.kuchling | 2008-04-17 22:44:06 +0200 (Thu, 17 Apr 2008) | 1 line Typo fixes ........ r62372 | andrew.kuchling | 2008-04-18 04:40:47 +0200 (Fri, 18 Apr 2008) | 1 line Use correct parameter name ........ r62373 | andrew.kuchling | 2008-04-18 18:53:09 +0200 (Fri, 18 Apr 2008) | 1 line #2654: fix typo ........ r62374 | andrew.kuchling | 2008-04-18 20:28:23 +0200 (Fri, 18 Apr 2008) | 4 lines Remove personal note from Jim Roskind; it no longer applies, and the e-mail address is for a previous employer. Can we move the big long copyright statement into a sidebar or something? ........ r62375 | andrew.kuchling | 2008-04-18 20:39:55 +0200 (Fri, 18 Apr 2008) | 1 line Rewrite introductory section, and remove old section. (It was already commented-out, but why keep it?) ........ r62378 | skip.montanaro | 2008-04-18 22:35:46 +0200 (Fri, 18 Apr 2008) | 1 line resolve issue 2014 ........ r62379 | benjamin.peterson | 2008-04-18 22:45:33 +0200 (Fri, 18 Apr 2008) | 2 lines Fix indentation in sysmodule.c ........ r62381 | amaury.forgeotdarc | 2008-04-19 01:31:33 +0200 (Sat, 19 Apr 2008) | 3 lines Some tests did not pass on repeated calls (regrtest -R::) Perform additional cleanup, mostly deleting from sys.modules, or clearing the warnings registry. ........
2008-04-18 21:55:37 -03:00
This document is placed in the public domain.
2007-08-15 11:28:22 -03:00
.. topic:: Abstract
This document can be considered a companion to the tutorial. It shows how to use
Python, and even more importantly, how *not* to use Python.
Language Constructs You Should Not Use
======================================
While Python has relatively few gotchas compared to other languages, it still
has some constructs which are only useful in corner cases, or are plain
dangerous.
from module import \*
---------------------
Inside Function Definitions
^^^^^^^^^^^^^^^^^^^^^^^^^^^
``from module import *`` is *invalid* inside function definitions. While many
versions of Python do not check for the invalidity, it does not make it more
valid, no more then having a smart lawyer makes a man innocent. Do not use it
like that ever. Even in versions where it was accepted, it made the function
execution slower, because the compiler could not be certain which names are
local and which are global. In Python 2.1 this construct causes warnings, and
sometimes even errors.
At Module Level
^^^^^^^^^^^^^^^
While it is valid to use ``from module import *`` at module level it is usually
a bad idea. For one, this loses an important property Python otherwise has ---
you can know where each toplevel name is defined by a simple "search" function
in your favourite editor. You also open yourself to trouble in the future, if
some module grows additional functions or classes.
One of the most awful question asked on the newsgroup is why this code::
f = open("www")
f.read()
does not work. Of course, it works just fine (assuming you have a file called
"www".) But it does not work if somewhere in the module, the statement ``from os
import *`` is present. The :mod:`os` module has a function called :func:`open`
which returns an integer. While it is very useful, shadowing builtins is one of
its least useful properties.
Remember, you can never know for sure what names a module exports, so either
take what you need --- ``from module import name1, name2``, or keep them in the
module and access on a per-need basis --- ``import module; print(module.name)``.
2007-08-15 11:28:22 -03:00
When It Is Just Fine
^^^^^^^^^^^^^^^^^^^^
There are situations in which ``from module import *`` is just fine:
* The interactive prompt. For example, ``from math import *`` makes Python an
amazing scientific calculator.
* When extending a module in C with a module in Python.
* When the module advertises itself as ``from import *`` safe.
from module import name1, name2
-------------------------------
This is a "don't" which is much weaker then the previous "don't"s but is still
something you should not do if you don't have good reasons to do that. The
reason it is usually bad idea is because you suddenly have an object which lives
Merged revisions 60481,60485,60489-60492,60494-60496,60498-60499,60501-60503,60505-60506,60508-60509,60523-60524,60532,60543,60545,60547-60548,60552,60554,60556-60559,60561-60562,60569,60571-60572,60574,60576-60583,60585-60586,60589,60591,60594-60595,60597-60598,60600-60601,60606-60612,60615,60617,60619-60621,60623-60625,60627-60629,60631,60633,60635,60647,60650,60652,60654,60656,60658-60659,60664-60666,60668-60670,60672,60676,60678,60680-60683,60685-60686,60688,60690,60692-60694,60697-60700,60705-60706,60708,60711,60714,60720,60724-60730,60732,60736,60742,60744,60746,60748,60750-60751,60753,60756-60757,60759-60761,60763-60764,60766,60769-60770,60774-60784,60787-60789,60793,60796,60799-60809,60812-60813,60815-60821,60823-60826,60828-60829,60831-60834,60836,60838-60839,60846-60849,60852-60854,60856-60859,60861-60870,60874-60875,60880-60881,60886,60888-60890,60892,60894-60898,60900-60931,60933-60958 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r60901 | eric.smith | 2008-02-19 14:21:56 +0100 (Tue, 19 Feb 2008) | 1 line Added PEP 3101. ........ r60907 | georg.brandl | 2008-02-20 20:12:36 +0100 (Wed, 20 Feb 2008) | 2 lines Fixes contributed by Ori Avtalion. ........ r60909 | eric.smith | 2008-02-21 00:34:22 +0100 (Thu, 21 Feb 2008) | 1 line Trim leading zeros from a floating point exponent, per C99. See issue 1600. As far as I know, this only affects Windows. Add float type 'n' to PyOS_ascii_formatd (see PEP 3101 for 'n' description). ........ r60910 | eric.smith | 2008-02-21 00:39:28 +0100 (Thu, 21 Feb 2008) | 1 line Now that PyOS_ascii_formatd supports the 'n' format, simplify the float formatting code to just call it. ........ r60918 | andrew.kuchling | 2008-02-21 15:23:38 +0100 (Thu, 21 Feb 2008) | 2 lines Close manifest file. This change doesn't make any difference to CPython, but is a necessary fix for Jython. ........ r60921 | guido.van.rossum | 2008-02-21 18:46:16 +0100 (Thu, 21 Feb 2008) | 2 lines Remove news about float repr() -- issue 1580 is still in limbo. ........ r60923 | guido.van.rossum | 2008-02-21 19:18:37 +0100 (Thu, 21 Feb 2008) | 5 lines Removed uses of dict.has_key() from distutils, and uses of callable() from copy_reg.py, so the interpreter now starts up without warnings when '-3' is given. More work like this needs to be done in the rest of the stdlib. ........ r60924 | thomas.heller | 2008-02-21 19:28:48 +0100 (Thu, 21 Feb 2008) | 4 lines configure.ac: Remove the configure check for _Bool, it is already done in the top-level Python configure script. configure, fficonfig.h.in: regenerated. ........ r60925 | thomas.heller | 2008-02-21 19:52:20 +0100 (Thu, 21 Feb 2008) | 3 lines Replace 'has_key()' with 'in'. Replace 'raise Error, stuff' with 'raise Error(stuff)'. ........ r60927 | raymond.hettinger | 2008-02-21 20:24:53 +0100 (Thu, 21 Feb 2008) | 1 line Update more instances of has_key(). ........ r60928 | guido.van.rossum | 2008-02-21 20:46:35 +0100 (Thu, 21 Feb 2008) | 3 lines Fix a few typos and layout glitches (more work is needed). Move 2.5 news to Misc/HISTORY. ........ r60936 | georg.brandl | 2008-02-21 21:33:38 +0100 (Thu, 21 Feb 2008) | 2 lines #2079: typo in userdict docs. ........ r60938 | georg.brandl | 2008-02-21 21:38:13 +0100 (Thu, 21 Feb 2008) | 2 lines Part of #2154: minimal syntax fixes in doc example snippets. ........ r60942 | raymond.hettinger | 2008-02-22 04:16:42 +0100 (Fri, 22 Feb 2008) | 1 line First draft for itertools.product(). Docs and other updates forthcoming. ........ r60955 | nick.coghlan | 2008-02-22 11:54:06 +0100 (Fri, 22 Feb 2008) | 1 line Try to make command line error messages from runpy easier to understand (and suppress traceback cruft from the implicitly invoked runpy machinery) ........ r60956 | georg.brandl | 2008-02-22 13:31:45 +0100 (Fri, 22 Feb 2008) | 2 lines A lot more typo fixes by Ori Avtalion. ........ r60957 | georg.brandl | 2008-02-22 13:56:34 +0100 (Fri, 22 Feb 2008) | 2 lines Don't reference pyshell. ........ r60958 | georg.brandl | 2008-02-22 13:57:05 +0100 (Fri, 22 Feb 2008) | 2 lines Another fix. ........
2008-02-22 12:37:40 -04:00
in two separate namespaces. When the binding in one namespace changes, the
2007-08-15 11:28:22 -03:00
binding in the other will not, so there will be a discrepancy between them. This
happens when, for example, one module is reloaded, or changes the definition of
a function at runtime.
Bad example::
# foo.py
a = 1
# bar.py
from foo import a
if something():
2009-01-03 17:18:54 -04:00
a = 2 # danger: foo.a != a
2007-08-15 11:28:22 -03:00
Good example::
# foo.py
a = 1
# bar.py
import foo
if something():
foo.a = 2
except:
-------
Python has the ``except:`` clause, which catches all exceptions. Since *every*
error in Python raises an exception, this makes many programming errors look
like runtime problems, and hinders the debugging process.
The following code shows a great example::
try:
foo = opne("file") # misspelled "open"
except:
sys.exit("could not open file!")
The second line triggers a :exc:`NameError` which is caught by the except
clause. The program will exit, and you will have no idea that this has nothing
to do with the readability of ``"file"``.
The example above is better written ::
try:
foo = opne("file") # will be changed to "open" as soon as we run it
except IOError:
sys.exit("could not open file")
There are some situations in which the ``except:`` clause is useful: for
example, in a framework when running callbacks, it is good not to let any
callback disturb the framework.
Exceptions
==========
Exceptions are a useful feature of Python. You should learn to raise them
whenever something unexpected occurs, and catch them only where you can do
something about them.
The following is a very popular anti-idiom ::
def get_status(file):
if not os.path.exists(file):
print("file not found")
2007-08-15 11:28:22 -03:00
sys.exit(1)
return open(file).readline()
Consider the case the file gets deleted between the time the call to
:func:`os.path.exists` is made and the time :func:`open` is called. That means
the last line will throw an :exc:`IOError`. The same would happen if *file*
exists but has no read permission. Since testing this on a normal machine on
existing and non-existing files make it seem bugless, that means in testing the
results will seem fine, and the code will get shipped. Then an unhandled
:exc:`IOError` escapes to the user, who has to watch the ugly traceback.
Here is a better way to do it. ::
def get_status(file):
try:
return open(file).readline()
except (IOError, OSError):
print("file not found")
2007-08-15 11:28:22 -03:00
sys.exit(1)
In this version, \*either\* the file gets opened and the line is read (so it
works even on flaky NFS or SMB connections), or the message is printed and the
application aborted.
Still, :func:`get_status` makes too many assumptions --- that it will only be
used in a short running script, and not, say, in a long running server. Sure,
the caller could do something like ::
try:
status = get_status(log)
except SystemExit:
status = None
So, try to make as few ``except`` clauses in your code --- those will usually be
a catch-all in the :func:`main`, or inside calls which should always succeed.
So, the best version is probably ::
def get_status(file):
return open(file).readline()
The caller can deal with the exception if it wants (for example, if it tries
several files in a loop), or just let the exception filter upwards to *its*
caller.
The last version is not very good either --- due to implementation details, the
file would not be closed when an exception is raised until the handler finishes,
and perhaps not at all in non-C implementations (e.g., Jython). ::
def get_status(file):
fp = open(file)
try:
return fp.readline()
finally:
fp.close()
Using the Batteries
===================
Every so often, people seem to be writing stuff in the Python library again,
usually poorly. While the occasional module has a poor interface, it is usually
much better to use the rich standard library and data types that come with
Python then inventing your own.
A useful module very few people know about is :mod:`os.path`. It always has the
correct path arithmetic for your operating system, and will usually be much
better then whatever you come up with yourself.
Compare::
# ugh!
return dir+"/"+file
# better
return os.path.join(dir, file)
More useful functions in :mod:`os.path`: :func:`basename`, :func:`dirname` and
:func:`splitext`.
There are also many useful builtin functions people seem not to be aware of for
some reason: :func:`min` and :func:`max` can find the minimum/maximum of any
sequence with comparable semantics, for example, yet many people write their own
:func:`max`/:func:`min`. Another highly useful function is
:func:`functools.reduce`. A classical use of :func:`reduce` is something like
::
2007-08-15 11:28:22 -03:00
import sys, operator, functools
nums = list(map(float, sys.argv[1:]))
print(functools.reduce(operator.add, nums) / len(nums))
2007-08-15 11:28:22 -03:00
This cute little script prints the average of all numbers given on the command
line. The :func:`reduce` adds up all the numbers, and the rest is just some
pre- and postprocessing.
On the same note, note that :func:`float` and :func:`int` accept arguments of
type string, and so are suited to parsing --- assuming you are ready to deal
with the :exc:`ValueError` they raise.
2007-08-15 11:28:22 -03:00
Using Backslash to Continue Statements
======================================
Since Python treats a newline as a statement terminator, and since statements
are often more then is comfortable to put in one line, many people do::
if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \
calculate_number(10, 20) != forbulate(500, 360):
pass
You should realize that this is dangerous: a stray space after the ``\`` would
2007-08-15 11:28:22 -03:00
make this line wrong, and stray spaces are notoriously hard to see in editors.
In this case, at least it would be a syntax error, but if the code was::
value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \
+ calculate_number(10, 20)*forbulate(500, 360)
then it would just be subtly wrong.
It is usually much better to use the implicit continuation inside parenthesis:
This version is bulletproof::
2009-01-03 17:18:54 -04:00
value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9]
2007-08-15 11:28:22 -03:00
+ calculate_number(10, 20)*forbulate(500, 360))