Issue #13237: Forward port subprocess module updates and explicitly document UTF-8 encoding assumption when universal_newlines=True

This commit is contained in:
Nick Coghlan 2011-11-08 20:49:23 +10:00
parent 23760e97ed
commit c29248f232
2 changed files with 306 additions and 137 deletions

View File

@ -25,7 +25,227 @@ modules and functions can be found in the following sections.
Using the subprocess Module
---------------------------
This module defines one class called :class:`Popen`:
The recommended approach to invoking subprocesses is to use the following
convenience functions for all use cases they can handle. For more advanced
use cases, the underlying :class:`Popen` interface can be used directly.
.. function:: call(args, *, stdin=None, stdout=None, stderr=None, shell=False)
Run the command described by *args*. Wait for command to complete, then
return the :attr:`returncode` attribute.
The arguments shown above are merely the most common ones, described below
in :ref:`frequently-used-arguments` (hence the slightly odd notation in
the abbreviated signature). The full function signature is the same as
that of the :class:`Popen` constructor - this functions passes all
supplied arguments directly through to that interface.
Examples::
>>> subprocess.call(["ls", "-l"])
0
>>> subprocess.call("exit 1", shell=True)
1
.. warning::
Invoking the system shell with ``shell=True`` can be a security hazard
if combined with untrusted input. See the warning under
:ref:`frequently-used-arguments` for details.
.. note::
Do not use ``stdout=PIPE`` or ``stderr=PIPE`` with this function. As
the pipes are not being read in the current process, the child
process may block if it generates enough output to a pipe to fill up
the OS pipe buffer.
.. function:: check_call(args, *, stdin=None, stdout=None, stderr=None, shell=False)
Run command with arguments. Wait for command to complete. If the return
code was zero then return, otherwise raise :exc:`CalledProcessError`. The
:exc:`CalledProcessError` object will have the return code in the
:attr:`returncode` attribute.
The arguments shown above are merely the most common ones, described below
in :ref:`frequently-used-arguments` (hence the slightly odd notation in
the abbreviated signature). The full function signature is the same as
that of the :class:`Popen` constructor - this functions passes all
supplied arguments directly through to that interface.
Examples::
>>> subprocess.check_call(["ls", "-l"])
0
>>> subprocess.check_call("exit 1", shell=True)
Traceback (most recent call last):
...
subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1
.. versionadded:: 2.5
.. warning::
Invoking the system shell with ``shell=True`` can be a security hazard
if combined with untrusted input. See the warning under
:ref:`frequently-used-arguments` for details.
.. note::
Do not use ``stdout=PIPE`` or ``stderr=PIPE`` with this function. As
the pipes are not being read in the current process, the child
process may block if it generates enough output to a pipe to fill up
the OS pipe buffer.
.. function:: check_output(args, *, stdin=None, stderr=None, shell=False, universal_newlines=False)
Run command with arguments and return its output as a byte string.
If the return code was non-zero it raises a :exc:`CalledProcessError`. The
:exc:`CalledProcessError` object will have the return code in the
:attr:`returncode` attribute and any output in the :attr:`output`
attribute.
The arguments shown above are merely the most common ones, described below
in :ref:`frequently-used-arguments` (hence the slightly odd notation in
the abbreviated signature). The full function signature is largely the
same as that of the :class:`Popen` constructor, except that *stdout* is
not permitted as it is used internally. All other supplied arguments are
passed directly through to the :class:`Popen` constructor.
Examples::
>>> subprocess.check_output(["echo", "Hello World!"])
b'Hello World!\n'
>>> subprocess.check_output(["echo", "Hello World!"], universal_newlines=True)
'Hello World!\n'
>>> subprocess.check_output("exit 1", shell=True)
Traceback (most recent call last):
...
subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1
By default, this function will return the data as encoded bytes. The actual
encoding of the output data may depend on the command being invoked, so the
decoding to text will often need to be handled at the application level.
This behaviour may be overridden by setting *universal_newlines* to
:const:`True` as described below in :ref:`frequently-used-arguments`.
To also capture standard error in the result, use
``stderr=subprocess.STDOUT``::
>>> subprocess.check_output(
... "ls non_existent_file; exit 0",
... stderr=subprocess.STDOUT,
... shell=True)
'ls: non_existent_file: No such file or directory\n'
.. versionadded:: 2.7
.. warning::
Invoking the system shell with ``shell=True`` can be a security hazard
if combined with untrusted input. See the warning under
:ref:`frequently-used-arguments` for details.
.. note::
Do not use ``stderr=PIPE`` with this function. As the pipe is not being
read in the current process, the child process may block if it
generates enough output to the pipe to fill up the OS pipe buffer.
.. data:: PIPE
Special value that can be used as the *stdin*, *stdout* or *stderr* argument
to :class:`Popen` and indicates that a pipe to the standard stream should be
opened.
.. data:: STDOUT
Special value that can be used as the *stderr* argument to :class:`Popen` and
indicates that standard error should go into the same handle as standard
output.
.. _frequently-used-arguments:
Frequently Used Arguments
^^^^^^^^^^^^^^^^^^^^^^^^^
To support a wide variety of use cases, the :class:`Popen` constructor (and
the convenience functions) accept a large number of optional arguments. For
most typical use cases, many of these arguments can be safely left at their
default values. The arguments that are most commonly needed are:
*args* is required for all calls and should be a string, or a sequence of
program arguments. Providing a sequence of arguments is generally
preferred, as it allows the module to take care of any required escaping
and quoting of arguments (e.g. to permit spaces in file names). If passing
a single string, either *shell* must be :const:`True` (see below) or else
the string must simply name the program to be executed without specifying
any arguments.
*stdin*, *stdout* and *stderr* specify the executed program's standard input,
standard output and standard error file handles, respectively. Valid values
are :data:`PIPE`, an existing file descriptor (a positive integer), an
existing file object, and ``None``. :data:`PIPE` indicates that a new pipe
to the child should be created. With the default settings of ``None``, no
redirection will occur; the child's file handles will be inherited from the
parent. Additionally, *stderr* can be :data:`STDOUT`, which indicates that
the stderr data from the child process should be captured into the same file
handle as for stdout.
When *stdout* or *stderr* are pipes and *universal_newlines* is
:const:`True` then the output data is assumed to be encoded as UTF-8 and
will automatically be decoded to text. All line endings will be converted
to ``'\n'`` as described for the universal newlines `'U'`` mode argument
to :func:`open`.
If *shell* is :const:`True`, the specified command will be executed through
the shell. This can be useful if you are using Python primarily for the
enhanced control flow it offers over most system shells and still want
access to other shell features such as filename wildcards, shell pipes and
environment variable expansion.
.. warning::
Executing shell commands that incorporate unsanitized input from an
untrusted source makes a program vulnerable to `shell injection
<http://en.wikipedia.org/wiki/Shell_injection#Shell_injection>`_,
a serious security flaw which can result in arbitrary command execution.
For this reason, the use of *shell=True* is **strongly discouraged** in cases
where the command string is constructed from external input::
>>> from subprocess import call
>>> filename = input("What file would you like to display?\n")
What file would you like to display?
non_existent; rm -rf / #
>>> call("cat " + filename, shell=True) # Uh-oh. This will end badly...
``shell=False`` disables all shell based features, but does not suffer
from this vulnerability; see the Note in the :class:`Popen` constructor
documentation for helpful hints in getting ``shell=False`` to work.
These options, along with all of the other options, are described in more
detail in the :class:`Popen` constructor documentation.
Popen Constuctor
^^^^^^^^^^^^^^^^
The underlying process creation and management in this module is handled by
the :class:`Popen` class. It offers a lot of flexibility so that developers
are able to handle the less common cases not covered by the convenience
functions.
.. class:: Popen(args, bufsize=0, executable=None, stdin=None, stdout=None, stderr=None, preexec_fn=None, close_fds=True, shell=False, cwd=None, env=None, universal_newlines=False, startupinfo=None, creationflags=0, restore_signals=True, start_new_session=False, pass_fds=())
@ -78,21 +298,9 @@ This module defines one class called :class:`Popen`:
.. warning::
Executing shell commands that incorporate unsanitized input from an
untrusted source makes a program vulnerable to `shell injection
<http://en.wikipedia.org/wiki/Shell_injection#Shell_injection>`_,
a serious security flaw which can result in arbitrary command execution.
For this reason, the use of *shell=True* is **strongly discouraged** in cases
where the command string is constructed from external input::
>>> from subprocess import call
>>> filename = input("What file would you like to display?\n")
What file would you like to display?
non_existent; rm -rf / #
>>> call("cat " + filename, shell=True) # Uh-oh. This will end badly...
*shell=False* does not suffer from this vulnerability; the above Note may be
helpful in getting code using *shell=False* to work.
Enabling this option can be a security hazard if combined with untrusted
input. See the warning under :ref:`frequently-used-arguments`
for details.
On Windows: the :class:`Popen` class uses CreateProcess() to execute the
child program, which operates on strings. If *args* is a sequence, it will
@ -121,14 +329,15 @@ This module defines one class called :class:`Popen`:
You don't need ``shell=True`` to run a batch file, nor to run a console-based
executable.
*stdin*, *stdout* and *stderr* specify the executed programs' standard input,
*stdin*, *stdout* and *stderr* specify the executed program's standard input,
standard output and standard error file handles, respectively. Valid values
are :data:`PIPE`, an existing file descriptor (a positive integer), an
existing :term:`file object`, and ``None``. :data:`PIPE` indicates that a
new pipe to the child should be created. With ``None``, no redirection will
occur; the child's file handles will be inherited from the parent. Additionally,
*stderr* can be :data:`STDOUT`, which indicates that the stderr data from the
applications should be captured into the same file handle as for stdout.
new pipe to the child should be created. With the default settings of
``None``, no redirection will occur; the child's file handles will be
inherited from the parent. Additionally, *stderr* can be :data:`STDOUT`,
which indicates that the stderr data from the applications should be
captured into the same file handle as for stdout.
If *preexec_fn* is set to a callable object, this object will be called in the
child process just before the child is executed.
@ -242,104 +451,6 @@ This module defines one class called :class:`Popen`:
output.
Convenience Functions
^^^^^^^^^^^^^^^^^^^^^
This module also defines the following shortcut functions:
.. function:: call(*popenargs, **kwargs)
Run command with arguments. Wait for command to complete, then return the
:attr:`returncode` attribute.
The arguments are the same as for the :class:`Popen` constructor. Example::
>>> retcode = subprocess.call(["ls", "-l"])
.. warning::
Like :meth:`Popen.wait`, this will deadlock when using
``stdout=PIPE`` and/or ``stderr=PIPE`` and the child process
generates enough output to a pipe such that it blocks waiting
for the OS pipe buffer to accept more data.
.. function:: check_call(*popenargs, **kwargs)
Run command with arguments. Wait for command to complete. If the exit code was
zero then return, otherwise raise :exc:`CalledProcessError`. The
:exc:`CalledProcessError` object will have the return code in the
:attr:`returncode` attribute.
The arguments are the same as for the :class:`Popen` constructor. Example::
>>> subprocess.check_call(["ls", "-l"])
0
.. warning::
See the warning for :func:`call`.
.. function:: check_output(*popenargs, **kwargs)
Run command with arguments and return its output as a byte string.
If the exit code was non-zero it raises a :exc:`CalledProcessError`. The
:exc:`CalledProcessError` object will have the return code in the
:attr:`returncode`
attribute and output in the :attr:`output` attribute.
The arguments are the same as for the :class:`Popen` constructor. Example::
>>> subprocess.check_output(["ls", "-l", "/dev/null"])
b'crw-rw-rw- 1 root root 1, 3 Oct 18 2007 /dev/null\n'
The stdout argument is not allowed as it is used internally.
To capture standard error in the result, use ``stderr=subprocess.STDOUT``::
>>> subprocess.check_output(
... ["/bin/sh", "-c", "ls non_existent_file; exit 0"],
... stderr=subprocess.STDOUT)
b'ls: non_existent_file: No such file or directory\n'
.. versionadded:: 3.1
.. function:: getstatusoutput(cmd)
Return ``(status, output)`` of executing *cmd* in a shell.
Execute the string *cmd* in a shell with :func:`os.popen` and return a 2-tuple
``(status, output)``. *cmd* is actually run as ``{ cmd ; } 2>&1``, so that the
returned output will contain output or error messages. A trailing newline is
stripped from the output. The exit status for the command can be interpreted
according to the rules for the C function :c:func:`wait`. Example::
>>> subprocess.getstatusoutput('ls /bin/ls')
(0, '/bin/ls')
>>> subprocess.getstatusoutput('cat /bin/junk')
(256, 'cat: /bin/junk: No such file or directory')
>>> subprocess.getstatusoutput('/bin/junk')
(256, 'sh: /bin/junk: not found')
Availability: UNIX.
.. function:: getoutput(cmd)
Return output (stdout and stderr) of executing *cmd* in a shell.
Like :func:`getstatusoutput`, except the exit status is ignored and the return
value is a string containing the command's output. Example::
>>> subprocess.getoutput('ls /bin/ls')
'/bin/ls'
Availability: UNIX.
Exceptions
^^^^^^^^^^
@ -355,16 +466,19 @@ when trying to execute a non-existent file. Applications should prepare for
A :exc:`ValueError` will be raised if :class:`Popen` is called with invalid
arguments.
check_call() will raise :exc:`CalledProcessError`, if the called process returns
a non-zero return code.
:func:`check_call` and :func:`check_output` will raise
:exc:`CalledProcessError` if the called process returns a non-zero return
code.
Security
^^^^^^^^
Unlike some other popen functions, this implementation will never call /bin/sh
implicitly. This means that all characters, including shell metacharacters, can
safely be passed to child processes.
Unlike some other popen functions, this implementation will never call a
system shell implicitly. This means that all characters, including shell
metacharacters, can safely be passed to child processes. Obviously, if the
shell is invoked explicitly, then it is the application's responsibility to
ensure that all whitespace and metacharacters are quoted appropriately.
Popen Objects
@ -592,15 +706,21 @@ The :mod:`subprocess` module exposes the following constants.
Replacing Older Functions with the subprocess Module
----------------------------------------------------
In this section, "a ==> b" means that b can be used as a replacement for a.
In this section, "a becomes b" means that b can be used as a replacement for a.
.. note::
All functions in this section fail (more or less) silently if the executed
program cannot be found; this module raises an :exc:`OSError` exception.
All "a" functions in this section fail (more or less) silently if the
executed program cannot be found; the "b" replacements raise :exc:`OSError`
instead.
In the following examples, we assume that the subprocess module is imported with
"from subprocess import \*".
In addition, the replacements using :func:`check_output` will fail with a
:exc:`CalledProcessError` if the requested operation produces a non-zero
return code. The output is still available as the ``output`` attribute of
the raised exception.
In the following examples, we assume that the relevant functions have already
been imported from the subprocess module.
Replacing /bin/sh shell backquote
@ -609,8 +729,8 @@ Replacing /bin/sh shell backquote
::
output=`mycmd myarg`
==>
output = Popen(["mycmd", "myarg"], stdout=PIPE).communicate()[0]
# becomes
output = check_output(["mycmd", "myarg"])
Replacing shell pipeline
@ -619,7 +739,7 @@ Replacing shell pipeline
::
output=`dmesg | grep hda`
==>
# becomes
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
@ -628,22 +748,27 @@ Replacing shell pipeline
The p1.stdout.close() call after starting the p2 is important in order for p1
to receive a SIGPIPE if p2 exits before p1.
Alternatively, for trusted input, the shell's own pipeline support may still
be used directly:
output=`dmesg | grep hda`
# becomes
output=check_output("dmesg | grep hda", shell=True)
Replacing :func:`os.system`
^^^^^^^^^^^^^^^^^^^^^^^^^^^
::
sts = os.system("mycmd" + " myarg")
==>
p = Popen("mycmd" + " myarg", shell=True)
sts = os.waitpid(p.pid, 0)[1]
# becomes
sts = call("mycmd" + " myarg", shell=True)
Notes:
* Calling the program through the shell is usually not required.
* It's easier to look at the :attr:`returncode` attribute than the exit status.
A more realistic example would look like this::
try:
@ -768,6 +893,7 @@ Replacing functions from the :mod:`popen2` module
``close_fds=True`` with :class:`Popen` to guarantee this behavior on
all platforms or past Python versions.
Notes
-----
@ -801,3 +927,43 @@ runtime):
described in rule 3.
Legacy Shell Invocation Functions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This module also provides the following legacy functions from the 2.x
``commands`` module. These operations implicitly invoke the system shell and
none of the guarantees described above regarding security and exception
handling consistency are valid for these functions.
.. function:: getstatusoutput(cmd)
Return ``(status, output)`` of executing *cmd* in a shell.
Execute the string *cmd* in a shell with :func:`os.popen` and return a 2-tuple
``(status, output)``. *cmd* is actually run as ``{ cmd ; } 2>&1``, so that the
returned output will contain output or error messages. A trailing newline is
stripped from the output. The exit status for the command can be interpreted
according to the rules for the C function :c:func:`wait`. Example::
>>> subprocess.getstatusoutput('ls /bin/ls')
(0, '/bin/ls')
>>> subprocess.getstatusoutput('cat /bin/junk')
(256, 'cat: /bin/junk: No such file or directory')
>>> subprocess.getstatusoutput('/bin/junk')
(256, 'sh: /bin/junk: not found')
Availability: UNIX.
.. function:: getoutput(cmd)
Return output (stdout and stderr) of executing *cmd* in a shell.
Like :func:`getstatusoutput`, except the exit status is ignored and the return
value is a string containing the command's output. Example::
>>> subprocess.getoutput('ls /bin/ls')
'/bin/ls'
Availability: UNIX.

View File

@ -259,6 +259,9 @@ Extension Modules
Documentation
-------------
- Issue #13237: Reorganise subprocess documentation to emphasise convenience
functions and the most commonly needed arguments to Popen.
- Issue #13141: Demonstrate recommended style for socketserver examples.