cpython/Doc/library/filecmp.rst

164 lines
4.3 KiB
ReStructuredText
Raw Normal View History

2007-08-15 11:28:01 -03:00
:mod:`filecmp` --- File and Directory Comparisons
=================================================
.. module:: filecmp
:synopsis: Compare files efficiently.
.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il>
The :mod:`filecmp` module defines functions to compare files and directories,
2007-10-19 09:48:17 -03:00
with various optional time/correctness trade-offs. For comparing files,
see also the :mod:`difflib` module.
2007-08-15 11:28:01 -03:00
The :mod:`filecmp` module defines the following functions:
.. function:: cmp(f1, f2[, shallow])
Compare the files named *f1* and *f2*, returning ``True`` if they seem equal,
``False`` otherwise.
Unless *shallow* is given and is false, files with identical :func:`os.stat`
signatures are taken to be equal.
Files that were compared using this function will not be compared again unless
their :func:`os.stat` signature changes.
Note that no external programs are called from this function, giving it
portability and efficiency.
.. function:: cmpfiles(dir1, dir2, common[, shallow])
Merged revisions 69578-69580,69901,69907,69994,70022-70023,70025-70026,70166,70273,70275,70342,70386-70387,70389-70390,70392-70393,70395,70397,70400,70418 via svnmerge ........ r69578 | georg.brandl | 2009-02-13 12:03:59 +0100 (Fr, 13 Feb 2009) | 1 line #3694: add test for fix committed in r66693. ........ r69579 | georg.brandl | 2009-02-13 12:06:59 +0100 (Fr, 13 Feb 2009) | 2 lines Fix warnings GCC emits where the argument of PyErr_Format is a single variable. ........ r69580 | georg.brandl | 2009-02-13 12:10:04 +0100 (Fr, 13 Feb 2009) | 2 lines Fix warnings GCC emits where the argument of PyErr_Format is a single variable. ........ r69901 | georg.brandl | 2009-02-23 12:24:46 +0100 (Mo, 23 Feb 2009) | 2 lines #5349: C++ pure virtuals can also have an implementation. ........ r69907 | georg.brandl | 2009-02-23 19:33:48 +0100 (Mo, 23 Feb 2009) | 1 line Fix grammar. ........ r69994 | georg.brandl | 2009-02-26 18:36:26 +0100 (Do, 26 Feb 2009) | 1 line Document that setting sys.py3kwarning wont do anything. ........ r70022 | georg.brandl | 2009-02-27 17:23:18 +0100 (Fr, 27 Feb 2009) | 1 line #5361: fix typo. ........ r70023 | georg.brandl | 2009-02-27 17:39:26 +0100 (Fr, 27 Feb 2009) | 1 line #5363: fix cmpfiles() docs. Another instance where a prose description is twice as long as the code. ........ r70025 | georg.brandl | 2009-02-27 17:52:55 +0100 (Fr, 27 Feb 2009) | 1 line #5344: fix punctuation. ........ r70026 | georg.brandl | 2009-02-27 17:59:03 +0100 (Fr, 27 Feb 2009) | 1 line #5365: add quick look conversion table for different time representations. ........ r70166 | georg.brandl | 2009-03-04 19:24:41 +0100 (Mi, 04 Mär 2009) | 2 lines Remove obsolete stuff from string module docs. ........ r70273 | georg.brandl | 2009-03-09 15:25:07 +0100 (Mo, 09 Mär 2009) | 2 lines #5458: add a note when we started to raise RuntimeErrors. ........ r70275 | georg.brandl | 2009-03-09 17:35:48 +0100 (Mo, 09 Mär 2009) | 2 lines Add missing space. ........ r70342 | georg.brandl | 2009-03-13 20:03:58 +0100 (Fr, 13 Mär 2009) | 1 line #5486: typos. ........ r70386 | georg.brandl | 2009-03-15 22:32:06 +0100 (So, 15 Mär 2009) | 1 line #5496: fix docstring of lookup(). ........ r70387 | georg.brandl | 2009-03-15 22:37:16 +0100 (So, 15 Mär 2009) | 1 line #5493: clarify __nonzero__ docs. ........ r70389 | georg.brandl | 2009-03-15 22:43:38 +0100 (So, 15 Mär 2009) | 1 line Fix a small nit in the error message if bool() falls back on __len__ and it returns the wrong type: it would tell the user that __nonzero__ should return bool or int. ........ r70390 | georg.brandl | 2009-03-15 22:44:43 +0100 (So, 15 Mär 2009) | 1 line #5491: clarify nested() semantics. ........ r70392 | georg.brandl | 2009-03-15 22:46:00 +0100 (So, 15 Mär 2009) | 1 line #5488: add missing struct member. ........ r70393 | georg.brandl | 2009-03-15 22:47:42 +0100 (So, 15 Mär 2009) | 1 line #5478: fix copy-paste oversight in function signature. ........ r70395 | georg.brandl | 2009-03-15 22:51:48 +0100 (So, 15 Mär 2009) | 1 line #5276: document IDLESTARTUP and .Idle.py. ........ r70397 | georg.brandl | 2009-03-15 22:53:56 +0100 (So, 15 Mär 2009) | 1 line #5469: add with statement to list of name-binding constructs. ........ r70400 | georg.brandl | 2009-03-15 22:59:37 +0100 (So, 15 Mär 2009) | 3 lines Fix markup in re docs and give a mail address in regex howto, so that the recommendation to send suggestions to the author can be followed. ........ r70418 | georg.brandl | 2009-03-16 20:42:03 +0100 (Mo, 16 Mär 2009) | 1 line Add token markup. ........
2009-04-05 18:48:06 -03:00
Compare the files in the two directories *dir1* and *dir2* whose names are
given by *common*.
Returns three lists of file names: *match*, *mismatch*,
*errors*. *match* contains the list of files that match, *mismatch* contains
the names of those that don't, and *errors* lists the names of files which
could not be compared. Files are listed in *errors* if they don't exist in
one of the directories, the user lacks permission to read them or if the
comparison could not be done for some other reason.
The *shallow* parameter has the same meaning and default value as for
2007-08-15 11:28:01 -03:00
:func:`filecmp.cmp`.
Merged revisions 69578-69580,69901,69907,69994,70022-70023,70025-70026,70166,70273,70275,70342,70386-70387,70389-70390,70392-70393,70395,70397,70400,70418 via svnmerge ........ r69578 | georg.brandl | 2009-02-13 12:03:59 +0100 (Fr, 13 Feb 2009) | 1 line #3694: add test for fix committed in r66693. ........ r69579 | georg.brandl | 2009-02-13 12:06:59 +0100 (Fr, 13 Feb 2009) | 2 lines Fix warnings GCC emits where the argument of PyErr_Format is a single variable. ........ r69580 | georg.brandl | 2009-02-13 12:10:04 +0100 (Fr, 13 Feb 2009) | 2 lines Fix warnings GCC emits where the argument of PyErr_Format is a single variable. ........ r69901 | georg.brandl | 2009-02-23 12:24:46 +0100 (Mo, 23 Feb 2009) | 2 lines #5349: C++ pure virtuals can also have an implementation. ........ r69907 | georg.brandl | 2009-02-23 19:33:48 +0100 (Mo, 23 Feb 2009) | 1 line Fix grammar. ........ r69994 | georg.brandl | 2009-02-26 18:36:26 +0100 (Do, 26 Feb 2009) | 1 line Document that setting sys.py3kwarning wont do anything. ........ r70022 | georg.brandl | 2009-02-27 17:23:18 +0100 (Fr, 27 Feb 2009) | 1 line #5361: fix typo. ........ r70023 | georg.brandl | 2009-02-27 17:39:26 +0100 (Fr, 27 Feb 2009) | 1 line #5363: fix cmpfiles() docs. Another instance where a prose description is twice as long as the code. ........ r70025 | georg.brandl | 2009-02-27 17:52:55 +0100 (Fr, 27 Feb 2009) | 1 line #5344: fix punctuation. ........ r70026 | georg.brandl | 2009-02-27 17:59:03 +0100 (Fr, 27 Feb 2009) | 1 line #5365: add quick look conversion table for different time representations. ........ r70166 | georg.brandl | 2009-03-04 19:24:41 +0100 (Mi, 04 Mär 2009) | 2 lines Remove obsolete stuff from string module docs. ........ r70273 | georg.brandl | 2009-03-09 15:25:07 +0100 (Mo, 09 Mär 2009) | 2 lines #5458: add a note when we started to raise RuntimeErrors. ........ r70275 | georg.brandl | 2009-03-09 17:35:48 +0100 (Mo, 09 Mär 2009) | 2 lines Add missing space. ........ r70342 | georg.brandl | 2009-03-13 20:03:58 +0100 (Fr, 13 Mär 2009) | 1 line #5486: typos. ........ r70386 | georg.brandl | 2009-03-15 22:32:06 +0100 (So, 15 Mär 2009) | 1 line #5496: fix docstring of lookup(). ........ r70387 | georg.brandl | 2009-03-15 22:37:16 +0100 (So, 15 Mär 2009) | 1 line #5493: clarify __nonzero__ docs. ........ r70389 | georg.brandl | 2009-03-15 22:43:38 +0100 (So, 15 Mär 2009) | 1 line Fix a small nit in the error message if bool() falls back on __len__ and it returns the wrong type: it would tell the user that __nonzero__ should return bool or int. ........ r70390 | georg.brandl | 2009-03-15 22:44:43 +0100 (So, 15 Mär 2009) | 1 line #5491: clarify nested() semantics. ........ r70392 | georg.brandl | 2009-03-15 22:46:00 +0100 (So, 15 Mär 2009) | 1 line #5488: add missing struct member. ........ r70393 | georg.brandl | 2009-03-15 22:47:42 +0100 (So, 15 Mär 2009) | 1 line #5478: fix copy-paste oversight in function signature. ........ r70395 | georg.brandl | 2009-03-15 22:51:48 +0100 (So, 15 Mär 2009) | 1 line #5276: document IDLESTARTUP and .Idle.py. ........ r70397 | georg.brandl | 2009-03-15 22:53:56 +0100 (So, 15 Mär 2009) | 1 line #5469: add with statement to list of name-binding constructs. ........ r70400 | georg.brandl | 2009-03-15 22:59:37 +0100 (So, 15 Mär 2009) | 3 lines Fix markup in re docs and give a mail address in regex howto, so that the recommendation to send suggestions to the author can be followed. ........ r70418 | georg.brandl | 2009-03-16 20:42:03 +0100 (Mo, 16 Mär 2009) | 1 line Add token markup. ........
2009-04-05 18:48:06 -03:00
For example, ``cmpfiles('a', 'b', ['c', 'd/e'])`` will compare ``a/c`` with
``b/c`` and ``a/d/e`` with ``b/d/e``. ``'c'`` and ``'d/e'`` will each be in
one of the three returned lists.
2007-08-15 11:28:01 -03:00
Example::
>>> import filecmp
>>> filecmp.cmp('undoc.rst', 'undoc.rst')
True
>>> filecmp.cmp('undoc.rst', 'index.rst')
False
.. _dircmp-objects:
The :class:`dircmp` class
-------------------------
:class:`dircmp` instances are built using this constructor:
.. class:: dircmp(a, b[, ignore[, hide]])
Construct a new directory comparison object, to compare the directories *a* and
*b*. *ignore* is a list of names to ignore, and defaults to ``['RCS', 'CVS',
'tags']``. *hide* is a list of names to hide, and defaults to ``[os.curdir,
os.pardir]``.
The :class:`dircmp` class provides the following methods:
2007-08-15 11:28:01 -03:00
.. method:: report()
2007-08-15 11:28:01 -03:00
Print (to ``sys.stdout``) a comparison between *a* and *b*.
2007-08-15 11:28:01 -03:00
.. method:: report_partial_closure()
2007-08-15 11:28:01 -03:00
Print a comparison between *a* and *b* and common immediate
subdirectories.
2007-08-15 11:28:01 -03:00
.. method:: report_full_closure()
2007-08-15 11:28:01 -03:00
Print a comparison between *a* and *b* and common subdirectories
(recursively).
2007-08-15 11:28:01 -03:00
The :class:`dircmp` offers a number of interesting attributes that may be
used to get various bits of information about the directory trees being
compared.
2007-08-15 11:28:01 -03:00
Note that via :meth:`__getattr__` hooks, all attributes are computed lazily,
so there is no speed penalty if only those attributes which are lightweight
to compute are used.
2007-08-15 11:28:01 -03:00
.. attribute:: left_list
2007-08-15 11:28:01 -03:00
Files and subdirectories in *a*, filtered by *hide* and *ignore*.
2007-08-15 11:28:01 -03:00
.. attribute:: right_list
2007-08-15 11:28:01 -03:00
Files and subdirectories in *b*, filtered by *hide* and *ignore*.
2007-08-15 11:28:01 -03:00
.. attribute:: common
2007-08-15 11:28:01 -03:00
Files and subdirectories in both *a* and *b*.
2007-08-15 11:28:01 -03:00
.. attribute:: left_only
2007-08-15 11:28:01 -03:00
Files and subdirectories only in *a*.
2007-08-15 11:28:01 -03:00
.. attribute:: right_only
2007-08-15 11:28:01 -03:00
Files and subdirectories only in *b*.
2007-08-15 11:28:01 -03:00
.. attribute:: common_dirs
2007-08-15 11:28:01 -03:00
Subdirectories in both *a* and *b*.
2007-08-15 11:28:01 -03:00
.. attribute:: common_files
2007-08-15 11:28:01 -03:00
Files in both *a* and *b*
2007-08-15 11:28:01 -03:00
.. attribute:: common_funny
2007-08-15 11:28:01 -03:00
Names in both *a* and *b*, such that the type differs between the
directories, or names for which :func:`os.stat` reports an error.
2007-08-15 11:28:01 -03:00
.. attribute:: same_files
2007-08-15 11:28:01 -03:00
Files which are identical in both *a* and *b*.
2007-08-15 11:28:01 -03:00
.. attribute:: diff_files
2007-08-15 11:28:01 -03:00
Files which are in both *a* and *b*, whose contents differ.
2007-08-15 11:28:01 -03:00
.. attribute:: funny_files
2007-08-15 11:28:01 -03:00
Files which are in both *a* and *b*, but could not be compared.
2007-08-15 11:28:01 -03:00
.. attribute:: subdirs
2007-08-15 11:28:01 -03:00
A dictionary mapping names in :attr:`common_dirs` to :class:`dircmp` objects.
2007-08-15 11:28:01 -03:00