cpython

Commit Graph

Author	SHA1	Message	Date
Miss Islington (bot)	28c179094b	bpo-33224: PEP 479 fix for difflib.mdiff() (GH-6381) (GH-6390)	2018-04-05 11:45:33 -07:00
Miss Islington (bot)	0902a2d6b2	bpo-32981: Fix catastrophic backtracking vulns (GH-5955) * Prevent low-grade poplib REDOS (CVE-2018-1060) The regex to test a mail server's timestamp is susceptible to catastrophic backtracking on long evil responses from the server. Happily, the maximum length of malicious inputs is 2K thanks to a limit introduced in the fix for CVE-2013-1752. A 2KB evil response from the mail server would result in small slowdowns (milliseconds vs. microseconds) accumulated over many apop calls. This is a potential DOS vector via accumulated slowdowns. Replace it with a similar non-vulnerable regex. The new regex is RFC compliant. The old regex was non-compliant in edge cases. * Prevent difflib REDOS (CVE-2018-1061) The default regex for IS_LINE_JUNK is susceptible to catastrophic backtracking. This is a potential DOS vector. Replace it with an equivalent non-vulnerable regex. Also introduce unit and REDOS tests for difflib. Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Christian Heimes <christian@python.org> Co-authored-by: Jamie Davis <davisjam@vt.edu> (cherry picked from commit `0e6c8ee235`)	2018-03-03 21:55:07 -08:00
Raymond Hettinger	15f44ab043	Issue #27895 : Spelling fixes (Contributed by Ville Skyttä).	2016-08-30 10:47:49 -07:00
Greg Ward	4d9d2563f5	#17445 : difflib: add diff_bytes(), to compare bytes rather than str Some applications (e.g. traditional Unix diff, version control systems) neither know nor care about the encodings of the files they are comparing. They are textual, but to the diff utility they are just bytes. This worked fine under Python 2, because all of the hardcoded strings in difflib.py are ASCII, so could safely be combined with old-style u'' strings. But it stopped working in 3.x. The solution is to use surrogate escapes for a lossless bytes->str->bytes roundtrip. That means {unified,context}_diff() can continue to just handle strings without worrying about bytes. Callers who have to deal with bytes will need to change to using diff_bytes(). Use case: Mercurial's test runner uses difflib to compare current hg output with known good output. But Mercurial's output is just bytes, since it can contain: * file contents (arbitrary unknown encoding) * filenames (arbitrary unknown encoding) * usernames and commit messages (usually UTF-8, but not guaranteed because old versions of Mercurial did not enforce it) * user messages (locale encoding) Since the output of any given hg command can include text in multiple encodings, it is hopeless to try to treat it as decodable Unicode text. It's just bytes, all the way down. This is an elaboration of a patch by Terry Reedy.	2015-04-20 20:21:21 -04:00
Berker Peksag	102029dfd6	Issue #2052 : Add charset parameter to HtmlDiff.make_file().	2015-03-15 01:18:47 +02:00
Raymond Hettinger	fabefc3c5b	Issue 21635: Fix caching in difflib.SequenceMatcher.get_matching_blocks().	2014-06-21 11:57:36 -07:00
Raymond Hettinger	9180deb59c	Issue 11747: Fix output format for context diffs.	2011-04-12 15:25:30 -07:00
Raymond Hettinger	49353d0e8f	Issue #11747 : Fix range formatting in context and unified diffs.	2011-04-11 12:40:58 -07:00
Terry Reedy	17a59252e8	Issue 10534, difflib: tweak doc; test new SequenceMatcher instance attributes; avoid unneeded lists of SM.b2j keys and items in .__chain_b. Do not backport.	2010-12-15 20:18:10 +00:00
Terry Reedy	99f9637de8	Issue 2986: Add autojunk paramater to SequenceMatcher to turn off heuristic. Patch by Terry Reedy, Eli Bendersky, and Simon Cross	2010-11-25 06:12:34 +00:00
R. David Murray	b2416e54b1	Merged revisions 80004 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r80004 \| r.david.murray \| 2010-04-12 12:35:19 -0400 (Mon, 12 Apr 2010) \| 13 lines Issue #7585: use tab between components in unified and context diff headers. Instead of spaces between the filename and date (or whatever the string is that follows the filename, if any) use tabs. This is what the unix 'diff' command does, for example, and difflib was intended to follow the 'standard' way of doing diffs. This improves compatibility with patch tools. The docs and examples are also changed to recommended that the date format used be the ISO 8601 format, which is what modern diff tools emit by default. Patch by Anatoly Techtonik. ........	2010-04-12 16:58:02 +00:00
Senthil Kumaran	758025cb1f	Merged revisions 76464 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r76464 \| senthil.kumaran \| 2009-11-24 00:11:31 +0530 (Tue, 24 Nov 2009) \| 4 lines Fix for issue1488943 - difflib.Differ() doesn't always add hints for tab characters. ........	2009-11-23 19:02:52 +00:00
Philip Jenvey	a27c5bd2bd	Merged revisions 72979 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r72979 \| philip.jenvey \| 2009-05-27 22:58:44 -0700 (Wed, 27 May 2009) \| 2 lines explicitly close files ........	2009-05-28 06:09:08 +00:00
Benjamin Peterson	ee8712cda4	#2621 rename test.test_support to test.support	2008-05-20 21:35:26 +00:00
Georg Brandl	a18af4e7a2	PEP 3114: rename .next() to .__next__() and add next() builtin.	2007-04-21 15:47:16 +00:00
Thomas Wouters	49fd7fa443	Merge p3yk branch with the trunk up to revision 45595. This breaks a fair number of tests, all because of the codecs/_multibytecodecs issue described here (it's not a Py3K issue, just something Py3K discovers): http://mail.python.org/pipermail/python-dev/2006-April/064051.html Hye-Shik Chang promised to look for a fix, so no need to fix it here. The tests that are expected to break are: test_codecencodings_cn test_codecencodings_hk test_codecencodings_jp test_codecencodings_kr test_codecencodings_tw test_codecs test_multibytecodec This merge fixes an actual test failure (test_weakref) in this branch, though, so I believe merging is the right thing to do anyway.	2006-04-21 10:40:58 +00:00
Gustavo Niemeyer	548148810b	Patch #1413711 : Certain patterns of differences were making difflib touch the recursion limit. The applied patch inlines the recursive __helper method in a non-recursive way.	2006-01-31 18:34:13 +00:00
Tim Peters	48bd7f3a71	Whitespace normalization. test_difflib passes again.	2004-08-29 22:38:38 +00:00
Tim Peters	afb5f94217	Reverting whitespace normalization. test_difflib fails with it -- the test depends on invisible trailing whitespace in .py files. The author will have to repair that.	2004-08-29 19:33:36 +00:00
Tim Peters	45e77c55ff	Whitespace normalization.	2004-08-29 18:47:31 +00:00
Martin v. Löwis	e064b41f5a	Patch #914575 : difflib side by side diff support, diff.py s/b/s HTML option.	2004-08-29 16:34:40 +00:00
Brett Cannon	d2c5b4b549	SequenceMatcher(None, [], []).get_grouped_opcodes() now returns a generator that behaves as if both lists has an empty string in each of them. Closes bug #979794 (and duplicate bug #980117).	2004-07-10 23:54:07 +00:00
Tim Peters	58eb11cf62	Whitespace normalization.	2004-01-18 20:29:55 +00:00
Raymond Hettinger	43d790c087	Exercise Jim Fulton's new doctest extension for running doctests in a unittest environment. Since his extension finds docstrings in private functions, it exposed a bug in the difflib doctests.	2003-07-16 04:34:56 +00:00
Neal Norwitz	e7dfe21bed	Fix SF bug #763023 , difflib.py: ratio() zero division not caught Backport candidate	2003-07-01 14:59:46 +00:00
Barry Warsaw	04f357cffe	Get rid of relative imports in all unittests. Now anything that imports e.g. test_support must do so using an absolute package name such as "import test.test_support" or "from test import test_support". This also updates the README in Lib/test, and gets rid of the duplicate data dirctory in Lib/test/data (replaced by Lib/email/test/data). Now Tim and Jack can have at it. :)	2002-07-23 19:04:11 +00:00
Tim Peters	a0a6222509	Teach regrtest how to pass on doctest failure msgs. This is done via a horridly inefficient hack in regrtest's Compare class, but it's about as clean as can be: regrtest has to set up the Compare instance before importing a test module, and by the time the module is imported it's too late to change that decision. The good news is that the more tests we convert to unittest and doctest, the less the inefficiency here matters. Even now there are few tests with large expected-output files (the new cost here is a Python-level call per .write() when there's an expected- output file).	2001-09-09 06:12:01 +00:00
Tim Peters	f5f6c436c6	Remove test_doctest's expected-output file. Change test_doctest and test_difflib to pass regrtest's notion of verbosity on to doctest. Add explanation for a dozen "new" things to test/README.	2001-05-23 07:46:36 +00:00
Tim Peters	dec4a6143c	Remove test_difflib's output file and change test_difflib to stop generating it. Since this is purely a doctest, the output file never served a good purpose.	2001-05-23 01:45:19 +00:00
Tim Peters	9ae2148ada	Moved SequenceMatcher from ndiff into new std library module difflib.py. Guido told me to do this <wink>. Greatly expanded docstrings, and fleshed out with examples. New std test. Added new get_close_matches() function for ESR. Needs docs, but LaTeXification of the module docstring is all it needs. \CVS: ----------------------------------------------------------------------	2001-02-10 08:00:53 +00:00

30 Commits