Commit Graph

105 Commits

Author SHA1 Message Date
Terry Jan Reedy 58719a7de7 Merge with 3.3 2014-02-23 23:40:16 -05:00
Terry Jan Reedy f106f8f29c whitespace 2014-02-23 23:39:57 -05:00
Terry Jan Reedy 40f8c6774b Merge with 3.3 2014-02-23 23:33:44 -05:00
Terry Jan Reedy 9dc3a36c84 Issue #9974: When untokenizing, use row info to insert backslash+newline.
Original patches by A. Kuchling and G. Rees (#12691).
2014-02-23 23:33:08 -05:00
Terry Jan Reedy 79bf89986c Merge with 3.3 2014-02-17 23:12:37 -05:00
Terry Jan Reedy 5b8d2c3af7 Issue #8478: Untokenizer.compat now processes first token from iterator input.
Patch based on lines from Georg Brandl, Eric Snow, and Gareth Rees.
2014-02-17 23:12:16 -05:00
Terry Jan Reedy 8c8d77254f Untokenize, bad assert: Merge with 3.3 2014-02-17 16:46:43 -05:00
Terry Jan Reedy 5e6db31368 Untokenize: An logically incorrect assert tested user input validity.
Replace it with correct logic that raises ValueError for bad input.
Issues #8478 and #12691 reported the incorrect logic.
Add an Untokenize test case and an initial test method.
2014-02-17 16:45:48 -05:00
Serhiy Storchaka 7282ff6d5b Issue #18960: Fix bugs with Python source code encoding in the second line.
* The first line of Python script could be executed twice when the source
encoding (not equal to 'utf-8') was specified on the second line.

* Now the source encoding declaration on the second line isn't effective if
the first line contains anything except a comment.

* As a consequence, 'python -x' works now again with files with the source
encoding declarations specified on the second file, and can be used again
to make Python batch files on Windows.

* The tokenize module now ignore the source encoding declaration on the second
line if the first line contains anything except a comment.

* IDLE now ignores the source encoding declaration on the second line if the
first line contains anything except a comment.

* 2to3 and the findnocoding.py script now ignore the source encoding
declaration on the second line if the first line contains anything except
a comment.
2014-01-09 18:41:59 +02:00
Serhiy Storchaka 768c16ce02 Issue #18960: Fix bugs with Python source code encoding in the second line.
* The first line of Python script could be executed twice when the source
encoding (not equal to 'utf-8') was specified on the second line.

* Now the source encoding declaration on the second line isn't effective if
the first line contains anything except a comment.

* As a consequence, 'python -x' works now again with files with the source
encoding declarations specified on the second file, and can be used again
to make Python batch files on Windows.

* The tokenize module now ignore the source encoding declaration on the second
line if the first line contains anything except a comment.

* IDLE now ignores the source encoding declaration on the second line if the
first line contains anything except a comment.

* 2to3 and the findnocoding.py script now ignore the source encoding
declaration on the second line if the first line contains anything except
a comment.
2014-01-09 18:36:09 +02:00
Ezio Melotti 5833c00427 #19620: merge with 3.3. 2013-11-25 05:16:09 +02:00
Ezio Melotti 4bcc796acc #19620: Fix typo in docstring (noticed by Christopher Welborn). 2013-11-25 05:14:51 +02:00
Serhiy Storchaka 935349406a Issue #18873: The tokenize module, IDLE, 2to3, and the findnocoding.py script
now detect Python source code encoding only in comment lines.
2013-09-16 23:57:00 +03:00
Serhiy Storchaka dafea85190 Issue #18873: The tokenize module, IDLE, 2to3, and the findnocoding.py script
now detect Python source code encoding only in comment lines.
2013-09-16 23:51:56 +03:00
Andrew Svetlov f7a17b48d7 Replace IOError with OSError (#16715) 2012-12-25 16:47:37 +02:00
Ezio Melotti fafa8b7797 #16152: merge with 3.2. 2012-11-03 17:46:51 +02:00
Ezio Melotti 2cc3b4ba9f #16152: fix tokenize to ignore whitespace at the end of the code when no newline is found. Patch by Ned Batchelder. 2012-11-03 17:38:43 +02:00
Florent Xicluna fed2c51eea Merge branch 2012-07-07 12:26:56 +02:00
Florent Xicluna 11f0b41e9d Issue #14990: tokenize: correctly fail with SyntaxError on invalid encoding declaration. 2012-07-07 12:13:35 +02:00
Christian Heimes 0b3847de6d Issue #15096: Drop support for the ur string prefix 2012-06-20 11:17:58 +02:00
Meador Inge 8d5c0b8c19 Issue #15054: Fix incorrect tokenization of 'b' string literals.
Patch by Serhiy Storchaka.
2012-06-16 21:49:08 -05:00
Brett Cannon c33f3f2339 Issue #14629: Mention the filename in SyntaxError exceptions from
tokenizer.detect_encoding() (when available).
2012-04-20 13:23:54 -04:00
Martin v. Löwis 63c39fe38e merge 3.2: issue 14629 2012-04-20 14:37:17 +02:00
Martin v. Löwis 63674f4b52 Issue #14629: Raise SyntaxError in tokenizer.detect_encoding
if the first two lines have non-UTF-8 characters without an encoding declaration.
2012-04-20 14:36:47 +02:00
Armin Ronacher c0eaecafe9 Updated tokenize to support the inverse byte literals new in 3.3 2012-03-04 13:07:57 +00:00
Armin Ronacher 6ecf77b3f8 Basic support for PEP 414 without docs or tests. 2012-03-04 12:04:06 +00:00
Meador Inge 00c7f85298 Issue #2134: Add support for tokenize.TokenInfo.exact_type. 2012-01-19 00:44:45 -06:00
Antoine Pitrou 10a99b024d Issue #13150: The tokenize module doesn't compile large regular expressions at startup anymore.
Instead, the re module's standard caching does its work.
2011-10-11 15:45:56 +02:00
Meador Inge 14c0f03b58 Issue #12943: python -m tokenize support has been added to tokenize. 2011-10-07 08:53:38 -05:00
Brett Cannon 45b96d373e Merged revisions 88498 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r88498 | brett.cannon | 2011-02-21 19:25:12 -0800 (Mon, 21 Feb 2011) | 8 lines

  Issue #11074: Make 'tokenize' so it can be reloaded.

  The module stored away the 'open' object as found in the global namespace
  (which fell through to the built-in namespace) since it defined its own 'open'.
  Problem is that if you reloaded the module it then grabbed the 'open' defined
  in the previous load, leading to code that infinite recursed. Switched to
  simply call builtins.open directly.
........
2011-02-22 03:35:18 +00:00
Brett Cannon f3042782af Issue #11074: Make 'tokenize' so it can be reloaded.
The module stored away the 'open' object as found in the global namespace
(which fell through to the built-in namespace) since it defined its own 'open'.
Problem is that if you reloaded the module it then grabbed the 'open' defined
in the previous load, leading to code that infinite recursed. Switched to
simply call builtins.open directly.
2011-02-22 03:25:12 +00:00
Alexander Belopolsky b9d10d08c4 Issue #10386: Added __all__ to token module; this simplifies importing
in tokenize module and prevents leaking of private names through
import *.
2010-11-11 14:07:41 +00:00
Victor Stinner 58c0752a33 Issue #10335: Add tokenize.open(), detect the file encoding using
tokenize.detect_encoding() and open it in read only mode.
2010-11-09 01:08:59 +00:00
Raymond Hettinger a0e79408bc A little bit more readable repr method. 2010-09-09 08:29:05 +00:00
Raymond Hettinger 3fb79c747b Experiment: Let collections.namedtuple() do the work. This should work now that _collections is pre-built. The buildbots will tell us shortly. 2010-09-09 07:15:18 +00:00
Raymond Hettinger 6c60d099e5 Improve the repr for the TokenInfo named tuple. 2010-09-09 04:32:39 +00:00
Florent Xicluna 43e4ea1b17 Remove unused import, fix typo and rewrap docstrings. 2010-09-03 19:54:02 +00:00
Benjamin Peterson 33856de84d handle names starting with non-ascii characters correctly #9712 2010-08-30 14:41:20 +00:00
Benjamin Peterson 1613ed8108 fix for files with coding cookies and BOMs 2010-03-18 22:34:15 +00:00
Benjamin Peterson 689a558098 in tokenize.detect_encoding(), return utf-8-sig when a BOM is found 2010-03-18 22:29:52 +00:00
Benjamin Peterson 81dd8b9594 use some more itertools magic to make '' be yielded after readline is done 2009-11-14 18:09:17 +00:00
Benjamin Peterson 21db77e396 simply by using itertools.chain() 2009-11-14 16:27:26 +00:00
Benjamin Peterson a0dfa82eca Merged revisions 75149,75260-75263,75265-75267,75292,75300,75376,75405,75429-75433,75437,75445,75501,75551,75572,75589-75591,75657,75742,75868,75952-75957,76057,76105,76139,76143,76162,76223 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r75149 | gregory.p.smith | 2009-09-29 16:56:31 -0500 (Tue, 29 Sep 2009) | 3 lines

  Mention issue6972 in extractall docs about overwriting things outside of
  the supplied path.
........
  r75260 | andrew.kuchling | 2009-10-05 16:24:20 -0500 (Mon, 05 Oct 2009) | 1 line

  Wording fix
........
  r75261 | andrew.kuchling | 2009-10-05 16:24:35 -0500 (Mon, 05 Oct 2009) | 1 line

  Fix narkup
........
  r75262 | andrew.kuchling | 2009-10-05 16:25:03 -0500 (Mon, 05 Oct 2009) | 1 line

  Document 'skip' parameter to constructor
........
  r75263 | andrew.kuchling | 2009-10-05 16:25:35 -0500 (Mon, 05 Oct 2009) | 1 line

  Note side benefit of socket.create_connection()
........
  r75265 | andrew.kuchling | 2009-10-05 17:31:11 -0500 (Mon, 05 Oct 2009) | 1 line

  Reword sentence
........
  r75266 | andrew.kuchling | 2009-10-05 17:32:48 -0500 (Mon, 05 Oct 2009) | 1 line

  Use standard comma punctuation; reword some sentences in the docs
........
  r75267 | andrew.kuchling | 2009-10-05 17:42:56 -0500 (Mon, 05 Oct 2009) | 1 line

  Backport r73983: Document the thousands separator.
........
  r75292 | benjamin.peterson | 2009-10-08 22:11:36 -0500 (Thu, 08 Oct 2009) | 1 line

  death to old CVS keyword
........
  r75300 | benjamin.peterson | 2009-10-09 16:48:14 -0500 (Fri, 09 Oct 2009) | 1 line

  fix some coding style
........
  r75376 | benjamin.peterson | 2009-10-11 20:26:07 -0500 (Sun, 11 Oct 2009) | 1 line

  platform we don't care about
........
  r75405 | neil.schemenauer | 2009-10-14 12:17:14 -0500 (Wed, 14 Oct 2009) | 4 lines

  Issue #1754094: Improve the stack depth calculation in the compiler.
  There should be no other effect than a small decrease in memory use.
  Patch by Christopher Tur Lesniewski-Laas.
........
  r75429 | benjamin.peterson | 2009-10-14 20:47:28 -0500 (Wed, 14 Oct 2009) | 1 line

  pep8ify if blocks
........
  r75430 | benjamin.peterson | 2009-10-14 20:49:37 -0500 (Wed, 14 Oct 2009) | 1 line

  use floor division and add a test that exercises the tabsize codepath
........
  r75431 | benjamin.peterson | 2009-10-14 20:56:25 -0500 (Wed, 14 Oct 2009) | 1 line

  change test to what I intended
........
  r75432 | benjamin.peterson | 2009-10-14 22:05:39 -0500 (Wed, 14 Oct 2009) | 1 line

  some cleanups
........
  r75433 | benjamin.peterson | 2009-10-14 22:06:55 -0500 (Wed, 14 Oct 2009) | 1 line

  make inspect.isabstract() always return a boolean; add a test for it, too #7069
........
  r75437 | benjamin.peterson | 2009-10-15 10:44:46 -0500 (Thu, 15 Oct 2009) | 1 line

  only clear a module's __dict__ if the module is the only one with a reference to it #7140
........
  r75445 | vinay.sajip | 2009-10-16 09:06:44 -0500 (Fri, 16 Oct 2009) | 1 line

  Issue #7120: logging: Removed import of multiprocessing which is causing crash in GAE.
........
  r75501 | antoine.pitrou | 2009-10-18 13:37:11 -0500 (Sun, 18 Oct 2009) | 3 lines

  Add a comment about unreachable code, and fix a typo
........
  r75551 | benjamin.peterson | 2009-10-19 22:14:10 -0500 (Mon, 19 Oct 2009) | 1 line

  use property api
........
  r75572 | benjamin.peterson | 2009-10-20 16:55:17 -0500 (Tue, 20 Oct 2009) | 1 line

  clarify buffer arg #7178
........
  r75589 | benjamin.peterson | 2009-10-21 21:26:47 -0500 (Wed, 21 Oct 2009) | 1 line

  whitespace
........
  r75590 | benjamin.peterson | 2009-10-21 21:36:47 -0500 (Wed, 21 Oct 2009) | 1 line

  rewrite to be nice to other implementations
........
  r75591 | benjamin.peterson | 2009-10-21 21:50:38 -0500 (Wed, 21 Oct 2009) | 4 lines

  rewrite for style, clarify, and comments

  Also, use the hasattr() like scheme of allowing BaseException exceptions through.
........
  r75657 | antoine.pitrou | 2009-10-24 07:41:27 -0500 (Sat, 24 Oct 2009) | 3 lines

  Fix compilation error in debug mode.
........
  r75742 | benjamin.peterson | 2009-10-26 17:51:16 -0500 (Mon, 26 Oct 2009) | 1 line

  use 'is' instead of id()
........
  r75868 | benjamin.peterson | 2009-10-27 15:59:18 -0500 (Tue, 27 Oct 2009) | 1 line

  test expect base classes
........
  r75952 | georg.brandl | 2009-10-29 15:38:32 -0500 (Thu, 29 Oct 2009) | 1 line

  Use the correct function name in docstring.
........
  r75953 | georg.brandl | 2009-10-29 15:39:50 -0500 (Thu, 29 Oct 2009) | 1 line

  Remove mention of the old -X command line switch.
........
  r75954 | georg.brandl | 2009-10-29 15:53:00 -0500 (Thu, 29 Oct 2009) | 1 line

  Use constants instead of magic integers for test result.  Do not re-run with --verbose3 for environment changing tests.
........
  r75955 | georg.brandl | 2009-10-29 15:54:03 -0500 (Thu, 29 Oct 2009) | 1 line

  Use a single style for all the docstrings in the math module.
........
  r75956 | georg.brandl | 2009-10-29 16:16:34 -0500 (Thu, 29 Oct 2009) | 1 line

  I do not think the "railroad" program mentioned is still available.
........
  r75957 | georg.brandl | 2009-10-29 16:44:56 -0500 (Thu, 29 Oct 2009) | 1 line

  Fix constant name.
........
  r76057 | benjamin.peterson | 2009-11-02 09:06:45 -0600 (Mon, 02 Nov 2009) | 1 line

  prevent a rather unlikely segfault
........
  r76105 | georg.brandl | 2009-11-04 01:38:12 -0600 (Wed, 04 Nov 2009) | 1 line

  #7259: show correct equivalent for operator.i* operations in docstring; fix minor issues in operator docs.
........
  r76139 | benjamin.peterson | 2009-11-06 19:04:38 -0600 (Fri, 06 Nov 2009) | 1 line

  spelling
........
  r76143 | georg.brandl | 2009-11-07 02:26:07 -0600 (Sat, 07 Nov 2009) | 1 line

  #7271: fix typo.
........
  r76162 | benjamin.peterson | 2009-11-08 22:10:53 -0600 (Sun, 08 Nov 2009) | 1 line

  discuss how to use -p
........
  r76223 | georg.brandl | 2009-11-12 02:29:46 -0600 (Thu, 12 Nov 2009) | 1 line

  Give the profile module a module directive.
........
2009-11-13 02:25:08 +00:00
Benjamin Peterson d3afadaa49 normalize latin-1 and utf-8 variant encodings like the builtin tokenizer does 2009-10-09 21:43:09 +00:00
Raymond Hettinger aa17a7fc98 Remove dependency on the collections module. 2009-04-29 14:21:25 +00:00
Raymond Hettinger a48db39992 Issue #5857: tokenize.tokenize() now returns named tuples. 2009-04-29 00:34:27 +00:00
Benjamin Peterson 9b8d24b17d reuse tokenize.detect_encoding in linecache instead of a custom solution
patch by Victor Stinner #4016
2009-03-24 22:30:15 +00:00
Benjamin Peterson 433f32c3be raise a SyntaxError in detect_encoding() when a codec lookup fails like the builtin parser #4021 2008-12-12 01:25:05 +00:00
Antoine Pitrou fd036451bf #2834: Change re module semantics, so that str and bytes mixing is forbidden,
and str (unicode) patterns get full unicode matching by default. The re.ASCII
flag is also introduced to ask for ASCII matching instead.
2008-08-19 17:56:33 +00:00
Benjamin Peterson 0fe14383a8 use the more idomatic (and Py3k faster) while True 2008-06-05 23:07:42 +00:00