cpython

Commit Graph

Author	SHA1	Message	Date
Victor Stinner	e822e37946	bpo-36020: Remove snprintf macro in pyerrors.h (GH-20889) On Windows, #include "pyerrors.h" no longer defines "snprintf" and "vsnprintf" macros. PyOS_snprintf() and PyOS_vsnprintf() should be used to get portable behavior. Replace snprintf() calls with PyOS_snprintf() and replace vsnprintf() calls with PyOS_vsnprintf().	2020-06-15 21:59:47 +02:00
Lysandros Nikolaou	896f4cf63f	bpo-40847: Consider a line with only a LINECONT a blank line (GH-20769) A line with only a line continuation character should be considered a blank line at tokenizer level so that only a single NEWLINE token gets emitted. The old parser was working around the issue, but the new parser threw a `SyntaxError` for valid input. For example, an empty line following a line continuation character was interpreted as a `SyntaxError`. Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>	2020-06-11 00:56:08 +01:00
Ammar Askar	a2bbedc8b1	Fix peg_generator compiler warnings under MSVC (GH-20405)	2020-05-26 05:33:35 +01:00
Serhiy Storchaka	74ea6b5a75	bpo-40593: Improve syntax errors for invalid characters in source code. (GH-20033)	2020-05-12 12:42:04 +03:00
Lysandros Nikolaou	846d8b28ab	bpo-40246: Revert reporting of invalid string prefixes (GH-19888) Due to backwards compatibility concerns regarding keywords immediately followed by a string without whitespace between them (like in `bg="#d00" if clear else"#fca"`) will fail to parse, commit `41d5b94af4` has to be reverted.	2020-05-04 12:32:18 +01:00
Pablo Galindo	11a7f158ef	bpo-40335: Correctly handle multi-line strings in tokenize error scenarios (GH-19619) Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>	2020-04-21 01:53:04 +01:00
Lysandros Nikolaou	41d5b94af4	bpo-40246: Report a better error message for invalid string prefixes (GH-19476)	2020-04-12 19:21:00 +01:00
Victor Stinner	87d3b9db4a	bpo-39882: Add _Py_FatalErrorFormat() function (GH-19157)	2020-03-25 19:27:36 +01:00
Victor Stinner	9e5d30cc99	bpo-39882: Py_FatalError() logs the function name (GH-18819) The Py_FatalError() function is replaced with a macro which logs automatically the name of the current function, unless the Py_LIMITED_API macro is defined. Changes: * Add _Py_FatalErrorFunc() function. * Remove the function name from the message of Py_FatalError() calls which included the function name. * Update tests.	2020-03-07 00:54:20 +01:00
Andy Lester	384f3c536d	closes bpo-39721: Fix constness of members of tok_state struct. (GH-18600) The function PyTokenizer_FromUTF8 from Parser/tokenizer.c had a comment: /* XXX: constify members. / This patch addresses that. In the tok_state struct: end and start were non-const but could be made const * str and input were const but should have been non-const Changes to support this include: * decode_str() now returns a char * since it is allocated. * PyTokenizer_FromString() and PyTokenizer_FromUTF8() each creates a new char * for an allocate string instead of reusing the input const char . PyTokenizer_Get() and tok_get() now take const char ** arguments. * Various local vars are const or non-const accordingly. I was able to remove five casts that cast away constness.	2020-02-27 18:44:52 -08:00
Serhiy Storchaka	0cc6b5e559	bpo-39219: Fix SyntaxError attributes in the tokenizer. (GH-17828) * Always set the text attribute. * Correct the offset attribute for non-ascii sources.	2020-02-12 12:17:00 +02:00
Victor Stinner	f3e7ea5b8c	bpo-39500: Document PyUnicode_IsIdentifier() function (GH-18397) PyUnicode_IsIdentifier() does not call Py_FatalError() anymore if the string is not ready.	2020-02-11 14:29:33 +01:00
Pablo Galindo	5ec91f78d5	bpo-39209: Manage correctly multi-line tokens in interactive mode (GH-17860)	2020-01-06 15:59:09 +00:00
Batuhan Taşkaya	109fc2792a	bpo-38673: dont switch to ps2 if the line starts with comment or whitespace (GH-17421) https://bugs.python.org/issue38673	2019-12-08 20:36:27 -08:00
Hansraj Das	69f37bcb28	Indent code inside if block. (GH-15284) Without indendation, seems like strcpy line is parallel to `if` condition.	2019-08-15 09:19:07 -07:00
Anthony Sottile	5b94f3578c	Fix `SyntaxError` indicator printing too many spaces for multi-line strings (GH-14433)	2019-07-29 14:59:13 +01:00
Michael J. Sullivan	d8a82e2897	bpo-36878: Only allow text after `# type: ignore` if first character ASCII (GH-13504) This disallows things like `# type: ignoreé`, which seems wrong. Also switch to using Py_ISALNUM for the alnum check, for consistency with other code (and maybe correctness re: locale issues?). https://bugs.python.org/issue36878	2019-05-22 13:43:36 -07:00
Michael J. Sullivan	933e1509ec	bpo-36878: Track extra text added to 'type: ignore' in the AST (GH-13479) GH-13238 made extra text after a # type: ignore accepted by the parser. This finishes the job and actually plumbs the extra text through the parser and makes it available in the AST.	2019-05-22 15:54:20 +01:00
Anthony Sottile	abea73bf4a	bpo-2180: Treat line continuation at EOF as a `SyntaxError` (GH-13401) This makes the parser consistent with the tokenize module (already the case in `pypy`). sample ------ ```python x = 5\ ``` before ------ ```console $ python3 t.py $ python3 -mtokenize t.py t.py:2:0: error: EOF in multi-line statement ``` after ----- ```console $ ./python t.py File "t.py", line 3 x = 5\ ^ SyntaxError: unexpected EOF while parsing $ ./python -m tokenize t.py t.py:2:0: error: EOF in multi-line statement ``` https://bugs.python.org/issue2180	2019-05-18 11:27:16 -07:00
Michael J. Sullivan	d8320ecb86	bpo-36878: Allow extra text after `# type: ignore` comments (GH-13238) In the parser, when using the type_comments=True option, recognize a TYPE_IGNORE as anything containing `# type: ignore` followed by a non-alphanumeric character. This is to allow ignores such as `# type: ignore[E1000]`.	2019-05-11 19:17:24 +01:00
Pablo Galindo	f2cf1e3e28	bpo-36623: Clean parser headers and include files (GH-12253) After the removal of pgen, multiple header and function prototypes that lack implementation or are unused are still lying around.	2019-04-13 17:05:14 +01:00
Zackery Spytz	cda139d1de	bpo-36459: Fix a possible double PyMem_FREE() due to tokenizer.c's tok_nextc() (12601) Remove the PyMem_FREE() call added in `cb90c89`. The buffer will be freed when PyTokenizer_Free() is called on the tokenizer state.	2019-03-28 15:53:00 +02:00
Pablo Galindo	cb90c89de1	bpo-36367: Free buffer if realloc fails in tokenize.c (GH-12442)	2019-03-19 17:17:58 +00:00
Guido van Rossum	495da29225	bpo-35975: Support parsing earlier minor versions of Python 3 (GH-12086) This adds a `feature_version` flag to `ast.parse()` (documented) and `compile()` (hidden) that allow tweaking the parser to support older versions of the grammar. In particular if `feature_version` is 5 or 6, the hacks for the `async` and `await` keyword from PEP 492 are reinstated. (For 7 or higher, these are unconditionally treated as keywords, but they are still special tokens rather than `NAME` tokens that the parser driver recognizes.) https://bugs.python.org/issue35975	2019-03-07 12:38:08 -08:00
Pablo Galindo	1f24a719e7	bpo-35808: Retire pgen and use pgen2 to generate the parser (GH-11814) Pgen is the oldest piece of technology in the CPython repository, building it requires various #if[n]def PGEN hacks in other parts of the code and it also depends more and more on CPython internals. This commit removes the old pgen C code and replaces it for a new version implemented in pure Python. This is a modified and adapted version of lib2to3/pgen2 that can generate grammar files compatibles with the current parser. This commit also eliminates all the #ifdef and code branches related to pgen, simplifying the code and making it more maintainable. The regen-grammar step now uses $(PYTHON_FOR_REGEN) that can be any version of the interpreter, so the new pgen code maintains compatibility with older versions of the interpreter (this also allows regenerating the grammar with the current CI solution that uses Python3.5). The new pgen Python module also makes use of the Grammar/Tokens file that holds the token specification, so is always kept in sync and avoids having to maintain duplicate token definitions.	2019-03-01 15:34:44 -08:00
Guido van Rossum	dcfcd146f8	bpo-35766: Merge typed_ast back into CPython (GH-11645)	2019-01-31 12:40:27 +01:00
Anthony Sottile	995d9b9297	bpo-16806: Fix `lineno` and `col_offset` for multi-line string tokens (GH-10021)	2019-01-13 13:05:13 +09:00
Serhiy Storchaka	8ac658114d	bpo-30455: Generate all token related code and docs from Grammar/Tokens. (GH-10370) "Include/token.h", "Lib/token.py" (containing now some data moved from "Lib/tokenize.py") and new files "Parser/token.c" (containing the code moved from "Parser/tokenizer.c") and "Doc/library/token-list.inc" (included in "Doc/library/token.rst") are now generated from "Grammar/Tokens" by "Tools/scripts/generate_token.py". The script overwrites files only if needed and can be used on the read-only sources tree. "Lib/symbol.py" is now generated by "Tools/scripts/generate_symbol_py.py" instead of been executable itself. Added new make targets "regen-token" and "regen-symbol" which are now dependencies of "regen-all". The documentation contains now strings for operators and punctuation tokens.	2018-12-22 11:18:40 +02:00
Serhiy Storchaka	94cf308ee2	bpo-33306: Improve SyntaxError messages for unbalanced parentheses. (GH-6516)	2018-12-17 17:34:14 +02:00
Zackery Spytz	4c49da0cb7	bpo-35436: Add missing PyErr_NoMemory() calls and other minor bug fixes. (GH-11015) Set MemoryError when appropriate, add missing failure checks, and fix some potential leaks.	2018-12-07 12:11:30 +02:00
Zackery Spytz	5061a74a4c	Remove unneeded PyUnicode_READY() in tokenizer.c (GH-9114)	2018-09-10 09:27:31 +03:00
Victor Stinner	c884616390	Fix Windows compiler warning in tokenize.c (GH-8359) Fix the following warning on Windows: parser\tokenizer.c(1297): warning C4244: 'function': conversion from '__int64' to 'int', possible loss of data.	2018-07-21 03:36:06 +02:00
Serhiy Storchaka	cf7303ed2a	bpo-33305: Improve SyntaxError for invalid numerical literals. (GH-6517)	2018-07-09 15:09:35 +03:00
Victor Stinner	f2ddc6ac93	tokenizer: Remove unused tabs options (#4422 ) Remove the following fields from tok_state structure which are now used unused: * altwarning: "Issue warning if alternate tabs don't match" * alterror: "Issue error if alternate tabs don't match" * alttabsize: "Alternate tab spacing" Replace alttabsize variable with ALTTABSIZE define.	2017-11-17 01:25:47 -08:00
Jelle Zijlstra	ac317700ce	bpo-30406: Make async and await proper keywords (#1669 ) Per PEP 492, 'async' and 'await' should become proper keywords in 3.7.	2017-10-05 23:24:46 -04:00
Albert-Jan Nijburg	c9ccacea3f	bpo-25324: add missing comma in Parser/tokenizer.c (GH-1910)	2017-06-01 13:51:27 -07:00
Albert-Jan Nijburg	fc354f0785	bpo-25324: copy tok_name before changing it (#1608 ) * add test to check if were modifying token * copy list so import tokenize doesnt have side effects on token * shorten line * add tokenize tokens to token.h to get them to show up in token * move ERRORTOKEN back to its previous location, and fix nitpick * copy comments from token.h automatically * fix whitespace and make more pythonic * change to fix comments from @haypo * update token.rst and Misc/NEWS * change wording * some more wording changes	2017-05-31 16:00:21 +02:00
Berker Peksag	d2f4404bbb	Issue #28489 : Merge from 3.6	2017-02-05 04:33:11 +03:00
Berker Peksag	6f80562862	Issue #28489 : Fix comment in tokenizer.c Patch by Ryan Gonzalez.	2017-02-05 04:32:39 +03:00
Victor Stinner	a5ed5f000a	Use _PyObject_CallNoArg() Replace: PyObject_CallObject(callable, NULL) with: _PyObject_CallNoArg(callable)	2016-12-06 18:45:50 +01:00
Serhiy Storchaka	06515833fe	Replaced outdated macros _PyUnicode_AsString and _PyUnicode_AsStringAndSize with PyUnicode_AsUTF8 and PyUnicode_AsUTF8AndSize.	2016-11-20 09:13:07 +02:00
Benjamin Peterson	f5e8e8fc2b	merge 3.5 (#24022 )	2016-09-18 23:44:02 -07:00
Benjamin Peterson	57bda335e1	merge 3.4	2016-09-18 23:43:18 -07:00
Benjamin Peterson	26d998cfdd	properly handle the single null-byte file (closes #24022 )	2016-09-18 23:41:11 -07:00
Benjamin Peterson	5a715cfc57	merge 3.5 (#27981 )	2016-09-12 22:07:14 -07:00
Benjamin Peterson	35ee948fa5	restructure fp_setreadl so as to avoid refleaks (closes #27981 )	2016-09-12 22:06:58 -07:00
Brett Cannon	a721abac29	Issue #26331 : Implement the parsing part of PEP 515. Thanks to Georg Brandl for the patch.	2016-09-09 14:57:09 -07:00
Christian Heimes	c6cc23d0b9	Skip unused value in tokenizer code In the case of an escape character, c is never read. tok_next() is used to advance the pointer. CID 1225097	2016-09-09 00:09:45 +02:00
Serhiy Storchaka	ec39756960	Issue #22570 : Renamed Py_SETREF to Py_XSETREF.	2016-04-06 09:50:03 +03:00
Serhiy Storchaka	48842714b9	Issue #22570 : Renamed Py_SETREF to Py_XSETREF.	2016-04-06 09:45:48 +03:00
Benjamin Peterson	7285d520e0	remove duplicated check for fractions and complex numbers (closes #26076 ) Patch by Oren Milman.	2016-03-24 22:43:23 -07:00
Serhiy Storchaka	a051bf3afb	Issue #26581 : Use the first coding cookie on a line, not the last one.	2016-03-20 23:47:48 +02:00
Serhiy Storchaka	e431d3c9aa	Issue #26581 : Use the first coding cookie on a line, not the last one.	2016-03-20 23:36:29 +02:00
Serhiy Storchaka	ef1585eb9a	Issue #25923 : Added more const qualifiers to signatures of static and private functions.	2015-12-25 20:01:53 +02:00
Serhiy Storchaka	f006940351	Issue #20440 : Massive replacing unsafe attribute setting code with special macro Py_SETREF.	2015-12-24 10:39:57 +02:00
Serhiy Storchaka	5a57ade58e	Issue #20440 : Massive replacing unsafe attribute setting code with special macro Py_SETREF.	2015-12-24 10:35:59 +02:00
Serhiy Storchaka	0304729ec4	Issue #25388 : Fixed tokenizer crash when processing undecodable source code with a null byte.	2015-11-14 15:12:04 +02:00
Serhiy Storchaka	7e2b870b85	Issue #25388 : Fixed tokenizer crash when processing undecodable source code with a null byte.	2015-11-14 15:11:17 +02:00
Serhiy Storchaka	0d441119f5	Issue #25388 : Fixed tokenizer crash when processing undecodable source code with a null byte.	2015-11-14 15:10:35 +02:00
Eric V. Smith	235a6f0984	Issue #24965 : Implement PEP 498 "Literal String Interpolation". Documentation is still needed, I'll open an issue for that.	2015-09-19 14:51:32 -04:00
Eric V. Smith	6408dc82fa	Fixed indentation.	2015-09-12 18:53:36 -04:00
Yury Selivanov	96ec934e75	Issue #24619 : Simplify async/await tokenization. This commit simplifies async/await tokenization in tokenizer.c, tokenize.py & lib2to3/tokenize.py. Previous solution was to keep a stack of async-def & def blocks, whereas the new approach is just to remember position of the outermost async-def block. This change won't bring any parsing performance improvements, but it makes the code much easier to read and validate.	2015-07-23 15:01:58 +03:00
Yury Selivanov	8fb307cd65	Issue #24619 : New approach for tokenizing async/await. This commit fixes how one-line async-defs and defs are tracked by tokenizer. It allows to correctly parse invalid code such as: >>> async def f(): ... def g(): pass ... async = 10 and valid code such as: >>> async def f(): ... async def g(): pass ... await z As a consequence, is is now possible to have one-line 'async def foo(): await ..' functions: >>> async def foo(): return await bar()	2015-07-22 13:33:45 +03:00
Yury Selivanov	8085b80c18	Issue 24226: Fix parsing of many sequential one-line 'def' statements.	2015-05-18 12:50:52 -04:00
Yury Selivanov	7544508f02	PEP 0492 -- Coroutines with async and await syntax. Issue #24017 .	2015-05-11 22:57:16 -04:00
Benjamin Peterson	273a720f87	merge 3.4 (#24022 )	2015-04-21 12:07:06 -04:00
Benjamin Peterson	d73aca769f	do not call into python api if an exception is set (#24022 )	2015-04-21 12:05:19 -04:00
Benjamin Peterson	3e439797ba	merge 3.4 (#21642 )	2014-06-07 12:39:51 -07:00
Benjamin Peterson	c416162302	allow the keyword else immediately after (no space) an integer (closes #21642 )	2014-06-07 12:36:39 -07:00
Benjamin Peterson	d51374ed78	PEP 465: a dedicated infix operator for matrix multiplication (closes #21176 )	2014-04-09 23:55:56 -04:00
Martin v. Löwis	78f1e4c865	Merge with 3.3	2014-02-28 15:43:36 +01:00
Martin v. Löwis	815b41b1cd	Issue #20731 : Properly position in source code files even if they are opened in text mode. Patch by Serhiy Storchaka.	2014-02-28 15:27:29 +01:00
Serhiy Storchaka	5940b92909	Do not reset the line number because we already set file position to correct value. (fixes error in patch for issue #18960)	2014-01-09 20:13:52 +02:00
Serhiy Storchaka	1064a13bb0	Do not reset the line number because we already set file position to correct value. (fixes error in patch for issue #18960)	2014-01-09 20:12:49 +02:00
Serhiy Storchaka	7282ff6d5b	Issue #18960 : Fix bugs with Python source code encoding in the second line. * The first line of Python script could be executed twice when the source encoding (not equal to 'utf-8') was specified on the second line. * Now the source encoding declaration on the second line isn't effective if the first line contains anything except a comment. * As a consequence, 'python -x' works now again with files with the source encoding declarations specified on the second file, and can be used again to make Python batch files on Windows. * The tokenize module now ignore the source encoding declaration on the second line if the first line contains anything except a comment. * IDLE now ignores the source encoding declaration on the second line if the first line contains anything except a comment. * 2to3 and the findnocoding.py script now ignore the source encoding declaration on the second line if the first line contains anything except a comment.	2014-01-09 18:41:59 +02:00
Serhiy Storchaka	768c16ce02	Issue #18960 : Fix bugs with Python source code encoding in the second line. * The first line of Python script could be executed twice when the source encoding (not equal to 'utf-8') was specified on the second line. * Now the source encoding declaration on the second line isn't effective if the first line contains anything except a comment. * As a consequence, 'python -x' works now again with files with the source encoding declarations specified on the second file, and can be used again to make Python batch files on Windows. * The tokenize module now ignore the source encoding declaration on the second line if the first line contains anything except a comment. * IDLE now ignores the source encoding declaration on the second line if the first line contains anything except a comment. * 2to3 and the findnocoding.py script now ignore the source encoding declaration on the second line if the first line contains anything except a comment.	2014-01-09 18:36:09 +02:00
Serhiy Storchaka	c679227e31	Issue #1772673 : The type of `char` arguments now changed to `const char`.	2013-10-19 21:03:34 +03:00
Victor Stinner	daf455554b	Issue #18571 : Implementation of the PEP 446: file descriptors and file handles are now created non-inheritable; add functions os.get/set_inheritable(), os.get/set_handle_inheritable() and socket.socket.get/set_inheritable().	2013-08-28 00:53:59 +02:00
Antoine Pitrou	9ed5f27266	Issue #18722 : Remove uses of the "register" keyword in C code.	2013-08-13 20:18:52 +02:00
Benjamin Peterson	cb2226cb69	merge 3.3	2013-07-15 20:50:25 -07:00
Benjamin Peterson	265fba40c8	move declaration to top of block	2013-07-15 20:50:22 -07:00
Benjamin Peterson	fd9c0203de	merge 3.3 (closes #18470 )	2013-07-15 20:47:47 -07:00
Benjamin Peterson	2dbfd88245	check the return value of new_string() (closes #18470 )	2013-07-15 19:15:34 -07:00
Serhiy Storchaka	9670543a00	Issue #18038 : SyntaxError raised during compilation sources with illegal encoding now always contains an encoding name.	2013-06-09 16:53:55 +03:00
Serhiy Storchaka	3af14aaba5	Issue #18038 : SyntaxError raised during compilation sources with illegal encoding now always contains an encoding name.	2013-06-09 16:51:52 +03:00
Victor Stinner	796977360f	Issue #9566 : Fix compiler warning on Windows 64-bit	2013-06-05 00:44:00 +02:00
Benjamin Peterson	d0845588b8	make _PyParser_TokenNames const	2012-10-24 08:21:52 -07:00
Christian Heimes	0b3847de6d	Issue #15096 : Drop support for the ur string prefix	2012-06-20 11:17:58 +02:00
Armin Ronacher	6ecf77b3f8	Basic support for PEP 414 without docs or tests.	2012-03-04 12:04:06 +00:00
Antoine Pitrou	3a5d4cb940	Issue #13748 : Raw bytes literals can now be written with the `rb` prefix as well as `br`.	2012-01-12 22:46:19 +01:00
Martin v. Löwis	bd928fef42	Rename _Py_identifier to _Py_IDENTIFIER.	2011-10-14 10:20:37 +02:00
Martin v. Löwis	1ee1b6fe0d	Use identifier API for PyObject_GetAttrString.	2011-10-10 18:11:30 +02:00
Martin v. Löwis	afe55bba33	Add API for static strings, primarily good for identifiers. Thanks to Konrad Schöbel and Jasper Schulz for helping with the mass-editing.	2011-10-09 10:38:36 +02:00
Martin v. Löwis	d63a3b8beb	Implement PEP 393.	2011-09-28 07:41:54 +02:00
Jesus Cea	c1935d2abf	Revert bb62908896fe, but keep the test	2011-04-25 04:03:58 +02:00
Jesus Cea	88f7841be7	Correctly merging #9319 into 3.3?	2011-04-25 03:46:43 +02:00
Victor Stinner	c68b6aaec8	Issue #9319 : Fix a crash on parsing a Python source code without encoding cookie and not valid in UTF-8: use "<file>" as the filename instead of reading from NULL.	2011-04-23 00:41:19 +02:00
Victor Stinner	fe7c5b5bdf	Issue #9319 : Include the filename in "Non-UTF8 code ..." syntax error.	2011-04-05 01:48:03 +02:00
Victor Stinner	7f2fee3640	Issue #10785 : Store the filename as Unicode in the Python parser.	2011-04-05 00:39:01 +02:00
Victor Stinner	034c7537d8	Issue #10841 : don't translate newlines for pgen	2011-01-07 18:56:19 +00:00

1 2 3 4 5 ...

316 Commits