Commit Graph

1194 Commits

Author SHA1 Message Date
Lysandros Nikolaou fb7843ee89
gh-107450: Raise OverflowError when parser column offset overflows (#110754) 2023-10-12 09:34:12 +00:00
Pablo Galindo Salgado 3d180347ae
gh-110696: Fix incorrect syntax error message for incorrect argument unpacking (#110706) 2023-10-12 09:02:02 +00:00
Lysandros Nikolaou 17d65547df
gh-104169: Fix test_peg_generator after tokenizer refactoring (#110727)
* Fix test_peg_generator after tokenizer refactoring
* Remove references to tokenizer.c in comments etc.
2023-10-12 09:34:35 +02:00
Filipe Laíns 23645420dc
GH-110749: fix unistd.h import in file_tokenizer.c (#110750) 2023-10-12 07:52:13 +02:00
Lysandros Nikolaou 01481f2dc1
gh-104169: Refactor tokenizer into lexer and wrappers (#110684)
* The lexer, which include the actual lexeme producing logic, goes into
  the `lexer` directory.
* The wrappers, one wrapper per input mode (file, string, utf-8, and
  readline), go into the `tokenizer` directory and include logic for
  creating a lexer instance and managing the buffer for different modes.
---------

Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2023-10-11 15:14:44 +00:00
sunmy2019 2cb62c6437
gh-110309: Prune empty constant in format specs (#110320)
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2023-10-05 14:08:42 +00:00
Pablo Galindo Salgado cc389ef627
gh-110259: Fix f-strings with multiline expressions and format specs (#110271)
Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
2023-10-05 14:26:44 +01:00
Victor Stinner 7513994c92
gh-110014: Include explicitly <unistd.h> header (#110155)
* Remove unused <locale.h> includes.
* Remove unused <fcntl.h> include in traceback.h.
* Remove redundant <assert.h> and <stddef.h> includes. They  are already
  included by "Python.h".
* Remove <object.h> include in faulthandler.c. Python.h already includes it.
* Add missing <stdbool.h> in pycore_pythread.h if HAVE_PTHREAD_STUBS
  is defined.
* Fix also warnings in pthread_stubs.h: don't redefine macros if they
  are already defined, like the __NEED_pthread_t macro.
2023-09-30 20:06:45 +00:00
Pablo Galindo Salgado b28ffaa193
gh-109596: Ensure repeated rules in the grammar are not allowed and fix incorrect soft keywords (#109606) 2023-09-22 19:03:23 +01:00
Pablo Galindo Salgado 5bda2f637e
gh-109114: Relax the check for invalid lambdas inside f-strings to avoid false positives (#109121) 2023-09-08 17:00:23 +00:00
Victor Stinner b0edf3b98e
GH-91079: Rename C_RECURSION_LIMIT to Py_C_RECURSION_LIMIT (#108507)
Symbols of the C API should be prefixed by "Py_" to avoid conflict
with existing names in 3rd party C extensions on "#include <Python.h>".

test.pythoninfo now logs Py_C_RECURSION_LIMIT constant and other
_testcapi and _testinternalcapi constants.
2023-09-08 09:48:28 +00:00
Serhiy Storchaka b2729e93e9
gh-88943: Improve syntax error for non-ASCII character that follows a numerical literal (GH-109081)
It now points on the invalid non-ASCII character, not on the valid numerical literal.
2023-09-07 17:00:13 +03:00
Victor Stinner 578ebc5d5f
gh-108767: Replace ctype.h functions with pyctype.h functions (#108772)
Replace <ctype.h> locale dependent functions with Python "pyctype.h"
locale independent functions:

* Replace isalpha() with Py_ISALPHA().
* Replace isdigit() with Py_ISDIGIT().
* Replace isxdigit() with Py_ISXDIGIT().
* Replace tolower() with Py_TOLOWER().

Leave Modules/_sre/sre.c unchanged, it uses locale dependent
functions on purpose.

Include explicitly <ctype.h> in _decimal.c to get isascii().
2023-09-01 18:36:53 +02:00
Victor Stinner e59a95238b
gh-108444: Remove _PyLong_AsInt() function (#108461)
* Update Parser/asdl_c.py to regenerate Python/Python-ast.c.
* Remove _PyLong_AsInt() alias to PyLong_AsInt().
2023-08-25 11:13:59 +02:00
Dennis Sweeney 86617518c4
gh-108179: Add error message for parser stack overflows (#108256) 2023-08-22 08:41:50 +01:00
Victor Stinner 0dd3fc2a64
gh-108216: Cleanup #include in internal header files (#108228)
* Add missing includes.
* Remove unused includes.
* Update old include/symbol names to newer names.
* Mention at least one included symbol.
* Sort includes.
* Update Tools/cases_generator/generate_cases.py used to generated
  pycore_opcode_metadata.h.
* Update Parser/asdl_c.py used to generate pycore_ast.h.
* Cleanup also includes in _testcapimodule.c and _testinternalcapi.c.
2023-08-21 18:05:59 +00:00
Irit Katriel 10a91d7e98
gh-108113: Make it possible to create an optimized AST (#108154) 2023-08-21 16:31:30 +00:00
Lysandros Nikolaou d66bc9e8a7
gh-107967: Fix infinite recursion on invalid escape sequence warning (#107968) 2023-08-15 11:26:42 +00:00
Mark Shannon fa45958450
GH-107263: Increase C stack limit for most functions, except `_PyEval_EvalFrameDefault()` (GH-107535)
* Set C recursion limit to 1500, set cost of eval loop to 2 frames, and compiler mutliply to 2.
2023-08-04 10:10:29 +01:00
Pablo Galindo Salgado da8f87b7ea
gh-107015: Remove async_hacks from the tokenizer (#107018) 2023-07-26 16:34:15 +01:00
Victor Stinner 1a3faba9f1
gh-106869: Use new PyMemberDef constant names (#106871)
* Remove '#include "structmember.h"'.
* If needed, add <stddef.h> to get offsetof() function.
* Update Parser/asdl_c.py to regenerate Python/Python-ast.c.
* Replace:

  * T_SHORT => Py_T_SHORT
  * T_INT => Py_T_INT
  * T_LONG => Py_T_LONG
  * T_FLOAT => Py_T_FLOAT
  * T_DOUBLE => Py_T_DOUBLE
  * T_STRING => Py_T_STRING
  * T_OBJECT => _Py_T_OBJECT
  * T_CHAR => Py_T_CHAR
  * T_BYTE => Py_T_BYTE
  * T_UBYTE => Py_T_UBYTE
  * T_USHORT => Py_T_USHORT
  * T_UINT => Py_T_UINT
  * T_ULONG => Py_T_ULONG
  * T_STRING_INPLACE => Py_T_STRING_INPLACE
  * T_BOOL => Py_T_BOOL
  * T_OBJECT_EX => Py_T_OBJECT_EX
  * T_LONGLONG => Py_T_LONGLONG
  * T_ULONGLONG => Py_T_ULONGLONG
  * T_PYSSIZET => Py_T_PYSSIZET
  * T_NONE => _Py_T_NONE
  * READONLY => Py_READONLY
  * PY_AUDIT_READ => Py_AUDIT_READ
  * READ_RESTRICTED => Py_AUDIT_READ
  * PY_WRITE_RESTRICTED => _Py_WRITE_RESTRICTED
  * RESTRICTED => (READ_RESTRICTED | _Py_WRITE_RESTRICTED)
2023-07-25 15:28:30 +02:00
Victor Stinner 7d41ead919
gh-106320: Remove _PyBytes_Join() C API (#107144)
Move private _PyBytes functions to the internal C API
(pycore_bytesobject.h):

* _PyBytes_DecodeEscape()
* _PyBytes_FormatEx()
* _PyBytes_FromHex()
* _PyBytes_Join()

No longer export these functions.
2023-07-23 20:10:12 +00:00
Victor Stinner d228825e08
gh-106320: Remove _PyOS_ReadlineTState API (#107034)
Remove _PyOS_ReadlineTState variable from the public C API.
The symbol is still exported for the readline shared extension.
2023-07-22 14:45:56 +00:00
Menelaos Kotoglou 76e20c361c
gh-106989: Remove tok report warnings (#106993) 2023-07-22 14:23:23 +02:00
Serhiy Storchaka be1b968dc1
gh-106521: Remove _PyObject_LookupAttr() function (GH-106642) 2023-07-12 08:57:10 +03:00
Lysandros Nikolaou dfe4de2038
gh-106396: Special-case empty format spec to gen empty JoinedStr node (#106401) 2023-07-04 14:19:08 +02:00
Victor Stinner d8c5d76da2
gh-106320: Remove private _PyUnicode codecs C API functions (#106385)
Remove private _PyUnicode codecs C API functions: move them to the
internal C API (pycore_unicodeobject.h). No longer export most of
these functions.
2023-07-04 07:29:52 +00:00
Victor Stinner c5afc97fc2
gh-106320: Remove private _PyErr C API functions (#106356)
Remove private _PyErr C API functions: move them to the internal
C API (pycore_pyerrors.h).
2023-07-03 10:48:50 +00:00
Inada Naoki d5bd32fb48
gh-104922: remove PY_SSIZE_T_CLEAN (#106315) 2023-07-02 15:07:46 +09:00
Nikita Sobolev 46c1097868
gh-106145: Make `end_{lineno,col_offset}` required on `type_param` nodes (#106224) 2023-06-30 23:45:08 +00:00
Victor Stinner 8c5f74fc89
gh-106023: Update code using _PyObject_FastCall() (#106257)
Replace _PyObject_FastCall() calls with PyObject_Vectorcall().
2023-06-30 01:05:01 +00:00
Pablo Galindo Salgado 13237a2da8
gh-98931: Add custom error messages to invalid import/from with multiple targets (#105985)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2023-06-22 15:56:40 +00:00
Lysandros Nikolaou 6586cee27f
gh-105938: Emit a SyntaxWarning for escaped braces in an f-string (#105939) 2023-06-20 12:38:46 +00:00
Brandt Bucher a4056c8f9c
GH-105588: Add missing error checks to some obj2ast_* converters (GH-105589) 2023-06-15 15:45:13 -07:00
Lysandros Nikolaou d382ad4915
gh-105820: Fix tok_mode expression buffer in file & readline tokenizer (#105828) 2023-06-15 16:21:24 +00:00
Pablo Galindo Salgado 12b6d844d8
gh-105800: Issue SyntaxWarning in f-strings for invalid escape sequences (#105801) 2023-06-15 01:08:12 +01:00
Lysandros Nikolaou abfbab6415
gh-105718: Fix buffer allocation in tokenizer with readline (#105728) 2023-06-13 16:18:11 +01:00
Pablo Galindo Salgado b047fa5e56
gh-105549: Tokenize separately NUMBER and NAME tokens and allow 0-prefixed literals (#105555) 2023-06-09 21:39:01 +01:00
Pablo Galindo Salgado c0a6ed3934
gh-105259: Ensure we don't show newline characters for trailing NEWLINE tokens (#105364) 2023-06-06 12:52:16 +01:00
Pablo Galindo Salgado 41de54378d
gh-105194: Fix format specifier escaped characters in f-strings (#105231) 2023-06-02 13:33:26 +02:00
Jelle Zijlstra 77d2579586
gh-104799: Default missing lists in AST to the empty list (#104834)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2023-06-01 18:39:39 -07:00
Victor Stinner ef300937c2
gh-92536: Remove PyUnicode_READY() calls (#105210)
Since Python 3.12, PyUnicode_READY() does nothing and always
returns 0.
2023-06-02 01:33:17 +02:00
Lysandros Nikolaou 70f315c2d6
gh-105042: Disable unmatched parens syntax error in python tokenize (#105061) 2023-05-30 22:52:52 +01:00
Pablo Galindo Salgado 9216e69a87
gh-105069: Add a readline-like callable to the tokenizer to consume input iteratively (#105070) 2023-05-30 22:43:34 +01:00
Marta Gómez Macías 96fff35325
gh-105017: Include CRLF lines in strings and column numbers (#105030)
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2023-05-28 15:15:53 +01:00
Marta Gómez Macías 86d8f48935
gh-105017: Fix including additional NL token when using CRLF (#105022)
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2023-05-27 16:50:43 +00:00
Petr Vaněk 6e62eb2e70
Fix indentation in Parser/tokenizer.c (#105012) 2023-05-27 12:41:50 +01:00
Jelle Zijlstra ba73473f4c
gh-104799: Move location of type_params AST fields (#104828)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2023-05-26 05:54:37 -07:00
Stepfen Shawn 705e387dd8
Fix typo in the tokenizer (#104950) 2023-05-26 02:50:33 +00:00
Lysandros Nikolaou c90a862cdc
gh-104866: Tokenize should emit NEWLINE after exiting block with comment (#104870) 2023-05-24 17:18:17 +01:00