cpython

Commit Graph

Author	SHA1	Message	Date
Lysandros Nikolaou	f46333b9f5	gh-107450: Remove unnecessary overflow check in parser error handler (#110940 )	2023-10-16 22:41:01 +02:00
Lysandros Nikolaou	a1ac5590e0	gh-107450: Check for overflow in the tokenizer and fix overflow test (#110832 ) Co-authored-by: Filipe Laíns <lains@riseup.net> Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2023-10-16 16:42:49 +02:00
Lysandros Nikolaou	fb7843ee89	gh-107450: Raise OverflowError when parser column offset overflows (#110754 )	2023-10-12 09:34:12 +00:00
Lysandros Nikolaou	01481f2dc1	gh-104169: Refactor tokenizer into lexer and wrappers (#110684 ) * The lexer, which include the actual lexeme producing logic, goes into the `lexer` directory. * The wrappers, one wrapper per input mode (file, string, utf-8, and readline), go into the `tokenizer` directory and include logic for creating a lexer instance and managing the buffer for different modes. --------- Co-authored-by: Pablo Galindo <pablogsal@gmail.com> Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>	2023-10-11 15:14:44 +00:00
Pablo Galindo Salgado	b28ffaa193	gh-109596: Ensure repeated rules in the grammar are not allowed and fix incorrect soft keywords (#109606 )	2023-09-22 19:03:23 +01:00
Dennis Sweeney	86617518c4	gh-108179: Add error message for parser stack overflows (#108256 )	2023-08-22 08:41:50 +01:00
Victor Stinner	c5afc97fc2	gh-106320: Remove private _PyErr C API functions (#106356 ) Remove private _PyErr C API functions: move them to the internal C API (pycore_pyerrors.h).	2023-07-03 10:48:50 +00:00
Marta Gómez Macías	6715f91edc	gh-102856: Python tokenizer implementation for PEP 701 (#104323 ) This commit replaces the Python implementation of the tokenize module with an implementation that reuses the real C tokenizer via a private extension module. The tokenize module now implements a compatibility layer that transforms tokens from the C tokenizer into Python tokenize tokens for backward compatibility. As the C tokenizer does not emit some tokens that the Python tokenizer provides (such as comments and non-semantic newlines), a new special mode has been added to the C tokenizer mode that currently is only used via the extension module that exposes it to the Python layer. This new mode forces the C tokenizer to emit these new extra tokens and add the appropriate metadata that is needed to match the old Python implementation. Co-authored-by: Pablo Galindo <pablogsal@gmail.com>	2023-05-21 01:03:02 +01:00
Lysandros Nikolaou	9169a56fad	gh-103656: Transfer f-string buffers to parser to avoid use-after-free (GH-103896) Co-authored-by: Pablo Galindo <pablogsal@gmail.com>	2023-04-27 01:33:31 +00:00
Pablo Galindo Salgado	1ef61cf71a	gh-102856: Initial implementation of PEP 701 (#102855 ) Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com> Co-authored-by: Batuhan Taskaya <isidentical@gmail.com> Co-authored-by: Marta Gómez Macías <mgmacias@google.com> Co-authored-by: sunmy2019 <59365878+sunmy2019@users.noreply.github.com>	2023-04-19 11:18:16 -05:00
Pablo Galindo Salgado	97e7004cfe	gh-100050: Fix an assertion error when raising unclosed parenthesis errors in the tokenizer (GH-100065) Automerge-Triggered-By: GH:pablogsal	2022-12-06 15:09:56 -08:00
Lysandros Nikolaou	cbf0afd8a1	gh-97973: Return all necessary information from the tokenizer (GH-97984) Right now, the tokenizer only returns type and two pointers to the start and end of the token. This PR modifies the tokenizer to return the type and set all of the necessary information, so that the parser does not have to this.	2022-10-06 16:07:17 -07:00
Christian Heimes	b4c857d0fd	gh-95876: Fix format string in pegen error location code (#95877 )	2022-08-11 09:55:57 +01:00
Paul m. p. Peny	bbb2ab70b6	[3.11] bpo-14916: interactive fd is not tied to stdin [type-bug] (#91469 ) * bpo-14916: interactive fd is not always stdin related to https://github.com/python/cpython/pull/31006 merged bugfix following https://bugs.python.org/issue14916 * 📜🤖 Added by blurb_it. Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>	2022-07-16 09:35:19 +01:00
Pablo Galindo Salgado	36fcde61ba	gh-94360: Fix a tokenizer crash when reading encoded files with syntax errors from stdin (#94386 ) * gh-94360: Fix a tokenizer crash when reading encoded files with syntax errors from stdin Signed-off-by: Pablo Galindo <pablogsal@gmail.com> * nitty nit Co-authored-by: Łukasz Langa <lukasz@langa.pl>	2022-07-05 17:39:21 +01:00
Pablo Galindo Salgado	26cca8067b	bpo-47117: Don't crash if we fail to decode characters when the tokenizer buffers are uninitialized (GH-32129) Automerge-Triggered-By: GH:pablogsal	2022-03-26 09:29:02 -07:00
Pablo Galindo Salgado	650720a0cf	Fix the caret position in some syntax errors in interactive mode (GH-30718)	2022-01-20 15:34:13 +00:00
Pablo Galindo Salgado	8c2fd09f36	bpo-46339: Include clarification on assert in 'get_error_line_from_tokenizer_buffers' (#30545 )	2022-01-18 11:13:00 +00:00
Pablo Galindo Salgado	cedec19be8	bpo-46339: Fix crash in the parser when computing error text for multi-line f-strings (GH-30529) Automerge-Triggered-By: GH:pablogsal	2022-01-11 08:30:39 -08:00
Pablo Galindo Salgado	70f415fb8b	bpo-46240: Correct the error for unclosed parentheses when the tokenizer is not finished (GH-30378)	2022-01-04 10:41:22 +00:00
Pablo Galindo Salgado	24c10d2943	bpo-45727: Only trigger the 'did you forgot a comma' error suggestion if inside parentheses (GH-29757)	2021-11-24 22:21:23 +00:00
Pablo Galindo Salgado	4f006a789a	Ensure the str member of the tokenizer is always initialised (GH-29681)	2021-11-21 02:06:39 +00:00
Pablo Galindo Salgado	c9c4444d9f	Refactor parser compilation units into specific components (GH-29676)	2021-11-21 01:08:50 +00:00

23 Commits