Commit Graph

902 Commits

Author SHA1 Message Date
Pablo Galindo c6483c9896
Raise specialised syntax error for invalid lambda parameters (GH-20776) 2020-06-10 14:07:06 +01:00
Pablo Galindo 9f495908c5
bpo-40903: Handle multiple '=' in invalid assignment rules in the PEG parser (GH-20697)
Automerge-Triggered-By: @pablogsal
2020-06-07 18:57:00 -07:00
Pablo Galindo 972ab03276
bpo-40904: Fix segfault in the new parser with f-string containing yield statements with no value (GH-20701) 2020-06-08 01:47:37 +01:00
Pablo Galindo 2e6593db00
bpo-40880: Fix invalid read in newline_in_string in pegen.c (#20666)
* bpo-40880: Fix invalid read in newline_in_string in pegen.c

* Update Parser/pegen/pegen.c

Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>

* Add NEWS entry

Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2020-06-06 00:52:27 +01:00
Pablo Galindo a54096e305
bpo-40883: Fix memory leak in fstring_compile_expr in parse_string.c (GH-20667) 2020-06-06 00:52:15 +01:00
Victor Stinner fa7ab6aa0f
bpo-40826: Add _PyOS_InterruptOccurred(tstate) function (GH-20599)
my_fgets() now calls _PyOS_InterruptOccurred(tstate) to check for
pending signals, rather calling PyOS_InterruptOccurred().

my_fgets() is called with the GIL released, whereas
PyOS_InterruptOccurred() must be called with the GIL held.

test_repl: use text=True and avoid SuppressCrashReport in
test_multiline_string_parsing().

Fix my_fgets() on Windows: fgets(fp) does crash if fileno(fp) is closed.
2020-06-03 14:39:59 +02:00
Victor Stinner c353764fd5
bpo-40826: Fix GIL usage in PyOS_Readline() (GH-20579)
Fix GIL usage in PyOS_Readline(): lock the GIL to set an exception.

Pass tstate to my_fgets() and _PyOS_WindowsConsoleReadline(). Cleanup
these functions.
2020-06-01 20:59:35 +02:00
Shantanu c116c94ff1
bpo-40614: Respect feature version for f-string debug expressions (GH-20196)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2020-05-27 21:30:38 +01:00
Lysandros Nikolaou 526e23f153
Refactor error handling code in Parser/pegen/pegen.c (GH-20440)
Set p->error_indicator in various places, where it's needed, but it's
not done.

Automerge-Triggered-By: @gvanrossum
2020-05-27 09:04:11 -07:00
Pablo Galindo 1cf15af9a6 bpo-40217: Ensure Py_VISIT(Py_TYPE(self)) is always called for PyType_FromSpec types (reverts GH-19414) (GH-20264)
Heap types now always visit the type in tp_traverse. See added docs for details.

This reverts commit 0169d3003b.

Automerge-Triggered-By: @encukou
2020-05-27 02:03:38 -07:00
Pablo Galindo 404b23b85b
Fix lookahead of soft keywords in the PEG parser (GH-20436)
Automerge-Triggered-By: @gvanrossum
2020-05-26 16:15:52 -07:00
Guido van Rossum b45af1a569
Add soft keywords (GH-20370)
These are like keywords but they only work in context; they are not reserved except when there is an exact match.

This would enable things like match statements without reserving `match` (which would be bad for the `re.match()` function and probably lots of other places).

Automerge-Triggered-By: @gvanrossum
2020-05-26 10:58:44 -07:00
Ammar Askar a2bbedc8b1
Fix peg_generator compiler warnings under MSVC (GH-20405) 2020-05-26 05:33:35 +01:00
Lysandros Nikolaou f7b1e46156
bpo-38964: Print correct filename on a SyntaxError in an fstring (GH-20399)
When a `SyntaxError` in the expression part of a fstring is found,
the filename attribute of the `SyntaxError` is always `<fstring>`.
With this commit, it gets changed to always have the name of the file
the fstring resides in.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-05-26 01:32:18 +01:00
Pablo Galindo deb4355a37
bpo-40750: Do not expand the new parser debug flags if Py_BUILD_CORE is not defined (GH-20393) 2020-05-25 20:17:12 +01:00
Pablo Galindo 800a35c623
bpo-40750: Support -d flag in the new parser (GH-20340) 2020-05-25 18:38:45 +01:00
Rémi Lapeyre c73914a562
bpo-36290: Fix keytword collision handling in AST node constructors (GH-12382) 2020-05-24 22:12:57 +01:00
Pablo Galindo b23d7adfdf
Use Py_ssize_t for the column number in the PEG support code (GH-20341) 2020-05-24 06:01:34 +01:00
Lysandros Nikolaou ae14583302
bpo-40334: Produce better error messages for non-parenthesized genexps (GH-20153)
The error message, generated for a non-parenthesized generator expression
in function calls, was still the generic `invalid syntax`, when the generator expression wasn't appearing as the first argument in the call. With this patch, even on input like `f(a, b, c for c in d, e)`, the correct error message gets produced.
2020-05-22 01:56:52 +01:00
Batuhan Taskaya b8a65ec1d3
bpo-40715: Reject dict unpacking on dict comprehensions (GH-20292)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2020-05-21 23:39:56 +01:00
Batuhan Taskaya 72e0aa2fd2
bpo-40176: Improve error messages for trailing comma on from import (GH-20294) 2020-05-21 21:41:58 +01:00
Pablo Galindo ced4e5c227
Regenerate the parser (#20195) 2020-05-18 23:47:51 +02:00
Lysandros Nikolaou 75b863aa97
bpo-40334: Reproduce error message for type comments on bare '*' in the new parser (GH-20151) 2020-05-18 20:14:47 +01:00
Batuhan Taskaya 63b8e0cba3
bpo-40528: Improve AST generation script to do builds simultaneously (GH-19968)
- Switch from getopt to argparse.
- Removed the limitation of not being able to produce both C and H simultaneously.

This will make it run faster since it parses the asdl definition once and uses the generated tree to generate both the header and the C source.
2020-05-18 18:42:10 +01:00
Lysandros Nikolaou 7b7a21bc4f
bpo-40661: Fix segfault when parsing invalid input (GH-20165)
Fix segfaults when parsing very complex invalid input, like `import äˆ ð£„¯ð¢·žð±‹á”€ð””ð‘©±å®ä±¬ð©¾\n𗶽`.

Co-authored-by: Guido van Rossum <guido@python.org>
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2020-05-18 18:32:03 +01:00
Lysandros Nikolaou 2c8cd06afe
bpo-40334: Improvements to error-handling code in the PEG parser (GH-20003)
The following improvements are implemented in this commit:
- `p->error_indicator` is set, in case malloc or realloc fail.
- Avoid memory leaks in the case that realloc fails.
- Call `PyErr_NoMemory()` instead of `PyErr_Format()`, because it requires no memory.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-05-17 04:19:23 +01:00
Pablo Galindo 16ab07063c
bpo-40334: Correctly identify invalid target in assignment errors (GH-20076)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2020-05-15 02:04:52 +01:00
Lysandros Nikolaou ce21cfca7b
bpo-40618: Disallow invalid targets in augassign and except clauses (GH-20083)
This commit fixes the new parser to disallow invalid targets in the
following scenarios:
- Augmented assignments must only accept a single target (Name,
  Attribute or Subscript), but no tuples or lists.
- `except` clauses should only accept a single `Name` as a target.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-05-14 21:13:50 +01:00
Pablo Galindo bcc3036095
bpo-40619: Correctly handle error lines in programs without file mode (GH-20090) 2020-05-14 21:11:48 +01:00
Lysandros Nikolaou a15c9b3a05
bpo-40334: Always show the caret on SyntaxErrors (GH-20050)
This commit fixes SyntaxError locations when the caret is not displayed,
by doing the following:

- `col_number` always gets set to the location of the offending
  node/expr. When no caret is to be displayed, this gets achieved
  by setting the object holding the error line to None.

- Introduce a new function `_PyPegen_raise_error_known_location`,
  which can be called, when an arbitrary `lineno`/`col_offset`
  needs to be passed. This function then gets used in the grammar
  (through some new macros and inline functions) so that SyntaxError
  locations of the new parser match that of the old.
2020-05-13 20:36:27 +01:00
Serhiy Storchaka 74ea6b5a75
bpo-40593: Improve syntax errors for invalid characters in source code. (GH-20033) 2020-05-12 12:42:04 +03:00
Shantanu 27c0d9b54a
bpo-40334: produce specialized errors for invalid del targets (GH-19911) 2020-05-11 14:53:58 -07:00
Pablo Galindo 5b956ca42d
bpo-40585: Normalize errors messages in codeop when comparing them (GH-20030)
With the new parser, the error message contains always the trailing
newlines, causing the comparison of the repr of the error messages
in codeop to fail. This commit makes the new parser mirror the old parser's
behaviour regarding trailing newlines.
2020-05-11 01:41:26 +01:00
Pablo Galindo ac7a92cc0a
bpo-40334: Avoid collisions between parser variables and grammar variables (GH-19987)
This is for the C generator:
- Disallow rule and variable names starting with `_`
- Rename most local variable names generated by the parser to start with `_`

Exceptions:
- Renaming `p` to `_p` will be a separate PR
- There are still some names that might clash, e.g.
  - anything starting with `Py`
  - C reserved words (`if` etc.)
  - Macros like `EXTRA` and `CHECK`
2020-05-09 21:34:50 -07:00
Joannah Nanjekye d10091aa17
bpo-40502: Initialize n->n_col_offset (GH-19988)
* initialize n->n_col_offset

* 📜🤖 Added by blurb_it.

* Move initialization

Co-authored-by: nanjekyejoannah <joannah.nanjekye@ibm.com>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2020-05-08 17:58:28 -03:00
Pablo Galindo db9163ceef
bpo-40555: Check for p->error_indicator in loop rules after the main loop is done (GH-19986) 2020-05-08 03:38:44 +01:00
Lysandros Nikolaou 4638c64295
bpo-40334: Error message for invalid default args in function call (GH-19973)
When parsing something like `f(g()=2)`, where the name of a default arg
is not a NAME, but an arbitrary expression, a specialised error message
is emitted.
2020-05-07 11:44:06 +01:00
Lysandros Nikolaou 2f37c355ab
bpo-40334: Fix error location upon parsing an invalid string literal (GH-19962)
When parsing a string with an invalid escape, the old parser used to
point to the beginning of the invalid string. This commit changes the new
parser to match that behaviour, since it's currently pointing to the
end of the string (or to be more precise, to the beginning of the next
token).
2020-05-07 11:37:51 +01:00
Pablo Galindo 470aac4d8e
bpo-40334: Generate comments in the parser code to improve debugging (GH-19966) 2020-05-06 23:14:43 +01:00
Pablo Galindo 99db2a1db7
bpo-40334: Allow trailing comma in parenthesised context managers (GH-19964) 2020-05-06 22:54:34 +01:00
Lysandros Nikolaou 999ec9ab6a
bpo-40334: Add type to the assignment rule in the grammar file (GH-19963) 2020-05-06 19:11:04 +01:00
Batuhan Taskaya 091951a67c
bpo-40528: Improve and clear several aspects of the ASDL definition code for the AST (GH-19952) 2020-05-06 15:29:32 +01:00
Lysandros Nikolaou 846d8b28ab
bpo-40246: Revert reporting of invalid string prefixes (GH-19888)
Due to backwards compatibility concerns regarding keywords immediately followed by a string without whitespace between them (like in `bg="#d00" if clear else"#fca"`) will fail to parse,
commit 41d5b94af4 has to be reverted.
2020-05-04 12:32:18 +01:00
Lysandros Nikolaou e10e7c771b
bpo-40334: Spacialized error message for invalid args after bare '*' (GH-19865)
When parsing things like `def f(*): pass` the old parser used to output `SyntaxError: named arguments must follow bare *`, which the new parser wasn't able to do.
2020-05-04 11:58:31 +01:00
Shantanu c3f001461d
bpo-40491: Fix typo in syntax error for numeric literals (GH-19893) 2020-05-04 11:13:30 +03:00
Shantanu 603d354626
bpo-40493: fix function type comment parsing (GH-19894)
The grammar for func_type_input rejected things like `(*t1) ->t2`. This fixes that.

Automerge-Triggered-By: @gvanrossum
2020-05-03 22:08:14 -07:00
Lysandros Nikolaou 7f06af684a
bpo-40334: Set error_indicator in _PyPegen_raise_error (GH-19887)
Due to PyErr_Occurred not being called at the beginning of each rule, we need to set the error indicator, so that rules do not get expanded after an exception has been thrown
2020-05-04 01:20:09 +01:00
Lysandros Nikolaou 03b7642265
bpo-40334: Make the PyPegen* and PyParser* APIs more consistent (GH-19839)
This commit makes both APIs more consistent by doing the following:
- Remove the `PyPegen_CodeObjectFrom*` functions, which weren't used 
  and will probably not be needed. Functions like `Py_CompileStringObject`
  can be used instead.
- Include a `const char *filename` parameter in `PyPegen_ASTFromString`.
- Rename `PyPegen_ASTFromFile` to `PyPegen_ASTFromFilename`, because
  its signature is not the same with `PyParser_ASTFromFile`.
2020-05-01 18:30:51 +01:00
Guido van Rossum d9d6eadf00
Ensure that tok->type_comments is set on every path (GH-19828) 2020-05-01 17:42:32 +01:00
Guido van Rossum 3941d9700b
bpo-40334: Refactor lambda_parameters similar to parameters (GH-19830) 2020-05-01 17:42:03 +01:00
Pablo Galindo d955241469
bpo-40334: Correct return value of func_type_comment (GH-19833) 2020-05-01 08:32:09 -07:00
Batuhan Taskaya 76c1b4d5c5
bpo-40334: Improve column offsets for thrown syntax errors by Pegen (GH-19782) 2020-05-01 14:13:43 +01:00
Pablo Galindo b796b3fb48
bpo-40334: Simplify type handling in the PEG c_generator (GH-19818) 2020-05-01 12:32:26 +01:00
Lysandros Nikolaou 3e0a6f37df
bpo-40334: Add support for feature_version in new PEG parser (GH-19827)
`ast.parse` and `compile` support a `feature_version` parameter that
tells the parser to parse the input string, as if it were written in
an older Python version.
The `feature_version` is propagated to the tokenizer, which uses it
to handle the three different stages of support for `async` and
`await`. Additionally, it disallows the following at parser level:
- The '@' operator in < 3.5
- Async functions in < 3.5
- Async comprehensions in < 3.6
- Underscores in numeric literals in < 3.6
- Await expression in < 3.5
- Variable annotations in < 3.6
- Async for-loops in < 3.5
- Async with-statements in < 3.5
- F-strings in < 3.6

Closes we-like-parsers/cpython#124.
2020-04-30 20:27:52 -07:00
Guido van Rossum c001c09e90
bpo-40334: Support type comments (GH-19780)
This implements full support for # type: <type> comments, # type: ignore <stuff> comments, and the func_type parsing mode for ast.parse() and compile().

Closes https://github.com/we-like-parsers/cpython/issues/95.

(For now, you need to use the master branch of mypy, since another issue unique to 3.9 had to be fixed there, and there's no mypy release yet.)

The only thing missing is `feature_version=N`, which is being tracked in https://github.com/we-like-parsers/cpython/issues/124.
2020-04-30 12:12:19 -07:00
Pablo Galindo 4db245ee9d
bpo-40334: refactor and cleanup for the PEG generators (GH-19775) 2020-04-29 10:42:21 +01:00
Lysandros Nikolaou 6d65087655
bpo-40334: Disallow invalid single statements in the new parser (GH-19774)
After parsing is done in single statement mode, the tokenizer buffer has to be checked for additional lines and a `SyntaxError` must be raised, in case there are any.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-04-29 02:42:27 +01:00
Pablo Galindo 2208134918
bpo-40334: Explicitly cast to int in pegen.c to fix a compiler warning (GH-19779) 2020-04-29 02:04:06 +01:00
Lysandros Nikolaou 37af21b667
bpo-40334: Fix shifting of nested f-strings in the new parser (GH-19771)
`JoinedStr`s and `FormattedValue also needs to be shifted, in order to correctly compute the location information of nested f-strings.
2020-04-29 01:43:50 +01:00
Lysandros Nikolaou d55133f49f
bpo-40334: Catch E_EOF error, when the tokenizer returns ERRORTOKEN (GH-19743)
An E_EOF error was only being caught after the parser exited before this commit. There are some cases though, where the tokenizer returns ERRORTOKEN *and* has set an E_EOF error (like when EOF directly follows a line continuation character) which weren't correctly handled before.
2020-04-28 01:23:35 +01:00
Pablo Galindo b94dbd7ac3
bpo-40334: Support PyPARSE_DONT_IMPLY_DEDENT in the new parser (GH-19736) 2020-04-27 18:35:58 +01:00
Pablo Galindo 2b74c835a7
bpo-40334: Support CO_FUTURE_BARRY_AS_BDFL in the new parser (GH-19721)
This commit also allows to pass flags to the new parser in all interfaces and fixes a bug in the parser generator that was causing to inline rules with actions, making them disappear.
2020-04-27 18:02:07 +01:00
Pablo Galindo 9f27dd3e16
Use Py_ssize_t instead of ssize_t (GH-19685) 2020-04-24 01:13:33 +01:00
Lysandros Nikolaou ebebb6429c
bpo-40334: Improve various PEG-Parser related stuff (GH-19669)
The changes in this commit are all related to @vstinner's original review comments of the initial PEP 617 implementation PR.
2020-04-23 16:36:06 +01:00
Pablo Galindo 1df5a9e88c
bpo-40334: Fix build errors and warnings in test_peg_generator (GH-19672) 2020-04-23 12:42:13 +01:00
Pablo Galindo ee40e4b856
bpo-40334: Don't downcast from Py_ssize_t to int (GH-19671) 2020-04-23 03:43:08 +01:00
Pablo Galindo 0b7829e089
Compile extensions in test_peg_generator with C99 (GH-19668) 2020-04-23 03:24:25 +01:00
Pablo Galindo 458004bf79
bpo-40334: Fix errors in parse_string.c with old compilers (GH-19666) 2020-04-23 00:13:47 +01:00
Pablo Galindo c5fc156852
bpo-40334: PEP 617 implementation: New PEG parser for CPython (GH-19503)
Co-authored-by: Guido van Rossum <guido@python.org>
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2020-04-22 23:29:27 +01:00
Pablo Galindo 11a7f158ef
bpo-40335: Correctly handle multi-line strings in tokenize error scenarios (GH-19619)
Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>
2020-04-21 01:53:04 +01:00
Lysandros Nikolaou 9a4b38f66b
bpo-40267: Fix message when last input character produces a SyntaxError (GH-19521)
When there is a SyntaxError after reading the last input character from
the tokenizer and if no newline follows it, the error message used to be
`unexpected EOF while parsing`, which is wrong.
2020-04-15 11:22:10 -07:00
Victor Stinner 4a21e57fe5
bpo-40268: Remove unused structmember.h includes (GH-19530)
If only offsetof() is needed: include stddef.h instead.

When structmember.h is used, add a comment explaining that
PyMemberDef is used.
2020-04-15 02:35:41 +02:00
Victor Stinner 62183b8d6d
bpo-40268: Remove explicit pythread.h includes (#19529)
Remove explicit pythread.h includes: it is always included
by Python.h.
2020-04-15 02:04:42 +02:00
Victor Stinner e5014be049
bpo-40268: Remove a few pycore_pystate.h includes (GH-19510) 2020-04-14 17:52:15 +02:00
Victor Stinner 81a7be3fa2
bpo-40268: Rename _PyInterpreterState_GET_UNSAFE() (GH-19509)
Rename _PyInterpreterState_GET_UNSAFE() to _PyInterpreterState_GET()
for consistency with _PyThreadState_GET() and to have a shorter name
(help to fit into 80 columns).

Add also "assert(tstate != NULL);" to the function.
2020-04-14 15:14:01 +02:00
Victor Stinner 4a3fe08353
bpo-40268: Include explicitly pycore_interp.h (GH-19505)
pycore_pystate.h no longer includes pycore_interp.h:
it's now included explicitly in files accessing PyInterpreterState.
2020-04-14 14:26:24 +02:00
Lysandros Nikolaou 41d5b94af4
bpo-40246: Report a better error message for invalid string prefixes (GH-19476) 2020-04-12 19:21:00 +01:00
Pablo Galindo 168660b547
bpo-40141: Add line and column information to ast.keyword nodes (GH-19283) 2020-04-02 00:47:39 +01:00
Alexander Riccio 51e3e450fb
bpo-40020: Fix realloc leak on failure in growable_comment_array_add (GH-19083)
Fix a leak and subsequent crash in parsetok.c caused by realloc misuse on a rare codepath. 

Realloc returns a null pointer on failure, and then growable_comment_array_deallocate crashes later when it dereferences it.
2020-03-30 23:15:59 +02:00
Victor Stinner 87d3b9db4a
bpo-39882: Add _Py_FatalErrorFormat() function (GH-19157) 2020-03-25 19:27:36 +01:00
Serhiy Storchaka bace59d8b8
bpo-39999: Improve compatibility of the ast module. (GH-19056)
* Re-add removed classes Suite, slice, Param, AugLoad and AugStore.
* Add docstrings for dummy classes.
* Add docstrings for attribute aliases.
* Set __module__ to "ast" instead of "_ast".
2020-03-22 20:33:34 +02:00
Serhiy Storchaka 6b97598fb6
bpo-39988: Remove ast.AugLoad and ast.AugStore node classes. (GH-19038) 2020-03-17 23:41:08 +02:00
Batuhan Taşkaya 4ab362cec6
bpo-39638: Keep ASDL signatures in the AST nodes (GH-18515) 2020-03-16 10:12:53 +02:00
Batuhan Taşkaya 8689209e03
bpo-39969: Remove ast.Param node class as is no longer used (GH-19020) 2020-03-15 19:32:17 +00:00
Serhiy Storchaka 13d52c2686
bpo-34822: Simplify AST for subscription. (GH-9605)
* Remove the slice type.
* Make Slice a kind of the expr type instead of the slice type.
* Replace ExtSlice(slices) with Tuple(slices, Load()).
* Replace Index(value) with a value itself.

All non-terminal nodes in AST for expressions are now of the expr type.
2020-03-10 18:52:34 +02:00
Serhiy Storchaka b7e9525f9c
bpo-36287: Make ast.dump() not output optional fields and attributes with default values. (GH-18843)
The default values for optional fields and attributes of AST nodes are now set
as class attributes (e.g. Constant.kind is set to None).
2020-03-10 00:07:47 +02:00
xatier d7a04a8425
Fix typo in the parser generator (GH-18603) 2020-03-09 02:58:24 +00:00
Victor Stinner 9e5d30cc99
bpo-39882: Py_FatalError() logs the function name (GH-18819)
The Py_FatalError() function is replaced with a macro which logs
automatically the name of the current function, unless the
Py_LIMITED_API macro is defined.

Changes:

* Add _Py_FatalErrorFunc() function.
* Remove the function name from the message of Py_FatalError() calls
  which included the function name.
* Update tests.
2020-03-07 00:54:20 +01:00
Batuhan Taşkaya d82e469048
bpo-39639: Remove the AST "Suite" node and associated code (GH-18513)
The AST "Suite" node is no longer used and it can be removed from the ASDL definition and related structures (compiler, visitors, ...).

Co-Authored-By: Victor Stinner <vstinner@python.org>
Co-authored-by: Brett Cannon <54418+brettcannon@users.noreply.github.com>
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-03-04 16:16:46 +00:00
Andy Lester 384f3c536d
closes bpo-39721: Fix constness of members of tok_state struct. (GH-18600)
The function PyTokenizer_FromUTF8 from Parser/tokenizer.c had a comment:

    /* XXX: constify members. */

This patch addresses that.

In the tok_state struct:
    * end and start were non-const but could be made const
    * str and input were const but should have been non-const

Changes to support this include:
    * decode_str() now returns a char * since it is allocated.
    * PyTokenizer_FromString() and PyTokenizer_FromUTF8() each creates a
        new char * for an allocate string instead of reusing the input
        const char *.
    * PyTokenizer_Get() and tok_get() now take const char ** arguments.
    * Various local vars are const or non-const accordingly.

I was able to remove five casts that cast away constness.
2020-02-27 18:44:52 -08:00
Serhiy Storchaka 0cc6b5e559
bpo-39219: Fix SyntaxError attributes in the tokenizer. (GH-17828)
* Always set the text attribute.
* Correct the offset attribute for non-ascii sources.
2020-02-12 12:17:00 +02:00
Victor Stinner f3e7ea5b8c
bpo-39500: Document PyUnicode_IsIdentifier() function (GH-18397)
PyUnicode_IsIdentifier() does not call Py_FatalError() anymore if the
string is not ready.
2020-02-11 14:29:33 +01:00
Brandt Bucher d2f9667264
bpo-38823: Fix refleaks in _ast initialization error path (GH-17276) 2020-02-06 15:45:46 +01:00
Pablo Galindo 45cf5db587
Allow pgen to produce a DOT format dump of the grammar (GH-18005)
Originally suggested by Anthony Shaw.
2020-01-14 22:32:55 +00:00
Emmanuel Arias d23f78267a Remove unused functions in Parser/parsetok.c (GH-17365) 2020-01-13 11:58:52 +00:00
Alex Henrie 7ba6f18de2 bpo-39307: Fix memory leak on error path in parsetok (GH-17953) 2020-01-13 10:35:47 +00:00
Pablo Galindo 5ec91f78d5
bpo-39209: Manage correctly multi-line tokens in interactive mode (GH-17860) 2020-01-06 15:59:09 +00:00
Steve Dower a9d0a6a1b9
bpo-36500: Simplify PCbuild/build.bat and prevent path separator changing in comments (GH-17644) 2019-12-17 14:14:13 -08:00
Batuhan Taşkaya 109fc2792a bpo-38673: dont switch to ps2 if the line starts with comment or whitespace (GH-17421)
https://bugs.python.org/issue38673
2019-12-08 20:36:27 -08:00
Vinay Sajip 9def81aa52
bpo-36876: Moved Parser/listnode.c statics to interpreter state. (GH-16328) 2019-11-07 10:08:58 +00:00