Commit Graph

759 Commits

Author SHA1 Message Date
Pablo Galindo 45cf5db587
Allow pgen to produce a DOT format dump of the grammar (GH-18005)
Originally suggested by Anthony Shaw.
2020-01-14 22:32:55 +00:00
Emmanuel Arias d23f78267a Remove unused functions in Parser/parsetok.c (GH-17365) 2020-01-13 11:58:52 +00:00
Alex Henrie 7ba6f18de2 bpo-39307: Fix memory leak on error path in parsetok (GH-17953) 2020-01-13 10:35:47 +00:00
Pablo Galindo 5ec91f78d5
bpo-39209: Manage correctly multi-line tokens in interactive mode (GH-17860) 2020-01-06 15:59:09 +00:00
Steve Dower a9d0a6a1b9
bpo-36500: Simplify PCbuild/build.bat and prevent path separator changing in comments (GH-17644) 2019-12-17 14:14:13 -08:00
Batuhan Taşkaya 109fc2792a bpo-38673: dont switch to ps2 if the line starts with comment or whitespace (GH-17421)
https://bugs.python.org/issue38673
2019-12-08 20:36:27 -08:00
Vinay Sajip 9def81aa52
bpo-36876: Moved Parser/listnode.c statics to interpreter state. (GH-16328) 2019-11-07 10:08:58 +00:00
Max Bernstein bdac32e9fe closes bpo-38648: Remove double tp_free slot in Python-ast.c. (GH-17002)
This looks like a typo due to copy-paste.
2019-10-30 18:08:06 -07:00
Vinay Sajip 0b60f64e43
bpo-11410: Standardize and use symbol visibility attributes across POSIX and Windows. (GH-16347) 2019-10-15 08:26:12 +01:00
Dong-hee Na a05fcd3c7a bpo-38425: Fix ‘res’ may be used uninitialized warning (GH-16688) 2019-10-10 09:41:26 +02:00
Eddie Elizondo 3368f3c6ae bpo-38140: Make dict and weakref offsets opaque for C heap types (#16076)
* Make dict and weakref offsets opaque for C heap types

* Add news
2019-09-19 17:29:05 +01:00
Eddie Elizondo 0247e80f3c Fix leaks in Python-ast.c (#16127) 2019-09-14 14:38:17 +01:00
Zackery Spytz 421a72af4d bpo-21120: Exclude Python-ast.h, ast.h and asdl.h from the limited API (#14634)
The PyArena type is not part of the limited API, so these headers
shouldn't be part of it either.
2019-09-12 10:27:14 +01:00
Dino Viehland ac46eb4ad6 bpo-38113: Update the Python-ast.c generator to PEP384 (gh-15957)
Summary: This mostly migrates Python-ast.c to PEP384 and removes all statics from the whole file. This modifies the generator itself that generates the Python-ast.c. It leaves in the usage of _PyObject_LookupAttr even though it's not fully PEP384 compatible (this could always be shimmed in by anyone who needs it).
2019-09-11 18:16:34 +01:00
Serhiy Storchaka 43c9731334 bpo-38083: Minor improvements in asdl_c.py and Python-ast.c. (GH-15824)
* Use the const qualifier for constant C strings.
* Intern field and attribute names.
* Temporary incref a borrowed reference to a list item.
2019-09-10 03:02:30 -07:00
Greg Price fa3a38d81f Mark files as executable that are meant as scripts. (GH-15354)
This is the converse of GH-15353 -- in addition to plenty of
scripts in the tree that are marked with the executable bit
(and so can be directly executed), there are a few that have
a leading `#!` which could let them be executed, but it doesn't
do anything because they don't have the executable bit set.

Here's a command which finds such files and marks them.  The
first line finds files in the tree with a `#!` line *anywhere*;
the next-to-last step checks that the *first* line is actually of
that form.  In between we filter out files that already have the
bit set, and some files that are meant as fragments to be
consumed by one or another kind of preprocessor.

    $ git grep -l '^#!' \
      | grep -vxFf <( \
          git ls-files --stage \
          | perl -lane 'print $F[3] if (!/^100644/)' \
        ) \
      | grep -ve '\.in$' -e '^Doc/includes/' \
      | while read f; do
          head -c2 "$f" | grep -qxF '#!' \
          && chmod a+x "$f"; \
        done
2019-09-09 07:16:33 -07:00
Pablo Galindo c638521dbf
Fix typo in the algorithm description (GH-15774) 2019-09-09 15:08:23 +01:00
Shashi Ranjan 43710b67b3 Fix typos in the documentation of Parser/pgen (GH-15416)
Co-Authored-By: Antoine <43954001+awecx@users.noreply.github.com>
2019-08-24 19:07:24 +01:00
Pablo Galindo 71876fa438
Refactor Parser/pgen and add documentation and explanations (GH-15373)
* Refactor Parser/pgen and add documentation and explanations

To improve the readability and maintainability of the parser
generator perform the following transformations:

    * Separate the metagrammar parser in its own class to simplify
      the parser generator logic.

    * Create separate classes for DFAs and NFAs and move methods that
      act exclusively on them from the parser generator to these
      classes.

    * Add docstrings and comment documenting the process to go from
      the grammar file into NFAs and then DFAs. Detail some of the
      algorithms and give some background explanations of some concepts
      that will helps readers not familiar with the parser generation
      process.

    * Select more descriptive names for some variables and variables.

    * PEP8 formatting and quote-style homogenization.

The output of the parser generator remains the same (Include/graminit.h
and Python/graminit.c remain untouched by running the new parser generator).
2019-08-22 02:38:39 +01:00
Hansraj Das 69f37bcb28 Indent code inside if block. (GH-15284)
Without indendation, seems like strcpy line is parallel to `if` condition.
2019-08-15 09:19:07 -07:00
Anthony Sottile 5b94f3578c Fix `SyntaxError` indicator printing too many spaces for multi-line strings (GH-14433) 2019-07-29 14:59:13 +01:00
Hansraj Das e018dc52d1 Remove duplicate call to strip method in Parser/pgen/token.py (GH-14938) 2019-07-24 21:31:19 +01:00
Pablo Galindo cd6e83b481 bpo-37593: Swap the positions of posonlyargs and args in the constructor of ast.parameters nodes (GH-14778)
https://bugs.python.org/issue37593
2019-07-14 16:32:18 -07:00
Victor Stinner 022ac0a497
bpo-37253: Remove PyAST_obj2mod_ex() function (GH-14020)
PyAST_obj2mod_ex() is similar to PyAST_obj2mod() with an additional
'feature_version' parameter which is unused.
2019-06-13 09:18:45 +02:00
Jeroen Demeyer 530f506ac9 bpo-36974: tp_print -> tp_vectorcall_offset and tp_reserved -> tp_as_async (GH-13464)
Automatically replace
tp_print -> tp_vectorcall_offset
tp_compare -> tp_as_async
tp_reserved -> tp_as_async
2019-05-30 19:13:39 -07:00
Eric V. Smith 6f6ff8a565
bpo-37050: Remove expr_text from FormattedValue ast node, use Constant node instead (GH-13597)
When using the "=" debug functionality of f-strings, use another Constant node (or a merged constant node) instead of adding expr_text to the FormattedValue node.
2019-05-27 15:31:52 -04:00
Steve Dower b82e17e626
bpo-36842: Implement PEP 578 (GH-12613)
Adds sys.audit, sys.addaudithook, io.open_code, and associated C APIs.
2019-05-23 08:45:22 -07:00
Michael J. Sullivan d8a82e2897 bpo-36878: Only allow text after `# type: ignore` if first character ASCII (GH-13504)
This disallows things like `# type: ignoreé`, which seems wrong.

Also switch to using Py_ISALNUM for the alnum check, for consistency
with other code (and maybe correctness re: locale issues?).


https://bugs.python.org/issue36878
2019-05-22 13:43:36 -07:00
Michael J. Sullivan 933e1509ec bpo-36878: Track extra text added to 'type: ignore' in the AST (GH-13479)
GH-13238 made extra text after a # type: ignore accepted by the parser.
This finishes the job and actually plumbs the extra text through the
parser and makes it available in the AST.
2019-05-22 15:54:20 +01:00
Matthias Bussonnier 565b4f1ac7 bpo-34616: Add PyCF_ALLOW_TOP_LEVEL_AWAIT to allow top-level await (GH-13148)
Co-Authored-By: Yury Selivanov <yury@magic.io>
2019-05-21 16:12:02 -04:00
Anthony Sottile abea73bf4a bpo-2180: Treat line continuation at EOF as a `SyntaxError` (GH-13401)
This makes the parser consistent with the tokenize module (already the case
in `pypy`).

sample
------

```python
x = 5\
```

before
------

```console
$ python3 t.py
$ python3 -mtokenize t.py
t.py:2:0: error: EOF in multi-line statement
```

after
-----

```console
$ ./python t.py
  File "t.py", line 3
    x = 5\

         ^
SyntaxError: unexpected EOF while parsing
$ ./python -m tokenize t.py
t.py:2:0: error: EOF in multi-line statement
```



https://bugs.python.org/issue2180
2019-05-18 11:27:16 -07:00
Michael J. Sullivan d8320ecb86 bpo-36878: Allow extra text after `# type: ignore` comments (GH-13238)
In the parser, when using the type_comments=True option, recognize
a TYPE_IGNORE as anything containing `# type: ignore` followed by
a non-alphanumeric character. This is to allow ignores such as
`# type: ignore[E1000]`.
2019-05-11 19:17:24 +01:00
Eric V. Smith 9a4135e939
bpo-36817: Add f-string debugging using '='. (GH-13123)
If a "=" is specified a the end of an f-string expression, the f-string will evaluate to the text of the expression, followed by '=', followed by the repr of the value of the expression.
2019-05-08 16:28:48 -04:00
Pablo Galindo 8c77b8cb91
bpo-36540: PEP 570 -- Implementation (GH-12701)
This commit contains the implementation of PEP570: Python positional-only parameters.

* Update Grammar/Grammar with new typedarglist and varargslist

* Regenerate grammar files

* Update and regenerate AST related files

* Update code object

* Update marshal.c

* Update compiler and symtable

* Regenerate importlib files

* Update callable objects

* Implement positional-only args logic in ceval.c

* Regenerate frozen data

* Update standard library to account for positional-only args

* Add test file for positional-only args

* Update other test files to account for positional-only args

* Add News entry

* Update inspect module and related tests
2019-04-29 13:36:57 +01:00
Inada Naoki 09415ff0eb
fix warnings by adding more const (GH-12924) 2019-04-23 20:39:37 +09:00
tyomitch 84b4784f12 use `const` in graminit.c (GH-12713) 2019-04-23 18:29:57 +09:00
Pablo Galindo f2cf1e3e28
bpo-36623: Clean parser headers and include files (GH-12253)
After the removal of pgen, multiple header and function prototypes that lack implementation or are unused are still lying around.
2019-04-13 17:05:14 +01:00
Zackery Spytz cda139d1de bpo-36459: Fix a possible double PyMem_FREE() due to tokenizer.c's tok_nextc() (12601)
Remove the PyMem_FREE() call added in cb90c89.  The buffer will be
freed when PyTokenizer_Free() is called on the tokenizer state.
2019-03-28 15:53:00 +02:00
Pablo Galindo 91759d9801
bpo-36143: Regenerate Lib/keyword.py from the Grammar and Tokens file using pgen (GH-12456)
Now that the parser generator is written in Python (Parser/pgen) we can make use of it to regenerate the Lib/keyword file that contains the language keywords instead of parsing the autogenerated grammar files. This also allows checking in the CI that the autogenerated files are up to date.
2019-03-25 22:01:12 +00:00
Emmanuel Arias ed5e29cba5 bpo-36385: Add ``elif`` sentence on to avoid multiple ``if`` (GH-12478)
Currently, when arguments on Parser/asdl_c.py are parsed
``ìf`` sentence is used. This PR Propose to use ``elif``
to avoid multiple evaluting of the ifs.





https://bugs.python.org/issue36385
2019-03-20 21:39:17 -07:00
Pablo Galindo cb90c89de1
bpo-36367: Free buffer if realloc fails in tokenize.c (GH-12442) 2019-03-19 17:17:58 +00:00
Guido van Rossum 10f8ce6688 bpo-36280: Add Constant.kind field (GH-12295)
The value is a string for string and byte literals, None otherwise.
It is 'u' for u"..." literals, 'b' for b"..." literals, '' for "..." literals.
The 'r' (raw) prefix is ignored.
Does not apply to f-strings.

This appears sufficient to make mypy capable of using the stdlib ast module instead of typed_ast (assuming a mypy patch I'm working on).

WIP: I need to make the tests pass. @ilevkivskyi @serhiy-storchaka 



https://bugs.python.org/issue36280
2019-03-13 13:00:46 -07:00
tyomitch 1b304f992d Remove d_initial from the parser as it is unused (GH-12212)
d_initial, the first state of a particular DFA in the parser has always been initialized to 0 in the old pgen as well as the new pgen. As this value is not used and the first state of each DFA is assumed to be the first element in the array representing it, remove d_initial from the parser to reduce complexity.
2019-03-09 15:35:50 +00:00
Guido van Rossum 495da29225 bpo-35975: Support parsing earlier minor versions of Python 3 (GH-12086)
This adds a `feature_version` flag to `ast.parse()` (documented) and `compile()` (hidden) that allow tweaking the parser to support older versions of the grammar. In particular if `feature_version` is 5 or 6, the hacks for the `async` and `await` keyword from PEP 492 are reinstated. (For 7 or higher, these are unconditionally treated as keywords, but they are still special tokens rather than `NAME` tokens that the parser driver recognizes.)



https://bugs.python.org/issue35975
2019-03-07 12:38:08 -08:00
Serhiy Storchaka d8b3a98c90
bpo-36187: Remove NamedStore. (GH-12167)
NamedStore has been replaced with Store. The difference between
NamedStore and Store is handled when precess the NamedExpr node
one level upper.
2019-03-05 20:42:06 +02:00
Pablo Galindo 8bc401a55c
Clean implementation of Parser/pgen and fix some style issues (GH-12156) 2019-03-04 07:26:13 +00:00
Alex Gaynor 8589f14bbe
Remove some code which has been dead since 1994 (#12136) 2019-03-01 23:37:34 -05:00
Pablo Galindo 1f24a719e7
bpo-35808: Retire pgen and use pgen2 to generate the parser (GH-11814)
Pgen is the oldest piece of technology in the CPython repository, building it requires various #if[n]def PGEN hacks in other parts of the code and it also depends more and more on CPython internals. This commit removes the old pgen C code and replaces it for a new version implemented in pure Python. This is a modified and adapted version of lib2to3/pgen2 that can generate grammar files compatibles with the current parser.

This commit also eliminates all the #ifdef and code branches related to pgen, simplifying the code and making it more maintainable. The regen-grammar step now uses $(PYTHON_FOR_REGEN) that can be any version of the interpreter, so the new pgen code maintains compatibility with older versions of the interpreter (this also allows regenerating the grammar with the current CI solution that uses Python3.5). The new pgen Python module also makes use of the Grammar/Tokens file that holds the token specification, so is always kept in sync and avoids having to maintain duplicate token definitions.
2019-03-01 15:34:44 -08:00
Pablo Galindo b9d2e97601
Fix potential memory leak in parsetok.c (GH-11832) 2019-02-13 00:45:53 +00:00
Guido van Rossum d2b4c19d53
bpo-35879: Fix type comment leaks (GH-11728)
* Fix leak for # type: ignore
* Fix the type comment leak
2019-02-01 15:28:13 -08:00