Commit Graph

20 Commits

Author SHA1 Message Date
Marta Gómez Macías 6715f91edc
gh-102856: Python tokenizer implementation for PEP 701 (#104323)
This commit replaces the Python implementation of the tokenize module with an implementation
that reuses the real C tokenizer via a private extension module. The tokenize module now implements
a compatibility layer that transforms tokens from the C tokenizer into Python tokenize tokens for backward
compatibility.

As the C tokenizer does not emit some tokens that the Python tokenizer provides (such as comments and non-semantic newlines), a new special mode has been added to the C tokenizer mode that currently is only used via
the extension module that exposes it to the Python layer. This new mode forces the C tokenizer to emit these new extra tokens and add the appropriate metadata that is needed to match the old Python implementation.

Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2023-05-21 01:03:02 +01:00
Victor Stinner d3ded08048
bpo-40204: Add :noindex: in the documentation (GH-21859)
Add :noindex: to duplicated documentation to fix "duplicate object
description" errors.

For example, fix this Sphinx 3 issue:

Doc/library/configparser.rst:1146: WARNING: duplicate object
description of configparser.ConfigParser.optionxform, other instance
in library/configparser, use :noindex: for one of them
2020-08-13 21:41:54 +02:00
Guido van Rossum 508ed2d912
Delete remaining references to Grammar/Grammar from docs (#21624)
(Ironically, the file itself remains, see https://github.com/we-like-parsers/cpython/issues/135.)
2020-07-26 08:27:52 -07:00
Shantanu c2f7eb254b
bpo-39718: add TYPE_IGNORE, COLONEQUAL to py38 changes in token (GH-18598) 2020-02-28 18:25:36 -05:00
Serhiy Storchaka 138ccbb022
bpo-38738: Fix formatting of True and False. (GH-17083)
* "Return true/false" is replaced with "Return ``True``/``False``"
  if the function actually returns a bool.
* Fixed formatting of some True and False literals (now in monospace).
* Replaced "True/False" with "true/false" if it can be not only bool.
* Replaced some 1/0 with True/False if it corresponds the code.
* "Returns <bool>" is replaced with "Return <bool>".
2019-11-12 16:57:03 +02:00
Guido van Rossum 495da29225 bpo-35975: Support parsing earlier minor versions of Python 3 (GH-12086)
This adds a `feature_version` flag to `ast.parse()` (documented) and `compile()` (hidden) that allow tweaking the parser to support older versions of the grammar. In particular if `feature_version` is 5 or 6, the hacks for the `async` and `await` keyword from PEP 492 are reinstated. (For 7 or higher, these are unconditionally treated as keywords, but they are still special tokens rather than `NAME` tokens that the parser driver recognizes.)



https://bugs.python.org/issue35975
2019-03-07 12:38:08 -08:00
Guido van Rossum dcfcd146f8 bpo-35766: Merge typed_ast back into CPython (GH-11645) 2019-01-31 12:40:27 +01:00
Serhiy Storchaka 8ac658114d
bpo-30455: Generate all token related code and docs from Grammar/Tokens. (GH-10370)
"Include/token.h", "Lib/token.py" (containing now some data moved from
"Lib/tokenize.py") and new files "Parser/token.c" (containing the code
moved from "Parser/tokenizer.c") and "Doc/library/token-list.inc" (included
in "Doc/library/token.rst") are now generated from "Grammar/Tokens" by
"Tools/scripts/generate_token.py". The script overwrites files only if
needed and can be used on the read-only sources tree.

"Lib/symbol.py" is now generated by "Tools/scripts/generate_symbol_py.py"
instead of been executable itself.

Added new make targets "regen-token" and "regen-symbol" which are now
dependencies of "regen-all".

The documentation contains now strings for operators and punctuation tokens.
2018-12-22 11:18:40 +02:00
Jelle Zijlstra ac317700ce bpo-30406: Make async and await proper keywords (#1669)
Per PEP 492, 'async' and 'await' should become proper keywords in 3.7.
2017-10-05 23:24:46 -04:00
Serhiy Storchaka 5cefb6cfdd bpo-25324: Move the description of tokenize tokens to token.rst. (#1911) 2017-06-06 18:43:35 +03:00
Albert-Jan Nijburg fc354f0785 bpo-25324: copy tok_name before changing it (#1608)
* add test to check if were modifying token

* copy list so import tokenize doesnt have side effects on token

* shorten line

* add tokenize tokens to token.h to get them to show up in token

* move ERRORTOKEN back to its previous location, and fix nitpick

* copy comments from token.h automatically

* fix whitespace and make more pythonic

* change to fix comments from @haypo

* update token.rst and Misc/NEWS

* change wording

* some more wording changes
2017-05-31 16:00:21 +02:00
Berker Peksag 7d1c5efed1 Issue #23322: Remove outdated reference to an example in parser docs
Initial patch by Sahil Chelaramani.
2016-08-08 13:07:08 +03:00
Terry Jan Reedy fa089b9b0b Issue #22558: Add remaining doc links to source code for Python-coded modules.
Reformat header above separator line (added if missing) to a common format.
Patch by Yoni Lavi.
2016-06-11 15:02:54 -04:00
Yury Selivanov 3dc74bf703 docs: Document ASYNC/AWAIT tokens (issue #25580)
Initial patch by SilentGhost
2015-12-17 18:26:41 -05:00
Benjamin Peterson d51374ed78 PEP 465: a dedicated infix operator for matrix multiplication (closes #21176) 2014-04-09 23:55:56 -04:00
Meador Inge ac007ba686 Issue #13632: Update token documentation to reflect actual token types 2011-12-23 22:30:16 -06:00
Raymond Hettinger a199368b23 More source links. 2011-01-27 01:20:32 +00:00
Georg Brandl eec2d768cd #8968: add actual name of token constants. 2010-10-17 09:46:11 +00:00
Georg Brandl 7f01a13e7c Last round of adapting style of documenting argument default values. 2009-09-16 15:58:14 +00:00
Georg Brandl 116aa62bf5 Move the 3k reST doc tree in place. 2007-08-15 14:28:22 +00:00