diff --git a/Doc/library/token.rst b/Doc/library/token.rst index 4bf15d5a81c..b7ca9dbca72 100644 --- a/Doc/library/token.rst +++ b/Doc/library/token.rst @@ -101,18 +101,37 @@ The token constants are: AWAIT ASYNC ERRORTOKEN - COMMENT - NL - ENCODING N_TOKENS NT_OFFSET - .. versionchanged:: 3.5 - Added :data:`AWAIT` and :data:`ASYNC` tokens. Starting with - Python 3.7, "async" and "await" will be tokenized as :data:`NAME` - tokens, and :data:`AWAIT` and :data:`ASYNC` will be removed. - .. versionchanged:: 3.7 - Added :data:`COMMENT`, :data:`NL` and :data:`ENCODING` to bring - the tokens in the C code in line with the tokens needed in - :mod:`tokenize` module. These tokens aren't used by the C tokenizer. \ No newline at end of file +The following token type values aren't used by the C tokenizer but are needed for +the :mod:`tokenize` module. + +.. data:: COMMENT + + Token value used to indicate a comment. + + +.. data:: NL + + Token value used to indicate a non-terminating newline. The + :data:`NEWLINE` token indicates the end of a logical line of Python code; + ``NL`` tokens are generated when a logical line of code is continued over + multiple physical lines. + + +.. data:: ENCODING + + Token value that indicates the encoding used to decode the source bytes + into text. The first token returned by :func:`tokenize.tokenize` will + always be an ``ENCODING`` token. + + +.. versionchanged:: 3.5 + Added :data:`AWAIT` and :data:`ASYNC` tokens. Starting with + Python 3.7, "async" and "await" will be tokenized as :data:`NAME` + tokens, and :data:`AWAIT` and :data:`ASYNC` will be removed. + +.. versionchanged:: 3.7 + Added :data:`COMMENT`, :data:`NL` and :data:`ENCODING` tokens. diff --git a/Doc/library/tokenize.rst b/Doc/library/tokenize.rst index ff55aacbd44..cd27a101a8f 100644 --- a/Doc/library/tokenize.rst +++ b/Doc/library/tokenize.rst @@ -17,7 +17,7 @@ as well, making it useful for implementing "pretty-printers," including colorizers for on-screen displays. To simplify token stream handling, all :ref:`operators` and :ref:`delimiters` -tokens are returned using the generic :data:`token.OP` token type. The exact +tokens are returned using the generic :data:`~token.OP` token type. The exact type can be determined by checking the ``exact_type`` property on the :term:`named tuple` returned from :func:`tokenize.tokenize`. @@ -44,7 +44,7 @@ The primary entry point is a :term:`generator`: The returned :term:`named tuple` has an additional property named ``exact_type`` that contains the exact operator type for - :data:`token.OP` tokens. For all other token types ``exact_type`` + :data:`~token.OP` tokens. For all other token types ``exact_type`` equals the named tuple ``type`` field. .. versionchanged:: 3.1 @@ -58,26 +58,7 @@ The primary entry point is a :term:`generator`: All constants from the :mod:`token` module are also exported from -:mod:`tokenize`, as are three additional token type values: - -.. data:: COMMENT - - Token value used to indicate a comment. - - -.. data:: NL - - Token value used to indicate a non-terminating newline. The NEWLINE token - indicates the end of a logical line of Python code; NL tokens are generated - when a logical line of code is continued over multiple physical lines. - - -.. data:: ENCODING - - Token value that indicates the encoding used to decode the source bytes - into text. The first token returned by :func:`.tokenize` will always be an - ENCODING token. - +:mod:`tokenize`. Another function is provided to reverse the tokenization process. This is useful for creating tools that tokenize a script, modify the token stream, and @@ -96,8 +77,8 @@ write back the modified script. token type and token string as the spacing between tokens (column positions) may change. - It returns bytes, encoded using the ENCODING token, which is the first - token sequence output by :func:`.tokenize`. + It returns bytes, encoded using the :data:`~token.ENCODING` token, which + is the first token sequence output by :func:`.tokenize`. :func:`.tokenize` needs to detect the encoding of source files it tokenizes. The @@ -115,7 +96,7 @@ function it uses to do this is available: It detects the encoding from the presence of a UTF-8 BOM or an encoding cookie as specified in :pep:`263`. If both a BOM and a cookie are present, - but disagree, a SyntaxError will be raised. Note that if the BOM is found, + but disagree, a :exc:`SyntaxError` will be raised. Note that if the BOM is found, ``'utf-8-sig'`` will be returned as an encoding. If no encoding is specified, then the default of ``'utf-8'`` will be @@ -147,8 +128,8 @@ function it uses to do this is available: 3 Note that unclosed single-quoted strings do not cause an error to be -raised. They are tokenized as ``ERRORTOKEN``, followed by the tokenization of -their contents. +raised. They are tokenized as :data:`~token.ERRORTOKEN`, followed by the +tokenization of their contents. .. _tokenize-cli: @@ -260,7 +241,7 @@ the name of the token, and the final column is the value of the token (if any) 4,11-4,12: NEWLINE '\n' 5,0-5,0: ENDMARKER '' -The exact token type names can be displayed using the ``-e`` option: +The exact token type names can be displayed using the :option:`-e` option: .. code-block:: sh