There are two copies of the grammar -- the one used by Python itself as
Grammar/Grammar, and the one used by lib2to3 which has necessarily diverged at
Lib/lib2to3/Grammar.txt because it needs to support older syntax an we want it
to be reasonable stable to avoid requiring fixer rewrites.
This brings suport for syntax like `if x:= foo():` to match what the live
Python grammar does.
This should've been added at the time of the walrus operator itself, but lib2to3 being
independent is often overlooked. So we do consider this a bugfix rather than enhancement.
Capturing exceptions into names can lead to reference cycles though the __traceback__ attribute of the exceptions in some obscure cases that have been reported previously and fixed individually. As these variables are not used anyway, we can remove the binding to reduce the chances of creating reference cycles.
See for example GH-13135
* Now uses pickle protocol 4
* Doesn't wrap the grammar's `__dict__` in ordered dictionaries anymore as
dictionaries in Python 3.6+ are ordered by default
This still produces deterministic pickles (that hash the same with MD5).
Tested with different PYTHONHASHSEED values.
New tests also added.
I also made the comments in line with the builtin Grammar/Grammar. PEP 306 was
withdrawn, Kees Blom's railroad program has been lost to the sands of time for
at least 16 years now (I found a python-dev post from people looking for it).
Fix two (in my opinion) spurious failure conditions in the lib2to3.tests.test_parser.TestParserIdempotency test_parser test.
Use the same encoding found in the initial file to write a temp file for a diff. This retains the BOM if the encoding was initially utf-8-sig.
If the file cannot be parsed using the normal grammar, try again with no print statement which should succeed for valid files using future print_function
For case (1), the driver was correctly handling a BOM in a utf-8 file, but then the test was not writing a comparison file using 'utf-8-sig' to diff against, so the BOM got removed. I don't think that is the fault of the parser, and lib2to3 will retain the BOM.
For case (2), lib2to3 pre-detects the use of from __future__ import print_function or allows the user to force this interpretation with a -p flag, and then selects a different grammar with the print statement removed. That makes the test cases unfair to this test as the driver itself doesn't know which grammar to use. As a minimal fix, the test will try using a grammar with the print statement, and if that fails fall back on a grammar without it. A more thorough handling of the idempotency test would to be to parse all files using both grammars and ignore if one of the two failed but otherwise check both. I didn't think this was necessary but can change.
This is more complicated than it should be because we need to preserve the
useful mtime-based regeneration feature that lib2to3.pgen2.driver.load_grammar
has. We only look for the pickled grammar file with pkgutil.get_data and only if
the source file does not exist.
Note: this doesn't unpack f-strings into the underlying JoinedStr AST.
Ideally we'd fully implement JoinedStr here but given its additional
complexity, I think this is worth bandaiding as is. This unblocks tools like
https://github.com/google/yapf to format 3.6 syntax using f-strings.
* Allow underscores in numeric literals in lib2to3.
* Stricter literal parsing for Python 3.6 in lib2to3.pgen2.tokenize.
* Add test case for underscores in literals in Python 3.
* change LBYL key lookup to dict.setdefault
The ``results`` was constructed as a defaultdict and we could simply
delete the check ``if key not in results``. However, I think it's safer
to use dict.setdefault as I'm not sure whether the caller expects a
regular dict or defaultdict.
* add name to the acknowledgements file
* use defaultdict to make the key-lookup cleaner