SF patch #1015989
The basic idea of this patch is to compute lineno attributes for all AST nodes. The actual
implementation lead to a lot of restructing and code cleanup.
The generated AST nodes now have an optional lineno argument to constructor. Remove the
top-level asList(), since it didn't seem to serve any purpose. Add an __iter__ to ast nodes.
Use isinstance() instead of explicit type tests.
Change transformer to use the new lineno attribute, which replaces three lines of code with one.
Use universal newlines so that we can get rid of special-case code for line endings. Use
lookup_node() in a few more frequently called, but simple com_xxx methods(). Change string
exception to class exception.
unicodedata.east_asian_width(). You can still implement your own
simple width() function using it like this:
def width(u):
w = 0
for c in unicodedata.normalize('NFC', u):
cwidth = unicodedata.east_asian_width(c)
if cwidth in ('W', 'F'): w += 2
else: w += 1
return w
The script was originally used to create the initial set of
codecs (and these were (c) CNRI). While the script itself still
is (c) CNRI, the output certainly isn't anymore.
iswide() for east asian width manipulation. (Inspired by David
Goodger, Reviewed by Martin v. Loewis)
- Move _PyUnicode_TypeRecord.flags to the end of the struct so that
no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
(Code contributed by Jiwon Seo.)
The documentation portion of the patch is being re-worked and will be
checked-in soon. Likewise, PEP 289 will be updated to reflect Guido's
rationale for the design decisions on binding behavior (as described in
in his patch comments and in discussions on python-dev).
The test file, test_genexps.py, is written in doctest format and is
meant to exercise all aspects of the the patch. Further additions are
welcome from everyone. Please stress test this new feature as much as
possible before the alpha release.
option is not given. If dbfile isn't given and can't be retrieved
from the optionsdb, just initialize it to the first element in
RGB_TXT.
Backport candidate.
* Delimiter mismatch now prints a warning instead of raising an exception.
* Offer style warnings for use of e.g. and i.e.
* Bypass false positive warnings for forward slashes in urls and in /rfc822.
* Put non-LaTex delimiter matching first to make -d option more reliable.
* Added more LaTex cmds from the docs.
* Blocked forward-slash warnings with delimiters-only option.
* Put help message on shorter line to fit an 80 char screen.
I'm finding some pretty baffling output, like reprs consisting entirely
of three left parens. At least this will let us know what type the object
is (it's not str -- there's no quote character in the repr).
New tool combinerefs.py, to combine the two output blocks produced via
PYTHONDUMPREFS.
The Py2.3 updates to the pyclbr module return both Class and Function
objects. The IDLE ClassBrowser module only knew about Class and could
not handle objects which did not define "super".
Fixed by adding a guard.
externally visible name of the module. This is so that type names can be
shown as "Carbon.File.FSSpec" even though the real name of the module is
"_File".
The bug is a reference to co_first_lineno that should be
co_firstlineno. The only other substantial change is to speed up
localtrace_count() by avoiding *costly* calls to inspect module.
It's trivial to get the filename and lineno directly from the frame.
Otherwise, delete commented out debug code and reflow very long lines.
Mark writes in private email:
"Modules listed in the registry was a dumb idea. This whole scheme
can die. AFAIK, no one in the world uses it (including win32all
since the last build)."
(See also SF #643711)
per PEP 291 (although there are currently string methods used).
This patch makes it compatible with 2.2, at least, by detecting
universal newline support.
get PEP-252 style objects in stead of old-fashioned objects.
In stead of defining a GetattrHook you declare a class variable getsetlist,
which contains tuples (name, getcode, setcode, docstring).
Only lightly tested: the code still works if you don't inherit PEP252Mixin
and the code works if you inherit it but don't define any getters
or setters. Also, this will not work together with the "poor mans inheritance"
offered by method chains, so the CF module will remain with old-style
objects until PEP253 is supported too.
contains options, drop them to get the major/minor content type.
Modified from the supplied patch to support more whitespace variation.
Closes SF patch #613605.
build(): Fix the logic here for calculating fallbacks if the dbfile
isn't parseable.
main(): Fix the semantics for -d/--database; this should override any
database value found in the .pynche file.
Update some comments, and author contact info.
Bump to v1.4
Whitespace normalization.
(with one small bugfix in bgen/bgen/scantools.py)
This replaces string module functions with string methods
for the stuff in the Tools directory. Several uses of
string.letters etc. are still remaining.
[ 587993 ] SET_LINENO killer
Remove SET_LINENO. Tracing is now supported by inspecting co_lnotab.
Many sundry changes to document and adapt to this change.
where it was: it is really a configuration file, not a normal module.
By moving it into Mac/Lib we can now also store the location of bgen
itself in there, which is needed because bgen isn't installed.
* globaltrace_lt - handle case where inspect.getmodulename doesn't return
anything useful
* localtrace_trace - handle case where inspect.getframeinfo doesn't return
any context info
I think both of the last two are caused by exec'd or eval'd code
defined and the default was "pre" instead of "sre". Give up on 1.5.2
compatibility, hardcode the sre solution. However, this XXX comment
still applies, AFAIK:
# XXX This code depends on internals of the regular expression
# engine! There's no standard API to do a substitution when you
# have already found the match. One should be added.
The staticforward define was needed to support certain broken C
compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the
static keyword when it was used with a forward declaration of a static
initialized structure. Standard C allows the forward declaration with
static, and we've decided to stop catering to broken C compilers. (In
fact, we expect that the compilers are all fixed eight years later.)
I'm leaving staticforward and statichere defined in object.h as
static. This is only for backwards compatibility with C extensions
that might still use it.
XXX I haven't updated the documentation.
1. BUGFIX: In function makefile(), strip blanks from the nodename.
This is necesary to match the behavior of parser.makeref() and
parser.do_node().
2. BUGFIX fixed KeyError in end_ifset (well, I may have just made
it go away, rather than fix it)
3. BUGFIX allow @menu and menu items inside @ifset or @ifclear
4. Support added for:
@uref URL reference
@image image file reference (see note below)
@multitable output an HTML table
@vtable
5. Partial support for accents, to match MAKEINFO output
6. I added a new command-line option, '-H basename', to specify
HTML Help output. This will cause three files to be created
in the current directory:
`basename`.hhp HTML Help Workshop project file
`basename`.hhc Contents file for the project
`basename`.hhk Index file for the project
When fed into HTML Help Workshop, the resulting file will be
named `basename`.chm.
7. A new class, HTMLHelp, to accomplish item 6.
8. Various calls to HTMLHelp functions.
A NOTE ON IMAGES: Just as 'outputdirectory' must exist before
running this program, all referenced images must already exist
in outputdirectory.
FLD: wrapped some long lines.
Not sure this is better in all cases.
parse(): Fixed a bug in the output; the dict is referred to in the
code as `countries' not `country'. Also added no-case-fold for the
string "U.S." since the Virgin Islands name no longer wraps those in
parentheses.
main(): Fixed the argument parsing to agree with the docstring, i.e.
--outputdict instead of --output.
In the module docstring:
- updated my email address
- we don't need to explain about Python 1.5 regexps <wink>
We also don't need to wrap the import of re with a try/except.
Other style fixes:
- untabification
- revert back to <> style everywhere (and consistently)
This patch replaces string module functions with string
methods in the Tools/world/world scripts.
It also updates two outdated URLs and the countrycodes
dictionary.
It fixes a bug where result of string.find() was checked
for truth instead of compared with -1.
It also replaces <> with != in two spots.
Assorted crashes on Windows and Linux when trying to display a very
long calltip, most likely a Tk bug. Wormed around by clamping the
calltip display to a maximum of 79 characters (why 79? why not ...).
Bugfix candidate, for all Python releases.
pymalloc, apparently. Fixed, but this means all bgen-generated modules will
have to be re-generated.
I hope (and expect) that the pymalloc fixes aren't bugfix candidates, because
if they are this is one too.
The problem was that an exception can occur in the text.get() call or
in the write() call, when the text buffer contains non-ASCII
characters. This causes the previous contents of the file to be lost.
The provisional fix is to call str(self.text.get(...)) *before*
opening the file, so that if the exception occurs, we never open the
file.
Two orthogonal better solutions have to wait for policy decisions:
1. We could try to encode the data as Latin-1 or as UTF-8; but that
would require IDLE to grow a notion of file encoding which requires
more thought.
2. We could make backups before overwriting a file. This requires
more thought because it needs to be fast and cross-platform and
configurable.
The cause seems to be that when a file URL doesn't exist,
urllib.urlopen() raises OSError instead of IOError. Simply add this
to the except clause. Not elegant, but effective. :-)
(With slight cosmetic improvements to shorten lines and a grammar fix
to a docstring.)
This addes -X and -E options to freeze. From the docstring:
-X module Like -x, except the module can never be imported by
the frozen binary.
-E: Freeze will fail if any modules can't be found (that
were not excluded using -x or -X).
The strerror attribute contained only partial information about the
exception and produced some very confusing error messages. By passing
err (the exception object itself) and letting it convert itself to a
string, the error messages are better.
compile() becomes replacement for builtin compile()
compileFile() generates a .pyc from a .py
both are exported in __init__
compiler.parse() gets optional second argument to specify compilation
mode, e.g. single, eval, exec
Add AbstractCompileMode as parent class and Module, Expression, and
Interactive as concrete subclasses. Each corresponds to a compilation
mode.
THe AbstractCompileMode instances in turn delegate to CodeGeneration
subclasses specialized for their particular functions --
ModuleCodeGenerator, ExpressionCodeGeneration,
InteractiveCodeGenerator.
The argument properties are ordered from easiest to hardest. The
harder the arg, the more complicated that code that must be generated
to return it from getChildren() and/or getChildNodes(). The old
calculation routine was bogus, because it always set hardest_arg to
the hardness of the last argument. Now use max() to always set it to
the hardness of the hardest argument.
Remove the only test in the syntax module. It ends up that the
transformer must handle this error case.
In the transformer, check for a list compression in com_assign_list()
by looking for a list_for node where a comma is expected.
In pycodegen.compile() re-raise the SyntaxError rather than catching
it and exiting
Invoke compiler.syntax.check() after building AST. If a SyntaxError
occurs, print the error and exit without generating a .pyc file.
Refactor code to use compiler.misc.set_filename() rather than passing
filename argument around to each CodeGenerator instance.
introspection incompatibility, but in fact it's just that calltips
always gave up on a docstring that started with a newline (but
didn't realize they were giving up <wink>).
Remove the option to have nested scopes or old LGB scopes. This has a
large impact on the code base, by removing the need for two variants
of each CodeGenerator.
Add a get_module() method to CodeGenerator objects, used to get the
future features for the current module.
Set CO_GENERATOR, CO_GENERATOR_ALLOWED, and CO_FUTURE_DIVISION flags
as appropriate.
Attempt to fix the value of nlocals in newCodeObject(), assuming that
nlocals is 0 if CO_NEWLOCALS is not defined.
operators per line or statement are now on by default, and -m turns
these warnings off.
- Change the way multiple / operators are reported; a regular
recommendation is always emitted after the warning.
- Report ambiguous warnings (both int|long and float|complex used for
the same operator).
- Update the doc string again to clarify all this and describe the
possible messages more precisely.
percolated out, and some general cleanup. The output is still the
same, except it now prints "Index: <file>" instead of "Processing:
<file>", so that the output can be used as input for patch (but only
the diff-style parts of it).
Fix list comp code generation -- emit GET_ITER instead of Const(0)
after the list.
Add CO_GENERATOR flag to generators.
Get CO_xxx flags from the new module
try/except or try/finally.
Previous versions had only track SETUP_LOOP blocks and ignored the
exception part. This meant that it allowed continue inside a
try/except but generated buggy code. Now it does the right thing.
As the doc string for _lookupName() explains:
This routine uses a list instead of a dictionary, because a
dictionary can't store two different keys if the keys have the
same value but different types, e.g. 2 and 2L. The compiler
must treat these two separately, so it does an explicit type
comparison before comparing the values.
Avoid if/elif/elif/else tests where the final else is supposed to
handle exactly one case instead of all other cases. When the list of
operators is extended, the catchall else treats all new operators as
the last operator in the set of tests. Instead, raise an exception if
an unexpected operator occurs.
Use a dictionary instead of a list to map objects to their offsets in
a const/name tuple of a code object.
XXX The conversion is perhaps incomplete, in that we shouldn't have to
do the list2dict to start.
Add support for floor division (// and //=)
The implementation of getChildren() and getChildNodes() is intended to
be faster, because it avoids calling flatten() on every return value.
But it's not clear that it is a lot faster, because constructing a
tuple with just the right values ends up being slow. (Too many
attribute lookups probably.)
The ast.txt file is much more complicated, with funny characters at
the ends of names (*, &, !) to indicate the types of each child node.
The astgen script is also much more complex, making me wonder if it's
still useful.
varnames should list all the local variables (with arguments first).
The XXX_NAME ops typically occur at the module level and assignment
ops should create locals.
(Hard to believe these were never handled before)
Add misc.mangle() that mangles based on the rules in compile.c.
XXX Need to test the corner cases
Update CodeGenerator with a class_name attribute bound to None. If a
particular instance is created within a class scope, the instance's
class_name is bound to that class's name.
Add mangle() method to CodeGenerator that mangles if the class_name
has a class_name in it.
Modify the FunctionCodeGenerator family to handle an extra argument--
the class_name.
Wrap all name ops and attrnames in calls to self.mangle()
Make nested scopes enabled by default
Add is_constant_false() helper so that compiled code and symbols are
consistent with builtin compiler's handling of "if 0:"
Fix doc string handling to be consistent with recent change that
eliminates the doc string from the Module's node attribute.
Add fix to print handling from Evan & Shane.
Track change to visitor api by making "verbose" explicit.
Comment out setting CO_NESTED flag (it's unnecessary in 2.2).
Evan Simpson's fix. And his explanation:
If you defined two nested functions in a row that refer to the
same non-global variable, the second one will be generated as
though the variable were global.