* Speed-up "x in y" where x has more than one character.
The existing code made excessive calls to the expensive memcmp() function.
The new code uses memchr() to rapidly find a start point for memcmp().
In addition to knowing that the first character is a match, the new code
also checks that the last character is a match. This significantly reduces
the incidence of false starts (saving memcmp() calls and making quadratic
behavior less likely).
Improves the timings on:
python -m timeit -r7 -s"x='a'*1000" "'ab' in x"
python -m timeit -r7 -s"x='a'*1000" "'bc' in x"
Once this code has proven itself, then string_find_internal() should refer
to it rather than running its own version. Also, something similar may
apply to unicode objects.
[ 1124295 ] Function's __name__ no longer accessible in restricted mode
which I introduced with a bit of mindless copy-paste when making
__name__ writable. You can't assign to __name__ in restricted mode,
which I'm going to pretend was intentional :)
PyNumber_Check, rather than trying to convert to a float. Reimplemented
writer - now raises exceptions when it sees a quotechar but neither
doublequote or escapechar are set. Doublequote results are now more
consistent (eg, single quote should generate """", rather than "",
which is ambiguous).
when this limit is reached. Limit defaults to 128k, and is changed
by module set_field_limit() method. Previously, an unmatched quote
character could result in the entire file being read into the field
buffer, potentially exhausting virtual memory.
was done because we were previously performing validation of the dialect
from python, but this is now down within the C module. Also, the method
we were using to detect classes did not work with new-style classes.
a delimiter. Previously, the 'network location' (<authority> in RFC 2396) would
become 'www.example.com?query=spam', while RFC 2396 does not allow a '?' in
<authority>. See bug #548176 for further discussion.
`glob.glob()` currently calls itself recursively to build a list of matches of
the dirname part of the pattern and then filters by the basename part. This is
effectively BFS. ``glob.glob('*/*/*/*/*/foo')`` will build a huge list of all
directories 5 levels deep even if only a handful of them contain a ``foo``
entry. A generator-based recusion would never have to store these list at once
by implementing DFS. This patch converts the `glob` function to an `iglob`
recursive generator . `glob()` now just returns ``list(iglob(pattern))``.
I also cleaned up the code a bit (reduced duplicate `has_magic()` checks and
created a second `glob0` helper func so that the main loop need not be
duplicated).
Thanks to Cherniavsky Beni for the patch!
test_threading.test_foreign_thread(): new test does a basic check that
"foreign" threads can using the threading module, and that they create
a _DummyThread instance in at least one use case. This isn't a very
good test, since a thread created by thread.start_new_thread() isn't
particularly "foreign".
trying to return a complete line even if a size parameter was given (see
http://www.python.org/sf/1076985). This leads to buffer overflows with long
source lines under Windows if e.g. cp1252 is used as the source encoding.
This patch reverts the behaviour of readline() to something that behaves more
like Python 2.3: If a size parameter is given, read() is called only once.
As a side effect of this, readline() now supports all types of linebreaks
supported by unicode.splitlines().
Note that the tokenizer is still broken and it's possible to provoke segfaults
(see http://www.python.org/sf/1089395).
more. Thanks to Simon Percivall!
The patch makes changes to inspect.py in two places:
* the pattern to match against functions at line 436 is
modified: lambdas should be matched even if not
preceded by whitespace, as long as "lambda" isn't part
of another word.
* the BlockFinder class is heavily modified. Changes are:
- checking for "def", "class" or "lambda" names
before setting self.started to True. Then checking the
same line for word characters after the colon (if the
colon is on that line). If so, and the line does not
end with a line continuation marker, raise EndOfBlock
immediately.
- adding self.passline to show that the line is to be
included and no more checking is necessary on that
line. Since a NEWLINE token is not generated when a
line continuation marker exists, this allows getsource
to continue with these functions even if the following
line would not be indented.
Also add a bunch of
'quite-unlikely-to-occur-in-real-life-but-working-anyway' tests.
is pointless.
Also add a note to the docs for the 'test' package that test cases should check
first that any conditions needed in the operating system are met before having
a test run.
Closes bug #1077302. THanks, Ian Holsman.
(http://www.cygwin.com/faq/faq_3.html#SEC41).
Also check whether onerror has actually been called so this test will
fail on assertion instead of on trying to chmod a non-existent file.
__getitem__() methods: compute only the new spellings needed to satisfy
the given indexing object. This is purely an optimization (it should
have no effect on visible semantics).
regrtest.py: skip rgbimg and imageop as they are not built on 64-bit systems.
_tkinter.c: replace %.8x with %p for printing pointers.
setup.py: add lib64 into the library directories.
showing that doctest's pdb.set_trace() support was dramatically broken.
doctest.py _OutputRedirectingPdb.trace_dispatch(): Return a local trace
function instead of (implicitly) None. Else interaction with pdb was
bizarre, noticing only 'call' events. Amazingly, the existing set_trace()
tests didn't care.
reliably on WinME with FAT32.
* Native speaker rewrite of the comment block.
* Removed unnecessary backslashes from the multi-line function defintions.
In cyclic gc, clear weakrefs to unreachable objects before allowing any
Python code (weakref callbacks or __del__ methods) to run.
This is a critical bugfix, affecting all versions of Python since weakrefs
were introduced. I'll backport to 2.3.
exposed in header files. Fixed a few comments in these headers.
As we might have expected, writing down invariants systematically exposed a
(minor) bug. In this case, function objects have a writeable func_code
attribute, which could be set to code objects with the wrong number of
free variables. Calling the resulting function segfaulted the interpreter.
Added a corresponding test.
of the year, and day of the week. Was not taking into consideration properly
the issue of when %U is used for the week of the year but the year starts on
Monday.
Closes bug #1045381 again.
The underlying bug still exists, but also existed in 2.3.4:
import.c's load_source_module() returns NULL if
PyOS_GetLastModificationTime() returns -1, but
PyOS_GetLastModificationTime() doesn't set any exception when it returns
-1, and neither does load_source_module() when it gets back -1. This
leads to "SystemError: NULL result without error in PyObject_Call"
on an import that fails in this way.
- Added a chunk of plist data as generated by Cocoa's NSDictionary and
verify we output the same (including formatting)
- Changed the "literal" plist code to match the raw test data
Peepholer could be fooled into misidentifying a tuple_of_constants.
Added code to count consecutive occurrences of LOAD_CONST.
Use the count to weed out the misidentified cases.
Added a unittest.
Turns out the mysterious "expected output" file contained exactly N dots,
because test_poll() has a loop that *usually* went around N times,
printing one dot on each loop trip. But there's no guarantee of that,
because the exact value of N depended on the vagaries of scheduling
time.sleep()s across two different processes. So stopped printing dots,
and got rid of the expected output file. Add a loop counter instead,
and verify that the loop goes around at least a couple of times. Also
cut the minimum time needed for this test from 4 seconds to 1.
tester that a DOS box is expected to flash. Slash the sleep from 2
seconds to a quarter second (why would we want to wait 2 seconds just
to stare at a DOS box?).
what this is trying to do. If it's necessary for it to create > 1000
processes, it should be controlled by a new resource and not run by
default on Windows.
display a test's docstring as "the name" of the test. So changed most
test docstrings to comments, and removed the clearly useless ones. Now
unittest reports the actual names of the test methods.
deque_item(): a performance bug: the linked list of blocks was followed
from the left in most cases, because the test (i < (deque->len >> 1)) was
after "i %= BLOCKLEN".
deque_clear(): replaced a call to deque_len() with deque->len; not sure what
this call was here for, nor if all compilers under the sun would inline it.
deque_traverse(): I belive that it could be called by the GC when the deque
has leftblock==rightblock==NULL, because it is tracked before the first block
is allocated (though closely before). Still, a C extension module subclassing
deque could provide its own tp_alloc that could trigger a GC collection after
the PyObject_GC_Track()...
deque_richcompare(): rewrote to cleanly check for end-of-iterations instead of
relying on deque.__iter__().next() to succeed exactly len(deque) times -- an
assumption which can break if deques are subclassed. Added a test.
I wonder if the length should be explicitely bounded to INT_MAX, with
OverflowErrors, as in listobject.c. On 64-bit machines, adding more than
INT_MAX in the deque will result in trouble. (Note to anyone/me fixing
this: carefully check for overflows if len is close to INT_MAX in the
following functions: deque_rotate(), deque_item(), deque_ass_item())
request. Tim says that "correct 'fuzzy' comparison of floats cannot
be automated." (The motivation behind adding the new option
was verifying interactive examples in Python's latex documentation;
several such examples use numbers that don't print consistently on
different platforms.)
Also, add a testcase.
Formerly, the list_extend() code used several local variables to remember
its state across iterations. Since an iteration could call arbitrary
Python code, it was possible for the list state to be changed. The new
code uses dynamic structure references instead of C locals. So, they
are always up-to-date.
After list_resize() is called, its size has been updated but the new
cells are filled with NULLs. These needed to be filled before arbitrary
iteration code was called; otherwise, that code could attempt to modify
a list that was in a semi-invalid state. The solution was to change
the ob->size field back to a value reflecting the actual number of valid
cells.
When an integer is compared to a float now, the int isn't coerced to float.
This avoids spurious overflow exceptions and insane results. This should
compute correct results, without raising spurious exceptions, in all cases
now -- although I expect that what happens when an int/long is compared to
a NaN is still a platform accident.
Note that we had potential problems here even with "short" ints, on boxes
where sizeof(long)==8. There's #ifdef'ed code here to handle that, but
I can't test it as intended. I tested it by changing the #ifdef to
trigger on my 32-bit box instead.
I suppose this is a bugfix candidate, but I won't backport it. It's
long-winded (for speed) and messy (because the problem is messy). Note
that this also depends on a previous 2.4 patch that introduced
_Py_SwappedOp[] as an extern.
all examples in a given text file. (analagous to "testmod")
- Minor docstring fixes.
- Added module_relative parameter to DocTestFile/DocTestSuite, which
controls whether paths are module-relative & os-independent, or
os-specific.
- Fixed bug in handling of absolute paths.
- If run from an interactive session, make paths relative to the
directory containing sys.argv[0] (since __main__ doesn't have
a __file__ attribute).