Victor Stinner
9566311014
resize_inplace() sets utf8_length to zero if the utf8 is not shared8
...
Cleanup also the code.
2011-10-04 01:03:50 +02:00
Victor Stinner
9e9d689d85
PyUnicode_New() sets utf8_length to zero for latin1
2011-10-04 01:02:02 +02:00
Victor Stinner
016980454e
Unicode: raise SystemError instead of ValueError or RuntimeError on invalid
...
state
2011-10-04 00:04:26 +02:00
Victor Stinner
7f11ad4594
Unicode: document when the wstr pointer is shared with data
...
Add also related assertions to _PyUnicode_CheckConsistency().
2011-10-04 00:00:20 +02:00
Victor Stinner
03490918b7
Add _PyUnicode_HAS_WSTR_MEMORY() macro
2011-10-03 23:45:12 +02:00
Victor Stinner
9ce5a835bb
PyUnicode_Join() checks output length in debug mode
...
PyUnicode_CopyCharacters() may copies less character than requested size, if
the input string is smaller than the argument. (This is very unlikely, but who
knows!?)
Avoid also calling PyUnicode_CopyCharacters() if the string is empty.
2011-10-03 23:36:02 +02:00
Victor Stinner
b803895355
Fix a compiler warning in PyUnicode_Append()
...
Don't check PyUnicode_CopyCharacters() in release mode. Rename also some
variables.
2011-10-03 23:27:56 +02:00
Victor Stinner
8cfcbed4e3
Improve string forms and PyUnicode_Resize() documentation
...
Remove also the FIXME for resize_copy(): as discussed with Martin, copy the
string on resize if the string is not resizable is just fine.
2011-10-03 23:19:21 +02:00
Amaury Forgeot d'Arc
bbe7b0ad2a
Fix a few ResourceWarnings in idle
2011-10-03 20:33:24 +02:00
Victor Stinner
c3cec7868b
Add asciilib: similar to ucs1, ucs2 and ucs4 library, but specialized to ASCII
...
ucs1, ucs2 and ucs4 libraries have to scan created substring to find the
maximum character, whereas it is not need to ASCII strings. Because ASCII
strings are common, it is useful to optimize ASCII.
2011-10-05 21:24:08 +02:00
Victor Stinner
14f8f02826
Fix PyUnicode_Partition(): str_in->str_obj
2011-10-05 20:58:25 +02:00
Victor Stinner
31392e741d
Fix my_basename(): make the string ready
2011-10-05 20:14:23 +02:00
Charles-François Natali
b619bb27ed
Issue #13070 : Fix a crash when a TextIOWrapper caught in a reference cycle
...
would be finalized after the reference to its underlying BufferedRWPair's
writer got cleared by the GC.
2011-10-05 19:55:56 +02:00
Victor Stinner
bb10a1f759
Ensure that newly created strings use the most efficient store in debug mode
2011-10-05 01:34:17 +02:00
Victor Stinner
4d0d54bcba
Document requierements of Unicode kinds
2011-10-05 01:31:05 +02:00
Victor Stinner
9310abbf40
Replace PyUnicodeObject* with PyObject* where it was inappropriate
2011-10-05 00:59:23 +02:00
Victor Stinner
ce5faf673e
unicodeobject.c doesn't make output strings ready in debug mode
...
Try to only create non ready strings in debug mode to ensure that all functions
(not only in unicodeobject.c, everywhere) make input strings ready.
2011-10-05 00:42:43 +02:00
Senthil Kumaran
55a190fbbd
merge from 3.2. Issue13104 - Fix urllib.request.thishost() utility function.
2011-10-06 00:32:52 +08:00
Senthil Kumaran
91a076a72f
merge from 3.2. Issue #13073 - Address the review comments made by Ezio.
2011-10-05 23:27:37 +08:00
Georg Brandl
07de325672
More fixes.
2011-10-05 16:47:38 +02:00
Georg Brandl
7597addbd4
More typoes.
2011-10-05 16:36:47 +02:00
Georg Brandl
c6bc4c6897
Fix a few typos in the unicode header.
2011-10-05 16:23:09 +02:00
Georg Brandl
4975a9b44d
Fix grammar.
2011-10-05 16:12:21 +02:00
Victor Stinner
c80d6d20d5
Speedup str[a 🅱️ step] for step != 1
...
Try to stop the scanner of the maximum character before the end using a limit
depending on the kind (e.g. 256 for PyUnicode_2BYTE_KIND).
2011-10-05 14:13:28 +02:00
Victor Stinner
ae86485517
Speedup find_maxchar_surrogates() for 32-bit wchar_t
...
If we have at least one character in U+10000-U+10FFFF, we know that we must use
PyUnicode_4BYTE_KIND kind.
2011-10-05 14:02:44 +02:00
Victor Stinner
b9275c104e
Speedup str[a:b] and PyUnicode_FromKindAndData
...
* str[a:b] doesn't scan the string for the maximum character if the string
is ascii only
* PyUnicode_FromKindAndData() stops if we are sure that we cannot use a
shorter character type. For example, _PyUnicode_FromUCS1() stops if we
have at least one character in range U+0080-U+00FF
2011-10-05 14:01:42 +02:00
Victor Stinner
702c734395
Speedup the ASCII decoder
...
It is faster for long string and a little bit faster for short strings,
benchmark on Linux 32 bits, Intel Core i5 @ 3.33GHz:
./python -m timeit 'x=b"a"' 'x.decode("ascii")'
./python -m timeit 'x=b"x"*80' 'x.decode("ascii")'
./python -m timeit 'x=b"abc"*4096' 'x.decode("ascii")'
length | before | after
-------+------------+-----------
1 | 0.234 usec | 0.229 usec
80 | 0.381 usec | 0.357 usec
12,288 | 11.2 usec | 3.01 usec
2011-10-05 13:50:52 +02:00
Antoine Pitrou
00b2c86d09
Fix text failures when ctypes is not available
...
(followup to Victor's 85d11cf67aa8 and 7a50e549bd11)
2011-10-05 13:01:41 +02:00
Charles-François Natali
4637309ee6
Merge.
2011-10-04 23:37:43 +02:00
Charles-François Natali
09252c4938
os.geteuid() may not be available...
2011-10-04 23:36:49 +02:00
Victor Stinner
e1335c711c
Fix usage og PyUnicode_READY()
2011-10-04 20:53:03 +02:00
Victor Stinner
e06e145943
_PyUnicode_READY_REPLACE() cannot be used in unicode_subtype_new()
2011-10-04 20:52:31 +02:00
Charles-François Natali
5f99c912c8
Issue #11956 : Always skip test_import.test_unwritable_directory when run as
...
root, since the semantics varies across Unix variants.
2011-10-04 20:41:52 +02:00
Victor Stinner
17efeed284
Add DONT_MAKE_RESULT_READY to unicodeobject.c to help detecting bugs
...
Use also _PyUnicode_READY_REPLACE() when it's applicable.
2011-10-04 20:05:46 +02:00
Victor Stinner
6b56a7fd3d
Add assertion to _Py_ReleaseInternedUnicodeStrings() if READY fails
2011-10-04 20:04:52 +02:00
Antoine Pitrou
875f29bb95
Fix naïve heuristic in unicode slicing (followup to 1b4f886dc9e2)
2011-10-04 20:00:49 +02:00
Charles-François Natali
2b72f83877
Merge.
2011-10-04 19:20:52 +02:00
Charles-François Natali
e39b112aea
Issue #11956 : Skip test_import.test_unwritable_directory on FreeBSD when run as
...
root (directory permissions are ignored).
2011-10-04 19:19:21 +02:00
Antoine Pitrou
d9488c6841
Merge
2011-10-04 19:11:34 +02:00
Antoine Pitrou
2242522fde
Add a necessary call to PyUnicode_READY() (followup to ab5086539ab9)
2011-10-04 19:10:51 +02:00
Antoine Pitrou
7aec401966
Optimize string slicing to use the new API
2011-10-04 19:08:01 +02:00
Ezio Melotti
a9860aeb08
#13054 : fix usage of sys.maxunicode after PEP-393.
2011-10-04 19:06:00 +03:00
Victor Stinner
77bb47b312
Simplify unicode_resizable(): singletons reference count is at least 2
2011-10-03 20:06:05 +02:00
Charles-François Natali
8619cd7376
Issue #13001 : Fix test_socket.testRecvmsgTrunc failure on FreeBSD < 8, which
...
doesn't always set the MSG_TRUNC flag when a truncated datagram is received.
2011-10-03 19:43:15 +02:00
Charles-François Natali
87b3c92b5b
Introduce support.requires_freebsd_version decorator.
2011-10-03 19:40:37 +02:00
Victor Stinner
85041a54bd
_PyUnicode_CheckConsistency() checks utf8 field consistency
2011-10-03 14:42:39 +02:00
Victor Stinner
3cf4637e4e
unicode_subtype_new() copies also the ascii flag
2011-10-03 14:42:15 +02:00
Victor Stinner
42dfd71333
unicode_kind_name() doesn't check consistency anymore
...
It is is called from _PyUnicode_Dump() and so must not fail.
2011-10-03 14:41:45 +02:00
Victor Stinner
a3b334da6d
PyUnicode_Ready() now sets ascii=1 if maxchar < 128
...
ascii=1 is no more reserved to PyASCIIObject. Use
PyUnicode_IS_COMPACT_ASCII(obj) to check if obj is a PyASCIIObject (as before).
2011-10-03 13:53:37 +02:00
Victor Stinner
1b4f9ceca7
Create _PyUnicode_READY_REPLACE() to reuse singleton
...
Only use _PyUnicode_READY_REPLACE() on just created strings.
2011-10-03 13:28:14 +02:00