Commit Graph

67 Commits

Author SHA1 Message Date
Walter Dörwald ee1d24703f Add a test that checks the basic functionality of every encoding. 2004-12-29 16:04:38 +00:00
Walter Dörwald e57d7b179a The changes to the stateful codecs in 2.4 resulted in StreamReader.readline()
trying to return a complete line even if a size parameter was given (see
http://www.python.org/sf/1076985). This leads to buffer overflows with long
source lines under Windows if e.g. cp1252 is used as the source encoding.
This patch reverts the behaviour of readline() to something that behaves more
like Python 2.3: If a size parameter is given, read() is called only once.

As a side effect of this, readline() now supports all types of linebreaks
supported by unicode.splitlines().

Note that the tokenizer is still broken and it's possible to provoke segfaults
(see http://www.python.org/sf/1089395).
2004-12-21 22:24:00 +00:00
Walter Dörwald 063e1e846d Trigger a few error cases in Modules/_codecsmodule.c. 2004-10-28 13:04:26 +00:00
Hye-Shik Chang af5c7cff56 SF #1048865: Fix a trivial typo that breaks StreamReader.readlines() 2004-10-17 23:51:21 +00:00
Walter Dörwald 69652035bc SF patch #998993: The UTF-8 and the UTF-16 stateful decoders now support
decoding incomplete input (when the input stream is temporarily exhausted).
codecs.StreamReader now implements buffering, which enables proper
readline support for the UTF-16 decoders. codecs.StreamReader.read()
has a new argument chars which specifies the number of characters to
return. codecs.StreamReader.readline() and codecs.StreamReader.readlines()
have a new argument keepends. Trailing "\n"s will be stripped from the lines
if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and
PyUnicode_DecodeUTF16Stateful.
2004-09-07 20:24:22 +00:00
Marc-André Lemburg 3f41974525 Add generic codecs.encode() and .decode() APIs that don't impose
any restriction on the return type (like unicode.encode() et al. do).
2004-07-10 12:06:10 +00:00
Tim Peters 27f883687b Whitespace normalization. 2004-07-08 04:22:35 +00:00
Martin v. Löwis a1dde13389 Add test case for unicode(somestring, "idna"). 2004-03-24 16:48:24 +00:00
Walter Dörwald 21d3a32b99 Combine the functionality of test_support.run_unittest()
and test_support.run_classtests() into run_unittest()
and use it wherever possible.

Also don't use "from test.test_support import ...", but
"from test import test_support" in a few spots.

From SF patch #662807.
2003-05-01 17:45:56 +00:00
Tim Peters 0eadaac7dc Whitespace normalization. 2003-04-24 16:02:54 +00:00
Martin v. Löwis b5c4b7be3f Skip nameprep test 3.43, as we do allow unassigned characters. The test
fails only in UCS-2 mode, since it tests a non-BMP character.
2003-04-18 20:21:00 +00:00
Martin v. Löwis 2548c730c1 Implement IDNA (Internationalized Domain Names in Applications). 2003-04-18 10:39:54 +00:00
Marc-André Lemburg 29273c87da Fix for [ 543344 ] Interpreter crashes when recoding; suggested
by Michael Stone (mbrierst).

Python 2.1.4, 2.2.2 candidate.
2003-02-04 19:35:03 +00:00
Walter Dörwald 8709a420c4 Check whether a string resize is necessary at the end
of PyString_DecodeEscape(). This prevents a call to
_PyString_Resize() for the empty string, which would
result in a PyErr_BadInternalCall(), because the
empty string has more than one reference.

This closes SF bug http://www.python.org/sf/603937
2002-09-03 13:53:40 +00:00
Barry Warsaw 04f357cffe Get rid of relative imports in all unittests. Now anything that
imports e.g. test_support must do so using an absolute package name
such as "import test.test_support" or "from test import test_support".

This also updates the README in Lib/test, and gets rid of the
duplicate data dirctory in Lib/test/data (replaced by
Lib/email/test/data).

Now Tim and Jack can have at it. :)
2002-07-23 19:04:11 +00:00
Fred Drake 2e2be3760c Change the PyUnit-based tests to use the test_main() approach. This
allows using the tests with unittest.py as a script.  The tests will
still run when run as a script themselves.
2001-09-20 21:33:42 +00:00
Marc-André Lemburg a37171dd86 Test by Martin v. Loewis for the new UTF-16 codec handling of BOM
marks.
2001-06-19 20:09:28 +00:00