Commit Graph

575 Commits

Author SHA1 Message Date
Benjamin Peterson e249dcab7a merge 3.2 2012-02-21 11:09:13 -05:00
Benjamin Peterson 69e9727657 ensure no one tries to hash things before the random seed is found 2012-02-21 11:08:50 -05:00
Georg Brandl 09a7c72cad Merge from 3.1: Issue #13703: add a way to randomize the hash values of basic types (str, bytes, datetime)
in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated.

The environment variable PYTHONHASHSEED and the new command line flag -R control this
behavior.
2012-02-20 21:31:46 +01:00
Georg Brandl 2daf6ae249 Issue #13703: add a way to randomize the hash values of basic types (str, bytes, datetime)
in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated.

The environment variable PYTHONHASHSEED and the new command line flag -R control this
behavior.
2012-02-20 19:54:16 +01:00
Victor Stinner cbe01342bc Issue #13913: normalize utf-8 codec name in UTF-8 decoder 2012-02-14 01:17:45 +01:00
Antoine Pitrou 1334884ff2 Issue #13848: open() and the FileIO constructor now check for NUL characters in the file name.
Patch by Hynek Schlawack.
2012-01-29 18:36:34 +01:00
Gregory P. Smith 63e6c3222f Consolidate the occurrances of the prime used as the multiplier when hashing
to a single #define instead of having several copies in several files.

This excludes the Modules/ tree (datetime and expat both have a copy
for their own purposes with no need for it to be the same).
2012-01-14 15:31:34 -08:00
Benjamin Peterson 53aa1d7c57 fix possible if unlikely leak 2011-12-20 13:29:45 -06:00
Victor Stinner ab1d16b456 Issue #13093: Fix error handling on PyUnicode_EncodeDecimal()
* Add tests for PyUnicode_EncodeDecimal() and PyUnicode_TransformDecimalToASCII()
 * Remove the unused "e" variable in replace()
2011-11-22 01:45:37 +01:00
Antoine Pitrou 5418ee0b9a Issue #13333: The UTF-7 decoder now accepts lone surrogates
(the encoder already accepts them).
2011-11-15 01:42:21 +01:00
Victor Stinner d88d9836c5 Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null character
Fix also spelling of the null character.
2011-09-06 02:00:05 +02:00
Ezio Melotti 93e7afc5d9 #9200: The str.is* methods now work with strings that contain non-BMP characters even in narrow Unicode builds. 2011-08-22 14:08:38 +03:00
Benjamin Peterson 7a6b44ab62 the named of the character is actually NUL 2011-08-18 13:51:47 -05:00
Benjamin Peterson 5ad517a7d9 NUL -> NULL 2011-08-18 10:48:50 -05:00
Ezio Melotti ee8d998ecf #12266: Fix str.capitalize() to correctly uppercase/lowercase titlecased and cased non-letter characters. 2011-08-15 09:09:57 +03:00
Benjamin Peterson f413b80806 in narrow builds, make sure to test codepoints as identifier characters (closes #12732)
This fixes the use of Unicode identifiers outside the BMP in narrow builds.
2011-08-12 22:17:18 -05:00
Senthil Kumaran 53516a82df Fix closes Issue12621 - Fix docstrings of find and rfind methods of bytes/bytearry/unicodeobject. 2011-07-27 23:33:54 +08:00
Senthil Kumaran 9ebe08d2f6 Fix closes issue12471 - wrong TypeError message when '%i' format spec was used. 2011-07-03 21:03:16 -07:00
Victor Stinner 3cbf14bfb1 Issue #10914: Initialize correctly the filesystem codec when creating a new
subinterpreter to fix a bootstrap issue with codecs implemented in Python, as
the ISO-8859-15 codec.

Add fscodec_initialized attribute to the PyInterpreterState structure.
2011-04-27 00:24:21 +02:00
Ezio Melotti f2b3f780a1 #6780: merge with 3.1. 2011-04-26 06:40:59 +03:00
Ezio Melotti ba42fd5801 #6780: fix starts/endswith error message to mention that tuples are accepted too. 2011-04-26 06:09:45 +03:00
Jesus Cea 6159ee3cf5 MERGE: startswith and endswith don't accept None as slice index. Patch by Torsten Becker. (closes #11828) 2011-04-20 17:42:50 +02:00
Jesus Cea ac4515063c startswith and endswith don't accept None as slice index. Patch by Torsten Becker. (closes #11828) 2011-04-20 17:09:23 +02:00
Victor Stinner 2b574a2332 Merged revisions 88697 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r88697 | victor.stinner | 2011-03-01 23:46:52 +0100 (mar., 01 mars 2011) | 4 lines

  Issue #11246: Fix PyUnicode_FromFormat("%V")

  Decode the byte string from UTF-8 (with replace error handler) instead of
  ISO-8859-1 (in strict mode). Patch written by Ray Allen.
........
2011-03-01 22:48:49 +00:00
Victor Stinner 659eb84457 Merged revisions 88481 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r88481 | victor.stinner | 2011-02-21 22:13:44 +0100 (lun., 21 févr. 2011) | 4 lines

  Fix PyUnicode_FromFormatV("%c") for non-BMP char

  Issue #10830: Fix PyUnicode_FromFormatV("%c") for non-BMP characters on
  narrow build.
........
2011-02-23 12:14:22 +00:00
Alexander Belopolsky b9cc00caab Removed unneeded #include 2010-12-22 02:35:20 +00:00
Benjamin Peterson 28a4dce6a8 remove (un)transform methods 2010-12-12 01:33:04 +00:00
Alexander Belopolsky 942af5a9a4 Issue #10557: Fixed error messages from float() and other numeric
types.  Added a new API function, PyUnicode_TransformDecimalToASCII(),
which transforms non-ASCII decimal digits in a Unicode string to their
ASCII equivalents.
2010-12-04 03:38:46 +00:00
Martin v. Löwis 4d0d471a80 Merge branches/pep-0384. 2010-12-03 20:14:31 +00:00
Georg Brandl 3b9406b08a Remove redundant check for PyBytes in unicode_encode. 2010-12-03 07:54:09 +00:00
Georg Brandl 02524629f3 #7475: add (un)transform method to bytes/bytearray and str, add back codecs that can be used with them from Python 2. 2010-12-02 18:06:51 +00:00
Georg Brandl e5b99f0fb3 Remove redundant includes of headers that are already included by Python.h. 2010-11-30 09:41:01 +00:00
Victor Stinner d5af0a5df0 PyUnicode_DecodeFSDefaultAndSize() raises MemoryError if _Py_char2wchar() fails 2010-11-08 23:34:29 +00:00
Victor Stinner 2f02a51135 PyUnicode_EncodeFS() raises an exception if _Py_wchar2char() fails
* Add error_pos optional argument to _Py_wchar2char()
 * PyUnicode_EncodeFS() raises a UnicodeEncodeError or MemoryError if
   _Py_wchar2char() fails
2010-11-08 22:43:46 +00:00
Victor Stinner c911bbfd5d str, bytes, bytearray docstring: remove unnecessary [...] 2010-11-07 19:04:46 +00:00
Victor Stinner e14e212221 Fix encode/decode method doc of str, bytes, bytearray types
* Specify the default encoding: write 'utf-8' instead of
   sys.getdefaultencoding(), because the default encoding is now constant
 * Specify the default errors value
2010-11-07 18:41:46 +00:00
Eric Smith 16562f41b0 Merged revisions 86277 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r86277 | eric.smith | 2010-11-06 15:27:37 -0400 (Sat, 06 Nov 2010) | 1 line

  Added more to docstrings for str.format, format_map, and __format__.
........
2010-11-06 19:29:45 +00:00
Eric Smith 51d2fd983b Added more to docstrings for str.format, format_map, and __format__. 2010-11-06 19:27:37 +00:00
David Malcolm 9696088b6d Issue #10288: The deprecated family of "char"-handling macros
(ISLOWER()/ISUPPER()/etc) have now been removed: use Py_ISLOWER() etc
instead.
2010-11-05 17:23:41 +00:00
Eric Smith 27bbca6f79 Issue #6081: Add str.format_map. str.format_map(mapping) is similar to str.format(**mapping), except mapping does not get converted to a dict. 2010-11-04 17:06:58 +00:00
Victor Stinner ad15872854 Simplify PyUnicode_Encode/DecodeFSDefault on Windows/Mac OS X
* Windows always uses mbcs
 * Mac OS X always uses utf-8
2010-10-27 00:25:46 +00:00
Victor Stinner f933e1ab6f Issue #4388: On Mac OS X, decode command line arguments from UTF-8, instead of
the locale encoding. If the LANG (and LC_ALL and LC_CTYPE) environment variable
is not set, the locale encoding is ISO-8859-1, whereas most programs (including
Python) expect UTF-8. Python already uses UTF-8 for the filesystem encoding and
to encode command line arguments on this OS.
2010-10-20 22:58:25 +00:00
Victor Stinner 9a90900da5 PyUnicode_FromFormatV(): Fix %A format
It was not completly implemented. Add a test.
2010-10-18 20:59:24 +00:00
Benjamin Peterson 8f67d0893f make hashes always the size of pointers; introduce Py_hash_t #9778 2010-10-17 20:54:53 +00:00
Georg Brandl ded5acf34a Merged revisions 81936 via svnmerge from
svn+ssh://svn.python.org/python/branches/py3k

........
  r81936 | mark.dickinson | 2010-06-12 11:10:14 +0200 (Sa, 12 Jun 2010) | 2 lines

  Silence 'unused variable' gcc warning.  Patch by Éric Araujo.
........
2010-10-17 11:48:07 +00:00
Victor Stinner 168e117e0a Add an optional size argument to _Py_char2wchar()
_Py_char2wchar() callers usually need the result size in characters. Since it's
trivial to compute it in _Py_char2wchar() (O(1) whereas wcslen() is O(n)), add
an option to get it.
2010-10-16 23:16:16 +00:00
Victor Stinner f3170ccef8 Use locale encoding if Py_FileSystemDefaultEncoding is not set
* PyUnicode_EncodeFSDefault(), PyUnicode_DecodeFSDefaultAndSize() and
   PyUnicode_DecodeFSDefault() use the locale encoding instead of UTF-8 if
   Py_FileSystemDefaultEncoding is NULL
 * redecode_filenames() functions and _Py_code_object_list (issue #9630)
   are no more needed: remove them
2010-10-15 12:04:23 +00:00
Georg Brandl 66c221e993 #9418: first step of moving private string methods to _string module. 2010-10-14 07:04:07 +00:00
Victor Stinner beb4135b8c PyUnicode_AsWideCharString() takes a PyObject*, not a PyUnicodeObject*
All unicode functions uses PyObject* except PyUnicode_AsWideChar(). Fix the
prototype for the new function PyUnicode_AsWideCharString().
2010-10-07 01:02:42 +00:00
Victor Stinner 5593d8aeb4 Issue #8670: PyUnicode_AsWideChar() and PyUnicode_AsWideCharString() replace
UTF-16 surrogate pairs by single non-BMP characters for 16 bits Py_UNICODE
and 32 bits wchar_t (eg. Linux in narrow build).
2010-10-02 11:11:27 +00:00