cpython

Commit Graph

Author	SHA1	Message	Date
Victor Stinner	d23d3930ff	Issue #7820 : The parser tokenizer restores all bytes in the right if the BOM check fails. Fix an assertion in pydebug mode.	2010-03-02 23:20:02 +00:00
Benjamin Peterson	42d63847c3	rewrite translate_newlines for clarity	2009-12-06 17:37:48 +00:00
Benjamin Peterson	e36199b49d	fix several compile() issues by translating newlines in the tokenizer	2009-11-12 23:39:44 +00:00
Benjamin Peterson	e3383b8e8f	spelling	2009-11-07 01:04:38 +00:00
Benjamin Peterson	9586cf8677	fix some coding style	2009-10-09 21:48:14 +00:00
Benjamin Peterson	08a0bbc846	don't mask encoding errors when decoding a string #6289	2009-06-16 00:29:31 +00:00
Andrew M. Kuchling	110a48cf60	#3367 : revert rev. 65539: this change causes test_parser to fail	2008-08-05 02:05:23 +00:00
Andrew M. Kuchling	efa61bc15f	#3367 from Kristjan Valur Jonsson: If a PyTokenizer_FromString() is called with an empty string, the tokenizer's line_start member never gets initialized. Later, it is compared with the token pointer 'a' in parsetok.c:193 and that behavior can result in undefined behavior.	2008-08-05 01:38:08 +00:00
Gregory P. Smith	dd96db63f6	This reverts r63675 based on the discussion in this thread: http://mail.python.org/pipermail/python-dev/2008-June/079988.html Python 2.6 should stick with PyString_* in its codebase. The PyBytes_* names in the spirit of 3.0 are available via a #define only. See the email thread.	2008-06-09 04:58:54 +00:00
Christian Heimes	593daf545b	Renamed PyString to PyBytes	2008-05-26 12:51:38 +00:00
Amaury Forgeot d'Arc	5216721a53	Issue2681: the literal 0o8 was wrongly accepted, and evaluated as float(0.0). This happened only when 8 is the first digit. Credits go to Lukas Meuser.	2008-04-24 18:07:05 +00:00
Neal Norwitz	d183bdd6fb	Revert r61969 which added casts to Py_CHARMASK to avoid compiler warnings. Rather than sprinkle casts throughout the code, change Py_CHARMASK to always cast it's result to an unsigned char. This should ensure we do the right thing when accessing an array with the result.	2008-03-28 04:58:51 +00:00
Georg Brandl	d5b635f196	Make Py3k warnings consistent w.r.t. punctuation; also respect the EOL 80 limit and supply more alternatives in warning messages.	2008-03-25 08:29:14 +00:00
Eric Smith	9ff19b5434	Finished backporting PEP 3127, Integer Literal Support and Syntax. Added 0b and 0o literals to tokenizer. Modified PyOS_strtoul to support 0b and 0o inputs. Modified PyLong_FromString to support guessing 0b and 0o inputs. Renamed test_hexoct.py to test_int_literal.py and added binary tests. Added upper and lower case 0b, 0O, and 0X tests to test_int_literal.py	2008-03-17 17:32:20 +00:00
Neal Norwitz	c44af337ce	Add assertion that we do not blow out newl	2008-01-27 17:10:29 +00:00
Christian Heimes	082c9b0267	Fixed bug #1915 : Python compiles with --enable-unicode=no again. However several extension methods and modules do not work without unicode support.	2008-01-23 14:20:50 +00:00
Georg Brandl	898f1879e1	Add a "const" to make gcc happy.	2008-01-21 21:14:21 +00:00
Georg Brandl	38d1715b0d	Issue #1882 : when compiling code from a string, encoding cookies in the second line of code were not always recognized correctly.	2008-01-21 18:35:49 +00:00
Georg Brandl	14404b68d8	Fix #1679 : "0x" was taken as a valid integer literal. Fixes the tokenizer, tokenize.py and int() to reject this. Patches by Malte Helmert.	2008-01-19 19:27:05 +00:00
Christian Heimes	288e89acfc	Added bytes and b'' as aliases for str and ''	2008-01-18 18:24:07 +00:00
Georg Brandl	76b30d1688	Fix #define ordering.	2008-01-07 18:41:34 +00:00
Georg Brandl	dfe5dc8455	Make Python compile with --disable-unicode.	2008-01-07 18:16:36 +00:00
Amaury Forgeot d'Arc	6dae85f409	Warning "<> not supported in 3.x" should be enabled only when the -3 option is set.	2007-11-24 13:20:22 +00:00
Christian Heimes	02c9ab568d	Fixed problems in the last commit. Filenames and line numbers weren't reported correctly. Backquotes still don't report the correct file. The AST nodes only contain the line number but not the file name.	2007-11-23 12:12:02 +00:00
Christian Heimes	729ab15370	Applied patch #1754273 and #1754271 from Thomas Glee The patches are adding deprecation warnings for back ticks and <>	2007-11-23 09:10:36 +00:00
Guido van Rossum	9fc1b96a19	Change a PyErr_Print() into a PyErr_Clear(), per discussion in issue 1031213.	2007-10-15 15:54:11 +00:00
Martin v. Löwis	a5136196bc	Patch #1031213 : Decode source line in SyntaxErrors back to its original source encoding. Will backport to 2.5.	2007-09-04 14:19:28 +00:00
Andrew M. Kuchling	9b3a824097	Comment grammar	2006-10-06 18:51:55 +00:00
Neal Norwitz	71e05f1e0c	Don't truncate if size_t is bigger than uint	2006-06-12 02:07:57 +00:00
Neal Norwitz	d21a7fffb1	Patch #1357836 : Prevent an invalid memory read from test_coding in case the done flag is set. In that case, the loop isn't entered. I wonder if rather than setting the done flag in the cases before the loop, if they should just exit early. This code looks like it should be refactored. Backport candidate (also the early break above if decoding_fgets fails)	2006-06-02 06:23:00 +00:00
Skip Montanaro	a0b6338823	C++ compiler cleanup: cast signed to unsigned	2006-04-18 00:53:06 +00:00
Neal Norwitz	08062d6665	As discussed on python-dev, really fix the PyMem_/PyObject_ memory API mismatches. At least I hope this fixes them all. This reverts part of my change from yesterday that converted everything in Parser/.c to use PyObject_ API. The encoding doesn't really need to use PyMem_, however, it uses new_string() which must return PyMem_ for handling the result of PyOS_Readline() which returns PyMem_* memory. If there were 2 versions of new_string() one that returned PyMem_* for tokens and one that return PyObject_* for encodings that could also fix this problem. I'm not sure which version would be clearer. This seems to fix both Guido's and Phillip's problems, so it's good enough for now. After this change, it would be good to review Parser/*.c for consistent use of the 2 memory APIs.	2006-04-11 08:19:15 +00:00
Anthony Baxter	114900298e	Fix the code in Parser/ to also compile with C++. This was mostly casts for malloc/realloc type functions, as well as renaming one variable called 'new' in tokensizer.c. Still lots more to be done, going to be checking in one chunk at a time or the patch will be massively huge. Still compiles ok with gcc.	2006-04-11 05:39:14 +00:00
Neal Norwitz	2c4e4f9839	SF patch #1467512 , fix double free with triple quoted string in standard build. This was the result of inconsistent use of PyMem_* and PyObject_* allocators. By changing to use PyObject_* allocator almost everywhere, this removes the inconsistency.	2006-04-10 06:42:25 +00:00
Tim Peters	c9d78aa470	Years in the making. objimpl.h, pymem.h: Stop mapping PyMem_{Del, DEL} and PyMem_{Free, FREE} to PyObject_{Free, FREE} in a release build. They're aliases for the system free() now. _subprocess.c/sp_handle_dealloc(): Since the memory was originally obtained via PyObject_NEW, it must be released via PyObject_FREE (or _DEL). pythonrun.c, tokenizer.c, parsermodule.c: I lost count of the number of PyObject vs PyMem mismatches in these -- it's like the specific function called at each site was picked at random, sometimes even with memory obtained via PyMem getting released via PyObject. Changed most to use PyObject uniformly, since the blobs allocated are predictably small in most cases, and obmalloc is generally faster than system mallocs then. If extension modules in real life prove as sloppy as Python's front end, we'll have to revert the objimpl.h + pymem.h part of this patch. Note that no problems will show up in a debug build (all calls still go thru obmalloc then). Problems will show up only in a release build, most likely segfaults.	2006-03-26 23:27:58 +00:00
Neal Norwitz	2aa9a5dfdd	Use macro versions instead of function versions when we already know the type. This will hopefully get rid of some Coverity warnings, be a hint to developers, and be marginally faster. Some asserts were added when the type is currently known, but depends on values from another function.	2006-03-20 01:53:23 +00:00
Thomas Wouters	7eaf2aaf48	Fix crashing bug in tokenizer, when tokenizing files with non-ASCII bytes but without a specified encoding: decoding_fgets() (and decoding_feof()) can return NULL and fiddle with the 'tok' struct, making tok->buf NULL. This is okay in the other cases of calls to decoding_*(), it seems, but not in this one. This should get a test added, somewhere, but the testsuite doesn't seem to test encoding anywhere (although plenty of tests use it.) It seems to me that decoding errors in other places in the code (like at the start of a token, instead of in the middle of one) make the code end up adding small integers to NULL pointers, but happen to check for error states before using the calculated new pointers. I haven't been able to trigger any other crashes, in any case. I would nominate this file for a comlete rewrite for Py3k. The whole decoding trick is too bolted-on for my tastes.	2006-03-02 20:41:27 +00:00
Martin v. Löwis	49c5da1d88	Patch #1440601 : Add col_offset attribute to AST nodes.	2006-03-01 22:49:05 +00:00
Martin v. Löwis	6cba25666c	Change non-ASCII warning into a SyntaxError.	2006-02-28 22:41:29 +00:00
Martin v. Löwis	f5adf1eb72	Use Py_ssize_t to count the length.	2006-02-16 14:35:38 +00:00
Martin v. Löwis	18e165558b	Merge ssize_t branch.	2006-02-15 17:27:45 +00:00
Neal Norwitz	30b5c5d011	Fix SF bug #1072182 , problems with signed characters. Most of these can be backported.	2005-12-19 06:05:18 +00:00
Neal Norwitz	db83eb3170	Fix Bug #1378022 , UTF-8 files with a leading BOM crashed the interpreter. Needs backport.	2005-12-18 05:29:30 +00:00
Neal Norwitz	dee2fd5448	Fix some more memory leaks. Call error_ret() in decode_str(). It was called in some other places, but seemed inconsistent. It is safe to call PyTokenizer_Free() after calling error_ret().	2005-11-16 05:12:59 +00:00
Neal Norwitz	c0d5faa9b4	Free coding spec (cs) if there was an error to prevent mem leak. Maybe backport candidate	2005-10-21 06:05:33 +00:00
Neal Norwitz	40d3781416	- Fix segfault with invalid coding. - SF Bug #772896, unknown encoding results in MemoryError, which is not helpful I will only backport the segfault fix. I'll let Anthony decide if he wants the other changes backported. I will do the backport if asked.	2005-10-02 01:48:49 +00:00
Walter Dörwald	c1f5fff2b7	Apply SF patch #1101726 : Fix buffer overrun in tokenizer.c when a source file with a PEP 263 encoding declaration results in long decoded line.	2005-07-12 21:53:43 +00:00
Martin v. Löwis	4bf108d74f	Patch #802188 : better parser error message for non-EOL following line cont.	2005-03-03 11:45:45 +00:00
Hye-Shik Chang	7df44b384a	SF #941229 : Decode source code with sys.stdin.encoding in interactive modes like non-interactive modes. This allows for non-latin-1 users to write unicode strings directly and sets Japanese users free from weird manual escaping <wink> in shift_jis environments. (Reviewed by Martin v. Loewis)	2004-08-04 17:36:41 +00:00
Anthony Baxter	c2a5a63654	PEP-0318, @decorator-style. In Guido's words: "@ seems the syntax that everybody can hate equally" Implementation by Mark Russell, from SF #979728.	2004-08-02 06:10:11 +00:00

1 2 3

127 Commits