Expand on re.split behavior with captured expressions.

This commit is contained in:
Georg Brandl 2008-03-06 07:19:15 +00:00
parent d2bbe526c3
commit 70992c3c83
1 changed files with 13 additions and 1 deletions

View File

@ -543,14 +543,26 @@ form.
>>> re.split('\W+', 'Words, words, words.', 1) >>> re.split('\W+', 'Words, words, words.', 1)
['Words', 'words, words.'] ['Words', 'words, words.']
If there are capturing groups in the separator and it matches at the start of
the string, the result will start with an empty string. The same holds for
the end of the string::
>>> re.split('(\W+)', '...words, words...')
['', '...', 'words', ', ', 'words', '...', '']
That way, separator components are always found at the same relative
indices within the result list (e.g., if there's one capturing group
in the separator, the 0th, the 2nd and so forth).
Note that *split* will never split a string on an empty pattern match. Note that *split* will never split a string on an empty pattern match.
For example :: For example::
>>> re.split('x*', 'foo') >>> re.split('x*', 'foo')
['foo'] ['foo']
>>> re.split("(?m)^$", "foo\n\nbar\n") >>> re.split("(?m)^$", "foo\n\nbar\n")
['foo\n\nbar\n'] ['foo\n\nbar\n']
.. function:: findall(pattern, string[, flags]) .. function:: findall(pattern, string[, flags])
Return all non-overlapping matches of *pattern* in *string*, as a list of Return all non-overlapping matches of *pattern* in *string*, as a list of