From 70992c3c83e2324677ad4fee73f372682f036c18 Mon Sep 17 00:00:00 2001 From: Georg Brandl Date: Thu, 6 Mar 2008 07:19:15 +0000 Subject: [PATCH] Expand on re.split behavior with captured expressions. --- Doc/library/re.rst | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index 9f13d06b928..2ab12546e98 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -543,14 +543,26 @@ form. >>> re.split('\W+', 'Words, words, words.', 1) ['Words', 'words, words.'] + If there are capturing groups in the separator and it matches at the start of + the string, the result will start with an empty string. The same holds for + the end of the string:: + + >>> re.split('(\W+)', '...words, words...') + ['', '...', 'words', ', ', 'words', '...', ''] + + That way, separator components are always found at the same relative + indices within the result list (e.g., if there's one capturing group + in the separator, the 0th, the 2nd and so forth). + Note that *split* will never split a string on an empty pattern match. - For example :: + For example:: >>> re.split('x*', 'foo') ['foo'] >>> re.split("(?m)^$", "foo\n\nbar\n") ['foo\n\nbar\n'] + .. function:: findall(pattern, string[, flags]) Return all non-overlapping matches of *pattern* in *string*, as a list of