SF #596434: tweak wordsep_re so the definition of an em-dash is
stricter: specifically, "--" must be preceded by a limited set of characters, not by any non-whitespace character.
This commit is contained in:
parent
cc55cb9539
commit
a409f7c491
|
@ -75,7 +75,7 @@ class TextWrapper:
|
|||
# (after stripping out empty strings).
|
||||
wordsep_re = re.compile(r'(\s+|' # any whitespace
|
||||
r'-*\w{2,}-(?=\w{2,})|' # hyphenated words
|
||||
r'(?<=\S)-{2,}(?=\w))') # em-dash
|
||||
r'(?<=[\w\!\"\'\&\.\,\?])-{2,}(?=\w))') # em-dash
|
||||
|
||||
# XXX will there be a locale-or-charset-aware version of
|
||||
# string.lowercase in 2.3?
|
||||
|
|
Loading…
Reference in New Issue