bpo-31671: re: Convert RegexFlag to int before compile (GH-3862)

sre_compile does bit test (e.g. `flags & SRE_FLAG_IGNORECASE`) in loop.
`IntFlag.__and__` and `IntFlag.__new__` made it slower.

So this commit convert it to normal int before passing flags to `sre_compile()`.
This commit is contained in:
INADA Naoki 2017-10-05 17:19:26 +09:00 committed by GitHub
parent af810b35b4
commit c1c47c166b
3 changed files with 11 additions and 0 deletions

View File

@ -326,6 +326,11 @@ Optimizations
expressions <re>`. Searching some patterns can now be up to 20 times faster.
(Contributed by Serhiy Storchaka in :issue:`30285`.)
* :func:`re.compile` now converts ``flags`` parameter to int object if
it is ``RegexFlag``. It is now as fast as Python 3.5, and faster than
Python 3.6 about 10% depending on the pattern.
(Contributed by INADA Naoki in :issue:`31671`.)
* :meth:`selectors.EpollSelector.modify`, :meth:`selectors.PollSelector.modify`
and :meth:`selectors.DevpollSelector.modify` may be around 10% faster under
heavy loads. (Contributed by Giampaolo Rodola' in :issue:`30014`)

View File

@ -275,6 +275,8 @@ _cache = OrderedDict()
_MAXCACHE = 512
def _compile(pattern, flags):
# internal: compile pattern
if isinstance(flags, RegexFlag):
flags = flags.value
try:
return _cache[type(pattern), pattern, flags]
except KeyError:
@ -331,6 +333,8 @@ copyreg.pickle(Pattern, _pickle, _compile)
class Scanner:
def __init__(self, lexicon, flags=0):
from sre_constants import BRANCH, SUBPATTERN
if isinstance(flags, RegexFlag):
flags = flags.value
self.lexicon = lexicon
# combine phrases into a compound pattern
p = []

View File

@ -0,0 +1,2 @@
Now ``re.compile()`` converts passed RegexFlag to normal int object before
compiling. bm_regex_compile benchmark shows 14% performance improvements.