gh-111259: Optimize recursive wildcards in pathlib (GH-111303)

Regular expression pattern `(?s:.)` is much faster than `[\s\S]`.
This commit is contained in:
Serhiy Storchaka 2023-10-26 18:07:06 +03:00 committed by GitHub
parent 67a91f78e4
commit 309efb39dc
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 4 additions and 3 deletions

View File

@ -124,13 +124,13 @@ def _compile_pattern_lines(pattern_lines, case_sensitive):
elif part == '*':
part = r'.+'
elif part == '**\n':
# '**/' component: we use '[\s\S]' rather than '.' so that path
# '**/' component: we use '(?s:.)' rather than '.' so that path
# separators (i.e. newlines) are matched. The trailing '^' ensures
# we terminate after a path separator (i.e. on a new line).
part = r'[\s\S]*^'
part = r'(?s:.)*^'
elif part == '**':
# '**' component.
part = r'[\s\S]*'
part = r'(?s:.)*'
elif '**' in part:
raise ValueError("Invalid pattern: '**' can only be an entire path component")
else:

View File

@ -0,0 +1 @@
Optimize recursive wildcards in :mod:`pathlib`.