Revision 892cd4ff8a1cf96bad9dbd39e0d1f5829e33a1d0 authored by Jameson Nash on 17 February 2023, 15:58:05 UTC, committed by GitHub on 17 February 2023, 15:58:05 UTC
Previously, we might try to interpret the random bytes in a path as
UTF-8 and excluding \n, causing the regex match to fail or be incomplete
in some cases. But those are valid in a path, so we want PCRE2 to treat
them as transparent bytes. Accordingly, change r""a to specify all flags
needed to interpret the values simply as ASCII.

Note, this would be breaking if someone was previously trying to match a
Unicode character by `\u` while also disabling UCP matching of \w and
\s, but that seems an odd specific choice to need.

    julia> match(r"\u03b1"a, "α")
    ERROR: PCRE compilation error: character code point value in \u.... sequence is too large at offset 6

(this would have previously worked). Note that explicitly starting the
regex with (*UTF) or using a literal α in the regex would continue to
work as before however.

Note that `s` (DOTALL) is a more efficient matcher (if the pattern
contains `.`), as is `a`, so it is often preferable to set both when in
doubt: http://man.he.net/man3/pcre2perform

Refs: #48648
1 parent cbbfc68
History
File Mode Size
.devcontainer
.github
base
cli
contrib
deps
doc
etc
src
stdlib
test
.buildkite-external-version -rw-r--r-- 5 bytes
.clang-format -rw-r--r-- 3.3 KB
.clangd -rw-r--r-- 114 bytes
.codecov.yml -rw-r--r-- 52 bytes
.git-blame-ignore-revs -rw-r--r-- 294 bytes
.gitattributes -rw-r--r-- 65 bytes
.gitignore -rw-r--r-- 514 bytes
.mailmap -rw-r--r-- 12.1 KB
CITATION.bib -rw-r--r-- 513 bytes
CITATION.cff -rw-r--r-- 940 bytes
CONTRIBUTING.md -rw-r--r-- 23.1 KB
HISTORY.md -rw-r--r-- 363.4 KB
LICENSE.md -rw-r--r-- 1.3 KB
Make.inc -rw-r--r-- 53.0 KB
Makefile -rw-r--r-- 30.2 KB
NEWS.md -rw-r--r-- 3.3 KB
README.md -rw-r--r-- 7.3 KB
THIRDPARTY.md -rw-r--r-- 3.7 KB
VERSION -rw-r--r-- 11 bytes
julia.spdx.json -rw-r--r-- 35.8 KB
sysimage.mk -rw-r--r-- 4.1 KB

README.md

back to top