https://github.com/JuliaLang/julia
Revision d183ee1bc0f65312bbc9406af7dafd0333aa5aa6 authored by Sukera on 09 April 2024, 21:06:49 UTC, committed by GitHub on 09 April 2024, 21:06:49 UTC
This improves performance of `ncodeunits(::Char)` by simply counting the
number of non-zero bytes (except for `\0`, which is encoded as all zero
bytes). For a performance comparison, see [this gist](
https://gist.github.com/Seelengrab/ebb02d4b8d754700c2869de8daf88cad);
there's an up to 10x improvement here for collections of `Char`, with a
minor improvement for single `Char` (with much smaller spread). The
version in this PR is called `nbytesencoded` in the benchmarks.

Correctness has been verified with Supposition.jl, using the existing
implementation as an oracle:

```julia
julia> using Supposition

julia> const chars = Data.Characters()

julia> @check max_examples=1_000_000 function bytesenc(i=Data.Integers{UInt32}())
           c = reinterpret(Char, i)
           ncodeunits(c) == nbytesdiv(c)
       end;
Test Summary: | Pass  Total  Time
bytesenc      |    1      1  1.0s

julia> ncodeunits('\0') == nbytesencoded('\0')
true
```

Let's see if CI agrees!

Notably, neither the existing nor the new implementation check whether
the given `Char` is valid or not, since the only thing that matters is
how many bytes are written out.

---------

Co-authored-by: Sukera <Seelengrab@users.noreply.github.com>
1 parent f870ea0
History
Tip revision: d183ee1bc0f65312bbc9406af7dafd0333aa5aa6 authored by Sukera on 09 April 2024, 21:06:49 UTC
Improve performance of `ncodeunits(::Char)` (#54001)
Tip revision: d183ee1
File Mode Size
.devcontainer
.github
base
cli
contrib
deps
doc
etc
src
stdlib
test
.buildkite-external-version -rw-r--r-- 5 bytes
.clang-format -rw-r--r-- 3.3 KB
.clangd -rw-r--r-- 114 bytes
.codecov.yml -rw-r--r-- 52 bytes
.git-blame-ignore-revs -rw-r--r-- 371 bytes
.gitattributes -rw-r--r-- 65 bytes
.gitignore -rw-r--r-- 571 bytes
.mailmap -rw-r--r-- 12.7 KB
CITATION.bib -rw-r--r-- 513 bytes
CITATION.cff -rw-r--r-- 1012 bytes
CONTRIBUTING.md -rw-r--r-- 23.4 KB
HISTORY.md -rw-r--r-- 388.0 KB
LICENSE.md -rw-r--r-- 1.3 KB
Make.inc -rw-r--r-- 56.4 KB
Makefile -rw-r--r-- 30.4 KB
NEWS.md -rw-r--r-- 4.4 KB
README.md -rw-r--r-- 7.4 KB
THIRDPARTY.md -rw-r--r-- 3.9 KB
VERSION -rw-r--r-- 11 bytes
julia.spdx.json -rw-r--r-- 37.8 KB
pkgimage.mk -rw-r--r-- 1.4 KB
sysimage.mk -rw-r--r-- 4.2 KB
typos.toml -rw-r--r-- 78 bytes

README.md

back to top