Revision d183ee1bc0f65312bbc9406af7dafd0333aa5aa6 authored by Sukera on 09 April 2024, 21:06:49 UTC, committed by GitHub on 09 April 2024, 21:06:49 UTC
This improves performance of `ncodeunits(::Char)` by simply counting the number of non-zero bytes (except for `\0`, which is encoded as all zero bytes). For a performance comparison, see [this gist]( https://gist.github.com/Seelengrab/ebb02d4b8d754700c2869de8daf88cad); there's an up to 10x improvement here for collections of `Char`, with a minor improvement for single `Char` (with much smaller spread). The version in this PR is called `nbytesencoded` in the benchmarks. Correctness has been verified with Supposition.jl, using the existing implementation as an oracle: ```julia julia> using Supposition julia> const chars = Data.Characters() julia> @check max_examples=1_000_000 function bytesenc(i=Data.Integers{UInt32}()) c = reinterpret(Char, i) ncodeunits(c) == nbytesdiv(c) end; Test Summary: | Pass Total Time bytesenc | 1 1 1.0s julia> ncodeunits('\0') == nbytesencoded('\0') true ``` Let's see if CI agrees! Notably, neither the existing nor the new implementation check whether the given `Char` is valid or not, since the only thing that matters is how many bytes are written out. --------- Co-authored-by: Sukera <Seelengrab@users.noreply.github.com>
1 parent f870ea0
File | Mode | Size |
---|---|---|
.devcontainer | ||
.github | ||
base | ||
cli | ||
contrib | ||
deps | ||
doc | ||
etc | ||
src | ||
stdlib | ||
test | ||
.buildkite-external-version | -rw-r--r-- | 5 bytes |
.clang-format | -rw-r--r-- | 3.3 KB |
.clangd | -rw-r--r-- | 114 bytes |
.codecov.yml | -rw-r--r-- | 52 bytes |
.git-blame-ignore-revs | -rw-r--r-- | 371 bytes |
.gitattributes | -rw-r--r-- | 65 bytes |
.gitignore | -rw-r--r-- | 571 bytes |
.mailmap | -rw-r--r-- | 12.7 KB |
CITATION.bib | -rw-r--r-- | 513 bytes |
CITATION.cff | -rw-r--r-- | 1012 bytes |
CONTRIBUTING.md | -rw-r--r-- | 23.4 KB |
HISTORY.md | -rw-r--r-- | 388.0 KB |
LICENSE.md | -rw-r--r-- | 1.3 KB |
Make.inc | -rw-r--r-- | 56.4 KB |
Makefile | -rw-r--r-- | 30.4 KB |
NEWS.md | -rw-r--r-- | 4.4 KB |
README.md | -rw-r--r-- | 7.4 KB |
THIRDPARTY.md | -rw-r--r-- | 3.9 KB |
VERSION | -rw-r--r-- | 11 bytes |
julia.spdx.json | -rw-r--r-- | 37.8 KB |
pkgimage.mk | -rw-r--r-- | 1.4 KB |
sysimage.mk | -rw-r--r-- | 4.2 KB |
typos.toml | -rw-r--r-- | 78 bytes |
Computing file changes ...