https://github.com/JuliaLang/julia

sort by:
Revision Author Date Message Commit Date
9cbee21 parentindices and parent of substring (#49511) * parentindices and parent of substring * upd docs * Update base/abstractarray.jl Co-authored-by: Jakob Nybo Nissen <jakobnybonissen@gmail.com> * Update test/strings/basic.jl Co-authored-by: Jakob Nybo Nissen <jakobnybonissen@gmail.com> --------- Co-authored-by: Jakob Nybo Nissen <jakobnybonissen@gmail.com> 21 May 2023, 07:58:19 UTC
046f610 t0 is in counts_ctx 21 May 2023, 01:00:18 UTC
8e03be1 Improve `isassigned` implementation (#49827) Unless `isassigned` is called on `Array` with `Int`s, it uses a try catch, which is notoriously slow. This PR provides changes the default implementation of `isassigned` to coerce the indices provided to `Int`s and converts to linear or cartesian indices, depending on the arrays `IndexStyle`. This also overloads `isassigned` for many of the array types defined in Base. Fixes: https://github.com/JuliaLang/julia/issues/44720 20 May 2023, 20:52:22 UTC
d2f5bbd REPLCompletions: use a fixed world age for `REPLInterpreter` inference (#49880) This commit uses a fixed world age for `REPLInterpreter` inference, making `REPLInterpreter` robust against potential invalidations of `Core.Compiler` methods. It also generates code cache for `REPLinterpreter` at the fixed world age so that the first-time to completion stays the (almost) same. 20 May 2023, 10:45:15 UTC
4d3000b follow up #49889, pass `sv::AbsIntState` to `concrete_eval_call` (#49904) `sv` is not used by `NativeInterpreter`, but is used by external `AbstractInterpreter` like JET.jl. 20 May 2023, 07:08:45 UTC
1ef9f37 [NFC] cosmetic refactor of `abstract_call_method_with_const_args` (#49889) This commit is a collection of minor NFC (No Functional Change) modifications. Essentially, it is a cosmetic refactor, so there should be no changes in terms of compiler functionality. The specific changes include: - Making `concrete_eval_eligible` always return a value of `Symbol`, either of `:concrete_eval`, `:semi_concrete_eval` and `:none`, clarifying its return value's meaning - Splitting `abstract_call_method_with_const_args` into more granular subroutines - Rearranged the subroutines in `abstract_call_method_with_const_args` to ensure that the processing flow of the code can be followed when read from top to bottom 20 May 2023, 04:49:20 UTC
5dafc84 subtype: add a fast-path for Union parameters (#49878) For #49857 performance The union explosion is caused by the following MWE: `Type{Vector{Union{....}} <: Type{Array{T}} where {T}` 280f9993608956f76eac30fc85e1c6ebbca4f5e6 only fixes for `Union{......}` without free `Typevar`. This fast-path makes sure the remaining get fixed. 20 May 2023, 01:14:15 UTC
6d70d2a Attempting to add debug logs for ENQUEUING an invalid object (#49741) * Attempting to add debug logs for ENQUEUING an invalid object Check for the object's validity _before enqueuing_ so that we can hopefully give a more useful error message (which object's pointer was corrupted). --------- Co-authored-by: Diogo Netto <diogonetto.dcn@gmail.com> 19 May 2023, 21:49:47 UTC
6b2ba1d Update src/timing.h Co-authored-by: Cody Tapscott <84105208+topolarity@users.noreply.github.com> 19 May 2023, 21:49:18 UTC
3500ba4 Move t0 init to jl_timing_block_start 19 May 2023, 21:30:43 UTC
a43ca05 limit printing depth of argument types in stack traces (#49795) Co-authored-by: Tim Holy <tim.holy@gmail.com> 19 May 2023, 20:14:26 UTC
1acec74 Make `apply_type_nothrow` robust against `TypeVar`s in upper bounds (#49863) For types like `Foo{S, T<:S}`, `apply_type_nothrow` could in some situations check whether the argument is a subtype of the upper bound of `T`, i.e. `S`, but subtyping agaist a plain `TypeVar` would fail. Instead return `false` in this case. Fixes #49785. 19 May 2023, 10:04:55 UTC
863e131 Time events instead of subsystems 19 May 2023, 01:27:01 UTC
c99d839 Merge pull request #49861 from JuliaLang/sf/null_terminate_path [cli] Ensure that probed `libstdc++` path is NULL-terminated 18 May 2023, 19:18:32 UTC
7111597 [cli] Ensure that probed `libstdc++` path is NULL-terminated It appears that we were assuming our path was initialized with zeros, but that is not a safe assumption. 18 May 2023, 17:26:52 UTC
ce3909c inference: prioritize `SlotNumber`-constraint over `MustAlias`-constraint (#49856) Currently external `AbstractInterpreter` that uses `MustAliasesLattice` can fail to propagate type constraint on `SlotNumber` in the call-site refinement, e.g. fail to infer the return type of `firstitem(::ItrList)` in the following code: ```julia struct ItrList list::Union{Tuple{},Vector{Int}} end hasitems(list) = length(list) >= 1 function firstitem(ilist::ItrList) list = ilist.list if hasitems(list) return list end error("list is empty") end ``` (xref: <https://github.com/aviatesk/JET.jl/issues/509#issuecomment-1546658476>) This commit fixes it up as well as fixes the implementation of `from_interprocedural!` so that it uses the correct lattice. 18 May 2023, 02:54:39 UTC
a612388 reflection: declare keyword arguments types for reflection methods (#49783) 18 May 2023, 01:27:57 UTC
cb7d141 Overwrite random value in root timing slot 17 May 2023, 19:54:18 UTC
5a503b0 Print total time and self timing with timing counts 17 May 2023, 19:48:24 UTC
5774029 Sort timing names 17 May 2023, 19:30:37 UTC
8b4bb89 Allow number of subsystems to exceed 64 17 May 2023, 19:15:15 UTC
8e3a756 jl_timing_enable_mask -> jl_timing_disable_mask 17 May 2023, 19:04:17 UTC
98b64b2 Merge pull request #49842 from JuliaLang/sf/dont_eagerly_load_libgomp Don't depend on `CompilerSupportLibraries_jll` from `OpenBLAS_jll` 17 May 2023, 19:00:26 UTC
4dc683b Make jl_timing_enable_mask atomic 17 May 2023, 18:55:40 UTC
84d4b92 Print timing outputs as CSV 17 May 2023, 18:49:17 UTC
d50e25e Make jl_timing_counts atomic 17 May 2023, 18:31:19 UTC
3583fae Tracy: add source-code information to lowering and macro zones. (#49802) 17 May 2023, 18:14:39 UTC
869c70e follow up #49812, fix the wrong type declaration (#49854) JuliaLang/julia#49812 introduced a bug and broke the CI. This commit fixes it up. 17 May 2023, 17:47:43 UTC
34a2436 Remove CSL from the test suite 17 May 2023, 16:12:36 UTC
becaa78 Add optnone to invoke wrappers (#44590) 17 May 2023, 15:31:14 UTC
0b599ce Fix --image-codegen (#49631) 17 May 2023, 13:16:35 UTC
10dc33e Revert "Dark and light images for README.md" (#49819) 17 May 2023, 09:59:10 UTC
ff012aa improve inferrability of loading.jl (#49812) 17 May 2023, 08:52:36 UTC
c245179 fix missing gc root on store to iparams (#49820) Try to optimize the order of this code a bit more, given that these checks are somewhat infrequently to be needed. Fix #49762 16 May 2023, 20:14:58 UTC
45748b8 [Profile] fix overhead counts in format=:flat (#49824) Regression caused by #41742, which inverted the loop without inverting the logic. And fix a number of related formatting mistakes. Fix #49732 16 May 2023, 20:14:14 UTC
ee0199f Various improvements to peakflops() (#49833) * Various improvements to peakflops Use 4096 as the default matrix size Add kwarg to pick the type of elements in the matrix Add kwarg for number of trials and pick best time 16 May 2023, 19:44:55 UTC
520b639 Merge pull request #49647 from topolarity/timing-refactor Make `jl_timer_block_t` allocation separate from timer-start (JL_TIMING) 16 May 2023, 19:19:12 UTC
4d0f35d Don't depend on `CompilerSupportLibraries_jll` from `OpenBLAS_jll` This is important because CSL_jll loads in many other libraries that we may or may not care that much about, such as `libstdc++` and `libgomp`. We load `libstdc++` eagerly on Linux, so that will already be loaded in all cases that we care about, however on macOS we don't generally want that loaded, and this suppresses that. `libgomp` is needed by BB-provided software that uses OpenMP during compilation, however it can conflict with software compiled by the Intel compilers, such as `MKL`. It's best to allow MKL to load its OpenMP libraries first, so delaying loading `libgomp` until someone actually calls `using CompilerSupportLibraries_jll` is the right thing to do. In the future, we want to rework JLLs such that libraries aren't eagerly loaded at JLL `__init__()` time, but rather they should be JIT loaded upon first usage of the library handle itself. This would allow BB to emit much more fine-grained dependency structures, so that the distribution of a set of libraries can happen together, but the loading of said libraries would be independent. 16 May 2023, 18:35:37 UTC
c55000a add a hash value to Typeofwrapper objects (#49725) We probably should not do this in full correctness, but the performance gain is too great to ignore. 16 May 2023, 18:09:57 UTC
78fbf1b docs: fix code formatting and add some spaces (#49814) 16 May 2023, 06:11:38 UTC
dfbcc45 ensure all `isequal` methods to be inferred to return `Bool` (#49800) This would help inference on `Core.Compiler.return_type(isequal, tt)` when `tt` is not well inferred (e.g. `tt` is inferred to `Tuple{Any,Any}`). (although JuliaLang/julia#46810 may disable this `Core.Compiler.return_type` improvement for good reasons). Anyway, it is explicitly stated in the documentation that the `isequal` method should always return a value of `Bool`. So, not only does this annotation assist inference, it also serves to ensure the correctness of our code base, and therefore should be beneficial. We may need to take similar measures for `isless` and `isgreater` (in separate PRs). 16 May 2023, 02:10:13 UTC
909c57f Merge pull request #49535 from JuliaLang/pc/ittapi-invalidations Count invalidations, JIT memory, and image memory in profiling reports 15 May 2023, 21:46:22 UTC
b9806d6 irinterp: Don't try to rekill fall-through terminators (#49815) If a fall-through terminator was already Bottom, we should not attempt to rekill the successor edge, because it was already deleted. Yet another fix in the #49692, #49750, #49797 series, which is turning out to be quite a rabit hole. Also fix a typo in the verifer tweak where we were looking at the BB idx rather than the terminator idx. 15 May 2023, 20:47:14 UTC
d489203 Merge pull request #49822 from JuliaLang/jn/jit-dylib-order jitlayers: move the local dylibs ahead of the global one 15 May 2023, 20:33:58 UTC
74addd3 Reset `active` status for timing zone upon block entry 15 May 2023, 19:14:50 UTC
76fbd61 jitlayers: move the local dylibs ahead of the global one 15 May 2023, 18:19:26 UTC
3cadb6c Remove pointer indirection in `_TRACY_STOP` 15 May 2023, 16:29:12 UTC
1f161b4 only time inference if any work is actually done (#49817) 15 May 2023, 14:44:03 UTC
15d7bd8 Simplify `mul!` dispatch (#49806) 15 May 2023, 14:42:34 UTC
fbbe9ed Merge pull request #49664 from JuliaLang/jn/ml-matches-rewritten reorder ml-matches to avoid catastrophic performance case 15 May 2023, 14:38:44 UTC
9dd3090 Fix thread safety in `atexit(f)`: Lock access to atexit_hooks (#49774) - atexit(f) mutates global shared state. - atexit(f) can be called anytime by any thread. - Accesses & mutations to global shared state must be locked if they can be accessed from multiple threads. Add unit test for thread safety of adding many atexit functions in parallel 15 May 2023, 14:09:02 UTC
f7b0cf2 fix cross-reference link in variables.md (#49779) 15 May 2023, 14:05:37 UTC
be33e66 Core.Compiler: remove unused variable `phi_ssas` (#49816) 15 May 2023, 14:04:28 UTC
edf55b9 timing: Create ITTAPI events on the fly Instead of initializing all ITTAPI events during init, this change makes ITTAPI events use a statically-allocated object to track whether the event has been created. This makes our generation of events more similar to the Tracy API, where source locations are generated statically, in-line at each macro call-site instead of constructing them all up front. 15 May 2023, 11:04:35 UTC
f9c9d22 Split GC_Sweep JL_TIMING event into incremental/full versions 15 May 2023, 11:04:33 UTC
2e7f2ef timing: Introduce `JL_TIMING_CREATE_BLOCK` to separate alloc/init This includes several changes to the TIMING API: - Adds `JL_TIMING_CREATE_BLOCK(block, subsystem, event)` to create a timing block _without_ starting it - Adds `jl_timing_block_start` to start a timing block which was created with JL_TIMING_CREATE_BLOCK - Removes the C++-specific RAII implementation for JL_TIMING. Although it'd be nice to support JL_TIMING without GCC/Clang, the reality is that the C API prevents that from being achievable. - Renames `JL_TIMING_CURRENT_BLOCK` to `JL_TIMING_DEFAULT_BLOCK` To summarize, `JL_TIMING(subsystem, event)` is now equivalent to: ``` JL_TIMING_CREATE(__timing_block, subsystem, event); jl_timing_block_start(&__timing_block); ``` which also means that conditional events can be supported with: ``` JL_TIMING_CREATE(__timing_block, subsystem, event); if (condition) jl_timing_block_start(&__timing_block); ``` 15 May 2023, 11:04:00 UTC
e4924c5 add devdocs how to profile package precompilation with tracy (#49784) 15 May 2023, 07:58:29 UTC
344f1f5 Fixups for the `reinterpret` docstring (#49807) 14 May 2023, 17:56:11 UTC
4ed4195 Merge pull request #49790 from topolarity/tracy-checksums Update LibTracyClient checksums 14 May 2023, 00:32:34 UTC
7e1431f irinterp: Don't introduce invalid CFGs (#49797) This is yet another followup to #49692 and #49750. With the introduced change, we kill the CFG edge from the basic block with the discovered error to its successors. However, we have an invariant in the verifier that the CFG should always match the IR. Turns out this is for good reason, as we assume in a number of places (including, ironically in the irinterp) that a GotoNode/GotoIfNot terminator means that the BB has the corresponding number of successors in the IR. Fix all this by killing the rest of the basic block when we discover that it is unreachable and if possible introducing an unreachable node at the end. However, of course if the erroring statement is the fallthrough terminator itself, there is no space for an unreachable node. We fix this by tweaking the verification to allow this case, as its really no worse than the other problems with fall-through terminators (#41476), but of course it would be good to address that as part of a more general IR refactor. 13 May 2023, 21:27:41 UTC
ee86c06 orc::MemProt -> jitlink::MemProt 13 May 2023, 15:56:18 UTC
6fbb9b7 Track loaded image size 13 May 2023, 15:50:56 UTC
60273a5 Add JIT memory counters 13 May 2023, 15:46:12 UTC
e10dbd0 Address reviews 13 May 2023, 15:44:29 UTC
0a05a5b improve type inference of `Base.aligned_sizeof` (#49801) This commit includes a bit of refactoring of `Base.aligned_sizeof` to make it more inference-friendly, especially in cases like `Base.aligned_sizeof(::Union{DataType,Union})`. In particular, it eliminates the chance of inference accounting for a method error of `datatype_alignment(::Union)` in the second branch. xref: <https://github.com/aviatesk/JET.jl/issues/512> 13 May 2023, 14:18:09 UTC
ac1cb1c optimize reordering of ml-matches to avoid unnecessary computations This now chooses the optimal SCC set based on the size of lim, which ensures we can assume this algorithm is now << O(n^2) in all reasonable cases, even though the algorithm we are using is O(n + e), where e may require up to n^2 work to compute in the worst case, but should require only about n*min(lim, log(n)) work in the expected average case. This also further pre-optimizes quick work (checking for existing coverage) and delays unnecessary work (computing for *ambig return). 12 May 2023, 19:02:39 UTC
6a5f51b reorder ml-matches to avoid catastrophic performance case This ordering of the algorithm abandons the elegant insertion in favor of using another copy of Tarjan's SCC code. This enables us to abort the algorithm in O(k*n) time, instead of always running full O(n*n) time, where k is `min(lim,n)`. For example, to sort 1338 methods: Before: julia> @time Base._methods_by_ftype(Tuple{typeof(Core.kwcall), NamedTuple, Any, Vararg{Any}}, 3, Base.get_world_counter()); 0.136609 seconds (22.74 k allocations: 1.104 MiB) julia> @time Base._methods_by_ftype(Tuple{typeof(Core.kwcall), NamedTuple, Any, Vararg{Any}}, -1, Base.get_world_counter()); 0.046280 seconds (9.95 k allocations: 497.453 KiB) julia> @time Base._methods_by_ftype(Tuple{typeof(Core.kwcall), NamedTuple, Any, Vararg{Any}}, 30000, Base.get_world_counter()); 0.132588 seconds (22.73 k allocations: 1.103 MiB) julia> @time Base._methods_by_ftype(Tuple{typeof(Core.kwcall), NamedTuple, Any, Vararg{Any}}, 30000, Base.get_world_counter()); 0.135912 seconds (22.73 k allocations: 1.103 MiB) After: julia> @time Base._methods_by_ftype(Tuple{typeof(Core.kwcall), NamedTuple, Any, Vararg{Any}}, 3, Base.get_world_counter()); 0.001040 seconds (1.47 k allocations: 88.375 KiB) julia> @time Base._methods_by_ftype(Tuple{typeof(Core.kwcall), NamedTuple, Any, Vararg{Any}}, -1, Base.get_world_counter()); 0.039167 seconds (8.24 k allocations: 423.984 KiB) julia> @time Base._methods_by_ftype(Tuple{typeof(Core.kwcall), NamedTuple, Any, Vararg{Any}}, 30000, Base.get_world_counter()); 0.081354 seconds (8.26 k allocations: 424.734 KiB) julia> @time Base._methods_by_ftype(Tuple{typeof(Core.kwcall), NamedTuple, Any, Vararg{Any}}, 30000, Base.get_world_counter()); 0.080849 seconds (8.26 k allocations: 424.734 KiB) And makes inference faster in rare cases (this particular example came up because the expression below occurs appears in `@test` macroexpansion), both before loading loading more packages, such as OmniPackage, and afterwards, where the cost is almost unchanged afterwards, versus increasing about 50x. julia> f() = x(args...; kwargs...); @time @code_typed optimize=false f(); 0.143523 seconds (23.25 k allocations: 1.128 MiB, 99.96% compilation time) # before 0.001172 seconds (1.86 k allocations: 108.656 KiB, 97.71% compilation time) # after 12 May 2023, 19:02:39 UTC
d55314c allow loading extensions when a trigger is loaded from below the parent's load path (#49701) also allow loading extensions of the active project 12 May 2023, 18:54:17 UTC
c6fc12c fix build failure with dyld4 deadlock workaround (#49776) Accidentally missed in #49740 Fixes #49773 12 May 2023, 14:02:12 UTC
e365e57 Add LibTracyClient checksums 12 May 2023, 13:01:34 UTC
7bd3977 Update NEWS.md for grammar (#49759) [skip ci] 12 May 2023, 12:00:44 UTC
46b8a35 remove duplicate gc_try_claim_and_push (#49780) 12 May 2023, 12:00:21 UTC
2f6941f Update stable version in README.md to 1.9.0 (#49767) 12 May 2023, 11:59:41 UTC
ae6484d Update toolchain requirements and LLVM build docs (#49742) 12 May 2023, 11:59:29 UTC
6733197 Artifacts: pull out a recursive function from a closure to a stand alone function (#49755) 12 May 2023, 09:13:17 UTC
021015d Remove 1.9 package extension news item from NEWS.md (#49786) This feature was already shipped with 1.9 so it probably shouldn't be mentioned a second time in the 1.10 NEWS.md. 12 May 2023, 08:13:47 UTC
9002d16 abstractarray: fix `append!(::AbstractVector, ...)` interface (#49754) JuliaLang/julia#47154 mistakenly added `@_safeindex` macro on the `_append!(a::AbstractVector, ::Union{HasLength,HasShape}, iter)` method, although `@_safeindex` is only valid for builtin vectors i.e. `Vector`. This commit adds `isa` check so that `@_safeindex` is only applied to builtin vectors. The `isa` check should be removed at compile time, so it should not affect the runtime performance. closes #49748 12 May 2023, 06:35:06 UTC
1d58f24 adopt `Core.Compiler.get_max_methods` changes from #46810 (#49781) Since this part of refactoring is generally useful and I would like to utilize it for other PRs that may be merged before #46810. 12 May 2023, 05:55:49 UTC
1dc2ed6 experiment `@nospecializeinfer` on `Core.Compiler` This commit adds `@nospecializeinfer` macro on various `Core.Compiler` functions and achieves the following sysimage size reduction: | | this commit | master | % | | --------------------------------- | ----------- | ----------- | ------- | | `Core.Compiler` compilation (sec) | `66.4551` | `71.0846` | `0.935` | | `corecompiler.jl` (KB) | `17638080` | `18407248` | `0.958` | | `sys.jl` (KB) | `88736432` | `89361280` | `0.993` | | `sys-o.a` (KB) | `189484400` | `189907096` | `0.998` | 12 May 2023, 05:16:31 UTC
ce2275c introduce `@nospecializeinfer` macro to tell the compiler to avoid excess inference This commit introduces a new compiler annotation called `@nospecializeinfer`, which allows us to request the compiler to avoid excessive inference. \## `@nospecialize` mechanism T discuss `@nospecializeinfer`, let's first understand the behavior of `@nospecialize`. Its docstring says that > This is only a hint for the compiler to avoid excess code generation. , and it works by suppressing dispatches with complex runtime occurrences of the annotated arguments. This could be understood with the example below: ```julia julia> function call_func_itr(func, itr) local r = 0 r += func(itr[1]) r += func(itr[2]) r += func(itr[3]) r end; julia> _isa = isa; # just for the sake of explanation, global variable to prevent inlining julia> func_specialize(a) = _isa(a, Function); julia> func_nospecialize(@nospecialize a) = _isa(a, Function); julia> dispatchonly = Any[sin, muladd, nothing]; # untyped container can cause excessive runtime dispatch julia> @code_typed call_func_itr(func_specialize, dispatchonly) CodeInfo( 1 ─ %1 = π (0, Int64) │ %2 = Base.arrayref(true, itr, 1)::Any │ %3 = (func)(%2)::Any │ %4 = (%1 + %3)::Any │ %5 = Base.arrayref(true, itr, 2)::Any │ %6 = (func)(%5)::Any │ %7 = (%4 + %6)::Any │ %8 = Base.arrayref(true, itr, 3)::Any │ %9 = (func)(%8)::Any │ %10 = (%7 + %9)::Any └── return %10 ) => Any julia> @code_typed call_func_itr(func_nospecialize, dispatchonly) CodeInfo( 1 ─ %1 = π (0, Int64) │ %2 = Base.arrayref(true, itr, 1)::Any │ %3 = invoke func(%2::Any)::Any │ %4 = (%1 + %3)::Any │ %5 = Base.arrayref(true, itr, 2)::Any │ %6 = invoke func(%5::Any)::Any │ %7 = (%4 + %6)::Any │ %8 = Base.arrayref(true, itr, 3)::Any │ %9 = invoke func(%8::Any)::Any │ %10 = (%7 + %9)::Any └── return %10 ) => Any ``` The calls of `func_specialize` remain to be `:call` expression (so that they are dispatched and compiled at runtime) while the calls of `func_nospecialize` are resolved as `:invoke` expressions. This is because `@nospecialize` requests the compiler to give up compiling `func_nospecialize` with runtime argument types but with the declared argument types, allowing `call_func_itr(func_nospecialize, dispatchonly)` to avoid runtime dispatches and accompanying JIT compilations (i.e. "excess code generation"). The difference is evident when checking `specializations`: ```julia julia> call_func_itr(func_specialize, dispatchonly) 2 julia> length(Base.specializations(only(methods(func_specialize)))) 3 # w/ runtime dispatch, multiple specializations julia> call_func_itr(func_nospecialize, dispatchonly) 2 julia> length(Base.specializations(only(methods(func_nospecialize)))) 1 # w/o runtime dispatch, the single specialization ``` The problem here is that it influences dispatch only, and does not intervene into inference in anyway. So there is still a possibility of "excess inference" when the compiler sees a considerable complexity of argument types during inference: ```julia julia> func_specialize(a) = _isa(a, Function); # redefine func to clear the specializations julia> @assert length(Base.specializations(only(methods(func_specialize)))) == 0; julia> func_nospecialize(@nospecialize a) = _isa(a, Function); # redefine func to clear the specializations julia> @assert length(Base.specializations(only(methods(func_nospecialize)))) == 0; julia> withinfernce = tuple(sin, muladd, "foo"); # typed container can cause excessive inference julia> @time @code_typed call_func_itr(func_specialize, withinfernce); 0.000812 seconds (3.77 k allocations: 217.938 KiB, 94.34% compilation time) julia> length(Base.specializations(only(methods(func_specialize)))) 4 # multiple method instances inferred julia> @time @code_typed call_func_itr(func_nospecialize, withinfernce); 0.000753 seconds (3.77 k allocations: 218.047 KiB, 92.42% compilation time) julia> length(Base.specializations(only(methods(func_nospecialize)))) 4 # multiple method instances inferred ``` The purpose of this PR is to implement a mechanism that allows us to avoid excessive inference to reduce the compilation latency when inference sees a considerable complexity of argument types. \## Design Here are some ideas to implement the functionality: 1. make `@nospecialize` block inference 2. add nospecializeinfer effect when `@nospecialize`d method is annotated as `@noinline` 3. implement as `@pure`-like boolean annotation to request nospecializeinfer effect on top of `@nospecialize` 4. implement as annotation that is orthogonal to `@nospecialize` After trying 1 ~ 3., I decided to submit 3. \### 1. make `@nospecialize` block inference This is almost same as what Jameson has done at <https://github.com/vtjnash/julia/commit/8ab7b6b94079b842b5db9f3fe29eb9d2708f5d1e>. It turned out that this approach performs very badly because some of `@nospecialize`'d arguments still need inference to perform reasonably. For example, it's obvious that the following definition of `getindex(@nospecialize(t::Tuple), i::Int)` would perform very badly if `@nospecialize` blocks inference, because of a lack of useful type information for succeeding optimizations: <https://github.com/JuliaLang/julia/blob/12d364e8249a07097a233ce7ea2886002459cc50/base/tuple.jl#L29-L30> \### 2. add nospecializeinfer effect when `@nospecialize`d method is annotated as `@noinline` The important observation is that we often use `@nospecialize` even when we expect inference to forward type and constant information. Adversely, we may be able to exploit the fact that we usually don't expect inference to forward information to a callee when we annotate it with `@noinline` (i.e. when adding `@noinline`, we're usually fine with disabling inter-procedural optimizations other than resolving dispatch). So the idea is to enable the inference suppression when `@nospecialize`'d method is annotated as `@noinline` too. It's a reasonable choice and can be efficiently implemented with #41922. But it sounds a bit weird to me to associate no infer effect with `@noinline`, and I also think there may be some cases we want to inline a method while partly avoiding inference, e.g.: ```julia \# the compiler will always infer with `f::Any` @noinline function twof(@nospecialize(f), n) # this method body is very simple and should be eligible for inlining if occursin('+', string(typeof(f).name.name::Symbol)) 2 + n elseif occursin('*', string(typeof(f).name.name::Symbol)) 2n else zero(n) end end ``` \### 3. implement as `@pure`-like boolean annotation to request nospecializeinfer effect on top of `@nospecialize` This is what this commit implements. It basically replaces the previous `@noinline` flag with a newly-introduced annotation named `@nospecializeinfer`. It is still associated with `@nospecialize` and it only has effect when used together with `@nospecialize`, but now it is not associated to `@noinline`, and it would help us reason about the behavior of `@nospecializeinfer` and experiment its effect more safely: ```julia \# the compiler will always infer with `f::Any` Base.@nospecializeinfer function twof(@nospecialize(f), n) # the compiler may or not inline this method if occursin('+', string(typeof(f).name.name::Symbol)) 2 + n elseif occursin('*', string(typeof(f).name.name::Symbol)) 2n else zero(n) end end ``` \### 4. implement as annotation that is orthogonal to `@nospecialize` Actually, we can have `@nospecialize` and `@nospecializeinfer` separately, and it would allow us to configure compilation strategies in a more fine-grained way. ```julia function noinfspec(Base.@nospecializeinfer(f), @nospecialize(g)) ... end ``` I'm fine with this approach but at the same time I'm afraid to have too many annotations that are related to some sort (I expect we will annotate both `@nospecializeinfer` and `@nospecialize` in this scheme). Co-authored-by: Mosè Giordano <giordano@users.noreply.github.com> Co-authored-by: Tim Holy <tim.holy@gmail.com> 12 May 2023, 05:16:31 UTC
6a2e50d add docs for `Base.return_types` (#49744) Co-authored-by: Shuhei Kadowaki <40514306+aviatesk@users.noreply.github.com> 12 May 2023, 02:35:20 UTC
8d0282c Merge pull request #49770 from topolarity/fix-timing-warnings Fix various warnings with JL_TIMING enabled 12 May 2023, 00:12:55 UTC
b21f100 Make `*Triangular` handle units (#43972) 11 May 2023, 19:06:25 UTC
a09f426 Fix visibility of `timing.h` exports 11 May 2023, 17:53:22 UTC
e642cb9 Enable `TRACY_TIMER_FALLBACK` for libTracyClient This fallback is most likely to kick in on VM's that may not have support for the rdtsc instruction. The loss in timer fidelity can be pretty severe (8 ns -> 15.6 ms) but we pack lots of other metadata into our traces, so it can still be useful to run with the fallback (and forcefully crashing the application as Tracy does now is just not the Julian way to communicate a hard error anyway) 11 May 2023, 17:52:39 UTC
65c3b41 Initialize `last_alloc` The uninit usage analyzer appears to be thrown off by the `__attribute__((cleanup(*)))` used by the `JL_TIMING` macro, so work around it by explicitly initializing `last_alloc`. 11 May 2023, 17:52:39 UTC
528949f macOS: avoid deadlock inside dyld4 deadlock workaround (#49740) Extend the fix for #43578 (2939272af2ef3fe9d8921f7ed0a6500e31a550c9) to cover the deadlock bug present internally in dyld4 inside the function we use to avoid the previous deadlock issue. Fix #49733 11 May 2023, 17:29:07 UTC
c714e2e Add JITLink ELF debugger support (#47037) 11 May 2023, 17:09:54 UTC
e4633e0 add note about references in `Out` (#49729) 11 May 2023, 17:04:35 UTC
e3e5eaa 🤖 [master] Bump the Pkg stdlib from 94f668cee to daf02a458 (#49764) Co-authored-by: Dilum Aluthge <dilum@aluthge.com> 11 May 2023, 16:26:38 UTC
6618d44 minor follow up on #49692 (#49752) 11 May 2023, 08:40:56 UTC
0b5ec1f irinterp: Fix accidentally introduced deletion of effectful statement (#49750) I moved around some code in #49692 that broadened the replacement of statements by their const results. This is fine for how we're currently using irinterp in base, because we're requiring some fairly strong effects, but some downstream pipelines (and potentially Base in the future) want to use irinterp on code with arbitrary effects, so put in an appropriate check. 11 May 2023, 03:35:44 UTC
7757e46 Excise support for LLVM 13 (#49722) 10 May 2023, 22:05:42 UTC
21d4c2f Merge pull request #48700 from JuliaLang/vc/upgrade_llvm15 Update LLVM to 15.0.7 10 May 2023, 21:59:41 UTC
056112e make better use of visibility attributes (#49600) This pragma enables compilers to generate more optimal code than the identical command line flag, for better performance, by moving objects out of the GOT into direct references and eliminating the unnecessary PLT jump. Note that setting dllimport similarly enables more performance optimizations, at the cost of making duplicate symbols for functions so that they no longer have unique addresses (similar to the side-effect of setting -Bsymbolic-functions on ELF). 10 May 2023, 16:54:47 UTC
24a5dc4 Add `@eval using REPL` to the `atreplinit` do block in REPL documentation. (#49717) 10 May 2023, 15:43:22 UTC
77c13ad Reenable NonTrivial Loop Unswitch 10 May 2023, 14:51:16 UTC
2ddbb5a Fix tests and static analyzer for LLVM 15 Co-authored-by: Gabriel Baraldi <baraldigabriel@gmail.com> Co-authored-by: Prem Chintalapudi <prem.chintalapudi@gmail.com> 10 May 2023, 14:15:09 UTC
9e3da19 Activate NewPM support Co-authored-by: Valentin Churavy <v.churavy@gmail.com> 10 May 2023, 14:15:09 UTC
190f841 Upgrade Julia to LLVM 15.0.7+5 Co-authored-by: Gabriel Baraldi <baraldigabriel@gmail.com> 10 May 2023, 14:15:09 UTC
back to top