Revision 3ddceeee167d18dc62e56aab7cd71add5f843e40 authored by Keno Fischer on 29 June 2023, 18:50:53 UTC, committed by GitHub on 29 June 2023, 18:50:53 UTC
This fixes #49715. The fix itself is pretty simple - just remove
the generator expansion that was added in #48766, but the bigger
question here is what the correct behavior should be in the first place.

# Dynamic Semantics, generally

The primary question here are of the semantics of generated functions.
Note that this is quite different to how they are implemented. In
general, the way we think about compiling Julia is that there is a
well defined set of *dynamic semantics* that specify what a particular
piece of Julia code means. Julia's dynamic semantics are generally quite
simple (at every point, call the most specific applicable method).
What happens under the hood may be quite different (e.g. lots of inference,
compiling constant folding, etc), but the compilation process should
mostly preserve the semantics (with a few well defined exceptions
around floating point arithmetic, effect assumptions, semantically
unobservable side effects, etc.).

# The dnymaic semantics of generated functions

With that diatribe out of the way, let's think about the dynamic semantics
of generated functions. We haven't always been particularly clear about
this, but I propose it's basically the following:

For a generated function:
```
@generated function f(args...)
    # = generator body =#
end
```

this is semantically equivalent to the function to basically the following:

```
const lno = LineNumberNode(@__FILE__, @__LINE__); function f(args...)
    generator = @opaque @assume_effects :foldable :generator (args...)->#= generator body =#
    body = generator(Base.get_world_counter(), lno, Core.Typeof.(args))
    execute(body, f, args...)
end
```

A couple of notes on this:

1. `@opaque` used here for the world-age capture semantics of the generator itself
2. There's an effects-assumption `:generator` that doesn't exist but is
   supposed to capture the special allowance for calling generators. This
   is discussed more below.

## Implementing `execute`

For a long time, we didn't really have a first-class implementation of `execute`.
It's almost (some liberties around the way that the arguments work, but you get
the idea)

```
execute_eval(body, f, args...) = eval((args...)->$body)(f, args....)
```

but that doesn't have the correct world age semantics (would error
as written and even if you used invokelatest, the body would run
in the wrong world).

However, with OpaqueClosure we do actually have a mechanism now and
we could write:

```
execute(body, f, args...) = OpaqueClosure(body, f)(args...)
```

Again, I'm not proposing this as an implementation, just to give us an idea
of what the dynamic semantics of generated functions are.

# The particular bug (#49715)

The issue in #49715 is that the following happens:
1. A generated function gets called and inference is attempted.
2. Inference attempts to infer the generated function and call the generator.
3. The generator throws an error.
4. Inference fails.
5. The compiler enters a generic inference-failure fallback path
6. The compiler asks for a generator expansion in the generic world (-1)
7. This gives a different error, confusing the user.

There is the additional problem that this error gets thrown at
compilation time, which is not technically legal (and there was an
existing TODO to fix that).

In addition to that, I think there is a separate question of whether it
should be semantically legal to throw an error for a different world age
than the currently running one. Given the semantics proposed above, I
would suggest that the answer should be no. This does depend on the
exact semantics of :generator, but in general, our existing
effects-related notions do not allow particularly strong assumptions on
the particular error being thrown (requiring them to be re-evaluated
at runtime), and I see no reason to depart from this practice here.

Thus, I would suggest that the current behavior should be disallowed
and the expected behavior is that the generic fallback implementation
of generated functions invoke the generator in the runtime world and
expose the appropriate error.

# Should we keep the generic world?

That does leave the question what to do about the generic world (-1).
I'm not 100% convinced that this is necessarily a useful concept to
have. It is true that most generated functions do not depend on the
world age, but they can already indicate this by returning a value
with bounded world range and no backedges (equivalently returning
a plain expression). On the other hand, keeping the generic world
does risk creating the inverse of the situation that prompted this
issue, in that there is no semantically reachable path to calling
the generator with the generic world, making it hard to debug.

As a result, I am very strongly leaning towards removing this concept,
but I am open to being convinced otherwise.

# This PR

This PR, which is considerably shorter than this commit message is very
simple: The attempt to invoke the generator with the generic world -1
is removed. Instead, we fall back to the interpreter, which already
has the precise semantics that I want here - invoking the generator
in the dynamic world and interpreting the result.

# The semantics of :generator

That leaves one issue to be resolved which is the semantics of `:generator`.
I don't think it's necessary to be as precise here as we are about the
other effects we expose, but I propose it be something like the following:

For functions with the :generator effects assumption, :consistent-cy is
relaxed as follows:

1. The requistive notion of equality is relaxed to a "same code and
   metadata" equality of code instances. I don't think we have any
   predicate for this (and it's not necessarily computable), but the
   idea should be that the CodeInstance is always computed in the exact
   same way, but may be mutable and such. Note that this is explicitly
   not functional extensionality, because we do analyze the structure of
   the returned code and codegen based on it.

2. The world-age semantics of :consistent sharpened to require
   our relaxed notion of consistency for any overlapping min_world:max_world
   range returned from the generator.

Co-authored-by: Oscar Smith <oscardssmith@gmail.com>
1 parent f6f3553
Raw File
pkgimage.mk
SRCDIR := $(abspath $(dir $(lastword $(MAKEFILE_LIST))))
BUILDDIR := .
JULIAHOME := $(SRCDIR)
include $(JULIAHOME)/Make.inc

VERSDIR := v$(shell cut -d. -f1-2 < $(JULIAHOME)/VERSION)

# set some influential environment variables
export JULIA_DEPOT_PATH := $(build_prefix)/share/julia
export JULIA_LOAD_PATH := @stdlib
unexport JULIA_PROJECT :=
unexport JULIA_BINDIR :=

default: release
release: all-release
debug: all-debug
all: release debug

$(JULIA_DEPOT_PATH):
	mkdir -p $@

print-depot-path:
	@$(call PRINT_JULIA, $(call spawn,$(JULIA_EXECUTABLE)) --startup-file=no -e '@show Base.DEPOT_PATH')

STDLIBS := ArgTools Artifacts Base64 CRC32c FileWatching Libdl NetworkOptions SHA Serialization \
		   GMP_jll LLVMLibUnwind_jll LibUV_jll LibUnwind_jll MbedTLS_jll OpenLibm_jll PCRE2_jll \
		   Zlib_jll dSFMT_jll libLLVM_jll libblastrampoline_jll OpenBLAS_jll Printf Random Tar \
		   LibSSH2_jll MPFR_jll LinearAlgebra Dates Distributed Future LibGit2 Profile SparseArrays UUIDs \
		   SharedArrays TOML Test LibCURL Downloads Pkg Dates LazyArtifacts Sockets Unicode Markdown \
		   InteractiveUtils REPL DelimitedFiles

all-release: $(addprefix cache-release-, $(STDLIBS))
all-debug:   $(addprefix cache-debug-, $(STDLIBS))

define pkgimg_builder
$1_SRCS := $$(shell find $$(build_datarootdir)/julia/stdlib/$$(VERSDIR)/$1/src -name \*.jl) \
    $$(wildcard $$(build_prefix)/manifest/$$(VERSDIR)/$1)
$$(BUILDDIR)/stdlib/$1.release.image: $$($1_SRCS) $$(addsuffix .release.image,$$(addprefix $$(BUILDDIR)/stdlib/,$2)) $(build_private_libdir)/sys.$(SHLIB_EXT)
	@$$(call PRINT_JULIA, $$(call spawn,$$(JULIA_EXECUTABLE)) --startup-file=no --check-bounds=yes -e 'Base.compilecache(Base.identify_package("$1"))')
	@$$(call PRINT_JULIA, $$(call spawn,$$(JULIA_EXECUTABLE)) --startup-file=no -e 'Base.compilecache(Base.identify_package("$1"))')
	touch $$@
cache-release-$1: $$(BUILDDIR)/stdlib/$1.release.image
$$(BUILDDIR)/stdlib/$1.debug.image: $$($1_SRCS) $$(addsuffix .debug.image,$$(addprefix $$(BUILDDIR)/stdlib/,$2)) $(build_private_libdir)/sys-debug.$(SHLIB_EXT)
	@$$(call PRINT_JULIA, $$(call spawn,$$(JULIA_EXECUTABLE)) --startup-file=no --check-bounds=yes -e 'Base.compilecache(Base.identify_package("$1"))')
	@$$(call PRINT_JULIA, $$(call spawn,$$(JULIA_EXECUTABLE)) --startup-file=no -e 'Base.compilecache(Base.identify_package("$1"))')
cache-debug-$1: $$(BUILDDIR)/stdlib/$1.debug.image
.SECONDARY: $$(BUILDDIR)/stdlib/$1.release.image $$(BUILDDIR)/stdlib/$1.debug.image
endef

# Used to just define them in the dependency graph
# reside in the system image
define sysimg_builder
$$(BUILDDIR)/stdlib/$1.release.image:
	touch $$@
cache-release-$1: $$(BUILDDIR)/stdlib/$1.release.image
$$(BUILDDIR)/stdlib/$1.debug.image:
	touch $$@
cache-debug-$1: $$(BUILDDIR)/stdlib/$1.debug.image
.SECONDARY: $$(BUILDDIR)/stdlib/$1.release.image $$(BUILDDIR)/stdlib/$1.debug.image
endef

# no dependencies
$(eval $(call pkgimg_builder,MozillaCACerts_jll,))
$(eval $(call sysimg_builder,ArgTools,))
$(eval $(call sysimg_builder,Artifacts,))
$(eval $(call sysimg_builder,Base64,))
$(eval $(call sysimg_builder,CRC32c,))
$(eval $(call sysimg_builder,FileWatching,))
$(eval $(call sysimg_builder,Libdl,))
$(eval $(call sysimg_builder,Logging,))
$(eval $(call sysimg_builder,Mmap,))
$(eval $(call sysimg_builder,NetworkOptions,))
$(eval $(call sysimg_builder,SHA,))
$(eval $(call sysimg_builder,Serialization,))
$(eval $(call sysimg_builder,Sockets,))
$(eval $(call sysimg_builder,Unicode,))
$(eval $(call pkgimg_builder,Profile,))

# 1-depth packages
$(eval $(call pkgimg_builder,GMP_jll,Artifacts Libdl))
$(eval $(call pkgimg_builder,LLVMLibUnwind_jll,Artifacts Libdl))
$(eval $(call pkgimg_builder,LibUV_jll,Artifacts Libdl))
$(eval $(call pkgimg_builder,LibUnwind_jll,Artifacts Libdl))
$(eval $(call pkgimg_builder,MbedTLS_jll,Artifacts Libdl))
$(eval $(call pkgimg_builder,nghttp2_jll,Artifacts Libdl))
$(eval $(call pkgimg_builder,OpenLibm_jll,Artifacts Libdl))
$(eval $(call pkgimg_builder,PCRE2_jll,Artifacts Libdl))
$(eval $(call pkgimg_builder,Zlib_jll,Artifacts Libdl))
$(eval $(call pkgimg_builder,dSFMT_jll,Artifacts Libdl))
$(eval $(call pkgimg_builder,libLLVM_jll,Artifacts Libdl))
$(eval $(call sysimg_builder,libblastrampoline_jll,Artifacts Libdl))
$(eval $(call sysimg_builder,OpenBLAS_jll,Artifacts Libdl))
$(eval $(call sysimg_builder,Markdown,Base64))
$(eval $(call sysimg_builder,Printf,Unicode))
$(eval $(call sysimg_builder,Random,SHA))
$(eval $(call sysimg_builder,Tar,ArgTools,SHA))
$(eval $(call pkgimg_builder,DelimitedFiles,Mmap))

# 2-depth packages
$(eval $(call pkgimg_builder,LLD_jll,Zlib_jll libLLVM_jll Artifacts Libdl))
$(eval $(call pkgimg_builder,LibSSH2_jll,Artifacts Libdl MbedTLS_jll))
$(eval $(call pkgimg_builder,MPFR_jll,Artifacts Libdl GMP_jll))
$(eval $(call sysimg_builder,LinearAlgebra,Libdl libblastrampoline_jll OpenBLAS_jll))
$(eval $(call sysimg_builder,Dates,Printf))
$(eval $(call pkgimg_builder,Distributed,Random Serialization Sockets))
$(eval $(call sysimg_builder,Future,Random))
$(eval $(call sysimg_builder,InteractiveUtils,Markdown))
$(eval $(call sysimg_builder,LibGit2,NetworkOptions Printf SHA Base64))
$(eval $(call sysimg_builder,UUIDs,Random SHA))

 # 3-depth packages
 # LibGit2_jll
$(eval $(call pkgimg_builder,LibCURL_jll,LibSSH2_jll nghttp2_jll MbedTLS_jll Zlib_jll Artifacts Libdl))
$(eval $(call sysimg_builder,REPL,InteractiveUtils Markdown Sockets Unicode))
$(eval $(call pkgimg_builder,SharedArrays,Distributed Mmap Random Serialization))
$(eval $(call sysimg_builder,TOML,Dates))
$(eval $(call pkgimg_builder,Test,Logging Random Serialization InteractiveUtils))

# 4-depth packages
$(eval $(call sysimg_builder,LibCURL,LibCURL_jll MozillaCACerts_jll))

# 5-depth packages
$(eval $(call sysimg_builder,Downloads,ArgTools FileWatching LibCURL NetworkOptions))

# 6-depth packages
$(eval $(call sysimg_builder,Pkg,Dates LibGit2 Libdl Logging Printf Random SHA UUIDs)) # Markdown REPL

# 7-depth packages
$(eval $(call pkgimg_builder,LazyArtifacts,Artifacts Pkg))

$(eval $(call pkgimg_builder,SparseArrays,Libdl LinearAlgebra Random Serialization))
# SuiteSparse_jll
# Statistics
back to top