https://github.com/apache/spark
Revision fd998c8a6783c0c8aceed8dcde4017cd479e42c8 authored by Bruce Robbins on 04 May 2022, 09:22:11 UTC, committed by Gengliang Wang on 04 May 2022, 09:22:24 UTC
### What changes were proposed in this pull request? In `DivideYMInterval#doGenCode` and `DivideDTInterval#doGenCode`, rely on the operand variable names provided by `nullSafeCodeGen` rather than calling `genCode` on the operands twice. ### Why are the changes needed? `DivideYMInterval#doGenCode` and `DivideDTInterval#doGenCode` call `genCode` on the operands twice (once directly, and once indirectly via `nullSafeCodeGen`). However, if you call `genCode` on an operand twice, you might not get back the same variable name for both calls (e.g., when the operand is not a `BoundReference` or if whole-stage codegen is turned off). When that happens, `nullSafeCodeGen` generates initialization code for one set of variables, but the divide expression generates usage code for another set of variables, resulting in compilation errors like this: ``` spark-sql> create or replace temp view v1 as > select * FROM VALUES > (interval '10' months, interval '10' day, 2) > as v1(period, duration, num); Time taken: 2.81 seconds spark-sql> cache table v1; Time taken: 2.184 seconds spark-sql> select period/(num + 3) from v1; 22/05/03 08:56:37 ERROR CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 40, Column 44: Expression "project_value_2" is not an rvalue ... 22/05/03 08:56:37 WARN UnsafeProjection: Expr codegen error and falling back to interpreter mode ... 0-2 Time taken: 0.149 seconds, Fetched 1 row(s) spark-sql> select duration/(num + 3) from v1; 22/05/03 08:57:29 ERROR CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 40, Column 54: Expression "project_value_2" is not an rvalue ... 22/05/03 08:57:29 WARN UnsafeProjection: Expr codegen error and falling back to interpreter mode ... 2 00:00:00.000000000 Time taken: 0.089 seconds, Fetched 1 row(s) ``` The error is not fatal (unless you have `spark.sql.codegen.fallback` set to `false`), but it muddies the log and can slow the query (since the expression is interpreted). ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? New unit tests (unit tests run with `spark.sql.codegen.fallback` set to `false`, so the new tests fail without the fix). Closes #36442 from bersprockets/interval_div_issue. Authored-by: Bruce Robbins <bersprockets@gmail.com> Signed-off-by: Gengliang Wang <gengliang@apache.org> (cherry picked from commit ca87bead23ca32a05c6a404a91cea47178f63e70) Signed-off-by: Gengliang Wang <gengliang@apache.org>
1 parent d3aadb4
Tip revision: fd998c8a6783c0c8aceed8dcde4017cd479e42c8 authored by Bruce Robbins on 04 May 2022, 09:22:11 UTC
[SPARK-39093][SQL] Avoid codegen compilation error when dividing year-month intervals or day-time intervals by an integral
[SPARK-39093][SQL] Avoid codegen compilation error when dividing year-month intervals or day-time intervals by an integral
Tip revision: fd998c8
File | Mode | Size |
---|---|---|
.github | ||
.idea | ||
R | ||
assembly | ||
bin | ||
binder | ||
build | ||
common | ||
conf | ||
core | ||
data | ||
dev | ||
docs | ||
examples | ||
external | ||
graphx | ||
hadoop-cloud | ||
launcher | ||
licenses | ||
licenses-binary | ||
mllib | ||
mllib-local | ||
project | ||
python | ||
repl | ||
resource-managers | ||
sbin | ||
sql | ||
streaming | ||
tools | ||
.asf.yaml | -rw-r--r-- | 1.1 KB |
.gitattributes | -rw-r--r-- | 130 bytes |
.gitignore | -rw-r--r-- | 1.8 KB |
CONTRIBUTING.md | -rw-r--r-- | 997 bytes |
LICENSE | -rw-r--r-- | 13.1 KB |
LICENSE-binary | -rw-r--r-- | 22.4 KB |
NOTICE | -rw-r--r-- | 2.0 KB |
NOTICE-binary | -rw-r--r-- | 56.5 KB |
README.md | -rw-r--r-- | 4.4 KB |
appveyor.yml | -rw-r--r-- | 2.7 KB |
pom.xml | -rw-r--r-- | 137.1 KB |
scalastyle-config.xml | -rw-r--r-- | 22.0 KB |
![swh spinner](/static/img/swh-spinner.gif)
Computing file changes ...