Revision f01eafd79f3bd8a02cdce1422a4b2846b93bdc52 authored by Wenchen Fan on 05 August 2024, 06:57:25 UTC, committed by Wenchen Fan on 05 August 2024, 06:57:25 UTC
### What changes were proposed in this pull request?

We missed the fact that submitting a shuffle or broadcast query stage can be heavy, as it needs to submit subqueries and wait for the results. This blocks the AQE loop and hurts the parallelism of AQE.

This PR fixes the problem by using shuffle/broadcast's own thread pool to wait for subqueries and other preparations.

This PR also re-implements https://github.com/apache/spark/pull/45234 to avoid submitting the shuffle job if the query is failed and all query stages need to be cancelled.

### Why are the changes needed?

better parallelism for AQE

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

new test case

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #47533 from cloud-fan/aqe.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
1 parent 94f8872
History
File Mode Size
mvn -rwxr-xr-x 6.2 KB
sbt -rwxr-xr-x 5.2 KB
sbt-launch-lib.bash -rwxr-xr-x 5.2 KB
spark-build-info -rwxr-xr-x 1.6 KB
spark-build-info.ps1 -rw-r--r-- 1.7 KB
util.sh -rwxr-xr-x 1.2 KB

back to top