Revision - 18fc8e8 - [SPARK-39915][SQL][3.3] Dataset.repartition(N) may not create N [...] - origin: https://github.com/apache/spark

visit type:

https://github.com/apache/spark

05 April 2024, 20:24:39 UTC

Revision 18fc8e8e023868f6e7fab3422c5ce57e690d7834 authored by ulysses-you on 09 September 2022, 21:43:19 UTC, committed by Dongjoon Hyun on 09 September 2022, 21:43:19 UTC

[SPARK-39915][SQL][3.3] Dataset.repartition(N) may not create N partitions Non-AQE part

### What changes were proposed in this pull request?

backport https://github.com/apache/spark/pull/37706 for branch-3.3

Skip optimize the root user-specified repartition in `PropagateEmptyRelation`.

### Why are the changes needed?

Spark should preserve the final repatition which can affect the final output partition which is user-specified.

For example:

```scala
spark.sql("select * from values(1) where 1 < rand()").repartition(1)

// before:
== Optimized Logical Plan ==
LocalTableScan <empty>, [col1#0]

// after:
== Optimized Logical Plan ==
Repartition 1, true
+- LocalRelation <empty>, [col1#0]
```

### Does this PR introduce _any_ user-facing change?

yes, the empty plan may change

### How was this patch tested?

add test

Closes #37730 from ulysses-you/empty-3.3.

Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

1 parent 4f69c98

Files
Changes

Permalinks

Tip revision: 18fc8e8e023868f6e7fab3422c5ce57e690d7834 authored by ulysses-you on 09 September 2022, 21:43:19 UTC
[SPARK-39915][SQL][3.3] Dataset.repartition(N) may not create N partitions Non-AQE part

Tip revision: 18fc8e8

File	Mode	Size
.github
.idea
R
assembly
bin
binder
build
common
conf
core
data
dev
docs
examples
external
graphx
hadoop-cloud
launcher
licenses
licenses-binary
mllib
mllib-local
project
python
repl
resource-managers
sbin
sql
streaming
tools
.asf.yaml	-rw-r--r--	1.1 KB
.gitattributes	-rw-r--r--	130 bytes
.gitignore	-rw-r--r--	2.0 KB
CONTRIBUTING.md	-rw-r--r--	997 bytes
LICENSE	-rw-r--r--	13.1 KB
LICENSE-binary	-rw-r--r--	22.4 KB
NOTICE	-rw-r--r--	2.0 KB
NOTICE-binary	-rw-r--r--	56.5 KB
README.md	-rw-r--r--	4.4 KB
appveyor.yml	-rw-r--r--	2.7 KB
pom.xml	-rw-r--r--	137.4 KB
scalastyle-config.xml	-rw-r--r--	22.0 KB

Showing with 0 additions and 0 deletions (0 / 0 diffs computed)

Computing file changes ...

https://github.com/apache/spark

[SPARK-39915][SQL][3.3] Dataset.repartition(N) may not create N partitions Non-AQE part

README.md