https://github.com/apache/spark
Revision 5d10586a0065c6845e0e89afc5f22e09baa185b7 authored by Wenchen Fan on 20 September 2017, 16:00:43 UTC, committed by gatorsmile on 20 September 2017, 16:01:25 UTC
## What changes were proposed in this pull request?

Spark with Scala 2.10 fails with a group by cube:
```
spark.range(1).select($"id" as "a", $"id" as "b").write.partitionBy("a").mode("overwrite").saveAsTable("rollup_bug")
spark.sql("select 1 from rollup_bug group by rollup ()").show
```

It can be traced back to https://github.com/apache/spark/pull/15484 , which made `Expand.projections` a lazy `Stream` for group by cube.

In scala 2.10 `Stream` captures a lot of stuff, and in this case it captures the entire query plan which has some un-serializable parts.

This change is also good for master branch, to reduce the serialized size of `Expand.projections`.

## How was this patch tested?

manually verified with Spark with Scala 2.10.

Author: Wenchen Fan <wenchen@databricks.com>

Closes #19289 from cloud-fan/bug.

(cherry picked from commit ce6a71e013c403d0a3690cf823934530ce0ea5ef)
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
1 parent 6764408
History
Tip revision: 5d10586a0065c6845e0e89afc5f22e09baa185b7 authored by Wenchen Fan on 20 September 2017, 16:00:43 UTC
[SPARK-22076][SQL] Expand.projections should not be a Stream
Tip revision: 5d10586

README.md

back to top