https://github.com/apache/spark
Revision 5e3837380ddcac5c4c0b489ee81e3415ce8ca633 authored by Henry Robinson on 05 November 2017, 05:47:25 UTC, committed by gatorsmile on 05 November 2017, 05:54:51 UTC
It's not safe in all cases to push down a LIMIT below a FULL OUTER
JOIN. If the limit is pushed to one side of the FOJ, the physical
join operator can not tell if a row in the non-limited side would have a
match in the other side.

*If* the join operator guarantees that unmatched tuples from the limited
side are emitted before any unmatched tuples from the other side,
pushing down the limit is safe. But this is impractical for some join
implementations, e.g. SortMergeJoin.

For now, disable limit pushdown through a FULL OUTER JOIN, and we can
evaluate whether a more complicated solution is necessary in the future.

Ran org.apache.spark.sql.* tests. Altered full outer join tests in
LimitPushdownSuite.

Author: Henry Robinson <henry@cloudera.com>

Closes #19647 from henryr/spark-22211.

(cherry picked from commit 6c6626614e59b2e8d66ca853a74638d3d6267d73)
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
1 parent 4074ed2
History
Tip revision: 5e3837380ddcac5c4c0b489ee81e3415ce8ca633 authored by Henry Robinson on 05 November 2017, 05:47:25 UTC
[SPARK-22211][SQL] Remove incorrect FOJ limit pushdown
Tip revision: 5e38373

README.md

back to top