https://github.com/apache/spark
Revision 18d141250c8ffcf4f30d2b0edb77b57e3945f3f1 authored by Hui An on 12 December 2022, 10:17:49 UTC, committed by Wenchen Fan on 12 December 2022, 10:20:44 UTC
### What changes were proposed in this pull request?
Make consistent MR job IDs in FileBatchWriter and FileFormatWriter

### Why are the changes needed?

[SPARK-26873](https://issues.apache.org/jira/browse/SPARK-26873) fix the consistent issue for FileFormatWriter, but [SPARK-33402](https://issues.apache.org/jira/browse/SPARK-33402) break this requirement by introducing a random long, we need to address this to expects identical task IDs across attempts for correctness.

Also FileBatchWriter doesn't follow this requirement, need to fix it as well.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

Closes #38980 from boneanxs/SPARK-41448.

Authored-by: Hui An <hui.an@shopee.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 7801666f3b5ea3bfa0f95571c1d68147ce5240ec)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
1 parent 3beab1f
History
Tip revision: 18d141250c8ffcf4f30d2b0edb77b57e3945f3f1 authored by Hui An on 12 December 2022, 10:17:49 UTC
[SPARK-41448] Make consistent MR job IDs in FileBatchWriter and FileFormatWriter
Tip revision: 18d1412

README.md

back to top