https://github.com/apache/spark
Revision 9ed64048a740fbcd15d2b830b1edbb728f87c423 authored by Sergei Lebedev on 25 October 2017, 21:15:44 UTC, committed by Wenchen Fan on 25 October 2017, 21:17:40 UTC
Prior to this commit getAllBlocks implicitly assumed that the directories managed by the DiskBlockManager contain only the files corresponding to valid block IDs. In reality, this assumption was violated during shuffle, which produces temporary files in the same directory as the resulting blocks. As a result, calls to getAllBlocks during shuffle were unreliable. The fix could be made more efficient, but this is probably good enough. `DiskBlockManagerSuite` Author: Sergei Lebedev <s.lebedev@criteo.com> Closes #19458 from superbobry/block-id-option. (cherry picked from commit b377ef133cdc38d49b460b2cc6ece0b5892804cc) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
1 parent 4c1a868
Tip revision: 9ed64048a740fbcd15d2b830b1edbb728f87c423 authored by Sergei Lebedev on 25 October 2017, 21:15:44 UTC
[SPARK-22227][CORE] DiskBlockManager.getAllBlocks now tolerates temp files
[SPARK-22227][CORE] DiskBlockManager.getAllBlocks now tolerates temp files
Tip revision: 9ed6404
File | Mode | Size |
---|---|---|
.github | ||
R | ||
assembly | ||
bin | ||
build | ||
common | ||
conf | ||
core | ||
data | ||
dev | ||
docs | ||
examples | ||
external | ||
graphx | ||
launcher | ||
licenses | ||
mllib | ||
mllib-local | ||
project | ||
python | ||
repl | ||
resource-managers | ||
sbin | ||
sql | ||
streaming | ||
tools | ||
.gitattributes | -rw-r--r-- | 40 bytes |
.gitignore | -rw-r--r-- | 1.2 KB |
.travis.yml | -rw-r--r-- | 1.7 KB |
CONTRIBUTING.md | -rw-r--r-- | 995 bytes |
LICENSE | -rw-r--r-- | 17.5 KB |
NOTICE | -rw-r--r-- | 24.1 KB |
README.md | -rw-r--r-- | 3.7 KB |
appveyor.yml | -rw-r--r-- | 1.9 KB |
pom.xml | -rw-r--r-- | 94.8 KB |
scalastyle-config.xml | -rw-r--r-- | 17.4 KB |
Computing file changes ...