https://github.com/apache/spark
Revision b0b226e79c00d8dc1bee2c9b6818000ab8806f80 authored by Hyukjin Kwon on 24 March 2022, 03:12:38 UTC, committed by Hyukjin Kwon on 24 March 2022, 03:13:11 UTC
### What changes were proposed in this pull request?

This PR proposes to use `FileUtil.unTarUsingJava` that is a Java implementation for un-tar `.tar` files. `unTarUsingJava` is not public but it exists in all Hadoop versions from 2.1+, see HADOOP-9264.

The security issue reproduction requires a non-Windows platform, and a non-gzipped TAR archive file name (contents don't matter).

### Why are the changes needed?

There is a risk for arbitrary shell command injection via `Utils.unpack` when the filename is controlled by a malicious user. This is due to an issue in Hadoop's `unTar`, that is not properly escaping the filename before passing to a shell command:https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileUtil.java#L904

### Does this PR introduce _any_ user-facing change?

Yes, it prevents a security issue that, previously, allowed users to execute arbitrary shall command.

### How was this patch tested?

Manually tested in local, and existing test cases should cover.

Closes #35946 from HyukjinKwon/SPARK-38631.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 057c051285ec32c665fb458d0670c1c16ba536b2)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
1 parent 43c6b91
History
Tip revision: b0b226e79c00d8dc1bee2c9b6818000ab8806f80 authored by Hyukjin Kwon on 24 March 2022, 03:12:38 UTC
[SPARK-38631][CORE] Uses Java-based implementation for un-tarring at Utils.unpack
Tip revision: b0b226e

README.md

back to top