https://github.com/apache/spark
Revision e3f6b6d1e15378860b5e30fb4c40168215b16eea authored by Brennan Stein on 29 August 2022, 01:55:30 UTC, committed by Hyukjin Kwon on 29 August 2022, 01:57:31 UTC
The `castPartValueToDesiredType` function now returns byte for ByteType and short for ShortType, rather than ints; also floats for FloatType rather than double.

Previously, attempting to read back in a file partitioned on one of these column types would result in a ClassCastException at runtime (for Byte, `java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Byte`). I can't think this is anything but a bug, as returning the correct data type prevents the crash.

Yes: it changes the observed behavior when reading in a byte/short/float-partitioned file.

Added unit test. Without the `castPartValueToDesiredType` updates, the test fails with the stated exception.

===
I'll note that I'm not familiar enough with the spark repo to know if this will have ripple effects elsewhere, but tests pass on my fork and since the very similar https://github.com/apache/spark/pull/36344/files only needed to touch these two files I expect this change is self-contained as well.

Closes #37659 from BrennanStein/spark40212.

Authored-by: Brennan Stein <brennan.stein@ekata.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 146f187342140635b83bfe775b6c327755edfbe1)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
1 parent 694e4e7
History
Tip revision: e3f6b6d1e15378860b5e30fb4c40168215b16eea authored by Brennan Stein on 29 August 2022, 01:55:30 UTC
[SPARK-40212][SQL] SparkSQL castPartValue does not properly handle byte, short, or float
Tip revision: e3f6b6d

README.md

back to top