https://github.com/apache/spark
Revision 161ba18a01ba904808379858041fe4f3c95db8e2 authored by Bryan Cutler on 07 November 2017, 20:32:37 UTC, committed by Wenchen Fan on 07 November 2017, 20:38:54 UTC
Currently, a pandas.DataFrame that contains a timestamp of type 'datetime64[ns]' when converted to a Spark DataFrame with `createDataFrame` will interpret the values as LongType. This fix will check for a timestamp type and convert it to microseconds which will allow Spark to read as TimestampType.

Added unit test to verify Spark schema is expected for TimestampType and DateType when created from pandas

Author: Bryan Cutler <cutlerb@gmail.com>

Closes #19646 from BryanCutler/pyspark-non-arrow-createDataFrame-ts-fix-SPARK-22417.

(cherry picked from commit 1d341042d6948e636643183da9bf532268592c6a)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
1 parent 2695b92
History
Tip revision: 161ba18a01ba904808379858041fe4f3c95db8e2 authored by Bryan Cutler on 07 November 2017, 20:32:37 UTC
[SPARK-22417][PYTHON] Fix for createDataFrame from pandas.DataFrame with timestamp
Tip revision: 161ba18

README.md

back to top