https://github.com/apache/spark
Revision 30d5c9fd8ae1944a94ddedae83433368a02e55e6 authored by Dongjoon Hyun on 13 October 2017, 15:09:12 UTC, committed by Wenchen Fan on 13 October 2017, 15:11:50 UTC
Before Hive 2.0, ORC File schema has invalid column names like `_col1` and `_col2`. This is a well-known limitation and there are several Apache Spark issues with `spark.sql.hive.convertMetastoreOrc=true`. This PR ignores ORC File schema and use Spark schema. Pass the newly added test case. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #19470 from dongjoon-hyun/SPARK-18355. (cherry picked from commit e6e36004afc3f9fc8abea98542248e9de11b4435) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
1 parent c9187db
Tip revision: 30d5c9fd8ae1944a94ddedae83433368a02e55e6 authored by Dongjoon Hyun on 13 October 2017, 15:09:12 UTC
[SPARK-14387][SPARK-16628][SPARK-18355][SQL] Use Spark schema to read ORC table instead of ORC file schema
[SPARK-14387][SPARK-16628][SPARK-18355][SQL] Use Spark schema to read ORC table instead of ORC file schema
Tip revision: 30d5c9f
File | Mode | Size |
---|---|---|
.github | ||
R | ||
assembly | ||
bin | ||
build | ||
common | ||
conf | ||
core | ||
data | ||
dev | ||
docs | ||
examples | ||
external | ||
graphx | ||
launcher | ||
licenses | ||
mllib | ||
mllib-local | ||
project | ||
python | ||
repl | ||
resource-managers | ||
sbin | ||
sql | ||
streaming | ||
tools | ||
.gitattributes | -rw-r--r-- | 40 bytes |
.gitignore | -rw-r--r-- | 1.2 KB |
.travis.yml | -rw-r--r-- | 1.7 KB |
CONTRIBUTING.md | -rw-r--r-- | 995 bytes |
LICENSE | -rw-r--r-- | 17.5 KB |
NOTICE | -rw-r--r-- | 24.1 KB |
README.md | -rw-r--r-- | 3.7 KB |
appveyor.yml | -rw-r--r-- | 1.9 KB |
pom.xml | -rw-r--r-- | 94.8 KB |
scalastyle-config.xml | -rw-r--r-- | 17.4 KB |
Computing file changes ...