Revision b6749ba09724b3ed19166e7bb0b1fdcca79a44ba authored by Xiao Li on 23 June 2017, 12:44:25 UTC, committed by Wenchen Fan on 23 June 2017, 12:44:25 UTC
### What changes were proposed in this pull request?

The input query schema of INSERT AS SELECT could be changed after optimization. For example, the following query's output schema is changed by the rule `SimplifyCasts` and `RemoveRedundantAliases`.
```SQL
 SELECT word, length, cast(first as string) as first FROM view1
```

This PR is to fix the issue in Spark 2.2. Instead of using the analyzed plan of the input query, this PR use its executed plan to determine the attributes in `FileFormatWriter`.

The related issue in the master branch has been fixed by https://github.com/apache/spark/pull/18064. After this PR is merged, I will submit a separate PR to merge the test case to the master.

### How was this patch tested?
Added a test case

Author: Xiao Li <gatorsmile@gmail.com>
Author: gatorsmile <gatorsmile@gmail.com>

Closes #18386 from gatorsmile/newRC5.
1 parent b99c0e9
History
File Mode Size
_data
_includes
_layouts
_plugins
css
img
js
README.md -rw-r--r-- 3.6 KB
_config.yml -rw-r--r-- 615 bytes
api.md -rw-r--r-- 314 bytes
building-spark.md -rw-r--r-- 10.3 KB
cluster-overview.md -rw-r--r-- 6.5 KB
configuration.md -rw-r--r-- 84.6 KB
contributing-to-spark.md -rw-r--r-- 308 bytes
ec2-scripts.md -rw-r--r-- 217 bytes
graphx-programming-guide.md -rw-r--r-- 51.7 KB
hadoop-provided.md -rw-r--r-- 1.0 KB
hardware-provisioning.md -rw-r--r-- 3.8 KB
index.md -rw-r--r-- 7.5 KB
java-programming-guide.md -rw-r--r-- 177 bytes
job-scheduling.md -rw-r--r-- 15.5 KB
ml-advanced.md -rw-r--r-- 6.2 KB
ml-ann.md -rw-r--r-- 264 bytes
ml-classification-regression.md -rw-r--r-- 47.5 KB
ml-clustering.md -rw-r--r-- 7.3 KB
ml-collaborative-filtering.md -rw-r--r-- 8.2 KB
ml-decision-tree.md -rw-r--r-- 210 bytes
ml-ensembles.md -rw-r--r-- 224 bytes
ml-features.md -rw-r--r-- 66.0 KB
ml-frequent-pattern-mining.md -rw-r--r-- 4.1 KB
ml-guide.md -rw-r--r-- 6.3 KB
ml-linear-methods.md -rw-r--r-- 195 bytes
ml-migration-guides.md -rw-r--r-- 20.1 KB
ml-pipeline.md -rw-r--r-- 13.0 KB
ml-statistics.md -rw-r--r-- 3.3 KB
ml-survival-regression.md -rw-r--r-- 225 bytes
ml-tuning.md -rw-r--r-- 7.0 KB
mllib-classification-regression.md -rw-r--r-- 1.7 KB
mllib-clustering.md -rw-r--r-- 23.8 KB
mllib-collaborative-filtering.md -rw-r--r-- 6.1 KB
mllib-data-types.md -rw-r--r-- 30.3 KB
mllib-decision-tree.md -rw-r--r-- 14.5 KB
mllib-dimensionality-reduction.md -rw-r--r-- 5.9 KB
mllib-ensembles.md -rw-r--r-- 15.9 KB
mllib-evaluation-metrics.md -rw-r--r-- 25.3 KB
mllib-feature-extraction.md -rw-r--r-- 15.9 KB
mllib-frequent-pattern-mining.md -rw-r--r-- 7.5 KB
mllib-guide.md -rw-r--r-- 2.8 KB
mllib-isotonic-regression.md -rw-r--r-- 4.6 KB
mllib-linear-methods.md -rw-r--r-- 23.4 KB
mllib-migration-guides.md -rw-r--r-- 319 bytes
mllib-naive-bayes.md -rw-r--r-- 4.2 KB
mllib-optimization.md -rw-r--r-- 12.8 KB
mllib-pmml-model-export.md -rw-r--r-- 2.0 KB
mllib-statistics.md -rw-r--r-- 17.3 KB
monitoring.md -rw-r--r-- 20.4 KB
python-programming-guide.md -rw-r--r-- 179 bytes
quick-start.md -rw-r--r-- 17.1 KB
rdd-programming-guide.md -rw-r--r-- 79.8 KB
running-on-mesos.md -rw-r--r-- 23.8 KB
running-on-yarn.md -rw-r--r-- 31.1 KB
scala-programming-guide.md -rw-r--r-- 144 bytes
security.md -rw-r--r-- 10.4 KB
spark-standalone.md -rw-r--r-- 21.3 KB
sparkr.md -rw-r--r-- 25.6 KB
sql-programming-guide.md -rw-r--r-- 94.2 KB
storage-openstack-swift.md -rw-r--r-- 4.8 KB
streaming-custom-receivers.md -rw-r--r-- 9.8 KB
streaming-flume-integration.md -rw-r--r-- 10.1 KB
streaming-kafka-0-10-integration.md -rw-r--r-- 16.9 KB
streaming-kafka-0-8-integration.md -rw-r--r-- 15.6 KB
streaming-kafka-integration.md -rw-r--r-- 1.5 KB
streaming-kinesis-integration.md -rw-r--r-- 14.2 KB
streaming-programming-guide.md -rw-r--r-- 116.5 KB
structured-streaming-kafka-integration.md -rw-r--r-- 20.2 KB
structured-streaming-programming-guide.md -rw-r--r-- 79.1 KB
submitting-applications.md -rw-r--r-- 11.2 KB
tuning.md -rw-r--r-- 19.9 KB

README.md

back to top