https://github.com/apache/spark
Revision a3fef2c02f93b48c15feec21515567d6fded19f1 authored by Yin Huai on 02 March 2015, 15:18:07 UTC, committed by Cheng Lian on 02 March 2015, 15:18:31 UTC
[SPARK-6052][SQL]In JSON schema inference, we should always set containsNull of an ArrayType to true
Always set `containsNull = true` when infer the schema of JSON datasets. If we set `containsNull` based on records we scanned, we may miss arrays with null values when we do sampling. Also, because future data can have arrays with null values, if we convert JSON data to parquet, always setting `containsNull = true` is a more robust way to go. JIRA: https://issues.apache.org/jira/browse/SPARK-6052 Author: Yin Huai <yhuai@databricks.com> Closes #4806 from yhuai/jsonArrayContainsNull and squashes the following commits: 05eab9d [Yin Huai] Change containsNull to true. (cherry picked from commit 3efd8bb6cf139ce094ff631c7a9c1eb93fdcd566) Signed-off-by: Cheng Lian <lian@databricks.com>
1 parent c59871c
Tip revision: a3fef2c02f93b48c15feec21515567d6fded19f1 authored by Yin Huai on 02 March 2015, 15:18:07 UTC
[SPARK-6052][SQL]In JSON schema inference, we should always set containsNull of an ArrayType to true
[SPARK-6052][SQL]In JSON schema inference, we should always set containsNull of an ArrayType to true
Tip revision: a3fef2c
File | Mode | Size |
---|---|---|
assembly | ||
bagel | ||
bin | ||
build | ||
conf | ||
core | ||
data | ||
dev | ||
docker | ||
docs | ||
ec2 | ||
examples | ||
external | ||
extras | ||
graphx | ||
mllib | ||
network | ||
project | ||
python | ||
repl | ||
sbin | ||
sbt | ||
sql | ||
streaming | ||
tools | ||
yarn | ||
.gitattributes | -rw-r--r-- | 40 bytes |
.gitignore | -rw-r--r-- | 962 bytes |
.rat-excludes | -rw-r--r-- | 985 bytes |
CONTRIBUTING.md | -rw-r--r-- | 663 bytes |
LICENSE | -rw-r--r-- | 45.0 KB |
NOTICE | -rw-r--r-- | 22.0 KB |
README.md | -rw-r--r-- | 3.5 KB |
make-distribution.sh | -rwxr-xr-x | 8.6 KB |
pom.xml | -rw-r--r-- | 58.3 KB |
scalastyle-config.xml | -rw-r--r-- | 7.6 KB |
tox.ini | -rw-r--r-- | 838 bytes |
![swh spinner](/static/img/swh-spinner.gif)
Computing file changes ...