Revision - c423091 - [SPARK-22271][SQL] mean overflows and returns null for some [...]

Revision c423091432d55fa88c2799181b9d3575213530c8 authored by Huaxin Gao on 17 October 2017, 19:50:41 UTC, committed by gatorsmile on 17 October 2017, 19:53:14 UTC

[SPARK-22271][SQL] mean overflows and returns null for some decimal variables

## What changes were proposed in this pull request?

In Average.scala, it has
```
  override lazy val evaluateExpression = child.dataType match {
    case DecimalType.Fixed(p, s) =>
      // increase the precision and scale to prevent precision loss
      val dt = DecimalType.bounded(p + 14, s + 4)
      Cast(Cast(sum, dt) / Cast(count, dt), resultType)
    case _ =>
      Cast(sum, resultType) / Cast(count, resultType)
  }

  def setChild (newchild: Expression) = {
    child = newchild
  }

```
It is possible that  Cast(count, dt), resultType) will make the precision of the decimal number bigger than 38, and this causes over flow.  Since count is an integer and doesn't need a scale, I will cast it using DecimalType.bounded(38,0)
## How was this patch tested?
In DataFrameSuite, I will add a test case.

Please review http://spark.apache.org/contributing.html before opening a pull request.

Author: Huaxin Gao <huaxing@us.ibm.com>

Closes #19496 from huaxingao/spark-22271.

(cherry picked from commit 28f9f3f22511e9f2f900764d9bd5b90d2eeee773)
Signed-off-by: gatorsmile <gatorsmile@gmail.com>

# Conflicts:
#	sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala

1 parent 71d1cb6

Files
Changes

Permalinks

File	Mode	Size
.github
R
assembly
bin
build
common
conf
core
data
dev
docs
examples
external
graphx
launcher
licenses
mllib
mllib-local
project
python
repl
resource-managers
sbin
sql
streaming
tools
.gitattributes	-rw-r--r--	40 bytes
.gitignore	-rw-r--r--	1.2 KB
.travis.yml	-rw-r--r--	1.7 KB
CONTRIBUTING.md	-rw-r--r--	995 bytes
LICENSE	-rw-r--r--	17.5 KB
NOTICE	-rw-r--r--	24.1 KB
README.md	-rw-r--r--	3.7 KB
appveyor.yml	-rw-r--r--	1.9 KB
pom.xml	-rw-r--r--	94.8 KB
scalastyle-config.xml	-rw-r--r--	17.4 KB

Showing with 0 additions and 0 deletions (0 / 0 diffs computed)

Computing file changes ...

[SPARK-22271][SQL] mean overflows and returns null for some decimal variables

README.md