https://github.com/apache/spark
Revision c2ce247ead836a3ae593a6e4f2a5758c34a35bb4 authored by jeanlyn on 29 March 2016, 17:51:00 UTC, committed by Andrew Or on 29 March 2016, 17:51:00 UTC
## What changes were proposed in this pull request?

We have a streaming job using `FlumePollInputStream` always driver OOM after few days, here is some driver heap dump before OOM
```
 num     #instances         #bytes  class name
----------------------------------------------
   1:      13845916      553836640  org.apache.spark.storage.BlockStatus
   2:      14020324      336487776  org.apache.spark.storage.StreamBlockId
   3:      13883881      333213144  scala.collection.mutable.DefaultEntry
   4:          8907       89043952  [Lscala.collection.mutable.HashEntry;
   5:         62360       65107352  [B
   6:        163368       24453904  [Ljava.lang.Object;
   7:        293651       20342664  [C
...
```
`BlockStatus` and `StreamBlockId` keep on growing, and the driver OOM in the end.
After investigated, i found the `executorIdToStorageStatus` in `StorageStatusListener` seems never remove the blocks from `StorageStatus`.
In order to fix the issue, i try to use `onBlockUpdated` replace `onTaskEnd ` , so we can update the block informations(add blocks, drop the block from memory to disk and delete the blocks) in time.

## How was this patch tested?

Existing unit tests and manual tests

Author: jeanlyn <jeanlyn92@gmail.com>

Closes #12028 from jeanlyn/fixoom1.6.
1 parent a757944
History
Tip revision: c2ce247ead836a3ae593a6e4f2a5758c34a35bb4 authored by jeanlyn on 29 March 2016, 17:51:00 UTC
[SPARK-13845][CORE][BACKPORT-1.6] Using onBlockUpdated to replace onTaskEnd avioding driver OOM
Tip revision: c2ce247

README.md

back to top