Revision d7b268ab3264b892c4477cf8af30fb78c2694748 authored by herman on 03 December 2019, 10:25:49 UTC, committed by herman on 03 December 2019, 10:25:49 UTC
### What changes were proposed in this pull request? Observable metrics are named arbitrary aggregate functions that can be defined on a query (Dataframe). As soon as the execution of a Dataframe reaches a completion point (e.g. finishes batch query or reaches streaming epoch) a named event is emitted that contains the metrics for the data processed since the last completion point. A user can observe these metrics by attaching a listener to spark session, it depends on the execution mode which listener to attach: - Batch: `QueryExecutionListener`. This will be called when the query completes. A user can access the metrics by using the `QueryExecution.observedMetrics` map. - (Micro-batch) Streaming: `StreamingQueryListener`. This will be called when the streaming query completes an epoch. A user can access the metrics by using the `StreamingQueryProgress.observedMetrics` map. Please note that we currently do not support continuous execution streaming. ### Why are the changes needed? This enabled observable metrics. ### Does this PR introduce any user-facing change? Yes. It adds the `observe` method to `Dataset`. ### How was this patch tested? - Added unit tests for the `CollectMetrics` logical node to the `AnalysisSuite`. - Added unit tests for `StreamingProgress` JSON serialization to the `StreamingQueryStatusAndProgressSuite`. - Added integration tests for streaming to the `StreamingQueryListenerSuite`. - Added integration tests for batch to the `DataFrameCallbackSuite`. Closes #26127 from hvanhovell/SPARK-29348. Authored-by: herman <herman@databricks.com> Signed-off-by: herman <herman@databricks.com>
1 parent 075ae1e
File | Mode | Size |
---|---|---|
LICENSE-AnchorJS.txt | -rw-r--r-- | 1.1 KB |
LICENSE-CC0.txt | -rw-r--r-- | 6.9 KB |
LICENSE-bootstrap.txt | -rw-r--r-- | 550 bytes |
LICENSE-cloudpickle.txt | -rw-r--r-- | 1.6 KB |
LICENSE-copybutton.txt | -rw-r--r-- | 2.4 KB |
LICENSE-d3.min.js.txt | -rw-r--r-- | 1.4 KB |
LICENSE-dagre-d3.txt | -rw-r--r-- | 1.0 KB |
LICENSE-datatables.txt | -rw-r--r-- | 1.0 KB |
LICENSE-graphlib-dot.txt | -rw-r--r-- | 1.0 KB |
LICENSE-heapq.txt | -rw-r--r-- | 2.4 KB |
LICENSE-join.txt | -rw-r--r-- | 1.5 KB |
LICENSE-jquery.txt | -rw-r--r-- | 1.1 KB |
LICENSE-json-formatter.txt | -rw-r--r-- | 547 bytes |
LICENSE-matchMedia-polyfill.txt | -rw-r--r-- | 149 bytes |
LICENSE-modernizr.txt | -rw-r--r-- | 1.1 KB |
LICENSE-mustache.txt | -rw-r--r-- | 1.2 KB |
LICENSE-py4j.txt | -rw-r--r-- | 1.4 KB |
LICENSE-respond.txt | -rw-r--r-- | 1.0 KB |
LICENSE-sbt-launch-lib.txt | -rw-r--r-- | 1.5 KB |
LICENSE-sorttable.js.txt | -rw-r--r-- | 937 bytes |
LICENSE-vis.txt | -rw-r--r-- | 406 bytes |
![swh spinner](/static/img/swh-spinner.gif)
Computing file changes ...