https://github.com/Netflix/atlas

sort by:
Revision Author Date Message Commit Date
2b0996e fix build settings for 2.13 to use `-release 8` (#1055) Before it was looking specifically for version 12. For now we still need to be able to run on jdk8+. The specific issue encountered was: ``` java.lang.NoSuchMethodError: java.nio.CharBuffer.clear()Ljava/nio/CharBuffer; at com.netflix.atlas.core.model.TaggedItem$.writePair(TaggedItem.scala:59) at com.netflix.atlas.core.model.TaggedItem$.computeId(TaggedItem.scala:105) at com.netflix.atlas.core.model.TimeSeries$.<clinit>(TimeSeries.scala:22) at com.netflix.atlas.core.model.EvalContext.<init>(EvalContext.scala:32) at com.netflix.atlas.druid.DruidDatabaseActorSuite.<init>(DruidDatabaseActorSuite.scala:259) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) ``` Also improves the check for recent JDKs. 13 June 2019, 17:38:02 UTC
a9b2821 1.6: update to scala 2.13.0 final (#1053) 13 June 2019, 15:44:32 UTC
562cd85 1.6: cross build for 2.13.0-RC3 (#1049) Backport #1048 to 1.6.x branch. Most changes are minor. For some of the custom collections they needed to be specific to the new version. Bump akka and jackson versions as the older versions do not have a published build for 2.13. 05 June 2019, 23:42:56 UTC
045de8c fix deprecation warnings for ActorPublisher (#1037) Changes the processing for the `/fetch` endpoint to use streams instead of the deprecated ActorPublisher. 01 May 2019, 22:28:44 UTC
c783b88 fix empty data case for math/count (#1036) It wasn't checking if the input was empty like the other aggregate functions. Now it performs the same check which avoids errors like: ``` must have 1 or more time series to perform aggregation ``` 30 April 2019, 23:41:51 UTC
049a878 fix eval state for sparse lines with group by (#1035) Before, if a line was sparse, for example an error counter that has a lot of gaps, the state would be handled incorrectly. There would be one state item per matching grouping when data was present and they would only get moved forward if there was data for that key in a given interval. For intervals with no data, a no data line would get pushed through and have its own independent state. With this change, an empty group by result will push an empty data set through eval and it will now move the state buffers to allow for expiration and correct handling of subsequent intervals with data. If there is no resulting data lines from the overall expression, then a no data line for the overall expression will be emitted to the user. 30 April 2019, 16:54:42 UTC
4550be9 detect empty state and stop tracking it (#1034) Updates the online algorithms used with stateful operators to detect if the state is effectively empty. In that case the state will be dropped from the state map maintained during evaluation. This is useful to avoid leaking memory for tracking the state during long running streaming evaluations. There are currently three exceptions that have an unbounded window and never become empty: ignore-N, sdes, and integral. Integral is generally not very useful in a streaming context so we may just prohibit that operation. For sdes (and ignore-N which is typically only used to align sdes), the current uses do not involve group by operations, so it shouldn't be an issue for now. We'll revisit in the future when there are concrete use-cases that are problematic for that operation. 29 April 2019, 21:47:02 UTC
a805a6a fix tag results on MemoryDatabase.execute (#1031) The set of tags for the aggregated lines should be limited to the set that are exact matches in the query or are a part of a group by. Otherwise the grouping can have incorrect duplicates in the result set. Fixes #629. 26 April 2019, 22:19:07 UTC
cca15bb expire interned queries after 12h (#1030) Switches from a ConcurrentHashMap to a Cache for the interned queries. This avoids a slow leak over time with churn in the set of queries. In practice, we have not seen enough churn for it to matter, but now queries that are no longer in use will expire and go away. Fixes #729. 26 April 2019, 22:09:09 UTC
bf69de1 improve error message for invalid percentile data (#1029) The spectator-js client had a bug (Netflix/spectator-js#22) leading to duplicate values for a given count. This change updates the error message so it is easier to understand what is happening. Before it would just say `assertion failed`. 26 April 2019, 21:32:05 UTC
180fbaa improve handling of empty legend string (#1028) Before it would fail with an error that is hard for a user to understand: ``` IllegalArgumentException: Can't add attribute to 0-length text ``` This error would only show up if the image legend was rendered causing further confusion because it wouldn't show up if the legend was suppressed. Now it will fallback to the default legend string if an empty string is specified. 25 April 2019, 02:26:33 UTC
b3206fa remove inline rollup hooks on the api (#1027) Partially reverts #525. This removes the plumbing for the ids and rollups via the api. The usage for the internal DB have been retained. The reason for this is it adds additional complexity and risk to the backends and for most use-cases we are moving to the simple aggregator cluster instead (see atlas-aggregator in iep-apps). As we are not planning to exercise this as scale anytime soon, it is being removed from the APIs before 1.6 release. 25 April 2019, 02:19:29 UTC
49a4717 Add rolling-sum stateful operator (#1025) Sum of the values within a specified window. The sum will only be emitted if there are at least a minimum number of actual values (not `NaN`) within the window. Otherwise `NaN` will be emitted for that time period. 17 April 2019, 02:37:05 UTC
0fb0a09 update dependencies (#1024) 15 April 2019, 21:22:15 UTC
dff6a1a equalsverifier 3.1.8 15 April 2019, 21:02:07 UTC
9bc07aa RoaringBitmap 0.8.0 15 April 2019, 20:58:26 UTC
ce57cda joda-convert 2.2.0 15 April 2019, 20:57:55 UTC
a8bed68 aws-java-sdk 1.11.534 15 April 2019, 20:56:52 UTC
26318f3 iep 2.0.1 15 April 2019, 20:56:28 UTC
e98e25c spectator 0.90.0 15 April 2019, 20:44:03 UTC
48fef1a caffeine 2.7.0 15 April 2019, 20:43:21 UTC
1e2ebb6 akka-http 10.1.8 15 April 2019, 20:13:36 UTC
493714e akka 2.5.22 15 April 2019, 20:10:16 UTC
a3d3a20 fix typo in rolling-{min,max} descriptions (#1023) `s/is can/can/` 14 April 2019, 20:59:09 UTC
3aa0bec preserve offset for named rewrites (#1022) In some cases when the rewrite was not being used like an aggregate function, the offset would get lost. Fixes #1021. 14 April 2019, 16:23:51 UTC
579211f fix unused method warnings (#1020) 12 April 2019, 02:16:39 UTC
c31db8c add test case for stateful windows with no data (#1019) Verify that rolling window for stateful operators moves for intervals that have no data for the expression. 11 April 2019, 23:28:05 UTC
294f301 support generated dataset for eval (#1018) Adds a `evaluator.createDatapointProcessor(DataSources)` method that can be used to process a generated set of datapoints rather than getting data from an LWC cluster. This can be used for tests or for using Atlas expressions over arbitrary event streams. Fixes #798. 05 April 2019, 13:47:45 UTC
179311d Switch 2.12 build back to `osx` (#1017) Experiment showed that the 2.12 build with linux/trusty _does_ result in the correct version. 29 March 2019, 23:17:52 UTC
7971b7c Explicitly fetch tags for build (#1016) Add `--tags` to the `git fetch --unshallow` command. Specifying just `--unshallow` or just `--tags` results in an artifact version of `0.1` when building on Travis with `osx`. 29 March 2019, 23:14:20 UTC
9c917f4 Switch to linux/trusty for 2.12 build (#1015) Switch to linux/trusty for 2.12 build to match the 2.11 build. This is to observe whether the issue determining the version to use for publication has an OS component. 29 March 2019, 22:42:10 UTC
0638240 Revert "Ensure git tags are available for build" (#1014) This did not have the desired affect and seems to have caused the 2.11 build to no longer work. This reverts commit 7f30c26. 29 March 2019, 22:08:01 UTC
646d366 Ensure git tags are available for build (#1012) `git fetch --unshallow` was being used to get tags to set the artifact version. This has started failing on one of the builds. This commit switches to explicitly fetching tags via `git fetch --tags`. 29 March 2019, 21:34:59 UTC
3f5c03d Fix lambda config (#1010) The code that loads `MetricCategory` instances from the config expects the `dimension` field, even if it's empty. This was causing the service to silently fail. In addition to fixing the `lambda` config, I've added tests that load the production config to improve the chances we'll catch this at build time. 29 March 2019, 17:32:41 UTC
02dc9b9 improve logging for poller actor failures (#1011) Clearly log the initialization failure as an error for the poller manager. Avoids this getting overlooked if the detailed actor logging is not enabled. 29 March 2019, 17:03:58 UTC
a1f778c Collect additional lambda metrics (#1008) Add collection of * `ConcurrentExecutions` * `UnreservedConcurrentExecutions` * `DeadLetterErrors` * `IteratorAge` Issue #1005 28 March 2019, 02:10:14 UTC
bf9a99d simplify log/power scale logic (#1009) Simplifies the logic for computing the log and power scales. Now it just applies the mapping function to the input bounds and the value prior to using a normal linear scale. 28 March 2019, 01:27:28 UTC
b5b9eae update dependencies (#1006) 20 March 2019, 03:03:18 UTC
824d9ef aws-java-sdk 1.11.521 20 March 2019, 02:52:29 UTC
d546872 equalsverifier 3.1.7 20 March 2019, 02:36:35 UTC
a0bc227 scalatest 3.0.7 20 March 2019, 02:34:44 UTC
f89c5d1 spectator 0.87.0 20 March 2019, 02:31:35 UTC
8c4d12f iep 2.0.0 20 March 2019, 02:29:02 UTC
5f95cba Collect additional aurora replica metrics (#1003) Add collection of `AuroraReplicaLag` and `AuroraReplicaLagMaximum` (fixes #1002). 07 March 2019, 20:52:38 UTC
bab1872 Improve CloudWatch metric lag handling (#1001) This is the first of potentially multiple commits to improve handling of CloudWatch metric lag. This first commit makes the time range of the query configurable and adds a metric to track the age (in CloudWatch periods) of the latest datapoint returned. This will enable assessing the distribution of ages across the namespaces and metrics collected. Those data will influence the approach for datapoints that are older than Atlas will accept. 01 March 2019, 23:09:05 UTC
ed1f1f2 update dependencies (#1000) 28 February 2019, 17:18:32 UTC
782583a log4j 2.11.2 28 February 2019, 17:01:47 UTC
77ce64f roaring bitmap 0.7.42 28 February 2019, 16:54:43 UTC
23d59a2 equalsverifier 3.1.5 28 February 2019, 16:53:49 UTC
7010f52 slf4j 1.7.26 28 February 2019, 16:48:17 UTC
ed81688 scalatest 3.0.6 28 February 2019, 16:36:21 UTC
47ea06c akka 2.5.21 28 February 2019, 16:34:16 UTC
28f7b3a spectator 0.86.0 28 February 2019, 15:10:39 UTC
29c2ad9 aws-java-sdk 1.11.501 28 February 2019, 15:08:43 UTC
6e1be7f Add NetworkELB TLS metrics (#997) See * https://aws.amazon.com/about-aws/whats-new/2019/01/network-load-balancer-now-supports-tls-termination/ * https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-cloudwatch-metrics.html 28 February 2019, 15:04:55 UTC
fd64f34 Add NATGateway Metrics (#998) See https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway-cloudwatch.html 28 February 2019, 02:15:28 UTC
9fb970d fix link on eval lib readme (#996) Reference to reactive streams publisher didn't match. 27 February 2019, 14:51:19 UTC
9ad7190 add helpers for accessing materializer for stage (#995) In some cases, such as calling `discardEntityBytes` for an HTTP response, it is useful to access the materializer for the stage so the stream blueprint can be created without needing to pass in a materializer. 22 February 2019, 06:17:23 UTC
b151b50 only use offset notation if there is duplication (#994) Adjusts the logic so that offset notation for the y-axis will only get used if there is an actual duplication for the major tick labels. See #991 for more information. 15 February 2019, 17:55:30 UTC
6c3e280 avoid BoxesRunTime.equals for SmallHashMap (#990) For the QueryIndex on the streaming clusters a hot spot is `SmallHashMap.get`. Flame graphs show a significant and unnecessary overhead being `BoxesRunTime.equals`. This change updates 7 places in the code to avoid that call: **Before** ``` $ javap -verbose ./atlas-core/target/scala-2.12/classes/com/netflix/atlas/core/util/SmallHashMap.class | grep Boxes 46: invokestatic #1101 // Method scala/runtime/BoxesRunTime.equals:(Ljava/lang/Object;Ljava/lang/Object;)Z 79: invokestatic #1101 // Method scala/runtime/BoxesRunTime.equals:(Ljava/lang/Object;Ljava/lang/Object;)Z 105: invokestatic #1101 // Method scala/runtime/BoxesRunTime.equals:(Ljava/lang/Object;Ljava/lang/Object;)Z 40: invokestatic #1155 // Method scala/runtime/BoxesRunTime.unboxToBoolean:(Ljava/lang/Object;)Z 29: invokestatic #1101 // Method scala/runtime/BoxesRunTime.equals:(Ljava/lang/Object;Ljava/lang/Object;)Z 59: invokestatic #1101 // Method scala/runtime/BoxesRunTime.equals:(Ljava/lang/Object;Ljava/lang/Object;)Z 17: invokestatic #1101 // Method scala/runtime/BoxesRunTime.equals:(Ljava/lang/Object;Ljava/lang/Object;)Z 2: invokestatic #1101 // Method scala/runtime/BoxesRunTime.equals:(Ljava/lang/Object;Ljava/lang/Object;)Z ``` **After** ``` $ javap -verbose ./atlas-core/target/scala-2.12/classes/com/netflix/atlas/core/util/SmallHashMap.class | grep Boxes 40: invokestatic #1154 // Method scala/runtime/BoxesRunTime.unboxToBoolean:(Ljava/lang/Object;)Z ``` 01 February 2019, 00:30:38 UTC
d7e92dc QueryIndex: reduce overhead for checking entries (#989) Flame graphs on the prod clusters show a bit of overhead for filter and exists calls on the list of entries. This change converts it to a simple array and avoids using the collections framework methods. For the existing JMH test this resulted in about a 12% improvement. 31 January 2019, 22:31:05 UTC
5f5166a add rolling-mean operator (#987) This is meant as an alternative to `:trend` that fixes a number of issues with that operator. Specifically: 1. The denominator for the average is the number of actual values, that is non-NaN entries, within the rolling buffer. The `:trend` operator always uses the window size which can create confusing drops because `NaN` values are effectively 0. 2. The minimum number of values permitted before emitting a mean can be specified by the user. 3. It is more consistent with other stateful operators in that it works on a window size relative to the step interval rather than a fixed time duration. 4. It is more consistent with similar operators in other tools such as the `rolling_mean` function provided by Panda. Fixes #958. 29 January 2019, 22:30:29 UTC
80322a9 fix procedure syntax warnings (#988) Procedure syntax is deprecated in 2.13 and results in a lot of warnings when trying to build on that version of Scala. 29 January 2019, 22:26:02 UTC
fb09dd5 update dependencies (#986) 29 January 2019, 20:36:22 UTC
d11f6aa equalsverifier 3.1.4 29 January 2019, 18:31:37 UTC
c262b24 roaring bitmap 0.7.36 29 January 2019, 18:27:49 UTC
52da9ed frigga 0.19.0 29 January 2019, 18:27:00 UTC
4cf5e49 iep 1.2.10 29 January 2019, 18:22:08 UTC
b965033 aws-java-sdk 1.11.482 29 January 2019, 18:21:06 UTC
27c5f33 spectator 0.83.0 29 January 2019, 18:10:14 UTC
8e2007e sbt-scalafmt 1.16 29 January 2019, 18:09:26 UTC
2acb1a3 akka-http 10.1.7 29 January 2019, 18:01:00 UTC
25016b1 akka 2.5.20 29 January 2019, 18:00:02 UTC
ca7ffed consistent state model for all stateful operators (#985) Updates all of the stateful operators to use the same online algorithm base classes. This also gives them a consistent representation of state that can easily be serialized and deserialized. This is a first step to possible future work of persisting the state of streaming evaluations so it can be replayed or the execution can be transitioned to another instance. 29 January 2019, 17:54:49 UTC
ccea49d refactor to avoid AssignOrNamedArg (#984) This class was renamed in scala 2.13 (scala/scala@870131b). 29 January 2019, 14:49:35 UTC
d295803 Throttle calls to CloudWatch (#983) We're hitting CloudWatch rate limits on a regular basis. However, the AWS limits should be sufficient for our overall per second call rate in the majority of cases. The current pattern of calls has bursts when the `Tick` message kicks off a collection, which causes the call rate to spike above the per second limit. This commit introduces call rate limiting to smooth out the request pattern. The Akka documentation for throttling request/response actor communication is incomplete. Through iteration playing around in a local toy app, I arrived at the implementation herein and confirmed that it works as expected. For this use case, it's important to ensure that either all or none of the `MetricMetadata` elements are added for processing. To satisfy that requirement, the full list is sent to the actor `Source` which then uses `flatMapConcat` to send each element individually through the throttle phase. This provides a stronger guarantee than the default, which could drop elements if the queue fills up. In practice, memory is more likely to be the limiting factor, given actors have unbounded mailboxes by default. Case in point, it was difficult to trigger the drop scenario in the local toy app. However, this approach more deterministically provides the stronger guarantee. 28 January 2019, 19:49:54 UTC
c56bc88 use jdk8 for building the scala 2.11 artifacts (#982) Since 2.11 doesn't support the `--release` option, building on a newer version can lead to errors when running on jdk8. Specifically the return type of some methods changed in jdk9+. This should fix errors like: ``` Cause: java.lang.NoSuchMethodError: java.nio.CharBuffer.clear()Ljava/nio/CharBuffer; at com.netflix.atlas.core.model.TaggedItem$.writePair(TaggedItem.scala:59) at com.netflix.atlas.core.model.TaggedItem$.computeId(TaggedItem.scala:105) ``` This means that image tests will not run for the 2.11 build. 23 January 2019, 22:23:54 UTC
8b8881f remove stat vars from output tags (#981) The stat vars are desired for substitutions (#878), but should not be included in the tag maps for the output. The output tags should be stable over time if evaluated incrementally. Including the stats breaks this because the values are dependent on the data for that time slice. 18 January 2019, 17:40:54 UTC
55b900a add helper function to validation a datasource (#980) This can be used as an upfront check to filter out bad data sources rather than getting the failure via the stream. 18 January 2019, 00:50:33 UTC
b014def update default grid colors (#979) This makes the grid colors lighter so they do not distract the viewer as much. These settings have been used internally for many years so this also reduces differences between the internal use at Netflix and OSS settings. 08 January 2019, 21:13:50 UTC
e10c574 use dedicated object for algo state (#978) Before it was using a Config object for convenience. This switches it to a dedicated object that can be easily used with `Json.encode/decode` or other similar tools. This should also be more efficient for the more common use-cases because we can avoid creation of the needless config objects. 04 January 2019, 19:20:31 UTC
14efd74 avoid conversion if already a ConfigValue (#977) In the docs site it is getting config 1.2 in the sbt classpath. There isn't an obvious way to force it to a newer version. The older version will fail if trying to convert a ConfigValue to a ConfigValue. For now we can workaround the problem by special casing that to avoid the unnecessary conversion. 04 January 2019, 00:43:09 UTC
87689ad set --release for javac (#976) After switching to use jdk11 for the build, the java classes were getting compiled to class version 55 instead of 52. 03 January 2019, 21:52:59 UTC
bb7b319 remove redis from travis config (#975) This is no longer needed for the Atlas build. 03 January 2019, 20:43:24 UTC
c9cad18 disable scaladoc publishing (#974) On JDK11 with `-release 8` it crashes with: ``` [error] java.lang.AssertionError: assertion failed: [error] type AnyRef in java.lang [error] while compiling: ... [error] during phase: globalPhase=terminal, enteringPhase=typer [error] library version: version 2.12.8 [error] compiler version: version 2.12.8 ``` This is a quick workaround as we do not rely on the published scaladoc jars for anything. 03 January 2019, 19:24:13 UTC
ceced7a build using openjdk11 (#973) This updates the travis builds to use OpenJDK 11. The `-release 8` option is used to ensure the generated bytecode will still work on JDK8. Due to font rendering differences, image tests will fail when running on older versions of the JDK or on operating systems other than Mac OS X. Those checks will now automatically be disabled on systems that are known to fail, but are checked as part of CI validation. 03 January 2019, 17:32:11 UTC
f6c0fb9 enable antialiasing by default (#972) Update the config settings to use antialiasing for the text by default. It is explicitly disabled for tests as it will frequently cause rendering differences across systems. 02 January 2019, 22:25:02 UTC
822beec use RobotoMono font for error images (#971) Follow up to #967. Use RobotoMono font for the png error image utility as well as the graphs. 02 January 2019, 21:14:58 UTC
4c10f3c sbt 1.2.8 (#970) Fixes occasional NPE for Bintray. https://developer.lightbend.com/blog/2018-12-30-sbt-1-2-8/ 02 January 2019, 18:35:26 UTC
34d291b update license headers for 2019 (#969) 02 January 2019, 18:05:06 UTC
de117e9 use standard IIOMetadata classes for PNG (#968) When using the `-release 8` option the internal `PNGMetadata` class is not found in the classpath. Update the usage to rely on the public APIs. 22 December 2018, 04:39:47 UTC
363f2fe switch to RobotoMono font (#967) The Lucida fonts are not included with OpenJDK and have been removed from OracleJDK in version 11. The RobotoMono family is Apache licensed and will now be used as the default to get a more consistent experience across JDK versions. 21 December 2018, 23:49:17 UTC
4662ced fix #852, inconsistent group by behavior (#966) Before, attempting a group by on non-grouped expressions without a math aggregate function would behave differently than a non-grouped expression with a math aggregate. Now they have the same behavior and the group by will be ignored for expression trees that do not support it. 21 December 2018, 21:08:31 UTC
22ae17f fix #763, custom consolidation with rewrites (#965) The rewrites that look like aggregation functions will now work with custom consolidations. If used the rewrite will not be preserved in the model. So the output of converting the parsed expression model to a string will be the expanded expression and not indicate the rewrite was used. 21 December 2018, 20:32:32 UTC
19bcfc0 fix #948, all zeros shown on y-axis (#964) If the upper bound exactly matched `10 * factor` for the selected unit prefix, then it would use a different selection to avoid large numbers for the tick labels. However, for an exact match it is better to use the default prefix to avoid getting zeros due to rounding with the larger prefix. 21 December 2018, 19:01:14 UTC
c4e8820 move PngImage from atlas-core to atlas-chart (#963) The image utilities are only used for charting and this makes it easier to get consistency with upcoming font changes. 21 December 2018, 17:31:14 UTC
b2a28d0 akka-http 10.1.6 (#961) Adds an explicit `akka-stream-testkit` dependency because it is no longer included with `akka-http-testkit`. 21 December 2018, 03:36:00 UTC
a8d0eed update to jackson 2.9.8 (#960) Has a number of security fixes: https://groups.google.com/forum/#!topic/jackson-user/8jdpNS1dQPQ https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.9.8 20 December 2018, 20:55:28 UTC
294e0ba update dependencies (#959) 14 December 2018, 22:29:27 UTC
f94046a sbt 1.2.7 14 December 2018, 20:47:40 UTC
back to top