https://github.com/Netflix/atlas

sort by:
Revision Author Date Message Commit Date
5fbe43a akka-http 10.0.10 (#664) Fixes race condition with the idle timeout on the host connection pool (akka/akka-http#1245). 31 August 2017, 20:06:06 UTC
afe5254 sbt 1.0.1 (#662) Fixes some minor issues, unfortunately just running `project/sbt` will still leave the terminal in a bad state. Need to create a minimal example and report. 31 August 2017, 02:53:28 UTC
f71934c make ShardsSuite more resilient to false positives (#661) Updates the checks for uniform distribution of the ids to compare the average with the min and the max rather than comparing the min to the max. In rare cases we were seeing the min being over the 10% threshold to the max. Fixes #660. 31 August 2017, 02:53:12 UTC
a86f5c6 log4j 2.9.0 30 August 2017, 23:00:59 UTC
7e59c21 aws-java-sdk 1.11.185 30 August 2017, 23:00:59 UTC
f532fbc roaringbitmap 0.6.51 30 August 2017, 23:00:59 UTC
deffe69 caffeine 2.5.5 30 August 2017, 23:00:59 UTC
d05bb71 joda convert 1.8.3 30 August 2017, 23:00:59 UTC
b81fa14 akka 2.5.4 30 August 2017, 23:00:59 UTC
cfd77ca equalsverifier 2.3.3 30 August 2017, 23:00:59 UTC
44ba657 jackson 2.8.9 30 August 2017, 23:00:59 UTC
33d730b iep 1.0.4 30 August 2017, 23:00:59 UTC
916627d scalatest 3.0.4 30 August 2017, 23:00:59 UTC
0995a0d scala-logging 3.7.2 30 August 2017, 23:00:59 UTC
735a1c8 recover from errors on response source (#658) The LWC HostSource was not recovering if there was a failure on the entity source from the response. This would cause the overall stream to fail and stop producing data. With this change the error will get logged and it will get retried like any other failure. 30 August 2017, 22:39:16 UTC
5b4654f refactor sub manager to ensure cleanup works (#657) Updates the subscription manager to it is also tracking the expressions. The expression database classes are no longer needed and were removed. This change makes it easier to ensure that all resources for a given stream will be properly cleaned up when the connection goes away. Fixes #651. 30 August 2017, 16:52:15 UTC
7879a06 update ids for lwc to be consistent with metrics (#656) Changes the id strings used for lwc expressions to be more consistent with how metric ids are generated. 29 August 2017, 19:46:04 UTC
1b8066c remove GlobalUUID cruft (#655) This is left over from when redis was used in earlier versions. 29 August 2017, 19:33:00 UTC
fc7659e remove unused class TTLManager (#654) 29 August 2017, 17:46:40 UTC
269a232 support chaining with SmallHashMap.Builder (#653) Updates the return types for `add` and `addAll` to return the Builder so they can be chained. 25 August 2017, 20:22:32 UTC
62ce53c fix endless loop for numProbesPerKey (#652) This method was incorrectly using the data length rather than the capacity when looping over the array. If a key had a collision and was hashed to a position that was larger than the data length, then it would be an endless loop because the key entry would never be found. 25 August 2017, 19:52:06 UTC
e0b32df update to spectator 0.57.1 (#648) Fixes issues when running on jdk9: ``` Cause: java.lang.IllegalAccessException: access to public member failed: com.netflix.spectator.api.RegistryConfig.gaugePollingFrequency()Duration/invokeSpecial, from com.netflix.spectator.api.RegistryConfig/2 (unnamed module @7d3cb5af) at java.base/java.lang.invoke.MemberName.makeAccessException(MemberName.java:914) at java.base/java.lang.invoke.MethodHandles$Lookup.checkAccess(MethodHandles.java:2193) at java.base/java.lang.invoke.MethodHandles$Lookup.checkMethod(MethodHandles.java:2133) at java.base/java.lang.invoke.MethodHandles$Lookup.getDirectMethodCommon(MethodHandles.java:2282) at java.base/java.lang.invoke.MethodHandles$Lookup.getDirectMethodNoSecurityManager(MethodHandles.java:2276) at java.base/java.lang.invoke.MethodHandles$Lookup.unreflectSpecial(MethodHandles.java:1800) at com.netflix.spectator.impl.Config.lambda$createProxy$3(Config.java:131) ``` 12 August 2017, 15:20:30 UTC
84c955b update to sbt 1.0.0 (#647) Requires updating a number of the sbt plugins needed for publishing. These changes have been verified in the iep project already. Dropped some of the non-essential plugins that are not yet compatible. 12 August 2017, 15:10:15 UTC
e797f08 use built in text/event-stream media type (#645) As of 10.0.8 akka-http supports the `text/event-stream` media type. We no longer need it as a custom one. 05 August 2017, 17:24:01 UTC
48a747a handle :all expressions with streaming eval (#644) There is a bigger discussion on whether we should remove `:all`, but for now all DataExpr types should be supported. 05 August 2017, 17:11:16 UTC
0672bad disable eviction warnings for update task (#643) Trying to clean up some of the noise in the build logs. No one is paying attention to these warnings and if there is a real compatibility problem it should be caught in tests or the integration environment. 04 August 2017, 15:47:31 UTC
54283e9 allow the local mapper with just group size (#642) The local mapper doesn't actually need the full group details just the size. Originally it took the group, but there are a number of uses now that just create a fake group with the right size. 04 August 2017, 14:57:56 UTC
5e2aef4 fix prefix for SQS message size metric (#641) s/sns/sqs/ 02 August 2017, 22:29:57 UTC
5adf6d7 docs: fix links for deuteranopia and deuteranomaly (#640) They were linking to the wrong sub-sections. 02 August 2017, 15:09:42 UTC
00e5c8a update default path pattern for StreamSupervisor (#639) We sometimes see paths for the stream supervisor that look like: ``` akka://test/user/StreamSupervisor-99961 ``` This creates a lot of ids which aren't that useful. This change updates the default pattern to just extract the prefix and ignore the counter at the end. So activity for all stream supervisors will show up as a single metric. 01 August 2017, 17:29:49 UTC
8c807ee fix key for adding tags to dynamodb errors (#638) The key in the configuration was singular instead of plural. Need to think about a better way to validate this in the future. 01 August 2017, 00:46:30 UTC
0d373a1 add helper for simple shard mapping (#637) Move over the internal helper utility for mapping data to a set of instances. This is typically used with a service like edda that maintains stable slotting for instances within auto-scaling groups. These simple schemes are preferred for the core monitoring system to avoid dependencies on complex infrastructure like zookeeper that are prone to having correlated failure with the key systems we need to monitor. 28 July 2017, 18:14:16 UTC
16421f0 update to scala 2.12.3 (#636) It is supposed to have improved compiler performance. 28 July 2017, 00:20:55 UTC
af3f4b5 cross link the pages for data/math aggregate functions (#635) Adds cross-links so it is easier for the user to get from the math variant of an aggregate function to the data variant docs and vice versa. The distinction between these is usually something the user doesn't need to worry about, but sometimes causes confusion. 26 July 2017, 23:51:37 UTC
61de4f0 add mappings for application load balancer stats (#634) Adds basic mappings for the ApplicationELB metrics in CloudWatch. There was a bit of debate on whether the LoadBalancer and TargetGroup dimensions should be kept as is or extract the name and only show that. The ids are rarely useful to the end user and add quite a bit of noise when selecting in the UI. For now though we just map the values in as they are provided in CloudWatch. 22 July 2017, 16:34:59 UTC
1fe5d32 convert invalid characters for polled metrics (#633) When using an alias with a lambda it will get encoded as a `:alias` suffix on the resource dimension. Before those stats were getting dropped as invalid. 21 July 2017, 22:29:19 UTC
4b23695 fix unit for s3 request timers (#632) The latency is in milliseconds and we were treating it as seconds. 21 July 2017, 21:03:12 UTC
4c7ff56 add mapping for s3 request metrics (#631) Collect s3 request metrics if enabled. They need to get enabled for a given bucket to be available. 20 July 2017, 15:33:59 UTC
e283528 improve error message when missing uri param (#630) Fixes #574. Improves the error message returned if the input URI is missing required parameters. It has also been changed to get returned as a diagnostic message as part of the stream so that a single bad URI will not stop the processor for all evaluations. 13 July 2017, 22:17:14 UTC
6c6204c refresh dependencies (#627) 04 July 2017, 17:17:45 UTC
3250f95 jsr305 3.0.2 04 July 2017, 17:08:37 UTC
c69cd42 spectator 0.56.0 04 July 2017, 17:01:36 UTC
1fb621c akka-http 10.0.9 04 July 2017, 16:59:54 UTC
93c2236 akka 2.5.3 04 July 2017, 16:55:15 UTC
695c429 aws-java-sdk 1.11.158 04 July 2017, 16:52:29 UTC
775706b roaringbitmap 0.6.45 04 July 2017, 16:51:39 UTC
dbc4453 equalsverifier 2.3.1 04 July 2017, 16:50:42 UTC
ef312e7 iep 1.0.2 04 July 2017, 16:49:05 UTC
6dc5e10 update merge to also dedup the list (#626) For the use-cases we care about the input list will already be sorted and uniqued. This fixes the merge to also remove duplicate values across the lists. 29 June 2017, 22:24:03 UTC
2e98374 helper for merging sorted lists (#625) Broken off from larger effort that is still in-progress. Basic helper utility for merging a collection of sorted lists. One such use-case is merging tag list results coming back from many shards. 29 June 2017, 21:29:14 UTC
7e7d093 fix error when rendering with a long time range (#624) When trying to render a chart with a long time range a NoSuchElementException would result if the graph width was too small. Sample exception below: ``` java.util.NoSuchElementException: head of empty list at scala.collection.immutable.Nil$.head(List.scala:428) ~[scala-library-2.12.2.jar:?] at scala.collection.immutable.Nil$.head(List.scala:425) ~[scala-library-2.12.2.jar:?] at com.netflix.atlas.chart.graphics.Ticks$.time(Ticks.scala:265) ~[atlas-chart_2.12-1.6.0-SNAPSHOT.jar:1.6.0-SNAPSHOT] ``` The cause is that there were no options for meeting the desired number of major ticks given the options for timeTickSizes. This change uses a simpler selection method with just major ticks for longer ranges setting it to the name of the month or year. 29 June 2017, 16:34:41 UTC
be4cd8b add :clamp-min and :clamp-max (#623) Allows the user to restrict the min and max values of the input respectively. The most common use-case motivating this is to be able to support a bounded auto-scaling for data on an axis. The axis lower and upper limits are either explicit or automatic. These operators give more flexiblity and can be set for a given line to tune its behavior. Name was chosen to match the same operators in Promoetheus. 20 June 2017, 18:05:17 UTC
1b4d0aa cache most recent datapoints from cloudwatch (#621) In some cases if there are a lot of cloudwatch metrics, then the polling can start to get throttled and cannot keep up at the desired rate. This can cause spotty reporting until the limit can get raised or the poller config is changed to be less aggressive. To minimize the visibility pf the problem for a user consuming the metric the last datapoint is now cached and the reported value comes from the local cache. This does mean that when such a problem is happening the wrong value might be propagated for several minutes. From recent testing though in the more typical case it avoids confusion more often until there is a better way to access cloudwatch data. If it is a big concern the ttl can be configured to a smaller value. 19 June 2017, 15:17:12 UTC
346e9c6 add mapping for ApproximateAgeOfOldestMessage (#620) New SQS metric that was added a while back: https://aws.amazon.com/about-aws/whats-new/2016/08/new-amazon-cloudwatch-metric-for-amazon-sqs-monitors-the-age-of-the-oldest-message/ 19 June 2017, 14:48:36 UTC
66439f1 fix IllegalFormatConversionException (#619) Illegal conversion when trying to access the group by key for a data expression. ``` java.util.IllegalFormatConversionException: x != com.netflix.atlas.core.model.ItemId at java.util.Formatter$FormatSpecifier.failConversion(Formatter.java:4302) at java.util.Formatter$FormatSpecifier.printInteger(Formatter.java:2793) at java.util.Formatter$FormatSpecifier.print(Formatter.java:2747) at java.util.Formatter.format(Formatter.java:2520) at java.util.Formatter.format(Formatter.java:2455) at java.lang.String.format(String.java:2940) at scala.collection.immutable.StringLike.format(StringLike.scala:351) at scala.collection.immutable.StringLike.format$(StringLike.scala:350) at scala.collection.immutable.StringOps.format(StringOps.scala:29) at com.netflix.atlas.core.model.DataExpr.groupByKey(DataExpr.scala:35) ``` 15 June 2017, 20:18:20 UTC
938edd6 docs: add Roy's presentation from Monitorama (#618) 13 June 2017, 17:36:25 UTC
4e73843 refresh dependencies (#617) 06 June 2017, 16:38:46 UTC
45b0f76 iep 1.0.1 06 June 2017, 16:27:19 UTC
65d70d1 akka 2.5.2 Note, there was a signature change to `FileIO.toPath` causing the build to break. Looks like the type for the set of options was changed leading to: ``` Error:(115, 34) type mismatch; found : scala.collection.immutable.Set[java.nio.file.StandardOpenOption] required: Set[java.nio.file.OpenOption] Note: java.nio.file.StandardOpenOption <: java.nio.file.OpenOption, but trait Set is invariant in type A. You may wish to investigate a wildcard type such as `_ <: java.nio.file.OpenOption`. (SLS 3.2.10) .toMat(FileIO.toPath(file, options))(Keep.right) ``` 06 June 2017, 16:14:56 UTC
d080359 akka-http 10.0.7 06 June 2017, 15:08:59 UTC
ea8db2b fix spelling in publish-test script (#616) s/anwser/answer/ 06 June 2017, 15:08:36 UTC
8a75a59 jol-core 0.8 06 June 2017, 15:07:49 UTC
bbd4a62 sbt-jmh 0.2.25 06 June 2017, 15:06:49 UTC
975785a caffeine 2.5.2 06 June 2017, 15:05:31 UTC
0e83dac equalsverifier 2.3 06 June 2017, 15:04:46 UTC
4138533 log4j 2.8.2 06 June 2017, 15:04:05 UTC
81e3e37 aws-java-sdk 1.11.133 06 June 2017, 15:01:27 UTC
4985299 add more test cases for cors (#615) Add some test cases for cors on error responses. There was a report CORS headers were not added to errors. That does not seem to be the case. Most likely they did not handle the rejections and the filter was not applied to the error. 06 June 2017, 14:43:39 UTC
c3964ca docs: update to recommend 1.5.3 (#614) 06 June 2017, 04:40:21 UTC
0dfb445 fix warnings about adapting argument list to tuple (#611) The usage of the parameters directive was giving warnings like: ``` Warning:(48, 17) Adapting argument list by creating a 3-tuple: this may not be what you want. signature: ParameterDirectives.parameters(pdm: akka.http.scaladsl.server.directives.ParameterDirectives.ParamMagnet): pdm.Out given arguments: scala.Symbol("name").$qmark, scala.Symbol("expression").$qmark, scala.Symbol("frequency").$qmark after adaptation: ParameterDirectives.parameters((scala.Symbol("name").$qmark, scala.Symbol("expression").$qmark, scala.Symbol("frequency").$qmark): akka.http.scaladsl.server.directives.ParameterDirectives.ParamMagnet{type Out = akka.http.scaladsl.server.Directive[(Option[String], Option[String], Option[String])]}) parameters('name.?, 'expression.?, 'frequency.?) { (name, expr, frequency) => ``` It is easy to fix, though previous usage does seem to be inline with examples from docs: http://doc.akka.io/docs/akka-http/10.0.7/scala/http/routing-dsl/directives/parameter-directives/parameters.html#optional-parameter-with-default-value 03 June 2017, 17:00:09 UTC
43df9ee fix deprecation warning for use of actor publisher (#610) There are a few more of these in other parts of the code, but they are a bit more involved to cleanup. Will look at those later. 03 June 2017, 16:48:13 UTC
a08e04c use hash code to short-circuit equals check (#609) Since the hash code is precomputed it can be a cheap way to short circuit the equality check on the array. 03 June 2017, 04:21:56 UTC
5b91291 add test case for decoding datapoint (#608) Make sure encoding/decoding of datapoint works as expected. 02 June 2017, 22:57:24 UTC
c0bb87a skip value associated with unknown fields (#607) If an unknown object field was used in a datapoint it would result in a NullPointerException by prematurely closing hitting the end of object. 02 June 2017, 22:47:33 UTC
edaaaac change ids to custom class (#606) Before the ids were just a BigInteger. This was mostly for convenience and it was an easy starting point for some modifications. Flame graphs of stateful clusters for some stacks are showing a significant about of time being spent computing the hash code for the ids when looking up the blocks. The custom class in this change uses a precomputed hash code so that it will be constant time. 02 June 2017, 20:56:36 UTC
4f54d00 fix regex matching when prefix is an exact match (#605) When searching for the prefix it was looking for the postion of the first match that was strictly greater than the prefix. For cases where the prefix is an exact match to the full value they would not match. The search now checks for the first match that is greater than or equal to the prefix. 02 June 2017, 20:19:12 UTC
9cc4292 fix warning for offset (#604) s/var/val/. 31 May 2017, 01:50:06 UTC
7cfcc2c fix multi-threaded reads from long set/map (#603) Several use-cases of the primitive collections assume multi-threaded read-only access is safe after the collection is built and effectively immutable. That is nothing will ever write to it again. The variants with a Long key type use an internal buffer to avoid allocations when passing the long value to `MurmurHash3.bytesHash`. The multi-threaded access could result in errors such as BufferOverflowException on the buffer. This changes those collections to use a ThreadLocal for keeping track of the buffers. 31 May 2017, 01:28:26 UTC
36f537c update findKeys to just return the strings (#602) Before it was wrapping the string in a TagKey type that could also provide counts for each key. Since this isn't supported or used anymore we can avoid the additional allocations. Also most use-cases were just mapping it back to strings after getting the response. 30 May 2017, 15:41:04 UTC
3864010 improve tag query performance (#601) Drops support for counts in the index and does some initial work to simplify and improve performance. For a lot of the common use-cases we see a 2-3x improvement: **Before** ``` Benchmark Mode Cnt Score Error Units create thrpt 10 13.210 ± 1.041 ops/s findKeysAll thrpt 10 4646.344 ± 176.685 ops/s findKeysQuery thrpt 10 450.728 ± 39.296 ops/s findValuesAllMany thrpt 10 739.919 ± 41.530 ops/s findValuesAllOne thrpt 10 2194.050 ± 248.351 ops/s ``` **After** ``` Benchmark Mode Cnt Score Error Units create thrpt 10 20.270 ± 0.798 ops/s findKeysAll thrpt 10 6382544.499 ± 353321.426 ops/s findKeysQuery thrpt 10 1118.262 ± 370.960 ops/s findValuesAllMany thrpt 10 3849.128 ± 416.156 ops/s findValuesAllOne thrpt 10 4485.419 ± 424.741 ops/s ``` Some of the primitive collections have also been adjusted to use murmur hash instead of `Long.hashCode(v)` because the collision rate was way to high. This adds a bit of memory overhead because we copy the value into a pre-allocated byte buffer to reuse the murmur hash implementation provided in the scala std library. 29 May 2017, 21:51:01 UTC
51f676b drop cross build for 2.11 (#600) We have been running on 2.12 for a while now and internal builds and deployments are mostly on 2.12 now. The few stragglers are using the 1.5.x Atlas builds which is 2.11.x only and that will not change. 27 May 2017, 18:10:52 UTC
73416e3 always sort NaN values to the end (#599) This change alters the behvaior of sorting to always move NaN values to the end regardless of the order. Before, NaN values were at the end for ascending order and descending order would just reverse the list causing NaN values to show at the top. Fixes #586. 27 May 2017, 18:01:53 UTC
1765646 fix unused variable warnings (#595) 23 May 2017, 21:37:51 UTC
46a5fb2 avoid unused method warnings in guice modules (#594) Since scala 2.12 the private provider methods in the module cause warnings about unused methods. Unfortunately scala doesn't provide a way to suppress these on individual methods (SI-1781). For now the work around is to mark them as protected rather than private. Note the module class is final so there will not be any subclasses. 23 May 2017, 17:01:54 UTC
42de1e4 update flow types to allow DiagnosticMessages (#593) Updates the high level processor to pass JsonSupport messages instead of TimeSeriesMessage specifically. This gives a lot of flexibility to send through other types like DiagnosticMessage or other helpful information when processing the stream. 23 May 2017, 16:50:43 UTC
39d481d docs: discuss low volume conditions for alerting (#592) With traffic failovers internally low volume conditions cause a number of false positives. Update the philosophy doc to discuss that use-case and compensating with a check on the absolute volume. 19 May 2017, 16:58:38 UTC
0d953a3 suppress chunks that are all NaN (#591) If there is no data for an expression in a given chunk window, then drop that data. 18 May 2017, 14:16:49 UTC
162c392 docs: expand description for legend (#590) In particular discuss the use of variables in the legends. 17 May 2017, 22:35:51 UTC
95665c5 fix unit used when paritioning a context (#588) The context was ignoring the unit parameter passed into partition. This would result in an IAE under some situations. See test case for more details. 17 May 2017, 21:22:41 UTC
39f9d30 initial port of fetch api (#585) Starting work on #547. This adds an endpoint compatible with the internal fetch API to the oss repo. There are still a number of things left to do, but this should unblock some other efforts. To be addressed in later PRs: - Test coverage, there are some basic test cases but I still need to debug some issues with one of them and clean them up a bit to remove use of some test data that was dumped from internal sources. - The throttling/limiter logic still needs to be moved over, but is a bit more intertwined. - In general need to move the actors away from the actor publisher that was deprecated in akka 2.5. 16 May 2017, 16:13:11 UTC
e4640c8 docs: update to refer to 1.5.2 (#583) 07 May 2017, 16:07:43 UTC
749e214 akka-http 10.0.6 (#582) Fixes security vulnerability: http://doc.akka.io/docs/akka-http/10.0.6/security/2017-05-03-illegal-media-range-in-accept-header-causes-stackoverflowerror.html 07 May 2017, 15:47:19 UTC
e311d28 allow step size for datapoint to be set explicitly (#581) Before the step would always come from the global config setting. This made it impossible to use the datapoint class with multiple step sizes in play. 07 May 2017, 04:24:46 UTC
cafa8ce fix #578, incorrect formatting for links (#580) Looks like github no longer accepts the a line break between the display text and href for the link. 07 May 2017, 04:07:25 UTC
c52f00c search and replace over legend text (#579) This allows some simple manipulation of the legend text using a sed like search and replace. The main use-case is for massaging the results of a group by to be more presentable. 06 May 2017, 18:13:18 UTC
26239b2 create bitmap directly instead of through int set (#573) The use of the int set was a carry over from the lazy set implementation. This removes some steps saving memory and computation cost when building the index. 05 May 2017, 04:02:25 UTC
8c2a809 use int iterator for bitmaps (#572) Avoids boxing the values to Integer. 05 May 2017, 03:53:56 UTC
eeb92af add test case for createStreamsProcessor() Provides a simple example of usage and verifies some of the basic functionality with the file based uri. 04 May 2017, 15:46:58 UTC
f2edd36 included rendered image in v2.json response (#568) Fixes #564. We may move this behind a flag before 1.6 final, but for now it is enabled by default. By including in the v2.json format the UI/dashboard can use the static image and switch to a dynamic rendering without making another request. The message has a type of `graph-image` with a data field that is a data uri of the image without the legend. Sample usage: ```html <html> <body> <div id="content"></div> <script> var content = document.getElementById('content'); fetch('http://localhost:7101/api/v1/graph?q=name,sps,:eq,(,nf.cluster,),:by&format=v2.json') .then(function(response) { return response.json(); }) .then(function(json) { var html = ''; json.forEach(function(msg) { if (msg.type === 'graph-image') { html += '<div><img src="' + msg.data + '"/></div>'; } else if (msg.type === 'timeseries') { html += '<div>' + msg.label + '</div>'; } }); content.innerHTML = html; }); </script> </body> </html> ``` 01 May 2017, 22:19:52 UTC
16a734c remove ExpressionDatabaseActor from sample config (#567) Updates the lwcapi.conf sample config to remove the actor since it went away in #561. Also updates to spectator 0.55.0. 01 May 2017, 21:23:55 UTC
back to top