https://github.com/wikimedia/operations-puppet

sort by:
Revision Author Date Message Commit Date
f9c41a5 contint: jenkins-debian-glue 0.17.0 That version now only has two binary packages. Drop the others from our puppet manifest. Change from 'latest' to 'present'. We have unattended upgrade now and regardless there are only two slaves doing package building. Bug: T141114 Change-Id: I314b6b3b076e407d622abf61df77d76be5ffb99b 22 July 2016, 20:03:58 UTC
9ee71ab Phab: make sure the mail crons have mariadb-client installed Change-Id: Id09a46edff0c58c8a8f6eaf683e98048d6256046 22 July 2016, 19:46:35 UTC
04bd1c2 gerrit: fixup error document config line follow-up to Ic0cb8e2a5060da fixes name of the config option Change-Id: I8553d0ef87e987196cfb8475381bb2eb45db0b54 22 July 2016, 19:13:42 UTC
efac112 Gerrit: Further tweaks to down/maintenance mode When Gerrit is actually down, Apache serves 503s. Let's at least pretty those up rather than spewing a black and white 503 with no context. When we go into maintenance mode, make @maint_mode a string of when we expect it to be back online. This allows us to set nice expectations. Change-Id: Ic0cb8e2a5060da40a5079adf3ed12338a5a72e2b 22 July 2016, 18:59:21 UTC
d2f6abb Gerrit: Store the ssh_host_key in private puppet secrets By default do nothing in the the jetty manifest so labs installs can just generate their own keys Change-Id: I75baa1be98d9ab7ab5fa8eca762079b0fd692ab5 22 July 2016, 18:53:22 UTC
b90a61c toollabs: collect stats on grid usage by job Bug: T140999 Change-Id: I0bcf71b7553ae821b11acbe2b5c6049fe9310c0f 22 July 2016, 18:30:20 UTC
bf5a4a2 Gerrit: Set sendemail.connectTimeout to 1 minute Doesn't affect Gerrit 2.8, but with 2.12.x this allows us to specify a limit. The actual value doesn't matter much, we mainly just want there to be some timeout so if a single mail gets hung up it'll eventually drop instead of hanging and holding up all mail. Bug: T131189 Change-Id: Idcc795d8f508dad0810573185858430fd4e890ce 22 July 2016, 17:46:19 UTC
d8e41e9 labs dnsrecursor: add tcy.wiki(pedia) Bug:T140898 Change-Id: Ia5fa4be4783c9bf1bee11644b22e4d289dafd835 22 July 2016, 17:21:25 UTC
b9142c8 Reenable jobs running on the phabricator db slave AFAICS, only the dump process was pending to be reenabled after all others had been enabled previously. Bug: T138460 Change-Id: I8799dadf3b3a7e1ee92a0535688f96b58f62aba8 22 July 2016, 16:25:39 UTC
fc94af7 Formatting followup for I32b30fccaf044ae2865b331f28f9238ac6693f81 Change-Id: Ia148d8dfc2689a2dc2adaaa8faa61eb1972a2d58 22 July 2016, 05:23:46 UTC
0e3bc4c Rearrange arguments for pool_target options This is a secret, undocumented change in config format which I learned about from Tim Simmons on IRC. Change-Id: I32b30fccaf044ae2865b331f28f9238ac6693f81 22 July 2016, 15:16:46 UTC
c90f1d1 Update references to ubuntu.wikimedia.org We use just mirrors.wikimedia.org nowadays. Change-Id: I8487e5de6e11be1c4d3c5ac78ff8aec216d6b4c5 22 July 2016, 15:11:10 UTC
f3af241 mirror1001 -> sodium Change-Id: I8ae0c6ba1c3662f2fe18a9ea9316220c7fefb7f1 22 July 2016, 15:01:40 UTC
117cad4 zuul: explicitly define the Gerrit event delay Zuul is always a fixed amount of seconds behind Gerrit in order to have the search query to be up-to-date after a change has merged. The default is 5 in our current setup and is going to be hardcoded to 10 with a newer Zuul version. We have a cherry pick that lets us finely tune the delay by adding gerrit.event_delay = 5 https://review.openstack.org/#/c/343562/ Add that as a class parameter to zuul::server In the template config file shared by the merger and server, add the event_delay value if exists. Change-Id: I8b78be1577723095f2a369c09c1509d92242e4c5 22 July 2016, 14:24:07 UTC
ae76aeb puppetmaster: temporarily allow rhodium to compile all catalogs At the moment we want to test rhodium itself against other backends, and we'll compile all production catalogs from it. Bug: T98173 Change-Id: I6d2f31c67681c0c1fce16212dd19dc720265909e 22 July 2016, 14:21:34 UTC
feccc01 Increase elasticsearch heap on logstash routing instances It looks like aggregations in es2.3 might be utilizing more memory than 1.7 was for facets. All of the nodes have at least 5G of unused memory in addition to the disk cache, so lets try increasing the heap from 2G to 4G Bug: T141063 Change-Id: Ia0778201d22b4b056bf37666bca4faadb82c175a 22 July 2016, 14:17:42 UTC
349a3dd Enable Cassandra instance restbase2004-c.codfw.wmnet Bug: T134016 Change-Id: I15c52788af6c02f2e82aa59cb571efd9747362e9 22 July 2016, 14:13:36 UTC
518c181 restbase: have systemd restart failed nodes since we're no longer in the nineties and we have init systems able to keep services up for us, avoiding outages in case of mass crashes. Bug: T136957 Change-Id: Idecf0f66ad7f2214efcceaeeca8da6e85defbc0f 22 July 2016, 14:09:39 UTC
9c6a632 puppetmaster: pass the group to git::clone This is needed in order to have consistent status of files that have an undefined owner in the puppet manifests. Bug: T98173 Change-Id: Icda289f8484ad1bf1e285ec70611c280c7e5eb2c 22 July 2016, 13:16:48 UTC
2fbb74e Add extra grants needed on m5 for the dbproxy1010 Bug: T140983 Change-Id: If7116b5d99244eda0d6f5602f6b6dd54844a9b5c 22 July 2016, 12:35:53 UTC
871b104 Add grants on m4 databases for dbproxy1009 Bug: T140983 Change-Id: Iba82d2c6b52b6d20b7771a0ebc64cb6af05be152 22 July 2016, 11:32:27 UTC
ec075d2 admin: add addshore to analytics-wmde-users Bug: T140342 Change-Id: Idb05b55e0f3fd391a8d32c2fbf6fdc948af4fb0c 22 July 2016, 11:21:52 UTC
367ffb6 Add extra grants on s2 for dbproxy1007 Bug: T140983 Change-Id: Ibe7acfa3ee2e80efa8f9ddea47aa5848ecfbc1ab 22 July 2016, 10:56:04 UTC
c3a8df5 cache_upload VTC tests VTC test cases for cache_upload, compatible with both varnish v3 and v4. This commit also introduces a new executable called varnishtest-runner, which passes varnish-version-specific macros to varnishtest. Bug: T128188 Change-Id: I653502c2b7ce0e370754ff8acfeaec81a13842f4 22 July 2016, 10:41:09 UTC
5720caa Extra grants on m1 databases for dbproxy1006 Bug: T140983 Change-Id: Id89a887eb50998b3b95c8b1a9417ba40b8f7ec7e 22 July 2016, 10:27:06 UTC
0cdf699 Change-Prop: Definition rerender bug - don't react to revision change When HTML content changes in RESTBase it emits 2 resource_change events: one for title and one for revision. The definition update rule incorrectly reacted to both. The second one treated title/revision string as a title and so created an additional request that resolved to 404. Change-Id: Ic033d4c26aecbe944223066d102db2ddcffa1622 22 July 2016, 09:58:41 UTC
dbbb436 Change-Prop: Revert the revert - ignore bots on ORES We don't need to precache ORES on bot edits, so ignore thos edits. The previous implementation of the `rev_is_bot` flag was wrong, so the patch was reverted. Now that an improvement has been made and deployed we can finally deploy this update to the rules. Change-Id: I040e60cfb5fe00f4d7c9a215ee1eb313c7058955 22 July 2016, 09:56:52 UTC
b80eb22 New m3 database grants for dbproxy1008 Bug: T140983 Change-Id: I9d11475e881e994b7cfe6b20861b5cc0b356ebab 22 July 2016, 09:19:30 UTC
d341b3e Add dbproxy1006-10 to production as redundant instances of 1-5 This requires first reimaging of those servers. We will think later how to coordinate them to provide full HA. Bug: T140983 Change-Id: I4f41736e289184d241e09ea59d4e19482efaaf07 22 July 2016, 07:15:44 UTC
e6db45b planet: "RelEng" (jargon)-> "WMF Release Engineering" Change-Id: I0292137571121afdeac65e7493781e608a82ba2e 22 July 2016, 04:53:24 UTC
7a979ae Reconfigure m3 servers to use modern mysql configuration Also remove unneeded options now that they are all equally upgraded. Bug: T138460 Change-Id: I4d5d9f9be42ce06a30607f7cf6b9bba33c596673 22 July 2016, 04:40:46 UTC
54cb378 Set db1048 as the new phabricator master on config This will fix current heartbeat (replication) alerts. Bug: T138460 Change-Id: I89a646da79bc0b0fe16adf76f3f7b7b023082fe8 22 July 2016, 03:27:46 UTC
d674a1c Designate: Don't specify two nameservers if they're the same. Change-Id: Ic9c9a6ec034ee6860e7442d7e0fe1a0431a5c97a 22 July 2016, 01:39:10 UTC
54c783e gerrit: add missing source for remote recursion follow-up fix to Iff8d4b47e88f62c0 "failed: You cannot specify a remote recursion without a source" Change-Id: I21070bc0720e71981ae1baa0927cac3b7ab4bcf3 22 July 2016, 00:15:08 UTC
24aa659 Rename oslo.config to oslo_config Those OpenStack people sure do like to rename things and break my code. Change-Id: I2b8d6f1077937465bfe016c42252e32645dbdcef 21 July 2016, 23:58:50 UTC
a2710e1 Gerrit: Greatly simplify directory management on host We can recurse => remote all of these directories, mostly the files in etc/, so adding/removing/renaming a new static file, config file, or Header/Footer/CSS file doesn't need a manifest change. recurse => remote is so cool :) Also tidy up some directory permissions (wayyyy too permissive) a tad. Change-Id: Iff8d4b47e88f62c0cc973a6031b6d07101510c51 21 July 2016, 23:36:40 UTC
00761c5 Catch liberty designate.conf up to the state of the art. Change-Id: I5d27bd80efc8015c3872c6c9bc5f4f6fb8288672 21 July 2016, 23:12:41 UTC
c736314 Add new user 'hjiang' for Helen Jiang Bug: T140659 Change-Id: I8501186a1c90ddbc05f46f6dda11f72afac4e3cd 21 July 2016, 23:00:10 UTC
70bc366 puppet @var warning on fastopen_pending_max Change-Id: Iddbedc8e2c14bbac65daf8e7c30fc4f14260afe0 21 July 2016, 21:52:57 UTC
624474a Upgrade labtest to openstack Liberty Change-Id: Id41b939fc71ea23533d36016e381bbe84d35cc13 21 July 2016, 21:48:16 UTC
6cc0aa0 admin: add shell account for Jasmeet Samra Creates a shell user for Jasmeet Samra, consultant for fundraising analytics (T139764) UID matches wikitech LDAP user "jsamra" Bug:T140445 Change-Id: Iad9e8a4a1fdf18f0381c902719947521a0a86d7d 21 July 2016, 21:31:00 UTC
cd3a456 Maps - initial import script Fix wrong script name Bug: T138501 Change-Id: I3d94b111e6bd4bba6d34aebda0f7ce938aecb834 21 July 2016, 21:16:09 UTC
d3546e4 Actually create initial import script for OSM data Bug: T138501 Change-Id: Ib9e228b99c982571c903b11206dbd9aff4e811e8 21 July 2016, 20:59:30 UTC
9cb3945 Script to do the initial data load from OSM for Maps project This script is a naive implementation of the steps taken to get Map's postgres database in working shape. Bug: T138501 Change-Id: Ibb7ce464636c380b5b7f1bed5ffe4a1fe1e1f177 21 July 2016, 20:06:30 UTC
e0af271 tools: Add check for high iowait Possibly catch etcd/tools-worker crashes Bug: T141017 Bug: T140256 Change-Id: I9359cfcb83584e06ae6d01d063dae35714e2a724 21 July 2016, 19:31:59 UTC
b094d63 Add alerting for MediaWiki exceptions and fatals Watch the rate per minute of MediaWiki exceptions and fatals. Warn if it exceeds 25; panic if it exceeds 50. I hope we can gradually make these numbers lower, but for now we need to make sure we're not too noisy or we'll inadvertantly train people to ignore the alert. Bug: T140942 Change-Id: I638d270e52a559a5b6bc0f68788172869ca2d888 21 July 2016, 19:15:25 UTC
076185a jsbench: chromium-browser on trusty, chromium on jessie On Ubuntu trusty this package was "chromium-browser", but since the upgrade to jessie it could not be found. E: Package 'chromium-browser' has no installation candidate It's just "chromium" on jessie. https://wiki.debian.org/Chromium Bug:T141023 Change-Id: Ic260c3e7d30a307cc48eca8b7b5dbbec0a6ad6b6 21 July 2016, 18:17:39 UTC
8216653 cold-migrate: activate/deactivate base image as needed. Bug: T139272 Change-Id: I666084bbaf156803a3485c10571b0dc9c9ff2d9a 21 July 2016, 16:54:47 UTC
910af60 cold-migrate: use novaenv.sh for credentials Bug: T139272 Change-Id: I3b98cf4096599cb4f0da5ab4dd2151225e648aff 21 July 2016, 16:54:05 UTC
cf80d59 contint: APPEND unattended upgrade allowed-origins Before: Unattended-Upgrade::Allowed-Origins "Wikimedia:${distro_codename}-wikimedia"; Unattended-Upgrade::Allowed-Origins:: "${distro_id}:${distro_codename}-security"; After: Unattended-Upgrade::Allowed-Origins ""; Unattended-Upgrade::Allowed-Origins:: "${distro_id}:${distro_codename}-security"; Unattended-Upgrade::Allowed-Origins:: "Wikimedia:${distro_codename}-wikimedia"; Adding '::' at the end of our stanza makes unattended upgrade to properly recognize the Wikimedia origin. Bug: T98885 Change-Id: Ie386cc044e4c5df4cbac50cbb29c9daf3b517be6 21 July 2016, 16:33:49 UTC
e2619d7 Changed partition scheme for relforge (elasticsearch) servers With 4x3To disks, we need GPT and RAID10. There is a standard config for this, but it requires moving elasticsearch data to /srv. This brings us closer to the standard used on most servers (data stored in /srv), but further from other elasticsearch servers. The long term plan is to move all data on all servers to /srv, so let's start now. Bug: T137256 Change-Id: I440776fcfcca776cac9df5916b1d564a83cb1923 21 July 2016, 16:27:59 UTC
306f875 Prerequisites for logstash_checker use Added the logstash_checker script to the deployment hosts. Open port 9200 from the $DEPLOYMENT_HOSTS to the logstash host so that the checker script will be able to query logstash. Change-Id: Ic2b16e7e6717a95a9f236957f0506bb58d3900a8 21 July 2016, 16:19:25 UTC
4151f65 Logstash_checker script for canary deploys This patch adds a basic script that compares error rates before a deploy with those after a deploy. To do this, it queries logstash for a given host & service name, and calculates mean error / fatal rates before & after a deploy time, specified via the --delay parameter. If the error ratio exceeds a threshold (2 by default), the script returns an error. Once integrated into the deploy process, this script should prevent badly broken code from being deployed to production. Bug: T110068 Change-Id: I1a900ee1d7eadc4689e14306a2fc72ad2c138a28 21 July 2016, 16:15:53 UTC
23ba26d Disable `streaming_socket_timeout_in_ms` setting This value defaults to 0 (timeout disabled) in Cassandra 2.1, and 1-hour in Cassandra 2.2. We're now seeing stream timeouts in production so reverting this to what worked before seems like a good initial step. Bug: T134016 Change-Id: Id6cda4bc065fd4627704287d0b1f0434f621ae83 21 July 2016, 16:10:46 UTC
75fc27c RESTBase Cassandra: Lower compaction throughput to 20MB/s Bug: T140825 Change-Id: Iece422bb2bca02e15622383537d00c7f68839385 21 July 2016, 16:07:09 UTC
2a06266 Use special monitor-account creds for the rabbitmq collector Change-Id: Ifc08f7b87d7aeda8e8568527e213eaf2b8e6c59a 21 July 2016, 15:31:13 UTC
e8cd6ee Disable instance rebuild in Horizon. The nova bits of this work but it breaks puppet on the rebuilt instance. It's simple enough to delete/create that I think we can live without this. Bug: T140259 Change-Id: Ib39bac55b36a8cb0de55a21ce83d0059de62be0c 21 July 2016, 15:31:13 UTC
a871319 puppetmaster: brown paper bag fix Change-Id: Ie334ee6d1350bb2e53a70efd5f2104f40e6adb62 21 July 2016, 15:09:15 UTC
6a21638 puppetmaster: fix apache vhost syntax Change-Id: Ifb3484b756a305d47c788087720cb34cc499be89 21 July 2016, 15:03:43 UTC
c98b4cc puppetmaster: Apache 2.4/jessie compatibility Change-Id: I1a4f83930df21b8744ba5d026afd15025b3b79c0 21 July 2016, 14:54:13 UTC
91b711c Revert "Change-prop: Ignore bot edits on ORES precache updates." Ic79690da726698ee785e048cbb2959c731bd017d contains a bug whereby all edits are classified as not being by bots. This causes change-prop to take a much longer time to process each message and send it to ORES, causing the backlog to pile up. This reverts commit 8b2c2f7b51c60297ac46dca9874c5502487080f4. Change-Id: I737242a65d0d6b2dfee26ef5dcc1a764347c5210 21 July 2016, 14:44:49 UTC
a08e23c shinken: Use new labs graphite URL Bug: T140976 Bug: T140899 Change-Id: Id5fba1275a467ecd10b07735d943aafc5693de4b 21 July 2016, 14:00:09 UTC
a879817 check_legal: mobile privacy reference is now explicitly https Allow https for other similarly linked external legal notice pages. Change-Id: I3f07af4b7675c8741b36d89e21299d19cc8f7f00 21 July 2016, 13:52:13 UTC
95cdd15 graphite: Move labs graphite to graphite-labs.wikimedia.org Bug: T140899 Change-Id: I6babde830181d8e613735b0088506c24b9e244a5 21 July 2016, 13:09:54 UTC
8b2c2f7 Change-prop: Ignore bot edits on ORES precache updates. ORES precaching doesn't need to be run on edits that were made by bots, so updating the rules to exclude them. Change-Id: I1898f484142d1ab9ffa432f2fb8e3611ba864e07 Depends-On: Ic79690da726698ee785e048cbb2959c731bd017d 21 July 2016, 11:15:54 UTC
dbd0690 Change-Prop: Fix error ignoring config bug An obvious bug in the error ignoring configuration. Change-Id: Ibf8a7605b10f155b0e91c4a0584f6f7bb8ad9182 21 July 2016, 11:14:53 UTC
c01e6b9 fix link to current set of cirrus search dumps The link was an absolute path on an nfs mounted directory, nonexistent on the nfs server which is also the web server. Also one more try at fixing the creation time/cutoff time comparison; bash doesn't like [ -n ... -a ... ] in the same line, apparently it does the substitution of variables in the entire condition and fails on a syntax error instead of bailing on the -n test right away. Bug: T138176 Change-Id: Ic5d62d3d4ac8306404faf5ed01d04127ee0c8e1e 21 July 2016, 10:38:02 UTC
b8f3c73 Adding rack information for new relforge servers Bug: T137256 Change-Id: I9d47876298fc4a717b797c6fb51c65666ade4169 21 July 2016, 09:36:02 UTC
93fe15a Configure new relevance forge servers * adding basic hiera configuration * changing partition scheme to align with similar elasticsearch servers * assigning servers to correct roles in site.pp Bug: T137256 Change-Id: I88e905fa4d28389195c03b0f68087ce06a9c49dd 21 July 2016, 09:16:16 UTC
de9e34b puppetmaster: fix test vhost proxy auth Change-Id: I347dd5e525c0f4da984ea000e703227de0c39bae 21 July 2016, 08:55:45 UTC
8d978f0 puppetmaster: declare NameVirtualHost where expected This will allow to use both puppet and puppet.test correctly Change-Id: Ie3b94fa21ae56b8b3cb80e95b44d60e436afdddd 21 July 2016, 07:54:40 UTC
f6b2b7c fix up xmlstubs batch jobs setting for en wiki xml dumps Bug: T132279 Change-Id: I11b7d78dc322d1de6e626d1f1d80effdfbdccc07 21 July 2016, 06:33:30 UTC
f29545f admin: add mpany to analytics-privatedata-users,researchers Gives access to new shell user Maximilian Pany which is part of the request described on T139764. Fundraising has hired consultants to help with fundraising analytics. 'bastiononly' group for access to bastion hosts. analytics-privatedata-users and researchers as requested and suggested by ottomata access is granted to work with fundraising's banner history data. approved by MeganHernandez on ticket Bug:T140399 Change-Id: I7d19fb320db21be6d10d8d4a347654ca7787c269 20 July 2016, 21:19:55 UTC
61eeb7b Revert "Disable user creation of new VMs until we increase capacity." We have labvirt1012 and 1013 in the pool now. This reverts commit 83bda78700a5532f40c7e36220275211c466a5b0. Change-Id: I7c3da563bd4647f2277575ecfcdc618c6f1a9e4e 20 July 2016, 21:16:16 UTC
863b0ee Revert "contint: tidy Nodepool slaves config history" Puppet tidy lacks a bunch of features such as being able to do a match against the whole patch. It does it solely against the file basename which is not what I wanted to do. This reverts commit 4bbfd3b9ec21e044a15d0c4e78736cac2332af2c. Bug: T126552 Change-Id: Ibf289e4f6cd77a3367f7b5ebe4a9283749b9338c 20 July 2016, 20:49:46 UTC
1ccf68e Setup CirrusSearch continuous saneitization process to run via cron Bug: T139200 Change-Id: I3b934b52b7b67726ba58c3d6c37c605b869202c2 20 July 2016, 20:41:55 UTC
51ae260 Disable 1013-c instance We're experiencing stream timeouts, so putting this on hold until that's been addressed (r/300059). Bug: T134016 Change-Id: Iae35e61398e0664fdf7ced43c8eb43d2a592465e 20 July 2016, 19:52:27 UTC
d8b0f44 grafana: Expand edit access in labs grafana This is anyone with shell access to any project on labs. Bug: T120295 Change-Id: Idc8c17f57bf80160ed908c2ce59c00b5c5a5011c 20 July 2016, 18:37:02 UTC
e959a80 ssl_ciphersuite: drop non-FS AES256 options This is slightly controversial, but I think we should move forward with it for now, and allow reverting if there's any legitimate complaint. The non-forward-secret set of ciphers are the least-secure ones, which we need to eliminate first in the long term. The AES128 and 3DES non-FS ciphers are in this list for pragmatic reasons: there are simply too many clients still connecting with them for us to remove them at this time. The AES256 options here are not pragmatic. By eliminating them, we change the nature of the "compat" non-forward-secret list. It changes from a list of "anything we can reasonably support" to "things we still *have* to support because of minority (but still significant) real client traffic". Stepping through the rationale and data on why the AES256 options aren't pragmatic: 1) Fundamentally, most agree that AES256 doesn't offer any pragmatic crypto-strength benefit over AES128 today. 2) Any software which implements AES256 would also implement AES128, which we already prefer over it for efficiency. 3) Therefore any real client chosing AES256 has actually disabled AES128 for some security policy reason, with the (questionable) rationale that more bits is stronger. However, these same clients apparently do not support basic forward secrecy. It seems ridiculous to consider oneself in a position to set non-default client cipher policy for supposed security reasons while ignoring much more important factors like forward secrecy. (Note: We do continue to support AES256 in our forward-secret cipher lists). 4) We've been gathering detailed stats on our primary clusters for all ciphersuite selections for a year now, and none of these see any significant traffic. Breaking that down by each cipher's history: AES256-GCM-SHA384: Basically no data, except for one isolated, tiny spike of connections back on 2016-02-16. AES256-SHA256: Had a small non-ignorable population in mid-2015, but the rate abruptly fell to near-zero on 2015-10-08 and never recovered. Prior to that date, logging and sampling showed the bulk was from a single group of US Military proxies, which presumably finally got their software upgraded. It's now intermittent (often days with zero), and the long-term average rate since the dropoff has been roughly 0.005 reqs/sec. AES256-SHA: Also often goes days without stats. Its average rate over the past year is roughly 0.015 reqs/sec. Bug: T118181 Change-Id: I56cadda5211706ff7040b0e968e0ecb80d22245b 20 July 2016, 18:23:44 UTC
2b9bd89 ssl_ciphersuite: auto-downgrade to compat when necc This eliminates some of the confusing caveats about using 'mid' or 'strong'. If 'mid' or 'strong' is used on a host that can't support them properly (apache on trusty/precise), a warning is emitted via the agent and the output is auto-downgraded to 'compat'. This will let us set many of our independent services to 'mid' now and have the actual change pend on their upgrade to jessie (and remind us to get that done). Bug: T118181 Change-Id: I123172ba9e289d15d884caefb9d9b2a5e17d21d7 20 July 2016, 17:55:00 UTC
d7a0d0f service::node: add git as deployment method This is also used in role::parsoid::testing where we don't want deployments via scap/trebuchet. Bug: T90668 Change-Id: Idc1d352b6f5d4aabb499b8208114e78de35f9ab1 20 July 2016, 15:39:39 UTC
97b325b redirector: Pass along request_uri to new location as well Allows redirecting domains while preserving links Change-Id: I5dc1d63984e51f4fbc634f93070da4317234b080 20 July 2016, 15:37:33 UTC
173cafc cache: Add labs grafana behind misc varnish Bug: T120295 Change-Id: I3a5a9415a95fd2e8f1538c2a102659341526ab73 20 July 2016, 15:03:19 UTC
d1afdf0 set up proper dump monitor role and add to snapshot1007 Change-Id: Ibf89f1754c26390f675741cc3d70137a0857954f 20 July 2016, 14:20:43 UTC
b2d6869 grafana: Add and provision labs grafana role Bug: T120295 Change-Id: I24036bd3298f3e6f679d37ee9ed1a69ce39fa1ad 20 July 2016, 14:00:20 UTC
17357ee grafana: Refactor production role into base role In preparation for introducing a labs grafana role Bug: T120295 Change-Id: I362e58ebe0bfee41fa97ed989a5de9c8085a05e8 20 July 2016, 13:47:07 UTC
a1e4f74 grafana: Mark role explicitly as production Bug: T120295 Change-Id: I245e877a68540eb61fe35b37bd1c9329c8213a88 20 July 2016, 13:44:37 UTC
d74b49b grafana: Make role explicitly reference production secrets Bug: T120295 Change-Id: I77b96657edf2c97c274c2d09f745e36d40870f30 20 July 2016, 13:42:57 UTC
4425b42 prometheus: use DOMAIN_NETWORKS not INTERNAL Change-Id: I35023b932a10354ec9cc795b0802b1bce4276a6f 20 July 2016, 13:18:33 UTC
8cf2ea2 remove dump monitor from snapshot1004 Change-Id: I7f11b5941b7a01fb1908570fdcda5e40e8bc02d7 20 July 2016, 13:08:28 UTC
d80e11c tools: Add a kubernetes diamond collector Bug: T140887 Change-Id: I4822045fb8c4a127fec4106658f6cec3180048fd 20 July 2016, 13:05:31 UTC
f237e90 parsoid: move to role::parsoid for all production nodes This is part of I3f4a5 that is being splitted in multiple commits Bug: T90668 Change-Id: Ie97015083ba56a7acaf5d40043dc04fcd0c33f3a 20 July 2016, 12:27:23 UTC
2d5f24f Include nova mysql password in novaenv.sh Bug: T139272 Change-Id: I148e6cf1c95ec6ec9bbf6e21172c500478d52a0a 20 July 2016, 11:32:37 UTC
601cb54 add-ldap-user: Don't use sillyshell, it's silly (and doesn't exist anymore) Bug: T86668 Change-Id: I09da9e50ad06be54b7947e62171ffb10c1b3a2d7 20 July 2016, 11:31:37 UTC
eec4295 tools: Make toolschecker webservice actions non silent So we can see why they are failing Change-Id: I6cdccf64c507166e5f8e012a0270d769180e0150 20 July 2016, 11:23:22 UTC
80e385f tools: Fix webservice toolschecker check - Use a separate tool to prevent racing with kubernetes webservice check - Fail if any of the steps fail, not only if all of them fail Change-Id: I96715d90e7e569d8527ba11cbeb2aa8cc73b903d 20 July 2016, 11:18:34 UTC
f8937a4 New user for prometheus monitoring Prerequisites: * Setting a password on the non-public repo (DONE) * Setting a fake password on the public repo (DONE) Current grants given are the same as nagios. More can be added if needed. I will add it manually to a single host in order to test it first. Bug: T128185 Change-Id: I85152efcedc70f68ce345aa21b4dcce95663abd6 20 July 2016, 11:13:55 UTC
dae57fa prometheus: fix ferm::service and include node_exporter Change-Id: I17c2cbf0b80a0e32c41348e32e594905d459bf73 20 July 2016, 09:57:31 UTC
d2f8d49 Add special Cassandra compaction configs for aqs100[456] The new AQS nodes needs to be loaded with data but we are facing performance issues while compacting SSTables. This is an attempt to leverage better hardware to speed up computations. Change-Id: I9e7a43af5dafabd5160b55c3ad386693d97451f1 20 July 2016, 09:32:37 UTC
7197062 nutcracker: default verbosity to 4 It looks like we're lowering verbosity to 4 across mediawiki machines anyways. Also verbosity 5 has created problems in the past with filling disks given our usage. Bug: T136078 Bug: T139786 Change-Id: Iaca2acde397ef276ab2636db9487fb8f349a0444 20 July 2016, 09:25:08 UTC
back to top