https://github.com/wikimedia/operations-puppet

sort by:
Revision Author Date Message Commit Date
2cdd1f0 graphite: hotname is a fact, qualify bug: T97251 Change-Id: I892aea3ed63ec6eb41058dcd4b9d25ab09e84ef8 03 June 2015, 19:25:37 UTC
c604caa glance: qualify vars bug: T97251 Change-Id: I7c12148e3d7a4152afe58a3e32922c9405a54516 03 June 2015, 19:21:39 UTC
c29eba0 Fix-up for I7bc734b58: use rsyslog 5 syntax for declaring template Change-Id: Ibe0db92289ab8122c3b7caf02282de62ef7d47de 03 June 2015, 19:18:30 UTC
7936de3 toollabs: Source hosts file as template, not file Change-Id: I5e07e1691125f5b31f12072da13411e258c2fc31 03 June 2015, 19:06:07 UTC
069ab26 tools: Don't include killed labsdb client /etc/hosts class Change-Id: I0a9afc244470154bf05d94388133e646ff501d1e 03 June 2015, 18:59:05 UTC
0af02a3 admin: add aklapper to bastion-only group He needs this in addition to the phab-admin group to be able to jump via bast1001 to iridium. Bug:T97642 Change-Id: I8031d4d781496f2cc51fa1b5cdcdf5f56f7817b9 03 June 2015, 18:54:58 UTC
17326d8 Merge "Revert "tools: Puppetize database aliases as host resources"" into production 03 June 2015, 18:51:54 UTC
25044c0 Revert "tools: Puppetize database aliases as host resources" The super long lines in /etc/hosts is causing problems for gridengine, causing it to fail in fun ways now and then. Since we have designate now we can actually move to actual DNS entries soon enough, so can revert this. Also sets tools-redis to be set as a template in the hosts file instead of a host {} resource This reverts commit 99ac180dc3252a9fee750b876200bbcd61265d91. Change-Id: Iff9b796ff5d03b5d27d340ec2ecfa6732d5856b2 03 June 2015, 18:49:58 UTC
594b6a9 admin: add aklapper to phabricator-admins Bug:T97642 Change-Id: I953db4ba57bd04fbc103211d063e202e14322d6e 03 June 2015, 18:40:04 UTC
c176c33 admin: create user for aklapper creates a shell account for Andre Klapper, for granting him access to the phabricator server. pending approval from greg and a confirmed SSH key Bug:T97642 Change-Id: I723b54860e853e736b392d47d020224355b4dbd7 03 June 2015, 18:24:11 UTC
e7a4a1a Allow for new labs domain schema in ENC Change-Id: I3282acb02d269841f3286a053cbc3b5193235e36 03 June 2015, 18:19:20 UTC
ab88b31 designate: Just use IP address directly for ferm rule Change-Id: Ibd2866f19bd07f07edb92e58cfcb7947dfeed24f 03 June 2015, 18:17:06 UTC
39c0925 designate: Use ferm's resolve function than puppet ipresolve ipresolve seems a lot more flaky Bug: T101281 Change-Id: I919e6ec7d11ce13cbda9a94d9c1e75b4ca0bdc0f 03 June 2015, 18:13:55 UTC
601720b dnsrecursor: Consistently order aliases Bug: T101281 Change-Id: Idba2c4872050e041830086752c86340e83102fb8 03 June 2015, 17:45:02 UTC
da80053 Merge "gitblit: don't install ssl cert anymore" into production 03 June 2015, 17:42:25 UTC
05d30f2 gitblit: don't install ssl cert anymore This has been moved behind misc-web and the certificate has just been deleted in Iddf579788cec471faf but puppet still tried to install it, leading to puppet fail on antimony. Bug:T100827 Change-Id: I0041f9e0f8f7e73c0eeeaeddd674a8d47c3b9581 03 June 2015, 17:34:42 UTC
5cadf90 Merge "mailman monitoring: adjusting thresholds" into production 03 June 2015, 17:31:15 UTC
ddef015 mailman monitoring: adjusting thresholds We did not have criticals anymore after recent adjustments but 2 soft state warnings were left about KBytes_Written/Sec around 6k. (T84150) Change-Id: Id8b68f29b20a875a042110c2244b602f94b6d00a 03 June 2015, 17:30:06 UTC
b5f9184 git.w.o cert deletion - exists behind misc-web the old git.wikimedia.org certificate is no longer used, as the service lives behind misc-web, and thus uses the wildcard wikimedia.org certificate T100827 Change-Id: Iddf579788cec471faf3cb14b2ac8bd843a2b8bb9 03 June 2015, 17:25:59 UTC
7b6dab4 Fix-up for I7bc734b58: remove spurious '--file' arg to logger Change-Id: I811a336abc91574a3ecc49377317e26d5299ff20 03 June 2015, 16:58:52 UTC
9ee5b1d memkeys-snapshot: forward to fluorine for aggregation * Tee the output of memkeys-snapshot to a file and to syslog, tagged with 'memcached-keys'. Configure rsyslog to forward log records with that tag to fluorine. * Change the run time from 20s to 10s. That's plenty of data. Change-Id: I7bc734b583de540e92329560e3d33c32bf4a3f49 03 June 2015, 16:53:13 UTC
dcac665 strongswan: fqdn is a fact, qualify bug: T97251 Change-Id: I4a9ae87fee24be2665af0805605048e986e0b9b9 03 June 2015, 16:33:43 UTC
703de8c interface: access template variable via '@' Puppet emits a deprecation warning when rendering this template. Change-Id: I7573ea9f2f1b1ae005f49dd59491f5b6346f5238 03 June 2015, 16:31:46 UTC
e5ac9e3 Merge "apache: qualify vars" into production 03 June 2015, 16:28:22 UTC
5b0e925 Merge "install-server: Force VMs to power off after installation" into production 03 June 2015, 16:09:08 UTC
6b64db9 install-server: Force VMs to power off after installation Right now ganeti does not support the once parameter that recent QEMU versions support. To overcome this we shutdown VMs after a succesful installation. Ganeti will autoreboot them anyway with the new boot_order configuration, provided of course the configuration has been changed. This need to be documented in wikitech Change-Id: I40ad37f434d38481de407e66317c5dd405181b8f 03 June 2015, 15:44:29 UTC
2e2823e Merge "ganglia: qualify vars" into production 03 June 2015, 14:56:51 UTC
e5015aa Merge "statistics: qualify vars" into production 03 June 2015, 14:53:28 UTC
6079e80 Merge "nove: qualify vars" into production 03 June 2015, 14:48:38 UTC
576559c nove: qualify vars Change-Id: I88aa6ab38a02616ca19c1cfbf1078abca7e4084f 03 June 2015, 13:59:15 UTC
64f7931 ganglia: qualify vars Change-Id: I2cc29cbd7145e90d31f74ca86ed6c73924a6d586 03 June 2015, 13:52:35 UTC
91add59 apache: qualify vars Change-Id: I1a530a8c951d57133f0db29ffa7bd0d71f5039bf 03 June 2015, 13:06:13 UTC
2ca883e site: remove authdns::server from rubidium rubidium can now be returned back to spares. Change-Id: Id24559ba7ffda7b3a1e71965b00e0819d8d5be9f 03 June 2015, 12:53:08 UTC
9891805 statistics: qualify vars Change-Id: I00de55c4b929ec6bfc8aa2496a4f57d1ac1183ea 03 June 2015, 12:40:22 UTC
790c793 site: add authdns::server role to radon Soon to replace rubidium. Change-Id: I9ecbb4951b0147b15ba56d009ce94d8d2b4042aa 03 June 2015, 12:25:17 UTC
880eff5 site: Set up radon (move to public, jessie) (also cleanup some old MySQL GRANTs from its previous allocation) Unallocated/has no purpose, for now. Change-Id: If94209359bea4fb946e69a7d130650cd6b2ba927 03 June 2015, 12:07:39 UTC
44ca33d install-server: Accomodate virtualization Override grub-installer/bootdev and partman-auto/disk in virtual.cfg. Using the jessie+ supported default argument for the former, see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=759737 and /dev/vda for the latter. VMs are not expected to have complex partitioning schemes like all physical hardware Change-Id: I7a50cfabb74a1431225c4e6ac1e441eb95563204 03 June 2015, 08:45:43 UTC
d615520 Log a 20s sample of memcached usage to a file once a day Run memkeys once a day, for 20 seconds, at a random minute / hour on each host, logging the output to a file. This way, whenever we spot a spike in memcached usage, we can diff logs to isolate the culprits. Change-Id: Ic1c8f10ff6f934d46b84cb779577441cfd5e1c4f 03 June 2015, 07:18:14 UTC
2f2261c Enable buffer pool load at start and dump at stop Bug: T101009 Change-Id: Ieb8fada4f3aede083979c5fe719370034efb6566 03 June 2015, 06:24:30 UTC
6e49aa2 Revert "Add basic alerts on RESTBase error rates and storage latencies" I prematurely merged this, it broke icinga config reload and it should live in its own class too. This reverts commit 5d04bd78d2263d9de33092b8e2c549968001a9c8. Change-Id: If38a4375e7cafa9468b8c4fef757e59e4f670b24 02 June 2015, 23:57:55 UTC
5d04bd7 Add basic alerts on RESTBase error rates and storage latencies Bug: T78514 Change-Id: Ic4b083cfd49436013b7958e9424c8ff131095726 02 June 2015, 23:15:30 UTC
682f26f HHVM: set light_process_count to 5, light_process_file_prefix to /tmp/hhvm. * /tmp/hhvm is managed by the Upstart script and is guaranteed to exist and to be writable by the HHVM user, so it's a good place for light process socket count. Change-Id: I23ac1a3bbe50f9da4607c172fa6a35f42c57c700 02 June 2015, 22:17:34 UTC
19bf816 Merge "Fix 0a5165b by picking a color alias graphite accepts" into production 02 June 2015, 20:15:24 UTC
ff3da1f Merge "Apply ::varnish::logging::statsd on all varnishes, not just bits" into production 02 June 2015, 20:15:17 UTC
3c844e7 Fix 0a5165b by picking a color alias graphite accepts Change-Id: I683a9973a59df84e7a208c53d4f16ba06db8fa7d 02 June 2015, 20:14:32 UTC
43f7327 Merge "tendril: use SSLCertificateChainFile" into production 02 June 2015, 20:02:12 UTC
abffacd Merge "replace tendril.wikimedia.org's sha1 cert with sha256" into production 02 June 2015, 19:59:27 UTC
1df0dab Merge "Adding dhcp entries for restbase1007-9" into production 02 June 2015, 19:52:17 UTC
514bc5f replace tendril.wikimedia.org's sha1 cert with sha256 this patchset replaces the tendril.wikimedia.org certificate file in our repo with a new one using sha256 separately Ifcf8e63ff7bc8068ab8a33 will change the Apache config to use the chained file Bug:T100835 Change-Id: I57bf486a1e974f2f545d128728b0fbbd89650e8c 02 June 2015, 19:51:38 UTC
0ef2ee9 Merge "Add job ack rates to gdash" into production 02 June 2015, 19:48:40 UTC
5870d99 Adding dhcp entries for restbase1007-9 Change-Id: I5218764b9dddc0dd098a52338b83c38818ba545b 02 June 2015, 19:47:43 UTC
597aa36 Merge "Allow the labs recursor to respond to all wmf networks." into production 02 June 2015, 19:38:40 UTC
0a5165b Add job ack rates to gdash Change-Id: I20127750019e5b1cfc788ff786c31ad37bb201c3 02 June 2015, 19:35:42 UTC
692fbc1 HHVM canaries: set light_process_count to 5 HHVM's LightProcess feature makes it faster to fork subprocesses (when shelling out). This patch sets hhvm.server.light_process_count to 5 on the canary app servers. Change-Id: Iba76a84c2a60f36a6d063d6c803ecc4d0bcee39a 02 June 2015, 19:31:11 UTC
a3756d8 Apply ::varnish::logging::statsd on all varnishes, not just bits Change-Id: Icd84ea3599f56c9fb356f2764130cbcc9972cfb4 02 June 2015, 19:29:13 UTC
a937e78 icinga apache config: Add back missing SSLCertificateFile 4fb3420cce caused this: you don't remove the certfile directive when adding the chained directive, apparently. Change-Id: I3cd452c984c7007d4cb0d3c6138672e1ce24423f 02 June 2015, 19:26:15 UTC
d99f22f HHVM APC: enable item expiration HHVM won't ever purge expired keys from APC unless Server.APC.ExpireOnSets=true. It is set to false by default. When true, HHVM will iterate on expired keys and evict them from APC once every 4096 APC writes by default. Change-Id: I87047dac97cb3bed0dfffd714b18be2b941f4de3 02 June 2015, 19:25:51 UTC
65d7ca3 tendril: use SSLCertificateChainFile use the chained file directly instead of specifying SSLCACertificateFile Change-Id: Ifcf8e63ff7bc8068ab8a3300e44a562b1e81f42e 02 June 2015, 19:24:12 UTC
4fb3420 icinga.wikimedia.org cert sha1 to sha256 + chained file usage Change-Id: I1c873cab17838103c61428b8995fef770d5efb22 02 June 2015, 19:13:45 UTC
5de543c Merge "pywikipedia->pywikibot in mailman" into production 02 June 2015, 18:41:51 UTC
422e7e5 Merge "do not set ca parameter with install_certificate" into production 02 June 2015, 18:31:04 UTC
242f147 do not set ca parameter with install_certificate Change-Id: Ibcde3b9b9c6ee6cb8b1437bb8f9f3d15e8704f66 02 June 2015, 18:28:53 UTC
f1577ad Merge "Revert "icinga.wikimedia.org cert sha1 to sha256"" into production 02 June 2015, 18:21:00 UTC
9ababce Revert "icinga.wikimedia.org cert sha1 to sha256" This reverts commit cce614853a30daae1e3645cc72277784439d9604. Change-Id: I2879531513fe7f22abee8eff9526389024859a24 02 June 2015, 18:20:09 UTC
da27ea9 gdash: fix graphite dashboard colors Change-Id: I84a4f636f1c021d853e8336d74580078b51f1a28 02 June 2015, 18:08:05 UTC
2a74c5c varnishstatsd: don't report stats for bogus HTTP status codes Change-Id: Icbdf0d499e352cbacb16d1894769cf4c5670cd4d 02 June 2015, 18:00:17 UTC
0a90e5d Merge "icinga.wikimedia.org cert sha1 to sha256" into production 02 June 2015, 17:47:05 UTC
55e5ed6 Apply role::cache::statsd on bits * Don't include varnishstatsd in kafka role. The repurposed role::cache::statsd is no longer directly related to kafka, so including it there is misleading. * As part of a gradual general rollout, apply role::cache::statsd on bits. Change-Id: I7ffd74997540a809bf7b3cdfa255ba62da5d5509 02 June 2015, 17:34:45 UTC
cce6148 icinga.wikimedia.org cert sha1 to sha256 DO NOT MERGE THIS PATCHSET without updating it with the associated configuration changes for icinga's use, or you will break icinga https. (dzahn) - changed intermediate cert to RapidSSL_SHA256_CA_-_G3.pem per Robh (dzahn) - do _not_ set the intermediate anymore since now this happens automagically Bug:T100830 Change-Id: Ic8baf0f715da4793ef64525edca2e5fe893f3f14 02 June 2015, 17:16:24 UTC
aa4ac5a Merge "dnsrecursor: ensure => 'present' rather than 'latest'" into production 02 June 2015, 17:14:53 UTC
acfeeef Deployment group for trebuchet Bug: T97775 Change-Id: I4486fd5e69a4cbca1a02817f59a12546ad6c9e4b 02 June 2015, 17:05:56 UTC
19468c9 lists.wikimedia.org certificate sha1 to sha256 This replaces the old SHA1 certificate with a new SHA256 certificate. It also defines the install_certificate parameter for the mailing list role, which seems to have been missing previously. (So the SHA1 cert was installed via another no longer existing declaration, or manually.) SPECIAL NOTE: This cannot merge until the planned maintainance, as the merging opsen will have to remove the chained certificate and have it regenerate on puppet run. Bug:T100832 Change-Id: I7290984d67f9b0b35a7c000e6c02c5737df1bf78 02 June 2015, 17:02:35 UTC
92cac6c Merge "Improve static-bugzilla frontpage (mention Phabricator etc.)" into production 02 June 2015, 16:55:52 UTC
dbdc234 dynamicproxy: Hardcode resolver URL to new designate server Change-Id: I6d23a545c7b92f92433bc4b91e2c85b82aa61056 02 June 2015, 15:48:44 UTC
21dda3a Merge "Remove call to onInstanceActionCompletion.php after signing" into production 02 June 2015, 15:15:49 UTC
1b5a9cd ores: Specify protocol explicitly for nginx backend Change-Id: I26c306a8cf314eb1abbcd1e21727756215472c59 02 June 2015, 15:10:36 UTC
f6dbdf6 Improve static-bugzilla frontpage (mention Phabricator etc.) Change-Id: I518333a4a0b4a768884f781f16b42956b94b5405 02 June 2015, 14:43:23 UTC
3e86d31 sslcert: switch all install_certificate to x509-bundle Change-Id: Ia46eae36b4d6a037275f2366e34ac63448b69593 02 June 2015, 02:23:51 UTC
ee50f75 bugfix for 44c3d7201 Change-Id: I94b632e820b9619b4897058112349801a2eba8ad 02 June 2015, 01:27:48 UTC
44c3d72 add DigiCertSHA2HighAssuranceServerCA Intermediate cert, used by stream.wm.o cert Change-Id: Ie59f89b3c8550576f466519da62062086215dce6 02 June 2015, 01:24:20 UTC
1ae18e7 sslcert: Deploy new x509-bundle script + test output This is a breakup of Faidon's work so I can test it before affecting all install_certificate users. For a given chained cert produced as: /etc/ssl/localcerts/foo.chained.crt This creates a test of the new method with output at: /tmp/foo.chained.crt.x509b-test This can be validated on the hosts and diffs compared, etc, before proceeding with redefining the primary chained.crt file. Change-Id: I9a3da71bc3d0f3cb22b7183bc050970ace581112 02 June 2015, 00:18:17 UTC
90c011a Revert "Add link in gitblit for phabricator" This reverts commit 7c2b3b466aa0a237d36e08cf02a5ebff045ceef1. Change-Id: I7eac6c19ac3b6dc4d11b70725ce1b16434ee6fce 01 June 2015, 23:56:14 UTC
19e9523 Merge "Add link in gitblit for phabricator" into production 01 June 2015, 23:32:38 UTC
4b62682 Turn off recursing for labs pdns/mysql/designate. We have a proper recursor that labs instances point to now. Using the recursor within pdns auth is discouraged in the docs, and we don't need it. Change-Id: I8094dfcf44bdd6336c853759a43ed42e115fbc1b 01 June 2015, 23:25:16 UTC
7c2b3b4 Add link in gitblit for phabricator * Add link in Bug for phabricator in gitblit since it was showing just text no link. Change-Id: Iaf0b8c7b7ff302facba4ebd63ff610e832365783 01 June 2015, 23:22:04 UTC
a06257b Merge "Use the labs dns server for reverse-dns in the labs range" into production 01 June 2015, 23:20:37 UTC
acfc915 Merge "mailman: adjust monitoring thresholds" into production 01 June 2015, 23:17:31 UTC
7af4c4f mailman: adjust monitoring thresholds Bug:T84150 Change-Id: Id8d54024ea64c39a8dc47a7bb5328bacd2fbc8c3 01 June 2015, 23:15:31 UTC
8838317 Don't hardcode dns listening IPs. Change-Id: I47fb18286f0e485433c079d1071f7404c7dd21aa 01 June 2015, 23:03:43 UTC
38d8315 Revert "Feed the puppet host IP directly to dnsmasq." This reverts commit 7e628093eb5b1a7c09c447dea82d6a9f332d9a19. Bug T100317 Change-Id: I5a8312dc7a31a7fdd41e78c5fbb9ac1853842564 01 June 2015, 22:58:15 UTC
f05c139 Allow .s in salt-key names for labs. Now that we have foo.project.eqiad.wmflabs names, we need to accept dots. Note that dots were already accepted for puppet certs, failing to accept them in salt was an oversight. Change-Id: I76d69789b760c5a606e3899c1b4892d61dde610e 01 June 2015, 22:16:15 UTC
1f11ad6 Merge "Fix a typo that was causing the puppetsigner to crash." into production 01 June 2015, 21:47:27 UTC
b23af40 Add all Release-Engineering team as Gerrit admins Gerrit is maintained by the Release-Engineering team. Bug: T100565 Change-Id: Ic5f16846e08440b50e2affb438a8a8f0b8a83af5 01 June 2015, 20:32:13 UTC
ef82b2a strongswan module: don't install ipsec-tools we don't need it, it recently had a vulnerability, and it's unmaintained. Change-Id: Ic2691a136cd4ffed5dec092af7f9c55bfb90900a 01 June 2015, 17:37:14 UTC
4dc955c Allow the labs recursor to respond to all wmf networks. This should fix monitoring without causing any trouble. Change-Id: I71b0d27061bd3d24d9262a51b415f9f1f904301d 01 June 2015, 17:34:19 UTC
6b8e123 Merge "[English Planet] Add Bluerasberry, Nimish Gautam" into production 01 June 2015, 17:29:36 UTC
8f002d5 Merge "people.wikimedia.org: HTTPS only" into production 01 June 2015, 17:15:58 UTC
4ac0432 Merge "puppetmaster: Explicitly include config class before using it" into production 01 June 2015, 16:58:22 UTC
8ba18f6 puppetmaster: Explicitly include config class before using it Change-Id: If88d97b396bc30a05b9fe5c52b93658b01096820 01 June 2015, 16:57:09 UTC
a6dfeb0 Merge "Allow a lax ssh policy on labs controllers." into production 01 June 2015, 16:51:46 UTC
back to top