https://github.com/elastic/elasticsearch
Revision 12bbe28649e3313d47f7d77a571918a40e66f700 authored by Boaz Leskes on 17 April 2014, 09:12:36 UTC, committed by Boaz Leskes on 18 April 2014, 16:56:08 UTC
When a replication operation (index/delete/update) fails to be executed properly, we fail the replica and allow master to allocate a new copy of it. At the moment, the node hosting the primary shard is responsible of notifying the master of a failed replica. However, if the replica shard is initializing (`POST_RECOVERY` state), we have a racing condition between the failed shard message and moving the shard into the `STARTED` state. If the latter happen first, master will fail to resolve the fail shard message.

This commit builds on #5800 and fails the engine of the replica shard if a replication operation fails. This protects us against the above as the shard will reject the `STARTED` command from master. It also makes us more resilient to other racing conditions in this area.

Closes #5847
1 parent b6515e2
History
Tip revision: 12bbe28649e3313d47f7d77a571918a40e66f700 authored by Boaz Leskes on 17 April 2014, 09:12:36 UTC
Fail replica shards locally upon failures
Tip revision: 12bbe28
File Mode Size
.settings
bin
config
dev-tools
docs
lib
rest-api-spec
src
.gitignore -rw-r--r-- 816 bytes
.travis.yml -rw-r--r-- 145 bytes
CONTRIBUTING.md -rw-r--r-- 6.1 KB
LICENSE.txt -rw-r--r-- 11.1 KB
NOTICE.txt -rw-r--r-- 150 bytes
README.textile -rw-r--r-- 8.2 KB
TESTING.asciidoc -rw-r--r-- 6.9 KB
core-signatures.txt -rw-r--r-- 2.6 KB
pom.xml -rw-r--r-- 65.8 KB

README.textile

back to top