On a private network, elasticsearch nodes will automatically discover peers using multicast. Nodes configured with a common cluster name will magically find each other when they boot up and form a cluster. It’s wonderful, magical, and a little scary — elasticsearch nodes will likely be the first to become sentient in a robot uprising.
On AWS and most other clouds, multicast is not allowed. (Rackspace supports broadcast and multicast.) This leaves two options: use unicast discovery and explicitly list out each node in
discovery.zen.ping.unicast.hosts, or use the EC2 discovery method provided by the cloud-aws plugin. The former is fairly brittle due to the dynamic nature of the cloud. The latter uses the EC2 API to enumerate hosts, essentially populating
discovery.zen.ping.unicast.hosts dynamically. This guide does a great job of covering the process, so I won’t go into the details here. Instead, I will try to offer a few tips on the setup process.
Version 1.7.0 and later of the cloud-aws plugin supports IAM roles. (IAM stands for Identity and Access Management.) IAM roles can be used to grant access to certain APIs and resources in a secure way. First, a role is created and is associated with various permissions. Then, an EC2 instance is launched with a certain role. Applications running on that instance can make an API call to retrieve temporary credentials that will grant access to APIs and resources associated with the role. This negates the need to configure the cloud-aws plugin with a permanent access key and secret key.
As I was new to IAM and the cloud-aws plugin, it took me a while to figure out how everything fit together, how to add an IAM role, what permissions to grant, and how to configure the plugin to use the role.
If you are only using the cloud-aws plugin for EC2 discovery, you only need to grant access to
EC2:Describe* (and even then, there are probably more fine-grained permissions that will still allow the plugin to function). The S3 permissions are only required if you use the S3 gateway. (My understanding is the default local gateway is preferred to the S3 gateway.)
To configure the cloud-aws plugin to use IAM roles, simply omit the
secret_key from the configuration file. If the plugin detects that neither is set, it will attempt to fetch temporary credentials from the IAM API. If you are using the elasticsearch chef cookbook, be sure to set
node.elasticsearch.cloud, otherwise none of the other cloud-related settings will be written to the config file.
Filter Instances by Group, AZ, or Tag
elasticsearch will try unicast discovery with every instance returned by the EC2 API. If you have a lot of instances in your AWS account, this could take a long time. Worse, if the discovery process times out, a new node may declare itself a master! Configure the plugin to search only instances in certain security groups or availability zones, or instances that have certain tags. See the plugin documentation for details.
Since multicast won’t work anyway, it should be disabled so elasticsearch doesn’t even try. Set
discovery.zen.ping.multicast.enabled = false.