- Elastic Cloud Enterprise - Elastic Cloud on your Infrastructure: other versions:
- What is Elastic Cloud Enterprise?
- Getting started
- Planning your installation
- Preparing your environment
- Installing Elastic Cloud Enterprise
- Configuring your installation
- Securing your installation
- Monitoring your installation
- Administering your installation
- Getting started with deployments
- Administering deployments
- Change your deployment configuration
- Stop routing requests or pause nodes
- Stop a deployment
- Restart a deployment
- Delete a deployment
- Access the Elasticsearch API
- Work with snapshots
- Upgrade versions
- Editing your user settings
- Configure Beats and Logstash with Cloud ID
- Keep your clusters healthy
- Secure your clusters
- Reset the password for the
elastic
user - Manage users and roles in X-Pack
- Manage users and roles in Shield
- Configure the Java Transport client
- Filter IP traffic
- Secure your settings
- Secure your 7.x clusters with LDAP or Active Directory
- Secure your 5.x and 6.x clusters with LDAP or Active Directory
- Secure your clusters with SAML
- Reset the password for the
- Manage your Kibana instance
- Manage your APM Server
- Enable Monitoring (formerly Marvel)
- Enable Graph (versions before 5.0)
- Connect to your cluster
- Enable cross-cluster search
- Troubleshooting
- RESTful API
- Using the API
- API examples
- A first API call: What deployments are there?
- Create a first deployment: Just an Elasticsearch cluster
- Applying a new plan: Resize and add high availability
- Applying a new plan: Checking on progress
- Applying a new deployment configuration: Upgrade
- Enable more stack features: Add Kibana to a deployment
- Dipping a toe into platform automation: Generate a roles token
- Customize your deployment
- Remove unwanted deployment templates and instance configurations
- Secure your settings
- API reference
- Authentication
- Clusters - Apm - CRUD
- Clusters - Apm - CRUD - Configuration
- Clusters - Apm - Commands
- Search clusters
- Restart cluster
- Resynchronize cluster
- Shut down cluster
- Upgrade cluster
- Move instances (advanced)
- Start all instances
- Stop all instances
- Start maintenance mode all instances
- Stop maintenance mode all instances
- Move instances
- Start instances
- Stop instances
- Start maintenance mode
- Stop maintenance mode
- Clusters - Elasticsearch - CRUD
- Clusters - Elasticsearch - CRUD - Configuration
- Get cross-cluster search clusters
- Get remote clusters for cross-cluster search
- Set remote clusters for cross-cluster search
- Get cluster curation settings
- Update cluster curation settings
- Set settings overrides (all instances)
- Set settings overrides
- Get settings from this cluster’s keystore
- Add or remove settings from the cluster keystore
- Set cluster name
- Get cluster metadata
- Set cluster metadata
- Get cluster metadata settings
- Update cluster metadata settings
- Cancel monitoring
- Set monitoring
- Get plan
- Update plan
- Migrate plan
- Get plan activity
- Cancel pending plan
- Get pending plan
- Set legacy security settings
- Get cluster snapshot settings
- Update cluster snapshot settings
- Clusters - Elasticsearch - Commands
- Search clusters
- Restart cluster
- Resynchronize cluster
- Shut down cluster
- Take snapshot
- Move instances (advanced)
- Start all instances
- Stop all instances
- Start maintenance mode all instances
- Stop maintenance mode all instances
- Move instances
- Start instances
- Stop instances
- Start maintenance mode
- Stop maintenance mode
- Clusters - Elasticsearch - Proxy
- Clusters - Elasticsearch - Support
- Clusters - Kibana - CRUD
- Clusters - Kibana - CRUD - Configuration
- Clusters - Kibana - Commands
- Search clusters
- Restart cluster
- Resynchronize cluster
- Shut down cluster
- Upgrade cluster
- Move instances (advanced)
- Start all instances
- Stop all instances
- Start maintenance mode all instances
- Stop maintenance mode all instances
- Move instances
- Start instances
- Stop instances
- Start maintenance mode
- Stop maintenance mode
- Clusters - Kibana - Proxy
- Clusters - Search
- Deployments - IP Filtering - CRUD
- Deployments - Notes
- Platform
- Platform - Allocators
- Get allocators
- Search allocators
- Delete allocator
- Get allocator
- Resynchronize allocator
- Move clusters
- Move clusters by type
- Start maintenance mode
- Stop maintenance mode
- Get allocator metadata
- Set allocator metadata
- Delete allocator metadata item
- Set allocator metadata item
- Get allocator settings
- Update allocator settings
- Set allocator settings
- Platform - Configuration - Instances - CRUD
- Platform - Configuration - Security
- Platform - Configuration - Security Deployment
- Platform - Configuration - Security Realms
- Platform - Configuration - TLS
- Platform - Constructors
- Platform - License
- Platform - Repository - CRUD
- Platform - Runners
- Platform - configuration - Store
- Platform - proxies
- Stack - Instance Types - CRUD
- Stack - Versions - CRUD
- Templates - Deployments
- Definitions
AllocatedInstancePlansInfo
AllocatedInstanceStatus
AllocatorBuildInfo
AllocatorCapacity
AllocatorCapacityMemory
AllocatorHealthStatus
AllocatorInfo
AllocatorMoveRequest
AllocatorOverview
AllocatorSettings
AllocatorZoneInfo
ApmConfiguration
ApmCrudResponse
ApmInfo
ApmPlan
ApmPlanControlConfiguration
ApmPlanInfo
ApmPlansInfo
ApmSettings
ApmSubInfo
ApmSystemSettings
ApmTopologyElement
ApmsInfo
AvailableAuthenticationMethods
BasicFailedReply
BasicFailedReplyElement
BoolQuery
CapacityConstraintsResource
ChangeSourceInfo
ClusterCommandResponse
ClusterCredentials
ClusterCrudResponse
ClusterCurationSettings
ClusterCurationSpec
ClusterInfo
ClusterInstanceConfigurationInfo
ClusterInstanceDiskInfo
ClusterInstanceInfo
ClusterInstanceMemoryInfo
ClusterLicenseInfo
ClusterMetadataCpuResourcesSettings
ClusterMetadataInfo
ClusterMetadataPortInfo
ClusterMetadataResourcesSettings
ClusterMetadataSettings
ClusterPlanMigrationResponse
ClusterPlanStepInfo
ClusterPlanStepLogMessageInfo
ClusterSnapshotRepositoryDefault
ClusterSnapshotRepositoryInfo
ClusterSnapshotRepositoryReference
ClusterSnapshotRepositoryStatic
ClusterSnapshotRequest
ClusterSnapshotResponse
ClusterSnapshotRetention
ClusterSnapshotSettings
ClusterSystemAlert
ClusterTopologyInfo
ClusterUpgradeInfo
ClustersInfo
CompatibleNodeTypesResource
CompatibleVersionResource
ConfigStoreOption
ConfigStoreOptionData
ConfigStoreOptionList
ConstructorHealthStatus
ConstructorInfo
ConstructorOverview
CreateApmInCreateElasticsearchRequest
CreateApmRequest
CreateElasticsearchClusterRequest
CreateKibanaInCreateElasticsearchRequest
CreateKibanaRequest
CrossClusterSearchClusters
CrossClusterSearchInfo
CrossClusterSearchSettings
DeploymentTemplateInfo
DeploymentTemplateReference
DiscreteSizes
ElasticsearchClusterBlockingIssueElement
ElasticsearchClusterBlockingIssues
ElasticsearchClusterInfo
ElasticsearchClusterInstanceSettingsOverrides
ElasticsearchClusterPlan
ElasticsearchClusterPlanInfo
ElasticsearchClusterPlansInfo
ElasticsearchClusterRole
ElasticsearchClusterSecurityInfo
ElasticsearchClusterSettings
ElasticsearchClusterTopologyElement
ElasticsearchClusterUser
ElasticsearchClustersInfo
ElasticsearchConfiguration
ElasticsearchCuration
ElasticsearchInfo
ElasticsearchMasterElement
ElasticsearchMasterInfo
ElasticsearchMonitoringInfo
ElasticsearchNodeType
ElasticsearchPlanControlConfiguration
ElasticsearchReplicaElement
ElasticsearchScriptTypeSettings
ElasticsearchScriptingUserSettings
ElasticsearchShardElement
ElasticsearchShardsInfo
ElasticsearchSystemSettings
ElasticsearchUserBundle
ElasticsearchUserPlugin
ElevatePermissionsRequest
EmptyResponse
EnrollmentTokenRequest
ExistsQuery
ExternalHyperlink
FilterAssociation
GrowShrinkStrategyConfig
Hyperlink
IdResponse
InstanceConfiguration
InstanceMoveRequest
InstanceTypeResource
IpFilterRule
IpFilterRuleset
IpFilterRulesets
IpFilteringSettings
KeystoreContents
KeystoreSecret
KibanaClusterInfo
KibanaClusterPlan
KibanaClusterPlanInfo
KibanaClusterPlansInfo
KibanaClusterSettings
KibanaClusterTopologyElement
KibanaClustersInfo
KibanaConfiguration
KibanaPlanControlConfiguration
KibanaSubClusterInfo
KibanaSystemSettings
LdapGroupSearch
LdapSecurityRealmLoadBalance
LdapSecurityRealmRoleMappingRule
LdapSecurityRealmRoleMappingRules
LdapSettings
LdapUserSearch
LegacySecuritySettings
LicenseInfo
LicenseObject
ListEnrollmentTokenElement
ListEnrollmentTokenReply
LoginRequest
LoginState
ManagedMonitoringSettings
MatchAllQuery
MatchNoneQuery
MatchQuery
MetadataItem
MetadataItemValue
MetadataItems
MoveApmClusterConfiguration
MoveApmClusterDetails
MoveClustersCommandResponse
MoveClustersDetails
MoveClustersRequest
MoveElasticsearchClusterConfiguration
MoveElasticsearchClusterDetails
MoveKibanaClusterConfiguration
MoveKibanaClusterDetails
NestedQuery
NodeTypeResource
Note
Notes
PlanStrategy
PlatformInfo
PlatformServiceImageInfo
PlatformServiceInfo
PrefixQuery
ProxiesAllocationsInfo
ProxiesFilter
ProxiesFilteredGroup
ProxiesFilteredGroupHealth
ProxiesHealth
ProxiesHttpSettings
ProxiesSSOSettings
ProxiesSettings
ProxyAllocationCounts
ProxyAllocationInfo
ProxyInfo
ProxyOverview
QueryContainer
QueryStringQuery
RangeQuery
RemoteClusterInfo
RemoteClusterRef
RepositoryConfig
RepositoryConfigs
RequestEnrollmentTokenReply
RestoreSnapshotApiConfiguration
RestoreSnapshotConfiguration
RestoreSnapshotRepoConfiguration
RollingGrowShrinkStrategyConfig
RollingStrategyConfig
RuleSetResponse
RulesetAssociations
RunnerBuildInfo
RunnerContainerInfo
RunnerInfo
RunnerOverview
RunnerRoleInfo
RunnerRolesInfo
SamlAttributeSettings
SamlIdpSettings
SamlSecurityRealmRoleMappingRule
SamlSecurityRealmRoleMappingRules
SamlSettings
SamlSpSettings
SearchRequest
SecurityDeployment
SecurityDeploymentCreateRequest
SecurityDeploymentTopology
SecurityRealmInfo
SecurityRealmInfoList
SecurityRealmsReorderRequest
SnapshotRepositoryConfiguration
SnapshotStatusInfo
StackVersionApmConfig
StackVersionArchiveProcessingError
StackVersionArchiveProcessingResult
StackVersionConfig
StackVersionConfigPost
StackVersionConfigs
StackVersionElasticsearchConfig
StackVersionInstanceCapacityConstraint
StackVersionKibanaConfig
StackVersionMetadata
StackVersionNodeType
StackVersionTemplateFileHash
StackVersionTemplateInfo
TargetElasticsearchCluster
TermQuery
TiebreakerTopologyElement
TlsPublicCertChain
TokenResponse
TopologySize
TransientApmPlanConfiguration
TransientElasticsearchPlanConfiguration
TransientKibanaPlanConfiguration
UsageStats
- Script reference
- Release notes
- Elastic Cloud Enterprise 2.2.3
- Elastic Cloud Enterprise 2.2.2
- Elastic Cloud Enterprise 2.2.1
- Elastic Cloud Enterprise 2.2.0
- Elastic Cloud Enterprise 2.1.1
- Elastic Cloud Enterprise 2.1.0
- Elastic Cloud Enterprise 2.0.1
- Elastic Cloud Enterprise 2.0.0
- Elastic Cloud Enterprise 1.1.5
- Elastic Cloud Enterprise 1.1.4
- Elastic Cloud Enterprise 1.1.3
- Elastic Cloud Enterprise 1.1.2
- Elastic Cloud Enterprise 1.1.1
- Elastic Cloud Enterprise 1.1.0
- Elastic Cloud Enterprise 1.0.2
- Elastic Cloud Enterprise 1.0.1
- Elastic Cloud Enterprise 1.0.0
- Limitations and known problems
- What’s new with the Elastic Stack
- Elastic Stack 7.5.1 released
- Elastic Stack 7.5.0 released
- Elastic Stack 7.4.2 released
- Elastic Stack 7.4.1 released
- Elastic Stack 7.4.0 released
- Elastic Stack 7.3.2 released
- Elastic Stack 7.3.1 released
- Elastic Stack 7.3.0 released
- Elastic Stack 7.2.1 and 6.8.2 released
- Elastic Stack 7.2.0 released
- Elastic Stack 7.1.1 released
- Elastic Stack 7.1.0 released
- Elastic Stack 7.0.1 released
- Elastic Stack 7.0.0 released
- Elastic Stack 6.8.6 released
- Elastic Stack 6.8.5 released
- Elastic Stack 6.8.4 released
- Elastic Stack 6.8.3 released
- Elastic Stack 6.8.1 released
- Elastic Stack 6.8.0 released
- Elastic Stack 6.7.2 released
- Elastic Stack 6.7.1 released
- Elastic Stack 6.7.0 released
- Elastic Stack 6.6.2 released
- Elastic Stack 6.6.0 released
- Elastic Stack 6.5.4 released
- Elastic Stack 5.6.14 and 6.5.3 released
- Elastic Stack 6.5.2 released
- Elastic Stack 6.5.1 released
- Elastic Stack 6.5.0 released
- Elastic Stack 5.6.13 and 6.4.3 released
- Elastic Stack 5.6.11 and 6.4.0 released
- Elastic Stack 6.3.2 released
- Elastic Stack 6.3.1 released
- Known issue affecting 5.6.10 to 6.3.0 upgrades
- Elastic Stack 5.6.10 and 6.3.0 released
- About this product
It is time to say goodbye: This version of Elastic Cloud Enterprise has reached end-of-life (EOL) and is no longer supported.
The documentation for this version is no longer being maintained. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Fault tolerance (high availability)
editFault tolerance (high availability)
editFault tolerance for Elastic Cloud Enterprise is based around the concept of availability zones. An availability zone contains resources available to an Elastic Cloud Enterprise installation that are isolated from other availability zones to safeguard against potential failure. In a fault tolerant Elasticsearch cluster, the nodes of a cluster are spread across two or three availability zones to ensure that the cluster can survive the failure of an availability zone. An availability zone could be a rack, a server zone, a zone on a cloud platform such as AWS or GCP, or some other logical constraint that supports the requirement that you could lose the entire availability zone and yet your cluster would still be up and running, because other availability zones are unaffected and can handle the workload.
All hosts that you install Elastic Cloud Enterprise on belong to an availability zone. You can specify which zone a runner should belong to with the --availability-zone
parameter when you install . If you are following our deployment recommendations, the runners in your installation will end up distributed evenly across three availability zones.
When you create or change an Elasticsearch cluster, you can select between one and three availability zones, similar to what our Elastic Cloud hosted offering supports. In broad terms:
- A single availability zone is suitable for testing and development
- Two availability zones are suitable for production use (with a tiebreaker)
- Three availability zones are great for mission critical environments
Elasticsearch nodes are started in each of the availability zones when your cluster is provisioned. When deploying clusters in multiple availability zones, shard allocation awareness ensures that primary shards and their replica shards are spread across different zones to minimize the risk of losing all shard copies at the same time.
Planning for a fault-tolerant installation with multiple availability zones means avoiding any single point of failure that could bring down Elastic Cloud Enterprise. For example, if you create an installation that uses a single physical rack with multiple availability zones placed on the same rack, that rack becomes a potential single point of failure. If the rack suffers an outage or a hardware failure, your entire cluster could be forced offline. To make such an installation more fault tolerant, you should create availability zones that are not dependent on the same physical rack. Similarly, you need to plan for sufficient capacity when an availability zone goes down: If your cluster is barely keeping up with its workload when all availability zones are up, the loss of one availability zone will likely cause the remaining Elasticsearch nodes in your cluster to be overwhelmed.
The main difference between Elastic Cloud Enterprise installations that include two or three availability zones is that three availability zones enable Elastic Cloud Enterprise to create clusters with a tiebreaker. If you have only two availability zones in total in your installation, no tiebreaker is created.
Tiebreakers are used in distributed clusters to avoid cases of split brain, where a cluster splits into multiple, autonomous parts that continue to handle requests independently of each other, at the risk of affecting cluster consistency and data loss. A split-brain scenario is avoided by making sure that a minimum number of master-eligible nodes must be present in order for any part of the cluster to elect a master node and accept user requests. To prevent multiple parts of a cluster from being eligible, there must be a quorum-based majority of (n/2)+1 nodes, where n is the number of nodes in the cluster. The minimum number of master nodes to reach quorum in a two-node cluster is the same as for a three-node cluster: two nodes must be available.
If you have only two availability zones in total in your installation, Elastic Cloud Enterprise disables the creation of a cluster with two availability zones. The reason is simple: If you could create a cluster with nodes in both of these availability zones, there would be nowhere reliable to put the tiebreaker. The tiebreaker would have a 50:50 chance of being part of the surviving availability zone in case of a zone failure, which is not enough to guarantee a quorum for every possible failure. If the availability zone that fails happens to contain the tiebreaker, the remaining availability zone would control less than the required (n/2)+1 nodes in the cluster. Without a quorum, the remaining nodes would not be able to process user requests. Effectively, such clusters provide little or no advantage in fault tolerance over clusters in a single availability zone which is why we disable their creation.
When you create a cluster with nodes in two availability zones when a third zone is available, Elastic Cloud Enterprise can create a tiebreaker in the third availability zone to help establish quorum in case of loss of an availability zone. The extra tiebreaker node that helps to provide quorum does not have to be a full-fledged and expensive node, as it does not hold data. For example: By tagging allocators hosts in Elastic Cloud Enterprise, can you create a cluster with eight nodes each in zones ece-1a
and ece-1b
, for a total of 16 nodes, and one tiebreaker node in zone ece-1c
.This cluster can lose any of the three availability zones whilst maintaining quorum, which means that the cluster can continue to process user requests, provided that there is sufficient capacity available when an availability zone goes down.
By default, each node in an Elasticsearch cluster is a master-eligible node and a data node. In larger clusters, such as production clusters, it’s a good practice to split the roles, so that master nodes are not handling search or indexing work. When you create a cluster, you can specify to use dedicated master-eligible nodes, one per availability zone.