Processing and performance
editProcessing and performance
editThis documentation refers to the standalone (legacy) method of running APM Server. This method of running APM Server will be deprecated and removed in a future release. Please consider upgrading to the Elastic APM integration. If you’ve already upgraded, please see Processing and performance instead.
APM Server performance depends on a number of factors: memory and CPU available, network latency, transaction sizes, workload patterns, agent and server settings, versions, and protocol.
Let’s look at a simple example that makes the following assumptions:
- The load is generated in the same region as where APM Server and Elasticsearch are deployed.
- We’re using the default settings in cloud.
- A small number of agents are reporting.
This leaves us with relevant variables like payload and instance sizes. See the table below for approximations. As a reminder, events are transactions and spans.
Transaction/Instance | 512 MB Instance | 2 GB Instance | 8 GB Instance |
---|---|---|---|
Small transactions 5 spans with 5 stack frames each |
600 events/second |
1200 events/second |
4800 events/second |
Medium transactions 15 spans with 15 stack frames each |
300 events/second |
600 events/second |
2400 events/second |
Large transactions 30 spans with 30 stack frames each |
150 events/second |
300 events/second |
1400 events/second |
In other words, a 512 MB instance can process \~3 MB per second, while an 8 GB instance can process ~20 MB per second.
APM Server is CPU bound, so it scales better from 2 GB to 8 GB than it does from 512 MB to 2 GB. This is because larger instance types in Elastic Cloud come with much more computing power.
Don’t forget that the APM Server is stateless. Several instances running do not need to know about each other. This means that with a properly sized Elasticsearch instance, APM Server scales out linearly.
RUM deserves special consideration. The RUM agent runs in browsers, and there can be many thousands reporting to an APM Server with very variable network latency.