Elasticsearch Notes

Elasticsearch is 2 components.

You need to understand how both work;


 Index Merges

This video displays how index merges occur:
Indexing Mediawiki

Basically when you have enough segments that can be grouped they will be vacuumed and merged.


 Memory Pressure/Heap:

If you monitor the total memory used on the JVM you will typically see a sawtooth pattern where the memory usage steadily increases and then drops suddenly.


The reason for this sawtooth pattern is that the JVM continously needs to allocate memory on the heap as new objects are created as a part of the normal program execution. Most of these objects are however short lived and quickly become available for collection by the garbage collector. When the garbage collector finishes you’ll see a drop on the memory usage graph. This constant state of flux makes the current total memory usage a poor indicator of memory pressure.

The JVM garbage collector is designed such that it draws on the fact that most objects are short lived. There are separate pools in the heap for new objects and old objects and these pools are garbage collected separately. After the collection of the new objects pool, surviving objects are moved to the old objects pool, which is garbage collected less frequently. This is due to the fact that it’s less likely to be any substantial amount of garbage there. If you monitor each of these pools separately, you will see the same sawtooth pattern, but the old pool is fairly steady while the new pool frequently moves between full and empty. This is why we we have based our memory pressure indicator on the fill rate of the old pool.

This is bad, heap is too large:
Notice the gaps between teeth
If the heap is too large, the application will be prone to infrequent long latency spikes from full-heap garbage collections. Infrequent long pauses impact end-user experience as these pauses increase the tail of the latency distribution; user requests will sometimes see unacceptably-long response times. Long pauses are especially detrimental to a distributed system like Elasticsearch because a long pause is indistinguishable from a node that is unreachable because it is hung, or otherwise isolated from the cluster. During a stop-the-world pause, no Elasticsearch server code is executing: it doesn’t call, it doesn’t write, and it doesn’t send flowers. In the case of an elected master, a long garbage collection pause can cause other nodes to stop following the master and elect a new one. In the case of a data node, a long garbage collection pause can lead to the master removing the node from the cluster and reallocating the paused node’s assigned shards. This increases network traffic and disk I/O across the cluster, which hampers normal load. Long garbage collection pauses are a top issue for cluster instability.

This is bad, heap is too small:
If the heap is too small, applications will be prone to the danger of out of memory errors. While that is the most serious risk from an undersized heap, there are additional problems that can arise from a heap that is too small. A heap that is too small relative to the application’s allocation rate leads to frequent small latency spikes and reduced throughput from constant garbage collection pauses. Frequent short pauses impact end-user experience as these pauses effectively shift the latency distribution and reduce the number of operations the application can handle. For Elasticsearch, constant short pauses reduce the number of indexing operations and queries per second that can be handled. A small heap also reduces the memory available for indexing buffers, caches, and memory-hungry features like aggregations and suggesters.

Memory Pressure
Understanding HEAP
Sizing ES


 Node Types:

Client Node:
node.data: off and node.master: off

Master Node:
node.data: off and node.master: on

Data Node:
node.data: on and node.master: off

 Sizing of ES




curl -XGET localhost:9200/_cat

Explore!, you can get a subset of data by listing the headings

curl -XGET 'localhost:9200/_cat/nodeattrs?h=host,ip,attr'

And list the headings

curl -XGET 'localhost:9200/_cat/nodeattrs?v

of course there is ?help too.

Recommended software:


 Best Practices:


Now read this

Failing to monitor, dying without dignity.

Today, I’m going to tell you about the story of an obscure kernel bug, how we missed it, and how we’re still recovering from the effect I should preface this by saying that, generally, I like virtual machines. I have 5 actual servers... Continue →