Elasticsearch engineers discovered that removing sequence numbers from replicated shards can reduce metric storage by up to 41%. This optimization leverages the fact that sequence numbers are only needed during replication, not after. The post explains the implementation details, including changes to the indexing and replication pipeline, and provides benchmarks showing significant storage savings without impacting query performance. For organizations running large Elasticsearch clusters, this technique offers a straightforward way to cut costs and improve efficiency. The approach is particularly valuable for time-series data and logging use cases where storage is a major expense.
Technical deep-dive on Elasticsearch storage optimization via sequence number removal.