Elasticsearch's native pagination methods, like from/size, become prohibitively slow for deep pages due to the need to fetch and discard large result sets. This article presents a novel solution from vivo's internet server team that reduces deep pagination query time from 10 minutes to 1 second. The approach uses a three-round optimization: first, segment-based preheating of a Redis cache; second, nearest anchor point positioning combined with search_after; and third, a multi-level anchor cache strategy. The key insight is to cache anchor points at regular intervals (e.g., every 1000 documents) and use search_after to jump to the nearest cached anchor, then scan forward to the exact page. This avoids the overhead of deep scrolling while supporting arbitrary page jumps. The article includes detailed performance benchmarks and trade-offs, such as cache invalidation strategies and memory considerations. For engineering teams dealing with large-scale search or log analytics, this provides a production-ready pattern that balances speed, consistency, and resource usage.
This article details a three-round optimization by vivo's internet server team to enable arbitrary page jumps in Elasticsearch deep pagination, reducing query time from 10 minutes to 1 second. The solution uses search_after combined with a Redis multi-level anchor cache, addressing a key limitation of ES. It offers a practical, scalable approach for production environments needing random access to deep result sets.