Why is Sneller Data Storage More Efficient, Reliable and Easier to Manage than Elastic and OpenSearch?

Varun Mehta
November 27, 2022

When you store data in an Elastic or OpenSearch cluster you are typically trying to optimize for 3 different axes:

  1. Performance
  2. Cost
  3. Reliability

With these legacy packages there is no good way to get optimal cost, performance and reliability all in one solution.  Higher performance or reliability always comes at increased cost.  Conversely lower cost results in poorer performance or reliability.  As we shall see, Sneller’s architecture does not force you to make these difficult trade offs: you get low cost, high performance and high reliability all at once.

Reliability

OpenSearch and Elastic store indices in shards which are distributed across a compute cluster.  No shard redundancy is the cheapest but also the most unreliable way to go.  With one replica per shard, your storage costs are doubled and with two replicas you pay 3x the amount for storage.  Naturally you must take care to distribute these redundant shards across availability zones.  So shard replication gives you reliability, but at increased cost.

Instead of shard replication you have the alternative of building a redundant standby cluster.  This doubles both your compute and storage costs and increases management overhead as well.

Tiering

For Elastic and OpenSearch the highest performance is achieved when all your data is resident in expensive locally attached NVMe SSDs.  However this becomes very costly for more than a few days of data.  Elastic/OpenSearch offer storage tiering to reduce costs. Only a limited amount of data is stored in expensive NVME SSDs while longer term data is offloaded to warm disks or cold object storage.  Many customers we’ve spoke with like the performance of locally attached SSDs but find that query latencies for warm or cold data are painful and have high variance.  They also find it hard to size, monitor and administer different tiers of storage.

Elastic offers four different levels of storage tiering:

source: https://www.elastic.co/blog/introducing-elasticsearch-frozen-tier-searchbox-on-s3

AWS OpenSearch offers three different levels of storage tiering:

source: https://aws.amazon.com/blogs/big-data/choose-the-right-storage-tier-for-your-needs-in-amazon-opensearch-service/

As you can see, there is a lot of complexity that comes with storage tiering.

Sneller Storage Management

Sneller uses object storage exclusively.  There is only one tier of storage and it is equivalent to the extreme low cost options that our competitors offer.  Clearly the challenge is to provide Hot Storage performance at Cold or Frozen tier prices.  That is the technical hurdle we’ve overcome with selective, high throughput data loading from object storage, ultra-fast queries and intelligent DRAM caching.  The benefits of this approach are significant.  You get the lowest possible cost and eliminate huge management, data protection and data reliability headaches.  There are no tiers to size and no extra charges for storage beyond what you already pay your cloud provider for object storage.  This allows you to ingest data at very high rates and increase retention to whatever scale you want.

Summary

The table below shows the various alternatives to choose for Elastic and OpenSearch based on how much reliability you want vs how much you want to pay.

Users want all the advantages that the rightmost column provides.  They want very high availability for both compute and storage.  They want low cost but high speed access todays weeks or even years of data and they want all this in an easy to manage solution.  The Elastic and OpenSearch cluster based architecture offers reliability, but at a steep price that can be>5 times the cost of a base configuration. The high availability deployment also requires significant operational overhead to manage and monitor all the nodes and several tiers of storage.

 

On the other hand, Sneller offers the highest levels of redundancy and reliability at huge cost savings in an offering that is much easier to use. Because we are compatible with Elastic and OpenSearch it is easy to transition to Sneller in parallel and convert at your own pace.

Ready to speedup and simplify your event data analytics?

Sneller is also available as an open source project on GitHub.