This is part of a two part blog that compares Sneller compute and storage efficiency to Elastic and OpenSearch. Sneller’s architecture can radically reduce costs – typically by 50% to 90%. The more compute intensive your queries or the more data you have to ingest and retain the more cost effective Sneller becomes. Keep in mind that comparing costs across two radically different architectures depends on a number of implementation choices. Ultimately what matters most are your specific workload and reliability requirements.
To start with, we will describe the factors needed to size an Elastic or OpenSearch cluster.
How To Size An Elasticsearch or OpenSearch Cloud
With Elasticsearch and OpenSearch you need to set up a cluster of systems with locally attached storage. Both compute work (needed for ingest indexing and querying) and data (consisting of indexes) are equally distributed through the cluster. You choose configurations based on your cost and reliability requirements.
Elastic and AWS have a resource sizing tool that you set up based on the following parameters:
You have 3 choices that come with increasing levels of cost and reliability. The first and cheapest is no compute redundancy. All your nodes are configured in one availability zone (AZ) and you get to utilize them up to 100% at peak indexing and query loads. This is low cost, but if a node or an AZ goes down, you lose your cluster. Therefore this configuration is only recommended for short test or development runs. Next is dual redundancy recommended for production workloads. You allocate twice the nodes and cap maximum CPU utilization at 50%. That way if a node goes down, its backup has the capacity to absorb the extra load. Note that for this configuration costs go up by 4x (twice the nodes, half the CPU utilization). Finally, there is triple redundancy where nodes are spread over 3 availability zones and maximum CPU utilization is capped at 66% per node (compute costs go up by 3x).
One way to reduce Elastic costs is to run your nodes “hot.” An Elastic cluster is normally provisioned with sufficient headroom to accommodate dynamic changes in ingest or querying and to provide some slack for growth. To save costs, you can ensure that your average CPU utilization is always above 50% 24x7, but this can lead to failed queries or dropped ingest due to traffic spikes. As you go to production and mission critical deployments, average CPU utilization should be reduced. One of our design partners runs their mission critical cluster at 20% average utilization. An additional resource to watch out for is heap space or RAM utilization.
Sneller monitors CPU, memory and network utilization of our clusters and ensures optimal utilization to reduce costs and provide fast and scalable performance.
Sneller’s serverless compute provides triple redundant clusters spread across three availability zones in each region by default for even the smallest workloads. This is provided at no extra cost. Sneller leverages AVX-512 to expand compute resources by 16x compared to the competition, again at no extra cost. Sneller compute automatically and quickly expand to serve large workloads. You do not need to babysit CPU or memory utilization or maintain large margins of CPU or memory just to ensure reliable operation. You do not pay 24x7 for your clusters like you have to do with either Elastic or AWS OpenSearch. Instead, you pay only for the CPU cycles you actually use. Sneller pricing is based on the volume of data scanned at query time. You also do not need to worry about sizing, managing, scaling, replicating or overprovisioning nodes in case of an AZ failure.
Put all of that together and you find that Sneller’s AVX-512 powered, serverless, fully redundant and instantly scalable compute is far more efficient and easy to manage than over-provisioned, hard to configure, hard to scale and hard to manage compute from Elastic or OpenSearch.
My next blog discusses a similar advantage for storage.