Tech Blog

Exasol Update: Lifting the Lid on New Elasticity and Scalability Features

We’re delighted to get our latest product enhancements out in the wild in Exasol’s no-compromise database.  Now is the really exciting time for us, when we get the chance to work with our customers around the world to put these latest capabilities into action and make a real difference to their jobs and their organizations.

To uncover the details, we spoke with Jens Graupmann, our SVP Product & Innovation at Exasol. Read on if you want to find out more about:

  • New Elasticity and scalability features
  • Separation of storage and compute
  • More functionality to make your life easier

What makes the latest offering different and what was the main driver for it?

With the latest improvements Exasol has revolutionized the way it operates to meet the elasticity demands of enterprises running data analytics. Thanks to the introduction of our multi-cluster functionality, combined with the decoupling of storage and compute, Exasol users can seamlessly allocate additional resources to an active database in order to address their business requirements.

Looking into our existing user base and the market, there are some very typical workload patterns where this elasticity will make customers’ lives easier:

  1. Constant 24/7 usage
  2. On-demand usage
  3. Constant 24/7 usage with workload peaks in recurring and non-recurring situations

You can consider (3) as a combination of (1) + (2).

What will be the biggest changes for customers and what is the most exciting thing about the way they can now work with Exasol?

Customers working with the latest Exasol database offering can benefit from the dynamic administration of multiple clusters that can access the same data, providing not only adaptable scaling but also workload separation in response to user requirements.

So, what does this look like in the real world? Imagine an international Exasol user with two distinct teams requiring separate resources:

  1. BI Team – Standard reporting, operating 24/7
  2. Loading...Data Science Team – Model fitting and predictions, active only during business hours

Thanks to the multi-cluster approach that’s been introduced,  it’s now possible to effortlessly allocate dedicated clusters, or compute resources, to each team. For instance, the BI Team could utilize a medium-sized cluster (M) 24/7, while the Data Science Team could access an extra-large cluster (XL) exclusively during business hours. These independent compute resources prevent interference between the teams and enable more accurate planning and budgeting while still operating on the same data, as you can see in the diagram below.

Now let’s assume the company is planning to introduce self-service BI capabilities. With the multi- cluster approach there are different ways organizations can approach this:

  1. Increase compute resources of one cluster and point self-service BI to one of those clusters
  2. Spin up an additional cluster for this workload

Either way, we’re enhancing Exasol’s analytics database to provide ultimate flexibility for organizations.

How are the new features implemented from a technical perspective?

Decoupling compute from data storage

The following illustration depicts the architectural design of an Exasol Database prior to the latest enhancements.

In this setup, each database runs on a single cluster composed of multiple nodes. Each of these nodes possesses local storage (local disks) to store its segment of the data. This approach is commonly referred to as “shared nothing.”

As local storage is only accessible by a directly connected server, and modern cloud concepts provide highly scalable and cost-effective object storage options, Exasol has now transitioned to utilizing object storage for persistent data storage.

Utilizing object storage offers several advantages:

  • Object storage is more cost-effective than local storage
  • Object storage is heavily optimized to scale with the amount of disk I/O requests

However, as with any benefit, there are also challenges to bear in mind:

  • Object storage exhibits higher latency compared to locally attached disks

How are we addressing this challenge?

Exasol utilizes high performance ephemeral SSD storage in compute instances to cache data that demands low latency. This typically includes data used by Exasol server processes, but the ephemeral storage also serves as a secondary cache for user data.

With the separation of storage, it is now possible to operate the Exasol Database using multiple, independent clusters accessing the same data. One of these clusters holds a special role as the MAIN Cluster, housing the Exasol Transaction Management System (TMS). The TMS ensures ACID compliance and enables various operations on any of the clusters while maintaining database consistency.

A new SQL command has been introduced, enabling the effortless migration of idle user sessions between sub-clusters, allowing for a completely transparent setup change for the user.

When and where will this be available to customers?

Due to the varying object storage implementations offered by each public cloud provider, we will be running a phased approach to availability.  The new offering is currently available for AWS and on-premise installations. Platform-specific storage implementations and optimizations for Google GCP and Microsoft Azure are scheduled for release in the coming months, so stay tuned.

What other functionalities and improvements can we expect?

Our customers can expect a multitude of enhancements and new features, which will be progressively delivered over the coming months.

Just to name a few highlights, users will benefit from:

  • Extended Timestamp datatype
  • Improved compile times
  • Improvements for high concurrency situations
  • An improved optimizer for complex joins
  • Zone Maps for improved memory efficiency (in conjunction with partitioning)
  • A new backup concept (snapshot backups)

If you want to see for yourself how your organization can benefit from this functionality, existing customers can talk with their account managers. If you’re not working with us but like what you see, test us out today through our accelerator program – 3 months and five terabytes for free.