Insights Blog

Exasol users now get Amazon SageMaker connectivity

AWS SageMaker with Exasol

As an Advanced Technology Partner within the AWS Partner Network, we’re ensuring users can access the full potential of Amazon’s portfolio via our native connectors. So, our high-performance database now includes an extension for Amazon SageMaker, allowing users to load their data efficiently for fully integrated Loading...machine learning development.

Exasol now offers next-level analytics performance while unifying AI/ML and BI, by combining the SageMaker extension with our existing connections to AWS Kinesis and DynamoDB.

The power of Exasol’s SageMaker extension

Exasol users connecting with AWS can use the ExaLoader function to efficiently load data into SageMaker, enabling them to train machine learning models to make reliable predictions addressing a range of business scenarios. We’ve made sure the extension is intuitive for end-users, providing easy access to machine learning capabilities within the Exasol environment.

Models can be developed via all standard forms of supervised learning. This includes classification models for categorizing data, regression models for assessing relationships among dependent and changing business variables, and forecasting to uncover trends within data to improve operations at a faster pace.

Data engineers, data analysts, and BI analysts who’re already familiar with Exasol can work with the extension to extend their analyses within our database. By using SageMaker endpoints, multiple instances for prediction can be deployed smoothly with minimal input and utilised via user-defined functions. These can be included in SQL queries for BI reporting when the training phase is complete. 

Integration with SageMaker also provides users with an intuitive interface that’s equipped to automatically perform key operations when training models. This includes cleaning and transforming data in the pre-processing stage, deciding on hyperparameters to dictate how models can be properly trained, and determining the correct model type based on prediction requirements.

Models can be developed for all common ML use cases. This includes forecasts on stock levels to anticipate supply requirements from vendors, fluctuations in demand by product, identifying at-risk customers to reduce churn rates, outlier predictions to detect fraud, and maintenance prediction to plan the upkeep of equipment. 

These features combine to create an ideal environment that data scientists and machine learning engineers can use for the duration of their careers, providing all the tools they’ll need to grapple with data and successfully develop strong predictive models. 

What is Amazon SageMaker?

Amazon SageMaker is a public cloud service that helps developers to train reliable machine learning models that predict business outcomes. After connecting to data sources, developers can use SageMaker Data Wrangler to convert, transform and combine raw data into simplified features to streamline the training process. Data sets can then be inspected by connecting to the SageMaker Autopilot service, allowing users to automatically build, train and tune several machine learning models simultaneously.

Through feature engineering, developers can transform raw data into integrated dimensions that are relevant to specific business scenarios – resulting in more accurate models. SageMaker’s Feature Store can be used to save and manage existing features which are also accessible to ML developers working on similar projects. Retrieving features means projects can be streamlined for low-latency predictions that allow users to improve operations faster pace.

With SageMaker Clarify, users can ensure values within features are coherent and well represented within the data set, this means models can be generalizable to different cases and not skewed toward particular sub-categories of data. Models can also be assessed to understand the role each feature plays in the prediction, which also helps developers to remove biases. 

The SageMaker Debugger is used to refine models by removing errors and speeding up the training process. With SageMaker Pipelines, integration and deployment can be automated, and corrected models can be re-trained with just one click.

Interacting with SageMaker

Users can interact with SageMaker via the AWS console for API-based navigation, or via coding in SageMaker notebook servers managed by Jupyter.  A range of programming languages and SDKs are supported, including ML-specific libraries such as TensorFlow, PyTorch and scikit-learn, in addition to the SageMaker-specific devkit. These notebooks can be shared with other developers allowing for effective collaboration. Tutorial notebooks are also available, containing instructions and code covering all phases of training that’s also useful for live projects. 

SageMaker Studio has been included as an IDE modelled on the JupyterLab environment, providing a central web-based interface to manage all development phases and improve productivity. SageMaker Studio is also equipped with add-ons and extensions allowing for even tighter integration with the SageMaker platform.