Tech Blog

Aurora Virtualis — using AWS Aurora with Exasol’s Virtual Schema

In today’s article I’m going to explain how to set up a low-cost playground for using Amazon’s AWS Aurora with Exasol’s Virtual Schema. The goal is to demonstrate how to connect to Aurora.

Aurora is client-compatible with:

This tutorial looks at the cheapest installation possible. You’ll certainly choose some different options for a production setup.

Fair warning: the setup isn’t easy due to the fact that establishing a secure connection requires Aurora users to jump through some hoops.

Setting up a low-cost Aurora installation

Aurora is part of the Relational Database Services (RDS) solutions of Amazon’s AWS offer. You’ll get a managed installation that you can create with a reasonable amount of effort.

In this tutorial, we’re going to install the PostgreSQL variant of Aurora.

We also have MySQL Virtual Schema available, so you can also use MySQL Virtual Schema dialect to connect to Aurora.

To save money and reduce complexity, we’re going to:

These aren’t the choices you’d make for a production environment, but they’re more than enough for the sake of this tutorial.

Step-by-step Aurora setup

Many of AWS services are region-based. That means they are hosted in a specific area as opposed to services that are distributed world-wide by default. In this article region-based links — like the ones to the AWS console  — use the Frankfurt region. Choose a region closest to you for your trial for the best network performance.

To start, log into AWS.

Setting up the virtual network

I’m assuming that you have the community edition of Exasol set up on your computer and connect to Aurora via the internet. Another option would be to install Exasol pay-as-you-go on AWS. You could then install it on the same virtual network as Aurora.

  1. Open the VPC console
  2. Create the VPC
  3. Make sure DNS resolution and DNS hostnames options are enabled
  4. Create two subnets in different availability zones
  5. Create an internet gateway
  6. Select the newly created internet gateway
  7. Under Actions, click Attach to VPC
  8. In the dialog box Attach to VPC:
    1. Select the VPC you created earlier
    2. Click Attach

Setting up the Aurora infrastructure

Aurora is a managed service, but the underlying infrastructure shines through. Aurora uses EC2 under the hood to do the computing much like other RDS services. Some of the settings will look familiar if you ever set up an EC2 instance like the instance type and the need to define the VPC where you want to install your cluster.

  1. Open the RDS console
  2. Click Create database
  3. In the dialog box Create database:
    1. Choose Standard Create
    2. Choose Amazon Aurora
    3. Select Amazon Aurora with PostgreSQL compatibility
    4. Select Provisioned capacity type.
    5. Select Dev/Test template.
    6. Set a DB instance identifier such as aurora-tutorial
    7. We can use default postgres as the master username
    8. Enter a secure password for the master user
    9. Choose db.r4.large as this is the smallest available instance type
    10. Multi-AZ deployment should be set for Don’t create an Aurora Replica
    11. Select the VPC you created earlier
    12. Let RDS create the subnets for you
    13. Make the instance publicly accessible
    14. Let RDS create a security group for you
    15. Open Additional configuration
    16. Call your initial Aurora database aurora_tutorial_database
    17. Reduce the backup retention period to one day
    18. Disable Copy tags to snapshots
    19. Disable Encryption
    20. Switch off Performance Insights
    21. Disable Enhanced monitoring
    22. Click Create database

This will take a while — the perfect opportunity to grab a tea or coffee.

Tweaking the security group

Note that RDS will limit the source IP for inbound traffic to your current IP address while it creates the security group for you. You might want to widen it to the address range of your organization later.

Connecting the database with an SQL client

Aurora instance hostname

We’re now going to test our newly installed Aurora database. Start your preferred SQL client and follow these instructions to get a connection.

Before you can connect, find the hostname of the instance you want to connect to. To achieve this:

  1. Open the RDS console
  2. Select your database instance in the list
  3. Switch to the tab Connectivity & security
  4. Copy the hostname under Endpoint

Get the Aurora version number as a first smoke test of your connection:

SELECT AURORA_VERSION();

Establishing a secure connection

Next we should talk a little bit about Transport Layer Security (TLS). If that is new to you I suggest reading up on TLS first before you continue. A starting point is the description section in the Wikipedia article. As it’s a very complex topic, covering all important aspects is way beyond the scope of this tutorial.

I also recommend reading Denis Szczukocki‘s article that explains the difference between a Java keystore and a truststore.

In the past, Aurora had at least one certificate in the certificate validation chain with 1024-bit key length. This is considered too low to be acceptable by some modern Linux distributions. On my Fedora 33 machine, for example, you’ll find the following:  /etc/crypto-policies/back-ends/Loading...java.config:

jdk.certpath.disabledAlgorithms=MD2, MD5, DSA, RSA keySize < 2048

Generally speaking, this configuration is a good idea since it improves system security. If you run into a certificate length problem, you can reduce the mentioned value to < 1024 in order to get TLS working.

Important properties

You need to set the property to ssl=true to switch SSL on, including full certificate chain checking. To use the default Java certificate store, set the property to  sslfactory=org.postgresql.ssl.DefaultJavaSSLFactory. Both settings are essential to successfully establish the secured connection in a way that also works with virtual schemas.

Authentication and certificates

I will now briefly explain the concept of an authentication trust chain built on certificates. I can only scratch the surface in this article. If you’re new to this topic, I strongly recommend reading up on secure network communication via TLS.

TLS uses a cryptographic certificate to authenticate the server your client is talking to. Since you can’t store all certificates of all machines in the world on your computer, certificate chaining was invented. The idea is that you only need to store certificates of root certification agencies (CAs) that are considered trusted on your computer. Operating systems, browsers and Java all come with a — mostly overlapping — set of root CA certificates preinstalled. Next a chain of certificates is established. This helps to trace a certificate your machine doesn’t know all the way down to a root CA. And this establishes a trusted authentication.

If the service provider derives the host certificate of the server you’re talking to from a root CA your computer already knows, you’re fine.

If not, you have to install the missing certificates. Amazon provides a bundle of certificates for their RDS service for download. And this is where the installation becomes inconvenient. Successful authentication against Aurora required the installation of these extra certificates during my trials. For a cluster you’d have to install them on every data node.

Installing the certificates

Let’s prepare a script and install certificates on an Exasol node — provided you have permissions to log into the node as root.

To create a script run the following command:

cat > /tmp/install_certificate  << 'END'#!/bin/bash region=$1cd /tmpwget https://truststore.pki.rds.amazonaws.com/${region}/${region}-bundle.pemopenssl x509 -outform der -in ${region}-bundle.pem -out ${region}-bundle.derkeytool -keystore /etc/pki/ca-trust/extracted/java/cacerts -import -file ${region}-bundle.der -alias aws-${region}updatedbCONTAINER_TRUSTSTORE=$(locate --regex 'ScriptLanguages.*/java/cacerts')keytool -keystore "$CONTAINER_TRUSTSTORE" -import -file ${region}-bundle.der -alias aws-${region}END

Install the certificates for regions you want to access:

bash /tmp/install_certificate region-name

For example:

bash /tmp/install_certificate eu-west-2

You can check a list of all available certificates for AWS regions.

The script first switches to the /tmp directory, then downloads the drivers from the internet. It then converts the certificates from PEM to DER format in order to be able to install them in the Java truststores (central and BucketFS).

If your cluster doesn’t have an internet connection, you have to download the certificates to your machine, upload it to the cluster node via SSH and then install them in the truststore.

Setting up the Virtual Schema

CREATE OR REPLACE CONNECTION AURORA_CONNECTIONTO 'jdbc:postgresql://<cluster id>.<region>.rds.amazonaws.com/<database>?ssl=true&sslfactory=org.postgresql.ssl.DefaultJavaSSLFactory'USER '<user>'IDENTIFIED BY '<password>'

The property ssl=true switches SSL on, including full certificate chain checking. To use the default Java certificate store, set the property to sslfactory=org.postgresql.ssl.DefaultJavaSSLFactory.

Remember to adapt the parameters for user, cluster, region and database to match your setup.

CREATE SCHEMA ADAPTER;CREATE OR REPLACE JAVA ADAPTER SCRIPT ADAPTER.AURORA_ADAPTER AS%scriptclass com.exasol.adapter.RequestDispatcher;%jar /buckets/bfsdefault/jars/<virtual schema jar name>.jar;%jar /buckets/bfsdefault/jars/postgresql-<version>.jar;/;
  • Create the Virtual Schema
CREATE VIRTUAL SCHEMA AURORAUSING ADAPTER.AURORA_ADAPTERWITHCONNECTION_NAME = 'AURORA_CONNECTION'SCHEMA_NAME = 'public';

You now have a TLS-secured JDBC connection from the Exasol Virtual Schema to the Aurora cluster.

Conclusion

AWS Aurora is a managed Loading...relational database that you can access via a PostgreSQL JDBC connection. This lets us create a Virtual Schema on top of an Aurora database. The setup is complex because authentication only works if you install the certificate bundle Amazon provides on each Exasol cluster node. And if you manage to wrap your head around this, you’re rewarded with a secure connection between Exasol and Aurora.

Sebastian Bär