Maintenance Notification:
On next Monday August 5th 2019 starting at 7am CEST we will conduct some maintenance. You might encounter some issues using the Exasol User Portal and Issue Tracker! We will restore the Exasol User Portal and Issue Tracker before 9am CEST on Monday August 5th 2019.

Page tree

This guide will assist you during your initial steps with EXASOL such as setting up the database, loading data from various sources or connecting certain BI tools.

Step 1: Setting up your EXASOL Instance

First of all, setup the EXASOL database within your environment by choosing one of the options below. You should appropriate RAM and CPUs resources for your specific data volume and perfromance needs.

 Virtual Machine
You can download a virtual machine image (.ova) here. Afterwards, please import it into the virtualization layer of your choice (e.g. Virtualbox, VMWare / VMWare Player, KVM, ...). If you just want to load data from your local machine (e.g. CSV-Files), you don't need to change the standard network settings, leave them as the defaults. Otherwise you might want to set it to either NAT or Bridge.
 Microsoft Azure

You can easily start an EXASOL database on the Azure Marketplace. Details are described in our solution EXASOL on Azure and directly in the Azure Marketplace

 Amazon Web Services (AWS)

The steps to setup a database in AWS are described in the following solution center: EXASOL on AWS and directly on the AWS marketplace:

Step 2: Connect to the database via EXAplus (SQL-Client)

EXAplus is our SQL client and can be downloaded in the download section. Further details about EXAplus can be found in our solution center: First Steps with EXAplus

The default credentials to connect to the database are:

  • Specific IP address or hostname

  • User: sys
  • Password: exasol

Step 3: Loading data using the IMPORT statement

You can ingest data either using your ETL tool of choice or with our internal bulk loader which is a very simply but powerful SQL command and fully integrated in our transaction concept. 

Directly importing data from databases

To make it as easy as possible, we've prepared scripts for you which can automate the task of creating the schema structure and generating appropriate IMPORT statements. You only need the connection information and a user with appropriated rights for the database you want to load data from. Please find the appropriate scripts for various data sources on our open source GitHub repository:

Loading data from files

With the IMPORT command you can load CSV / FBV files directly from your local file system or from via ftp(s), sftp and http(s) servers or from S3 or Hadoop.

 Load data from CSV Files

First you should create a table with the according column types and column names inside EXASOL. 

Load data from the local file system:

  IMPORT into <targettable> from local CSV file '/path/to/filename.csv' <optional options>;

Load data from a server or webservice:

  IMPORT into <targettable> from CSV at 'ftp://<your_server>/path_to_file/' user 'username' identified by 'somepass' file 'filename.csv' <optional options>;

There are several options available to specify the format of the file:

Skip a header:

If your files contain a header: SKIP=1

Set the right linebreak:

  • Linux(default): ROW SEPARATOR='LF'

Set the separator between the fields (usually comma):

COLUMN SEPARATOR='<your_character>'

Set the delimiter of the fields (common: double quotes):

COLUMN DELIMITER='<your character>'

If your data (e.g. dates / timestamps) doesn't match the default format, you can either change the default format:

    alter session set NLS_DATE_FORMAT='DD.MM.YYYY'; --or your format
    alter session set NLS_TIMESTAMP_FORMAT='DD.MM.YYYY-HH24:MI.SS.FF3'; --or your format

or you can specify after the file the list of columns with the according format. To e.g. specify the format for column 6:

    ... file 'myfile.csv' (1..5,6 format='DD.MM.YYYY',7..12) ...

If you want to monitor the progress of the import: please open a new EXAplus, connect and select the information from the according system table. The amount of rows that have been read till now are displayed in the column ACTIVITY:

    select * from EXA_DBA_SESSIONS;

See also this comprehensive example with commonly used options:

	SKIP = 1 

Another way to import files is the Import Wizard in EXAplus (GUI). You find it in the context menu of a table (right click) in the EXAplus database browser.

 Load data from Amazon S3

You'll find an example of how to load data in parallel from Amazon S3 in the following description: Import/Export from/to Amazon S3™ buckets

In case you would like to import SSE-S3 and SSE-KMS encrypted data please refer to: Support for AWS SSE-S3 and SSE-KMS Encryption for EXPORT to S3

In order to import Hadoop data formats like Parquet from S3 please refer to our Cloud-Storage-UDFs:

 Load data from HIVE

 The following solution describes the process of connecting and loading data from HIVE: HIVE JDBC Connection

Step 4: Have fun with your data

After loading data to the EXASOL database you can start to analyze the data by either executing SQL statements via EXAplus (or other SQL consoles) or you can connect your Business Intelligence tool.

 Connect Microstrategy

The steps to connect Microstrategy to EXASOL are described in the solution: How to connect Microstrategy to an EXASOL database

 Connect Business Objects

The steps to connect Business Objects to EXASOL are described in the solution: Business Objects connection

 Connect IBM Cognos

 Since Cognos Business Intelligence 10.2.1 Fix Pack 3, there is a certified JDBC connector for EXASOL: How to connect IBM Cognos with EXASolution

 Connect Microsoft SQLServer Analysis Services (SSAS)

Microsoft SQLServer Analysis Services (SSAS) can be connected via ROLAP mode. You can benefit on near-realtime dataintegration without needing to generate cubes upfront.

The steps to connect SSAS are described in the solution:  SQLServer Analysis Services (SSAS) Connectivity

Please keep in mind that indexes are automatically generated, reutilized and discarded by the system as necessary. You don't need to worry about indexes -> they will be created automatically during the execution of queries without the need for manual tuning.

Indexes are stored persistently and reused in subsequent executions. They are stored in a compressed manner and don't need to be decompressed when accessing them. You can find more technical details about indices here.