Uploaded image for project: 'Solution Center'
  1. Solution Center
  2. SOL-636

AWS Installation via CloudFormation Advanced Templates

    XMLWordPrintable

    Details

    • Type: How To
    • Status: Published
    • Affects Version/s: Exasol 6.1.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Solution:
      Hide

      Introduction

      These AWS CloudFormation configuration templates are meant for expert users.
      Please only use these templates if the standard templates cannot be used (see SOL-633:Cluster, SOL-605:Single Node )

      In the last section of this solution you will find CloudFormation templates for cluster and for single node installation.

      1. Selecting an appropriate instance type and storage size

      A good rule of thumb is that the RAM size of the selected instance should correspond to approximately 10% - 15 % of your raw data.

      EXASOL supports all EC2 instances with >15GB Main Memory.
      We recommend that you use instance types of category r4,r5,m4,m4 or c5 with >30 GB main memory.

      If we assume that your raw data (uncompressed) is about 500 GB in size, the following instance types could be a good choice for a single-node installation:

      • r4.2xlarge, 8 vcores, 61 GB main memory
      • m4.4xlarge, 16 vcores, 64 GB main memory

      The main difference between those instance type is the number of vcores: the compute power.

      Due to it's higher compute power the m4.4xlarge type will be able to process more concurrent queries compared to the r4.2xlarge instance.

      In this guide we decided to use the m4.4xlarge instance model therefore.

      Please note:

      • The sizing "rule of thumb" used above may not be appropriate for your use case. Based on your specific use case, you may need significantly more memory.
      • For more information on sizing please refer to: EXASOL Sizing and the corresponding section in the FAQ document on installation and configuration

      The sizes of the storage should be able to store the compressed data as well as index structures and space for temporary data.

      As a rule of thumb the disk space for a single node should correspond to approximately 0.7x of the amount of raw data that should be managed by the node (equivalent to 7 times the amount of main memory of the node) .
      As a cluster always redundantly stores the data for high availability reasons, cluster nodes need more disk space compared to single node systems. As a rule of thumb the disk storage for a cluster node should correspond to 12x of amount of main memory of that node (equivalently: 1,2x of the amount of raw data that should be managed by the node).

      In our example: 500GB * 0.7 = 350GB. The attached storage should be therefore not smaller than 350 GB.

      Performance Tip for Storage

      Storage volumes (EBS) > 214 GB deliver maximum EBS performance. Therefore try to avoid using volumes <214GB. Furthermore adding more than one EBS drive might increase the overall throughput.

      Based on the performance tip above we configure our instance with 2x 214 GB EBS drives

      Template Parameters

      Parameter Usage/Description
      Stack name Name of the AWS CloudFormation Stack
      Database Name Name of the generated database
      SYS user password Password for the exasol database administration user (SYS)
      ADMIN user Password Password for the EXAoperation system administration user (ADMIN)
      VPC CIDR Block If a new VPC is created, a valid CIDR Block has to be specified
      Database Subnet-Id (only when deploying into existing VPC) Choose a SubnetID from the dropdown menu
      Subnet CIDR Block Choose a valid CIDR Block if deplyoment into new VPC (Optional)
      Database Placement Group An existing placement group can be choosen here, else a new one is created
      Public IPs If true, associate public ip addresses to all instances
      DNS Server DNS Server (default AWS DNS Server 169.254.169.253)
      System Timezone Required timezone for the DB (default Europe/Berlin)
      License Server IP IP address of the License Server (Data Nodes IP are counting upwards from this one) (Optional)
      License Server Instance Type EC2 instance type of the license server
      AMI ID Select corresponding AMI from the AWS Marketplace
      First Data Node IP Counting upwards for the IPs of the data nodes (FirstDataNodeIP > LicenceServerIP) (Optional)
      Number of Data Nodes Number of database nodes that store data and process queries (min:2, max:64)
      Data Node Instance Type Instance type of data nodes
      Replication Factor Replication Factor defines how many copies of a data block are kept in the cluster (1 means: no redundancy)
      StandByNodes If the replication factor is > 1 then a standby node can automatically replace a failed node
      Encrypt EBS Volumes Enable encryption of block storage
      Size in GB of OS Block Device VOlumes Device size for OS volume
      General Purpose SSD (gp2) or
      Throughput Optimized HDD (st1)
      Device type for OS volume
      Size in GB of Data Block Device Volumes Size in GB per block device volume (for optimal performance >= 214GB)
      General Purpose SSD (gp2) or
      Throughput Optimized HDD (st1)
      Device type for data volume
      Number of Data Block Devices Number of storage volumes for each node
      Remote Access From IP 0.0.0.0/0 if allow access from anywhere
      AWS Key Pair Choose keypair for SSH access of created instances
      License In case of a BYOL image, an already aquired license can be pasted here

      Links to templates

      Single Node, using existing VPC
      Single Node, creating new VPC
      Cluster, using existing VPC
      Cluster, creating new VPC

      Show
      Introduction These AWS CloudFormation configuration templates are meant for expert users. Please only use these templates if the standard templates cannot be used (see SOL-633:Cluster , SOL-605:Single Node ) In the last section of this solution you will find CloudFormation templates for cluster and for single node installation. 1. Selecting an appropriate instance type and storage size A good rule of thumb is that the RAM size of the selected instance should correspond to approximately 10% - 15 % of your raw data. EXASOL supports all EC2 instances with >15GB Main Memory. We recommend that you use instance types of category r4,r5,m4,m4 or c5 with >30 GB main memory. If we assume that your raw data (uncompressed) is about 500 GB in size, the following instance types could be a good choice for a single-node installation: r4.2xlarge, 8 vcores, 61 GB main memory m4.4xlarge, 16 vcores, 64 GB main memory The main difference between those instance type is the number of vcores: the compute power. Due to it's higher compute power the m4.4xlarge type will be able to process more concurrent queries compared to the r4.2xlarge instance. In this guide we decided to use the m4.4xlarge instance model therefore. Please note: The sizing "rule of thumb" used above may not be appropriate for your use case. Based on your specific use case, you may need significantly more memory. For more information on sizing please refer to: EXASOL Sizing and the corresponding section in the FAQ document on installation and configuration The sizes of the storage should be able to store the compressed data as well as index structures and space for temporary data. As a rule of thumb the disk space for a single node should correspond to approximately 0.7x of the amount of raw data that should be managed by the node (equivalent to 7 times the amount of main memory of the node) . As a cluster always redundantly stores the data for high availability reasons, cluster nodes need more disk space compared to single node systems. As a rule of thumb the disk storage for a cluster node should correspond to 12x of amount of main memory of that node (equivalently: 1,2x of the amount of raw data that should be managed by the node). In our example: 500GB * 0.7 = 350GB. The attached storage should be therefore not smaller than 350 GB. Performance Tip for Storage Storage volumes (EBS) > 214 GB deliver maximum EBS performance. Therefore try to avoid using volumes <214GB. Furthermore adding more than one EBS drive might increase the overall throughput. Based on the performance tip above we configure our instance with 2x 214 GB EBS drives Template Parameters Parameter Usage/Description Stack name Name of the AWS CloudFormation Stack Database Name Name of the generated database SYS user password Password for the exasol database administration user (SYS) ADMIN user Password Password for the EXAoperation system administration user (ADMIN) VPC CIDR Block If a new VPC is created, a valid CIDR Block has to be specified Database Subnet-Id (only when deploying into existing VPC) Choose a SubnetID from the dropdown menu Subnet CIDR Block Choose a valid CIDR Block if deplyoment into new VPC (Optional) Database Placement Group An existing placement group can be choosen here, else a new one is created Public IPs If true, associate public ip addresses to all instances DNS Server DNS Server (default AWS DNS Server 169.254.169.253) System Timezone Required timezone for the DB (default Europe/Berlin) License Server IP IP address of the License Server (Data Nodes IP are counting upwards from this one) (Optional) License Server Instance Type EC2 instance type of the license server AMI ID Select corresponding AMI from the AWS Marketplace First Data Node IP Counting upwards for the IPs of the data nodes (FirstDataNodeIP > LicenceServerIP) (Optional) Number of Data Nodes Number of database nodes that store data and process queries (min:2, max:64) Data Node Instance Type Instance type of data nodes Replication Factor Replication Factor defines how many copies of a data block are kept in the cluster (1 means: no redundancy) StandByNodes If the replication factor is > 1 then a standby node can automatically replace a failed node Encrypt   EBS Volumes Enable encryption of block storage Size in GB of OS Block Device VOlumes Device size for OS volume General Purpose SSD (gp2) or Throughput Optimized HDD (st1) Device type for OS volume Size in GB of Data Block Device Volumes Size in GB per block device volume (for optimal performance >= 214GB) General Purpose SSD (gp2) or Throughput Optimized HDD (st1) Device type for data volume Number of Data Block Devices Number of storage volumes for each node Remote Access From IP 0.0.0.0/0 if allow access from anywhere AWS Key Pair Choose keypair for SSH access of created instances License In case of a BYOL image, an already aquired license can be pasted here Links to templates Single Node, using existing VPC Single Node, creating new VPC Cluster, using existing VPC Cluster, creating new VPC
    • Category 1:
      Plattform Support - EXASOL on AWS
    • Category 2:
      Cluster Administration - Installation

      Attachments

        Issue Links

          Activity

            People

            • Assignee:
              CaptainEXA Captain EXASOL
              Reporter:
              CaptainEXA Captain EXASOL
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: