[SOL-171] Database size Created: 04.09.2014  Updated: 22.06.2020  Resolved: 22.06.2020

Status: Obsolete
Project: Solution Center
Component/s: EXASolution
Affects Version/s: EXASolution 4.2.0, EXASolution 5.0, EXASOL 6.0.0, Exasol 6.1.0, Exasol 6.2.x
Fix Version/s: None

Type: Explanation
Reporter: Captain EXASOL Assignee: Captain EXASOL
Labels: None

Issue Links:
Related
is related to SOL-583 Overview of Exasol's data and memory ... Published
Explanation:

Note: This solution is no longer maintained. For the latest information, please visit our Knowledge Base:

https://community.exasol.com/t5/database-features/database-size/ta-p/1040

------------------------------------------------------------------------------------------------------------------------------------------------------------------

Explanation

Database size

The database size can be calculated on the basis of the following system dictionaries: EXA_ALL_OBJECT_SIZES or EXA_DBA_OBJECT_SIZES.

The dictionary lists all tables and schemas. Please note, the size of an object of the type 'SCHEMA' will be calculated as a sum of the sizes of all the objects in this schema. For views, functions, etc., the size represents the corresponding text size.

RAW_OBJECT_SIZE specifies the logical object size based on both data types and content. This size is comparable with the size of a CSV-file containing the same data. The value is calculated as a sum of sizes of stored data:

  • NULL –› 1 byte
  • fixed size type: depending on the type –› 1-80000 bytes
  • variable size type (varchar) –› number of bytes the value really uses.

MEM_OBJECT_SIZE specifies the real size of the database object. The value is calculated as a sum of the following:

  • a sum of all stored values after compression
  • structural overhead, e.g. length information for a VARCHAR value
  • overhead for replication
    Replication: Table content will be held in RAM on each node for better performance. This applies only to small tables (< 100.000 rows).

Please note, that for a new table some data blocks will be reserved. Therefore, MEM_OBJECT_SIZE of empty or very small tables can be bigger than RAW_OBJECT_SIZE. This does not imply a bad data compression ratio.

These system dictionaries provide you the total size of database objects in the cluster.

Example: Objects in the schema 'EXAMPLES'

        OBJECT_NAME              OBJECT_TYPE RAW_OBJE MEM_OBJE
        ------------------------ ----------- -------- --------
        TESTADR                  TABLE         492836  2615072
        ITEMS                    TABLE           1120    23489
        PAYMENTS                 TABLE            208    13956
        PRODUCTS                 TABLE           1191   230942
        NEW_CITIES               TABLE            132    80336
        V_PRODUCT_ORDERING       VIEW             578      578
        V_CUSTOMERS              VIEW             431      431
        V_PRODUCT_RATING_MONTHLY VIEW             755      755
        CUSTOMER_MOVES           TABLE              0    13920
        MYMAX                    FUNCTION         290      290
        DAYS_BETWEEN             FUNCTION         402      402
        COUNTRIES                TABLE             64    74655
        NEW_CUSTOMERS            TABLE            117   217756
        CITIES                   TABLE            167    80336
        ORDERS                   TABLE            399    13996
        TESTADR_CLEANSED         TABLE          41770  1822724
        CUSTOMERS                TABLE            365   222986
        RETURNED_ITEMS           TABLE            178    92179
        V_ORDERS                 VIEW             648      648
        V_RETURNS                VIEW             713      713
        V_PAYMENTS               VIEW             472      472
        V_CUSTOMER_RAITING       VIEW             861      861
        V_TRANSACTIONS           VIEW             615      615
        LAG                      TABLE             45     9299
        MYFUNCS                  PACKAGE          208      208

Data distribution

You can check the data distribution of a table by using an iproc()-function:

Example: Using of iproc()-function

SELECT
count(*), iproc()
FROM mytable
GROUP BY iproc()
ORDER BY 2;
        COUNT(*)            IPROC
        ------------------- -----
                    5327099     0
                    5325780     1
                    5333799     2
                    5319445     3
         
Category 1: Database Administration
Generated at Sat Aug 15 22:20:27 CEST 2020 using Jira 7.13.13#713013-sha1:4c0f5f5e3383570393fbbf3d2fd5de1eb4057c36.