Uploaded image for project: 'Solution Center'
  1. Solution Center
  2. SOL-204

Local and global joins

    XMLWordPrintable

    Details

    • Type: Explanation
    • Status: Published
    • Affects Version/s: EXASolution 4.2.0, EXASolution 5.0, EXASOL 6.0.0, Exasol 6.1.0
    • Fix Version/s: None
    • Component/s: EXASolution
    • Labels:
      None
    • Explanation:
      Hide

      In EXASolution there are two different types of joins: A local join and a global join. The type is not influenced by the SQL statement, but the internal data distribution (see SOL-136).

      Global Join

      If no data distribution is specified or database distribution does not fit for performing a local join (will be discussed later), a global join is executed.
      The following image will illustrate a global join:

      To perform a join, it is necessary to find matches for the join keys within the cluster. The example above illustrates the global match finding using arrows. Since the data is not distributed suitable to find matches on the local node only, it is necessary to use the network to find matches. E.g. Mr. Green (residing on node 3) is living in city 2 (New York) which is residing on another node (node 1). Thus the query requires a global index (see SOL-6) to perform the join.
      A global join causes network traffic by finding matches and transfering needed columns for matches to the responsible node.

      Local Join

      Using distribution keys (SOL-136) on the join columns (CUSTOMER.CITY_ID and CITY.ID in the example) causes a local join.

      ALTER TABLE CUSTOMER DISTRIBUTE BY CITY_ID;
      ALTER TABLE CITIES DISTRIBUTE BY ID;
      

      The optimizer recognizes that both tables have a suitable distribution to perform a local join. It is not necessary to find matches on other nodes and thereby no data needs to be transferred to the responsible nodes.
      The following image illustrates the same join with a suitable data distribution:

      Determining join types

      It is possible to determine the join type using a query profile (SOL-75).
      For a global join the keyword "GLOBAL" is shown in the PART_INFO column. The following table will show the difference between both join types in profiling tables (statement 1: global join, statement 2: local join) :

      STMT_ID PART_ID PART_NAME PART_INFO OBJECT_NAME
      1 1 COMPILE / EXECUTE    
      1 2 SCAN   CITY
      1 3 JOIN GLOBAL CUSTOMER
      2 1 COMPILE / EXECUTE    
      2 2 SCAN   CITY
      2 3 JOIN   CUSTOMER

      Differences in Performance

      Typically a local join can be executed much faster than a global join, since no network traffic is caused.

      Show
      In EXASolution there are two different types of joins: A local join and a global join. The type is not influenced by the SQL statement, but the internal data distribution (see SOL-136 ). Global Join If no data distribution is specified or database distribution does not fit for performing a local join (will be discussed later), a global join is executed. The following image will illustrate a global join: To perform a join, it is necessary to find matches for the join keys within the cluster. The example above illustrates the global match finding using arrows. Since the data is not distributed suitable to find matches on the local node only, it is necessary to use the network to find matches. E.g. Mr. Green (residing on node 3) is living in city 2 (New York) which is residing on another node (node 1). Thus the query requires a global index (see SOL-6 ) to perform the join. A global join causes network traffic by finding matches and transfering needed columns for matches to the responsible node. Local Join Using distribution keys ( SOL-136 ) on the join columns (CUSTOMER.CITY_ID and CITY.ID in the example) causes a local join. ALTER TABLE CUSTOMER DISTRIBUTE BY CITY_ID; ALTER TABLE CITIES DISTRIBUTE BY ID; The optimizer recognizes that both tables have a suitable distribution to perform a local join. It is not necessary to find matches on other nodes and thereby no data needs to be transferred to the responsible nodes. The following image illustrates the same join with a suitable data distribution: Determining join types It is possible to determine the join type using a query profile ( SOL-75 ). For a global join the keyword "GLOBAL" is shown in the PART_INFO column. The following table will show the difference between both join types in profiling tables (statement 1: global join, statement 2: local join) : STMT_ID PART_ID PART_NAME PART_INFO OBJECT_NAME 1 1 COMPILE / EXECUTE     1 2 SCAN   CITY 1 3 JOIN GLOBAL CUSTOMER — — — — 2 1 COMPILE / EXECUTE     2 2 SCAN   CITY 2 3 JOIN   CUSTOMER Differences in Performance Typically a local join can be executed much faster than a global join, since no network traffic is caused.
    • Category 1:
      Database Administration - Performance
    • Category 2:
      Database Administration - Data Organization

      Attachments

        Issue Links

          Activity

            People

            • Assignee:
              CaptainEXA Captain EXASOL
              Reporter:
              CaptainEXA Captain EXASOL
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: