# Analytical functions

#### Details

• Type: Explanation
• Status: Published
• Affects Version/s: None
• Fix Version/s: None
• Component/s:
• Labels:
None
• Explanation:
Hide

# Background

Analytical functions are designed to address problems similar to the following

• Calculate a running total
• Find percentiles within a group
• Top-N queries
• Compute a moving average

Analytical functions compute an aggregate value based on a group of rows. They differ from aggregate functions in that they do not condense every group to one line, but return a value for every row.

Analytical functions are the last set of operations performed in a query except for the final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVING clauses are completed before the analytic functions are processed. Therefore, analytic functions can appear only in the select list or ORDER BY clause.

# Explanation

## Analytical Clause

With the help of PARTITION BY you can divide the table into multiple partitions based on the specified criteria with the effect that the results within each partition will be calculated independently from the rest of the table similar to the GROUP BY clause. If no PARTITION BY clause is stated, an analytical function always refers to the entire table.

The ORDER BY clause specifies the sort order of the rows within each partition. Note, that this can differ from the output sort order. If this clause is specified, not all the rows of a partition will be used for the result computation, but only a part of them, a so-called "Window". By default, this window consists of all the rows of the respective partition up to the current row regarding the given sort criteria.

For the following examples, we use a view V_TRANSACTIONS, which shows all orders, returns, and payments of each customer.

### Example 1: Without analytical clause

```SELECT
customer_no
, last_name
, transaction_date 'DATE'
, transaction_type type
, transaction_sum 'SUM'
-- calculates the balance of all customers
, sum(transaction_sum) over() total
FROM v_transactions
ORDER BY last_name, transaction_date;
```

The result will be calculated by using all the table rows, but unlike the aggregation functions will be displayed for each row of the table:

```CUS LAST_NAME    DATE       TYPE    SUM      TOTAL
--- ------------ ---------- ------- -------- --------
2 Adams        2008-10-11 ORDER     -31.32   -77.46
2 Adams        2008-10-25 ORDER     -31.89   -77.46
2 Adams        2008-10-25 RETURN     12.64   -77.46
2 Adams        2008-10-27 PAYMENT     6.04   -77.46
2 Adams        2008-11-02 ORDER      -8.97   -77.46
2 Adams        2008-11-03 ORDER     -19.25   -77.46
2 Adams        2008-11-04 PAYMENT    31.89   -77.46
2 Adams        2008-11-10 RETURN      8.97   -77.46
2 Adams        2008-11-25 ORDER     -11.83   -77.46
2 Adams        2008-11-30 PAYMENT    43.72   -77.46
7 Black        2008-10-20 ORDER     -21.17   -77.46
7 Black        2008-11-04 PAYMENT    21.17   -77.46
7 Black        2008-11-12 ORDER     -17.03   -77.46
7 Black        2008-11-17 PAYMENT    17.03   -77.46
4 Brown        2008-10-19 ORDER     -27.49   -77.46
4 Brown        2008-10-25 ORDER     -15.13   -77.46
4 Brown        2008-11-03 PAYMENT    27.49   -77.46
4 Brown        2008-11-21 ORDER     -21.04   -77.46
4 Brown        2008-11-30 PAYMENT    36.17   -77.46
8 Green        2008-10-14 ORDER     -16.48   -77.46
8 Green        2008-11-03 RETURN      8.24   -77.46
8 Green        2008-11-03 PAYMENT     8.24   -77.46
8 Green        2008-11-09 ORDER     -43.61   -77.46
3 Jones        2008-10-11 ORDER     -42.02   -77.46
3 Jones        2008-10-25 ORDER     -34.38   -77.46
3 Jones        2008-11-03 PAYMENT    26.84   -77.46
3 Jones        2008-11-03 RETURN      7.59   -77.46
3 Jones        2008-11-10 ORDER     -31.89   -77.46
3 Jones        2008-11-15 PAYMENT    73.86   -77.46
6 Marsch       2008-11-20 ORDER     -14.37   -77.46
6 Marsch       2008-12-01 PAYMENT    14.37   -77.46
12 Meier        2008-11-17 ORDER     -22.27   -77.46
5 Peters       2008-11-05 ORDER      -8.97   -77.46
5 Peters       2008-11-15 ORDER      -5.79   -77.46
5 Peters       2008-11-20 PAYMENT     8.97   -77.46
9 White        2008-11-17 ORDER      -5.79   -77.46
10 Young        2008-10-12 ORDER     -20.50   -77.46

38 rows in resultset.
```

### Example 2: Using PARTITION BY clause

```SELECT
customer_no
, last_name
, transaction_date 'DATE'
, transaction_type type
, transaction_sum 'SUM'
-- calculates the balance of all customers
, sum(transaction_sum) over() total
-- calculates the balance of each customer
, sum(transaction_sum) over(partition BY customer_no) cust_balance
FROM v_transactions
ORDER BY last_name, transaction_date;
```

PARTITION BY customer_no causes the sum to be calculated separately for each different customer_no.

```CUS LAST_NAME    DATE       TYPE    SUM      TOTAL    CUST_BAL
--- ------------ ---------- ------- -------- -------- --------
2 Adams        2008-10-11 ORDER     -31.32   -77.46     0.00
2 Adams        2008-10-25 ORDER     -31.89   -77.46     0.00
2 Adams        2008-10-25 RETURN     12.64   -77.46     0.00
2 Adams        2008-10-27 PAYMENT     6.04   -77.46     0.00
2 Adams        2008-11-02 ORDER      -8.97   -77.46     0.00
2 Adams        2008-11-03 ORDER     -19.25   -77.46     0.00
2 Adams        2008-11-04 PAYMENT    31.89   -77.46     0.00
2 Adams        2008-11-10 RETURN      8.97   -77.46     0.00
2 Adams        2008-11-25 ORDER     -11.83   -77.46     0.00
2 Adams        2008-11-30 PAYMENT    43.72   -77.46     0.00
7 Black        2008-10-20 ORDER     -21.17   -77.46     0.00
7 Black        2008-11-04 PAYMENT    21.17   -77.46     0.00
7 Black        2008-11-12 ORDER     -17.03   -77.46     0.00
7 Black        2008-11-17 PAYMENT    17.03   -77.46     0.00
4 Brown        2008-10-19 ORDER     -27.49   -77.46     0.00
4 Brown        2008-10-25 ORDER     -15.13   -77.46     0.00
4 Brown        2008-11-03 PAYMENT    27.49   -77.46     0.00
4 Brown        2008-11-21 ORDER     -21.04   -77.46     0.00
4 Brown        2008-11-30 PAYMENT    36.17   -77.46     0.00
8 Green        2008-10-14 ORDER     -16.48   -77.46   -43.61
8 Green        2008-11-03 RETURN      8.24   -77.46   -43.61
8 Green        2008-11-03 PAYMENT     8.24   -77.46   -43.61
8 Green        2008-11-09 ORDER     -43.61   -77.46   -43.61
3 Jones        2008-10-11 ORDER     -42.02   -77.46     0.00
3 Jones        2008-10-25 ORDER     -34.38   -77.46     0.00
3 Jones        2008-11-03 PAYMENT    26.84   -77.46     0.00
3 Jones        2008-11-03 RETURN      7.59   -77.46     0.00
3 Jones        2008-11-10 ORDER     -31.89   -77.46     0.00
3 Jones        2008-11-15 PAYMENT    73.86   -77.46     0.00
6 Marsch       2008-11-20 ORDER     -14.37   -77.46     0.00
6 Marsch       2008-12-01 PAYMENT    14.37   -77.46     0.00
12 Meier        2008-11-17 ORDER     -22.27   -77.46   -22.27
5 Peters       2008-11-05 ORDER      -8.97   -77.46    -5.79
5 Peters       2008-11-15 ORDER      -5.79   -77.46    -5.79
5 Peters       2008-11-20 PAYMENT     8.97   -77.46    -5.79
9 White        2008-11-17 ORDER      -5.79   -77.46    -5.79
10 Young        2008-10-12 ORDER     -20.50   -77.46     0.00
10 Young        2008-10-20 PAYMENT    20.50   -77.46     0.00

38 rows in resultset.
```

### Example 3: With PARTITION BY and ORDER BY clause

```SELECT
customer_no
, last_name
, transaction_date 'DATE'
, transaction_type type
, transaction_sum 'SUM'
-- calculates the balance of all customers
, sum(transaction_sum) over() total
-- calculates the balance of each customer
, sum(transaction_sum) over(partition BY customer_no) cust_balance
-- calculates the running balance of each customer
, sum(transaction_sum) over(partition BY customer_no ORDER BY transaction_date) running_balance
FROM v_transactions
ORDER BY last_name, transaction_date;
```

The ORDER BY clause causes the calculation of a cumulated sum for each customer and can be used for displaying a detailed customer balance.

```CUS LAST_NAME    DATE       TYPE    SUM      TOTAL    CUST_BAL RUNNING_
--- ------------ ---------- ------- -------- -------- -------- --------
2 Adams        2008-10-11 ORDER     -31.32   -77.46     0.00   -31.32
2 Adams        2008-10-25 ORDER     -31.89   -77.46     0.00   -63.21
2 Adams        2008-10-25 RETURN     12.64   -77.46     0.00   -50.57
2 Adams        2008-10-27 PAYMENT     6.04   -77.46     0.00   -44.53
2 Adams        2008-11-02 ORDER      -8.97   -77.46     0.00   -53.50
2 Adams        2008-11-03 ORDER     -19.25   -77.46     0.00   -72.75
2 Adams        2008-11-04 PAYMENT    31.89   -77.46     0.00   -40.86
2 Adams        2008-11-10 RETURN      8.97   -77.46     0.00   -31.89
2 Adams        2008-11-25 ORDER     -11.83   -77.46     0.00   -43.72
2 Adams        2008-11-30 PAYMENT    43.72   -77.46     0.00     0.00
7 Black        2008-10-20 ORDER     -21.17   -77.46     0.00   -21.17
7 Black        2008-11-04 PAYMENT    21.17   -77.46     0.00     0.00
7 Black        2008-11-12 ORDER     -17.03   -77.46     0.00   -17.03
7 Black        2008-11-17 PAYMENT    17.03   -77.46     0.00     0.00
4 Brown        2008-10-19 ORDER     -27.49   -77.46     0.00   -27.49
4 Brown        2008-10-25 ORDER     -15.13   -77.46     0.00   -42.62
4 Brown        2008-11-03 PAYMENT    27.49   -77.46     0.00   -15.13
4 Brown        2008-11-21 ORDER     -21.04   -77.46     0.00   -36.17
4 Brown        2008-11-30 PAYMENT    36.17   -77.46     0.00     0.00
8 Green        2008-10-14 ORDER     -16.48   -77.46   -43.61   -16.48
8 Green        2008-11-03 RETURN      8.24   -77.46   -43.61    -8.24
8 Green        2008-11-03 PAYMENT     8.24   -77.46   -43.61     0.00
8 Green        2008-11-09 ORDER     -43.61   -77.46   -43.61   -43.61
3 Jones        2008-10-11 ORDER     -42.02   -77.46     0.00   -42.02
3 Jones        2008-10-25 ORDER     -34.38   -77.46     0.00   -76.40
3 Jones        2008-11-03 PAYMENT    26.84   -77.46     0.00   -49.56
3 Jones        2008-11-03 RETURN      7.59   -77.46     0.00   -41.97
3 Jones        2008-11-10 ORDER     -31.89   -77.46     0.00   -73.86
3 Jones        2008-11-15 PAYMENT    73.86   -77.46     0.00     0.00
6 Marsch       2008-11-20 ORDER     -14.37   -77.46     0.00   -14.37
6 Marsch       2008-12-01 PAYMENT    14.37   -77.46     0.00     0.00
12 Meier        2008-11-17 ORDER     -22.27   -77.46   -22.27   -22.27
5 Peters       2008-11-05 ORDER      -8.97   -77.46    -5.79    -8.97
5 Peters       2008-11-15 ORDER      -5.79   -77.46    -5.79   -14.76
5 Peters       2008-11-20 PAYMENT     8.97   -77.46    -5.79    -5.79
9 White        2008-11-17 ORDER      -5.79   -77.46    -5.79    -5.79
10 Young        2008-10-12 ORDER     -20.50   -77.46     0.00   -20.50
10 Young        2008-10-20 PAYMENT    20.50   -77.46     0.00     0.00

38 rows in resultset.

```

## Analytical functions

By using the following functions, which can be used both as analytical and aggregate function, all the analytical clauses are allowed, none is required though:

• AVG: Calculation of the arithmetic mean
• COUNT: Calculation of the count of values
• MIN: Determining of the smallest value
• MAX: Determining of the greatest value
• SUM: Summarizing of values

### DENSE_RANK, RANK, and ROW_NUMBER

These functions have no argument, the ORDER BY clause is required. They return the rank or row number within the partition, whereas the ORDER BY clause determines the ranking or numbering.

For equal values of the sort expression, both RANK and DENSE_RANK return a common rank, however, DENSE_RANK doesn't skip the following values as in the case with RANK. ROW_NUMBER returns unique numbers in any case, equal values will receive a random ordering.

### Example 4: RANK, DENSE_RANK and ROW_NUMBER()

```SELECT
product_no
, product_name
, order_month
, sold_items
, row_number() over( partition BY order_month ORDER BY nvl(sold_items, 0) DESC nulls last) num
, rank() over( partition BY order_month ORDER BY nvl(sold_items, 0) DESC nulls last) rating
, dense_rank() over( partition BY order_month ORDER BY nvl(sold_items, 0) DESC nulls last) dense
FROM
(
SELECT
product_no
, product_name
, trunc(order_date, 'MM') order_month
, sum(count) sold_items
, sum(sum) revenue
FROM V_PRODUCT_ORDERING
GROUP BY  product_no, product_name, trunc(order_date, 'MM') HAVING sum(count) > 0
)
;
```
```PRO PRODUCT_NAME                        ORDER_MONT SOLD_ITE NUM     RATING  DENSE
--- ----------------------------------- ---------- -------- ------- ------- -------
8 The Love Dare                       2008-10-01        4       1       1       1
5 The Tales of Beedle the Bard        2008-10-01        3       2       2       2
7 Barefoot Contessa Back to Basics    2008-10-01        3       3       2       2
1 The Shack                           2008-10-01        2       4       4       3
2 Breaking Dawn                       2008-10-01        2       5       4       3
3 The Eclipse                         2008-10-01        2       6       4       3
10 The Snowball                        2008-10-01        1       7       7       4
6 Twilight The Twilight               2008-10-01        1       8       7       4
9 Brisingr                            2008-10-01        1       9       7       4
4 New Moon                            2008-10-01        1      10       7       4
5 The Tales of Beedle the Bard        2008-11-01        4       1       1       1
17 Dreams from My Father               2008-11-01        4       2       1       1
10 The Snowball                        2008-11-01        2       3       3       2
8 The Love Dare                       2008-11-01        2       4       3       2
12 Outliers                            2008-11-01        2       5       3       2
15 The Last Lecture                    2008-11-01        1       6       6       3
16 A Mercy                             2008-11-01        1       7       6       3
3 The Eclipse                         2008-11-01        1       8       6       3
6 Twilight The Twilight               2008-11-01        1       9       6       3
13 American Lion                       2008-11-01        1      10       6       3
4 New Moon                            2008-11-01        1      11       6       3

21 rows in resultset.
```

### LAG and LEAD

These functions facilitate access to the values of other rows. ORDER BY is mandatory, PARTITION BY is optional. An offset determines, which row will be used:

• offset must be >=0.
• offset = 0 address the current row.
• offset can be an expression
• default value for offset is 1

The default value will be used if there is no row with a given offset. This can, for example, occur, if the current row is the first one in the partition. The default value can also be an expression. If the default value is not set, NULL will be used.

### Syntax

```        LEAD/LAG(<expr>[, <offset>[, <default>]]) OVER([partition BY <expr>] ORDER BY <expr>)

```

### Example 5: LAG

```        WITH basis AS (
SELECT
customer_no, order_no, order_date,
lag(order_date) over(
partition BY customer_no
ORDER BY order_date
) AS prev_date
FROM orders
)
SELECT
customer_no,
count(*) order_count,
avg(days_between(order_date, prev_date)) AS avg_days
FROM basis
GROUP BY customer_no
ORDER BY customer_no;

```

LAG(order_date) will be used in the sub-select 'basis' to get for each order the date of the previous one. According to given partitioning and sorting rules, the result is a value of a previous row (offset=1). If there is no previous row, the result is NULL (=default).

On the basis of both order dates, we can calculate how long was the pause between each two orders. This value can be then used for the aggregation. This second step is necessary, thus analytical function in the select list cannot be used in expressions.

```        CUSTOMER_NO ORDER_COUNT AVG_DAYS
----------- ----------- ---------
2          5       11.25
3          3       15
4          3       16.5
5          2       10
6          1
7          2       23
8          2       26
9          1
10          1
12          1

```
Show
• Category 1:
SQL

#### People

• Assignee:
Captain EXASOL
Reporter:
Captain EXASOL