In the recent 2014 run of theTPC-H benchmarks, EXASOL managed briefly to break the scoreboard on the TPC website by posting performance benchmarks of over 10 million TPC performance units. Our nearest competitors struggle to get a few percent of that.
The most striking figures for me are in the top size categories, the 100 TB database.
I remember seeing the press release when Hitachi and researchers at Japanese universities posted the very first benchmark at 100 TB – it was (and remains) a real achievement – you always have to respect people who are the first to do something.
Their score at 100 TB was 82,000 – EXASOL’s new record is 11.6 million. In other words, it’s not just a little faster, it over a hundred time faster. It’s like somebody being the first to run a mile in four minutes and then for someone the year after to beat their record with a time of 1.7 seconds.
The Hitachi machine was also many times more expensive than EXASOL running on Dell hardware – so our price/performance ratio was even more astounding. The Japanese attempt cost the equivalent of $ 172.50 per performance unit – the EXASOL attempt cost 37 cents.
Something new and radically different is happening here – how are EXASOL getting these speeds on true Big Data databases that the other vendors can only dream of?
You are very welcome to find out more here: http://www.exasol.com/en/test-drive/
What exactly is a Performance Unit?
This idea of a “performance unit” is maybe a little difficult to understand
You can measure the individual components of database hardware:
- Processor speed => Gigahertz
- Disk & network bandwidth => Megabits per second
- Memory access => Gigabits per second
The capacity for a given machine to do simple arithmetic can also be measured in units called “FLOPS” (floating point operations per second).
When we are talking about analytical queries, we have a complex mixture of all of the above hardware components and some very complex software – so FLOPS is not a suitable unit of measure.
What you need is some way of calculating a single score for a range of activities – like measuring an athlete’s performance across a range of running races, jumping and throwing events etc. …
… Which is exactly the basis of the sports of decathalon and heptathlon – the winner is the one who has the best total score from all the different events.
In a similar way, the TPC (Transaction Processing Council) set out to create a score specifically for analytical databases.
To quote their website:
… It consists of a suite of business oriented ad-hoc queries and concurrent data modifications. The queries and the data populating the database have been chosen to have broad industry-wide relevance. This benchmark illustrates decision support systems that examine large volumes of data, execute queries with a high degree of complexity, and give answers to critical business questions.
The performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@Size), and reflects multiple aspects of the capability of the system to process queries. These aspects include the selected database size against which the queries are executed, the query processing power when queries are submitted by a single stream, and the query throughput when queries are submitted by multiple concurrent users. The TPC-H Price/Performance metric is expressed as $/QphH@Size.
In simple terms, you run a standard suite of queries, measure the time taken, and use a complex statistical function to calculate a single “performance” value for the database.
And EXASOL’s score puts it in a league above everyone else.