Massively parallel data processing with R programming language in EXASolution enables predictive analyses in real time
Nuremberg, 16 April 2013 – In future, EXASOL AG, the specialist for in-memory databases, will support programming language R for statistical calculations withEXASolution. The combination of R – the leading programming language for data scientists and business analyses – and EXASolution will have considerable benefits for data scientists and other analysis specialists from the fields of trade, production and market research, among others: no expensive calculation resources will be required for analyses to transfer the data from one system to the other. Given that linear scaling is possible with EXASolution, even analyses of extremely large quantities of data can be performed in just a fraction of the time previously required. The parallel execution of the R code in EXASolution not least enables extremely high-performance queries.
Applications of parallel data processing with R
Steffen Weissbarth, CEO at EXASOL, names a few striking examples to illustrate the fields of application of EXASolution in combination with R: “Whether various customer classifications, or shopping basket, churn or sentiment analyses, our solution is a real asset in everything involving the evaluation of customer opinions. A publishing house wishing to know which book its customers buy the most can analyse all customer reviews and determine what is said to be the book of the year, for example,” Weissbarth explains. EXASOL uses the range of functions and flexibility of the open source programming language R, and skilfully combines these two fields with the turbo database, EXASolution. The outcome: high performance data analyses also for rapidly growing quantities of data.
Manufacturers of comparable databases often have to first transfer all data to the R environment, which costs time and significant amounts of money. EXASolution already integrates R into each running cluster node, meaning that the queries received are distributed over all nodes. These can then be processed en masse in parallel, as an ‘individual’ R runs on each individual node.
R – “the most powerful programming language for data analysis”
R is software for data analysis and visualisation. Developed in 1993 at the University of Auckland in New Zealand, R stands for an entirely new approach to dealing with all kinds of data. R offers a variety of statistical methods (linear and non-linear modelling, classic statistical test procedures, time-series analyses, cluster analyses, etc.) and tools for graphic visualisation. At the same time, R is highly expandable. One of R’s strengths lies in its flexibility: R experts can analyse both big data and smaller studies using the same code, the same tools and the same expertise, and perform text mining and regression analyses. Standard settings already yield extremely good results; the user retains all algorithms at all times. R is used worldwide in companies and science by an estimated 2 million users. The community has developed more than 2,500 packages, representing components for the creation of analytical models.
EXASOL AG, based in Nuremberg, develops and markets the high-performance database EXASolution, which is based on in-memory technology and was designed specifically for enterprise warehouse applications and business intelligence solutions. This also allows extremely large volumes of data to be analysed and evaluated within the shortest of times. Thanks to the high performance and low administrative maintenance, EXASolution not only supports businesses with valuable decision-making bases from their data, but also reduces the total cost of ownership. In April 2011, Gartner named EXASOL AG a “Cool Vendor” in the category of “Data Management and Integration 2011”. In 2012 and 2013, the solution was included in the Magic Quadrant for Data Warehouse Database Management Systems.
About EXASOL AG
EXASOL AG develops a data management system (EXASolution) that enables rapid analysis and evaluation of data. Through the use of the sector-independent solution that even analyses large quantities of data (“big data” and “value data”), enterprises optimise business processes, generate a reliable basis for decisions, provide for their daily work and therefore obtain a sustainable competitive edge. EXASolution is a relational high-performance database that was specifically developed for datawarehouse applications and business intelligence solutions. The in-memory technology-based database is utilised by companies from all industry sectors for time-critical analyses, comprehensive data research, planning or reporting. By integrating geodata and polystructured data, which can also be added through Hadoop systems, EXASolution offers (i.e. with the EXAPowerlytics module) an additional evaluation dimension for more efficient and ad hoc analyses. The easy-to-manage database can simply be integrated into existing IT infrastructures and requires minimal administrative maintenance while ensuring low investment and operation costs (TCO). EXASolution is also available as an appliance solution and as data warehousing as a service under the name of EXACloud. Companies such as XING, Sony Music, Olympus, media control, Zalando, stayfriends, Coop, IMS Health, Semikron, Webtrekk, econda and xplosion rely on EXASOL technology which is made in Germany.