Saturday, February 4, 2012

OLAP tools review: Basis for comparison of OLAP tools.


This is a second article from "OLAP tools review" series. To read the first one open this link.

There are two important sides of any OLAP systems: "Which questions can be addressed to OLAP?" and "Which answers can be received from OLAP?" In this article we'll review these topics in details.

Questions to OLAP System

All questions to OLAP Systems are about figures. As we know, OLAP systems are intended to answer business questions like: "What's a company revenue this year?" or "How does it compare to a previous year?". These questions can easily be answered by searching the appropriate information in a Data Warehouse. More advanced question will be: "Which revenue the company may expect next year?". Existing OLAP usually answer this question by calculating a formula provided by an expert.

Rather complex questions can be constructed with modern OLAP systems. For building these complex questions OLAP provides a form of conversation between a system and a user. Suppose you asked for a comparison bar chart for expenses on entertainments for each month of this year. Then next question may simply be "Why in April I spent so much?". System should remember the context of conversation and present to you a list of items under entertainment category for April of this year.

When complex question is built, OLAP usually provide an ability to save it, and then quickly run next time.

Answers of OLAP System

OLAP Systems gives answers in terms of figures. It can be a single number, but usually it's a row of numbers of a table of numbers. For our expenses example a table may contain months by rows and expenses categories by columns. See table below as example. A table helps you to compare numbers to each other. Each number will usually represent an aggregate value. The aggregates are not necessary should be sums, they may be average values, counts, ratios, and a lot more types.


Categories
Months of 2011 Food Entertainment Car
January $407 $224 $734
February $451 $311 $354
March $376 $74 $579

So we can classify all answers by dimensionality. First three types of answers are used in practice, and the forth I introduce as a possible direction of improvement of existing OLAP systems:
  • Single number. Example: Total expenses in May.
  • List of numbers. Example: Total expenses in each month of a year.
  • Table of numbers. Example: Total expenses in each month of a year by categories.
  • Several tables of numbers. Suppose we have a slider component for changing date and table will display data for that day. This may be considered as set of tables.

Answers visualization

A huge role in easy comprehension of these numbers plays visualization. Instead of reviewing a table of figures for several minutes a single glance at a chart or graph will tell you everything you need in a moment. As example system may show you a bar chart with expenses for each month of a year, and you'll easily identify the higher bar which denotes a month with the biggest expenses.

Let's think about visualization of different types of answer dimensionality:
  • Single number. Visualization usually is not required, because nothing to compare this number with.
  • List of numbers. Many ways to visualize (A bar chart, a pie chart, line chart)
  • Table of numbers. Still many ways for great visualization (A stacked-bar chart, multi-line chart, many pie charts)
  • Several tables of numbers. Difficult to visualize, an animation may help.

Initial context of  conversation

OLAP Systems can answer very complex questions, and it's difficult to classify them all. But OLAP usually supports conversations. That means user instead of writing a complex question is asking a series of short questions which leads to desired result. Along this process system should remember the context of conversation.

Many OLAP systems start conversation from grand total values and allow user to "drill down" to details. And it's reasonable that user wants to see the whole picture at first and then investigate the details. But it's also true that business users prefer to look at current situation, like expenses for last year. And seeing total expenses looks useless.

On the other hand, user which opens OLAP system for the first time, may be unfamiliar with Data Warehouse structure and it may be very helpful to see several most recent rows with  fine-grained data from DW, and then start data exploration in a bottom-up fashion. But I don't remember seeing OLAP system which adopts bottom-up style of navigation.

Navigation in figures ocean

Common operations which user can perform in OLAP system can be split into following classes: Result table or chart definition, data filtering, changing detail level, calculations.

Result table or chart setup: 
  • Choose the measures. Example: Show total expenses.
  • Choose Dimension or combination of Dimensions for columns. Example: Compare expenses by categories.
  • Choose Dimension or combination of Dimensions for rows. Example: Compare expenses by categories(on columns) and by months(on rows).
  • Sort data in a table.

Data filtering:
  • Select dimension members for which to display data in the result table or chart.
  • Filter data in result table by members of other dimensions(which don't used in result table definition)
  • Filter data by Measures values, or by arithmetic formulas.

Changing detail level:
  • Drill level down.  Example: If we see total expenses for a year "drill down" will open expenses my months of the year.
  • Drill level up. It’s an opposite of drill down.
  • Drill Through. It shows the most fine-grained rows from data warehouse which comprise selected value.

Calculations:
  • Calculate column or row values with user defined formula. OLAP systems from this review support such functionality in less or more extent. But I think a lot of improvements can be made to existing OLAP systems in this direction.
  • Calculate column or row values with a predefined function.
  • Change aggregation function for Measures.

Let me describe one simple idea which OLAP tools may adopt in future to simplify advanced calculations.

Do you like using Excel? I do. Excel allows entering formulas with mouse clicks. Excel suggests function names to user. Excel highlights cells used in formula with different colors. All you need to learn to start using formulas in Excel is that formulas start with '=' sign.
Implementing functionality like that in OLAP system opens the door to completely new usage scenarios for OLAP systems. Suppose accountant use data warehouse to calculate taxes, and don't need to ask technical specialist to change some formula every time tax rate or tax policies changes.

In the next article I'll review JPivot tool.

0 comments:

Post a Comment