Measures are facts about the business – basically
items that you want to know about – Sales, Revenue, Profit,
Price etc and the Cube is the combination of these two such that
any the Measure at any point in a combination of dimensions is
known. In the vast majority of cases Cubes are sourced from relational
databases so SQL remains a key part of an overall solution.
A cube is constructed through an ETL - (E)xtract, (T)ransform
and (L)oad process
- Extract – Get the data from the source systems
- Transform – Convert the data into a common format
- Load – Get it into the cube
This is typically called the “Build” process and
relies primarily on good relational database design, SQL skills
and a performant build process is essential to a high quality
BI solution. Too often we see Cube build’s taking too long – hours,
days even – and alarm bells ring.
We also recommend to add a final step to the build process – known
as the Validate Step – which basically means that you should
make sure that your cube add’s up to your source data as
red faces ensue if they don’t.
The cube will also contain calculation’s which can be simple
Profit= Sales – Costs or extremely complex - Rolling 7
Day Volume weighted margins. These are exposed to the user as
new measures or members.
End Users then query the cube – rarely directly – but
through a front end tool (such as XLCubed) which constructs the
query and the end user is hidden from the complexity making it
easy to understand.
The technology behind cubes ensures that if you follow good
design principles that the results
of queries are blindingly fast. Navigation through the data becomes
intuitive resulting in speed of thought analysis and discovery
of actionable items that make a difference.
Sounds just what you need for Business Intelligence – and
in many cases you will be right – but there are important
considerations to assess with respect to application design.
Most, if not all, of the edge cases where cubes have challenges
require experience, lateral thought and compromise to be successful.
Our consultants and customers have examples of success in almost
every case.
Much of this discussion is based around our decade
of experience with Microsoft Analysis Services.
- Your breadth of reporting needs – Cubes perform best
with manageable sets of dimensions– rule of thumb 16
Dimensions – from both a manageability (human beings
understanding) and performance. HyperCubes with 100’s
of dimensions can and are successful but require careful design
and typically clarity of explanation.
- Data Volumes and Dimensional Rate of Change - Very high data
volumes (tens of millions of dimension members), 100’s
of millions of facts require careful thought and design particularly
with respect to dimensional hierarchies.
- How up to date do you need your data – The more up
to date the more challenging the cube build becomes.
- How much hierarchical navigation do your end users need – Cubes
excel where there are lots of hierarchies. If there are no
real hierarchies then the benefits lessen. If people only want
to report on detail data then there really is very little point.
- How much text data do you want to report on - Cubes do numbers
great – Text not so well.
- What is the nature of your reporting needs – If you
simply are producing list style reports which have little need
for further interactive analysis then Cubes aren’t for
you.
- What is the nature of what you want to do to your measures:
- Adding Up the Hierarchies – Great
- Averaging Up the Heirachies – Great
- Inception to user defined date – Care to be taken
- On the Fly Volume Weighting – Care to be taken
- Complex Triangulation calculation – Care to be taken
Ultimately it comes down to how many numbers of having to be
added up in the course of a query. If it’s millions to
billions then be careful.
Weighed against the benefits of the cube analytical capabilities
is the “Cost” of creating the cube environment both
from a software cost and a manpower cost. There has been a perception
in the market that Cube and data warehouse projects take a long
time are expensive and don’t deliver return.
This does not have to be the case and personally I think the
failure of such projects is much more to do with the definition
of the business problem and design failures along the way.
Unless a business knows what it wants to do with its information,
understands its organization and operational infrastructure and
the quality of its source data systems (of which there will be
many) then a project is typically doomed.
XLCubed consultants are trained to ask:
- First - the business question’s first (The Why – where
is the benefit going to come from)
- Second – Where does the data come from ( The how – we
try very hard in this question to look at the detail of the
data as I’m afraid the data never lies)
- Third - The technology infrastructure and environment
Business Intelligence projects can be completed in days, weeks
and months but without looking at projects this way a tightrope
is walked. Based on this analysis the cost benefit of the approach
can be taken and a decision taken. Sometimes an evolutionary
approach can be taken with a key focus being realizing benefits
early.
XLCubed products support both OLAP & SQL and our consultants
are fully versed in the costs and benefits.
So in summary:
OLAP = Best for interactive analysis and High
Level Aggregated reporting with average latency
SQL = Best for detail list style analysis and very high latency
In our applications we always look at the costs vs. the benefit and more than
often utilize a combination. |