Thursday, June 26, 2008

OLAP's cube crumbling around the edges


Business intelligence (BI) is essentially a set of best practices for building models to answer business questions. However, today’s BI best practices may be suboptimal for many enterprises’ decision-support requirements.

For most users, BI is a journey that’s been modeled and mapped out in advance by others, following a well-marked path through vast data sets. Data models, which must often be pre-built by specialists, generate or shape the design of such key BI artifacts as queries, reports, and dashboards. Essentially, every BI application is some data modeler’s prediction of the types of questions that users will want to ask of the underlying data marts. Sometimes, those predictions are little more than an educated guess--and are not always on the mark.

BI’s most ubiquitous data-modeling approach is the online analytical processing (OLAP) data structure known as a “cube.” The OLAP cube--essentially a denormalized relational database--sits at the heart of most BI data marts. OLAP cubes, usually implemented as multidimensional “star” or “snowflake” schemas, allow large recordsets to be quickly and efficiently summarized, sorted, queried, and analyzed. However, no matter how well designed the dimensional data models within any particular cube, users eventually outgrow these constraints and demand the ability to drill down, up, and across tabular recordsets in ways not built into the underlying data structures.

The chief disadvantage of multidimensional OLAP cubes is their inflexibility. Cubes are built by pre-joining relational data tables into fixed, subject-specific structures. One way of getting around these constraints is the approach known as relational OLAP, which retains the underlying normalized relational storage approach while speeding multidimensional query access through “projections.” However, relational OLAP also suffers from the need for explicit, upfront modeling of relationships within and among the underlying tabular data structures.

From the average end user’s point of view, all of this is mere plumbing--invisible and boring--until it prevents them from obtaining the new query tools, structured reports, and dashboards needed to do their jobs. One unfortunate consequence of OLAP cubes’ inflexibility is that requests for new BI applications inevitably wind up in a backlog of IT projects that can take weeks or months to deliver. What might seem a trivial thing to the end user--such as adding a new field or new calculation to an existing report--might represent a time-consuming technical exercise for the data modeling professional. Behind the scenes, this simple decision-support request might, beyond the front-end BI tweaks, also require remodeling of the data mart’s OLAP star schema, re-indexing of the data warehouse, revision of extract transform load (ETL) scripts, and retrieval of data from different transactional applications.

No one expects the OLAP cube to vanish completely from the BI landscape, but its role in many decision-support environments has been declining over the past several years. Increasingly, vendors are emphasizing new approaches that, when examined in a broader context, appear to be loosening OLAP’s lockhold on mainstream BI and data warehousing. The emerging paradigm for ad-hoc, flexible, multi-dimensional, user-driven decision support includes the following important approaches:


  • Automated discovery and normalization of dispersed, heterogeneous data sets through a pervasive metadata layer
  • Semantic virtualization middleware, which supports on-demand, logically integrated viewing and query of data from heterogeneous, distributed data sources without need for a data warehouse or any other centralized persistence node
  • On-the-fly report, query, and dashboard creation, which relies on dynamic aggregation of data, organization of that data within relevant hierarchies, and presentation of metrics that have been customized to the user or session context
  • Interactive data visualization tools, which enable user-driven exploration of the full native dimensionality of heterogeneous data sets, thereby eliminating the need for manual modeling and transformation of data to a common schema
  • Guided analytics tools, which support user-driven, ad-hoc creation of sharable, extensible models containing data, visualization, and navigation models for customizable decision-support scenarios
  • Inverted indexing storage engines, which support more flexible, on-the-fly assembly of structured data in response to ad-hoc queries than is possible with traditional row-based or column-based data warehousing persistence layers
  • Distributed in-memory processing, which enables continuous delivery of intelligence being extracted in real-time from millions of rows of data that originates in myriad, distributed data sources
Unfortunately, this new decision-support paradigm has no pithy name or coherent best practices. If we were call it the “post-OLAP” paradigm, that would give the false impression that OLAP cubes are obsolete, when in fact they are simply being virtualized and embedded within a more flexible Web 2.0 and SOA framework. We could call this the new “hypercube” paradigm, but that might give the mathematical purists among us a case of indigestion.

Whatever we choose to call this new era, look around you. It has already arrived. We can see this trend in the growing adoption of all of these constituent approaches in production BI environments everywhere. However, to date, few enterprises have combined these post-OLAP approaches in a coherent BI architectural framework.

But that day is rapidly coming to mainstream BI and data warehousing environments everywhere. OLAP’s hard-and-fast, cube-based approach is slowly but surely dissolving in this new era of more flexible, user-centric decision support.

Saturday, June 21, 2008

poem Menhir

MENHIR

For Elizabeth,
all of whose eight grown children
today survive her.

For Jean, whose four sons
have all attained middle age
in full possession.

Old or otherwise,
we are never quite ready
for mommy passing.

Friday, June 06, 2008

poem Neural Carpal

NEURAL CARPAL

Catch a breath between scenes.
An escape from dwelling.
A grounding from
flight.

A break from the sheen
of the towering
lights.

The entrancing connections of
repetitive trips.

And the steady burn of
one straight strip.

Thursday, June 05, 2008

poem Jade

JADE

I

Every pitch hits some
Differentiated and
Distinctly sharp mark.

Typically it’s at
Least the fact that this vendor’s
Lawyers nailed the name.

Undeniably
They alone in their space are
Thusly entitled.

II

How nicely every
Press release marks another
Milestone in their rise.

How capital was
Amassed, plans staked firm, and bright
Warm brains brought on board.

Now culminating
In this major or minor
Green innovation.

III

Remind me: I’ve been
Around this block several times.
Perhaps I’m jaded.

We’ve met, haven’t we--
Mandalay or Caesars--or
Purely virtual?

Hard to distinguish
This present polish from the
Glint of bygone gems.

poem Center of Conventions Exhibitions Conferences and Expositions

CENTER OF CONVENTIONS EXHIBITIONS CONFERENCES AND EXPOSITIONS

High ceiling. A day
before booths are built and some
of the comers come.

Facility. A
concrete plain, a room swept, kept
on ready’s near edge.

Walker walks. The guards
regard the span of several
leveled city blocks.

Tuesday, June 03, 2008

poem These Plains

THESE PLAINS

As green and glaring
As brown and bearing as a
Long Las Vegas block.

Monday, June 02, 2008

Relations with Analysts...the fifth of five

All:

Fifth of five questions sent to us before the Forester Analyst Relations (AR) Council panel this past month, plus my thoughts:

  • Q: What are the one or two things you recommend all AR professionals do right now with respect to blogging?
  • A: One: Read the blogs. Two: Respect the bloggers--and treat them as full analysts--if and only if they behave as professional analysts in this and all other media through which they present their thoughts. In this latter regard, also respect the fact that each professional analyst may choose to surface different analytical perspectives, priorities, and voices through different channels--“color outside the lines,” as it were--show different sides of themselves--and if you wonder how it all coheres with and supports their 9-to-5 selves, just ask the analyst to explain themselves. The best of us are continually self-reinventing/evolving. The same James Kobielus stands behind this and all other things I choose to say, wherever, whenever, however--but don’t expect me to stay stuck in a rut of saying the same things on the same topics over and over. And don’t expect me to stay content broadcasting through only one channel forever. Public speaking, for example, is something I enjoy and haven’t done enough of. I enjoy participating in panel sessions, such as this latest one alongside Messrs. Gardner, Lusher, Hopkins, and Eunice. Actually, I enjoyed listening to them as much as wagging my own tongue. But nobody expects an analyst to just sit there and listen. I’m expected to dispense brilliance on demand. So I’m always spring-loading fresh thoughts to share. Through any channel.

Jim

Relations with Analysts...the fourth

All:

Another panel query, plus my toosense:

  • Q: What impact does microblogging--or Facebook, LinkedIn profiles, discussion groups, etc.--have on AR? Is this something that is around the corner, or not likely to be important?
  • A: First and foremost, you should target the analyst-ish voices who use the broadcast-ish media, such as “macroblogging” (i.e., plain old blogging of the sort you’re reading at this moment) and not the more “narrowcast-ish” voices, such as micro-blogging. Bottom line: widespread publication is key to influence. After all, that’s why you should target the bigger analyst firms, and the most widely read macro-bloggers, with your AR initiatives, while at least keeping everyone else, especially the mid-tier analyst firms, but also the micro-pundits, such as the twitterers, in some inner or outer band of the AR loop. You should look at microbloggers as being part of the “long tail” of the analyst community these days and forever more--collectively they have considerable mass and gravitational sway, but individually they’re barely in your telescope. But in dealing with the long tail of the analyst community, there’s only so far you’ll want to dive that deep into the Kuiper Belt--much of this dark-matter commentary is undistinguished me-too pontificating, at best, or just flame-intensive blather at worst. Wait till some cometary fragment of it orbits closer to your home planet, and whips its long tail in your face, before giving it sustained attention.

Jim

Relations with Analysts...the third

All:

Another question posed to the panel, plus my response (not verbatim--it wasn’t being recorded--and I don’t have a phonographic memory--rather, this is a paraphrase of things I actually said live, plus stuff I should have said, perhaps stuff others on the panel said that in my memory are now indistinguishably blurred into my own comments):

  • Q: Does the activity of blogging fundamentally change the definition of “who is an analyst” or “a reliable voice in the industry?” and--as a consequence--who AR professionals should work with?
  • A: Not really. My feeling is that analysts have always primarily been self-designated entities, and that blogging has simply intensified that trend by making it free, fast, and easy to self-publish. We’re all fundamentally self-designated--in the sense that many of us start our core business/career models by self-asserting through some channel(s)--perhaps a newsletter that we put out too see if the world will read/subscribe--and then, if widely read and perhaps even paid to opine, we the analyst, by virtue of that, get progressively validated by others (to greater or lesser degrees). Self-concept is key. We self-designate ourselves as somebody worth paying attention to. Our “know-something” self-image may precede the perception of same by others (hopefully, without much lag, and without too much disconnect between our self-perception and universal regard). We don’t usually start our professional lives as “leading industry analysts”--gosh knows I didn’t (and I’m not sure I completely fit that designator now)--I got initially validated by the good Dr. Peter G.W. Keen, and then by the fine editors of Network World, just three years out of grad school, in the late 80s, when the former hired me as an IT industry analyst, and the latter started publishing my still-going column. So we pay our dues, sometimes for years, struggling to gain visibility, acceptance, smarts, connections, reputation, differentiation, starpower, etc. Building our brand, and aging into it. But no matter how big we get for our britches, we’re still all just alternative voices, none of us infallible, none indispensable, some of us more or less reliable/valuable than others. It does little good to demonize or marginalize any alternative analyst-ish voice over the long term--it will just stoke ill will against your firm or client. To the extent that any analyst (multi- or one-person, established or new, widely read or largely ignored) takes an interest (even a cynical/adversarial one) in your firm/client, you, as AR professionals, should respect their right to their viewpoint. You should also work with them--at least to the extent of adding them to your mailing list, and responding to their questions and requests for briefings (within reason). Reliabilty and professionalism are proved out by people’s track records (e.g., checking facts, responding to objections, defending themselves with sound argumentation, honoring NDAs, etc.) Some of these people just want your attention/validation, and just need to vent/showboat. That’s why they started blogging in the first place. Those who intend to keep doing in long-term quickly adopt the professional ethics of a true analyst.

Jim