Saturday, October 18, 2008

poem An Insinuation


Whether the weather
blows over or not I'll be
sitting here chilling.

Every God-given
day, damned or otherwise, is
another excuse.

A day to rise and
be still in situ and in
my very sinew.

poem Drench


Could rain fall as snow
and in flight form a bow of
crystalline color.

Leaves would float as light
and diffuse the heavenly
silence of autumn.

Jumble the remains
of drifting summer. Soak and
drench the memory.

Wednesday, October 15, 2008

Ellison hypes Oracle's data warehouse appliance

This story appeared on Network World at

Ellison hypes Oracle's data warehouse appliance

By James Kobielus , Network World , 10/07/2008

The high-end data warehousing wars are fast upon us. Vendors are launching ever more scalable DW solutions. And they're delivering them with more aggressive -- and slippery -- performance claims.

The DW industry's new battlefront is petabyte scalability. This refers to a DW platform's ability to ingest, store, process and deliver an order-of-magnitude more data than today's typical terabyte-size warehouses. In this regard, the competitive high ground is still held by pioneering DW-appliance provider Teradata. That vendor recently released a high-end, shared-nothing, massively parallel processing (MPP) DW appliance that can scale to an astounding 10 petabytes across as many as 1,024 compute/storage nodes.

Oracle and HP recently joined the petabyte battle with all guns blazing. At Oracle's annual OpenWorld conference, they jointly announced general availability of a new petabyte-scalable DW appliance: the HP Oracle Database Machine, which includes the HP Exadata Storage Server. They touted its "extreme" performance and scaling features, bolstering those claims through public demos and beta-tester testimonials.

Most significant, they enlisted none other than Oracle CEO Larry Ellison and HP honchos Mark Hurd and Ann Livermore to unveil the new offering from the conference's main stage.

Clearly, the HP Oracle Database Machine is highly strategic for both companies. It provides a platform for Oracle to sell more database licenses and for HP to sell more server and storage hardware into DW deployments. It will almost certainly get the partners onto vendor short lists, alongside Teradata, for petabyte-scale DW solutions, which are increasingly being deployed in such vertical markets as telecommunications, government and financial services.

Also, it helps them blunt the momentum of DW appliance up-and-comer Netezza, whose platform, like the new Oracle/HP offering, performs SQL processing in an intelligent storage layer, thereby accelerating queries and table scans against very large data sets.

For sure, the recent Oracle/HP announcement was substantial and has shifted the competitive dynamics in the high-end DW market. But it was also an exercise in pure, albeit well-engineered, marketing hype. Predictably, it triggered an immediate firestorm of heated retorts from aggrieved competitors, which will almost certainly escalate in coming months.

In the fog of war, the first casualty is perspective, and that's certainly the case in this competitive fracas. Buyers of DW solutions should exercise extreme caution when evaluating the new Oracle/HP solution vis-à-vis comparably scalable offerings from Teradata, Sybase, Greenplum, IBM and others. You'll definitely need to apply the standard caveats to Larry Ellison's bold price/performance claims for his new monster DW appliance. And considering that Ellison was employing the native marketing speak of the DW arena, you'll need to apply the same grains of salt to his competitors' tails. Everybody in the DW market presents their self-serving performance story in much the same way as Oracle's big kahuna.

For starters, Ellison studded his talk with what might be regarded as the "virtuous coefficients" of DW performance enhancement: 10x, 20x, 30x, 40x, 50x, as high as 72x speedups have been documented by beta testers of the HP Oracle Database Machine. Of course, every DW professional knows that these performance boosts are extremely sensitive to myriad implementation factors, such as what you put in a SQL "where" clause, how many table joins you perform, whether and how you compress the data and so forth.

The performance enhancements are also relative to whatever DW configuration -- well-engineered or otherwise -- the beta testers had implemented prior to getting their hands on this shiny new uber-appliance. Note the tag line near the end of Ellison's presentation (emphasis added): "10-50x faster than current Oracle data warehouses."

Also, Oracle's big boss hammered Teradata and Netezza with benchmarks that were ostensibly apples-to-apples. However, Ellison's presentation seriously lacked the detailed footnoting that would be necessary to ascertain that he was indeed comparing his product against comparably configured instances of rival offerings that were processing comparable workloads. Where are those fast-talking, TV-commercial pharmaceutical disclaimer readers when we need them?

But even without aid of a magnifying glass, it was clear that Ellison was comparing his appliance directly to the Teradata 2550 and Netezza 10100 on the basis of a single common-denominator, configuration-wise: They all have a one-rack footprint. That's an odd basis for comparison. Those competitors do in fact have higher-end DW-appliance models, with more capacity, that might serve as a better basis for performance and price comparisons. Somehow, though, Oracle chose to overlook that fact. Why did it size up a 168-terabyte Oracle/HP machine against 43-terabyte offerings from Teradata and Netezza respectively?

Furthermore, Oracle somehow failed to benchmark these same solutions on the full range of performance criteria that actually matter in DW and business intelligence (BI) deployments, such as query response times, concurrent usage, mixed workload support, load speed and transaction throughput. Of course, even if Oracle had provided reliable, unbiased, third-party benchmarks in all of these areas, it would have been useless if the company didn't apply to comparably configured Teradata and Netezza offerings.

And the price-comparison chart -- including those same rival solutions -- was also seriously deficient. Most notably, the HP Oracle Database Machine's overall price, as presented by Ellison, lacked the requisite Oracle Database Real Application Cluster license fees. However, the stated prices for the Teradata and Netezza solutions definitely included the database management systems that come configured into those offerings (though, of course, Netezza has a free open source database, PostgreSQL, at the heart of its offering). So when you factor in all relevant costs, the new HP Oracle Database Machine doesn't look quite as attractive on the common-denominator of acquisition price per usable terabyte of production data.

Finally, Ellison, like most DW vendors, implicitly presented his solution's architectural approach as the gold standard against which all others must be disparaged. That, of course, is a highly debatable proposition.

For one thing, Oracle Database 11g -- the software heart of the appliance -- is still a general-purpose relational DBMS that has one foot in DW but another solidly planted in online transaction processing (OLTP). By constast, Teradata, Sybase, Netezza, Greenplum and other competitors have optimized their DBMSs for DW from the get-go, and do not support OLTP.

Also, Oracle's new appliance implements a shared-disk storage-area network architecture. By most accounts, shared-disk approaches are inherently less scalable than the shared-nothing MPP approach at the heart of DW solutions from, among others, Teradata and Greenplum.

And the Exadata storage layer can only parallelize SQL queries, and only against structured relational data. In its present incarnation, the Exadata storage grid cannot be used to execute a wider range of analytic functions or handle unstructured and semi-structured data types. Consequently, it is not applicable to the new generation of "content DWs" or for any of the in-database analytics that might be applied to the myriad nonrelational data types that reside in those warehouses.

Of course, Larry Ellison didn't go into anywhere near this degree of industry context. His job was and is to sell the world on an important new Oracle product and partnership, and he did so quite well. We shouldn't expect his direct competitors to be any more frank about their respective DW solutions' limitations. No commercial DW platform can optimally address every business-analytics requirement, now and future.

Sorting through the field of high-end DW solutions is getting more difficult, due to the diversity of vendor approaches. IT professionals need to read between the lines of DW vendors' increasingly breathtaking product announcements -- and talk to a consultant or analyst in the know -- before deciding if Oracle, HP or any other solution provider is truly breaking new ground.

If you find all of these complexities and caveats extremely confusing, and you're having trouble deciding which high-end appliance-based solutions can support the most extreme petabyte-scale workloads, welcome to the new DW market.

All contents copyright 1995-2008 Network World, Inc.

Wednesday, October 08, 2008

imho Re “Forensic Architecture and other lessons from SOA land ”


See Duane Nickull’s post and Anne Thomas Manes’ musings on the same.

First off, I started reading into this because I wasn’t, and still am not, clear on why Nickull is using the term “forensic” in this context. He refers to “forensic architecture” as “the process of describing the architecture of something after it has been built,” as if this is a non-judgmental effort. But he then uses it to refer to his dissection of various failed, ineffectual, and/or underwhelming SOA standardization efforts in which he was involved: ebXML, W3C Web Services Architecture Working Group, UN/CEFACT eBusiness Architecture, and OASIS Reference Model for SOA. But it becomes clear, in his analysis, that he’s actually deploying the term “forensic” in the standard negative connotation of post-mortem (on a victim) and building of an evidence-based case for prosecution of perpetrators.

Though, fortunately, Nickull doesn’t lay it on that heavy. And he provides a good analysis of what went wrong and lessons learned from those various efforts. But, reading into this, it’s clear to me that he’s primarily critiquing the applicability of the software development life cycle “waterfall methodology” to committee-based development of standards in sprawling, ill-defined architectural initiatives--of which SOA is perhaps a classic case in point.

What his analysis points to is the value of a retrospective approach to clarifying the core design principles of an emergent architectural phenomenon that simply works--such as the Web, with REST as perhaps the textbook premier example of a “principles clarification” exercise. In contrast to the “waterfall” method, I’d call this the “salmon swimming upstream to reconceptualize in their presumed/intuited spawning place” approach. Or maybe simply the “salmon” methodology.

Which reminds me of a point I need to make. For the past several years, I’ve been focusing on a space--business intelligence (BI), data warehousing (DW), and data integration (DI)--in which SOA (however defined and standardized) has had just a minimal footprint, and primarily as one integration approach in the back-end. But BI/DW/DI continues to grow and innovate at an amazing clip, still hinging on an old, stable, universal standard: SQL (with SOA-ish XQuery/XPath not achieving any significant momentum swimming upstream against this powerful current).

Interestingly, there are few if any industry specification activities in any forum that involve the BI/DW/DI segment. Much of the front-end BI innovation revolves around integrating Web 2.0-style interfaces and services, and much of that relies on REST (the non-architected architecture that has totally eclipsed SOA in real-world adoption).

REST-ive salmon continue to spawn like crazy downstream in the BI and analytics market. Look, for example, at my latest Forrester Information and Knowledge Management blogpost on the next-generation OLAP “Project Gemini” features that Microsoft began demonstrating publicly this week. Not much SOA in this approach, but a hefty dose of REST, thanks to its tight integration with Microsoft’s Sharepoint portal/collaboration platform, and a lot of SQL, owing to integration with SQL Server Analysis Services.

By the way, people who read James Kobielus’ blog may or may not realize that I now put most of my tech-meaty musings under Forrester’s I&KM blog. Please plug that blog into your reader. I’m one of many Forrester analysts who post to that regularly. Seriously great stuff, all of it.

And you thought all I do nowadays is write pretentious poetry.

Pretentious? Moi? Au contraire, mon frere.


Monday, October 06, 2008

poem Crunchy Analytic


Thought is optional.
Have to have hands to grasp and
tear the text to bits.

Pulp's preferable.
Its immolation is a
blazing face of fire.

An unlocking of
energies that only hard
tedium can free.