Blog: Merv Adrian Subscribe to this blog's RSS feed!

Not Pictured

Hello and welcome to my BeyeNETWORK blog! I will use this blog to share my thoughts and observations on new analytic business applications and data management : vendor briefings, case studies, events and other activities that stimulate ideas will be the source. I believe the emergence of this new class of application, and new emerging data management tools, herald a next step in the maturity of information technology, and I'm excited to be present for its emergence. I hope my blog entries will stimulate ideas that will serve both the vendors creating these new solutions and the companies that will improve their business prospects as a result of applying them. Please share your thoughts and input on the topics.

About the author >

December 2009 Archives

Rod Adkins, the SVP and Group Executive of IBM's Systems and Technology Group (STG) took the time to engage the influencer community quite early in his tenure for a well-run event at the Watson Research Lab in Yorktown Heights. "I've been in this position for 38 days," he  reminded us, as STG's AR team widened the usually hardware-focused invited audience to include generalists and more software-focused folk like me.  IBM execs from IBM's Software Group, its Research organization and corporate, joined us  for a look at the science behind the systems, a compelling addition to the agenda. And another pitch for IBM's analytics thrust was a scene-stealer.

Adkins is hardly new to the party: he spent 10 of the last 12 years in STG in addition to the work on pervasive computing he did in the Software Group. "We play a vital role in the Smarter Planet initiative - we have to provide optimized infrastructure to capture, manage and deliver all the data it runs on," he pointed out. He clearly articulated a strategic path for 2010 to match his aggressive influencer communications agenda: growth in workload optimized platforms, systems software value capture, delivery model changes and data center wins. For each, organizational changes, investments, messages and new offerings are well scoped and already underway. STG has 47% of IBM's R&D budget and is investing in process and packaging; technology design; hardware systems; systems software; and client support. Yes, research into client support - a topic well worth its own discussion, but beyond our scope for this post.

Adkins takes the helm of a ship in good order: 2009 performance to date has been steady, and compares well with its peers given the economic environment. Revenues for Q3 were down 12 percent from the third quarter of 2008 (better than HP's 17% decline in its server group). IBM's mainframe, late in the z10 product cycle, decreased 26 percent. It's not unusual to see declines at this stage as the market anticipates the next (usually well known) version. Still, z continues to defy all predictions of its demise: it has almost doubled its share of systems over $250K in the last 8 years from 17.2% (Q400) to 32.1%(Q408). Adkins touted IDC and Gartner reports of market share gains for Power Systems, System x and disk and tape storage during the third quarter. According to industry analyst sources (IDC and Gartner) revenues from System x servers increased 1 percent; microelectronics OEM revenues were fairly flat at a 1 percent decline. The 45 nanometer business is running ahead of the previous (65 nm) ramp - its production run is already sold out and fab performance is good. IBM's acquired XIV storage has shipped its 1000th unit - 70% of which went to new customers. And IBM claims to have capitalized very effectively on the turmoil in Sun's base: "We've won business from 100 of Sun's top 300 customers."  Even system software has a few things to crow about: the Migration Factory solution, IBM asserts, has driven 2000 migrations to IBM POWER systems so far this year. 

IBM is very optimistic about upcoming technology introductions: its next generation Power7 chip architecture will feature substantial performance and energy consumption improvements, as well as some interesting new dynamic thread management capabilities. Big bets are being placed around the notion of "workload-optimized systems," pre-integrated solutions such as:

  • Cloudburst  private cloud hardware appliances for secure deployment and management of  application environments
  • The new scale-out NAS (SONAS), for standards-based, large scale file storage
  • Smart Analytics System,  a pre-integrated  analytic appliance that delivers IBM data warehouse software preinstalled on an IBM server and storage

Some innovations are delivered as what IBM is calling "system optimizers." For example, the Smart Analytics brand is extended to the Smart Analytics Optimizer, a software-based accelerator (available now for z and moving to other platforms) that automates the movement of frequently queried data into as much as a terabyte of main memory, where other tricks like vector processing and a new "optimized storage format" (I haven't explored edetails of his yet) IBM claims will dramatically improve performance. IBM's decades of database  optimizer research enable it to drop this in and "know" when to push queries to the in-memory database. This kind of hybrid system will change the architecture of data management if it is proved to work as it rolls out - and I wouldn't bet against it.

IBM has been ramping its geographic coverage with 56 added global branch offices and will add 15 more in 2010. The channel is getting more investment in education, enablement, and training in how to improve profitability. And STG has created a new System Software business unit to drive development, revenue and margin growth, with a service management focus that will take HP on in one of its core areas of strength.

Where STG Fits in the Smarter Planet Story

John Iwata, IBM's Senior Vice President Marketing Communications, gave us a conversational look at how IBM continues to apply deep probes into its own business to understand company customers, their needs, and resulting opportunities. "We looked at 20 showcase IBM solutions that we had built and implemented for clients," he said. "We found that all 20 had the entire portfolio of IBM product and service, technology and in almost all of them the Research division played an important role in the eyes of the client as differentiation as to why they chose IBM, and" - perhaps most interesting - "are  instrumenting their physical systems. By 2010 there will be 1B transistors per human; there has been a hockey stick in deployment. The internet of smart things, and the associated data, are the wellspring of IBM's Smarter Planet thinking."

By now, some of the examples are familiar but Iwata unveiled one I hadn't seen before: smart sewers. Who would have thought? But IBM predicts at least $70B of expected investment in sewer modernization. In San Francisco, smart manholes have a float attached to a wireless transmitter to gauge water flow and water level.  Our sewer systems, many of which are a century old, are built for average volume, but when you have above average flow they are easily overwhelmed. It's a familiar phenomenon. And worse things happen: as I write this, a 100-year old main is being repaired after rupturing two days ago and causing massive problems, creating a giant sinkhole flooding streets and buildings, and closing street. Perhaps future systems will be able to anticipate these failures better.  Iwata pointed out that other instrumentation might let you learn that a certain pump needs to be fixed; doing so beforehand avoids similar failures and catastrophes. And, of course, cities will get smarter than "managing by averages" permits by tracking in real time, and learning from the data.

Iwata noted that "We don't label anything 'dumb' in our messaging. Nobody designed the systems that way." Advances in technology permit new ways of thinking, even though, as he acknowledged, the timing could not have been worse in terms of the economic conditions. It's a leadership agenda, not just an ad campaign. IBM took Smarter Planet to governments all over the world - it can raise job growth. Much economic stimulus money flows here.  Iwata has been delighted to see how the field has taken this program up enthusiastically - they don't always, do so, he acknowledged wryly. Over 80 field-driven events have been done, with Ho Chi Minh City being one of the most recent, focusing on a "smart food" supply initiative. Emerging markets are looking to leapfrog from traditional systems to post-industrial, information age ones. IM stands to gain enormous opportunities from all this. It can do well by doing good.

The Mathematician's View of Advanced Analytics

The star turn from the wings at this event came from Brenda Dietrich, a last minute substitute when a snowstorm in Minnesota prevented a Mayo Clinic presenter from joining the meeting. Dietrich, an IBM Fellow, and VP from Research, has made many contributions over an illustrious career; she shared with us that she coded the 1959 algorithm underlying GPS as a new intern and was the largest user of computing power at IBM that year. Today's GPS device makes for a good example of the potential for advanced analytics, when one thinks about it.

Dietrich says analytics can be applied in many places - for example, she and her team have been modeling IBM's biggest customers, and their own sales organization - and have made recommendations on making adjustments to the configuration of their sales force. Some of the changes were implemented - and they've seen large lift (over $1B.)

Dietrich pointed out that the state of the art in statistics is to "extract the explainable and extend." Of course, sometimes that's not as easy as it may sound. For example, the Super Bowl occurs on a different day every year - how to put information on its impact on your sales back in to your predictive model to leverage it in retail (to sell more big TVs, or just chips and beer) is not obvious. She also talked about the frontier cases - such as streaming data, which opens the door for a new type of statistics. Massive data sets from the mobile web, like GPS and OnStar, could be the basis for exciting new opportunities if used effectively. And her proximity to the STG teams gives her a view of how the science of analytics can benefit from new architectures. She noted that "most of the algorithms in the field assume serial, not parallel, hardware - so a lot of work on things like matrix inversion can be rethought."

Finally, as a truly provocative thought, Dietrich challenged us to consider what would happen if computers got smart enough with type checking at a higher order of computation. What if a built-in help system could tackle the garbage in, garbage out problem by telling users "I'm sorry, but there isn't enough data to make that prediction?" Think of it as "warning labels on mathematical models." Might have been nice for the quants on Wall Street to have had that - although there's no guarantee they wouldn't have ignored it, as they did with other warnings that Dietrich said ware not only visible, but had been noted.

It was easy to walk away from this presentation impressed by the inside look at how advance and predictive analytics are thought about - and many of the attendees were. Even more intriguing was the obvious connection to Smarter Planet - it's in the very nature of these techniques that they can do things with the newly instrumented physical systems that have not been thought of before. Delivering on the promise is one thing, but IBM is after something bigger: finding opportunities that have not been thought of yet. Few would have thought of mathematical work of this kind as creative endeavor, but Dietrich's enthusiasm, command and passion make it clear that it is just that, and the best is likely yet to come.


Posted December 29, 2009 11:01 AM
Permalink | No Comments |

Oracle has been making much of its recent benchmark results. Its new TPC campaign may backfire, however; its deceptive assertions do it no credit, and obscure some interesting technical advances (such as its first use of flash technology) behind mislabeling and deliberate omission of important facts. The "benchmark wars" are far less active than they were in their heyday, when new leapfrogging results occurred quarterly, or even more often. TPC-C, the transaction processing measure, has long been understood to be a poor representation of today's real transaction types. At various times, most of the DBMS vendors have stopped issuing them - but they come back when they think they can get a headline or two. Some hardware vendors have also been dismissive of the benchmark; in fact, until this one, Sun had been a skeptic for a number of years.

In practice, most production transaction processing requires DBMS features which are routinely turned off to achieve the breakthrough numbers that vendors like to tout. TPC-E was developed to correct some of these issues, but a quick glance at the top results shows that only Microsoft SQL Server results are available so far. Oracle and IBM DB2 have stayed with TPC-C. 

Those busy benchmark days a few years ago were driven by competing hardware architectures as much as database providers, and it's no surprise that recent interesting results are being driven by hardware again. Kickfire has grabbed attention with its hardware-based SQL processing to deliver some extraordinary results on TPC-H benchmarks, and column stores running in memory with "smart storage" are likely to roil the waters there some more in 2010. But the most visible TPC benchmark in Q4 was Oracle's TPC-C, and not surprisingly, it was driven by the desire to tout a hardware play: Exadata v2. Unfortunately, the benchmark Oracle published is not an Exadata benchmark at all - not even V1 - and some of the interesting things about it are obscured by the manner of its description, unless you dig a little.

A quick glance at the published Top Ten TPC-C results as of December 13, 2009 (see http://www.tpc.org/tpcc/results/tpcc_perf_results.asp) shows that Oracle has indeed delivered the top recent result. "Recent" is important here: the report was submitted November 3, 2009, in time for Oracle Open World, but after the first ads ran. The report was preceded by a prime media buy (Wall Street Journal and elsewhere) in October that compared it to IBM's DB2 benchmark from 15 months earlier and promised to beat it.

How that comparison was done publicly is instructive, both for what it says about the performance of the product, and for the manner in which Oracle promoted it. We've seen this style before; it can be characterized as at best overstated, and at worst deceptive. It's based on the premise that the headline is what people remember - few people would read the fine print details if there were any. 

But in fact, there wasn't any fine print on the benchmark details in the advertising, and that takes its meaning into different territory. After paying the (very unusual) fine against Oracle imposed by the TPC, which explicitly forbids comparing unaudited, unsubmitted benchmarks to posted ones, Oracle subsequently issued ads which didn't actually contain the tpmC or price/tpmC numbers - and weren't about the Exadata machine at all. The fine was trivial - amounting to far less than any of the ads themselves cost. One is left to assume that it was just considered part of the cost of the campaign: as a founding and very active member of the Council, Oracle is certainly not unaware of the rules.

Digging a bit deeper (I'm not given to microanalysis of full reports anymore, but a few things jump out from the Executive Summary, compared to the one for IBM's #2 result - both are only 4 pages), the actual results demonstrate several interesting things about the state of the art in transaction processing:

  • Clustered systems have made great strides in delivering results. This is the only clustered system on the list, run on 12 systems; all the other results in the list are on non-clustered systems, and they are all behind in absolute numbers. Clustered architectures are now demonstrably capable of competing at the top end (if there were any doubt.).
  • Clustered systems deliver a lot of capacity/power for less. Oracle's benchmark ran on a Sun server with 48 processors, 384 cores and 3072 threads, a configuration with a total cost (including storage and software; more on that below) of $18M in late 2009. IBM's had 32 processors and 64 cores, 128 threads, but it cost $17M in mid 2008. So from the system cost perspective, a little more money (relatively speaking) now buys more than an order of magnitude (24x) more threads. Again, perhaps not surprising, but one nice thing about the TPC is that it represents a stake in the ground; theoretically, anyone can buy this configuration at that price.
  • The results fall far short of the hardware uplift. Though the news is good on the hardware side, the tpmC uplift: 7,646,486 for Oracle vs 6,085,166 for DB2, or 25%, seems almost trivial by comparison. Oracle's statement that "Oracle and Sun were able to set the world record using eight times less hardware than IBM used for its largest benchmark" was one of the questionable choices of phrasing in their messaging about the results.
  • DB2 clearly wins the "per core" comparison. Looked at on a tpmC per-core basis, Oracle got 20K/core, while IBM got 95K/core. I'd be careful about per-core licensing under such circumstances; the good news is that Oracle recently reduced their core multiplier, clearly seeing the value in pushing customers onto machines with more cores. (No doubt it helped the price/performance comparison for the benchmark too.)
  • Memory drives performance, but not as much as one would hope.) There's a sizable difference in memory: 6 Tb for Sun (at 512 Gb per node) and 4 Tb for IBM (64 Gb per core). A 50% difference, but not a 50% performance uplift. As system architectures become more effective at using multiple cores with dedicated memory, we should see a series of jumps in performance ahead.
  • Solid state storage is transforming economics, if not yet performance: the Sun system used 686.6 Tb (with a good deal of new, high-performing solid state disk), which cost $8.4M; IBM, using older disk technology, used 746.5 Gb. at $11.5M. This is an extraordinary volume difference, and it suggests the economics of storage are shifting as rapidly as those of processors.  But again, all the added volume does not seem to have driven a proportional improvement in performance.

Bottom line: this suggests there is a very interesting set of skirmishes ahead, and substantial improvements in results I'd expect to come from both vendors. When IBM delivers benchmarks on its current DB2 release (the one discussed here was on DB2 9.5, not the current 9.7), current hardware, current storage - and perhaps using its own clustering pureScale technology, the game will really be afoot. One hopes IBM's discussion of the results will be somewhat less fuzzy than Oracle's. In the meantime, they can only stand by and watch the big headlines sink in; IBM rarely gets into you said/we said kinds of games. At its recent conferences for analysts, there was no talk of these issues to match the red meat attacks heard at OOW. They need to get their own benchmark to market. Soon. Watch also for a wildcard if VoltDB, the stealth project now in early trials, decides to publish an audited TPC-C.


Posted December 23, 2009 5:26 PM
Permalink | No Comments |

Xkoto, the database virtualization pioneer, has generated substantial interest since its first deployments in 2006. Still privately held and in investment mode, Xkoto sees profitability on the horizon, but offers no target date, and appears in no hurry. Its progress has been steady: in early 2008, a B round of financing led by GrandBanks Capital allowed a step up to 50 employees as the company crossed the 50 customer mark. 2008 also saw Xkoto adding support for Microsoft SQL Server to its IBM DB2 base. Charlie Ungashick, VP of marketing for Xkoto, says that 2009 has been going well, and the third quarter was quite strong. And at the end of September 2009, Xkoto announced GRIDSCALE version 5.1, which adds new cluster management capabilities to its active-active configuration model, as well as Amazon EC2 availability.

"Traditional" models of passive failover and passive disaster recovery are high cost, inflexible architectures that xkoto's GRIDSCALE replaces with multiple identical databases copies that it manages in an active-active configuration for lower cost scale-out and disaster recovery (DR.)  Applications don't "see" it; GRIDSCALE captures the SQL statements and replicates them to the copies - both DDL and DML - with an optimistic, non-synchronous protocol: the first successful response goes back to the application.

Customers include CNN, Puma, HSBC, and the US Department of Homeland Security. Ungashick says Western Europe is doing very well - half of Q3 revenue came from Europe. The firm has recently added direct sales headcount in South Africa and Asia, and is continuing to add more. The partnership with IBM has been instrumental in some big wins for both parties. Xkoto is arguably the closest DB2 has to answer Oracle's RAC, and Xkoto participated with IBM in several deals that needed this capability.

GRIDSCALE version 5.1's Amazon EC2 availability enables multiple DB2 databases running in the cloud to work together, and avoid the limitations formerly resulting from the lack of shared storage - allowing load distribution in the cloud for the first time.Version 5.1 also adds automatic recovery, Kerberos support for authentication, and other features as described here.

The SQL Server version of GRIDSCALE 5.1 features a new, "driverless" configuration. Native SQL Server drivers, including ADO, .NET Framework, and OLE DB are now supported; GRIDSCALE has implemented the tabular data stream (TDS) protocol Microsoft inherited and updated from Sybase. Microsoft SQL Server Enterprise Manager and other tools compatible with Microsoft interfaces can be used for management of the server instances. Ungashick says he's seeing more opportunities with Microsoft where the competition is Oracle RAC, similar to what Xkoto had already been seeing with DB2 prospects:

A number of situations have arisen recently with SQL Server customers recognizing that their data warehouses are not adequately designed for availability and disaster recovery. As the DWs become more important to the business, we think we'll see much more interest in that use case," he said. 

Xkoto is hoping to leverage the Microsoft community to drive business there.  The recognition the company has already received - Best of Microsoft Tech•Ed 2009 and Gartner's "Cool Vendor in IT Operations and Virtualization" among them - will go a long way towards boosting its visibility. This is a promising model, and Xkoto has the early lead - which will be a challenge to hold as the big database vendors add their own capabilities in this area. Meanwhile, Ungashick says, there are other database products that could use similar capabilities and we can expect to see announcements with others in the year ahead.


Posted December 21, 2009 11:00 AM
Permalink | No Comments |

Many SAP and Oracle apps customers would rather leave stable products alone than continually change, or "upgrade," as it is called. For these customers, the cost of maintenance, also known as "buying it all over again every 4 years," seems excessive. The slow pace of innovation from the mammoth firms, and the even slower uptake of those innovations, amplifies this. (For a recent discussion of this problem, see video highlights from Ray Wang's keynote speech from the SAP UK and Ireland User Group. I discussed the resounding thud heard from Oracle's "wait till next year" non-announcement of its Fusion apps here.)

With this backdrop, Rimini Street, one of the pioneering 3rd-party maintenance firms, recently announced stellar Q3 results: revenue up 200% year over year, and sequential quarter-over-quarter growth continuing: it claimed Q3 invoicing doubled the prior calendar quarter. Rimini Street's value proposition has steadily attracted customers willing to try a different way. The company claims hundreds of customers since inception, all over the size spectrum. The offer is simple: their base price is 50% of the vendor's maintenance price.

Rimini Street also offers premium support for sites that have customized the products (and most have), so even wary prospects may want to take a second look. Most customer issues that require attention, the company claims, are fixable. They assert that the average Rimini Streeter has 10-15 years of experience, and each account gets a dedicated engineer. Rimini Street offers a service level agreement of 30 minute response time, and claims an average of 4 minutes. Siebel, PeopleSoft, JD Edwards World and OneWorld are all offered in addition to SAP. Specifics can be found here . In its quarterly results press release, the firm claimed over 90% customer retention. Rimini Street says those who don't renew are not dissatisfied, but rather have moved to next generation products it doesn't (yet) support.

 In late September, Barron's reported that Adams Street Partners had invested in Rimini Street for a minority stake, and the funds will be used to grow European operations. Rimini Street has also brought in ex-Sybase CFO and SVP of EMEA operations Pieter Van Der Vorst as its new CFO. In 2010, the firm expects to grow from 130 people to 200 - and to continue to grow its existing footprint in countries like the UK, Germany, Netherlands, Singapore, Australia, and Brazil.

Many of Rimini Street's customers won't be named. The vendors being displaced do not look kindly on competition, and some customers are concerned about their response. But Pepsi, Virgin Mobile, and JB Hunt have all been identified, along with extensive coverage of Siemens' decision to use 3rd party maintenance, and Rimini Street's website continues to show more client logos as more firms and governmental clients become willing to acknowledge - and praise - their relationships. (Perhaps the EU ought to focus on real competitive restraints that big apps vendors have attempted over the years - to customers' material disadvantage - here, rather than on unrealized problems that might happen with acquisitions, as it's doing with the Oracle/Sun deal.)

Risks? To be sure. There is no current litigation against Rimini Street, although Oracle has attempted to bring it into the case against TomorrowNow, which is still in the courts. Seth Ravin, Rimini Street's founder, is widely regarded as extremely knowledgeable about the nuances of contracts, and the company is confident that legal challenges will not become a problem; it is incorporated in Las Vegas, regarded as fairly unfriendly to the kind of challenges it would be expected to see if Oracle and SAP decide to up the ante. Still, as Rimini Street continues to eat away at a key revenue stream, we may see some fireworks soon. Continued SAP and Oracle customer dissatisfaction with maintenance pricing, even as the two giants grow their margins by raising prices for the "locked in," is likely to drive solid growth in the year ahead.


Posted December 15, 2009 10:24 AM
Permalink | No Comments |

At IBM's 8th annual Connect meeting with analysts, Steve Mills, Senior VP and Group Executive, had much to crow about. Software is the engine driving IBM's profitability, anchoring its customer relationships, and enabling the vaulting ambition to drive the company's Smarter Planet theme into the boardroom. Mills' assets are formidable: 36 labs worldwide have more than 100 SW developers each, plus 49 more with over 20 - 25,000 developers in all. Mills showcased all this in a matter-of-fact, businesslike fashion with minimal hype and little competitor bashing. A research project aimed at extending Hadoop usage to a broader audience was among the highlights. 

Mills gave us a look at his organizing principle:

We have been working on extending the notion of what middleware is. It's about connecting an organization's applications, the codification of business process and function."

Companies from midmarket to large enterprises run thousands of applications; understanding customers' business scenarios, addressing identified gaps and promoting recommended patterns for success - adoption routes, solution stacks - is the driver. "It's very easy to make a mess if you're not guided," Mills points out. He's an effective, dedicated proponent of IBM's Smarter Planet theme, and returned to it at this event, pointing out how IBM-supported projects that instrument and enhance the world's often aging physical systems pay for themselves in efficiency savings even before the larger goals they enable are considered. He also held forth on other favorite topics: Industry Models, Cloud Computing ("You have to talk about Cloud"), and more, but told us he'd promised not to use all his team's best slides before they could. "Not that I can't talk about all of it," he joked - and we've seen him do it. But no 3 hour keynotes here, mercifully, unlike some other vendors' recent events.

Bringing Hadoop to Business Users

In his presentation, Rod Smith, VP, Emerging Internet Technologies, made it clear that the company is not ignoring the MapReduce/Hadoop phenomenon. He referred graciously to Cloudera's work and picked up their phrase: "big data." With the world creating nearly 15 PB of new data per day, a new class of content-centric WebApps is on the horizon, typically "longer running apps" - customers Smith talks with don't like the word "batch," he noted. But his focus was different from other vendors I've been hearing, where there is an assumption that the "big data" opportunity is limited to the sophisticated programmers who have so far led the way. Instead, "Put the business person in the center of the data," Smith suggested. "They want their own Google" - here meaning not a search engine, but a data interaction tool capable of visualization and other forms of manipulation.

It's clear that the need for such solutions will be there, and someone will fill it. When a firm like Extrabux can process 40Gb/day, loading and indexing 70 million constantly changing input records for MapReduce by processing on Amazon's EC2 cloud for less than $5000 per year - with no DBA - others will follow. (See the September issue of Charles Brett's Insight-Spectra  for details of this case study.) Like other explorers in this new mode, Smith offered his own great examples, including  a Visa risk modeling app using Hadoop with the R statistical libraries that reduced an analysis literally from 1 month to 13 minutes. "This is not incrementally better; it changes everything," he said.

Smith's Big Sheets project showed off analysis performed on over 2 million patent documents - a "one person project, like all my things." He referred to the iTunes interface and showed a similarly clean, intuitive model. And he pointed out that "the data operated on does not always get reduced; here it exploded, because one analysis was of how patents made references to other patents." Similar things happen when analyzing social graphs; it's why focusing on MapReduce alone to describe these cases doesn't always paint the full picture. It's just one step in more complex processes that can be distributed around large systems which scale on demand as needs dictate. Similar thinking about user empowerment, without the elastic scaling (yet), is behind Microsoft's PowerPivot, which treats Excel as the UI, and adds operators to the Excel language which mimic the kinds of things MDX programmers can do with OLAP cubes, among other things.

IBM is looking past today's MR cases, which are often reminiscent of early computing days, when specialists spent days to set up machines for a single program run. The problem then was scale too, and learning how to use machine resources efficiently was job one. Today, the economics have flipped - we understand that the people resources are more valuable and we have to empower them. IBM is looking beyond complex setup, java coding and single run models for "big data" processing and towards interactive big data analysis - at Web scale. In Smith's view, that's the key to going into an "evidence-based business world." IBM is focused on hiding the complex details of system parallelization, fault tolerance, load balancing, etc. from the user by hiding everything behind the UI. Tech details weren't at the top of Smith's agenda for this presentation, but REST interfaces, the use of Jackal, extensibility via UDFs, integration of Pig, and exporting results into feeds and XML were briefly highlighted. As IBM continues to push at this area, we can expect to see some breakthrough innovations emerge, in larger, end-to-end scenarios.


Posted December 3, 2009 9:41 AM
Permalink | No Comments |