Six reasons why Pentaho’s support of Apache Hadoop is great news for ‘big data’

May 19, 2010

Earlier today Pentaho announced support for Apache Hadoop – read about it here.

There are many reasons we are doing this:

  1. Hadoop lacks graphical design tools – Pentaho provides plug-able design tools.
  2. Hadoop is Java –  Pentaho’s technologies are Java.
  3. Hadoop needs embedded ETL – Pentaho Data Integration is easy to embed.
  4. Pentaho’s open source model enables us to provide technology with great price/performance.
  5. Hadoop lacks visualization tools – Pentaho has those
  6. Pentaho provides a full suite of ETL, Reporting, Dashboards, Slice ‘n’ Dice Analysis, and Predictive Analytics/Machine Learning

The thing is, taking all of these in combination, Pentaho is the only technology that satisfies all of these points.

You can see a few of the upcoming integration points in the demo video (above). The ones shown in the video are only a few of the many integration points we are going to deliver.

Most recently I’ve been working on integrating the Pentaho suite with the Hive database. This enables desktop and web-based reporting, integration with the Pentaho BI platform components, and integration with Pentaho Data Integration. Between these use cases, hundreds of different components and transformation steps can be combined in thousands of different ways with Hive data. I had to make some modifications to the Hive JDBC driver and we’ll be working with the Hive community to get these changes contributed. These changes are the minimal changes required to get some of the Pentaho technologies working with Hive. Currently the changes are in a local branch of the Hive codebase. More specifically they are a ‘Short-term Rapid-Iteration Minimal Patch’ fork – a SHRIMP Fork.

Technically, I think the most interesting Hive-related feature so far is the ability to call an ETL process within a SQL statement (as a Hive UDF). This enables all kinds of complex processing and data manipulation within a Hive SQL statement.

There are many more Hadoop-related ETL and BI features and tools to come from Pentaho.  It’s gonna be a big summer.

James Dixon
Chief Geek
Pentaho Corporation

Learn more - watch the demo



Big data should not mean big cost

May 19, 2010

Data is exploding at rates our industry has never seen before and the huge opportunity to leverage this data is stymied by the archaic licensing practices still in use by the old school software companies.  Currently, the big guys like Oracle, IBM, SAP, Teradata and other proprietary database and data warehouse vendors have a very simple solution to “big data” environments – just keep charging more money, a lot more money. The only “winners” in this scenario are the software sales reps. Our industry (Tech) is artificially slowed in order to support these old school business models – they can’t afford to innovate in licensing and they surely don’t want to kill the golden goose – The Perpetual License fee.

A major gaming company, for example, had been using Oracle for its database and BI tech. With traffic reaching 100 million to 1 billion impressions per day, the database giant’s only answer was to sell more expensive licenses. Even then, the best it could do was analyze four days worth of information at a time.

Organizations like Mozilla, Facebook, Amazon, Yahoo, RealNetworks and many others are now collecting immense amount of structured and unstructured data. The size of weblogs alone can be enormous.  Management wants to be able to triangulate what people are doing at their sites in order to do a better job of

a)     Turning prospects into customers
b)     Offering customers what they want in a more timely manner
c)     Spotting trends and reacting to them in real time.

Any company, small or large, that is trying to sift through terabytes of structured and complex data on an hourly, daily or weekly basis for any kind of analytics had better take a long hard look at what it is really paying for. Just like the worldwide recession of 08-09 brought tremendous attention to lower cost, better value prop alternatives like Pentaho, the “big data” movement is doing the same thing in the DB/DW space. And where do you find some of the best innovations in the tech space? The answer is open source.

Specifically, an open source tech called Apache Hadoop is addressing the “better value proposition for Big Data.” It also is the only tech capable of handling some of these big data applications. Sounds great, right? Well not exactly. The issue with Hadoop is it is a very technical product with a command line interface. Once that data gets into Hadoop, how do you get it out? How do you analyze that data? If only there was an ETL and BI product tightly integrated with Hadoop, and available with the right licensing terms…

Today I’m proud to announce that Pentaho has done just that. Early May 19th we announced our plans to deliver the industry’s first complete end-to-end data integration and business intelligence platform to support Apache Hadoop.  Over the next few months we’ll be rolling out versions of our Pentaho Data Integration product and our BI Suite products that will provide Hadoop installations with a rich, visual analytical solution. Early feedback from joint Hadoop-Pentaho sites have been extremely positive and the excitement level is high.

Hadoop came out of the Apache open source camp. It is the best technology around for storing monster data sets. Until recently, only a small number of organizations used it, primarily those with deep technical resources. However, as the tech matures the audience is widening and now with a rich ETL and analytical solution it is about to get even bigger.

Stay tuned to our website and to this blog as I’ll be sharing many success stories over the next 90 days. And most importantly, watch out for the ‘Golden Goose’ licensing schemes from the old school vendors.

Richard

Visit www.pentaho.com/hadoop to watch a demo of Pentaho Enterprise integration with Hadoop and reserve your place in the beta program.


You can run but you can’t hide

May 18, 2010

Social media has changed the whole customer-vendor relationship. If our customers are not happy they have 100’s of ways to tell a lot of people. Vendors can run but they can’t hide.

Howard Dresner gets it. He tapped into the social media world to gather a sampling of over 450 business intelligence users to learn their real-world experiences with their vendors. He gathered their feedback in the new market study by Dresner Advisory Services, The Wisdom of Crowds Business Intelligence Market Study™. Here are a few of the highlights:

  • Pentaho, in the Emerging Vendor category, was ranked against six other vendors and received the top ranking in value for the price paid and quality of consulting services.
  • Pentaho is the clear leader in OSBI ranking above other OSBI vendors in every category.
  • Pentaho received the second highest marks overall among BI vendors.
  • 100 percent of responding customers recommended Pentaho.

Red Hat’s slogan comes to mind in this situation: “Truth Happens.” With the growth of social media you cannot hide if you have bad support or a crappy product – social media puts the power back into the users hands.

You can’t just say you are the best or largest – you have to be it. It is easy to throw around meaningless statistics about having “over 100,000 community members”, when the 100,000 merely visited a website or forge and registered to get community editions or view a forum. What is worse is when over 75% of these “community members” only visited that site one time and never returned. Making these grossly misleading claims is bad for commercial open source in general and horrible for the specific vendors.

Beyond the Wisdom of Crowds report, I wanted to point out a few other independent surveys out there to see the truth.

Open Source Solutions: Managing, Analyzing and Delivering Business Information
BeyeNETWORK Research Report by Mark Madsen

The End of Enterprise Software: Open Source Finds an Opening
By Wayne Eckerson, Director, TDWI Research

If you are interested to know what your peers are saying about Pentaho in The Wisdom of Crowds Business Intelligence Market Study™ the survey is available for download by visiting: http://www.pentaho.com/wisdom_of_crowds/

Richard


What is Agile BI? Your answers from business user to fluff

May 12, 2010

Last month, Pentaho sponsored a contest where people answer the question -“What does agile BI mean?”  I was lucky enough to be one of the judges to determine who made it to the final five and win a Flip Ultra™ camcorder.  The results were posted today http://www.pentaho.com/what_is_agile/. Now it’s up to the community to vote for their favorite answer and the winner gets an iPad – (yes, that means you).

When reading through hundreds of entries I began to see a pattern and being an old BI guy, that meant I had to make a pie chart. The answers fell into 5 main groups: BI Solution Development, Business Users, Entire Business marketing Fluff and Other.

Almost 34% of the entries cited Agile BI as an iterative methodology for developing BI solutions involving the end user as early and often as possible.  It is exemplified by one of the finalists “Agile is about speeding up the design/create/ship/observe cycle. The more you ship and observe, the better you learn to design and do. Whether you’re headed in the wrong direction or the right one, it’s imperative that you find that out as soon as possible.” Exactly what we have started with the PDI 4.0 release and are continuing to focus on.

A full 25% of the responses focused on the business user with quotes like, “Agile is never being caught flat-footed – being able to react and adapt with ease, leaving competitors in your wake.” The ability for end users to explore and analyze business data beyond static reporting is very important.  Applications like Pentaho Analyzer and Web-based Ad hoc Query and Reporting address this need. The modeling perspective added to PDI 4.0 reduces the complexity and learning curve associated with building metadata models and schemas in order to put that analytical power into the end user’s hands.

A little over 12% were not concerned whether the agility was on the development or user side.  They just knew that the business had to react quickly to changing business conditions.  “Agile means being able to rapidly adjust to changing conditions with speed and accuracy” was a typical response in this category.

Exactly 15% of people responded with what I call fluffy messages. These were creative and got the most attention from our marketing people (I wasn’t the only judge)  “The antonym to SAP”, “Less work, more money” and “Agile (with Pentaho) means never having to say you’re sorry.”

The last 14% were entries like “agile is eliga read backwards” and the self-referencing “The only way to make agile decisions.” Not sure where they were going with some of these but they were also entertaining.

Out of the entries, there were nine attempts to make a phrase by using words that start with the letters A-G-I-L-E. Two people submitted papers on agile BI.  We even had a submission written in Haiku.  No one went for the extra creativity points by using video or interpretive dance.  We didn’t get any abusive, obscene or SPAM entries, which was nice.

There were only two negative responses complaining that Agile BI was marketing hype.  Here is one of them, “Two answers. 1. Personally, “Agile BI” means nothing to me. Sounds like yet another attempt by marketing to create an artificial differentiator. 2. If I had to describe “Agile BI” or die, I’d say, “An Agile BI environment enables an organization’s people and processes to quickly adapt to new or changed user requirements, ideally through self-learning and pre-emptive adjustments.” Answer two is exactly what we intend to enable with this initiative and when we are successful, that will prove the number one answer wrong.

Pentaho is committed to Agile BI. We believe our development plan is in line with the majority of respondents to this contest. PDI 4.0 is a great start but it is just the first steps and we are using this feedback to help set the product roadmap for the next half of this year and beyond. Thank you for participating! Please vote for one of the finalists.

Doug Moran
Pentaho Community Guy


Former Informatica exec joins Pentaho

May 4, 2010

If you have the world’s most popular open source data integration (DI) offering and want to make it even better, you need the top DI market strategist behind the product. That is why we turned to DI and analytics veteran, Joe Nicholson. With over 25 years of technology and marketing management experience, with companies like Informatica, DecisionPoint and Trintech I asked Joe the same question I did to Tom Leonard when he joined, “What brought you to Pentaho?” Welcome Joe!

Guest Blogger: Joe Nicholson

I have been in DI and analytics in one form or another for most of my career. Whether it was marketing the data integration platform at Informatica or doing business development around packaged finance analytics at DecisionPoint, I am really an analytics guy at heart. Part of my passion no doubt comes from solving the enduring complexity and depth of the BI challenge. A decade ago, our challenges were focused on dealing with the complexity of data sources and providing cost effective information and analytics to the emerging generation of managers and mobile workers. At that time, data volumes and sources were exploding (or so we thought) and our primary customer prospects were those hand-coding their ETL projects or dumping their finance and other data into Excel and wanted a better way. What is particularly intriguing to me is that those challenges haven’t really changed over time. Data volumes and sources continue to grow, now at a rate that was impossible to imagine 10 years ago. And, while the use of BI has certainly increased for the business user, it is nowhere near where I would have expected it to be given all of the hype around Pervasive BI, BI for the Masses, the Democratization of BI or the other terms that were being used at that time. As a technology, it is still too expensive, too complex and too slow.

What has changed is the mindset of the business user. Frankly, the ROI for BI was oversold in the early and middle years and the recent economy has focused end users on TCO like never before as the need for faster, more agile and diverse projects now drive end user requirements.

I had followed Pentaho for several years, both from the technology and business model angles. First, from a technology perspective, they are solving many of the issues faced by end users by producing a complete, integrated BI suite that includes purpose built ETL, data modeling, reporting and analysis, all within a single platform that handles administration, security and the rest of the functions needed for deployment and maintenance of the BI applications.

Second, old way of licensing BI technology (and other software) has run its course. The era of open source and commercial open source has arrived. In today’s cost conscious, business driven climate, why would anyone run a long, extensive procurement process, construct costly prototypes and pay $10,000 to $100,000+ of license plus maintenance fees before you can actually see if the technology works for your purpose.

Enter Pentaho and the open source development and distribution model. Pentaho and its thousands of community members together with its own professional engineering staff have developed the Pentaho Enterprise BI Suite that is on par with all of the major players in the market across ETL, modeling, reporting analysis and data modeling. The game has changed with Pentaho’s recent announcements of the Pentaho Data Integration Version 4.0 and the Agile BI integrated development environment that forever changes how IT developers and business users collaborate to produce relevant BI applications.

It’s an exciting time to be in the DI and BI markets and even more exciting to be at Pentaho.

Joe Nicholson,
Vice President Product Marketing
Pentaho Corporation

Read the press release Pentaho Names Joe Nicholson Vice President of Product Marketing


Pentaho hires heavy hitter to head Europe

May 4, 2010

This week kicks off our EMEA Agile BI Roadshow. If you are in Turkey, Milan or Madrid make sure to drop by and meet Vinay Joosery, our new VP of Sales for EMEA (Europe, Middle East and Asia) region. The growth in EMEA is blazing, creating a wealth of sales opportunities and we are stepping up our game accordingly. Vinay brings the skill set that will propel our European operations forward at this critical time and build a sales operation that can take on and take down the incumbents across the region. After seven years at MySQL (Oracle/Sun) I asked Vinay, “What brought you to Pentaho?” Välkomna Vinay!

Guest blogger Vinay Joosery

The new guy often sees things differently to the ‘lifers’. I thought, after seven years in sales at the pioneering MySQL, it might take a while to settle in. At Pentaho though, life moves fast and there is a strong sense of community, so heading into my second month here, it already feels like I have been here for a while.

One reason for that, is the similarity in approach between the two companies. Pentaho is pioneering the commercialization of the leading end-to-end open source BI platform, at the very time that many organizations are looking for just such a solution. For me, there was no better professional challenge.

In my interactions with Pentaho customers, I see a wide range of adoption. We have the small fast-moving organizations who need to stay ahead of the curve and aggressively grow their top-line, like Aspiro, the ‘Nordic iTunes’. Or, the larger more conservative shops, who, perhaps driven by more austere economic times, need to reduce bottom-line costs. One good example is Swissport, a billion euro enterprise who employs over 30,000 people. It is great to see first-hand the real business value that customers are taking today from our offerings.

One important factor for these companies was the opportunity to ‘try before they buy’, easily gather independent information about the product and get assistance when they needed it.  Businesses are looking to ‘de-risk’ their investments, and this is where the deal is irresistible. In my previous roles dealing with high-end proprietary software, this is not possible. With Open Source it is common practice – one can follow the development of the products, access the public bug database, contribute and help influence the product roadmap.

What I love most about Pentaho is the great product, already in use by thousands of organizations, the engineers who are passionate about the product and so a large and active community, which keeps us all honest. In EMEA, we are very focused on ramping up on the marketing and sales.

It will take all of my experience of living on three continents and test my three European language skills, but I am relishing this new challenge. Thanks for welcoming me onboard with such warmth and feel free to reach out. I will do my utmost to deliver for all of you.

Cheers,
Vinay Joosery
VP Sales, EMEA
Pentaho Corporation

Read the press release, Pentaho Hires Heavy Hitter to Head Europe


Follow

Get every new post delivered to your Inbox.

Join 102 other followers