Top 10 Reasons Behind Pentaho’s Success

September 2, 2011

To continue our revival of old blog posts, today we have our #2 most popular blog from last July. Pentaho is now 7 years old, with sales continually move up and to the right. In a crazy economy, many are asking, “What is the reason behind your growth and success?” Richard Daley reflected on this question after reporting on quartlery results in 2010 .

*****Originally posted on July 20, 2010*****

Today we announced our Q2 results. In summary Pentaho:

  • More than doubled new Enterprise Edition Subscriptions from Q2 2009 to Q2 2010.
  • Exceeded goals resulting in Q2 being the strongest quarter in company history and most successful for the 3rd quarter in a row.
  • Became the only vendor that lets customers choose the best way to access BI: on-site, in the cloud, or on the go using an iPad.
  • Led the industry with a series of market firsts including delivering on Agile BI.
  • Expanded globally, received many industry recognitions and added several stars to our executive bench.

How did this happen? Mostly because of our laser focus over the past 5 years to build the leading end-to-end open source BI offering. But if we really look closely over the last 12-18 months there are some clear signs pointing to our success (my top ten list):

Top 10 reasons behind Pentaho’s success:

1.     Customer Value – This is the top of my list. Recent analyst reports explain how we surpassed $2 billion mark during Q2 in terms of cumulative customer savings on business intelligence and data integration license and maintenance costs. In addition, ranked #1 in terms of value for price paid and quality of consulting services amongst all Emerging Vendors.

2.     Late 2008-Early 2009 Global Recession – this was completely out of our control but it helped us significantly by forcing companies to look for lower cost BI alternatives that could deliver the same or better results than the high priced mega-vendor BI offerings. Making #1 more attractive to companies worldwide.

3.     Agile BI – we announced our Agile BI initiative in Nov 2009 and received an enormous amount of press and positive reception from the community, partners, and customers. We’ve been showing previews and releasing RCs in Q1-Q2 2010 and put PDI 4.0 in GA at the end of Q2 2009.

4.     Active Community – A major contributing factor to our massive industry adoption is our growing number of developer stars (the Pentaho army) that continue to introduce Pentaho into new BI and data integration projects. Our community triples the amount of work of our QA team, contributes leading plug-ins like CDA and PAT, writes best-selling books about our technologies and self-organizes to spread the word.

5.    BI Suite 3.5 & 3.6 – 3.5 was a huge release for the company and helped boost adoption and sales in Q3-Q4 2009. This brought our reporting up to and beyond that of competitors. In Q2 2010 the Pentaho BI Suite 3.6 GA brought this to another level including enhancements and new functionality for enterprise security, content management and team development as well as the new Enterprise Edition Data integration Server.  The 3.6 GA also includes the new Agile BI integrated ETL, modeling and data visualization environment.

6.     Analyzer – the addition of Pentaho Analyzer to our product lineup in Sept-Oct 2009 was HUGE for our users – the best web-based query and reporting product on the market.

7.     Enterprise Edition 30-Day Free Evaluation – we started this “low-touch/hassle free” approach in March 2009 and it has eliminated the pains that companies used to have to go thru in order to evaluate software.

8.     Sales Leadership – Lars Nordwall officially took over Worldwide Sales in June 2009 and by a combination of building upon the existing talent and hiring great new team members, he has put together a world-class team and best practices in place.

9.     Big Data Analytics – we launched this in May 2010 and have received very strong support and interest in this area. We currently have a Pentaho-Hadoop beta program with over 40 participants. There is a large and unfulfilled requirement for Data Integration and Analytic solutions in this space.

10.   Whole Product & Team – #1-#9 wouldn’t work unless we had all of the key components necessary to succeed – doc, training, services, partners, finance, qa, dev, vibrant community, IT, happy customers and of course a sarcastic CTO ;-)

Thanks to the Pentaho team, community, partners and customers for this great momentum. Everyone should be extremely proud with the fact that we are making history in the BI market. We have a great foundation in which to continue this rapid growth, and with the right team and passion, we’ll push thru our next phase of growth over the next 6-12 months.

Quick story to end the note:  I was talking and white boarding with one of my sons a few weeks ago (yes, I whiteboard with my kids) and he was asking certain questions about our business (how do we make money, why are we different than our competitors, etc.) and I explained at a high level how we are basically “on par and in many cases better” than the Big Guys (IBM, ORCL, SAP) with regards to product, we provide superior support/services, yet we cost about 10% as much as they do. To which my son replied, “Then why doesn’t everyone buy our product?”  Exactly.

Richard
CEO, Pentaho


High availability and scalability with Pentaho Data Integration

March 31, 2011

Experts often possess more data than judgment.” – Colin Powell….hmmm, those experts surely are not using a scalable Business Intelligence solution to optimize that data which can help them make better decisions. :-)

Data is everywhere! The amount of data being collected by organizations today is experiencing explosive growth. In general, ETL (Extract Transform Load) tools have been designed to move, cleanse, integrate, normalize and enrich raw data to make it meaningful and available for knowledge workers and decision support systems. Once data has been “optimized,” only then can it be turned into “actionable” information using the appropriate business applications or Business Intelligence software. This information could then be used to discover how to increase profits, reduce costs or even write a program that suggests what your next movie on Netflix should be. The capability to pre-process this raw-data before making it available to the masses, becomes increasingly vital to organizations who must collect, merge and create a centralized repository containing “one version of the truth.” Having an ETL solution that is always available, extensible and highly scalable is an integral part of processing this data.

Pentaho Data Integration

Pentaho Data Integration (PDI) can provide such a solution for many varying ETL needs. Built upon a open Java framework, PDI uses a metadata driven design approach that eliminates the need to write, compile or maintain code. It provides an intuitive design interface with a rich library of prepacked plug-able design components. ETL developers with skill sets that range from the novice to the Data Warehouse expert can take advantage of the robust capabilities available within PDI immediately with little to no training.

The PDI Component Stack

Creating a highly available and scalable solution with Pentaho Data Integration begins with understanding the PDI component stack.

● Spoon – IDE – for creating Jobs, Transformations including the semantic layer for BI platform
● Pan – command line tool for executing Transformations modeled in Spoon
● Kitchen – command line tool for executing Jobs modeled in Spoon
● Carte – lightweight ETL server for remote execution
● Enterprise Data Integration Server – remote execution, version control repository, enterprise security
● Java API – write your own plug-ins or integrate into your own applications

Spoon is used to create the ETL design flow in the form of a Job or Transformation on a developer’s workstation. A Job coordinates and orchestrates the ETL process with components that control file movement, scripting, conditional flow logic, notification as well as the execution of other Jobs and Transformations. The Transformation is responsible for the extraction, transformation and loading or movement of the data. The flow is then published or scheduled to the Carte or Data Integration Server for remote execution. Kitchen and Pan can be used to call PDI Jobs and Transformations from your external command line shell scripts or 3rd party programs. There is also a complete Java SDK available to integrate and embed these process into your Java applications.

Figure 1: Sample Transformation that performs some data quality and exception checks before loading the cleansed data

PDI Remote Execution and Clusters

The core of a scalable/available PDI ETL solution involves the use of multiple Carte or Data Integration servers defined as “Slaves” in the ETL process. The remote Carte servers are started on different systems in the network infrastructure and listen for further instructions. Within the PDI process, a Cluster Scheme can be defined with one Master and multiple Slave nodes. This Cluster Scheme can be used to distribute the ETL workload in parallel appropriately across these multiple systems. It is also possible to define Dynamic Clusters where the Slave servers are only known at run-time. This is very useful in cloud computing scenarios where hosts are added or removed at will. More information on this topic including load statistics can be found here in an independent consulting white paper created by Nick Goodman from Bayon Technologies, “Scaling Out Large Data Volume Processing in the Cloud or on Premise.”

Figure 2: Cx2 means these steps are executed clustered on two Slave servers
All other steps are executed on the Master server

The Concept of High Availability, Recover-ability and Scalability

Building a highly available, scalable, recoverable solution with Pentaho Data Integration can involve a number of different parts, concepts and people. It is not a check box that you simply toggle when you want to enable or disable it. It involves careful design and planning to prepare and anticipate the events that may occur during an ETL process. Did the RDBMS go down? Did the Slave node die? Did I lose network connectivity during the load? Was there a data truncation error at the database? How much data will be processed on peak times? The list can go on and on. Fortunately PDI arms you with a variety of components including complete ETL metric logging, web services and dynamic variables that can be used to build recover-ability, availability, scalability scenarios into your PDI ETL solution.

For example, Managing Consultant in EMEA, Jens Bleuel developed a PDI implementation of the popular Watchdog concept. A solution that includes checks to monitor if everything is on track is using the concept of a Watchdog when executing its tasks and events. Visit the link above for more information on this implementation.

 

 

Putting it all together – (Sample)

Diethard Steiner, active Pentaho Community member and contributor, has written an excellent tutorial that explains how to set up PDI ETL remote execution using the Carte server. He also provides a complete tutorial (including sample files provided by Matt Casters, Chief Architect and founder of Kettle) on setting up a simple “available” solution to process files, using Pentaho Data Integration. You can get it here. Please note that advanced topics such as this are also covered in greater detail (designed by our Managing Consultant Jens Bleuel – EMEA) in our training course available here.

Summary

When attempting to process the vast amounts of data collected on a daily basis, it is critical to have a Data Integration solution that is not only easy to use but easily extendable. Pentaho Data Integration achieves this extensibility with its open architecture, component stack and object library which can be used to build a scalable and highly available ETL solution without exhaustive training and no code to write, compile or maintain.

Happy ETLing.

Regards,

Michael Tarallo
Senior Director of Sales Engineering
Pentaho

This blog was originally published on the Pentaho Evaluation Sandbox. A comprehensive resource for evaluating and testing Pentaho BI.


Happy 1 year anniversary Pentaho Agile BI!

December 8, 2010

Today we celebrate the one year anniversary of launching our Pentaho Agile BI initiative. Over the past year the Pentaho Agile BI initiative has received tremendous traction and praise throughout the industry and, more importantly, has impacted organizations in a positive way.

We thought an appropriate gift for our community is something that you would find insightful. For you, we have three new white papers that cover Pentaho Agile BI from different perspectives:

  • Pentaho Agile BI: An iterative methodology for fast, flexible and cost-effective Business Intelligence, by Pentaho Chief Geek/CTO, James Dixon
  • Business Intelligence at the Crossroads: The Need for Lean, Agile and Effective End-User Solutions, by Joshua Greenbaum, principal, Enterprise Applications Consulting
  • Realizing the Agile BI Opportunity, by Joshua Greenbaum, principal, Enterprise Applications Consulting

You can download the complimentary Agile BI white papers here.


Now available: Pentaho Kettle Solutions

September 21, 2010

Congrats to Matt Casters, Roland Bourman and Jos van Dongen for their new book available today: Pentaho Kettle Solutions: Building Open Source ETL Solutions with Pentaho Data Integration. You can buy it on Amazon or Wiley (they also have free exerpts)

If their names sound familiar it is because Roland and Jos are also the authors of the Amazon best seller – Pentaho Solutions: Business intelligence and Data Warehousing with Pentaho and MySQL published August 2009. Matt is the Kettle Project founder and Chief Architect of Pentaho Data Integration.

My copy is ordered and I can’t wait to read it.

Congrats
Doug Moran
Pentaho Community Guy


Top 10 reasons behind Pentaho’s success

July 20, 2010

Today we announced our Q2 results. In summary Pentaho:

  • More than doubled new Enterprise Edition Subscriptions from Q2 2009 to Q2 2010.
  • Exceeded goals resulting in Q2 being the strongest quarter in company history and most successful for the 3rd quarter in a row.
  • Became the only vendor that lets customers choose the best way to access BI: on-site, in the cloud, or on the go using an iPad.
  • Led the industry with a series of market firsts including delivering on Agile BI.
  • Expanded globally, received many industry recognitions and added several stars to our executive bench.

How did this happen? Mostly because of our laser focus over the past 5 years to build the leading end-to-end open source BI offering. But if we really look closely over the last 12-18 months there are some clear signs pointing to our success (my top ten list):

Top 10 reasons behind Pentaho’s success:

1.     Customer Value – This is the top of my list. Recent analyst reports explain how we surpassed $2 billion mark during Q2 in terms of cumulative customer savings on business intelligence and data integration license and maintenance costs. In addition, ranked #1 in terms of value for price paid and quality of consulting services amongst all Emerging Vendors.

2.     Late 2008-Early 2009 Global Recession – this was completely out of our control but it helped us significantly by forcing companies to look for lower cost BI alternatives that could deliver the same or better results than the high priced mega-vendor BI offerings. Making #1 more attractive to companies worldwide.

3.     Agile BI – we announced our Agile BI initiative in Nov 2009 and received an enormous amount of press and positive reception from the community, partners, and customers. We’ve been showing previews and releasing RCs in Q1-Q2 2010 and put PDI 4.0 in GA at the end of Q2 2009.

4.     Active Community – A major contributing factor to our massive industry adoption is our growing number of developer stars (the Pentaho army) that continue to introduce Pentaho into new BI and data integration projects. Our community triples the amount of work of our QA team, contributes leading plug-ins like CDA and PAT, writes best-selling books about our technologies and self-organizes to spread the word.

5.    BI Suite 3.5 & 3.6 – 3.5 was a huge release for the company and helped boost adoption and sales in Q3-Q4 2009. This brought our reporting up to and beyond that of competitors. In Q2 2010 the Pentaho BI Suite 3.6 GA brought this to another level including enhancements and new functionality for enterprise security, content management and team development as well as the new Enterprise Edition Data integration Server.  The 3.6 GA also includes the new Agile BI integrated ETL, modeling and data visualization environment.

6.     Analyzer – the addition of Pentaho Analyzer to our product lineup in Sept-Oct 2009 was HUGE for our users – the best web-based query and reporting product on the market.

7.     Enterprise Edition 30-Day Free Evaluation – we started this “low-touch/hassle free” approach in March 2009 and it has eliminated the pains that companies used to have to go thru in order to evaluate software.

8.     Sales Leadership – Lars Nordwall officially took over Worldwide Sales in June 2009 and by a combination of building upon the existing talent and hiring great new team members, he has put together a world-class team and best practices in place.

9.     Big Data Analytics – we launched this in May 2010 and have received very strong support and interest in this area. We currently have a Pentaho-Hadoop beta program with over 40 participants. There is a large and unfulfilled requirement for Data Integration and Analytic solutions in this space.

10.   Whole Product & Team – #1-#9 wouldn’t work unless we had all of the key components necessary to succeed – doc, training, services, partners, finance, qa, dev, vibrant community, IT, happy customers and of course a sarcastic CTO ;-)

Thanks to the Pentaho team, community, partners and customers for this great momentum. Everyone should be extremely proud with the fact that we are making history in the BI market. We have a great foundation in which to continue this rapid growth, and with the right team and passion, we’ll push thru our next phase of growth over the next 6-12 months.

Quick story to end the note:  I was talking and white boarding with one of my sons a few weeks ago (yes, I whiteboard with my kids) and he was asking certain questions about our business (how do we make money, why are we different than our competitors, etc.) and I explained at a high level how we are basically “on par and in many cases better” than the Big Guys (IBM, ORCL, SAP) with regards to product, we provide superior support/services, yet we cost about 10% as much as they do. To which my son replied, “Then why doesn’t everyone buy our product?”  Exactly.

Richard
CEO, Pentaho


Where is Pentaho this July?

July 12, 2010

This July, Pentaho continues the Worldwide Techcast Series demonstrating how Pentaho’s Agile BI initiative will help you speed development of new BI applications and better ensure that these applications meet the needs of your business users. Learn about Pentaho Data Integration 4.0 and Agile BI in six languages: Italian, German, Portuguese, Spanish, French and Norwegian. Sign-up for a live techcast this month or watch the series on-demand.

We are also holding an executive breakfast the city where high-tech meets southern hospitality, Raleigh, North Carolina on July 20th at EvoApp Live. If you are in the ‘Triangle’ make sure to reserve your seat for this interactive panel discussion about how business analytics are driving top preforming companies to make better, faster, more informed decisions.

Pentaho Featured Events

The highlighted events for the month of July are webcast for those looking to learn more about OSBI and simple ways to get started. These webcast are free, however, early registration is recommended.

Comparing the Cost of Business Intelligence – Proprietary Vs. Open Source.
With leading analyst Mark Madsen
July 13 at 14:00 EDT (18:00 GMT)
Register Now

* If you cannot attend the webcast on July 13, register for the July 22 webcast.
* For additional background to this webcast download Mark Madsen’s latest white paper, Lowering the cost of Business Intelligence with Open Source.

BI On-Demand – See your data in a dashboard in just 3 days!
Wednesday, July 28, 2010 14:00 EDT (18:00 GMT)
Register Now

Visit our events page for more details and updated events.
Follow-us on Twitter or Facebook for event reminders and updates.


Top 3 reasons to download PDI 4.0 and BI Suite 3.6

June 10, 2010

Today we announced General Availability of Pentaho Data Integration 4.0 and Pentaho BI Suite 3.6.  This marks a major step forward on our mission to empower users with the most comprehensive, easy-to-use, and integrated Business Intelligence suite on the market today.  All too often, new BI projects end up on IT’s cutting room floor due a variety of factors:

  • Licensing costs – it’s too expensive to add the additional users/CPUs for the project to reach its intended audience.  A single BI project can have licensing impacts on several software products including data integration (ETL), reporting, dashboards, analytics, database, and on and on…
  • Lengthy time to ROI – a classic pitfall of new BI projects is spending weeks or months ‘getting the data right’ before business users have an opportunity interact and provide feedback.  Inevitably, this leads to missed requirements, delayed rollouts, and blown budgets
  • Technical Resources – do you have the right technical resources available and are they knowledgeable in all of the tools and technology involved?  Am I beholden to availability of IT, or can I accomplish this myself?

Our Agile BI initiative is breaking down these barriers by delivering a BI Suite with:

  • All of the functionality you actually need at 20% the cost of a comparable solution from the big guys… bye bye bloat-ware
  • Integrated design environment combining all aspects of a BI solution from Data Integration (ETL) through data visualization; thereby encouraging collaborative, cross team interactions between solution architects and end users and faster iterations… compare that with a with a hybrid Informatica/Data Stage – Oracle/Business Objects/Cognos solution
  • A modern, standards based architecture that deploys in minutes and is easy to customize or extend to meet the changing needs of your business… guaranteed 97% less super glue and duct tape under the hood than comparable proprietary BI Suites

So you’re thinking… enough with the marketing bullets.  Why should I download or upgrade to these new releases? The top 3 reasons:

1. Design Perspectives – New perspectives in PDI’s designer (Spoon) providing one click visualization of data and simple, drag-and-drop metadata modeling for OLAP and reporting metadata

2. PDI Enterprise Edition Data Integration ServerAll the execution and clustering capabilities of our core offering plus integrated scheduling, advanced security, and enhanced content management including complete revision history on all of your jobs and transformations

3. User Console improvements – There are numerous improvements to our visualization plugins including:

  • Support for scheduling, emailing, member properties and dashboard filter integration in Analyzer
  • Configurable auto-refresh intervals on dashboards and dashboard widgets and integration of Dashboard filters with Pentaho metadata data sources

Get started today by downloading your copy from http://www.pentaho.com/download/

Jake Cornelius
Director of Product Management
Pentaho Corporation


Follow

Get every new post delivered to your Inbox.

Join 105 other followers