Announcing Pentaho with Storm and YARN

February 11, 2014

One of Pentaho’s core beliefs is that you can’t prepare for tomorrow with yesterday’s tools. In June of 2013, amidst waves of emerging big data technologies, Pentaho established Pentaho Labs to drive innovation through the incubation of these new technologies. Today, one of our Labs projects hatches.  At the Strata Conference in Santa Clara, we announced native integration of Pentaho Data Integration (PDI) with Storm and YARN. This integration enables developers to process big data and drive analytics in real-time, so businesses can make critical decisions on time-sensitive information.

Read the announcement here.

Here is what people are saying about Pentaho with Storm and YARN:

Pentaho Customer
Bryan Stone, Cloud Platform Lead, Synapse Wireless: “As an M2M leader in the Internet of Everything, our wireless solutions require innovative technology to bring big data insights to business users. The powerful combination of Pentaho Data Integration, Storm and YARN will allow my team to immediately leverage real-time processing, without the delay of batch processing or the overhead of designing additional transformations. No doubt this advancement will have a big impact on the next generation of big data analytics.

Leading Big Data Industry Analyst
Matt Aslett, Research Director, Data Management and Analytics, 451 Research: “YARN is enabling Hadoop to be used as a flexible multi-purpose data processing and analytics platform. We are seeing growing interest in Hadoop not just as a platform for batch-based MapReduce but also rapid data ingestion and analysis, especially using Apache Storm. Native support of YARN and Storm from companies like Pentaho will encourage users to innovate and drive greater value from Hadoop.”

Pentaho founder and Pentaho Labs Leader
Richard Daley, Founder and Chief Strategy Officer, Pentaho: “Our customers are facing fast technology iterations from the relentless evolution of the big data ecosystem. With Pentaho’s Adaptive Big Data Layer and Big Data Analytical Platform our customers are “future proofed” from the rapid pace of evolution in the big data environment. In 2014, we’re leading the way in big data analytics with Storm, YARN, Spark and predictive, and making it easy for customers to leverage these innovations.”

Learn more about the innovation of Pentaho Data Integration for Storm on YARN in Pentaho Labs at pentaho.com/storm

If you are at O’Reilly Strata Conference in Santa Clara this week make sure to stop by booth 710 to see a live demo of Pentaho Data integration with Storm and YARN at the O’Reilly Strata Conference in Santa Clara, February 11-13 at Booth 710. The Pentaho team of technologist, data scientist and executives will be on hand to share the latest big data innovations from Pentaho Labs.

Donna Prlich
Senior Director, Product Marketing
Pentaho


Analyze 10 years of Chicago Crime with Pentaho, Cloudera Search and Impala

December 23, 2013

Hadoop is a complex technology stack and many people getting started with Hadoop spend an inordinate amount of time focusing on operational aspects – getting the cluster up and running, obtaining foundational training, and ingesting data. Consequently it can be difficult to get a good picture of the true value that Hadoop provides, namely unlocking insight across multiple data streams that add valuable context to the transactional history comprising most of the core data in the enterprise.

At Strata Hadoop World in October, Pentaho’s Lord of 1’s and 0’s or CTO, James Dixon, unveiled a powerful demonstration of the true value that Hadoop – combined with enabling technology from Pentaho and our partner Cloudera – can provide. He took a publicly available data set provided by the City of Chicago and built a demo around it that enables nontechnical end-users to understand how crime patterns have changed over time in Chicago, unlocking insight into the type of crimes being committed in different areas of the city – not only historically but also broken down by time of day and day of week. As a result, citizenry as well as law enforcement have a much better sense of what to expect on the streets of Chicago from the insight the demonstration provides.

In the demo, end-users start with a dashboard that provides a high-level understanding of the mix of crimes historically committed on the streets of Chicago over the last ten years. Watch the demo here:

This kind of top-to-bottom understanding of (in this case) crime patterns is uniquely enabled by the capability Pentaho delivers to the market, combining dashboarding, analytics and data integration into one easily-embedded platform that leverages blending across multiple data sets.

The deep understanding that Pentaho’s solution delivers to end-users is enabled by two key technologies from Cloudera: Cloudera Search and Impala. The original data set provided by the City of Chicago was loaded into a Cloudera Hadoop cluster using Pentaho’s data integration tool, Pentaho Data Integration (“PDI”). End-user drilldown is powered by Cloudera Search, which executes a faceted search on behalf of Pentaho’s dashboard. Once an area of interest has been located, Cloudera’s Impala executes low-latency performance of SQL on the raw data stored in the Hadoop cluster to bring up individual crime records.

Although Hadoop is often perceived as a geek’s playground, the power of Pentaho’s business-friendly interface is readily apparent when engaging this demo. Unlocking the power of Hadoop can be as simple as engaging Pentaho’s integrated approach to analytics together with Cloudera’s foundational platform to deliver an integrated solution whose value is apparent to nontechnical executives wondering whether Hadoop is the right choice for a key initiative.

Rob Rosen
Field Big Data Lead
Pentaho


Big Data, Big Revenue for Marketers

December 12, 2013

Why might Big Data mean millions for marketing?  Because it has the potential to create a more complete picture of the buyer, thereby empowering marketers to more effectively deliver the right message to the right individual at the right time – and ultimately increase sales.  In the following brief video from DMA 2013, Marketo VP/Co-founder Jon Miller and Pentaho CMO Rosanne Saccone provide a crash course on what Big Data means for marketers.  It covers:

  • The defining characteristics of Big Data – Velocity, Variety, & Volume
  • How marketers can leverage Big Data to blend operational information (CRM, ERP) and online data (web activity, social networking interactions) for new insights
  • Sample Big Data use cases that organizations are green-lighting today to optimize customer interactions and drive marketing’s contribution to revenue

Note that this is an excerpt from a larger presentation – for the full video please click here.

We’d also recommend this blog post by Jon Miller for more context on Big Data in marketing.

For additional compelling use cases that leverage Big Data for marketing and other functions, see here.

Ben Hopkins
Product Marketing
Pentaho


Big Data 2014: Powering Up the Curve

December 5, 2013

Last year, I predicted that 2013 would be the year big data analytics started to go into mainstream deployment and the research we recently commissioned with Enterprise Management Consultants indicates that’s happened. What really surprised me though is the extent to which the demand for data blending has powered up the curve and I believe this trend will accelerate big data growth in 2014.

Prediction one: The big data ‘power curve’ in 2014 will be shaped by business users’ demand for data blending
Customers like Andrew Robbins of Paytronix and Andrea Dommers-Nilgen of TravelTainment, who recently spoke about their Pentaho projects at events in NY and London, both come from the business side and are achieving specific goals for their companies by blending big and relational data. Business users like these are getting inspired by the potential to tap into blended data to gain new insights from a 360 degree customer view, including the ability to analyze customer behavior patterns and predict the likelihood that customers will take advantage of targeted offers.

Prediction two: big data needs to play well with others!
Historically, big data projects have largely sat in the IT departments because of the technical skills needed and the growing and bewildering array of technologies that can be combined to build reference architectures. Customers must choose from the various commercial and open source technologies including Hadoop distributions, NoSQL databases, high-speed databases, analytics platforms and many other tools and plug-ins. But they also need to consider existing infrastructure including relational data and data warehouses and how they’ll fit into the picture.

The plus side of all this choice and diversity is that after decades of tyranny and ‘lock-in’ imposed by enterprise software vendors, in 2014, even greater buying power will shift to customers. But there are also challenges. It can be cumbersome to manage this heterogeneous data environment involved with big data analytics. It also means that IT will be looking for Big Data tools to help deploy and manage these complex emerging reference architectures, and to simplify them.  It will be incumbent on the Big Data technology vendors to play well with each other and work towards compatibility. After all, it’s the ability to access and manage information from multiple sources that will add value to big data analytics.

Prediction three: you will see even more rapid innovation from the big data open source community
New open source projects like Hadoop 2.0 and YARN, as the next generation Hadoop resource manager, will make the Hadoop infrastructure more interactive. New open source projects like STORM, a streaming communications protocol, will enable more real-time, on-demand blending of information in the big data ecosystem.

Since we announced the industry’s first native Hadoop connectors in 2010, we’ve been on a mission to make the transition to big data architectures easier and less risky in the context of this expanding ecosystem. In 2013 we made some massive breakthroughs towards this, starting with our most fundamental resource, the adaptive big data layer. This enables IT departments to feel smarter, safer and more confident about their reference architectures and open up big data solutions to people in the business, whether they be data scientists, data analysts, marketing operations analysts or line of business managers.

Prediction four: you can’t prepare for tomorrow with yesterday’s tools
We’re continuing to refine our platform to support the future of analytics. In 2014, we’ll release new functionality, upgrades and plug-ins to make it even easier and faster to move, blend and analyze relational and big data sources. We’re planning to improve the capabilities of the adaptive data layer and make it more secure and easy for customers to manage data flow. On the analytics side, we’re working to simplify data discovery on the fly for all business users and make it easier to find patterns and catch anomalies. In Pentaho Labs, we’ll continue to work with early adopters to cook up new technologies to bring things like predictive, machine data and real-time analytics into mainstream production.

As people in the business continue to see what’s possible with blended big data, I believe we’re going to witness some really exciting breakthroughs and results. I hope you’re as excited as I am about 2014!

Quentin Gallivan, CEO, Pentaho

Big-Data-2014-Predictions-Blog-Graphic


Pentaho wins Red Herring Global 100 Award

November 22, 2013

RH_global100Following Pentaho’s success this summer in winning the Red Herring North America competition, which recognizes the most promising private technology companies, Pentaho was invited to participate in this week’s Red Herring 2013 Top 100 Global event and award competition. And we’re pleased to announce that Pentaho won!

We’re honored to be recognized as a Red Herring 2013 Global Top 100 company!

Red Herring Global culminates a full year of work scouring the globe by the Red Herring editorial team and venture capitalists reviewing thousands of privately held companies in regional competitions around the world. The world’s top technology companies are selected based on financial performance, technology innovation and a variety of other metrics that have made the Red Herring Global 100 a mark of distinction for identifying the most promising private companies and entrepreneurs.

Pentaho was joined by a list of impressive company finalists for the Global 100 competition including Pentaho customers Spreadshirt GmbH and Active Broadband Networks (both ended up on the top 100 list as well – our heartfelt congratulations go out to these innovators!). Top private companies from the Red Herring’s Regional competitions in Europe, North America and Asia flew to the Red Herring Global forum in Los Angeles for the final competition on November 19 and 20.

Pentaho CEO, Quentin Gallivan, was asked to present at the forum, where he shared our point of view about how the big data and analytics markets are transforming, particularly with the need to easily blend big data with other data sources for better business insights. Quentin provided insights from our customer front lines about how Pentaho’s Big Data pilot projects are transitioning into widespread deployments with real business impact—with the most powerful insights coming from blending relational and big data sources.

The Red Herring Global forum concluded with an awards gala, where the 2013 Red Herring Global 100 companies were announced.

photo.JPGThis is great validation for Pentaho, as well as our customers, partners and community. Together we’re driving the future of analytics, and the Global 100 award provides a very solid foundation to build upon as we push the boundaries of Analytics with Storm, YARN, and predictive analytics in 2014.

Rosanne

Rosanne Saccone
Chief Marketing Officer
Pentaho


9 Years Later….

October 8, 2013
5founders

Photo taken the day Pentaho was founded – October 8, 2004

On Oct 8, 2004 five guys got some crazy idea to create a commercial open source BI offering to provide customers of all sizes with a better and more affordable solution than existed from proprietary vendors. Nine years later – “BI” became “BA”,  the core platform just underwent its biggest overhaul since its inception,  our UI/UX is the best ever, the open source community is still key, big data has become our core growth strategy, predictive is awakening, we went thru the biggest financial crisis since the Great Depression and are achieving great Y/Y bookings growth. Last, but not least, we have been very fortunate to attract and retain a fantastic, talented, passionate team to make us the leader in big data analytics. Big Data is one of the biggest business impacts our industry has seen in decades and we’re making it happen.

Congrats to the entire company for making this real. Happy Birthday Pentaho.

Richard

Richard Daley
Co-Founder and Chief Strategy Officer, Pentaho


Pentaho 5 has arrived with something for everyone!

September 18, 2013

I am tremendously excited to announce that Pentaho Business Analytics 5 is available for download!  This release is represents the culmination of over 30 man years of engineering effort and contains over 250 new features and improvements.  There truly is something for everyone in Pentaho 5.  If you are an end user, administrator, executive or developer I wanted to share with you what I think are the 3 top areas of improvement for you:

  1. Improving productivity for end users and administrators
  2. Empowering organizations to easily and accurately answer questions using blended big data sets
  3. Simplifying the experience for developers integrating with or embedding Pentaho Business Analytics

Improving Productivity for End Users and Administrators

multi-charts

homepage

18 months ago, we challenged ourselves to think deeply about the different profiles of users working with the Pentaho suite and identify the top areas where we could significantly improve our ease-of-use.  Based on the feedback from countless customer interviews and usability studies, the first thing you will notice about Pentaho 5 is a dramatically overhauled User Console.  Beyond the fresh, new, modern look and feel, we’ve introduced a new concept called “perspectives” making it easier than ever for end users to:

  • navigate between open documents
  • browse the repository
  • manage scheduled activities

Throughout the User Console, end users will enjoy numerous improvements and better feedback for common workflows such as designing dashboards or scheduling the execution of a parameterized report. Administrators will appreciate that we have consolidated all Administration capabilities directly into the User Console, enhanced security with the ability to create more specific role types with control the types of actions they can perform, and bundled a comprehensive audit mart providing out-of-the-box answers to common questions about usage patterns, performance and errors.

Analytics-ready Big Data Blending

MattCasters_Blog_graphic

partner logos

In the dawn of the Big Data era, a wide range of new storage and processing technologies have flooded the market, each bringing specialized characteristics to help solve the next wave of data challenges.   Pentaho has long been a leader and innovator in delivering an end-to-end platform for designing scalable and easily maintainable Big Data solutions.  Powered by the Pentaho Adaptive Big Data Layer, we’ve dramatically expanded our support for Hadoop with all new certifications for the latest distributions from Cloudera, Hortonworks, MapR and Intel.  Furthermore, we’ve integrated our complete analytics platform for use with Cloudera Impala.  Other Big Data highlights in Pentaho 5 include new integration with Splunk and dramatic ease-of-use improvements when working with NoSQL platforms such as MongoDB and Cassandra.

blendingAs organizations large and small map out their next generation data architectures, we see best practice design patterns emerging that help organizations target the appropriate data technology for each use case.  Evident in all of these design patterns is the fact that Big Data technologies are rarely information silos.  Solving common use cases such as optimizing your data warehousing architecture or performing 360 degree analysis on a customer require that all data be accessible and blended in an accurate way.  Pentaho Data Integration provides the connectivity and design ease-of-use to implement all of these emerging patterns, and with Pentaho 5 I’m excited to announce the world’s first SQL (JDBC) driver for runtime transformation.  This integration empowers data integration designers to accurately design blended data sets from across the enterprise, and put them directly in the hands of end users using tools they are already familiar with – reporting, dashboards and visual discovery – as well as predictive analytics.

Simplified Platform for OEMs and Embedders

marketplace perspective

integration samples

Finally, I’d like to highlight how this release further solidifies the Pentaho suite as the best platform for enterprises and OEMs who want to enrich their applications with better data processing or business analytics.  Pentaho 5 delivers a more customizable User Console providing developers with complete control over the menu bar and toolbar, improvements to the underlying theming engine and an all new plugin layer for adding custom perspectives.  Furthermore, we’ve dramatically simplified our service architecture by introducing a brand new REST-based API along with a rich library of integration samples and documentation to get you started.

These enhancements are just a few of the many great improvements in Pentaho 5. If you want a more in-depth overview and demonstration, register for the Pentaho 5.0 webinar on September 24th – 2 times to choose:  North America/LATAM & EMEA. You can also access great resources from videos to solutions briefs at Pentaho.com/5.0.

Jake Cornelius

SVP Products


Follow

Get every new post delivered to your Inbox.

Join 102 other followers