Cloudera Stamp of Approval

April 3, 2014

logo_cloudera_certifiedYesterday, Cloudera announced the general availability of Cloudera 5 (C5), the latest generation of Cloudera’s unified data platform for the enterprise data hub. Pentaho engineers have been working on certification since the beta was available in early February to make sure we are certified on day one of the GA.

Cloudera and Pentaho  have a long standing strategic relationship with tested joint technologies that have been deployed time and time over again. By using Cloudera Certified products, enterprises significantly reduce risk while taking advantage of the worlds most complete, tested and popular platform powered by Apache Hadoop. This stamp of approval should put customers at ease when deploying C5 knowing that Pentaho and Cloudera have worked together to ensure the highest level of capabilities and compatibility.

Learn more about Pentaho and Cloudera’s joint solution benefits, access the download, resources and recordings.

Paul Vasquez
Senior Product Manager, Technology Partners
Pentaho
@BigDataPaul


Analyze 10 years of Chicago Crime with Pentaho, Cloudera Search and Impala

December 23, 2013

Hadoop is a complex technology stack and many people getting started with Hadoop spend an inordinate amount of time focusing on operational aspects – getting the cluster up and running, obtaining foundational training, and ingesting data. Consequently it can be difficult to get a good picture of the true value that Hadoop provides, namely unlocking insight across multiple data streams that add valuable context to the transactional history comprising most of the core data in the enterprise.

At Strata Hadoop World in October, Pentaho’s Lord of 1’s and 0’s or CTO, James Dixon, unveiled a powerful demonstration of the true value that Hadoop – combined with enabling technology from Pentaho and our partner Cloudera – can provide. He took a publicly available data set provided by the City of Chicago and built a demo around it that enables nontechnical end-users to understand how crime patterns have changed over time in Chicago, unlocking insight into the type of crimes being committed in different areas of the city – not only historically but also broken down by time of day and day of week. As a result, citizenry as well as law enforcement have a much better sense of what to expect on the streets of Chicago from the insight the demonstration provides.

In the demo, end-users start with a dashboard that provides a high-level understanding of the mix of crimes historically committed on the streets of Chicago over the last ten years. Watch the demo here:

This kind of top-to-bottom understanding of (in this case) crime patterns is uniquely enabled by the capability Pentaho delivers to the market, combining dashboarding, analytics and data integration into one easily-embedded platform that leverages blending across multiple data sets.

The deep understanding that Pentaho’s solution delivers to end-users is enabled by two key technologies from Cloudera: Cloudera Search and Impala. The original data set provided by the City of Chicago was loaded into a Cloudera Hadoop cluster using Pentaho’s data integration tool, Pentaho Data Integration (“PDI”). End-user drilldown is powered by Cloudera Search, which executes a faceted search on behalf of Pentaho’s dashboard. Once an area of interest has been located, Cloudera’s Impala executes low-latency performance of SQL on the raw data stored in the Hadoop cluster to bring up individual crime records.

Although Hadoop is often perceived as a geek’s playground, the power of Pentaho’s business-friendly interface is readily apparent when engaging this demo. Unlocking the power of Hadoop can be as simple as engaging Pentaho’s integrated approach to analytics together with Cloudera’s foundational platform to deliver an integrated solution whose value is apparent to nontechnical executives wondering whether Hadoop is the right choice for a key initiative.

Rob Rosen
Field Big Data Lead
Pentaho


Highlights From Splunk .conf2013 – Machine Data Meets Big Business

October 4, 2013
Eddie.jpg

Eddie White, EVP Business Development, Pentaho

This week, Pentaho was on site for Splunk .conf2013 in Las Vegas and the show was buzzing with excitement. Organizations big and small shared a range of new innovations leveraging machine data.

Eddie White, executive VP of business development at Pentaho, shares his first-hand impressions and insights on the biggest news and trends coming out of .conf2013.

Q: Eddie, what are your impressions of this year’s Splunk conference?

There’s a different feel at the show this year — bigger companies and more business users attended this year. What traditionally has been more of an “IT Show,” has evolved to showcase real business use cases, success stories and post-deployment analysis. It’s apparent that machine data has turned a corner. The industry is moving well beyond simply logging of machine data. Users integrate, analyze and leverage their vast resource of device data for business intelligence and competitive advantage.

For example, on the first day ADP shared how they leverage big data for real-time insights. Yahoo! shared details on a deployment of Splunk Enterprise at multi-terabyte scale that is helping to better monitor and manage website properties. Intuit spoke on leveraging Splunk for diagnostics, testing, performance tuning and more. And on the second day, StubHub, Harvard University, Credit Suisse, Sears and Wipro were all featuring compelling uses for Splunk.

What was most exciting to me was the 50+ end users I spoke with who wanted learn how Pentaho blends data with and in Splunk. Our booth traffic was steady and heavy. Pentaho’s enhanced visualization and reporting demos were a hit not only with the IT attendees, but with the business users who are searching for ways to harness the power of their Splunk data for deeper insights. 

Q: Does attendance indicate a bigger/growing appetite for analysis of machine data?

Splunk is helping to uncover new information and insights – tapping into the myriad of data types Splunk can support as a data platform. It’s clearly making an impact in the enterprise. Yet as all these organizations increasingly turn to Splunk to collect, index and harness their machine-generated big data…there is tremendous opportunity for organizations to turn to Pentaho , a Splunk Powered Technology Partner, to tap and combine Splunk data with any other data source for deeper insights.

Q: How is the market developing for machine data analytics?

We are seeing the market here change from being driven by the technologists, to being driven by the business user.  The technology has advanced and now has the scale, the flexibility and the models to make real business impacts for the enterprise.  The use cases are clearly defined now and the technology fits the customer needs.  The level of collaboration between the major players like Pentaho, Splunk and Hadoop vendors now presents CIOs with real value.

Q: You were invited this year to speak on a CXO Panel addressing Big Data challenges and opportunities. What were some of the highlights?

The CXO panel was fantastic. It was quite an honor to present and be on a panel with four founders and “rock stars” in Big Data: Matt Pfeil (DataStax), M.C. Srivas (MapR), Ari Zilka (Hortonworks) and Amr Awadallah (Cloudera).

Over a panel session that ran for 90 minutes, we tackled subjects on big data challenges. We heard that Splunk users are dealing with quite a few of the same questions and challenges.

Business users and IT professionals just getting started are struggling with what project to pick first and first steps. My advice is to pick a real business use case and push us vendors to do a proof-of-concept with you, your team and to show quantifiable results in 30 days.

We also heard a lot of questions about which vendor has the right answer to their individual use scenarios and challenges. It was great to see all of the panelists on the same page in their response. No one vendor has all the answers. As I mentioned on the panel, if any Big Data player tells you they can solve all your Big Data problems, you should disqualify them! Users need Splunk, they need Pentaho and they need Hadoop.

Q: Taking a high level view of the conference, what trends can you identify?

There were two major trends taking center stage. Business people were asking business questions, and almost everyone was looking to map adoption to real business use cases.  And again, there’s a clear awareness that no one vendor can answer all of their questions. They are all looking at how to best assemble Hadoop, along with Pentaho and extend their use of Splunk with those technologies.

Q: Pentaho and Splunk are demonstrating the new Pentaho Business Analytics and Splunk Enterprise offering, providing a first look to conference attendees. What kind of reaction are you getting from the demos?

The reaction from the audiences was tremendous. We had two sets of reactions. The end user customers took the time to go in-depth with technology demos and asked questions like… where Splunk ends and where Pentaho begins?  The demo we showed drew the business user in too. It was a very powerful visualization of how we can enable a Splunk enterprise to solve business problems.

The Splunk sales teams who visited the booth and saw the demo were able to clearly discuss how to position a total solution for their customer.

Learn more about Splunk and Pentaho.

 


Pentaho and Cloudera Impala in 5 words

April 29, 2013

Today our big data partner Cloudera, joined us in continuing to deliver innovative, open technologies that bring real business value to customers. Pentaho and Cloudera share a common history and approach to simplifying complex, but powerful technologies to integrate and analyze big data. Our common open source heritage means that we can innovate at the speed of our customers businesses.

What is Cloudera’s latest Innovation? Cloudera Impala powers Cloudera Enterprise RTQ (Real-time Query), the first data management solution that takes Hadoop beyond batch to enable real-time data processing and analysis on any type of data (unstructured and structured) within a centralized, massively scalable system. Impala dramatically improves the economics and performance of large-scale enterprise data management.

Pentaho and Cloudera Impala in 5 words = Affordable scalability meets fast analytics. Cloudera Imapala enables any product that is JDBC-enabled to get fast results from Hadoop, making Hadoop an ideal component for a data warehouse strategy. Customers no longer have to pay for expensive proprietary DBMS or analytical DBs to house their entire data warehouse.

Cloudera’s innovation makes it even easier for customers to use common analytic tools that can access and analyze data in all of these formats. What does this really mean? It means you don’t have buy expensive, proprietary products that can’t work across all of your data platforms.

With Pentaho and Cloudera you can quickly analyze large volumes of disparate data significantly faster with Impala than with Hive. Take a look at how Cloudera Impala is driving a major evolutionary step in the growth of the company’s Platform for Big Data, Cloudera Enterprise, and the Apache Hadoop ecosystem as a whole.

Richard Daley


Impala – A New Era for BI on Hadoop

November 30, 2012

With the recent announcement of Impala, also known as Cloudera Enterprise RTQ (Real Time Query), I expect the interest in and adoption of Hadoop to go from merely intense to crazy.  We applaud Cloudera’s investment in creating Impala as it moves Hadoop a huge step forward in making Hadoop accessible using existing BI tools.

What is Impala?  Simply put, it enables all of the SQL-based BI and business analytics tools that have been built over the past couple of decades to now work directly on top of Hadoop, providing interactive response times not previously attainable with Hadoop, and many times faster than Hive, the existing SQL-like alternative. And Impala provides pretty complete SQL support, including join and aggregate functions – must-have functions for analytics.

For enterprises this analytic query speed and expressiveness is huge – it means they are now much less likely to need to extract data out of Hadoop and load it into a data mart or warehouse for interactive visualization.  Instead they can use their favorite business analytics tool directly against Hadoop. But of course only Pentaho provides the integrated end-to-end data integration and business analytics capability for both ingesting and processing data inside of Hadoop, as well as interactively visualizing and analyzing Hadoop data.

Over the past few months Cloudera and Pentaho have been partnering closely at all levels including marketing, sales and engineering.  We are proud of the role we played in assisting Cloudera with validating and testing Impala against realistic BI workloads and use cases.  Based on the extremely strong interest we’ve seen, as evidenced by the lines at our booth at the recent Strata big data conference in New York City, the combination of Pentaho’s visual development and interactive visualization for Hadoop with the break-through performance of Cloudera Impala is very compelling for a huge number of enterprises.

- Ian Fyfe, Chief Technology Evangelist, Pentaho

Impala


Pentaho Joins Dell Partner Program for Big Data

June 13, 2012

The opportunity that big data analytics provides for organizations to innovate quickly, predict events and improve customer relationships is endless. Our own customers including Shareable Ink, Travian Games and TravelTainment are using big data analytics today to analyze clinical data, innovate computer games and design targeted promotional campaigns – just to name a few.

Two key themes in of our vision for the future of analytics are to integrate with the leading technology partners in the big data analytics ecosystem and to enable cloud-ready applications.  In light of this, I am incredibly pleased to announce that today Pentaho is part of Dell’s Emerging Solutions Ecosystem, a new partnership program announced last April with the aim of focusing on cloud and big data enablement.

Pentaho’s big data analytics software will be offered with the Dell Apache Hadoop Solution, which brings together Cloudera’s distribution of Apache Hadoop, Dell’s hardware reference architecture and Dell’s Crowbar software, which automates and accelerates the deployment, configuration and ongoing operation of your cloud or cluster environment. The combined solution will also be offered with joint services and support.

What does this mean? I’ve had many conversations this year with customers who’ve told us they need to ‘operationalize’ Hadoop and that’s exactly what this partnership is about.  In a nutshell, this partnership makes it faster and easier for organizations to gain total insight from all their data through a single appliance that combines Hadoop integration, data integration and out-of-the-box analytics.

Why did Dell select Pentaho to be part of the Dell Apache Hadoop Solution? A major factor is that we were one of the first movers in the space announcing support for Big Data in May 2010 and we have many customers doing real work with big data.  We’ve also been working with Cloudera since October 2010 and have a longstanding strategic relationship with them, which includes technology integration.

The partnership means that Dell’s dedicated, big data sales team will now be reselling Pentaho directly into existing and new Dell accounts, initially in North America and other parts of the world very soon.  The Dell partnership also further commercializes Pentaho’s relationship with Cloudera by providing a vehicle to bring a big data offering to the market.

We look forward to helping you achieve your business goals through this important new partnership. For more information about Pentaho and Dell’s Emerging Solutions Ecosystem please visit pentaho.com/big-data/dell/.

Quentin


Follow

Get every new post delivered to your Inbox.

Join 102 other followers