Is Sage Kotsenburg a Big Data Analyst?

February 13, 2014

I was pleased to be part of a panel discussion Tuesday called “Getting There from Here: Moving Data Science into the Boardroom,” moderated by Alistair Croll with colleagues Chris Selland of HP Vertica and Scott Chastain of SAS Institute.  This panel was part of O’Reilly’s Strata Conference Data-Driven Business Day.

It’s amazing to see how the Strata Conference has changed over the last couple of years. With big data becoming truly operational across many different industries, companies are now looking at how to manage, blend and analyze data to make profitable business decisions and fly past competitors.

One interesting topic we debated was whether being able to process, analyze and predict outcomes with all this data might drive conformity and ‘dull’ innovation. Will we become so reliant on predictive algorithms and models that we lose confidence in our intuition and not apply enough business judgment into our decision-making?

In preparing, reflecting on these and other burning questions before the panel, I was inspired by the athletes competing in the Sochi Winter Olympics. Sage Kotsenburg, who won the Gold medal in ‘Slopestyle’ snowboarding, managed to blend creativity and style with calculated risk-taking. He had the same data as everyone else in the competition: the course terrain and conditions, the strengths and weaknesses of his competitors, historical trends on how the judges rate different types of athletic feats and other information. Armed with this data and his ability to (literally!) analyze it ‘on the fly,’ he then layered on his intuition to bust some totally unexpected moves, such as the ‘Holy Crail’, that blew the judges away! This demonstrates that being analytical doesn’t come at the expense of using intuition – the two actually strengthen each other. This New York Times pictorial is a great explanation of Sage’s blend of creativity, style and calculated risk-taking.

Let’s explore further how this relates to business decision-making, data science and big data analytics.  Consider how the best business and operational managers make decisions:

  1. They assess the competitive landscape – which are the other key players? What are their strengths and weaknesses (innovation, execution, service)? How have they historically gained customers and grown their businesses? What might they do in the future based on their historical moves?
  2. They assess what customers will want to reward/buy (the criteria the ‘judges’ will use to assess performance).  Some of that seems obvious based on what criteria they specified, but often what a customer truly wants is more subtle and personal. Some want to be inspired, while others want to remove risk. Some would like to be a hero to their end users, while others value having a ‘cool-factor’ in their product.
  3. They take stock in their company’s strengths and weaknesses to determine how to best meet customers’ needs. Every company has its own ‘DNA’ or approach, just like every Olympic athlete.  Are you great at execution (so go for technical perfection)? Flexible and creative (so showcase the art of the possible)?
  4. They apply business judgment to determine how to take advantage of the opportunities they see – often needing to make a change ‘on the fly’ based on new data…just like Sage decided to create a new move on the fly to change the rules of the game during that particular meet.
  5. Lastly, they have to execute well enough on the strategy we have picked – choosing the right time, right place and right moves.  Sage’s execution wasn’t perfect, but combined with his creativity, energy and style, his overall performance was solid gold.

What’s the implication for those of us whose competitive sports take place in the world of business?  We need to make sure we put consumable, near real-time information into the hands of the operational managers who can apply business judgment, decide on courses of actions, and ultimately own the outcome of decisions.  We may not always feel the exhilaration of the wind in our face, jumping at neck-breaking speeds down steep cliffs, but we can combine analytics with our best business judgment to go for the gold!

Rosanne Saccone
CMO
Pentaho


Announcing Pentaho with Storm and YARN

February 11, 2014

One of Pentaho’s core beliefs is that you can’t prepare for tomorrow with yesterday’s tools. In June of 2013, amidst waves of emerging big data technologies, Pentaho established Pentaho Labs to drive innovation through the incubation of these new technologies. Today, one of our Labs projects hatches.  At the Strata Conference in Santa Clara, we announced native integration of Pentaho Data Integration (PDI) with Storm and YARN. This integration enables developers to process big data and drive analytics in real-time, so businesses can make critical decisions on time-sensitive information.

Read the announcement here.

Here is what people are saying about Pentaho with Storm and YARN:

Pentaho Customer
Bryan Stone, Cloud Platform Lead, Synapse Wireless: “As an M2M leader in the Internet of Everything, our wireless solutions require innovative technology to bring big data insights to business users. The powerful combination of Pentaho Data Integration, Storm and YARN will allow my team to immediately leverage real-time processing, without the delay of batch processing or the overhead of designing additional transformations. No doubt this advancement will have a big impact on the next generation of big data analytics.

Leading Big Data Industry Analyst
Matt Aslett, Research Director, Data Management and Analytics, 451 Research: “YARN is enabling Hadoop to be used as a flexible multi-purpose data processing and analytics platform. We are seeing growing interest in Hadoop not just as a platform for batch-based MapReduce but also rapid data ingestion and analysis, especially using Apache Storm. Native support of YARN and Storm from companies like Pentaho will encourage users to innovate and drive greater value from Hadoop.”

Pentaho founder and Pentaho Labs Leader
Richard Daley, Founder and Chief Strategy Officer, Pentaho: “Our customers are facing fast technology iterations from the relentless evolution of the big data ecosystem. With Pentaho’s Adaptive Big Data Layer and Big Data Analytical Platform our customers are “future proofed” from the rapid pace of evolution in the big data environment. In 2014, we’re leading the way in big data analytics with Storm, YARN, Spark and predictive, and making it easy for customers to leverage these innovations.”

Learn more about the innovation of Pentaho Data Integration for Storm on YARN in Pentaho Labs at pentaho.com/storm

If you are at O’Reilly Strata Conference in Santa Clara this week make sure to stop by booth 710 to see a live demo of Pentaho Data integration with Storm and YARN at the O’Reilly Strata Conference in Santa Clara, February 11-13 at Booth 710. The Pentaho team of technologist, data scientist and executives will be on hand to share the latest big data innovations from Pentaho Labs.

Donna Prlich
Senior Director, Product Marketing
Pentaho


edo optimizes data warehouse, increases loyalty and targets new customers

February 10, 2014

edo

What do you do when you need to track, store, blend, and analyze over 6 billion financial data transactions with the outlook of daily growth by the millions? edo Interactive, inc is a digital advertising company that leverages payment networks to connect brands with consumers. Their legacy data integration and analysis system took more than 27 hours to run, meaning that meeting daily Service Level Agreements was nearly impossible. However, after only a few weeks of implementing a data distribution on Hadoop, with Pentaho for data integration, edo Interactive was able to reduce its processing time to less than 8 hours, often as little as 2 hours.

Minimum timesaving’s of 70% quickly precipitated cost savings. With an optimized data warehouse, edo and its clients also spend less time navigating IT barriers. Pentaho’s graphical user interface, removes cumbersome coding of batch process jobs, enabling sophisticated and simplified conversion of data from PostgreSQL to Hadoop, Hive and HBase. edo and its clients quickly gain insights to customer preferences, refine marketing strategies and provide their customers with improved experience and satisfaction.

Edo Interactive successfully navigated many of the obstacles faced when implementing a big data environment and created a lasting and scalable solution. Their vision to provide end-users a better view of their customers has helped shape a new data architecture and embedded analytics capabilities.

To learn more about edo’s Big Data vision and success, read their customer success overview and case study on Pentaho.com. We are excited to announce that Tim Garnto, SVP of Product Engineering at edo, will share his story live when he presents at O’Reilly + Strata on Thursday, February 13th in Santa Clara (11:30AM, Ballroom G).

Strata Santa Clara is already sold out! If you are interested to learn more about edo’s Big Data deployment, leave your questions in the comments section below and we will ask Tim during his speaking session at Strata.

Ben Mayer
Customer Marketing
Pentaho


Bring your Big Data to Life With Pentaho at Strata Santa Clara

January 30, 2014

horizon_bigdatalife

If you are like most Enterprise IT decision makers, there’s a 50/50 chance you are already knee deep into Big Data or on a path to figuring out how to get started. One of the “must attend” conferences for anyone involved in Big Data is the O’Reilly Strata Conference (Santa Clara, February 11-15, 2014).

Join Us!

Pentaho is excited to return as a sponsor this year and we have a number of ways you can learn more about getting the most out of your Big Data initiatives.

The Pentaho team of executives, technologist and data scientist will be on hand to share the latest big data innovations from Pentaho Labs such as integration with Apache Hadoop YARN and Storm. Come get answers to your all of your big data integration and analytics question. Let us help you bring your Big Data to life!

Below is a list of all activities for Pentaho in and around the conference. Register with code Pentaho20 and receive 20% off registration.

Exhibit booth

You will find the Pentaho team in the Sponsor Pavilion at booth 710 (located near the O’reilly Media booth). Learn all about how Pentaho can help bring your Big Data to life! Don’t forget to get your Pentaho t-shirt and enter for the chance to win a Go Pro camera.

Meetups

Big Data Science Meet-up at Strata Conference

  • Monday, 2/10 at 5:30-9:30 in Ballroom E
  • Nick Gonzalez, Data Scientist at Pentaho will speak about Real World Big Data Prescriptive Analytics
  • Today’s large and convoluted data landscape coupled with the abundance of available computing resources presents unique opportunities for data scientists around the world. To remain competitive in this landscape, we must go beyond generating predictions to generating solutions from big data that are driven by actions derived from data driven predictions. And we have to do this as fast as possible.  This is the real world of big data prescriptive analytics. This talk will address each one of these challenges and present technical solutions and algorithms to address them.  By the end of this presentation each individual solution will come together in a symphony of code and hardware to form a unified automated process that is the backbone of a successful big data prescriptive analytics solution.

Breakout Sessions

Getting There from Here: Moving Data Science into the Boardroom

  • Rosanne Saccone (Pentaho), Scott Chastain (SAS), Chris Selland (HP Vertica)
  • Tuesday, 2/11 at 11:15 on the Data Driven Business Track, Ballroom CD
  • Pundits and analysts agree—the data-driven enterprise is here to stay. But how will companies balance analysis with action? Will optimization of the current model leave firms more vulnerable than ever to disruption by what’s new and unpredictable? And how do we balance legacy investments in data warehousing and business intelligence with emerging technologies for massive, real-time data processing? Join Scott Chastain, Roseanne Saccone, Chris Selland, and Strata Chair Alistair Croll for a look at the practical concerns facing tomorrow’s data-driven business.

Lessons from the Trenches: edo Interactive Leverages Hadoop to Build Customer Loyalty

  • Thursday, 2/13 at 11:30am, Ballroom G
  • Tim Garnto (edo) & Rob Rosen (Pentaho)
  • Hadoop presents as an enabling technology to better understand customer preferences and behaviors, but organizations often struggle with time-consuming data preparation and analytics processes. edo Interactive – a leader in providing card-linked offers to financial services and retailers – shares how they drive agile, improved decision-making by complementing native Hadoop technologies with analytical databases and ETL optimization and data visualization solutions from vendors such as Pentaho.

We hope to see you soon at Strata in Santa Clara. If you would prefer a private meeting with Pentaho at the conference send us a message via our contacts page or direct message us on twitter @Pentaho.


Top 10 Pentaho Stories of 2013

December 31, 2013

goodbye2013

2013 has been an exciting year with new products, partnerships and overall company momentum. Before the clock strikes midnight on December 31, and we welcome 2014, we’d like to look back at some of the top stories of 2013. Here are the top 10 most popular news releases of 2013:

10. Pentaho Named a Red Herring Global 100 Winner
Big data integration and analytics leader listed among world’s most promising technology companies (Wednesday, November 27, 2013)

9. Revamped Global Pentaho Partner Programme Rewards Commitment and Lowers Barriers to Entry
Now even easier for channel players to profit from the big data revolution (Wednesday, September 4, 2013)

8. Pentaho Brings Big Data Analytics to Intel® Distribution for Apache Hadoop Software
End-to-end solution combines the Intel Distribution for Apache Hadoop Software with Pentaho’s full range of enterprise data integration and analytics software (Tuesday, February 26, 2013)

7. Rackspace brings ETL to the Cloud with Pentaho
A Hadoop Summit Q&A (Thursday, June 27, 2013)

6. Pentaho Instaview Templates Broaden Big Data Access and Analysis
Big Data delivery is simplified for IT to easily customize and create new templates; Data Analysts empowered to choose, prepare and analyze big data sources in three easy steps (Tuesday, February 26, 2013)

5. Pentaho Announces New Offering for MongoDB-based Business Intelligence
Expanded native integration provides enterprise analytics for MongoDB (Thursday, September 12, 2013)

4. Pentaho and Splunk Partner to Unlock Big Data Value from Machine Generated Data
Extends Splunk Machine Data Insights to Business Users with Pentaho’s Big Data Visualization and Data Integration (Wednesday, August 7, 2013)

3. Pentaho Furthers Innovation in Big Data Integration and Launches Pentaho Labs
Helps companies thrive in the face of relentless big data change (Wednesday, June 26, 2013)

2. Pentaho Acquires Dashboard and UI Specialist Partner Webdetails
Portugal-based consultancy provides visual development expertise, consulting services and a new community leader (Monday, April 22, 2013)

1.  Pentaho Gears Up Analytics Platform for the Future of Big Data
Pentaho Business Analytics 5.0 greatly simplifies the entire analytics experience for everyone and delivers the industry’s first just in time big data blending ‘at the source’ (Thursday, September 12, 2013)

What was your favorite story of 2013? We want to know – you can respond in the comments section below or on twitter using the hashtag #Pentaho.

Wishing you a very Happy New Year!


Analyze 10 years of Chicago Crime with Pentaho, Cloudera Search and Impala

December 23, 2013

Hadoop is a complex technology stack and many people getting started with Hadoop spend an inordinate amount of time focusing on operational aspects – getting the cluster up and running, obtaining foundational training, and ingesting data. Consequently it can be difficult to get a good picture of the true value that Hadoop provides, namely unlocking insight across multiple data streams that add valuable context to the transactional history comprising most of the core data in the enterprise.

At Strata Hadoop World in October, Pentaho’s Lord of 1’s and 0’s or CTO, James Dixon, unveiled a powerful demonstration of the true value that Hadoop – combined with enabling technology from Pentaho and our partner Cloudera – can provide. He took a publicly available data set provided by the City of Chicago and built a demo around it that enables nontechnical end-users to understand how crime patterns have changed over time in Chicago, unlocking insight into the type of crimes being committed in different areas of the city – not only historically but also broken down by time of day and day of week. As a result, citizenry as well as law enforcement have a much better sense of what to expect on the streets of Chicago from the insight the demonstration provides.

In the demo, end-users start with a dashboard that provides a high-level understanding of the mix of crimes historically committed on the streets of Chicago over the last ten years. Watch the demo here:

This kind of top-to-bottom understanding of (in this case) crime patterns is uniquely enabled by the capability Pentaho delivers to the market, combining dashboarding, analytics and data integration into one easily-embedded platform that leverages blending across multiple data sets.

The deep understanding that Pentaho’s solution delivers to end-users is enabled by two key technologies from Cloudera: Cloudera Search and Impala. The original data set provided by the City of Chicago was loaded into a Cloudera Hadoop cluster using Pentaho’s data integration tool, Pentaho Data Integration (“PDI”). End-user drilldown is powered by Cloudera Search, which executes a faceted search on behalf of Pentaho’s dashboard. Once an area of interest has been located, Cloudera’s Impala executes low-latency performance of SQL on the raw data stored in the Hadoop cluster to bring up individual crime records.

Although Hadoop is often perceived as a geek’s playground, the power of Pentaho’s business-friendly interface is readily apparent when engaging this demo. Unlocking the power of Hadoop can be as simple as engaging Pentaho’s integrated approach to analytics together with Cloudera’s foundational platform to deliver an integrated solution whose value is apparent to nontechnical executives wondering whether Hadoop is the right choice for a key initiative.

Rob Rosen
Field Big Data Lead
Pentaho


Big Data, Big Revenue for Marketers

December 12, 2013

Why might Big Data mean millions for marketing?  Because it has the potential to create a more complete picture of the buyer, thereby empowering marketers to more effectively deliver the right message to the right individual at the right time – and ultimately increase sales.  In the following brief video from DMA 2013, Marketo VP/Co-founder Jon Miller and Pentaho CMO Rosanne Saccone provide a crash course on what Big Data means for marketers.  It covers:

  • The defining characteristics of Big Data – Velocity, Variety, & Volume
  • How marketers can leverage Big Data to blend operational information (CRM, ERP) and online data (web activity, social networking interactions) for new insights
  • Sample Big Data use cases that organizations are green-lighting today to optimize customer interactions and drive marketing’s contribution to revenue

Note that this is an excerpt from a larger presentation – for the full video please click here.

We’d also recommend this blog post by Jon Miller for more context on Big Data in marketing.

For additional compelling use cases that leverage Big Data for marketing and other functions, see here.

Ben Hopkins
Product Marketing
Pentaho


Weka goes BIG

December 4, 2013

funny_science_nerd_cartoon_character_custom_flyer-rb4a8aff0894a4e25932056b8852f8b18_vgvyf_8byvr_512.jpgThe beakers are bubbling more violently than usual at Pentaho Labs and this time predictive analytics is the focus.  The lab coat, pocket-protector and taped glasses clad scientists have turned their attention to the Weka machine learning software.

Weka, a collection of machine learning algorithms for predictive analytics and data mining, has a number of useful applications. Examples include, scoring credit risk, predicting downtime of machines and analyzing sentiment in social feeds.  The technology can be used to facilitate automatic knowledge discovery by uncovering hidden patterns in complex datasets, or to develop accurate predictive models for forecasting.

Organizations have been building predictive models to aid decision making for a number of years, but the recent explosion in the volume of data being recorded (aka “Big Data”) provides unique challenges for data mining practitioners. Weka is efficient and fast when running against datasets that fit in main memory, but larger datasets often require sampling before processing. Sampling can be an effective mechanism when samples are representative of the underlying problem, but in some cases the loss of information can negatively impact predictive performance.

To combat information loss, and scale Weka’s wide selection of predictive algorithms to large data sets, the folks at Pentaho Labs developed a framework to run Weka in Hadoop. Now the sort of tasks commonly performed during the development of a predictive solution – such as model construction, tuning, evaluation and scoring – can be carried out on large datasets without resorting to down-sampling the data. Hadoop was targeted as the initial distributed platform for the system, but the Weka framework contains generic map-reduce building blocks that can be used to develop similar functionality in other distributed environments.

If you’re a predictive solution developer or a data scientist, the new Weka framework is a much faster path to solution development and deployment.  Just think of the questions you can ask at scale!

To learn more technical details about the Weka Hadoop framework I suggest to read the blog, Weka and Hadoop Part 1, by Mark Hall, Weka core developer at Pentaho.

Also, check out Pentaho Labs to learn more about Predictive Analytics from Pentaho, and to see some of the other cool things the team has brewing.

Chuck Yarbrough
Technical Solutions Marketing


Customer Spotlight: WiMP Music, the 2013 BI Award winner for Innovation

December 2, 2013

wimpAwardIn today`s blog post, I want to put the spotlight on our Norwegian partner Conduct and congratulate them for their great work implementing Pentaho for WiMP Music, a music streaming service similar to iTunes that is very popular in Northern and Central Europe. The deployment received the 2013 BI Award for Innovation in Norway given by the prestigious Norwegian Computer Society on October 31st in Oslo.

Through local editorial teams in each country, WiMP provides daily recommendations, tips and playlists for any occasion for its audiences. The ad-free service is available on computers and mobiles, tablets and network players in Denmark, Germany, Norway, Poland and Sweden. The business is entirely digital and data-driven, with music files licensed from a large number of different sources delivered by a huge partner network and priced dynamically, according to where and how it is sold. Wimp’s BI solution was built by Conduct on Pentaho Business Analytics and has been in production since 2010.

Pentaho has become essential for delivering the music streaming service WiMP provides, because it controls income distribution and settlement. It also provides information that WiMP is contractually obliged to provide to content providers and partners. In addition, the solution provides a huge range of easily accessible management information for decision-making based on facts, not gut feelings.

The Jury stated that this year’s winner of the BI Award for Innovation has:

  • Adopted BI as a core component of its business
  • Been thinking outside the box and dared to challenge traditional IT architecture when establishing its business and its IT portfolio. The solution is flexible and meets the changing challenges of a dynamic market.
  • Used Open Source in its solution design and decided to use the data warehouse as the core of its business systems and business model
  • Created a robust solution that satisfies both audits and controls and creates numbers its partners trust

Congratulations to WiMP for its foresight of truly extracting the real value of data and building its business model based on it. And of course a big congrats to our channel partner Conduct who built this great deployment on our Pentaho Business Analytics platform.

Pentaho is supporting its partners also with its marketing activities. If you have impressive customer stories like WiMP, our team is on hand to help you promote and celebrate them by writing up case studies, press releases, co-hosting a webinar or completing award applications. If you are a Pentaho partner and want to get the most out of your partnership, please contact me.

Erik Nolten
Director Channel EMEA & APAC


Happy Thanksgiving from Pentaho

November 27, 2013

To all of our readers in the US, we hope you have a Happy Thanksgiving and are enjoying some time off with friends and family.

In the spirit of Thanksgiving we want to say that we are thankful for our amazing community, customers, partners and colleagues. Happy Thanksgiving!

Visit our Facebook page to see photos from our Worksgiving potluck in the San Francisco office. Everyone brought in their famous Thanksgiving recipes such as Thai sesame chicken wings, spring rolls, jalapeno corn bread, lemon cream bars and of course a big turkey with stuffing.

1453340_10151775699832724_1030180799_n

Click here to see more photos 


Follow

Get every new post delivered to your Inbox.

Join 96 other followers