Big Data in 2015—Power to the People!

December 16, 2014

Last year I speculated that the big data ‘power curve’ in 2014 would be shaped by business demands for data blending. Customers presenting at our debut PentahoWorld conference last October, from Paytronix, to RichRelevance, to NASDAQ, certainly proved my speculations to be true. Businesses like these are examples of how increasingly large and varied data sets can be used to deliver high and sustainable ROI. In fact, Ventana Research recently confirmed that 22 percent of organizations now use upwards of 20 data sources, and 19 percent use between 11 – 20 data sources.[1]

Moving into 2015, and fired up by their initial big data bounties, businesses will seek even more power to explore data freely, structure their own data blends, and gain profitable insights faster. They know “there’s gold in them hills” and they want to mine for even more!

With that said, here are my big data predictions for 2015:

Big Data Meets the Big Blender!

The digital universe is exploding at a rate that even Carl Sagan might struggle to articulate. Analysts believe it’s doubling every year, but with the unstructured component doubling every three months. By 2025, IDC estimates that 40 percent of the digital universe will be generated by machine data and devices, while unstructured data is getting all the headlines.[2] The ROI business use cases we’ve seen require the blending of unstructured data with more traditional, relational, data. For example, one of the most common use cases we are helping companies create is a 360 view of their customers. The de facto reference architecture involves the blending of relational/transactional data detailing what the customer has bought, with unstructured weblog and clickstream data highlighting customer behavior patterns around what they might buy in the future. This blended data set is further mashed up with social media data describing sentiment around the company’s products and customer demographics. This “Big Blend” is fed into recommendation platforms to drive higher conversion rates, increase sales, and improve customer engagement. This “blended data” approach is fundamental to other popular big data use cases like Internet of Things, security and intelligence applications, supply chain management and regulatory and compliance demands in Financial Services, Healthcare and Telco industries.

Internet of Things Will Fuel the New ‘Industrial Internet’

Early big data adoption drove the birth of new business models at companies like our customers Beachmint and Paytronix. In 2015, I’m convinced that we’ll see big data starting to transform traditional industrial businesses by delivering operational, strategic and competitive advantage. Germany is running an ambitious Industry 4.0 project to create “Smart Factories” that are flexible, resource efficient, ergonomic and integrated with customers and business partners. The machine data generated from sensors and devices, are fueling key opportunities like Smart Homes, Smart Cities, and Smart Medicine, which all require big data analytics. Much like the ‘Industrial Internet’ movement in the U.S., Industry 4.0 is is being defined by the Internet of Things. According to Wikibon, the value of efficiency from machine data could reach close to $1.3 trillion dollars and will drive $514B in IT spend by 2020.[3]The bottlenecks are challenges related to data security and governance, data silos, and systems integration.

Big Data Gets Cloudy!

As companies with huge data volumes seek to operate in more elastic environments, we’re starting to see some running all, or part of, their big data infrastructures in the cloud. This says to me that the cloud is now “IT approved” as a safe, secure, and flexible data host. At PentahoWorld, I told a story about a “big datathrow down” that occurred during our Strategic Advisory Board meeting. At one point in the meeting, two enterprise customers in highly regulated industries started one-upping each other about how much data they stored in Amazon Redshift Cloud. One shared that they processed and analysed 5-7 billion records daily.. The next shared that they stored a half petabyte of new data every day and on top of that, they had to hold the data for seven years while still making it available for quick analysis. Both of these customers are held to the highest standards for data governance and compliance – regardless of who won, the forecast for their big data environments is the cloud!

Embedded Analytics is the New BI

Although “classic BI,” which involves a business analyst looking at data with a separate tool outside the flow of the business application, will be around for a while, a new wave is rising in which business users increasingly consume analytics embedded within applications to drive faster, smarter decisions. Gartner’s latest research estimates that more than half the enterprises that use BI now use embedded analytics.[4] Whether it’s a RichRelevance data scientist building a predictive algorithm for a recommendation engine, or a marketing director accessing Marketo to consume analytics related to lead scoring or campaign effectiveness, the way our customers are deploying Pentaho leave me with no doubt that this prediction will bear out.

As classic BI matured, we witnessed a final “tsunami” in which data visualization and self-service inspired business people to imagine the potential for advanced analytics. Users could finally see all their data – warts and all – and also start to experiment with rudimentary blending techniques. Self-service and data visualization prepared the market for what I firmly expect to be the most significant analytics trend in 2015….

Data Refineries Give Real Power to the People!

The big data stakes are higher than ever before. No longer just about quantifying ‘virtual’ assets like sentiment and preference, analytics are starting to inform how we manage physical assets like inventory, machines and energy. This means companies must turn their focus to the traditional ETL processes that result in safe, clean and trustworthy data. However, for the types of ROI use cases we’re talking about today, this traditional IT process needs to be made fast, easy, highly scalable, cloud-friendly and accessible to business. And this has been a stumbling block – until now. Enter Pentaho’s Streamlined Data Refinery, a market-disrupting innovation that effectively brings the power of governed data delivery to “the people,” unlocking big data’s full operational potential. I’m tremendously excited about 2015 and the journey we’re on with both customers and partners. You’re going to hear a lot more about the Streamlined Data Refinery in 2015 – and that’s a prediction I can guarantee will come true!

Finally, as I promised at PentahoWorld in October, we’re only going to succeed when you tell us you’ve delivered an ROI for your business. Let’s go forth and prosper together in 2015!

Quentin Gallivan, CEO, Pentaho

120914-Quentin-Big-Data-2015-Prediction-Graphic

[1] Ventana Research, Big Data Integration Report, Tony Cosentino and Mark Smith, May 2014.

[2] IDC, The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things, Gantz, Minton, Turner, Reinsel, Turner, April 2014.

[3] Wikibon, Defining and Sizing the Industrial Internet, David Floyer, June 2013.

[4] Gartner, Use Embedded Analytics to Extend the Reach and Benefits of Business Analytics, Daniel Yuen, October 3, 2014.


Attention Retail Banks, Its Time for Change!

July 18, 2014

Walhalla_(1896)_by_Max_Brückner

Retail banks, which have been wracked by scandals relating to PPI fraud, LIBOR rigging, unpopular bonus schemes and IT failures, need to think beyond upselling and cross-selling and consider how big data analytics can repair trust and improve the whole customer experience. In the article, Montetising Big Data in Retail Banks Starts with a Better Customer Experience, Davy Nys, VP of EMEA & APAC at Pentaho shares how retail banks can achieve  the ‘Valhalla’ of customer value pricing (CVP), or maximising the total value of a customer to a bank throughout all interactions and transactions. He explains how big data integration and analytics supports CVP in five ways:

  1. Supporting a two-way, 360-degree view
  2. Lower costs
  3. Smarter offers
  4. Customer friendly fraud detection
  5. Measuring customer sentiment

Learn more about how to achieve the ‘Valhalla’ of CVP read the full article here and register/attend the live webinar featuring Forrester Analyst Martha Bennett on the topic: Making the Most of your Data in the Financial Sector on July 22nd at 11am GMT


Pentaho 5.1 in LEGO

July 16, 2014

Two weeks ago we launched Pentaho Business Analytics 5.1. The new capabilities in Pentaho 5.1 support our ongoing strategy to make the hardest aspects of big data analytics faster, easier and more accessible to all. In honor of our Chief Architect, Will Gorman (also a LEGO Master Builder), we decided to have some fun with LEGO and now present to you the LEGO explanation of new features and functionality in Pentaho 5.1:

Lego_5.1

Direct Analytics on MongoDB – Unleash the value of MongoDB analytics for IT and Business Analysts with no coding required.

MongoDB5.1_2

Data Science Pack – Operationalize predictive models for R and Weka, drastically reducing data preparation time and effort.

Lego_RWeka

Full YARN Support – Reduce complexity for big data developers while leveraging the full power of Hadoop

YARN_5.1

Visit the 5.1 landing page to learn more about this release and access resources such as videos, data sheets, customer profiles and download.

 


Dinosaurs Have Had Their Day

June 16, 2014

dinosaur

Once upon a time, (not so) long ago in 2004, two young technologies were born from the same open source origins – Hadoop and Pentaho. Both evolved quickly from the market’s demand for better, larger-scale analytics, that could be adopted faster to benefit more players

Most who adopt Hadoop want to be disruptive leaders in their market without breaking the bank. Earlier this month at Hadoop Summit 2014, I talked to many people who told me, “I’d like to get off of <insert old proprietary software here> for my new big data applications and that’s why we’re looking at Pentaho.” It’s simple – no company is going to adopt Hadoop and then turn around and pay the likes of Informatica, Oracle or SAS outrageous amounts for data engineering or analytics.

Big data is the asteroid that has hit the tech market and changed its landscape forever, giving life to new business models and architectures based on open source technologies. First the ancient dinosaurs ignored open source, then they fought it and now they are trying to embrace it. But the mighty force of evolution had other plans. Dinosaurs are giving way to a more nimble generation that doesn’t depend on a mammoth diet of maintenance revenue, exorbitant license fees and long-term deals just to survive.

In this new world companies must continually evolve to survive and dinosaurs have had their day. It’s incredibly rewarding to be  part of a new analytics ecosystem that thrives on open standards, high performance and better value for customers. So many positive evolutionary changes have taken place in the last ten years, I can’t wait to see what the next ten will bring.

Richard Daley
Founder and Chief Strategy Officer
Pentaho

Image: #147732373 / gettyimages.com


Good news, your data scientist just got a personal assistant

June 3, 2014

personal asstIf you are or have a data scientist in house you’re in for good news.

Today at Hadoop Summit in San Jose, Pentaho unveiled a toolkit built specifically for data scientists to simplify the messy, time-consuming data preparation, cleansing and orchestration of analytic data sets. Don’t just take it from us…

The Ventana Research Big Data Analytics Benchmark Research estimates the top two time-consuming big data tasks are solving data quality and consistency issues (46%) and preparing data for integration (52%). That’s a whopping amount of time just spent getting data prepped and cleansed, not to mention the time spent in post processing results.  Imagine if time spent preparing, managing and orchestrating these processes could be handed off to a personal assistant leaving more time to focus on analyzing and applying advanced and predictive algorithms to data (i.e. doing what a data scientist is paid to do).

Enter the Pentaho Data Science Pack, the personal assistant to the data scientist.  Built to help operationalize advanced analytic models as part of a big data flow, the data science pack leverages familiar tools like R, the most-used tool for data scientists and Weka, a widely used and popular open source collection of machine learning algorithms. No new tools to learn. In the words of our own customer, Ken Krooner, President at ESRG “There was a gap in the market until now and people like myself were piecing together solutions to help with the data preparation, cleansing and orchestration of analytic data sets. The Pentaho Data Science Pack fills that gap to operationalize the data integration process for advanced and predictive analytics.”

Pentaho is at the forefront of solving big data integration challenges, and we know advanced and predictive analytics are core ingredients for success. Find out how close at hand your data science personal assistant is and take a closer look at the Data Science Pack.

Chuck Yarbrough
Director, Big Data Product Marketing


Cloudera Stamp of Approval

April 3, 2014

logo_cloudera_certifiedYesterday, Cloudera announced the general availability of Cloudera 5 (C5), the latest generation of Cloudera’s unified data platform for the enterprise data hub. Pentaho engineers have been working on certification since the beta was available in early February to make sure we are certified on day one of the GA.

Cloudera and Pentaho  have a long standing strategic relationship with tested joint technologies that have been deployed time and time over again. By using Cloudera Certified products, enterprises significantly reduce risk while taking advantage of the worlds most complete, tested and popular platform powered by Apache Hadoop. This stamp of approval should put customers at ease when deploying C5 knowing that Pentaho and Cloudera have worked together to ensure the highest level of capabilities and compatibility.

Learn more about Pentaho and Cloudera’s joint solution benefits, access the download, resources and recordings.

Paul Vasquez
Senior Product Manager, Technology Partners
Pentaho
@BigDataPaul


Robert Frost on Big Data

March 27, 2014

frostThere is a time to be adventurous and take the path less travelled as Frost points out in the often quoted, “Road Not Taken” –but not with big data. I wonder what Frost would have to say about big data, a concept difficult to comprehend in the early 1900’s.  In reviewing the poem (poetry being an arm chair interest of mine), it becomes clear that there is some ambiguity about whether Frost intended to promote taking the “Road Less Travelled.”  In fact, it is thought that the poem was written in jest to his friend Lawrence Thomas to convey that the true regret is in not making a decision at all.

This ambiguity and indecisiveness is very present in the big data ecosystem.  Most organizations are cognizant of the disruptive force of big data — it’s impact on the competitive business landscape and the way consumers live their daily lives.  Fortunately, they also know there is little, if no room for regret at taking the Big Data Road Less Travelled or not choosing a path at all.

To that end Pentaho has collected and shared four paths that many of our customers have successfully travelled to achieve business value from big data.  They are four common big data use cases that reduce costs, optimize and begin to transform businesses.  We have laid out four roads that have been well traveled and ended in big data success. I highly encourage you to explore our top four big data use cases and the roads most often travelled. No regrets.

Donna Prlich
Sr. Director, Product Marketing

 


Follow

Get every new post delivered to your Inbox.

Join 105 other followers