Big Data in 2015—Power to the People!

December 16, 2014

Last year I speculated that the big data ‘power curve’ in 2014 would be shaped by business demands for data blending. Customers presenting at our debut PentahoWorld conference last October, from Paytronix, to RichRelevance, to NASDAQ, certainly proved my speculations to be true. Businesses like these are examples of how increasingly large and varied data sets can be used to deliver high and sustainable ROI. In fact, Ventana Research recently confirmed that 22 percent of organizations now use upwards of 20 data sources, and 19 percent use between 11 – 20 data sources.[1]

Moving into 2015, and fired up by their initial big data bounties, businesses will seek even more power to explore data freely, structure their own data blends, and gain profitable insights faster. They know “there’s gold in them hills” and they want to mine for even more!

With that said, here are my big data predictions for 2015:

Big Data Meets the Big Blender!

The digital universe is exploding at a rate that even Carl Sagan might struggle to articulate. Analysts believe it’s doubling every year, but with the unstructured component doubling every three months. By 2025, IDC estimates that 40 percent of the digital universe will be generated by machine data and devices, while unstructured data is getting all the headlines.[2] The ROI business use cases we’ve seen require the blending of unstructured data with more traditional, relational, data. For example, one of the most common use cases we are helping companies create is a 360 view of their customers. The de facto reference architecture involves the blending of relational/transactional data detailing what the customer has bought, with unstructured weblog and clickstream data highlighting customer behavior patterns around what they might buy in the future. This blended data set is further mashed up with social media data describing sentiment around the company’s products and customer demographics. This “Big Blend” is fed into recommendation platforms to drive higher conversion rates, increase sales, and improve customer engagement. This “blended data” approach is fundamental to other popular big data use cases like Internet of Things, security and intelligence applications, supply chain management and regulatory and compliance demands in Financial Services, Healthcare and Telco industries.

Internet of Things Will Fuel the New ‘Industrial Internet’

Early big data adoption drove the birth of new business models at companies like our customers Beachmint and Paytronix. In 2015, I’m convinced that we’ll see big data starting to transform traditional industrial businesses by delivering operational, strategic and competitive advantage. Germany is running an ambitious Industry 4.0 project to create “Smart Factories” that are flexible, resource efficient, ergonomic and integrated with customers and business partners. The machine data generated from sensors and devices, are fueling key opportunities like Smart Homes, Smart Cities, and Smart Medicine, which all require big data analytics. Much like the ‘Industrial Internet’ movement in the U.S., Industry 4.0 is is being defined by the Internet of Things. According to Wikibon, the value of efficiency from machine data could reach close to $1.3 trillion dollars and will drive $514B in IT spend by 2020.[3]The bottlenecks are challenges related to data security and governance, data silos, and systems integration.

Big Data Gets Cloudy!

As companies with huge data volumes seek to operate in more elastic environments, we’re starting to see some running all, or part of, their big data infrastructures in the cloud. This says to me that the cloud is now “IT approved” as a safe, secure, and flexible data host. At PentahoWorld, I told a story about a “big datathrow down” that occurred during our Strategic Advisory Board meeting. At one point in the meeting, two enterprise customers in highly regulated industries started one-upping each other about how much data they stored in Amazon Redshift Cloud. One shared that they processed and analysed 5-7 billion records daily.. The next shared that they stored a half petabyte of new data every day and on top of that, they had to hold the data for seven years while still making it available for quick analysis. Both of these customers are held to the highest standards for data governance and compliance – regardless of who won, the forecast for their big data environments is the cloud!

Embedded Analytics is the New BI

Although “classic BI,” which involves a business analyst looking at data with a separate tool outside the flow of the business application, will be around for a while, a new wave is rising in which business users increasingly consume analytics embedded within applications to drive faster, smarter decisions. Gartner’s latest research estimates that more than half the enterprises that use BI now use embedded analytics.[4] Whether it’s a RichRelevance data scientist building a predictive algorithm for a recommendation engine, or a marketing director accessing Marketo to consume analytics related to lead scoring or campaign effectiveness, the way our customers are deploying Pentaho leave me with no doubt that this prediction will bear out.

As classic BI matured, we witnessed a final “tsunami” in which data visualization and self-service inspired business people to imagine the potential for advanced analytics. Users could finally see all their data – warts and all – and also start to experiment with rudimentary blending techniques. Self-service and data visualization prepared the market for what I firmly expect to be the most significant analytics trend in 2015….

Data Refineries Give Real Power to the People!

The big data stakes are higher than ever before. No longer just about quantifying ‘virtual’ assets like sentiment and preference, analytics are starting to inform how we manage physical assets like inventory, machines and energy. This means companies must turn their focus to the traditional ETL processes that result in safe, clean and trustworthy data. However, for the types of ROI use cases we’re talking about today, this traditional IT process needs to be made fast, easy, highly scalable, cloud-friendly and accessible to business. And this has been a stumbling block – until now. Enter Pentaho’s Streamlined Data Refinery, a market-disrupting innovation that effectively brings the power of governed data delivery to “the people,” unlocking big data’s full operational potential. I’m tremendously excited about 2015 and the journey we’re on with both customers and partners. You’re going to hear a lot more about the Streamlined Data Refinery in 2015 – and that’s a prediction I can guarantee will come true!

Finally, as I promised at PentahoWorld in October, we’re only going to succeed when you tell us you’ve delivered an ROI for your business. Let’s go forth and prosper together in 2015!

Quentin Gallivan, CEO, Pentaho

120914-Quentin-Big-Data-2015-Prediction-Graphic

[1] Ventana Research, Big Data Integration Report, Tony Cosentino and Mark Smith, May 2014.

[2] IDC, The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things, Gantz, Minton, Turner, Reinsel, Turner, April 2014.

[3] Wikibon, Defining and Sizing the Industrial Internet, David Floyer, June 2013.

[4] Gartner, Use Embedded Analytics to Extend the Reach and Benefits of Business Analytics, Daniel Yuen, October 3, 2014.


4 Questions to Ask Before You Define Your Cloud BI Strategy

June 25, 2012

These days, when it comes to enterprise software, it seems that it is all about the cloud. Some software applications such as Salesforce, Marketo, and Workday, have made quite a name for themselves in this space. Can Business Intelligence follow the same path to success? Does it make sense to house your BI in the cloud? I believe that it depends. Let’s explore why.

There are four criteria that impact the decision for a cloud vs. on-premise BI strategy.  Let’s take a look at how they affect your approach.

Question 1: Where is the data located?

Your BI Strategy should vary depending on the location of data.  If your data is distributed, some data may already be in the cloud, e.g. web data / clickstreams; and some on-premise, such as corporate data. For real-time or near real-time analytics, you need to deploy your BI as close to the source as possible. For example, when analyzing supply chain data out of an on-premise SAP system, where your database, application and infrastructure are all sitting on-premise, it is expensive and frankly impractical to move the data to the cloud before you start analyzing it.

Your data can also be geographically distributed. Unless your cloud infrastructure is co-located with your data geo zones, your BI experience can suffer from data latency and long refresh intervals.

Question 2: What are the security levels of data?

It’s important to acknowledge that data security levels are different in the cloud. You may not be able to put all your analytics outside of the company firewall. According to Cisco’s 2012 Global Cloud Networking survey, 72% of respondents cited data protection security as the top obstacle to a successful implementation of cloud services.

Question 3: What are the choice preferences of your users?

Customer preference is extremely important today. The balance of power has shifted, and users and customers are now the ones who decide whether an on-premise or a cloud deployment is suitable for them. What’s more, each customer’s maturity model is different. As an application provider or business process automation provider, you need to cater to your individual customers’ business needs.

Question 4: What operational SLAs does your Cloud BI vendor oblige you to?

Your operational SLAs can depend on cloud infrastructure providers, obliging you to service quality levels different from what you need. Pure cloud BI vendors provide their BI software over the public Internet through a utility pricing and delivery scheme. As much as this model provides an attractive alternative when resources are limited, it’s not for everyone. In most cases, the SaaS BI vendor depends on IaaS vendors (such as Amazon, Savvis, OpSource, etc.) for storage, hardware, and networks. As a result, the SaaS BI vendors’ operational processes have to align with the infrastructure vendors’ for housing, running, and backup/recovery of the BI software. Depending on your BI strategy, these nested and complex SLAs may or may not be the right choice.

Large enterprises, or even mid-market companies inspired by growth, typically develop an IT strategy that is provider-agnostic and has the flexibility to be hosted on-premise or in the in the cloud.   This strategy helps companies avoid lock-in and inflexibility down the road.

As cloud technology remains one of the hottest trends in IT today, it is important to assess whether cloud is the right choice for BI. The reality is that it depends. The center of gravity for BI is still on premise; however, it will move to the cloud over time mostly through the embedded BI capabilities of enterprise SaaS applications. Successful organizations will be the ones that can navigate the boundary between the two strategies and provide gr

eater flexibility and choice by offering a product that can be deployed on-premise, in the cloud, or a hybrid of both.

What is your Business Intelligence Cloud strategy?

- Farnaz Erfan, Product Marketing, Pentaho

Originally posted on SmartData Collective on June 21, 2012


Sex
 & 
Sizzle 
– 
not
 without
 plumbing

November 16, 2010

What sells BI software? Sex and Sizzle! What makes BI projects successful? All of the data work done before any grids or graphs are ever produced. It’s the side of the business most BI vendors don’t talk about as they’d rather just woo and wow people with flash charts and glossy dashboards. Not that there is anything wrong with that – who doesn’t like great looking output? But, if the backend plumbing is either too complicated or non-existent, then it doesn’t matter how sexy this stuff is.

Today Pentaho announced the Pentaho Enterprise Data Services Suite to help make the “plumbing” as easy and efficient as possible. We’ve enabled people to iteratively get from raw data–from virtually any source–all the way through to metadata and onto visualization in less than an hour. We’ve enabled a new set of users to accomplish this by taking away many of the complexities.

In about 80% of the use cases we encounter, our customers want to quickly create and perform analytics on the fly, do this in an iterative approach, and when satisfied put their projects into production. You shouldn’t need a Ph.D in Data Warehousing to accomplish this, nevertheless many tools require extensive knowledge of DW methodologies and practices. It is fine to demand this knowledge with larger Enterprise DWs (EDW) but why make everyone pay the price – both in terms of software cost and experience/training required.

Now it would be one thing to provide data integration with RDBMSs, another thing to integrate with ROLAP, and yet another to integrate with Big Data like Hadoop, but how nice would it be to have a single Data Integration and Business Intelligence platform to work for all of these? Almost as nice as the Florida Gators winning a national championship but we won’t have to worry about that in 2010…had to digress for a moment.

A big part of our product release today centers around Pentaho for Hadoop integration including the GA for Pentaho Data Integration and BI Suite for Hadoop. Big Data and the whole ”data explosion” trend is just starting, so if you aren’t there today, give it time and know that Pentaho is already positioned to help in these use cases.

Pentaho allows you to start down an easy path with Agile BI and then scale up to EDW when and if necessary with enterprise data services. Our engineering team and community have spent significant time and effort to bring these services to market, and today is the official release. Please take a few minutes to read up on the new Pentaho Enterprise Data Services Suite and attend the launch webcast. Or, go ahead and download the Pentaho Enterprise Data Services Suite and start making easier, faster, better decisions.

Richard


Where to find Pentaho this June

June 15, 2010

June may be half way over but there are still 20 opportunities to learn about Pentaho this month at live and virtual events….and in 6 languages!

This month Pentaho is bringing a ray of Open Source BI sunshine to some of the industry’s most preeminent cloud events. Following the successful announcements of Pentaho’s On-Demand BI Solution and support of Apache Hadoop, we will demonstrate these offerings in action, bringing insight, clarity and flexibility to data in the cloud.

Pentaho Featured Cloud Events

GigaOm Structure 2010, June 23-24, 2010, San Francisco, CA – Join Pentaho ‘s CEO, Richard Daley and CTO, James Dixon at Structure 2010, to learn more about using Pentaho’s data integration and analytic tools to more quickly and easily load, access and analyze data in Hadoop, whether its on-premise or in the cloud.

In the exhibit hall, see a live preview demo of Pentaho’s integration with Hadoop and learn about the integrating of Pentaho BI Suite with Hive database. Take advantage of our 25% sponsor discount code by clicking here.

Hadoop Summit, June 29, 2010 in Santa Clara, CA –Pentaho is attending the third annual Hadoop Summit 2010.  Organized by Yahoo!, Hadoop Summit sessions span numerous industries and cater to all levels of expertise.  Richard Daley and Jake Conelius will be on hand at the conference to demo and discuss Pentaho’s integration with Hadoop and Hive and benefits of “Pentaho becoming the face of Hadoop.” They will also pass out limited edition Hadoop Elephants with Pentaho sweaters.

Pentaho Agile BI Events

Pentaho’s Agile BI initiative is full speed ahead as we recently delivered the Pentaho Data Integration 4.0 GA. To learn more about how to get started and why, make sure to attend one of these Agile BI focused events in the US and Europe:

North America

Worldwide webinar’s

Italy

Germany

Spain

UK

France

Norway

Visit the Events and Webcast section of our website to stay up-to-date on virtual and live events.

We want to connect with you. Join the conversation on Twitter and Facebook and get the inside scoop from the BI from the Swamp Blog by signing up to receive post via email or rss.

Rebecca Goldstein
Director, Corporate Communications
Pentaho Corporation


On-Demand or On-Premise: Yes, Please

June 8, 2010

The great debate about on-demand solutions versus on-premise software has gone on for about a decade now. The consensus is that the hosted approach is more nimble, much faster to deploy, doesn’t require in-house resources or hardware, but it isn’t quite ready for the big league due to security issues, cloudy information ownership issues, limited functionality, very limited flexibility, closed architectures and a lack of general data scalability. As a result, hosted BI has struggled to gain acceptance.

Well it’s time for a fresh take on this subject – one that finally gets BI where it needs to be. Pentaho has created a hybrid approach, which gives you the best of both worlds – the freedom, flexibility and ease of hosted software in tandem with the security, robustness and scalability of enterprise-grade solutions. This comes in the form of a suite of On-Demand BI subscriptions that put you in the driver seat. Instead of having to live with whatever the provider decides to do with information storage, security and management, you are in complete control over how the hosted solution is deployed and managed. That adds the key element that has been missing – choice. You determine if it is best that the BI application is managed wholly by Pentaho, internally or by both parties.

Those of you with more time or internal resources to commit to the specific BI application initiative can develop and administer the infrastructure yourselves. Alternatively, Pentaho can handle any portion of the deployment or application management, as you determine based on what is best for your organization.

What if you change your mind? No problem. This subscription service is deployed via a virtual image. So if you decide to move your BI Suite on-premise, there are no portability issues to overcome – one of the huge barriers to adoption of traditional hosted solutions. No longer is it problematic to shift from hosted to on-premise or back again. And if you want to add more bandwidth, memory or processing power, it’s quickly and easily added.. How’s that for agility?

Skeptical? Take it for a test drive in our new On-Demand BI evaluation environment. It’s so good that we challenge anyone to provide us with a distinct data set and we’ll get it running within 72 hours with dashboards and relevant KPIs. We’ll even let you access it for three weeks while you explore and expand it.

On-Demand or On-Premise: Yes, please.

Richard

For more information about Pentaho’s On-Demand BI Solution and the 72-Hour Challenge visit www.pentaho.com/services/on-demand.


Follow

Get every new post delivered to your Inbox.

Join 105 other followers