Democratizing Analytics

March 27, 2015

Davy BEP_0299Politics is all that stands in the way of democratizing analytics

Following the whole ‘BI for the masses’ movement, today’s buzz is all about democratizing analytics – giving everyone from Alice in the mailroom to Joe CEO the tools to make data-informed decisions. It’s a lively debate. Entrepreneurial types insist that it’s a ‘do or die’ imperative while the more cautious amongst us liken it to running with scissors.

Last Wednesday, I joined the panel of Computing’s “Practical steps towards democratising analytics” web conference chaired by Stuart Sumner to explore the topic in more depth. You can read a recap of the event here but if you can spare half an hour, do watch the replay. The quality of the debate was excellent, reflecting IT’s growing involvement and maturity in the enterprise analytics domain.

Given that 47% of those Computing surveyed said that access to analytics at their organisations were restricted to specialists, my co-panelist Trevor Attridge and I agreed that democratizing analytics should be high on companies’ agendas. And if not a do or die imperative today, it almost certainly will be in the not-too-distant future. Even the most traditional companies from energy suppliers to shipping companies to hospitals are starting to apply analytics and the Internet of Things to improve productivity, efficiency and growth. Our customer St Antonius Hospital is a great case in point.

The impending election serves as a reminder that healthy democracies depend on strong leadership, cultural acceptance, good governance and transparency. In fact, these were the very things that delegates raised as concerns when it came to rolling analytics out more broadly in their companies. One theme that kept resurfacing in our debate was the importance of a strong “coalition” between IT and the business to address these.

It occurred to me that most of these concerns boil down to company politics – leadership, culture and changing the status quo – and not technology. Technologies for big data blending, to real-time processing, through to predictive data mining algorithms are all out there and in production as our “Mavericks of Big Data” customers demonstrate.

Even cost and ROI, which delegates raised as the greatest ultimate concern is no longer the barrier it once was. The old school per-user licence model – wholly unsuited to analytics democracies – is fast being overtaken by more attractive usage-based and subscription models.

Is politics standing in the way of your company democratizing analytics?

Davy Nys
VP EMEA & APAC
Pentaho
@davynys


Pentaho 5.3 – Taming messy and extreme data

February 18, 2015

cloudpoints
There has definitely been an evolution of how the industry talks about data. About five years ago the term ‘Big Data’ emerged to define the volume aspect of Big Data. Soon after, the definition of Big Data expanded to a better one that explains what it really is; not just big, but data that moves extremely fast, often lacks structure, varies greatly from existing data, doesn’t fit well with more traditional database technologies, and frankly, is best described as “messy”.

Fast-forward to 2015 and Pentaho’s announcement of version 5.3 this week to deliver on demand big data analytics at scale on Amazon Web Services and Cloudera Impala. This release is driven by what we see in more and more of our customers – (a new data term for you) — EXTREME data problems! Our customer NASDAQ is a very interesting example of where traditional relational data systems have maxed out and have been replaced by cloud architectures that include Hadoop, Pentaho and AWS Redshift.  You can read their story here. What NASDAQ found was that pushing vast amounts of data at extreme levels (10 billion rows everyday) was more easily accomplished by combining cloud and big data technologies, creating a more scalable solution that is highly elastic.

We’ve seen many of our customers processing vast volumes of data in Hadoop with the help of Pentaho to enable analytics at scale like never before.  The biggest challenge these customers face is getting the results out of Hadoop and into the hands of the users who can make the most of fresh insights.  That’s where Pentaho 5.3 comes into play. This release opens the data refinery to Amazon Redshift AND Cloudera Impala to push the limits of analytics through blended and governed data delivery on demand. In addition to adding Redshift and Impala support to the data refinery, 5.3 includes several other key features:

  1. Advanced Auto-Modeling – Advances in the auto-modeling accelerate the creation and increase the sophistication of generated data models offering better analytics and ease of use
  2. Additional Hadoop Support – Support for the latest Hadoop distributions from Cloudera and MapR, Hadoop cluster naming for simplified connectivity and management, and enhanced performance for scale-out integration jobs.
  3. Analyzer API Enhancements – Complete control over the end user experience for highly tailored and easy to deliver embedded analytics.
  4. Simplified Customer Experience – Easier, more simplified mechanism for embedding analytics and documentation improvements to simplify learning

If your data is big, messy, extreme or just plain annoying and needs to be tamed, I encourage you to learn more about Pentaho 5.3. Check out the great resources like the video and white paper to get started taming your data today.

Chuck Yarbrough
Product Marketing, Big Data
Pentaho

 


The Pentaho Journey – A Big Data Booster

February 13, 2015

Pentaho-Journey-ingographic-updated

Ten years ago we set out to commoditize BI, disrupt the existing old school proprietary vendors and give customers a better choice. It’s been an exciting journey building up a scalable analytic platform, building an open source community, surviving a deep recession, beating up the competition, building a great team, providing great products and services to our customers, and being a major player in the big data market.

Some of the key points along the way:

2008 – the recession hits and frankly as painful as that was it actually helped Pentaho as we were the best value proposition in BI and people were looking to reduce expenditures. It also drastically reduced the cost of commercial office space and we opened up our San Fran office in Q2 2009.

2009 – at a happy hour in Orlando in November we stumbled upon our ability to natively integrate Pentaho Data Integration with Hadoop (which I had never even heard of prior to that evening). Eight months later we launched our big data initiative.

2011 – based on traction and momentum with our big data capabilities we decided to go all in around this space. This lead to a dedicated big data team, a change in packaging and pricing, and beefing up our management team.

2013 – we acquired longtime partner Webdetails and immediately had a world class UI/UX team that our customers love to help them build custom dashboards…….

2014 – we held our first PentahoWorld user conference with a theme that strongly resonated with the market: Bring your Big Data to Life. We were proud to host 400 attendees from 27 countries around the globe.

2015 – Hitachi Data Systems acquires Pentaho! This is an exciting time for both companies, our customers, and partners. We both share a vision around analytics and in particular around big data opportunities.

So the next part of our exciting journey begins as a wholly owned subsidiary of Hitachi Data Systems and I couldn’t be happier. We’ll be going after the big data market and specifically the Internet of Things and we’ll have a blast doing so. Buckle up, because it’s going to be a fast and thrilling ride!

Richard Daley
Founder and Chief Strategy Officer
Pentaho


A bolder, brighter future for big data analytics and the Internet of Things that matter: Pentaho + HDS

February 10, 2015

Big Data and the Internet of Things are disrupting entire markets, with machine data blurring the virtual world with the physical world. This market matters —a recent Goldman Sachs report cites an astounding $2 Trillion opportunity by 2020 for IoT, with the potential to impact everything from new product opportunities, to shop floor optimization, to factory worker efficiency gains that will power top-line and bottom-line gains. The company that delivers high quality big data solutions fastest and enables customers to connect people, data and things to transform their industries and organizations will win.

That is why I am very excited to share with you that today Hitachi Data Systems (HDS) announced its intention to acquire Pentaho. Pentaho and HDS share a common vision of the impact of big data on our industry and society as a whole  This acquisition builds on the existing OEM relationship between Pentaho and HDS, forged to accelerate the HDS IoT initiative known as Social Innovation. Social Innovation enables Hitachi to deliver solutions for the Internet of Things and big data – solutions that enable healthier, safer and smarter societies, for generations to come. The Pentaho vision of the interconnectedness of data, people and things supported by a big data orchestration platform to power embedded analytics aligns perfectly. Indeed, Social Innovation is a big, bold strategy and Pentaho is a critical part of it.

HDS plans to both retain the existing business model responsible for the success of Pentaho, and use Pentaho software to develop new big data services that will go to market in FY15, accelerating delivery of HDS Social Innovation solutions. Once closed, this acquisition brings together two companies that deliver innovative and proven solutions to enterprises around the globe. Hitachi owns the infrastructure and Pentaho owns the data integration and analytics platform and know-how to harness the value in big data. Together Pentaho + HDS form a powerhouse to deliver on the promise of big data with easier, faster deployments and quicker time to value.

For customers to succeed in this new world of big data and internet of things that matter, both hardware and software must scale flexibly to keep pace with the speed, diversity and velocity of data, regardless of where it is created. No two companies know these challenges better than Pentaho and HDS. Together we are delivering a transformative future for our industry.

Game on,

Quentin

 

 


Union of the State – A Data Lake Use Case

January 22, 2015

rebeccapentaho:

Pentaho co-founder and CTO, James Dixon is who we have to thank for the term, ‘Data Lake.’ He first wrote about the Data Lake concept on his blog in 2010, Pentaho, Hadoop and Data Lakes. After the numerous interpretations and feedback, he revisited the concept and definition here: Data Lakes Revisited.

Now, in his latest blog, Dixon explores a use case based off the Data Lake concept – calling it The Union of State. Read the blog below, to learn how the Union of State can provide the equivalent of a rewind, pause, and forward remote control on the state of your business. Let us know what you think and if you have deployed one of the four use cases.

Originally posted on James Dixon's Blog:

Many business applications are essentially workflow applications or state machines. This includes CRM systems, ERP systems, asset tracking, case tracking, call center, and some financial systems. The real-world entities (employees, customers, devices, accounts, orders etc.) represented in these systems are stored as a collection of attributes that define their current state. Examples of these attributes include someone’s current address or number of dependents, an account’s current balance, who is in possession of laptop X, which documents for a loan approval have been provided, and the date of Fluffy’s last Feline Distemper vaccination.
-
State machines are very good at answering questions about the state of things. They are, after all, machines that handle state. But what about reporting on trends and changes over the short and long term? How do we do this? The answer for this is to track changes to the attributes in change logs. These change logs…

View original 1,240 more words


Big Data in 2015—Power to the People!

December 16, 2014

Last year I speculated that the big data ‘power curve’ in 2014 would be shaped by business demands for data blending. Customers presenting at our debut PentahoWorld conference last October, from Paytronix, to RichRelevance, to NASDAQ, certainly proved my speculations to be true. Businesses like these are examples of how increasingly large and varied data sets can be used to deliver high and sustainable ROI. In fact, Ventana Research recently confirmed that 22 percent of organizations now use upwards of 20 data sources, and 19 percent use between 11 – 20 data sources.[1]

Moving into 2015, and fired up by their initial big data bounties, businesses will seek even more power to explore data freely, structure their own data blends, and gain profitable insights faster. They know “there’s gold in them hills” and they want to mine for even more!

With that said, here are my big data predictions for 2015:

Big Data Meets the Big Blender!

The digital universe is exploding at a rate that even Carl Sagan might struggle to articulate. Analysts believe it’s doubling every year, but with the unstructured component doubling every three months. By 2025, IDC estimates that 40 percent of the digital universe will be generated by machine data and devices, while unstructured data is getting all the headlines.[2] The ROI business use cases we’ve seen require the blending of unstructured data with more traditional, relational, data. For example, one of the most common use cases we are helping companies create is a 360 view of their customers. The de facto reference architecture involves the blending of relational/transactional data detailing what the customer has bought, with unstructured weblog and clickstream data highlighting customer behavior patterns around what they might buy in the future. This blended data set is further mashed up with social media data describing sentiment around the company’s products and customer demographics. This “Big Blend” is fed into recommendation platforms to drive higher conversion rates, increase sales, and improve customer engagement. This “blended data” approach is fundamental to other popular big data use cases like Internet of Things, security and intelligence applications, supply chain management and regulatory and compliance demands in Financial Services, Healthcare and Telco industries.

Internet of Things Will Fuel the New ‘Industrial Internet’

Early big data adoption drove the birth of new business models at companies like our customers Beachmint and Paytronix. In 2015, I’m convinced that we’ll see big data starting to transform traditional industrial businesses by delivering operational, strategic and competitive advantage. Germany is running an ambitious Industry 4.0 project to create “Smart Factories” that are flexible, resource efficient, ergonomic and integrated with customers and business partners. The machine data generated from sensors and devices, are fueling key opportunities like Smart Homes, Smart Cities, and Smart Medicine, which all require big data analytics. Much like the ‘Industrial Internet’ movement in the U.S., Industry 4.0 is is being defined by the Internet of Things. According to Wikibon, the value of efficiency from machine data could reach close to $1.3 trillion dollars and will drive $514B in IT spend by 2020.[3]The bottlenecks are challenges related to data security and governance, data silos, and systems integration.

Big Data Gets Cloudy!

As companies with huge data volumes seek to operate in more elastic environments, we’re starting to see some running all, or part of, their big data infrastructures in the cloud. This says to me that the cloud is now “IT approved” as a safe, secure, and flexible data host. At PentahoWorld, I told a story about a “big datathrow down” that occurred during our Strategic Advisory Board meeting. At one point in the meeting, two enterprise customers in highly regulated industries started one-upping each other about how much data they stored in Amazon Redshift Cloud. One shared that they processed and analysed 5-7 billion records daily. The next shared that they stored a half petabyte of new data every day and on top of that, they had to hold the data for seven years while still making it available for quick analysis. Both of these customers are held to the highest standards for data governance and compliance – regardless of who won, the forecast for their big data environments is the cloud!

Embedded Analytics is the New BI

Although “classic BI,” which involves a business analyst looking at data with a separate tool outside the flow of the business application, will be around for a while, a new wave is rising in which business users increasingly consume analytics embedded within applications to drive faster, smarter decisions. Gartner’s latest research estimates that more than half the enterprises that use BI now use embedded analytics.[4] Whether it’s a RichRelevance data scientist building a predictive algorithm for a recommendation engine, or a marketing director accessing Marketo to consume analytics related to lead scoring or campaign effectiveness, the way our customers are deploying Pentaho leave me with no doubt that this prediction will bear out.

As classic BI matured, we witnessed a final “tsunami” in which data visualization and self-service inspired business people to imagine the potential for advanced analytics. Users could finally see all their data – warts and all – and also start to experiment with rudimentary blending techniques. Self-service and data visualization prepared the market for what I firmly expect to be the most significant analytics trend in 2015….

Data Refineries Give Real Power to the People!

The big data stakes are higher than ever before. No longer just about quantifying ‘virtual’ assets like sentiment and preference, analytics are starting to inform how we manage physical assets like inventory, machines and energy. This means companies must turn their focus to the traditional ETL processes that result in safe, clean and trustworthy data. However, for the types of ROI use cases we’re talking about today, this traditional IT process needs to be made fast, easy, highly scalable, cloud-friendly and accessible to business. And this has been a stumbling block – until now. Enter Pentaho’s Streamlined Data Refinery, a market-disrupting innovation that effectively brings the power of governed data delivery to “the people,” unlocking big data’s full operational potential. I’m tremendously excited about 2015 and the journey we’re on with both customers and partners. You’re going to hear a lot more about the Streamlined Data Refinery in 2015 – and that’s a prediction I can guarantee will come true!

Finally, as I promised at PentahoWorld in October, we’re only going to succeed when you tell us you’ve delivered an ROI for your business. Let’s go forth and prosper together in 2015!

Quentin Gallivan, CEO, Pentaho

120914-Quentin-Big-Data-2015-Prediction-Graphic

[1] Ventana Research, Big Data Integration Report, Tony Cosentino and Mark Smith, May 2014.

[2] IDC, The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things, Gantz, Minton, Turner, Reinsel, Turner, April 2014.

[3] Wikibon, Defining and Sizing the Industrial Internet, David Floyer, June 2013.

[4] Gartner, Use Embedded Analytics to Extend the Reach and Benefits of Business Analytics, Daniel Yuen, October 3, 2014.


Ventana Research on PentahoWorld, company success and 5.2

December 15, 2014

It has been a busy year for Pentaho. A few highlights include: PentahoWorld, our inaugural worldwide users conference, two new product releases (5.1 & 5.2) and a refreshed 36-month vision for our product roadmap and the future of analytics.

In the opening PentahoWorld keynote Pentaho Chairman and CEO, Quentin Gallivan, shared the top three trends driving Pentaho’s vision for governed data to be delivered “at the point of business impact.” These include:

  1. The universe of data is exploding and data is connecting people and things
  2. Evolving data architectures are required to blend data sets
  3. Users expect to consume data in new ways

These trends all align with our latest release Pentaho 5.2 and our vision for a “data orchestration platform.” We were fortunate to have Tony Cosentino, VP and Research Director at Ventana Research attend PentahoWorld to hear about our announcements and keynote firsthand as well as meet with several customers. The team at Ventana Research and Tony have spent time with us over the years discussing, guiding and watching our future unfold. We encourage you to read his first-hand point of view of PentahoWorld, the reasons behind our success and highlights of our latest 5.2 release. Tony summed it up best calling out our focus on “targeted yet flexible” solutions. We know you will be enlightened and impressed.

Read: Pentaho Presents Big Data Orchestration Platform with Governance and Data Refinery

cropped-vr_cosentino_blog_header

Donna Prlich
VP of Product Marketing
Pentaho


Follow

Get every new post delivered to your Inbox.

Join 11,883 other followers