Using Pentaho to Be Aware, Analyze, & Take Action

September 28, 2011

Be Aware

Denial of Service attacking (DoS), IP Spoofing, Comment Spamming and Malware programming… are malicious activities designed to disrupt services used by many people and organizations. If you are taking advantage of the internet to run your business, create awareness of a product or service or simply keep in touch with friends and family, your systems are at risk of becoming a target.

Successful internet “intrusions” can cost you money and even steal your identity. DoS attacks can prevent internet sites from running efficiently and in most cases can take them down. IP Spoofing, frequently used in DoS attacks, is a means to “forge” the IP address and make it appear that the internet request or “attack” is coming from some other machine or location. And Comment Spamming, oh brother…where programs or people flood your site with random nonsense comments and links with an attempt to raise their site’s search engine ranking or increase internet traffic to their sites:

“Nice informations for me. Your posts is been helpful. I wish to has valuable posts like yours in my blog. How do you find these posts? Check mind out [link here]“

Huh? – LOL

You may already have defensive measures in place to address some if not all of these things. There are programs, filters and services that you can use to look up, track and prevent this sort of activity. However, with the continuous stream of unique and newly produced malware, those programs and services are only as good as the latest “malicious” activity that is captured. No matter what, it will eventually cause headaches for many people and organizations around the globe. Being able to monitor when something is “just not right” is a great step in the right direction.

Analyze

In September of 2010, I introduced the Pentaho Evaluation Sandbox. It was designed as a tool to assist with Pentaho evaluations as well as showcase many examples of what Pentaho can do. There have been numerous unique visitors to this site, both legitimate and some as I soon discovered…not. Prior to the site’s launch, using Pentaho’s Reporting, Dashboard and Analysis capabilities, I created a simplistic Web Analytic Dashboard that would highlight metrics and dimensions of the Sandbox’s internet traffic. It was a great example to demonstrate Pentaho Web Analytics embedded in a hosted application. Upon my daily review of the Site Activity dashboard which includes a real-time visit strip chart monitor, I noticed an unusually large spike in page views that occurred within a 1 minute time-frame.

Now that spike can be normal, providing a number of different people are surfing the site at the same time. However it caught my attention as “unusual” due to what I knew was normal. The dashboard quickly alerted me of something I should possibly take action on. So I clicked on the point at the peak to drill-down into the page visit detail at that time. The detail report revealed that who or whatever was accessing the Sandbox was rapidly traversing the site’s page map and directories looking for holes in the system. I also notice that all the page views were accessed by the same IP address within under 1 minute. Hmmm, I thought. “That could be a shared IP, a person or even a bot ignoring my robots.txt rules.” But..as I scrolled down I further discovered there were attempts to access the .htaccess and passwd files that protect the site. I immediately clicked on the IP address data value in the detail report (in my admin version of the report) which linked me to an IP Address Blacklist look-up service. The Blacklist Look-up program informed me that the IP address has been previously reported and was listed as suspicious for malicious activity. BINGO! Goodbye whoever you are!

Take Action

I quickly took action on my findings by banning the IP address from the system to prevent any further attempts to access the site. I then began to think of some random questions I needed to ask of the data. I switched gears and turned to Pentaho Analysis. Upon further analysis of the site’s data using Pentaho Analyzer Report - I was able to see evidence of IP Spoofing and even Comment Spamming coming form certain IP address ranges. The action I took next was to block certain IP address ranges that have been accessing the site in this manner. In addition I created a contact page for those who may be accessing the site legitimately but may have gotten blocked if their IP falls in that range.

Wow, talk about taking action on your data huh?

It is not a question of if, but when an unwarranted attempt will occur on your systems. Make sure you take the appropriate steps to protect them by using the appropriate software and services that will make you aware of problems. My experience may be an oversimplification but it is a great example of how I used Pentaho to make me aware of a problem and take that raw data and turn it into actionable information.

Special thanks to Marc Batchelor, Chief Engineer and Co-Founder of Pentaho, for helping me explore the corrective actions to take to protect the Pentaho Evaluation Sandbox.

Regards,

Michael Tarallo
Director of Enterprise Solutions
Pentaho


The Right Tool For the Right Job – Part 1

September 20, 2011

 

All too Common

You have questions. How do you get your answers? The methods and the tools used to help get those answers to business questions will vary per organization. For those without established BI solutions; using desktop database query and spreadsheet tools are…all too common. And…If there is a BI tool in place, usage and its longevity are dependent on its capabilities, costs to maintain it and ease of use for both development staff and business users. Decreased BI tool adoption, due to rising costs, lack of functionality and complexity may increase dependencies on technical resources and other home grown solutions to get answers. IT departments have numerous responsibilities. Running queries and creating reports may be ancillary, which can result in information not getting out in a timely manner, questions going unanswered and decisions being delayed. Therefore, the organization may not be leveraging its BI investment for what it was originally designed to do…empower business user to create actionable information.

(Read the similar experiences of Pentaho customer Kiva.org here)

Six of One, Half a Dozen of the Other

The BI market is saturated with BI tools, from the well known proprietary vendors to the established commercial open source leaders and niche players. There are choices that include the “Cloud,” on premise, hosted (SaaS) and even embedded. Let’s face it and not complicate things…most, if not all, of the BI tools out there can do the same thing in some form or fashion. They are designed to access, optimize and visualize data that will aid in the answering of questions and tracking of business performance. Dashboards, Reporting and Analysis fall under a category I refer as “Content Delivery.” These methods of delivering information are the foundation of a typical BI solution. They provide the most common means for tracking performance and identifying problems that need attention. But..did you know, there is usually some sort of prep work to be done, before that chart or traffic light is displayed on your screen or printed in that report? That prep work can range from simple ETL scripting to provisioning more robust Data Warehouse and Metadata Repositories.

Data Integration

Content Delivery should begin first with some sort of Data Integration. In my 15 years in the BI space I have not seen one customer or prospect challenge me on this. They all have “data” in multiple silos. They all have a “need” to access it, consolidate it, extrapolate it and make it available for analysis and reporting applications. Whether they use it already as second-hand data, loaded into an Enterprise Data Warehouse for historical purposes, or produce Operational Data Stores, they are using Data Integration. Whether they are writing code to access and move the data, using a proprietary utility or even some ETL tool, they are using Data Integration. It is important to realize that not all data needs to be “optimized” out of the gate, as it is not only the data that is important. It is how it will be used in the day to day activities supporting the questions that will be asked. This requires careful planning and consideration of the overall objectives that the BI tools will be supporting.

Well, How do I know what tools to use? – Stay Tuned

With so many tools available, how will you know what is right for the organization? Thorough investigation of the tools through RFIs, RFPs, self evaluation and POCs are a good start. However, make sure you are selecting tools based on the ability to solve your specific current AND future needs and not solely because it looks cool and provides only the “sex and sizzle” the executives are after. The typical need is always reporting, analysis and dashboards. Little realize that there is a lot more to it than those three little words. In the next part of this article I will cover a few of the most common “BI Profiles” that are in almost every organization. In each profile I will cover the pains, symptoms and impacts that plague organizations today as well as the solution strategies and limitations you should be aware of when looking at Pentaho.

Stay tuned!

Regards,

Michael Tarallo
Director of Enterprise Solutions
Pentaho

This blog was originally posted on http://michaeltarallo.blogspot.com/ on September 19, 2011

Facebook and Pentaho Data Integration

July 15, 2011

Social Networking Data

Recently, I have been asked about Pentaho’s product interaction with social network providers such as Twitter and Facebook. The data stored within these “social graphs” can provide its owners with critical metrics around their content. By analyzing trends within user growth and demographics as well as consumption and creation of content…owners and developers are better equipped to improve their business with Facebook and Twitter. Social networking data can already be viewed and analyzed utilizing existing tools such as FB Insights or even purchasable 3rd party software packages created specifically for this purpose. Now…Pentaho Data Integration in its traditional sense is an ETL (Extract Transform Load) tool. It can be used to extract and extrapolate data from these services and merge or consolidate it with other relative company data. However, it can also be used to automatically push information about a company’s product or service to the social network platforms. You see this in action today if you have ever used Facebook and “Liked” something a company had to offer. At regular intervals, you will sometimes note unsolicited product offers and advertisements posted to your wall from those companies. A great and cost effective way to advertise to the masses.

Application Programming Interface

Interacting with these systems is made possible because they provide an API. (Application Programming Interface) To keep it simple, a developer can write a program in “some language” to run on one machine which communicates with the social networking system on another machine. The API can leverage a 3GL such as Java or JavaScript or even simpler, RESTful services. At times, software developers/vendors will write connectors in the native API that can be distributed and used in many software applications. These connectors can offer a quicker and easier approach than writing code alone. It may be possible within the next release of Pentaho Data Integration, that an out of the box Facebook and/or Twitter transformation step is developed – but until then the RESTful APIs provided work just fine with the simple HTTP POST step.  Using Pentaho Data Integration with this out of the box component, allows quick access to social network graph data. It can also provide the ability to push content to those applications such as Facebook and Twitter without writing any code or purchasing a separate connector.

The Facebook Graph API

Both Facebook and Twitter provide a number of APIs, one worth mentioning is the Facebook Graph API (don’t worry Twitter, I’ll get back to you in my next blog entry).

The Graph API is a RESTful service that returns a JSON response. Simply stated an HTTP request can initiate a connection with the FB systems and publish / return data that can then be parsed with a programming language or even better yet – without programing using Pentaho Data Integration and its JSON input step.

Since the FB Graph API provides both data access and publish capabilities across a number of objects (photos, events, statuses, people pages) supported in the FB Social graph, one can leverage both automated push and pull capabilities.

If you are interested in giving this a try or seeing this in action, take a look at this tutorial available on the Pentaho Evaluation Sandbox.

Kind Regards,

Michael Tarallo
Director of Enterprise Solutions
Pentaho


Q&A with Pentaho Director of Sales Engineering, Mike Tarallo

April 14, 2011

Q&A is a series on the Business Intelligence from the Swamp Blog that interviews key members of the Pentaho team to learn more about their focus at Pentaho and outlook on the Business Intelligence industry.

You may be familiar with our interviewee today from reading his blogs on BI from the Swamp or watching a demo on the Pentaho Evaluation Sandbox. If you are a customer, there is a very good chance you have met him virtually or in person. Michael Tarallo is the Director of Sales Engineering and our feature today on Q&A. To learn more about Mike and his role at Pentaho we asked him 5 questions.

1. What brought you to Pentaho?

It was spring of 2007 and after about 9+ years at Information Builders (IBI), a proprietary Business Intelligence company, I felt I achieved all that I could and decided to “find new cheese”.  Around the time I was promoted to Senior Sales Engineer at IBI, a recruiter approached me speaking in tongues of this “thing” called “Open Source Business Intelligence.” I remember thinking, “Sheesh, I am having a hard enough time selling this expensive proprietary stuff, how the heck am I going to sell Open Source since it’s already free?” However, after a thorough investigation, I decided to leave my comfortable position at IBI for a company that was new and exciting called Pentaho.

When I finally learned that Open Source is not about “free” software but more about community, collaboration and better software, I was immediately enlightened and couldn’t wait to get started. I distinctly remember the Sales Manager at IBI, attempting to put fear in my head, “Ya know Mike, the grass isn’t always greener.” Well, sir 4 Years later, not only is the grass greener, but it is thicker and fuller than ever.  Remember, comfort is the enemy of achievement – (Dr. Farrah Gray)

2. What do you do?

I am the Director of Sales Engineering, responsible for leading the Sales Engineers and pre-sales activities within the organization.  I started 4 years ago as the first Pentaho Sales Consultant (SC), responsible for Pre-Sales activities. There are technical and consulting related activities that occur during the sales cycle, before the actual sale hence Pre-Sales.  Pentaho was still fairly new at the time and either sales team members or Product Management performed many product demonstrations. As the company grew it became increasingly important to a build a group that would focus support on the sales team as its pre-sales activities increased. My initial goal was to introduce pre-sales processes, create collateral and demonstrations that focused on “solutions” for business problems, rather than demonstrate a bunch of “tools”.

As the Pentaho Sales and Pre-Sales teams grew worldwide, there was a huge demand on Sales Engineer (SE) resources. With the continued growth of the company and finally a larger team of SEs – I was promoted to Director of Sales Engineering in order to lead and proliferate those processes throughout our group.

3. Why are Sales Engineers a vital part of the success of a sales cycle?

Sales Engineers are the stage performers of the IT world: immensely capable, adaptable, confident, excellent communicators who are equally cool in front of large crowds and intimate groups. Sales Engineers work closely with Pentaho Account Managers to demonstrate the breadth and depth of Pentaho products as a complete solution.

Sales Engineering provides what I call the “Solution Vision” to the prospect who is evaluating our software. We present the “Art of the Possible” by demonstrating and discussing how Pentaho software fits in the context of their landscape. At all costs we try to stay away from generic demonstrations, but sometimes you have to play the game. And when we do play, we play hard. Let’s face it; there are a lot of software packages that can “do” the same thing. Sales Engineering provides that 2% factor by establishing a relationship with the prospect and making them feel comfortable that the Pentaho solution can meet their specific needs. Without Sales Engineering, a crucial piece of the sales process would be missing.

4. What makes a good Sales Engineer (SE)?

There are a number of facets that make a good SE, good. But, there are key factors that make a good SE, great. Some of these characteristics are learned over time, and some are just part of one’s personality and are difficult to master. Aside from technical expertise, one important quality of a great SE is, communication. Learning not only how to communicate with the prospect but also with the account rep and other members of the SE team.

Every person, whether prospect or rep, is different. The key is being able to listen to what is being said and also what is not being said. This is important so the appropriate persuasive questions can be asked and proper expectations can be set.

Once communication skills are honed a SE should be able to translate the product’s technical software capabilities into the suitable business value to the prospect. There is nothing worse than demonstrating software to a company who has no idea why you just showed it to them.

I feel that an SE also needs to establish themselves as a leader in his/her domain. If your technical strength lies in some sort of application development you may want to focus your talents in the OEM group – where it is more about embedding and integrating the software into the prospect’s applications.  If you are able to articulate the business value better or provide subject matter expertise you can specialize in creating vertical demonstrations and collateral relevant to the sectors you are working with.

Finally….energy, lots of energy, enough said. :-)

5. What does the “Pre” in Pre-Sales stand for?

Doing a demonstration without knowing anything is called a “show up and throw up.” Hoping you know what a prospect wants is not good enough. It is critical to know their exact needs and how to best demonstrate our capabilities. I prefer to think of the “Pre” to stand for: P-prepare, R-respond, E-execute. Those actions will make for shorter sales cycles, proper customer expectations and increased sales.

Want to see what Mike is up to now?

Do you have additional questions for Mike? Is there someone or a certain role at Pentaho you would like us to interview? Leave your questions in the comments section below. We’d love to hear from you.


High availability and scalability with Pentaho Data Integration

March 31, 2011

Experts often possess more data than judgment.” – Colin Powell….hmmm, those experts surely are not using a scalable Business Intelligence solution to optimize that data which can help them make better decisions. :-)

Data is everywhere! The amount of data being collected by organizations today is experiencing explosive growth. In general, ETL (Extract Transform Load) tools have been designed to move, cleanse, integrate, normalize and enrich raw data to make it meaningful and available for knowledge workers and decision support systems. Once data has been “optimized,” only then can it be turned into “actionable” information using the appropriate business applications or Business Intelligence software. This information could then be used to discover how to increase profits, reduce costs or even write a program that suggests what your next movie on Netflix should be. The capability to pre-process this raw-data before making it available to the masses, becomes increasingly vital to organizations who must collect, merge and create a centralized repository containing “one version of the truth.” Having an ETL solution that is always available, extensible and highly scalable is an integral part of processing this data.

Pentaho Data Integration

Pentaho Data Integration (PDI) can provide such a solution for many varying ETL needs. Built upon a open Java framework, PDI uses a metadata driven design approach that eliminates the need to write, compile or maintain code. It provides an intuitive design interface with a rich library of prepacked plug-able design components. ETL developers with skill sets that range from the novice to the Data Warehouse expert can take advantage of the robust capabilities available within PDI immediately with little to no training.

The PDI Component Stack

Creating a highly available and scalable solution with Pentaho Data Integration begins with understanding the PDI component stack.

● Spoon – IDE – for creating Jobs, Transformations including the semantic layer for BI platform
● Pan – command line tool for executing Transformations modeled in Spoon
● Kitchen – command line tool for executing Jobs modeled in Spoon
● Carte – lightweight ETL server for remote execution
● Enterprise Data Integration Server – remote execution, version control repository, enterprise security
● Java API – write your own plug-ins or integrate into your own applications

Spoon is used to create the ETL design flow in the form of a Job or Transformation on a developer’s workstation. A Job coordinates and orchestrates the ETL process with components that control file movement, scripting, conditional flow logic, notification as well as the execution of other Jobs and Transformations. The Transformation is responsible for the extraction, transformation and loading or movement of the data. The flow is then published or scheduled to the Carte or Data Integration Server for remote execution. Kitchen and Pan can be used to call PDI Jobs and Transformations from your external command line shell scripts or 3rd party programs. There is also a complete Java SDK available to integrate and embed these process into your Java applications.

Figure 1: Sample Transformation that performs some data quality and exception checks before loading the cleansed data

PDI Remote Execution and Clusters

The core of a scalable/available PDI ETL solution involves the use of multiple Carte or Data Integration servers defined as “Slaves” in the ETL process. The remote Carte servers are started on different systems in the network infrastructure and listen for further instructions. Within the PDI process, a Cluster Scheme can be defined with one Master and multiple Slave nodes. This Cluster Scheme can be used to distribute the ETL workload in parallel appropriately across these multiple systems. It is also possible to define Dynamic Clusters where the Slave servers are only known at run-time. This is very useful in cloud computing scenarios where hosts are added or removed at will. More information on this topic including load statistics can be found here in an independent consulting white paper created by Nick Goodman from Bayon Technologies, “Scaling Out Large Data Volume Processing in the Cloud or on Premise.”

Figure 2: Cx2 means these steps are executed clustered on two Slave servers
All other steps are executed on the Master server

The Concept of High Availability, Recover-ability and Scalability

Building a highly available, scalable, recoverable solution with Pentaho Data Integration can involve a number of different parts, concepts and people. It is not a check box that you simply toggle when you want to enable or disable it. It involves careful design and planning to prepare and anticipate the events that may occur during an ETL process. Did the RDBMS go down? Did the Slave node die? Did I lose network connectivity during the load? Was there a data truncation error at the database? How much data will be processed on peak times? The list can go on and on. Fortunately PDI arms you with a variety of components including complete ETL metric logging, web services and dynamic variables that can be used to build recover-ability, availability, scalability scenarios into your PDI ETL solution.

For example, Managing Consultant in EMEA, Jens Bleuel developed a PDI implementation of the popular Watchdog concept. A solution that includes checks to monitor if everything is on track is using the concept of a Watchdog when executing its tasks and events. Visit the link above for more information on this implementation.

 

 

Putting it all together – (Sample)

Diethard Steiner, active Pentaho Community member and contributor, has written an excellent tutorial that explains how to set up PDI ETL remote execution using the Carte server. He also provides a complete tutorial (including sample files provided by Matt Casters, Chief Architect and founder of Kettle) on setting up a simple “available” solution to process files, using Pentaho Data Integration. You can get it here. Please note that advanced topics such as this are also covered in greater detail (designed by our Managing Consultant Jens Bleuel – EMEA) in our training course available here.

Summary

When attempting to process the vast amounts of data collected on a daily basis, it is critical to have a Data Integration solution that is not only easy to use but easily extendable. Pentaho Data Integration achieves this extensibility with its open architecture, component stack and object library which can be used to build a scalable and highly available ETL solution without exhaustive training and no code to write, compile or maintain.

Happy ETLing.

Regards,

Michael Tarallo
Senior Director of Sales Engineering
Pentaho

This blog was originally published on the Pentaho Evaluation Sandbox. A comprehensive resource for evaluating and testing Pentaho BI.


“Drilling” in to the detail with Pentaho

November 30, 2010

“Drilling”…..(with respect to Business Intelligence applications and Information Technology). Where did that word come from? What does it mean? What can it mean? I am sure you have heard the phrase “Drill down to detail” before, but you may have also heard “Drill Up”, “Drill Out”, “Drill Across”, “Drill In” and “Drill Through” and don’t forget “Drill Anywhere”.

In general, it means to simply move from summary level information to underlying detail data, either within its current data set or even outside to another data set. Its main purpose is to allow one to easily view summarized information in the form of a chart, table or some graphical visualization with the added ability to “click” on a value, series or region and “drill in” to the next level of detail or out to some other dimension. “Drilling” allows business users to make informed decisions quickly without having to page through sheets of raw data.

For example, summarized sales revenue for the year 2010 is $200K, but upon drilling down we see that $175K was brought in by 3 out of 4 regions, leaving 1 region with very low numbers. This now exposes a single region as being an outlier or a entity that needs focused attention. The power of Business Intelligence applications at work, turning raw data into actionable information.

The Pentaho BI Suite can provide “Drilling” in a number of ways depending on which module you deploy. We will explore each of these in this article……….read more about it here at the Pentaho Evaluation Sandbox.

http://sandbox.pentaho.com/2010/11/drilling-in-to-the-details-with-pentaho/

Regards,

Michael Tarallo
Director of Sales Engineering
Pentaho


The Pentaho Pre-Sales Sandbox

October 7, 2010

Quick Bit About Pentaho

Pentaho has been all about building and delivering a scalable, complete end-to-end BI Suite from day one. From making the software “possible” during those humble beginnings to making it “available” and now “easy”. You will find that Pentaho has an extensive offering that is both flexible and intuitive. Pentaho software has been targeted as the Commercial Open Source BI alternative by System Integrators, Consultancies, Enterprises, OEMs and SMBs. It has been deployed worldwide in a variety of industries; supporting mission critical applications which encompass both data integration and information delivery, all provided by “one” vendor, Pentaho.

The Pentaho Pre-Sales Sandbox

The Pre-Sales Sandbox is a resource that will streamline the effectiveness of the Evaluation/Selection process. With this resource at hand, you will be able to make an informed decision as quickly and efficiently as possible. Examples, tutorials, consolidated information are staged here to view and download to assist you in your evaluation process. Please be aware that the collateral available on this site can be posted on a moments notice and is intended to serve the masses as quickly as possible. Therefore the content downloaded may be a work in progress, a draft, incomplete, or limited in detail. Please see our website at http://www.pentaho.com for more information about our products and value added services.

Who should use this resource

This resource is primarily for those who are actively evaluating the Pentaho BI suite. It is not a replacement for our FREE evaluation support offering or the Pentaho Knowledge Base. It has been designed for those who are intending on exploring the power, flexibility and extensibility of the software. They have a basic understanding of Business Intelligence applications including information delivery and data integration. They are familiar with concepts of accessing data sources, creating and publishing reports as well as understand the fundamentals of ETL (Extract Transform Load). They understand terminology such as metric, measure, dimension and have a general understanding of data modeling. Professional documentation for the Pentaho BI suite, including Administration and Security guides are located in the Knowledge Base.

If you are evaluating Pentaho, or are simply a bit curious come check out the Pentaho Pre-Sales Sandbox here: http://sandbox.pentaho.com/

I hope to speak with you in person.

Michael Tarallo
Pre-Sales Director
Pentaho

(Originally posted on Michael Tarallo’s Blog: http://michaeltarallo.blogspot.com/2010/10/pentaho-pre-sales-sandbox.html)


Part 4: Using the Report Design Wizard

September 17, 2010

Ahhhh the Wizard. As defined by Wikipedia (you know, if it is on Wikipedia, it must be true) … “A magician, sorcerer, witch, or person known under one of many terms who practices magic derived from supernatural sources.” Well then, why software developers name the “easy to use” part of the product a “Wizard” I just have no idea. Just call it the “Assistant” or the “Guide me thingamajig” – just don’t call it a Wizard unless it is going to read my mind and magically create my report for me!

I digress….in Part 4 of this series we will cover the use of the Pentaho Report Designer and the integrated Report Design…ahem..[coughs] Wizard. The 10 min video below will show you how easy it is to get started creating and publishing Operational and Enterprise type reports. This video only scratches the surface of what is possible with the Pentaho Report Designer, for a more in-depth look visit the Pentaho Pre-Sales Tools page where there is a Comprehensive Report Designer Tutorial that will demonstrate the other features and functions available in this design tool. Stay tuned to this blog for Part 5 where I magically chase down a Leprechaun and force him to conjure me up some Dashboards. Until then.

Click here to view all the tutorials in this series.

Michael Tarallo
Pre-Sales Director
Pentaho


Top 3 questions asked to Pentaho Pre-Sales

August 16, 2010

Can you…? How does…? Where do…? What is…?

Sound familiar?

Pre-Sales Engineers are the stage performers of the IT world; immensely capable, adaptable, confident, excellent communicators who are  equally cool in front of large crowds and intimate groups.   If you are in a Pre-Sales role, you have the absolute pleasure of showcasing what your product / service / solution can offer to help prospects make an informed decision.  Pentaho Pre-Sales Engineers, have numerous discussions everyday with qualified prospects that want to evaluate Pentaho and prove that the software can satisfy their goals.  Most prospects are looking for a solution that will help them reduce costs, increase profits and make their businesses function efficiently. Part of my job is to identify a “fit” for these objectives and recommend an approach to take when evaluating. During these discussions I am asked many questions that have definitive answers and some questions that have more than one answer. I thought it would be helpful to share the 3 most common questions asked to my team and I and our responses.

Question 1: What is the most common reason an evaluation or implementation will not be successful?

That answer is quite simple. So simple in fact I wrote a blog entry about it in March 2009. You can view it here. However I will summarize it for you in one word.

Answer: People

People involved in the evaluation process of anything, are the lead cause for its success or failure. There are those who are self sufficient, “read” the provided materials, have the knowledge and expertise to get the job done with little assistance….then there are those that expect everything to be done for them. I elaborate more about this in the fore mentioned blog entry. However, for those who need a lot of hand holding, please be prepared to present clearly identified evaluation criteria to those you will be working with.

Question 2: What skill sets are required to use the Pentaho software?

Answer: I will first state that having Java Engineering skills are not necessary. You do not have to be a Java developer to use Pentaho. This stigma can attach itself to not just Pentaho, but to other “Java” developed platforms. Someone sees the word “Java” somewhere in the Wiki or on the Web and they automatically assume they need to know Java. This is not true. Pentaho is built on modern open standards that are written in Java. All this should mean to you, is that Pentaho provides a cost effective alternative that can run almost anywhere that supports Java.

Primarily, Pentaho training is the #1 skill set needed to really exploit all of the power that the software provides from the GUI design suite. Just like learning anything new, you should attempt to pick up the manual. I certainly wouldn’t start driving a car without learning how to operate it first. Other skills that can help are a general understanding of BI technologies and terminology. Relating to terms such as KPI (Key Performance Indicator), RDBMS (Relational Database Management System) metrics, dimensions, operational reporting, analyitcs, dashboards, and even a little acronym known as ETL (Extract Transform and Load). Furthermore, if you want to dive deeper and start embedding, integrating and enhancing the applications…having a Java background or even a simple web development skill set is always a plus.

Question 3: How scalable / performant is your solution?

Answer: I love this one. What answer do you think I am going to give? – Simply stated, your mileage is going to vary and greatly depends on what products you are using. Are you using the software for data integration or content delivery or perhaps both? I don’t care if you have the beefiest machine on the block or are paying for the largest cloud, your scalability and performance are going to be dependent on numerous factors. Prospects are usually looking for exact sizing statistics. They want to have some predetermined knowledge of application performance as well as what hardware and software infrastructure they may need to provision. Fortunately, Pentaho has produced the Pentaho Linear Scalability White Paper for the BI platform, and an independent consultancy has created the PDI Scaleout White Paper for Pentaho Data Integration. These documents can give additional insight in setting those expectations. Pentaho Services can also conduct capacity planning sessions with prospects to give an estimate of what can be expected. In addition to the white papers please note that there are a number of real-world scalability customer success stories you can read about on our website.

Thanks for your time, I look forward to answering your questions in person.

Michael Tarallo
Pre-Sales Director
Pentaho


Part 3: Easily prototyping your data

July 29, 2010


Prototyping as defined in information systems is:

The process of building an example or model of the actual proposed system. It is an iterative process that is part of the analysis phase of the development life cycle.

In Part 3 of this tutorial series I will show you how you can easily “prototype” your data within the Pentaho User Console by using the Dashboard Designer and its Data Access component. Part 3 picks up after we have configured our server side Pentaho Data Connections to connect to Oracle and MS SQL Server. In this tutorial I will also cover the Pentaho BI Server start up procedures on both the Linux and Windows operating systems.

You are able to download the video or view it in high quality directly from this blog. Click ‘share’ in the video window to see the available options.

Mike Tarallo
Pre-Sales Director
Pentaho


Follow

Get every new post delivered to your Inbox.

Join 97 other followers