Using Pentaho to Be Aware, Analyze, & Take Action

Be Aware

Denial of Service attacking (DoS), IP Spoofing, Comment Spamming and Malware programming… are malicious activities designed to disrupt services used by many people and organizations. If you are taking advantage of the internet to run your business, create awareness of a product or service or simply keep in touch with friends and family, your systems are at risk of becoming a target.

Successful internet “intrusions” can cost you money and even steal your identity. DoS attacks can prevent internet sites from running efficiently and in most cases can take them down. IP Spoofing, frequently used in DoS attacks, is a means to “forge” the IP address and make it appear that the internet request or “attack” is coming from some other machine or location. And Comment Spamming, oh brother…where programs or people flood your site with random nonsense comments and links with an attempt to raise their site’s search engine ranking or increase internet traffic to their sites:

“Nice informations for me. Your posts is been helpful. I wish to has valuable posts like yours in my blog. How do you find these posts? Check mind out [link here]”

Huh? – LOL

You may already have defensive measures in place to address some if not all of these things. There are programs, filters and services that you can use to look up, track and prevent this sort of activity. However, with the continuous stream of unique and newly produced malware, those programs and services are only as good as the latest “malicious” activity that is captured. No matter what, it will eventually cause headaches for many people and organizations around the globe. Being able to monitor when something is “just not right” is a great step in the right direction.

Analyze

In September of 2010, I introduced the Pentaho Evaluation Sandbox. It was designed as a tool to assist with Pentaho evaluations as well as showcase many examples of what Pentaho can do. There have been numerous unique visitors to this site, both legitimate and some as I soon discovered…not. Prior to the site’s launch, using Pentaho’s Reporting, Dashboard and Analysis capabilities, I created a simplistic Web Analytic Dashboard that would highlight metrics and dimensions of the Sandbox’s internet traffic. It was a great example to demonstrate Pentaho Web Analytics embedded in a hosted application. Upon my daily review of the Site Activity dashboard which includes a real-time visit strip chart monitor, I noticed an unusually large spike in page views that occurred within a 1 minute time-frame.

Now that spike can be normal, providing a number of different people are surfing the site at the same time. However it caught my attention as “unusual” due to what I knew was normal. The dashboard quickly alerted me of something I should possibly take action on. So I clicked on the point at the peak to drill-down into the page visit detail at that time. The detail report revealed that who or whatever was accessing the Sandbox was rapidly traversing the site’s page map and directories looking for holes in the system. I also notice that all the page views were accessed by the same IP address within under 1 minute. Hmmm, I thought. “That could be a shared IP, a person or even a bot ignoring my robots.txt rules.” But..as I scrolled down I further discovered there were attempts to access the .htaccess and passwd files that protect the site. I immediately clicked on the IP address data value in the detail report (in my admin version of the report) which linked me to an IP Address Blacklist look-up service. The Blacklist Look-up program informed me that the IP address has been previously reported and was listed as suspicious for malicious activity. BINGO! Goodbye whoever you are!

Take Action

I quickly took action on my findings by banning the IP address from the system to prevent any further attempts to access the site. I then began to think of some random questions I needed to ask of the data. I switched gears and turned to Pentaho Analysis. Upon further analysis of the site’s data using Pentaho Analyzer Report - I was able to see evidence of IP Spoofing and even Comment Spamming coming form certain IP address ranges. The action I took next was to block certain IP address ranges that have been accessing the site in this manner. In addition I created a contact page for those who may be accessing the site legitimately but may have gotten blocked if their IP falls in that range.

Wow, talk about taking action on your data huh?

It is not a question of if, but when an unwarranted attempt will occur on your systems. Make sure you take the appropriate steps to protect them by using the appropriate software and services that will make you aware of problems. My experience may be an oversimplification but it is a great example of how I used Pentaho to make me aware of a problem and take that raw data and turn it into actionable information.

Special thanks to Marc Batchelor, Chief Engineer and Co-Founder of Pentaho, for helping me explore the corrective actions to take to protect the Pentaho Evaluation Sandbox.

Regards,

Michael Tarallo
Director of Enterprise Solutions
Pentaho

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 105 other followers

%d bloggers like this: