March 20, 2012
Pentaho’s Matt Casters, Chief Architect, Pentaho Data Integration and Kettle Project Founder was featured last week on DM Radio on their radio broadcast titled: On the Move: Why ETL is Here to Stay.
Listen to Matt’s interview with Hosts Eric Kavanagh and Jim Ericson along with panelist Nimitt Desai of Deloitte, Geoff Malafsky of Phasic Systems and Josh Rogers of Syncsort.
Starting at 13:33 Listen to Matt talk about:
- How Big data and ETL intersect and what that means
- Points to keep in mind when starting to working and accessing data in and out of Hadoop
- How to keep track of changing technologies and architectures
- Why its important to not just do data integration for data integration sake
- Why there’s a lack of best practices
- What Matt’s seeing: need for high level of metadata and modeled ETL generation
Access both Matt’s segment and the full podcast here: http://www.information-management.com/dmradio//-10022068-1.html
July 15, 2011
Social Networking Data
Recently, I have been asked about Pentaho’s product interaction with social network providers such as Twitter and Facebook. The data stored within these “social graphs” can provide its owners with critical metrics around their content. By analyzing trends within user growth and demographics as well as consumption and creation of content…owners and developers are better equipped to improve their business with Facebook and Twitter. Social networking data can already be viewed and analyzed utilizing existing tools such as FB Insights or even purchasable 3rd party software packages created specifically for this purpose. Now…Pentaho Data Integration in its traditional sense is an ETL (Extract Transform Load) tool. It can be used to extract and extrapolate data from these services and merge or consolidate it with other relative company data. However, it can also be used to automatically push information about a company’s product or service to the social network platforms. You see this in action today if you have ever used Facebook and “Liked” something a company had to offer. At regular intervals, you will sometimes note unsolicited product offers and advertisements posted to your wall from those companies. A great and cost effective way to advertise to the masses.
Application Programming Interface
The Facebook Graph API
Both Facebook and Twitter provide a number of APIs, one worth mentioning is the Facebook Graph API (don’t worry Twitter, I’ll get back to you in my next blog entry).
The Graph API is a RESTful service that returns a JSON response. Simply stated an HTTP request can initiate a connection with the FB systems and publish / return data that can then be parsed with a programming language or even better yet – without programing using Pentaho Data Integration and its JSON input step.
Since the FB Graph API provides both data access and publish capabilities across a number of objects (photos, events, statuses, people pages) supported in the FB Social graph, one can leverage both automated push and pull capabilities.
If you are interested in giving this a try or seeing this in action, take a look at this tutorial available on the Pentaho Evaluation Sandbox.
Director of Enterprise Solutions