Refining raw data into something cleaner and ready for rigorous analysis is arguably more important in the big data world than it has been since the dawn of the data warehouse. With big data refinement happening in place rather than through an extract, transform and load process, rigid predefined data modeling is giving way to more agile data transformation. But the fact remains that uncleansed data will lead to a poor analytics experience. Maybe that’s why it’s still the case that only an estimated 20 percent to 25 percent of the potential audience is using analytics to make decisions.
It doesn’t have to be that way. But progressing from this status quo requires that data transformation tools undergo the same kind of self-service revolution that BI tools have. Data transformation can’t be just for development specialists any more than analytics tools can be built exclusively for data scientists. Otherwise big data and Hadoop will suffer from the same bottleneck of manual effort that today’s ETL tools have imposed on BI.
New data transformation tools that fit this self-service profile are available. And they are available for the data engineer, data scientist and business analyst alike.
In this webinar, our panel will discuss these topics:
- ETL’s premise and why it doesn’t fit big data
- How ETL in the BI world differs from data transformation in the big data world
- Can restructuring, cleansing and de-duping be fun?
- How can analytics be applied to data transformation (not just vice versa)?
- Data transformation and unstructured data
- Andrew J. Brust, research director, Gigaom Research
- William McKnight, founder and president, McKnight Consulting Group
- Richard Winter, president, WinterCorp
- Joe Hellerstein, co-founder and CEO, Trifacta
Register here to join Gigaom Research and our sponsor Trifacta for “Data transformation: why it’s critical for agile analysis,” a free analyst roundtable webinar on Thursday, May 22, 2014, at 10 a.m. PT.