Amazon preps Data Pipeline service to automate big data workflows

AWS: Reinvent

Amazon’s newly launched Data Pipeline will help Amazon Web Services customers get a better grip on handling their data scattered throughout the various AWS data repositories and third-party databases, Amazon CTO Werner Vogels said Thursday.

This tool will make it easy for AWS customers to create automated and scheduled workflows — from DynamoDB and S3 storage to Elastic MapReduce, wherever they’re needed. “It’s pre-integrated with AWS data sources and easily connected to third-party and on-premise data sources,” Vogels said.

The proliferation of data — machine logs, sensor data and plain old database data — is driving the need for automating the flow of that data from databases to storage to applications and back. “You have to put everything in logs which creates even more data … in AWS,” Vogels said.

Users build their workflows with a drag-and-drop interface and schedule them to run periodically. By making it easy to consolidate data in one place, customers will be better able to run big batch analytics on their logs and other information.

There was not a ton of details other than that, but from AWS track record, the service should be available soon. Stay tuned for updates.

You're subscribed! If you like, you can update your settings


Comments have been disabled for this post