FedEx CIO shares his thoughts on the architecture required for “epic data”

Shipping powerhouse FedEx (s fdx) has been generating big data for years, but now it’s prepping for the future. By attaching digital information in the form of sensors inside its packages, FedEx thinks can bring together the digital and physical worlds to expand its customer service and its business.

FedEX CIO Robert Carter, who spoke at the IT Expo in Austin on Thursday, explained that this act of attaching bits to real-world atoms creates opportunities galore.

“The information we apply to the physical world creates an incredible opportunity for us,” Carter said. “When we apply more bits to the atoms, we create more opportunity for interactions, more opportunities to do business, and opportunities to change how you see the world.” In later conversation with a few reporters, he explained how this lofty vision affects IT, and how changes in IT bring about this vision.

Paving the way for “epic data”

FedEx’s home page in 1994

FedEx has a history of embracing technology that gets it closer to its customers. Its first web site was put in place in 1994 and was just a basic HTML page that asked for customers to enter their tracking number, then communicated that info back to a mainframe. The mainframe figured out where the package was and shot the info back to the customer. Today, it offers a basic native app on all platforms that’s basically a superficial “skin” that talks back to myriad FedEx services to give customers the tools and information they need.

As FedEx has grown, the backend infrastructure to support the organization has adapted accordingly, with FedEx using gear form Teradata and Greenplum (s emc) to handle today’s data warehousing and analytics. Carter didn’t say how much data the company generates a day, but noted that it has exabytes and exabytes of data that it generates from the 9 million shipments it averages daily. His so-called “epic data” is then kept and stored indefinitely.

And as FedEx adds its SenseAware platform, it is adding real-time data and notifications to its infrastructure at a more granular level. That platform, which launched in 2009, contains a variety of sensors and radio chips that allow it to detect temperature, location, light and report back if a package (and its contents) hits a problem. The SenseAware device, which is a roughly 6 inches by 6 inches, gets dropped into high-value packages like diamonds or human organs and can proactively monitor the package for 96 hours and then alert recipients and senders if something endangers or waylays the package.

Because the SenseAware device has a radio, it is constantly broadcasting information back to FedEx, and can generate a lot of data that must be acted on in real time. Without the right infrastructure, that might be overwhelming. Still, Carter notes that the bulk of FedEx’s exabytes of data are structured and sent from different services to the same message bus where decisions or further analytics can happen.

In the future, databases are for archival purposes

“There is so much coming online that allows us to look at large data sets. From the technical mindset, what’s happening fundamentally is a shift from the reason databases even existed,” Carter explained. Databases were built because memory was precious and operations and IT staff had to allocate when and what made it into memory at any given time. But in modern data centers and computing architectures there’s plenty of addressable memory and cabinets of non-volatile flash memory available for applications. “Databases will become archival rather than a system of record,” Carter said.

Carter compares it to having a file cabinet versus having a more brain-like process with a matrix of information the computer can harness. But FedEx has some advantages in building that matrix. For example, much of its data comes from pre-determined processes such as existing routes or metrics affecting its business and, thus, is mostly structured. Carter says that only a few elements of data, such as the monitoring of Facebook and Twitter to talk to customers, are relatively unstructured.

He anticipates the future of his many exabytes of data as being dumped into a pool of storage with some metadata attached to the files so it can be analyzed. It sounds closer to a key-value store or even some of the NoSQL efforts, although he said FedEx isn’t using many of the new open source data stores or analytics out there, relying in Greenplum’s Hadoop distribution for analysis today.

As companies seek to embrace and use big data, it’s clear that the way data is stored and analyzed is changing. Bt understanding that the application of data to physical items like packages can generate huge opportunities shouldn’t be forgotten either. Big data needs to be used to produce big (or even little) insights.