3 Comments

Summary:

Underlying all the useful applications, like Hadoop, that have emerged out of the big data ecosystem, there’s a fundamental assumption: The data that companies want will be able to be accessed when companies want and need it, explained Michelle Munson, CEO and co-founder of Aspera.

Michelle Munson from Aspera at Structure Big Data 2011

Michelle Munson from Aspera at Structure Big Data 2011Underlying all the useful and inspiring applications, like Hadoop, that have emerged out of the Big Data ecosystem, is a fundamental assumption: The data that companies want will be able to be accessed when companies want and need it. That functionality requires the ability to transfer files at the speeds that people expect it, and is one of the constraints of the big data world, explained Michelle Munson, CEO and co-founder of Aspera.

Aspera has built a proprietary high-speed file-transport technology, fasp, that helps data move across networks with issues like over-burdened WANs. Aspera is primarily the province of large companies dealing with big data, including digital media companies sending content among supply-chain partners, life sciences researchers sending genome-sequencing data among institutes and government intelligence customers sending video files between agencies.

Munson said current Internet infrastructure lacks three qualities:

  1. availability
  2. geographic independence
  3. security

While all these issues need to be addressed in the fundamental architecture itself, the constraint has created an opportunity for Aspera’s transfer product. The reliability of Internet services is going up, which creates an expectation that this data will be available quickly, said Ammar Hanafi, general partner with Alloy Ventures.

While consumer web services can easily meet customer expectations, Aspera’s customers are a different story. “Our customers are moving many gigabytes and larger [quantities] of data that has to be chunked up and then distributed,” said Munson. But even if Aspera’s file transfer tech can make sure the delivery is as fast as the consumer web, the company has learned it can provide something else: predictability. “After solving the bottleneck, then you can offer customers predictability,” that manage their expectations, Munson said.

At the end of the day, its a physics problem, both Munson and Hanafi said. TCP, the transmission protocol used by IP networks, just doesn’t perform all that well for moving big data long distances. That’s both a big opportunity for startups like Aspera and big data infrastructure companies.

Watch live streaming video from gigaombigdata at livestream.com
  1. perhaps i’m too much of a layer 0/1 kind of guy, but i find it hard to take folks seriously who talk about software/protocol solutions “to a physics problem”. that being said optimization of the protocol stack has a lot of opportunity to fix the problem, but i don’t believe anyone is well served by characterizing this as solving “geographic independence” — that is misleading.

    Share
    1. i think matt’s point is right. it’s not a physics problem. it’s the bottleneck of the protocol designed almost 4 decades ago and weakness of tcp.

      Share
  2. Michelle Munson Thursday, March 24, 2011

    Hi Matt,

    I didn’t say it was a physics problem – the writer got this wrong. I said that traditional big data moving protocols (TCP-based) are severely bottlenecked over network network distance (due to the coupling of throughput to packet loss and round-trip delay), and leave a design gap for improved transport (without such bottlenecks). The distance-neutral transport is part of overcoming the geographic dependence of present contribution distribution architectures.

    Share

Comments have been disabled for this post