Blog Post

80legs Cares About Your Bandwidth Cap

80legslogoBandwidth caps are forcing at least one startup to adjust its business. Last month when I was in Houston, I met Shion Deysarkar, chief marketing officer of Plura Processing, a company that harnesses the CPU cycles and bandwidth of participating gamers (it pays them up to $2.60 a month for use of 100 percent of the CPU cycles). We talked about the product built on top of Plura, an application called 80legs, which is basically a web crawling service for hire. 80legs, which is still in private beta, provides access to data for search sites, video indexing sites and anything else that wants to scour the web for data.

Through 50,000 Plura nodes, 80legs has access to between 5 and 10 gigabits per second of capacity, which is nothing to sneeze at. However, because of the looming worries about bandwidth caps and metered broadband, Brad Wilson, CEO of Plura, says the company has had to implement several safeguards to keep the users who provide the nodes from hitting a cap. The company constantly monitors the web to find information about which ISPs are capping, and where they’re doing so. It then checks a nodes’ IP address (but doesn’t store it) to determine who the ISP is and where the node is located. If the user is in a capped area, Plura figures out if the cap is a large one (like Comcast’s 250GB-per-month cap) and then ensures that any bandwidth use remains a small percentage of that cap. If the cap is small, Plura doesn’t tap the bandwidth at all.

Users who have signed up to be nodes are informed of the bandwidth harvesting. Wilson says that he’d like to eventually provide tools for users so they can see if their bandwidth is being throttled or impaired. As for ISPs deciding to block Plura’s harvesting of consumers’ bandwidth, Wilson says that would be a good problem to have because it would mean 80legs was successful enough to be attracting ISP notice. Given how some in the industry like to preach the coming end of the Internet due to the pipes becoming too full, a program such as the one 80legs is offering is pretty interesting.  Since the pipes delivering the web to users can become congested at certain times of the day, finding programs and applications to take advantage of underutilized bandwidth would seem to be a great way to offload some bandwidth-consuming applications to times when the network is relatively empty.

Plura has raised $750,000 in seed money from Creeris Ventures. Computational Crawling, a sister company that builds applications such as 80legs to run on the Plura platform has raised $400,000 from Creeris. Wilson says both companies can reach profitability with their respective amounts of funding, and as such aren’t looking to raise more.

3 Responses to “80legs Cares About Your Bandwidth Cap”

  1. interesting concept. Found their sit because their bot hit our site and I had never heard of it before. Then when I saw that it is a web crawl for hire, kind of tough to decide what the bot is doing on the site and whether or not its a good thing or a bad one.

    If they charge $2/million urls -but only pay users $2.60 a month – that doesn’t quite seem right. It costs more than $2.60 a month to pay for the power to run the computer.

    When we used to participate in the MJ12 node program, the comptuer could easily crawl a million urls a day – the problem actually became that home networking equipment isn’t setup to handle it as easy as the computer is.

    • “If they charge $2/million urls -but only pay users $2.60 a month – that doesn’t quite seem right. It costs more than $2.60 a month to pay for the power to run the computer. ”

      you’re mixing units.

      – small linux instances are $227/year a piece = $0.03 per machine hour
      – Then they are charging $2 per million pages and since the average web page is ~25k of stuff if you ignore the images and scripts and focus on core html and don’t pull in images and style sheets.
      – Even if they were running this out of AWS – (25,000 bytes per page * 1,000,000 pages) bytes / $2 for this million pages / 1,000,000,000 bytes per GB = 12.5GB bytes per dollar of ingress that they charge…. Amazon’s 10 cents per GB = 10GB per dollar amazon charges – so it’s basically costing them a little bit if they have only one customer paying their $2/million pages otherwise if they are dual purposing the single crawl for more than one customer they are printing money
      – Given that they are usikng 50k “nodes” bandwidth and compute tmie for a bunch of this you’re really likely to have profitability even if they have one taker and small overhead (i.e. small staff)

      Or am I missing something?

      just my swag.