Ben Tremblay

Technology, business and change

80legs and democratizing web crawling: It’s game-changing

80 Legs logo

I read a post about 80Legs today on Mashable and the only words that came to my mind when I read the article was “they’re changing the game”. They are. Here’s what they do:

80Legs is a service platform for web crawling and processing web content. We put over 50,000 computers to work for you to deliver exceptional crawling performance at incredibly low costs. Our service is easy to use and completely customizable, so you can crawl and process web content however you want, whenever you want.

To summarize: you get your very own web crawler at a more than affordable price: 2.00$ / million pages crawled.

Why it’s game changing

Web crawling is a very complex and expensive process: crawling a web page, extracting content and scale the process to million of pages is far from being trivial and cheap. You not only need to build your crawler or use an existing one, you must have the infrastructure in place to support it all.

Now, anybody willing to build a niche search engine, extract specific information from a series of websites or build any web app involving crawling the Web at a large scale can do it at an affordable cost, without much technical knowledge.

It’s not 80Legs that’s game-changing, it’s the whole concept of democratizing web crawling and giving a chance to anybody with a great idea but without the money or the technical skills, to just make it happen.

It’s similar to blogging: it’s not about the platform you use (WordPress, blogger, TypePad, etc), it’s the fact that anybody can now have a voice and have an impact. Platforms die, ideas don’t.

It’s dead simple

I don’t want to talk too much about 80Legs because I haven’t really tested it (except creating an account and playing around) and I don’t like talking about stuff I haven’t tested, but it really looks dead simple. The process may look simple, but yet it seems to offer some decent advanced features.

webinterface80legs

I won’t go too much into details, but it allows a lot of customization so you can extract the information you want. Of course, you will need some technical knowledge if you want to use the crawled data and build a web app around it, but the hard part involving scaling issues is covered by a tool like 80Legs.

Spam is now more affordable than ever

As much as I’m enthusiast about democratizing Web crawling, there’s a huge downside: spam and content scrapping is now more affordable than ever. It’s an easy way for amateur spammers to build email lists by crawling websites and extracting email addresses. I’m sure we can find dozen of other spamming issues, but I much prefer to focus on the positive aspects.

Overall, the concept is extremely interesting and anybody willing to build a web app involving web crawling should be looking into this.

Receive updates by email



Qu'en pensez-vous?

Receive updates by email:

rss feed RSS twitter benoit tremblay twitter