Facebook has acquired its third company, Octazen Solutions - a Malaysian startup. According to GigaOm, Facebook says this is largely a talent acquisition. The Octazen homepage has a slightly different story, saying Facebook acquired “most of the company’s assets and to employ those assets in a different direction.”
So what exactly is Octazen? One senior engineer at a competing company says, “Facebook just bought the web’s most talented and creative scrapers that have gotten around everyones rate limits and detection systems.” Another said, “Facebook is so sanctimonious about protecting their own user data through Facebook Connect, but Octazen has been scraping user data for years off terms of service and then reselling it.”
The fact is, Octazen is very, very good at scraping data at scale without being detected. They may hit a service using lots of different IP addresses, for example, and remain undetected. Octazen could, they say, scrape very public sites like Twitter, where the social graph is on each profile, in a way that Twitter wouldn’t know it’s happening.
For example, in 2007 folks were buying and running Octazen scripts to scrape contacts in a very sketchy way: “So we use this toolkit from Octazen to scrape contact lists off of various sites. Our ever eager users (ab)used this feature so much that hotmail blocked us.” The poster found a way to access Hotmail’s API instead of just scraping to get the data, and Octazen responded, saying “Very nice indeed”
Facebook is apparently already using Octazen to mysteriously determine your long lost friends. I’m sure you’ve received one of the notifications suggesting that you re-connect with them.
It will be interesting to see how Facebook decides to start implementing this new found data gathering resource and what additional implications this will have on Facebook’s run-in with privacy concerns.
Related posts: