An incident in which a media research firm improperly copied messages from a discussion forum on the website PatientsLikeMe has sparked questions about the practice of data "scraping," the Wall Street Journal reports.
PatientsLikeMe is an online social networking site.
What Is Scraping?
Scraping refers to the practice of using sophisticated software to collect personal data from social networking websites, online forums, job sites and other Web-based sources.
The information can be used to track consumer preferences, screen job candidates and conduct market research.
PatientsLikeMe Incident
On May 7, PatientsLikeMe administrators noticed unusual activity on a discussion board about moods. The company shut down the suspicious account, as well as three additional questionable accounts.
PatientsLikeMe traced the suspicious accounts to the media research firm Nielsen, which monitors online trends to provide consumer insight for its clients. Investigators found that Nielsen had been copying messages from patient discussion forums.
On May 18, PatientsLikeMe sent a cease-and-desist order to Nielsen, which agreed to stop scraping.
On May 20, PatientsLikeMe President Ben Heywood informed the site's 70,000 users about the scraping incident in a blog post. He also reminded members that PatientsLikeMe sells user information that has been de-identified (Angwin/Stecklow, Wall Street Journal, 10/12).
PatientsLikeMe called the incident a violation of its user agreement rather than a security breach because user account information was not compromised (Merrill, Healthcare IT News, 10/12).
Nielsen Response
Dave Hudson -- who took over in June as the CEO of the Nielsen unit that scraped PatientsLikeMe in May -- said the incident was "a bad legacy practice that we don't do anymore."
The company said it no longer scrapes websites that require individual accounts for access, unless it has permission to do so.
A Growing Industry
Over the next few years, spending on data from online sources is expected to more than double from $410 million in 2009 to $840 million in 2012.
Numerous companies across the country provide data scraping services, but few legal precedents govern the practice of extracting online data.
Eric Goldman, law professor at Santa Clara University, said, "Scraping is ubiquitous, but questionable." He added, "Everyone does it, but it's not totally clear that anyone is allowed to do it without permission" (Wall Street Journal, 10/12).