After my post a few weeks ago lamenting Twitter's data use policies, many people reached out to me supporting my position and asking what they could do to help. One person was Mark Huberty, a fellow political scientist at UC Berkeley. Mark mentioned that there were many other social scientists who had similar experiences and were worried about its ramifications for research.
We decided to the best way to proceed was to make an appeal to all researchers—not only social scientists—to gather examples of work, and stories, of how many disciplines are using this data to uncover new aspects of human behavior. This morning, Mark wrote just such an appeal to the POLMETH mailing list, and in an effort to make this appeal to a larger audience I have reproduced it below:
One of us (Drew Conway) recently found that, although Twitter makes its data open to almost anyone via well-documented interfaces, and although it appears to encourage experimentation with its data, that doesn't extend to redistributing that data for replication. This poses serious questions about the use of twitter-based data for academic research. Twitter has been shown to be an accurate predictor of vote and polling outcomes, and a novel way to measure partisan polarization and communication. But without clear data use policies, research taking advantage of this data may not pass muster with journals requiring the release of replication files, and research progress will be hindered.
We think there is an opportunity here to show interdisciplinary academic interest in the Twitter data, and open a conversation about reasonable data retention and release policies on their part. At present, there appears to be a disconnect between Twitter's analysts, who seem to encourage data use, and its legal and business arm, who are very conservative with Twitter's intellectual property rights. Given this disconnect, Twitter has been inconsistent in its demands on researchers using this data. We're hoping that by pointing out the inconsistency and seeking a reasonable resolution, we can find a suitable outcome for both Twitter's business model and researchers' interests. If done correctly, this might have the potential to become a model for other social media sites of interest to social scientists.
We would like to engage participation from anyone in the PolMeth community who has an interest in this outcome. If you might be interested in participating, please let one of us know. We're only in the early stages of working on this, so we welcome all inquiries, ideas, and concerns.
Thanks for your interest. We will look forward to hearing from you.
We hope that those of you using Twitter data for research will help us in this effort. Please feel free to contact me directly, either by email or in the comments section below. We look forward to hearing from you!