Proposed guidelines for the ethical use of Twitter data

Background to this article: Twitter is releasing its historical archive of public tweets to selected researchers. See Introducing Twitter Data Grants and Twitter #DataGrants selections.

Scientific American says “A trove of billions of tweets will be a research boon and an ethical dilemma.” Indeed. We’re thus reproducing part of Caitlin M. Rivers and Bryan L. Lewis’s article Ethical research standards in a world of big data for comment.

Proposed guidelines for the ethical use of Twitter data

The objectives, methodologies, and data handling practices of the project are transparent and easily accessible

This information should be published in manuscripts, published on the web for the public to access, and provided to IRB (when relevant). Going forward, collaboration between the research community and Twitter to provide information to users about ongoing research and relevant results may also be beneficial. Transparency regarding uses of Internet data for research purposes is needed for fostering ‘privacy literacy’ so that the users can make informed decisions about participating in Twitter.

Study design and analyses respect the context in which a tweet was sent

A tweet author discussing his mental health, for example, does not do so with the intention of sharing that data with researchers; he does it to communicate with his digital community. Qualitatively analyzing these communications as if they are offered for research consumption does not align with the context in which the tweets were created. Twitter participants can reasonably expect to rely on some anonymity of the crowd to manage privacy.

The anonymity of tweet authors is protected, ensuring that subjects should not be identifiable in any way

To preserve source anonymity, direct quotes or screen names are not publishable, nor are any details that could be used to identify a subject. Any and all information that could be entered into a search engine to trace back to a human source should be protected. A composite of multiple example tweets may instead be used for illustrative purpose. Geolocations in particular should be scaled to a larger geographic area in order to avoid violating the privacy of those tweet authors. The Title 13 of the Data Protection and Privacy Policy, the federal law under which the Census Bureau is regulated, expressly forbids publishing GPS coordinates; researchers should adhere to this guideline as well.

Tweet data are not used to harvest additional information from other sources

Focused collection is also important for preserving anonymity. It is possible to use data collected from Twitter to discern the identities of tweet authors, which can then be used to find and collect additional information from additional sources. For example an author’s username, identifying details provided in tweet texts, or geolocations could all be used to collect data about that individual from other sources like Facebook, LinkedIn, Flickr, or public records.

Twitter users’ efforts to control their personal data are honored

Researchers may not follow a user on Twitter in order to gain access to a protected account. Doing so would violate that user’s efforts to control his or her personal data.

Researchers work collaboratively with IRB just as they would for any other human subject data collection

There is not currently an expectation that researchers engaging in research using Twitter will interface with their IRB. As discussed above, studies that could be conceived as individual-based should require IRB approval, whereas research designs that use data in aggregate (e.g. counts of keywords) may proceed without explicit consent. In turn, review boards should keep abreast of social network mining methodologies and corresponding ethical considerations in order provide informed guidance to researchers.

Geek Feminism readers: what do you think?

Article source, licencing and citation notes:

This post is an excerpt of Ethical research standards in a world of big data by Caitlin M. Rivers and Bryan L. Lewis as allowed under the terms of the Creative Commons Attribution licence. We suggest that anyone quoting or reproducing this article copy from the original source to ensure accuracy.

The original article can be cited as: Rivers CM and Lewis BL (2014) Ethical research standards in a world of big data [v1; ref status: approved with reservations 1, http://f1000r.es/2wq] F1000Research 2014, 3:38 (doi: 10.12688/f1000research.3-38.v1)

One thought on “Proposed guidelines for the ethical use of Twitter data

  1. Pingback: Leading Lady Link(spam): Female Fighters, Gendered Pronouns, Red Sonja’s Husband? | Feminism/geekery

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s