Category Archives: Uncategorized

NOTICE: Email sabbatical will start December 15

It’s that time of the year again. If you don’t know me, you probably don’t know that I work obscene hours for most of the year and then take a proper vacation during the winter months. As in no internet, no work, no geeking out on research. For me to continue doing the work that I do, I have to refresh. In order to refresh, I go offline. No email, no Twitter, no blogging. And only pre-downloaded Wikipedia-ing (because how can you tour foreign countries without wanting to know weird information about the universe?).

Over the years, I have learned that vacation isn’t vacation if you come home to thousands of pending emails. Cuz then you spend most of vacation worrying about the work that’s piling up. So, over seven years ago, I started instituting “email sabbaticals” in my life. While I’m away, my lovely procmail file (aka “filtering software”) will direct all of my email to /dev/null (aka “the permanent trash”). I will not be reachable. The only person that I stay in contact with while I’m gone is my mother because it’s just too cruel to my mom to disappear entirely. Twitter and my blog will also loudly proclaim my MIA-ness. But the bigger issue is that I will return to a zero-inbox. Nothing sent to me during my email sabbatical will survive. All senders will receive a lovely bounce message saying that their message will never get through. In this way, no one can put things in my to-do queue while I’m trying to take a break. I need to recharge and there’s no way to recharge when the pile-up grows ever unmanageable.

From December 15-January 10, you will not be able to reach me. So, if you need something from me, holler now. Or wait until I come back. But please recognize that I need a break.

To learn more about my crazy process, see: How to Take an Email Sabbatical

Social Science PhD Internships at Microsoft Research New England (Spring & Summer 2012)

Microsoft Research New England (MSRNE) is looking for PhD interns to join the social media collective for Spring and Summer 2012. For these positions, we are looking primarily for social science PhD students (including communications, sociology, anthropology, media studies, information studies, etc.). The Social Media Collective is a collection of scholars at MSRNE who focus on socio-technical questions, primarily from a social science perspective. We are not an applied program; rather, we work on critical research questions that are important to the future of social science scholarship.

MSRNE internships are 12-week paid internships in Cambridge, Massachusetts. PhD interns at MSRNE are expected to devise and execute a research project during their internships. The expected outcome of an internship at MSRNE is a publishable scholarly paper for an academic journal or conference of the intern’s choosing. The goal of the internship is to help the intern advance their own career; interns are strongly encouraged to work towards a publication outcome that will help them on the academic job market. Interns are also expected to collaborate with full-time researchers and visitors, give short presentations, and contribute to the life of the community. While this is not an applied program, MSRNE encourages interdisciplinary collaboration with computer scientists, economists, and mathematicians. There are also opportunities to engage with product groups at Microsoft, although this is not a requirement.

Topics that are currently of interest to the social media collective include: privacy & publicity, internet public policy research, online safety (from sexting to bullying to gang activities), technology and human trafficking, transparency & surveillance, conspicuous consumption & brand culture, piracy, news & information flow, and locative media. That said, we are open to other interesting topics, particularly those that may have significant societal impact. While most of the researchers in the collective are ethnographers, we welcome social scientists of all methodological persuasions.

Applicants should have advanced to candidacy in their PhD program or be close to advancing to candidacy. (Unfortunately, there are no opportunities for Master’s students at this time.) While this internship opportunity is not strictly limited to social scientists, preference will be given to social scientists and humanists making socio-technical inquiries. (Note: While other branches of Microsoft Research focus primarily on traditional computer science research, this group does no development-driven research and is not looking for people who are focused solely on building systems at this time. We welcome social scientists with technical skills and strongly encourage social scientists to collaborate with computer scientists at MSRNE.) Preference will be given to intern candidates who work to make public and/or policy interventions with their research. Interns will benefit most from this opportunity if there are natural opportunities for collaboration with other researchers or visitors currently working at MSRNE.

Applicants from universities outside of the United States are welcome to apply.

PEOPLE AT MSRNE SOCIAL MEDIA COLLECTIVE

The Social Media Collective is organized by Senior Researcher danah boyd (http://www.danah.org) and includes Postdoctoral Researchers Mike Ananny (http://www.stanford.edu/~mja/), Alice Marwick (http://www.tiara.org/), and Andrés Monroy-Hernández (http://www.mit.edu/~amonroy/). Spring faculty visitors will include T.L. Taylor (IT University of Copenhangen) and Eszter Hargittai (Northwestern University). Summer visitors are TBD.

Previous interns in the collective have included Amelia Abreu (UWashington information), Scott Golder (Cornell sociology), Germaine Halegoua (U. Wisconsin, communications) Jessica Lingel (Rutgers library & info science), Laura Noren (NYU sociology), Omar Wasow (Harvard African-American studies), and Sarita Yardi (GeorgiaTech HCI). Previous and current faculty MSR visitors to the collective include: Alessandro Acquisti, Beth Coleman, Bernie Hogan, Christian Sandvig, Helen Nissenbaum, James Grimmelmann, Judith Donath, Jeff Hancock, Kate Crawford, Karrie Karahalios, Lisa Nakamura, Mary Gray, Nalini Kotamraju, Nancy Baym, Nicole Ellison, and Tarleton Gillespie.

If you are curious to know more about MSRNE, I suspect that many of these people would be happy to tell you about their experiences here. Previous interns are especially knowledgeable about how this process works.

APPLICATION PROCESS

To apply for a PhD internship with the social media collective:

1. Fill out the online application form: https://research.microsoft.com/apps/tools/jobs/intern.aspx Make sure to indicate that you prefer Microsoft Research New England and “social media” or “social computing.” You will need to list two recommenders through this form. Make sure your recommenders respond to the request for letters.

2. Send an email to msrnejob -at- microsoft-dot-com with the subject “SMC PhD Intern Application: ” that includes the following four things:
a. A brief description of your dissertation project.
b. An academic article you have written (published or unpublished) that shows your writing skills.
c. A copy of your CV
d. A pointer to your website or other online presence (if available).
e. A short description of 1-3 projects that you might imagine doing as an intern at MSRNE.

We will begin considering internship applications on January 10 and consider applications until all social media internship positions are filled.

PREVIOUS INTERN TESTIMONIALS

“The internship at Microsoft Research was all of the things I wanted it to be – personally productive, intellectually rich, quiet enough to focus, noisy enough to avoid complete hermit-like cave dwelling behavior, and full of opportunities to begin ongoing professional relationships with other scholars who I might not have run into elsewhere.”
— Laura Noren, Sociology, New York University

“If I could design my own graduate school experience, it would feel a lot like my summer at Microsoft Research. I had the chance to undertake a project that I’d wanted to do for a long time, surrounded by really supportive and engaging thinkers who could provide guidance on things to read and concepts to consider, but who could also provoke interesting questions on the ethics of ethnographic work or the complexities of building an identity as a social sciences researcher. Overall, it was a terrific experience for me as a researcher as well as a thinker.”
— Jessica Lingel, Library and Information Science, Rutgers University

“Spending the summer as an intern at MSR was an extremely rewarding learning experience. Having the opportunity to develop and work on your own projects as well as collaborate and workshop ideas with prestigious and extremely talented researchers was invaluable. It was amazing how all of the members of the Social Media Collective came together to create this motivating environment that was open, supportive, and collaborative. Being able to observe how renowned researchers streamline ideas, develop projects, conduct research, and manage the writing process was a uniquely helpful experience – and not only being able to observe and ask questions, but to contribute to some of these stages was amazing and unexpected.”
— Germaine Halegoua, Communication Arts, University of Wisconsin-Madison

“The summer I spent at Microsoft Research was one of the highlights of my time in grad school. It helped me expand my research in new directions and connect with world-class scholars. As someone with a technical bent, this internship was an amazing opportunity to meet and learn from really smart humanities and social science researchers. Finally, Microsoft Research as an organization has the best of both worlds: the academic freedom and intellectual stimulation of a university with the perks of industry.”
— Andrés Monroy-Hernández, Media, Arts and Sciences, MIT

Debating Privacy in a Networked World for the WSJ

Earlier this week, the Wall Street Journal posted excerpts from a debate between me, Stewart Baker, Jeff Jarvis, and Chris Soghoian on privacy. In preparation for the piece, they had us respond to a series of questions. Jeff posted the full text of his responses here. Now it’s my turn. Here are the questions that I was asked and my responses.

Part 1:

Question: How much should people care about privacy? (400 words)

People should – and do – care deeply about privacy. But privacy is not simply the control of information. Rather, privacy is the ability to assert control over a social situation. This requires that people have agency in their environment and that they are able to understand any given social situation so as to adjust how they present themselves and determine what information they share. Privacy violations occur when people have their agency undermined or lack relevant information in a social setting that’s needed to act or adjust accordingly. Privacy is not protected by complex privacy settings that create what Alessandro Acquisti calls “the illusion of control.” Rather, it’s protected when people are able to fully understand the social environment in which they are operating and have the protections necessary to maintain agency.

Social media has prompted a radical shift. We’ve moved from a world that is “private-by-default, public-through-effort” to one that is “public-by-default, private-with-effort.” Most of our conversations in a face-to-face setting are too mundane for anyone to bother recording and publicizing. They stay relatively private simply because there’s no need or desire to make them public. Online, social technologies encourage broad sharing and thus, participating on sites like Facebook or Twitter means sharing to large audiences. When people interact casually online, they share the mundane. They aren’t publicizing; they’re socializing. While socializing, people have no interest in going through the efforts required by digital technologies to make their pithy conversations more private. When things truly matter, they leverage complex social and technical strategies to maintain privacy.

The strategies that people use to assert privacy in social media are diverse and complex, but the most notable approach involves limiting access to meaning while making content publicly accessible. I’m in awe of the countless teens I’ve met who use song lyrics, pronouns, and community references to encode meaning into publicly accessible content. If you don’t know who the Lions are or don’t know what happened Friday night or don’t know why a reference to Rihanna’s latest hit might be funny, you can’t interpret the meaning of the message. This is privacy in action.

The reason that we must care about privacy, especially in a democracy, is that it’s about human agency. To systematically undermine people’s privacy – or allow others to do so – is to deprive people of freedom and liberty.

Part 2:

Question: What is the harm in not being able to control our social contexts? Do we suffer because we have to develop codes to communicate on social networks? Or are we forced offline because of our inability to develop codes? (200 words)

Social situations are not one-size-fits-all. How a man acts with his toddler son is different from how he interacts with his business partner, not because he’s trying to hide something but because what’s appropriate in each situation differs. Rolling on the floor might provoke a giggle from his toddler, but it would be strange behavior in a business meeting. When contexts collide, people must choose what’s appropriate. Often, they present themselves in a way that’s as inoffensive to as many people as possible (and particularly those with high social status), which often makes for a bored and irritable toddler.

Social media is one big context collapse, but it’s not fun to behave as though being online is a perpetual job interview. Thus, many people lower their guards and try to signal what context they want to be in, hoping others will follow suit. When that’s not enough, they encode their messages to be only relevant to a narrower audience. This is neither good, nor bad; it’s simply how people are learning to manage their lives in a networked world where they cannot assume strict boundaries between distinct contexts. Lacking spatial separation, people construct context through language and interaction.

Part 3:

Question: Jeff and Stewart seem to be arguing that privacy advocates have too much power and that they should be reined in for the good of society. What do you think of that view? Is the status quo protecting privacy enough? So we need more laws? What kind of laws? Or different social norms? In particular, I would like to hear what you think should be done to prevent turning the Internet into one long job interview, as you described. If you had one or two examples of types of usages that you think should be limited, that would be perfect. (300 words)

When it comes to creating a society in which both privacy and public life can flourish, there are no easy answers. Laws can protect, but they can also hinder. Technologies can empower, but they can also expose. I respect my esteemed colleagues’ views, but I am also concerned about what it means to have a conversation among experts. Decisions about privacy – and public life – in a networked age are being made by people who have immense social, political, and/or economic power, often at the expense of those who are less privileged. We must engender a public conversation about these issues rather than leaving the in the hands of experts.

There are significant pros and cons to all social, legal, economic, and technological decisions. Balancing individual desires with the goals of the collective is daunting. Mediated life forces us to face serious compromises and hard choices. Privacy is a value that’s dear to many people, precisely because openness is a privilege. Systems must respect privacy, but there’s no easy mechanism to inscribe this value into code or law. Thus, we must publicly grapple with these issues and put pressure on decision-makers and systems-builders to remember that their choices have consequences.

We must also switch the conversation from being about one of data collection to being one about data usage. This involves drawing on the language of abuse, violence, and victimization to think about what happens when people’s willingness to share is twisted to do them harm. Just as we have models for differentiating sex between consenting partners and rape, so too must we construct models that that separate usage that’s empowering and that which strips people of their freedoms and opportunities. For example, refusing health insurance based on search queries may make economic sense, but the social costs are far to great. Focusing on usage requires understanding who is doing what to whom and for what purposes. Limiting data collection may be structurally easier, but it doesn’t address the tensions between privacy and public-ness with which people are struggling.

Part 4:

Question: Jeff makes the point that we’re overemphasizing privacy at the expense of all the public benefits delivered by new online services. What do you think of that view? Do you think privacy is being sufficiently protected?

I think that positioning privacy and public-ness in opposition is a false dichotomy. People want privacy *and* they want to be able to participate in public. This is why I think it’s important to emphasize that privacy is not about controlling information, but about having agency and the ability to control a social situation. People want to share and they gain a lot from sharing. But that’s different than saying that people want to be exposed by others. Agency matters.

From my perspective, protecting privacy is about making certain that people have the agency they need to make informed decisions about how they engage in public. I do not think that we’ve done enough here. That said, I am opposed to approaches that protect people by disempowering them or by taking away their agency. I want to see approaches that force powerful entities to be transparent about their data practices. And I want to see approaches the put restrictions on how data can be used to harm people. For example, people should have the ability to share their medical experiences without being afraid of losing their health insurance. The answer is not to silence consumers from sharing their experiences, but rather to limit what insurers can do with information that they can access.

Question: Jeff says that young people are “likely the worst-served sector of society online”? What do you think of that? Do youth-targeted privacy safeguards prevent them from taking advantage of the benefits of the online world? Do the young have special privacy issues, and do they deserve special protections?

I _completely_ agree with Jeff on this point. In our efforts to protect youth, we often exclude them from public life. Nowhere is this more visible than with respect to the Children’s Online Privacy Protection Act (COPPA). This well-intended laws was meant to empower parents. Yet, in practice, it has prompted companies to ban any child under the age of 13 from joining general-purpose communication services and participating on social media platforms. In other words, COPPA has inadvertently locked children out of being legitimate users of Facebook, Gmail, Skype, and similar services. Interestingly, many parents help their children circumvent age restrictions. Is this a win? I don’t think so.

I don’t believe that privacy protections focused on children make any sense. Yes, children are a vulnerable population, but they’re not the only vulnerable population. Can you imagine excluding senile adults from participating on Facebook because they don’t know when they’re being manipulated? We need to develop structures that support all people while also making sure that protection does not equal exclusion.

Thanks to Julia Angwin for keeping us on task!

Why Parents Help Children Violate Facebook’s 13+ Rule

Announcing new journal article: “Why Parents Help Their Children Lie to Facebook About Age: Unintended Consequences of the ‘Children’s Online Privacy Protection Act'” by danah boyd, Eszter Hargittai, Jason Schultz, and John Palfrey, First Monday.

“At what age should I let my child join Facebook?” This is a question that countless parents have asked my collaborators and me. Often, it’s followed by the following: “I know that 13 is the minimum age to join Facebook, but is it really so bad that my 12-year-old is on the site?”

While parents are struggling to determine what social media sites are appropriate for their children, government tries to help parents by regulating what data internet companies can collect about children without parental permission. Yet, as has been the case for the last decade, this often backfires. Many general-purpose communication platforms and social media sites restrict access to only those 13+ in response to a law meant to empower parents: the Children’s Online Privacy Protection Act (COPPA). This forces parents to make a difficult choice: help uphold the minimum age requirements and limit their children’s access to services that let kids connect with family and friends OR help their children lie about their age to circumvent the age-based restrictions and eschew the protections that COPPA is meant to provide.

In order to understand how parents were approaching this dilemma, my collaborators — Eszter Hargittai (Northwestern University), Jason Schultz (University of California, Berkeley), John Palfrey (Harvard University) — and I decided to survey parents. In many ways, we were responding to a flurry of studies (e.g. Pew’s) that revealed that millions of U.S. children have violated Facebook’s Terms of Service and joined the site underage. These findings prompted outrage back in May as politicians blamed Facebook for failing to curb underage usage. Embedded in this furor was an assumption that by not strictly guarding its doors and keeping children out, Facebook was undermining parental authority and thumbing its nose at the law. Facebook responded by defending its practices — and highlighting how it regularly ejects children from its site. More controversially, Facebook’s founder Mark Zuckerberg openly questioned the value of COPPA in the first place.

While Facebook has often sparked anger over its cavalier attitudes towards user privacy, Zuckerberg’s challenge with regard to COPPA has merit. It’s imperative that we question the assumptions embedded in this policy. All too often, the public takes COPPA at face-value and politicians angle to build new laws based on it without examining its efficacy.

Eszter, Jason, John, and I decided to focus on one core question: Does COPPA actually empower parents? In order to do so, we surveyed parents about their household practices with respect to social media and their attitudes towards age restrictions online. We are proud to release our findings today, in a new paper published at First Monday called “Why parents help their children lie to Facebook about age: Unintended consequences of the ‘Children’s Online Privacy Protection Act’.” From a national sample of 1,007 U.S. parents who have children living with them between the ages of 10-14 conducted July 5-14, 2011, we found:

Although Facebook’s minimum age is 13, parents of 13- and 14-year-olds report that, on average, their child joined Facebook at age 12.
Half (55%) of parents of 12-year-olds report their child has a Facebook account, and most (82%) of these parents knew when their child signed up. Most (76%) also assisted their 12-year old in creating the account.
A third (36%) of all parents surveyed reported that their child joined Facebook before the age of 13, and two-thirds of them (68%) helped their child create the account.
Half (53%) of parents surveyed think Facebook has a minimum age and a third (35%) of these parents think that this is a recommendation and not a requirement.
Most (78%) parents think it is acceptable for their child to violate minimum age restrictions on online services.

The status quo is not working if large numbers of parents are helping their children lie to get access to online services. Parents do appear to be having conversations with their children, as COPPA intended. Yet, what does it mean if they’re doing so in order to violate the restrictions that COPPA engendered?

One reaction to our data might be that companies should not be allowed to restrict access to children on their sites. Unfortunately, getting the parental permission required by COPPA is technologically difficult, financially costly, and ethically problematic. Sites that target children take on this challenge, but often by excluding children whose parents lack resources to pay for the service, those who lack credit cards, and those who refuse to provide extra data about their children in order to offer permission. The situation is even more complicated for children who are in abusive households, have absentee parents, or regularly experience shifts in guardianship. General-purpose sites, including communication platforms like Gmail and Skype and social media services like Facebook and Twitter, generally prefer to avoid the social, technical, economic, and free speech complications involved.

While there is merit to thinking about how to strengthen parent permission structures, focusing on this obscures the issues that COPPA is intended to address: data privacy and online safety. COPPA predates the rise of social media. Its architects never imagined a world where people would share massive quantities of data as a central part of participation. It no longer makes sense to focus on how data are collected; we must instead question how those data are used. Furthermore, while children may be an especially vulnerable population, they are not the only vulnerable population. Most adults have little sense of how their data are being stored, shared, and sold.

COPPA is a well-intentioned piece of legislation with unintended consequences for parents, educators, and the public writ large. It has stifled innovation for sites focused on children and its implementations have made parenting more challenging. Our data clearly show that parents are concerned about privacy and online safety. Many want the government to help, but they don’t want solutions that unintentionally restrict their children’s access. Instead, they want guidance and recommendations to help them make informed decisions. Parents often want their children to learn how to be responsible digital citizens. Allowing them access is often the first step.

Educators face a different set of issues. Those who want to help youth navigate commercial tools often encounter the complexities of age restrictions. Consider the 7th grade teacher whose students are heavy Facebook users. Should she admonish her students for being on Facebook underage? Or should she make sure that they understand how privacy settings work? Where does digital literacy fit in when what children are doing is in violation of websites’ Terms of Service?

At first blush, the issues surrounding COPPA may seem to only apply to technology companies and the government, but their implications extend much further. COPPA affects parenting, education, and issues surrounding youth rights. It affects those who care about free speech and those who are concerned about how violence shapes home life. It’s important that all who care about youth pay attention to these issues. They’re complex and messy, full of good intention and unintended consequences. But rather than reinforcing or extending a legal regime that produces age-based restrictions which parents actively circumvent, we need to step back and rethink the underlying goals behind COPPA and develop new ways of achieving them. This begins with a public conversation.

We are excited to release our new study in the hopes that it will contribute to that conversation. To read our complete findings and learn more about their implications for policy makers, see “Why Parents Help Their Children Lie to Facebook About Age: Unintended Consequences of the ‘Children’s Online Privacy Protection Act'” by danah boyd, Eszter Hargittai, Jason Schultz, and John Palfrey, published in First Monday.

To learn more about the Children’s Online Privacy Protection Act (COPPA), make sure to check out the Federal Trade Commission’s website.

(Versions of this post were originally written for the Huffington Post and for the Digital Media and Learning Blog.)

Image Credit: Tim Roe

The Unintended Consequences of Cyberbullying Rhetoric

We all know that teen bullying – both online and offline – has devastating consequences. Jamey Rodemeyer’s suicide is a tragedy. He was tormented for being gay. He knew he was being bullied and he regularly talked about the fact that he was being bullied. Online, he even wrote: “I always say how bullied I am, but no one listens. What do I have to do so people will listen to me?” The fact that he could admit that he was being tormented coupled with the fact that he asked for help and folks didn’t help him should be a big wake-up call. We have a problem. And that problem is that most of us adults don’t have the foggiest clue how to help youth address bullying.

It doesn’t take a tragedy to know that we need to find a way to combat bullying. Countless regulators and educators are desperate to do something – anything – to put an end to the victimization. But in their desperation to find a solution, they often turn a blind’s eye to both research and the voices of youth.

The canonical research definition of bullying was written by Olweus and it has three components:

Bullying is aggressive behavior that involves unwanted, negative actions.
Bullying involves a pattern of behavior repeated over time.
Bullying involves an imbalance of power or strength.

What Rodemeyer faced was clearly bullying, but a lot of the reciprocal relational aggression that teens experience online is not actually bullying. Still, in the public eye, these concepts are blurred and so when parents and teachers and regulators talk about wanting to stop bullying, they talk about wanting to stop all forms of relational aggression too. The problem is that many teens do not – and, for good reasons, cannot – identify a lot of what they experience as bullying. Thus, all of the new fangled programs to stop bullying are often missing the mark entirely. In a new paper that Alice Marwick and I co-authored – called “The Drama! Teen Conflict, Gossip, and Bullying in Networked Publics” – we analyzed the language of youth and realized that their use the language of “drama” serves many purposes, not the least of which is to distance themselves from the perpetrator / victim rhetoric of bullying in order to save face and maintain agency.

For most teenagers, the language of bullying does not resonate. When teachers come in and give anti-bullying messages, it has little effect on most teens. Why? Because most teens are not willing to recognize themselves as a victim or as an aggressor. To do so would require them to recognize themselves as disempowered or abusive. They aren’t willing to go there. And when they are, they need support immediately. Yet, few teens have the support structures necessary to make their lives better. Rodemeyer is a case in point. Few schools have the resources to provide youth with the necessary psychological counseling to work through these issues. But if we want to help youth who are bullied, we need there to be infrastructure to help young people when they are willing to recognize themselves as victimized.

To complicate matters more, although school after school is scrambling to implement anti-bullying programs, no one is assessing the effectiveness of these programs. This is not to say that we don’t need education – we do. But we need the interventions to be tested. And my educated hunch is that we need to be focusing more on positive frames that use the language of youth rather than focusing on the negative.

I want to change the frame of our conversation because we need to change the frame if we’re going to help youth. I’ve spent the last seven years talking to youth about bullying and drama and it nearly killed me when I realized that all of the effort that adults are putting into anti-bullying campaigns are falling on deaf ears and doing little to actually address what youth are experiencing. Even hugely moving narratives like “It Gets Better” aren’t enough when a teen can make a video for other teens and then kill himself because he’s unable to make it better in his own community.

In an effort to ground the bullying conversation, Alice Marwick and I just released a draft of our new paper: “The Drama! Teen Conflict, Gossip, and Bullying in Networked Publics.” We also co-authored a New York Times Op-Ed in the hopes of reaching a wider audience: “Why Cyberbullying Rhetoric Misses the Mark.” Please read these and send us feedback or criticism. We are in this to help the youth that we spend so much time with and we’re both deeply worried that adult rhetoric is going in the wrong direction and failing to realize why it’s counterproductive.

Image from Flickr by Brandon Christopher Warren

Continue reading →

Six Provocations for Big Data

The era of “Big Data” has begun. Computer scientists, physicists, economists, mathematicians, political scientists, bio-informaticists, sociologists, and many others are clamoring for access to the massive quantities of information produced by and about people, things, and their interactions. Diverse groups argue about the potential benefits and costs of analyzing information from Twitter, Google, Verizon, 23andMe, Facebook, Wikipedia, and every space where large groups of people leave digital traces and deposit data. Significant questions emerge. Will large-scale analysis of DNA help cure diseases? Or will it usher in a new wave of medical inequality? Will data analytics help make people’s access to information more efficient and effective? Or will it be used to track protesters in the streets of major cities? Will it transform how we study human communication and culture, or narrow the palette of research options and alter what ‘research’ means? Some or all of the above?

Kate Crawford and I decided to sit down and interrogate some of the assumptions and biases embedded into the rhetoric surrounding “Big Data.” The resulting piece – “Six Provocations for Big Data” – offers a multi-discplinary social analysis of the phenomenon with the goal of sparking a conversation. This paper is intended to be presented as a keynote address at the Oxford Internet Institute’s 10th Anniversary “A Decade in Internet Time” Symposium.

Feedback is more than welcome!

Guilt Through Algorithmic Association

You’re a 16-year-old Muslim kid in America. Say your name is Mohammad Abdullah. Your schoolmates are convinced that you’re a terrorist. They keep typing in Google queries likes “is Mohammad Abdullah a terrorist?” and “Mohammad Abdullah al Qaeda.” Google’s search engine learns. All of a sudden, auto-complete starts suggesting terms like “Al Qaeda” as the next term in relation to your name. You know that colleges are looking up your name and you’re afraid of the impression that they might get based on that auto-complete. You are already getting hostile comments in your hometown, a decidedly anti-Muslim environment. You know that you have nothing to do with Al Qaeda, but Google gives the impression that you do. And people are drawing that conclusion. You write to Google but nothing comes of it. What do you do?

This is guilt through algorithmic association. And while this example is not a real case, I keep hearing about real cases. Cases where people are algorithmically associated with practices, organizations, and concepts that paint them in a problematic light even though there’s nothing on the web that associates them with that term. Cases where people are getting accused of affiliations that get produced by Google’s auto-complete. Reputation hits that stem from what people _search_ not what they _write_.

It’s one thing to be slandered by another person on a website, on a blog, in comments. It’s another to have your reputation slandered by computer algorithms. The algorithmic associations do reveal the attitudes and practices of people, but those people are invisible; all that’s visible is the product of the algorithm, without any context of how or why the search engine conveyed that information. What becomes visible is the data point of the algorithmic association. But what gets interpreted is the “fact” implied by said data point, and that gives an impression of guilt. The damage comes from creating the algorithmic association. It gets magnified by conveying it.

What are the consequences of guilt through algorithmic association?
What are the correction mechanisms?
Who is accountable?
What can or should be done?

Note: The image used here is Photoshopped. I did not use real examples so as to protect the reputations of people who told me their story.

Update: Guilt through algorithmic association is not constrained to Google. This is an issue for any and all systems that learn from people and convey collective “intelligence” back to users. All of the examples that I was given from people involved Google because Google is the dominant search engine. I’m not blaming Google. Rather, I think that this is a serious issue for all of us in the tech industry to consider. And the questions that I’m asking are genuine questions, not rhetorical ones.

Exciting News: Me @ Microsoft Research + New York University

When I was finishing my PhD and starting to think about post-school plans, I made a list of my favorite university departments. At the top of the list was New York University’s “Media, Culture, and Communication” (MCC) department. I am in awe of their faculty and greatly admire the students who I know who graduated from there. I decided that MCC was my dream department.

When I joined Microsoft Research, I had a bit of a pang of sadness over the fact that I was opting out of the formal academic job market before it opened, in part because I was really hoping that MCC would have a job opening. But I also realized that I’d be a fool not to take the MSR job. Working at Microsoft Research is a complete dream come true. I have enormous freedom, unbelievable support, and the opportunity to really create a community of researchers.

But then I started wondering… would there be any way to do both? Yes, this is a twisted thought coming from a workaholic, but it kept nagging at the back of my brain. Countless Microsoft Research faculty in Redmond have joint appointments at University of Washington. And I’m already splitting time between New York and Boston for personal reasons and will be spending more time in New York in the future. So, maybe I could have my cake and eat it too…

One day, hanging out with Helen Nissenbaum, I mentioned that I lurved her department from the bottom of my heart. And, in an off-hand comment, I said something about how I would love love love to have a joint position at MCC. And somehow, what began as a side comment, slowly blossomed into a flower when Marita Sturken – the MCC Chair – told me that she thought that this was a great idea. We started talking and negotiating and plotting and imagining. And, to my surprise and delight, Marita called to say that it was possible to create a joint position for me between MSR and MCC.

So, I am tickled pink to announce that I now have a joint appointment at NYU’s Media, Culture, and Communication department. I am joining the faculty as a Research Assistant Professor. I won’t be teaching any formal classes this year, although I’m looking forward to teaching in the future. In the meantime, I will be advising students and collaborating on research and getting involved in the department life. I will not be leaving Microsoft Research – I still don’t see why anyone would leave MSR. My primary affiliation will still be MSR and MSR will continue to be my academic home. But I’m also excited to have a joint appointment at NYU’s MCC that allows me to engage with the scholarly community and with students in new ways. And I’m really really really excited about this!

w000t!!!

I do not speak for my employer.

I don’t know whether to laugh or cry when people imply that when I make arguments, I’m speaking on behalf of Microsoft. Anyone who knows me knows that my opinions are my own. (This blog sez so too but no one ever seems to reads that.) What I most appreciate about my employer is that they allow me to speak my mind, even when we disagree. This is what it means to have freedom as a researcher and it’s one of the reasons that I love love love Microsoft Research. I never ever speak on behalf of Microsoft but I have zero clue of why people desperately want to perpetuate this myth. This is what makes me want to cry.

What makes me want to laugh is the irony of folks thinking I speak on behalf of Microsoft when I am critiquing an industry-wide practice that is most prominent because of Google’s recent implementation. Yes, I work for Microsoft. But I used to work for Google on social products. Many of my friends – and my brother – work for Google. I also used to work for Bradley Horowitz (one of the folks in charge of Google Plus) when we were both at Yahoo! and I adore him to pieces. I have nothing but respect for the challenges involved in building products, but I also have no qualms about highlighting problematic corporate logic. My arguments are not coming from a point of hatred towards any company or individual, but stemming from a determination to speak up for those who are voiceless in many of these discussions and to provide a different perspective with which to understand the issues.

I write and critique decisions in the tech industry when I feel as though those decisions have unintended consequences for those being affected. I’m particularly passionate when what’s at stake has implications for equality. I recognize and respect the libertarian ethos that persists in the Valley, but I think that it’s critical that privileged folks understand the cultural logic of those who are not that privileged. And, as someone who has an obscene amount of privilege at this stage in the game, I’m committed to using my stature to draw attention to issues that affect people who are marginalized. And when I get pissed off about something, I rant. And that can be both good and bad. But I’ve found that my rants often make people think. That’s what motivates me to keep ranting.

Sometimes, what I say pisses people off. Sometimes, it sounds like I’m dissing particular products or people. Usually, though, I’m critiquing assumptions that persist in the tech industry and the policies that unfold because of those assumptions. And I recognize that those who don’t know me have a bad tendency to misinterpret what I’m saying. I struggle every time I write to do my darndest to be understandable to as many people as I can. And when I’m most visible, folks often think I’m saying the darndest things. But even though I don’t correct everyone, that doesn’t mean that it’s not frustrating to be taken out of context so frequently.

And so it goes… and so it goes..

“Oh, how I miss substituting the conclusion to confrontation with a kiss.”

Designing for Social Norms (or How Not to Create Angry Mobs)

In his seminal book “Code”, Larry Lessig argued that social systems are regulated by four forces: 1) the market; 2) the law; 3) social norms; and 4) architecture or code. In thinking about social media systems, plenty of folks think about monetization. Likewise, as issues like privacy pop up, we regularly see legal regulation become a factor. And, of course, folks are always thinking about what the code enables or not. But it’s depressing to me how few people think about the power of social norms. In fact, social norms are usually only thought of as a regulatory process when things go terribly wrong. And then they’re out of control and reactionary and confusing to everyone around. We’ve seen this with privacy issues and we’re seeing this with the “real name” policy debates. As I read through the discussion that I provoked on this issue, I couldn’t help but think that we need a more critical conversation about the importance of designing with social norms in mind.

Good UX designers know that they have the power to shape certain kinds of social practices by how they design systems. And engineers often fail to give UX folks credit for the important work that they do. But designing the system itself is only a fraction of the design challenge when thinking about what unfolds. Social norms aren’t designed into the system. They don’t emerge by telling people how they should behave. And they don’t necessarily follow market logic. Social norms emerge as people – dare we say “users” – work out how a technology makes sense and fits into their lives. Social norms take hold as people bring their own personal values and beliefs to a system and help frame how future users can understand the system. And just as “first impressions matter” for social interactions, I cannot underestimate the importance of early adopters. Early adopters configure the technology in critical ways and they play a central role in shaping the social norms that surround a particular system.

How a new social media system rolls out is of critical importance. Your understanding of a particular networked system will be heavily shaped by the people who introduce you to that system. When a system unfolds slowly, there’s room for the social norms to slowly bake, for people to work out what the norms should be. When a system unfolds quickly, there’s a whole lot of chaos in terms of social norms. Whenever a network system unfolds, there are inevitably competing norms that arise from people who are disconnected to one another. (I can’t tell you how much I loved watching Friendster when the gay men, Burners, and bloggers were oblivious to one another.) Yet, the faster things move, the faster those collisions occur, and the more confusing it is for the norms to settle.

The “real name” culture on Facebook didn’t unfold because of the “real name” policy. It unfolded because the norms were set by early adopters and most people saw that and reacted accordingly. Likewise, the handle culture on MySpace unfolded because people saw what others did and reproduced those norms. When social dynamics are allowed to unfold organically, social norms are a stronger regulatory force than any formalized policy. At that point, you can often formalize the dominant social norms without too much pushback, particularly if you leave wiggle room. Yet, when you start with a heavy-handed regulatory policy that is not driven by social norms – as Google Plus did – the backlash is intense.

Think back to Friendster for a moment… Remember Fakester? (I wrote about them here.) Friendster spent ridiculous amounts of time playing whack-a-mole, killing off “fake” accounts and pissing off some of the most influential of its userbase. The “Fakester genocide” prompted an amazing number of people to leave Friendster and head over to MySpace, most notably bands, all because they didn’t want to be configured by the company. The notion of Fakesters died down on MySpace, but the most central practice – the ability for groups (bands) to have recognizable representations – ended up being the most central feature of MySpace.

People don’t like to be configured. They don’t like to be forcibly told how they should use a service. They don’t want to be told to behave like the designers intended them to be. Heavy-handed policies don’t make for good behavior; they make for pissed off users.

This doesn’t mean that you can’t or shouldn’t design to encourage certain behaviors. Of course you should. The whole point of design is to help create an environment where people engage in the most fruitful and healthy way possible. But designing a system to encourage the growth of healthy social norms is fundamentally different than coming in and forcefully telling people how they must behave. No one likes being spanked, especially not a crowd of opinionated adults.

Ironically, most people who were adopting Google Plus early on were using their real names, out of habit, out of understanding how they thought the service should work. A few weren’t. Most of those who weren’t were using a recognizable pseudonym, not even trying to trick anyone. Going after them was just plain stupid. It was an act of force and people felt disempowered. And they got pissed. And at this point, it’s no longer about whether or not the “real names” policy was a good idea in the first place; it’s now an act of oppression. Google Plus would’ve been ten bazillion times better off had they subtly encouraged the policy without making a big deal out of it, had they chosen to only enforce it in the most egregious situations. But now they’re stuck between a rock and a hard place. They either have to stick with their policy and deal with the angry mob or let go of their policy as a peace offering in the hopes that the anger will calm down. It didn’t have to be this way though and it wouldn’t have been had they thought more about encouraging the practices they wanted through design rather than through force.

Of course there’s a legitimate reason to want to encourage civil behavior online. And of course trolls wreak serious havoc on a social media system. But a “real names” policy doesn’t stop an unrepentant troll; it’s just another hurdle that the troll will love mounting. In my work with teens, I see textual abuse (“bullying”) every day among people who know exactly who each other is on Facebook. The identities of many trolls are known. But that doesn’t solve the problem. What matters is how the social situation is configured, the norms about what’s appropriate, and the mechanisms by which people can regulate them (through social shaming and/or technical intervention). A culture where people can build reputation through their online presence (whether “real” names or pseudonyms) goes a long way in combating trolls (although it is by no means a fullproof solution). But you don’t get that culture by force; you get it by encouraging the creation of healthy social norms.

Companies that build systems that people use have power. But they have to be very very very careful about how they assert that power. It’s really easy to come in and try to configure the user through force. It’s a lot harder to work diligently to design and build the ecosystem in which healthy norms emerge. Yet, the latter is of critical importance to the creation of a healthy community. Cuz you can’t get to a healthy community through force.

danah boyd | apophenia

making connections where none previously existed