My name is danah boyd and I'm a Principal Researcher at Microsoft Research and the founder/president of Data & Society. Buzzwords in my world include: privacy, context, youth culture, social media, big data. I use this blog to express random thoughts about whatever I'm thinking.

Relevant links:

Archive

Put an End to Reporting on Election Polls

We now know that the US election polls were wrong. Just like they were in Brexit. Over the last few months, I’ve told numerous reporters and people in the media industry that they should be wary of the polling data they’re seeing, but I was generally ignored and dismissed. I wasn’t alone — two computer scientists whom I deeply respect — Jenn Wortman Vaughan and Hanna Wallach — were trying to get an op-ed on prediction and uncertainty into major newspapers, but were repeatedly told that the outcome was obvious. It was not. And election polls will be increasingly problematic if we continue to approach them the way we currently do.

It’s now time for the media to put a moratorium on reporting on election polls and fancy visualizations of statistical data. And for data scientists and pollsters to stop feeding the media hype cycle with statistics that they know have flaws or will be misinterpreted as fact.

Why Political Polling Will Never Be Right Again

Polling and survey research has a beautiful history, one that most people who obsess over the numbers don’t know. In The Averaged American, Sarah Igo documents three survey projects that unfolded in the mid-20th century that set the stage for contemporary polling: the Middletown studies, Gallup, and Kinsey. As a researcher, it’s mindblowing to see just how naive folks were about statistics and data collection in the early development of this field, how much the field has learned and developed. But there’s another striking message in this book: Americans were willing to contribute to these kinds of studies at unparalleled levels compared to their peers worldwide because they saw themselves as contributing to the making of public life. They were willing to reveal their thoughts, beliefs, and ideas because they saw doing so as productive for them individually and collectively.

As folks unpack the inaccuracies of contemporary polling data, they’re going to focus on technical limitations. Some of these are real. Cell phones have changed polling — many people don’t pick up unknown numbers. The FCC’s ruling that limited robocalls to protect consumers in late 2015 meant that this year’s sampling process got skewed, that polling became more expensive, and that pollsters took shortcuts. We’ve heard about how efforts to extrapolate representativeness from small samples messes with the data — such as the NYTimes report on a single person distorting national polling averages.

But there’s a more insidious problem with the polling data that is often unacknowledged. Everyone and their mother wants to collect data from the public. And the public is tired of being asked, which they perceive as being nagged. In swing states, registered voters were overwhelmed with calls from real pollsters, fake pollsters, political campaigns, fundraising groups, special interest groups, and their neighbors. We know that people often lie to pollsters (confirmation bias), but when people don’t trust information collection processes, normal respondent bias becomes downright deceptive. You cannot collect reasonable data when the public doesn’t believe in the data collection project. And political pollsters have pretty much killed off their ability to do reasonable polling because they’ve undermined trust. It’s like what happens when you plant the same crop over and over again until the land can no longer sustain that crop.

Election polling is dead, and we need to accept that.

Why Reporting on Election Polling Is Dangerous

To most people, even those who know better, statistics look like facts. And polling results look like truth serum, even when pollsters responsibly report margin of error information. It’s just so reassuring or motivating to see stark numbers because you feel like you can do something about those numbers, and then, when the numbers change, you feel good. This plays into basic human psychology. And this is why we use numbers as an incentive in both education and the workplace.

Political campaigns use numbers to drive actions on their teams. They push people to go to particular geographies, they use numbers to galvanize supporters. And this is important, which is why campaigns invest in pollsters and polling processes.

Unfortunately, this psychology and logic gets messed up when you’re talking about reporting on election polls in the public. When the numbers look like your team is winning, you relax and stop fretting, often into complacency.When the numbers look like your team is losing, you feel more motivated to take steps and do something. This is part of why the media likes the horse race — they push people to action by reporting on numbers, which in effect pushes different groups to take action. They like the attention that they get as the mood swings across the country in a hotly contested race.

But there is number burnout and exhaustion. As people feel pushed and swayed, as the horse race goes on and on, they get more and more disenchanted. Rather than galvanizing people to act, reporting on political polling over a long period of time with flashy visuals and constantly shifting needles prompts people to disengage from the process. In short, when it comes to the election, this prompts people to not show up to vote. Or to be so disgusted that voting practices become emotionally negative actions rather than productively informed ones.

This is a terrible outcome. The media’s responsibility is to inform the public and contribute to a productive democratic process. By covering political polls as though they are facts in an obsessive way, they are not only being statistically irresponsible, but they are also being psychologically irresponsible.

The news media are trying to create an addictive product through their news coverage, and, in doing so, they are pushing people into a state of overdose.

Yesterday, I wrote about how the media is being gamed and not taking moral responsibility for its participation in the spectacle of this year’s election. One of its major flaws is how it’s covering data and engaging in polling coverage. This is, in many ways, the easiest part of the process to fix. So I call on the news media to put a moratorium on political polling coverage, to radically reduce the frequency with which they reference polls during an election season, and to be super critical of the data that they receive. If they want to be a check to power, they need to have the structures in place to be a check to math.

(This was first posted on Points.)

Print Friendly

I blame the media. Reality check time.

For months I have been concerned about how what I was seeing on the ground and in various networks was not at all aligned with what pundits were saying. I knew the polling infrastructure had broken, but whenever I told people about the problems with the sampling structure, they looked at me like an alien and told me to stop worrying. Over the last week, I started to accept that I was wrong. I wasn’t.

And I blame the media.

The media is supposed to be a check to power, but, for years now, it has basked in becoming power in its own right. What worries me right now is that, as it continues to report out the spectacle, it has no structure for self-reflection, for understanding its weaknesses, its potential for manipulation.

I believe in data, but data itself has become spectacle. I cannot believe that it has become acceptable for media entities to throw around polling data without any critique of the limits of that data, to produce fancy visualizations which suggest that numbers are magical information. Every pollster got it wrong. And there’s a reason. They weren’t paying attention to the various structural forces that made their sample flawed, the various reasons why a disgusted nation wasn’t going to contribute useful information to inform a media spectacle. This abuse of data has to stop. We need data to be responsible, not entertainment.

This election has been a spectacle because the media has enjoyed making it as such. And in doing so, they showcased just how easily they could be gamed. I refer to the sector as a whole because individual journalists and editors are operating within a structural frame, unmotivated to change the status quo even as they see similar structural problems to the ones I do. They feel as though they “have” to tell a story because others are doing so, because their readers can’t resist reading. They live in the world pressured by clicks and other elements of the attention economy. They need attention in order to survive financially. And they need a spectacle, a close race.

We all know that story. It’s not new. What is new is that they got played.
Over the last year, I’ve watched as a wide variety of decentralized pro-Trump actors first focused on getting the media to play into his candidacy as spectacle, feeding their desire for a show. In the last four months, I watched those same networks focus on depressing turnout, using the media to trigger the populace to feel so disgusted and frustrated as to disengage. It really wasn’t hard because the media was so easy to mess with. And they were more than happy to spend a ridiculous amount of digital ink circling round and round into a frenzy.

Around the world, people have been looking at us in a state of confusion and shock, unsure how we turned our democracy into a new media spectacle. What hath 24/7 news, reality TV, and social media wrought? They were right to ask. We were irresponsible to ignore.

In the tech sector, we imagined that decentralized networks would bring people together for a healthier democracy. We hung onto this belief even as we saw that this wasn’t playing out. We built the structures for hate to flow along the same pathways as knowledge, but we kept hoping that this wasn’t really what was happening. We aided and abetted the media’s suicide.
The red pill is here. And it ain’t pretty.

We live in a world shaped by fear and hype, not because it has to be that way, but because this is the obvious paradigm that can fuel the capitalist information architectures we have produced.

Many critics think that the answer is to tear down capitalism, make communal information systems, or get rid of social media. I disagree. But I do think that we need to actively work to understand complexity, respectfully engage people where they’re at, and build the infrastructure to enable people to hear and appreciate different perspectives. This is what it means to be truly informed.

There are many reasons why we’ve fragmented as a country. From the privatization of the military (which undermined the development of diverse social networks) to our information architectures, we live in a moment where people do not know how to hear or understand one another. And our obsession with quantitative data means that we think we understand when we hear numbers in polls, which we use to judge people whose views are different than our own. This is not productive.

Most people are not apathetic, but they are disgusted and exhausted. We have unprecedented levels of anxiety and fear in our country. The feelings of insecurity and inequality cannot be written off by economists who want to say that the world is better today than it ever was. It doesn’t feel that way. And it doesn’t feel that way because, all around us, the story is one of disenfranchisement, difference, and uncertainty.

All of us who work in the production and dissemination of information need to engage in a serious reality check.

The media industry needs to take responsibility for its role in producing spectacle for selfish purposes. There is a reason that the public doesn’t trust institutions in this country. And what the media has chosen to do is far from producing information. It has chosen to produce anxiety in the hopes that we will obsessively come back for more. That is unhealthy. And it’s making us an unhealthy country.

Spectacle has a cost. It always has. And we are about to see what that cost will be.

(This was first posted at Points.)

Print Friendly

Columbus Day!?!? What the f* are we celebrating?

Today is Columbus Day, a celebration of colonialism wrapped up under the guise of exploration. Children around the US are taught that European settlers came in 1492 and found a whole new land magically free for occupation. In November, they will be told that there were small and disperse savage populations who opened their arms to white settlers fleeing oppression. Some of those students may eventually learn on their own about violence, genocide, infection, containment, relocation, humiliation, family separation, and cultural devaluation which millions of Native peoples experienced over centuries.

Hello, cultural appropriation!

Later this month, when everyone is excited about goblins and ghosts, thousands of sexy Indian costumes will be sold, prompting young Native Americans to cringe at the depictions of their culture and community. Part of the problem is that most young Americans think that Indians are dead or fictitious. Schools don’t help — children are taught to build teepees and wear headdresses as though this is a story of the past, not a living culture. And racist attitudes towards Native people are baked into every aspect of our culture. Why is it OK for Washington’s football team to be named the Redskins? Can you imagine a football team being named after the N-word?

Historically, Native people sit out Columbus Day in silence. This year, I hope you join me and thousands others by making a more active protest to Change what people learn!

In 2004, the Smithsonian’s National Museum of the American Indian was opened on the Mall in Washington DC as a cultural heritage institution to celebrate the stories of Native people and tell their story. I’m a proud trustee of this esteemed institution. I’m even more excited by upcoming projects that are focused on educating the public more holistically about the lives and experiences of Native peoples.

As a country, we’re struggling with racism and prejudice, hate that is woven deep into our cultural fabric. Injustice is at the core of our country’s creation, whether we’re talking about the original sin of slavery or the genocide of Native peoples. Addressing inequities in the present requires us to come to terms with our past. We need to educate ourselves about the limits of our understanding about our own country’s history. And we need to stop creating myths for our children that justify contemporary prejudice.

On this day, a day that we should not be celebrating, I have an ask for you. Please help me and NMAI build an educational effort that will change the next generation’s thinking about Native culture, past and present. Please donate a multiple of $14.91 to NMAI: http://nmai.si.edu/support/membership/ in honor of how much life existed on these lands before colonialist expansion. Help Indian nations achieve their rightful place of respect among the world’s nations and communities.

Print Friendly

There was a bomb on my block.

I live in Manhattan, in Chelsea, on 27th Street between 6th and 7th, the same block in which the second IED was found. It was a surreal weekend, but it is increasingly becoming depressing as the media moves from providing information to stoking fear, the exact response that makes these events so effective. I’m not afraid of bombs. I’m afraid of cars. And I’m increasingly becoming afraid of American media.

After hearing the bomb go off on 23rd and getting flooded with texts on Saturday night, I decided to send a few notes that I was OK and turn off my phone. My partner is Israeli. We’ve been there for two wars and he’s been there through countless bombs. We both knew that getting riled up was of no help to anyone. So we went to sleep. I woke up on Sunday, opened my blinds, and was surprised to see an obscene number of men in black with identical body types, identical haircuts, and identical cars. It looked like the weirdest casting call I’ve ever seen. And no one else. No cars, no people. As always, Twitter had an explanation so we settled into our PJs and realized it was going to be a strange day.

Flickr / Sean MacEntree

As other people woke up, one thing became quickly apparent — because folks knew we were in the middle of it, they wanted to reach out to us because they were worried, and scared. We kept shrugging everything off, focusing on getting back to normal and reading the news for updates about how we could maneuver our neighborhood. But ever since a suspect was identified, the coverage has gone into hyperventilation mode. And I just want to scream in frustration.

The worst part about having statistical training is that it’s hard to hear people get anxious about fears without putting them into perspective. ~100 people die every day in car crashes in the United States. That’s 33,804 deaths in a year. Thousands of people are injured every day by cars. Cars terrify me.And anyone who says that you have control over a car accident is full of shit; most car deaths and injuries are not the harmed person’s fault.

The worst part about being a parent is having to cope with the uncontrollable, irrational, everyday fears that creep up, unwarranted, just to plague a moment of happiness. Will he choke on that food? What if he runs away and gets hit by a car? What if he topples over that chair? The best that I can do is breathe in, breathe out, and remind myself to find my center, washing away those fears with each breath.

And the worst part about being a social scientist is understanding where others’ fears come from, understanding the power of those fears, and understanding the cost of those fears on the well-being of a society. And this is where I get angry because this is where control and power lies.

Traditional news media has a lot of say in what it publishes. This is one ofthe major things that distinguishes it from social media, which propagates the fears and anxieties of the public. And yet, time and time again, news media shows itself to be irresponsible, motivated more by the attention and money that it can obtain by stoking people’s fears than by a moral responsibility to help ground an anxious public.

I grew up on the internet. I grew up with the mantra “don’t feed the trolls.” I always saw this as a healthy meditation for navigating the internet, for focusing on the parts of the internet that are empowering and delightful.Increasingly, I keep thinking that this is a meditation that needs to be injected into the news ecosystem. We all know that the whole concept of terrorism is to provoke fear in the public. So why are we not holding news media accountable for opportunistically aiding and abetting terroristic acts?Our cultural obsession with reading news that makes us afraid parallels our cultural obsession with crises.

There’s a reason that hate is growing in this country. And, in moments like this, I’m painfully reminded that we’re all contributing to the culture of hate.When we turn events like what happened this weekend in NY/NJ into spectacle, when we encourage media to write stories about how afraid people are, when we read the stories of how the suspect was an average person until something changed, we give the news media license to stoke up fear. And when they are encouraged to stoke fear, they help turn our election cycle into reality TV and enable candidates to spew hate for public entertainment. We need to stop blaming what’s happening on other people and start taking responsibility.

In short, we all need to stop feeding the trolls.

Print Friendly

Be Careful What You Code For

Most people who don’t code don’t appreciate how hard it is to do right.Plenty of developers are perfectly functional, but to watch a master weave code into silken beauty is utterly inspiring. Unfortunately, most of the code that underpins the tools that we use on a daily basis isn’t so pretty. There isa lot of digital duct tape.

CC BY-NC 2.0-licensed photo by Dino Latoga.

I’m a terrible programmer. Don’t get me wrong — I’m perfectly capable of mashing together code to get a sorta-kinda-somewhat reasonable outcome.But the product is inevitably a Frankensteinesque monstrosity. I’m not alone. This is why I’m concerned about the code that is being built. Not all code is created equally.

If you want to understand what we’re facing, consider what this would mean if we were constructing cities. In the digital world, we are simultaneously building bridges, sewage systems, and skyscrapers. Some of the bridge builders have civil engineering degrees, some of our sewage contractors have been plumbers in past lives, but most of the people building skyscrapers have previously only built tree houses and taken a few math classes. Oh, and there aren’t any inspectors to assess whether or not it’s all going to fall apart.

Code is key to civic life, but we need to start looking under the hood and thinking about the externalities of our coding practices, especially as we’re building code as fast as possible with few checks and balances.

Area One: Environmental Consequences

Let’s play a game of math. Almost 1 billion people use Gmail. More than that are active on Facebook each month. Over 300 million are active on Twitter each month. All social media — including Facebook and Twitter — send out notifications to tell you that you have new friend requests, likes, updates, etc. Each one of those notifications is roughly 50KB. If you’re relatively active, you might get 1MB of notifications a day. That doesn’t seem to be that much. But if a quarter of Gmail users get that, this means that Google hosts over 90 petabytes of notifications per year. All of that is sitting live on server so that any user can search their email and find past emails, including the new followers they received in 2007. Is this really a good use of resources? Is this really what we want when we talk about keeping data around?

The tech industry uses crazy metaphors. Artificial intelligence. Files and folders. They often have really funny roots that make any good geek giggle. (UNIX geeks, did you know that the finger command is named as such because that word meant someone is a “snitch” in the 1970s? You probably had a dirtier idea in mind.

CC BY 2.0-licensed photo by Pattys-photos.

We don’t know who started calling the cloud the cloud, but he (and it’s inevitably a he) didus all a disservice. When the public hears about the cloud, they think about the fluffy white things in the sky. What were the skies like when you were young? They went on forever…And the skies always had little fluffy clouds.” Those clouds giveth. They offer rain, which gives us water, which is the source of life.

But what about the clouds we techies make? Those clouds take. They require rare earth metals and soak up land, power, and water. Many big companies are working hard to think about the environmental impact of data centers, to think about the carbon implications. (I’m proud to work forone of them.) Big companies still have a long way to go, but at least they’re trying. But how many developers out there are trying to write green code?At best, folks are thinking about the cost-per-computation, but most developers are pretty sloppy with code and data. And there’s no LEED-certified code. Who is going to start certifying LEED code!?

In the same sense, how many product designers are thinking about the environmental impact of every product design decision they make? Product folks are talking about how notifications might annoy or engage users but not the environmental impact of them. And for all those open data zealots, is the world really better off having petabytes of data sitting on live servers just to make sure it’s open and accessible just in case? It’s painful to think about how many terabytes of data are sitting in open data repositories that have never been accessed.

And don’t get me started about the blockchain or 3D printing or the Internet of Things. At least bitcoin got one thing right: this really is about mining.

Area Two: Social Consequences

In the early 2000s, Google thought that I was a truck driver. I got the bestadvertisements. I didn’t even know how many variations of trucker speed there were! All because I did fieldwork in parts of the country that only truckers visit. Consider how many people have received online advertisements that clearly got them wrong. Funny, huh?

Now…Have you ever been arrested? Have you ever been incarcerated?

Take a moment to think about the accuracy of our advertising ecosystem — the amount of money and data that goes into making ads right. Now think about what it means that the same techniques that advertisers are using to “predict” what you want to buy are also being used to predict the criminality of a neighborhood or a person. And those that work in law enforcement and criminal justice have less money, oversight mechanisms, and technical skills.

Inaccuracy and bias are often a given in advertising. But is it OK that we’re using extraordinarily biased data about previous arrests to predict future arrests and determine where police are stationed? Is it OK that we assess someone’s risk at the point of arrest and give judges recommendations for bail, probation, and sentencing? Is it OK that local law enforcement agencies are asking tech vendors to predict which children are going to commit a crime before they’re 21? Who is deciding, and who is holding them accountable?

We might have different political commitments when it comes to policing and criminal justice. But when it comes to tech and data analysis, I hope that we can all agree that accuracy matters. Yet, we’re turning a blind eye to all of the biases that are baked into the data and, thus, the models that we build.

CC BY-NC 2.0-licensed photo by Thomas Hawk.

Take a moment to consider that 96% of cases are plead out. Those defendants never see a jury of their peers. At a minimum, 10% — but most likely much more — of those who take a plea are innocent. Why? Last I saw, the average inmate at Riker’s waits ~600 days for their trial to begin. Average. And who is more likely to end up not making bail? Certainly not rich white folks.

Researchers have long known that whites are more likely to use and sell drugs. And yet, who is arrested for drugs? Blacks. 13% of the US population is black, but over 60% of those in prison are black. Mostly for drug crimes.

Because blacks are more likely to be arrested — and more likely to be prosecuted and serve time, guess what our algorithms tell us about who is most likely to commit a drug crime? About where drug crimes occur? Police aren’t sent by predictive policing tools to college campuses. They’re sent to the hood.

Engineers argue that judges and police officers should know the limits of the data they use. Some do — they’re simply ignoring these expensive, tax-payer-costing civic technologies. But in a world of public accountability, where police are punished for not knowing someone was a risk before they shoot up a church, many feel obliged to follow the recommendations for fear of reprisal. This is how racism gets built into the structures of our systems. And civic tech is implicated in this.

I don’t care what your politics are. If you’re building a data-driven system and you’re not actively seeking to combat prejudice, you’re building a discriminatory system.

Solution: Audits and Inspection

Decisions made involving tech can have serious ramifications that are outside of the mind’s eye of development. We need to wake up. Our technology is powerful, and we need to be aware of the consequences of our code.

Before our industry went all perpetual beta, we used to live in a world where Test or Quality Assurance meant something. Rooted in those domains is a practice that can be understood as an internal technical audit. We need to get back to this. We need to be able to answer simple questions like:

  • Does the system that we built produce the right output given the known constraints?
  • Do we understand the biases and limitations of the system and the output?
  • Are those clear to the user so that our tool cannot enable poor decision-making or inaccurate impressions?
  • What are the true social and environmental costs of the service?

We need to start making more meaningful trade-offs. And that requires asking hard questions.

Audits don’t have to be adversarial. They can be a way of honestly assessing the limitations of a system and benchmarking for improvement. This approach is not without problems and limitations, but, if you cannot understand whether a model is helping or hurting, discriminating or resulting in false positives, then you should not be implementing that technology in a high stakes area where freedom and liberty are at stake.Stick to advertising.

Technology can be amazingly empowering. But only when it is implemented in a responsible manner. Code doesn’t create magic. Without the right checks and balances, it can easily be misused. In the world of civic tech, we need to conscientiously think about the social and environmental costs, just as urban planners do.

Print Friendly

Facebook Must Be Accountable to the Public

A pair of Gizmodo stories have prompted journalists to ask questions about Facebook’s power to manipulate political opinion in an already heated election year. If the claims are accurate, Facebook contractors have depressed some conservative news, and their curatorial hand affects the Facebook Trending list more than the public realizes. Mark Zuckerberg took to his Facebook page yesterday to argue that Facebook does everything possible to be neutral and that there are significant procedures in place to minimize biased coverage. He also promises to look into the accusations.

Watercolor by John Orlando Parry, “A London Street Scene” 1835, in the Alfred Dunhill Collection.

As this conversation swirls around intentions and explicit manipulation, there are some significant issues missing. First, all systems are biased. There is no such thing as neutrality when it comes to media. That has long been a fiction, one that traditional news media needs and insists on, even as scholars highlight that journalists reveal their biases through everything from small facial twitches to choice of frames and topics of interests. It’s also dangerous to assume that the “solution” is to make sure that “both” sides of an argument are heard equally. This is the source of tremendous conflict around how heated topics like climate change and evolution are covered. Itis even more dangerous, however, to think that removing humans and relying more on algorithms and automation will remove this bias.

Recognizing bias and enabling processes to grapple with it must be part of any curatorial process, algorithmic or otherwise. As we move into the development of algorithmic models to shape editorial decisions and curation, we need to find a sophisticated way of grappling with the biases that shape development, training sets, quality assurance, and error correction, not to mention an explicit act of “human” judgment.

There never was neutrality, and there never will be.

This issue goes far beyond the Trending box in the corner of your Facebook profile, and this latest wave of concerns is only the tip of the iceberg around how powerful actors can affect or shape political discourse. What is of concern right now is not that human beings are playing a role in shaping the news — they always have — it is the veneer of objectivity provided by Facebook’s interface, the claims of neutrality enabled by the integration of algorithmic processes, and the assumption that what is prioritized reflects only the interests and actions of the users (the “public sphere”) and not those of Facebook, advertisers, or other powerful entities.

The key challenge that emerges out of this debate concerns accountability.In theory, news media is accountable to the public. Like neutrality, this is more of a desired goal than something that’s consistently realized. While traditional news media has aspired to — but not always realized — meaningful accountability, there are a host of processes in place to address the possibility of manipulation: ombudspeople, whistleblowers, public editors, and myriad alternate media organizations. Facebook and other technology companies have not, historically, been included in that conversation.

I have tremendous respect for Mark Zuckerberg, but I think his stance that Facebook will be neutral as long as he’s in charge is a dangerous statement.This is what it means to be a benevolent dictator, and there are plenty of people around the world who disagree with his values, commitments, and logics. As a progressive American, I have a lot more in common with Mark than not, but I am painfully aware of the neoliberal American value systems that are baked into the very architecture of Facebook and our society as a whole.

Who Controls the Public Sphere in an Era of Algorithms?

In light of this public conversation, I’m delighted to announce that Data & Society has been developing a project that asks who controls the public sphere in an era of algorithms. As part of this process, we convened a workshop and have produced a series of documents that we think are valuable to the conversation:

These documents provide historical context, highlight how media has always been engaged in power struggles, showcase the challenges that new media face, and offer case studies that reveal the complexities going forward.

This conversation is by no means over. It is only just beginning. My hope is that we quickly leave the state of fear and start imagining mechanisms of accountability that we, as a society, can live with. Institutions like Facebook have tremendous power and they can wield that power for good or evil. Butfor society to function responsibly, there must be checks and balances regardless of the intentions of any one institution or its leader.

This work is a part of Data & Society’s developing Algorithms and Publics project, including a set of documents occasioned by the Who Controls the Public Sphere in an Era of Algorithms? workshop. More posts from workshop participants:

Print Friendly

Where Do We Find Ethics?

I was in elementary school, watching the TV live, when the Challenger exploded. My classmates and I were stunned and confused by what we saw. With the logic of a 9-year-old, I wrote a report on O-rings, trying desperately to make sense of a science I did not know and a public outcry that I couldn’t truly understand. I wanted to be an astronaut (and I wouldn’t give up that dream until high school!).

Years later, with a lot more training under my belt, I became fascinated not simply by the scientific aspects of the failure, but by the organizational aspects of it. Last week, Bob Ebeling died. He was an engineer at a contracting firm, and he understood just how badly the O-rings handled cold weather. He tried desperately to convince NASA that the launch was going to end in disaster. Unlike many people inside organizations, he was willing to challenge his superiors, to tell them what they didn’t want to hear. Yet, he didn’t have organizational power to stop the disaster. And at the end of the day, NASA and his superiors decided that the political risk of not launching was much greater than the engineering risk.

Organizations are messy, and the process of developing and launching a space shuttle or any scientific product is complex and filled with trade-offs. This creates an interesting question about the site of ethics in decision-making. Over the last two years, Data & Society has been convening a Council on Big Data, Ethics, and Society where we’ve had intense discussions about how to situate ethics in the practice of data science. We talked about the importance of education and the need for ethical thinking as a cornerstone of computational thinking. We talked about the practices of ethical oversight in research, deeply examining the role of IRBs and the different oversight mechanisms that can and do operate in industrial research. Our mandate was to think about research, but, as I listened to our debates and discussions, I couldn’t help but think about the messiness of ethical thinking in complex organizations and technical systems more generally.

I’m still in love with NASA. One of my dear friends — Janet Vertesi — has been embedded inside different spacecraft teams, understanding how rovers get built. On one hand, I’m extraordinarily jealous of her field site (NASA!!!), but I’m also intrigued by how challenging it is to get a group of engineers and scientists to work together for what sounds like an ultimate shared goal. I will never forget her description of what can go wrong: Imagine if a group of people were given a school bus to drive, only they were each given a steering wheel of their own and had to coordinate among themselves which way to go. Introduce power dynamics, and it’s amazing what all can go wrong.

Like many college students, encountering Stanley Milgram’s famous electric shock experiment floored me. Although I understood why ethics reviews came out of the work that Milgram did, I’ve never forgotten the moment when I fully understood that humans could do inhuman things because they’ve been asked to do so. Hannah Arendt’s work on the banality of evil taught me to appreciate, if not fear, how messy organizations can get when bureaucracies set in motion dynamics in which decision-making is distributed. While we think we understand the ethics of warfare and psychology experiments, I don’t think we have the foggiest clue how to truly manage ethics in organizations. As I continue to reflect on these issues, I keep returning to a college debate that has constantly weighed on me. Audre Lorde said, “the master’s tools will never dismantle the master’s house.” And, in some senses, I agree. But I also can’t see a way of throwing rocks at a complex system that would enable ethics.

My team at Data & Society has been grappling with different aspects of ethics since we began the Institute, often in unexpected ways. When the Intelligence and Autonomy group started looking at autonomous vehicles, they quickly realized that humans were often left in the loop to serve as “liability sponges,” producing “moral crumple zones.” We’ve seen this in organizations for a long time. When a complex system breaks down, who is to be blamed? As the Intelligence & Autonomy team has shown, this only gets more messy when one of the key actors is a computational system.

And that leaves me with a question that plagues me as we work on our Council on Big Data, Ethics, and Society whitepaper: How do we enable ethics in the complex big data systems that are situated within organizations, influenced by diverse intentions and motivations, shaped by politics and organizational logics, complicated by issues of power and control?

No matter how thoughtful individuals are, no matter how much foresight people have, launches can end explosively.

(This was originally posted on Points.)

Print Friendly

What is the Value of a Bot?

Bots are tools, designed by people and organizations to automate processes and enable them to do something technically, socially, politically, or economically.

Most of the bots that I have built have been in the pursuit of laziness. I have built bots to sit on my server to check to see if processes have died and to relaunch them, mostly to avoid trying to figure out why the process would die in the first place. I have also built bots under the guise of “art.” For example, I built a bot to crawl online communities to quantitatively assess the interactions.

I’ve also written some shoddy code, and my bots haven’t always worked as intended. While I never designed them to be malicious, a few poorly thought through keystrokes had unintended consequences. One rev of my process-checker bot missed the mark and kept launching new processes every 30 seconds until it brought the server down. And in some cases, it wasn’t the bot that was the problem, but my own stupid interpretation of the information I got back from the bot. For example, I got the great idea to link my social bot designed to assess the “temperature” of online communities up to a piece of hardware designed to produce heat. I didn’t think to cap my assessment of the communities and so when my bot stumbled upon a super vibrant space and offered back a quantitative measure intended to signal that the community was “hot,” another piece of my code interpreted this to mean: jack the temperature up the whole way. I was holding that hardware and burnt myself. Dumb. And totally, 100% my fault.

Most of the bots that I’ve written were slipshod, irrelevant, and little more than a nuisance. But, increasingly, huge systems rely on bots. Bots make search engines possible and, when connected to sensors, are often key to smart cities and other IoT instantiations. Bots shape the financial markets and play a role in helping people get information. Of course, not all bots are designed to be helpful to large institutions. Bots that spread worms, viruses, and spam are often capitalizing on the naivety of users. There are large networks of bots (“botnets”) that can be used to bring down systems (e.g., DDoS attacks). There are also pesky bots that mess with the ecosystem by increasing people’s Twitter follower counts, automating “likes” on Instagram, and create the appearance of natural interest even when there is none.

Identifying the value of these different kinds of bots requires a theory of power. We may want to think that search engines are good, while fake-like bots are bad, but both enable the designer of the bots to profit economically and socially.

Who gets to decide the value of a bot? The technically savvy builder of the bot? The people and organizations that encounter or are affected by the bot? Bots are being designed for all sorts of purposes, and most of them are mundane. But even mundane bots can have consequences.

In the early days of search engines, many website owners were outraged by search engine bots, or web crawlers. They had to pay for traffic, and web crawlers were not seen as legitimate or desired traffic. Plus, they visited every page and could easily bring down a web server through their intensive crawling. As a result, early developers came together and developed a proposal for web crawler politeness, including a mechanism known as the “robots exclusion standard” (or robots.txt), which allowed a website owner to dictate which web crawler could look at which page.

As systems get more complex, it’s hard for developers to come together and develop politeness policies for all bots out there. And it’s often hard for a system to discern between bots that are being helpful and bots that are a burden and not beneficial. After all, before Google was Google, people didn’t think that search engines could have much value.

Standards bodies are no longer groups of geeky friends hashing out protocols over pizza. They’re now structured processes involving all sorts of highly charged interests — they often feel more formal than the meeting of the United Nations. Given high-profile disagreements, it’s hard to imagine such bodies convening to regulate the mundane bots that are creating fake Twitter profiles and liking Instagram photos. As a result, most bots are simply seen as a nuisance. But how many gnats come together to make a wasp?

Bots are first and foremost technical systems, but they are derived from social values and exert power into social systems. How can we create the right social norms to regulate them? What do the norms look like in a highly networked ecosystem where many pieces of the pie are often glued together by digital duct tape?

(This was originally written for Points as part of a series on how to think about bots.)

Print Friendly

It’s not Cyberspace anymore

It’s been 20 years — 20 years!? — since John Perry Barlow wrote “A Declaration of the Independence of Cyberspace” — a rant in response to the government and corporate leaders who descend on a certain snowy resort town each year as part of the World Economic Forum (WEF). Picture that pamphleteering with me for a moment…

Governments of the Industrial World, you weary giants of flesh and steel, I come from Cyberspace, the new home of Mind. On behalf of the future, I ask you of the past to leave us alone.

I first read Barlow’s declaration when I was 18 years old. I was in high school and in love with the Internet. His manifesto spoke to me. It was a proclamation of freedom, a critique of the status quo, a love letter to the Internet that we all wanted to exist. I didn’t know why he was in Davos, Switzerland, nor did I understand the political conversation he was engaging in. All I knew is that he was on my side.

Twenty years after Barlow declared cyberspace independent, I myself was in Davos for the WEF annual meeting. The Fourth Industrial Revolution was the theme this year, and a big part of me was giddy to go, curious about how such powerful people would grapple with questions introduced by technology.

What I heard left me conflicted and confused. In fact, I have never been made to feel more nervous and uncomfortable by the tech sector than I did at Davos this year.

Walking down the promenade through the center of Davos, it was hard not to notice the role of Silicon Valley in shaping the conversation of the powerful and elite. Not only was everyone attached to their iPhones and Androids, but companies like Salesforce and Palantir and Facebook took over storefronts and invited attendees in for coffee and discussions about Syrian migrants, while camouflaged snipers protected the scene from the roofs of nearby hotels. As new tech held fabulous parties in the newest venues, financial institutions, long the stalwarts of Davos, took over the same staid venues that they always have.

A Big Dose of AI-induced Hype and Fear

Yet, what I struggled with the most wasn’t the sheer excess of Silicon Valley in showcasing its value but the narrative that underpinned it all. I’m quite used to entrepreneurs talking hype in tech venues, but what happened at Davos was beyond the typical hype, in part because most of the non-tech people couldn’t do a reality check. They could only respond with fear. As a result, unrealistic conversations about artificial intelligence led many non-technical attendees to believe that the biggest threat to national security is humanoid killer robots, or that AI that can do everything humans can is just around the corner, threatening all but the most elite technical jobs. In other words, as I talked to attendees, I kept bumping into a 1970s science fiction narrative.

At first I thought I had just encountered the normal hype/fear dichotomy that I’m faced with on a daily basis. But as I listened to attendees talk, a nervous creeping feeling started to churn my stomach. Watching startups raise downrounds and watching valuation conversations moving from bubbalicious to nervousness, I started to sense that what the tech sector was doing at Davos was putting on the happy smiling blinky story that they’ve been telling for so long, exuding a narrative of progress: everything that is happening, everything that is coming, is good for society, at least in the long run.

Shifting from “big data,” because it’s become code for “big brother,” tech deployed the language of “artificial intelligence” to mean all things tech, knowing full well that decades of Hollywood hype would prompt critics to ask about killer robots. So, weirdly enough, it was usually the tech actors who brought up killer robots, if only to encourage attendees not to think about them. Don’t think of an elephant. Even as the demo robots at the venue revealed the limitations of humanoid robots, the conversation became frothy with concern, enabling many in tech to avoid talking about the complex and messy social dynamics that are underway, except to say that “ethics is important.” What about equality and fairness?

We are creating a world that all may enter without privilege or prejudice accorded by race, economic power, military force, or station of birth.

Barlow’s dreams echoed in my head as I listened to the tech elite try to convince the other elites that they were their solution. We all imagined that the Internet would be the great equalizer, but it hasn’t panned out that way. Only days before the Annual Meeting began, news media reported that the World Bank found that the Internet has had a role in rising inequality.

Welcome to Babel

Conversations around tech were strangely juxtaposed with the broader social and fiscal concerns that rattled through the halls. Faced with a humanitarian crises and widespread anxieties about inequality, much of civil society responded to tech enthusiasm by asking if technology will destabilize labor and economic well-being. A fair question. The only problem is that no one knows, and the models of potential impact are so variable as to be useless. Not surprisingly, these conversations then devolved into sharply split battles, as people lost track of whether all jobs would be automated or whether automation would trigger a lot more jobs.

Not only did any nuance get lost in this conversation, but so did the messy reality of doing tech. It’s hard to explain to political actors why, just because tech can (poorly) target advertising, this doesn’t mean that it can find someone who is trying to recruit for ISIS. Just because advances in AI-driven computer vision are enabling new image detection capabilities, this doesn’t mean that precision medicine is around the corner. And no one seemed to realize that artificial intelligence in this context is just another word for “big data.” Ah, the hype cycle.

It’s going to be a complicated year geopolitically and economically. Somewhere deep down, everyone seemed to realize that. But somehow, it was easier to engage around the magnificent dreams of science fiction. And I was disappointed to watch as tech folks fueled that fire with narratives of tech that drive enthusiasm for it but are so disconnected from reality as to be a distraction on a global stage.

The Internet Is Us. Which Us?

When Barlow penned his declaration, he was speaking on behalf of cyberspace, as though we were all part of one homogeneous community. And, in some sense, we were. We were geeks and freaks and queers. But over the last twenty years, tech has become the underpinning of so many sectors, of so much interaction. Those of us who wanted cyberspace to be universal couldn’t imagine a world in which our dreams got devoured by Silicon Valley.

Tech is truly mainstream — and politically powerful — and yet many in tech still want to see themselves as outsiders. Some of Barlow’s proclamations feel a lot weirder in this contemporary light:

You claim there are problems among us that you need to solve. You use this claim as an excuse to invade our precincts. Many of these problems don’t exist. Where there are real conflicts, where there are wrongs, we will identify them and address them by our means. We are forming our own Social Contract.

There is a power shift underway and much of the tech sector is ill-equipped to understand its own actions and practices as part of the elite, the powerful. Worse, a collection of unicorns who see themselves as underdogs in a world where instability and inequality are rampant fail to realize that they have a moral responsibility.
They fight as though they are insurgents while they operate as though they are kings.

What makes me the most uncomfortable is the realization that most of tech seems to have forgotten the final statement that Barlow made:

May it be more humane and fair than the world your governments have made before.

We built the Internet hoping that the world would come. The world did, but the dream that drove so many of us in the early days isn’t the dream of those who are shaping the Internet today. Now what?

Print Friendly

What If Social Media Becomes 16-Plus? New battles concerning age of consent emerge in Europe

At what age should children be allowed to access the internet without parental oversight? This is a hairy question that raises all sorts of issues about rights, freedoms, morality, skills, and cognitive capability. Cultural values also come into play full force on this one.

Consider, for example, that in the 1800s, the age of sexual (and marital) consent in the United States was between 10 and 12 (except Delaware, where it was seven). The age of consent in England was 12, and it’s still 14 in Germany. This is discomforting for many Western parents who can’t even fathom their 10- or 12-year-old being sexually mature. And so, over time, many countries have raised the age of sexual consent.

But the internet has raised new questions about consent. Is the internet more or less risky than sexual intercourse?
How can youth be protected from risks they cannot fully understand, such as the reputational risks associated with things going terribly awry? And what role should the state and parents have in protecting youth?

This ain’t a new battle. These issues have raged since the early days of the internet. In 1998, the United States passed a law known as the Children’s Online Privacy Protection Act (COPPA), which restricts the kinds of data companies can collect from children under 13 without parental permission. Most proponents of the law argue that this intervention has stopped countless sleazy companies from doing inappropriate things with children’s data.
I have a more cynical view.

Watching teens and parents navigate this issue — and then surveying parents about it — I came to the conclusion that the law prompted companies to restrict access to under-13s, which then prompted children (with parental knowledge) to lie about their age. Worse, I watched as companies stopped innovating for children or providing services that could really help them.

Proponents often push back, highlighting that companies could get parental permission rather than just restrict children. Liability issues aside, why would they? Most major companies aren’t interested in 12-year-olds, so it’s a lot easier to comply with the law by creating a wall than going through a hellacious process of parental consent.

So here we are, with a U.S. law that prompts companies to limit access to 13-plus, a law that has become the norm around the globe. Along comes the EU, proposing a new law to regulate the flow of personal data, including a provision that would allow individual countries to restrict children’s access to the internet at any age (with a cap at age 16).

Implicitly, this means the European standard is to become 16-plus, because how else are companies going to build a process that gives Spanish kids access at 14, German kids at 16, and Italian kids at 12?
Many in the EU are angry at how American companies treat people’s data and respond to values of privacy. We saw this loud and clear when the European Court of Justice invalidated the “safe harbor” and in earlier issues, such as “the right to be forgotten.” Honestly? The Europeans have a right to be angry. They’re so much more thoughtful on issues of privacy, and many U.S. companies pretty much roll their eyes and ignore them. But the problem is that this new law isn’t going to screw American companies, even if it makes them irritable. Instead, it’s going to screw kids. And that infuriates me.

Implicit in this new law — and COPPA more generally — is an assumption that parents can and should consent on behalf of their children. I take issue with both. While some educated parents have thought long and hard about the flows of data, the identity work that goes into reputation, and the legal mechanisms that do or don’t protect children, they are few and far between.

Most parents don’t have the foggiest clue what happens to their kids’ data, and giving them the power to consent sure doesn’t help them become more informed. Hell, most parents don’t have enough information to make responsible decisions for themselves, so why are we trusting them to know enough to protect their children?
We’re doing so because we believe they should have control, that they have the right to control and protect their children, and that no company or government should take this away.

The irony is that this runs completely counter to the treaty that most responsible countries signed at the UN Convention on the Rights of the Child. Every European country committed to making sure that children have the right to privacy — including a right to privacy from their parents. Psychotically individualistic and anti-government, the United States decided not to sign onto this empowering treaty because it was horrifying to U.S. sensibilities that the government would be able to give children rights in opposition to parents. But European countries understood that kids deserved rights. So why is the EU now suggesting that kids can’t consent to using the internet?

This legislation is shaped by a romanticization of parent-child relationships and an assumption of parental knowledge that is laughable.

But what really bothers me are the consequences to the least-empowered youth. While the EU at least made a carve-out for kids who are accessing counseling services, there’s no consideration of how many LGBTQ kids are accessing sites that might put them in danger if their parents knew. There’s no consideration for kids who are regularly abused and using technology and peer relations to get support. There’s no consideration for kids who are trying to get health information, privately. And so on. The UN Rights of the Child puts vulnerable youth front and center in protections. But somehow they’ve been forgotten by EU policymakers.

Child advocates are responding critically. I’m also hearing from countless scholars who are befuddled by and unsure of why this is happening. And it doesn’t seem as though the EU process even engaged the public or experts on these issues before moving forward. So my hope is that some magical outcry will stymie this proposal sooner rather than later. But I’m often clueless when it comes to how lawmakers work.

What baffles me the most is the logic of this proposal given the likely outcomes. We know from the dynamics around COPPA that, if given the chance, kids will lie about their age. And parents will help them. But even if we start getting parental permission, this means we’ll be collecting lots more information about youth, going against the efforts to minimize information. Still, most intriguing is what I expect this will do to the corporate ecosystem.

Big multinationals like Facebook and Twitter, which operate in the EU, will be required to follow this law. All companies based in the EU will be required to comply with this law. But what about small non-EU companies that do not store data in the EU or work with EU vendors and advertisers? It’s unclear if they’ll have to comply because they aren’t within the EU’s reach. Will this mean that EU youth will jump from non-EU service to non-EU service to gain access? Will this actually end up benefiting non-EU startups who are trying to challenge the big multinationals? But doesn’t this completely undermine the EU’s efforts to build EU companies and services?

I don’t know, but that’s my gut feeling when reading the new law.
While I’m not a lawyer, one thing I’ve learned in studying young people and technology is that when there’s a will, there’s a way. And good luck trying to stop a 15-year-old from sharing photos with her best friend when her popularity is on the line.

I don’t know what will come from this law, but it seems completely misguided. It won’t protect kids’ data. It won’t empower parents. It won’t enhance privacy. It won’t make people more knowledgeable about data abuses. It will irritate but not fundamentally harm U.S. companies. It will help vendors that offer age verification become rich. It will hinder EU companies’ ability to compete. But above all else, it will make teenagers’ lives more difficult, make vulnerable youth more vulnerable, and invite kids to be more deceptive. Is that really what we want?

(This was originally posted on Bright on Medium.)

Print Friendly