My name is danah boyd and I'm a Principal Researcher at Microsoft Research and the founder/president of Data & Society. Buzzwords in my world include: privacy, context, youth culture, social media, big data. I use this blog to express random thoughts about whatever I'm thinking.

Relevant links:

Archive

The Dark Side of Numbers: The Role of Population Data Systems in Human Rights Abuses.

Many people have heard me tell an anecdote that i learned while living in Holland: At the turn of the century, the Dutch government collected mass amounts of data about its citizens with good intentions. In order to give people proper burials, they included religion. In 1939, the Nazis invaded and captured that data in less than 3 days. A larger percentage of Dutch Jews died than any other Jews because of this system.

Well, i’d been searching for a citation for a while. Tonight, i remembered to ask Google Answers and in less than an hour, had a perfect citation:

The Dark Side of Numbers: The Role of Population Data Systems in Human Rights Abuses. Social Research, Summer, 2001, by William Seltzer, Margo Anderson

The essay is even better than my anecdote and i truly believe that anyone in the business of doing data capture should be required to read this.

Print Friendly

17 comments to The Dark Side of Numbers: The Role of Population Data Systems in Human Rights Abuses.

  • tony

    isn’t this (sort of) what the U.S. is accused of doing,through the purchasing of private citizens’ personal info. in latin america(use of a third party-corp.)? I suspect it is for anti-terror reasons but…

  • Abe

    Thank you! Hopefully that will open some heads up. I’m perpetually stunned at how many people don’t realize that good intentions don’t always produce good results…

  • My friend Jim Fruchterman runs a human rights abuse online database that is used in dangerous countries by human rights advocates. they keep the data secure in a server cloud and people in scary places feel free to put it in. this is a problem that has some solutions. it’s part of Benetech. http://www.benetech.org/

  • The Dark Side of Numbers: The Role of Population Data Systems in Human Rights Abuses

    danah boyd :”cit”Many people have heard me tell an anecdote that i learned while living in Holland: At the turn of the century, the Dutch government collected mass amounts of data about its citizens with good intentions.

  • Great Link. It is a shame information like this isn’t made more public. I’m sure many americans would have a different view of the privacy abuses being made now if this were the case.

  • Kevin – the problem is that it is often the unexpected that becomes problematic simply because the data is there. No one thought that Holland was a dangerous country. Holland was even trying to stay out of WWII. They were invaded; their data was invaded. People were murdered by the masses for their data, for their religion, for reasons that are beyond my understanding. We know how to operate when we’re under direct fear; we don’t know how to operate when we can’t imagine the unimaginable.

  • The Dark Side of Numbers: The Role of Population Data Systems in Human Rights Abuses

    Here’s an excellent and chilling example of the importance of thinking carefully about complex opportunities and problems.

  • Irina

    In this country, when I wanted to find 6000 people who have moved residences within the last 2 months, complete with their current address and distance moved, I was able to do that in a matter of 4-5 days for a price of less than $500. The data is already there, its available, its cheap. When I called up to get that data, no one was interested in my purposes.

    From what I hear, this is not much more difficult to do in Europe (varies from country to country of course). Especially in Germany, where every single citizen is registered. Maybe the more relevant question is not that its dangerous to collect it (it is, but its collected already), but how to make it safer, how to put in the much needed checks and balances in the system, complete with technological solutions of anonymizing (but see Latanya Sweeney’s work at CMU), security, built in ability to self-destruct.

  • It’s almost as if you’re saying that the government shouldn’t collect large amounts of data about you because a totalitarian fascist country may one day invade your country and get access to that data. Or, slightly more plausibly, that one day your own government might get taken over by fascists who use the data for corrupt purposes.

    While there is a risk to any data collection, it seems like the Dutch government did the right thing to start their data collection in 1900. Reading over this account, I’m not moved to say, “If we could only go back to 1900 and change history, we could avoid the tragedy of 1939.”

    Just because a course action might allow something bad to happen doesn’t mean that that course of action should be avoided. Life is full of risk. It merely means you should keep your eyes open regarding the risks.

  • It’s almost as if you’re saying that the government shouldn’t collect large amounts of data about you because a totalitarian fascist country may one day invade your country and get access to that data. Or, slightly more plausibly, that one day your own government might get taken over by fascists who use the data for corrupt purposes.

    While there is a risk to any data collection, it seems like the Dutch government did the right thing to start their data collection in 1900. Reading over this account, I’m not moved to say, “If we could only go back to 1900 and change history, we could avoid the tragedy of 1939.”

    Just because a course action might allow something bad to happen doesn’t mean that that course of action should be avoided. Life is full of risk. It merely means you should keep your eyes open regarding the risks.

  • Actually, i do believe that the government should not collect large amounts of data just to collect large amounts of data. I think that the risks are too great that this can be abused. Life is full of risk, but there are times when the risks can be avoided through some conscientious decisions. I’m a huge believer of just cause when it comes to data collection – collect because it plays a significant role in the situation and purge as data is no longer needed.

    Or, should i say, sometimes “less is more.”

  • Understood. I don’t think anyone supports the collection of data merely for the sake of collecting data. Collecting data is expensive and therefore needs a justification. But census data is rather closely linked to the workings of democracy, yes? If I could play the devil’s advocate for a moment, I’d suggest census data does good as well as bad. Government aid and benefits are distributed partly on the basis of population density, income levels, and racial distribution. Not to mention, in America, House Reps and therefore Presidential Electors. To resist the collection of the data means not having the data that would allow these programs to be properly administered. It’s clearly an issue that the American people are wary of, otherwise Hollywood wouldn’t keep making movies like Enemy Of The State and The Net. Still, it’s hard for me to imagine how a democracy can work without a census.

    If you don’t think the Dutch government was right to collect the census data that it did, then what course of action do you feel would have been right? Is it the question about religion, in particular, that you feel they should have kept off their census back in 1900? Are there any other questions you feel it should have kept off its census? I’ll assume for the moment that you support a minimalist census in democracies. What data collection do you support on those censuses?

  • Census data is a good example of the kind of data that can be quickly anonymized and used to create exactly the kinds of models you are talking about – population density, income levels, racial distribution. There is no need to link it to individual people or specific geographic locations. Since those atrocities, this is the kind of thing that is done. I recognize the advantage of knowing certain numerical statistics about a population and it’s topology. That doesn’t mean that it needs to be associated with an authenticated body. The body-linked data can be destroyed immediately in this case.

  • Pardon my ignorance, but if “Since those atrocities, this is the kind of thing that is done” is a true statement, what is the problem? And how do you present a valuable look at population density or racial distribution without relating it to geography?

    Shirky brings up how social software is “privatizing census functions” on M2M, is this where you’re going with this? This is a much different discussion than population data as collected by government censuses(censi?).

  • Robert – i was referring to Europe. Europe is much more protective of personal data now because of WWII. The same does not hold in the States. Also, there’s a difference between presenting it based on geography and doing accurate location that pinpoints a particular home.

    One US government group that deals with this is the EPA. There are huge reports of endangered and rare plants, including counts of rare orchids. The geographical region that is used for those reports is often a few square miles. Thus, researchers can know where these plants are popping up, but the exact information is not given so that they plants won’t be removed.

    As for Shirky’s arguments – no, i’m not thinking about privatizing census functions at all. I’m not convinced that will work for some of the critical purposes of census usage.

  • srl

    You know about the fact that some of the WW2-era data collection and tabulation had an American profit motive behind the technology side of it, right?

  • Links for 15th of July

    Global Rich List – you’re on it To PhD or not to PhD? at plasticbag.org. “She’s only Two Cats Mad on the Spinster Eccentricity Index” Steveberlinjohnson.com on Fahrenheit 9/11 Corante.com/loom on Machiavellian Monkeys, social intelligence and cortex s…