census | danah boyd

In my first class in computer science, I was taught that an algorithm is simply a way of expressing formal rules given to a computer. Computers like rules. They follow them. Turns out that bureaucracy and legal systems like rules too. The big difference is that, in the world of computing, we call those who are trying to find ways to circumvent the rules “hackers” but in the world of government, this is simply the mundane work of politicking and lawyering.

When Dan Bouk (and I, as an earnest student of his) embarked on a journey to understand the history of the 1920 census, we both expected to encounter all sorts of politicking and lawyering. As scholars fascinated by the census, we’d heard the basics of the story: Congress failed to reapportion itself after receiving data from the Census Bureau because of racist and xenophobic attitudes mixed with political self-interest. In other words, politics.

As we dove into this history, the first thing we realized was that one justification for non-apportionment centered on a fight about math. Politicians seemed to be arguing with each other over which algorithm was the right algorithm with which to apportion the House. In the end, they basically said that apportionment should wait until mathematicians could figure out what the “right” algorithm was. (Ha!) The House didn’t manage to pass an apportionment bill until 1929 when political negotiations had made this possible. (This story anchors our essay on “Democracy’s Data Infrastructure.”)

Dan kept going, starting what seemed like a simple question: what makes Congress need an algorithm in the first place? I bet you can’t guess what the answer is! Wait for it… wait for it… Politics! Yes, that’s right, Congress wanted to cement an algorithm into its processes in a feint attempt to de-politicize the reapportionment process. With a century of extra experience with algorithms, this is patently hysterical. Algorithms as a tool to de-politicize something!?!? Hahahah. But, that’s where they had gotten to. And now the real question was: why?

In Dan’s newest piece – “House Arrest: How an Automated Algorithm Constrained Congress for a Century” – Dan peels back the layers of history with beautiful storytelling and skilled analysis to reveal why our contemporary debates about algorithmic systems aren’t so very new. Turns out that there were a variety of political actors deeply invested in ensuring that the People’s House stopped growing. Some of their logics were rooted in ideas about efficiency, but some were rooted in much older ideas of power and control. (Don’t forget that the electoral college is tethered to the size of the House too!) I like to imagine power-players sitting around playing with their hands and saying mwah-ha-ha-ha as they strategize over constraining the growth of the size of the House. They wanted to do this long before 1920, but it didn’t get locked in then because they couldn’t agree, which is why they fought over the algorithm. By 1929, everyone was fed up and just wanted Congress to properly apportion and so they passed a law, a law that did two things: it stabilized the size of the House at 435 and it automated the apportionment process. Those two things – the size of the House and the algorithm – were totally entangled. After all, an automated apportionment couldn’t happen without the key variables being defined.

Of course, that’s not the whole story. That 1929 bill was just a law. Up until then, Congress had passed a new law every decade to determine how apportionment would work for that decade. But when the 1940 census came around, they were focused on other things. And then, in effect, Congress forgot. They forgot that they have the power to determine the size of the House. They forgot that they have control over that one critical variable. The algorithm became infrastructure and the variable was summarily ignored.

Every decade, when the Census data are delivered, there are people who speak out about the need to increase the size of the House. After all, George Washington only spoke once during the Constitutional Convention. He spoke up to say that we couldn’t possibly have Congresspeople represent 40,000 people because then they wouldn’t trust government! The constitutional writers listened to him and set the minimum at 30,000; today, our representatives each represent more than 720,000 of us.

After the 1790 census, there were 105 representatives in Congress. Every decade, that would increase. Even though it wasn’t exact, there was an implicit algorithm in that size increase. In short, increase the size of the House so that no sitting member would lose his seat. After all, Congress had to pass that bill and this was the best way to get everyone to vote on it. The House didn’t increase at the same ratio as the size of the population, but it did increase every decade until 1910. And then it stopped (with extra seats given to new states before being brought back to the zero-sum game at the next census).

One of the recommendations of the Commission on the Practice of Democratic Citizenship (for which I was a commissioner) was to increase the size of the House. When we were discussing this as a commission, everyone spoke of how radical this proposition was, how completely impossible it would be politically. This wasn’t one of my proposals – I wasn’t even on that subcommittee – so I listened with rapt curiosity. Why was it so radical? Dan taught me the answer to that. The key to political power is to turn politicking into infrastructure. After all, those who try to break a technical system, to work around an algorithm, they’re called hackers. And hackers are radical.

Want more like this?

Read “House Arrest: How an Automated Algorithm Constrained Congress for a Century” by Dan Bouk. There’s drama! And intrigue! And algorithms!
Read “Democracy’s Data Infrastructure” by Dan Bouk and me. It might shape your view about public fights over math.
Sign up for my newsletter. More will be coming, I promise!

In 2015, I was invited to join the Commerce Department’s Data Advisory Council. Truth be told, I was kinda oblivious to what this was all about. I didn’t know much about how the government functioned. I didn’t know what a “FACA” was. (Turns out that the “Federal Advisory Committee Act” is a formal government thing.) Heck, I only had the most cursory of understanding about the various agencies and bureaus associated with the Commerce Department. But I did understand one thing: the federal government has some of the most important data infrastructure out there. Long before discussions about our current tech industry, government agencies have been trying to wrangle data to help both the public and industry. The Weather Channel wouldn’t be able to do its work without NOAA (National Oceanographic and Atmospheric Administration). Standards would go haywire without NIST (National Institute of Standards and Technology). And we wouldn’t be able to apportion our representatives without Census.

Over the last few years, I have fallen madly in love with the data puzzles that underpin the census. Thanks to Margo Anderson’s “The American Census,” I learned that the history of the census is far far far messier than I ever could’ve imagined. An amazing network of people dedicated to helping ensure that people are represented have given me a crash course into the longstanding battle over collecting the best data possible. As the contours of the 2020 census became more visible, it also became clear that it would be the perfect networked fieldsite for trying to understand two questions that have been tickling my brain:

What makes data legitimate?
What does it take to secure data infrastructure?

(For any STS scholar reading this, add scare-quotes to all of the words that make you want to scream.)

Over the last two years, I’ve been learning as much as I could possibly learn about the census. I’ve also been dipping my toe into archival work and trying to strengthen my theoretical toolkit to handle the study of organizations and large scale operations. And now we’re a matter of days away from when everyone in the country will receive their invitation to participate in the census, and so I’m throwing myself into what is bound to be a whirlwind in order to fully understand how an operation of this magnitude unfolds.

While I have produced a living document to explain how differential privacy is part of the 2020 census, I’ve mostly not been writing much about the research I’m doing. To be honest, I’m relishing taking the time to deeply understand something and to do the deep reflection I haven’t had the privilege of doing in almost a decade.

If I’ve learned anything from the world of census junkies, this decadal process is raw insanity and full of unexpected twists and turns. Yet, what I can say is that it’s also filled with some of the most civic-minded people that I’ve ever encountered. There are so many different stakeholders trying to ensure that we get a good count in order to guarantee that everyone in this country is counted, represented, and acknowledged. This is important, not just for Congressional apportionment and redistricting, but also to make sure that funding is properly allocated, that social science research can inform important decision-making processes, and that laws designed to combat discrimination are enforced.

I’m sharing this now, not because I have new thinking to offer, but because I want folks to understand why I might be rather unresponsive to non-census-obsessives over the next few months. I want to dive head-first into this research and relish the opportunity to be surrounded by geeks engaged in a phenomenal civic effort. For those who aren’t thinking full-time about the census, please understand that I’m going to turn down requests for my time this spring and my email response time may also falter.

Of course.. if you want to make me smile, send me photographs of cool census stuff happening in your community! Or interesting census content that comes through your feeds! And if you want to go hog wild, get involved. Census is hiring. Or you could make census-related content to encourage others to participate. Or at the very least, tell everyone you know to participate; they’ll get their official invitation starting March 12.

The US census has been taking place every 10 years since 1790. It is our democracy’s data infrastructure. And it is “big data” before there was big data. It’s also the cornerstone of countless advances in statistics and social scientific knowledge. Understanding the complexity of the census is part-and-parcel with understanding where our data-driven world is headed. When this is all over, I hope that I’ll have a lot more to contribute to that conversation. In the meantime, forgive me for relishing my obsessive focus.

danah boyd | apophenia

making connections where none previously existed

Tag Archives: census

Behind every algorithm, there be politics.

Joyfully Geeking Out