If you have heard anything about the NSA this month, you have heard grand statements and sweeping generalizations. More than likely you have heard a whole gallery of commentators try and relate the news to ideals like “liberty”, “security”, and “privacy”, as if we could all agree about what those ideals mean. In the technology world we have a saying, “code is law”, to remind everyone that the systems we build are not governed by our ideals, they are governed by the practical way we put them together. What the NSA has built is a tool: a system of technology, personnel, and regulations. To judge this tool based on the ideals of those involved or the reasons for its creation is a job for pundits. Us? We know to look at the code.

Prisms, internet giants, and James Bond.

So, what exactly is the “code” of a national surveillance system? Unpacking the avalanche of NSA information this month we can see three major components of the system: collection of wholesale raw data, use of private companies as data refineries, and collaboration with other spy agencies, including the British NSA equivalent, the GCHQ. These three components determine how the system works, what its limitations are, and what it is capable of; they are its “code” and they each have important ramifications for the system as a whole so we will look at them each in turn.

Carbon copying the internet

Of all the NSA programs reveled recently, PRISM has gotten perhaps the most press. We will be focusing on the specifics of this program in the next section but it is worth mentioning here for its name alone. Have you ever wondered why they would name a data collection program “Prism”? While the actual reasons are still classified, my guess is that the name is an homage to the NSA’s practice of using actual glass prisms-like devices for data collection.

Glass is useful for data collections because most internet traffic that travels any distance is converted into patterns of light and sent over fiber optic cables. If you can tap into the fiber optic cable you can install a prism-like device * you can split that light, sending part of it further down the line as intended while sending a duplicate copy somewhere else. We learned back in 2006 that the NSA began installing prism-like “splitter” devices in all the major fiber optic cables in the country, installing secret rooms at the nation’s leading phone and internet companies to capture copies of everything flowing over the network.

Notice that this approach is only useful when you want to copy everything going over a cable; you cannot, for instance, have the splitter recognize what information is bound for overseas and what is just moving over to the next town. Once you get down to the actual cables all our communications run through, all our data looks the same. This is fundamentally important because the NSA is legally prohibited from monitoring US citizens but, once you tap into the cables, the only way to make sure that you will end up with the particular data you want is to take all of it and look through it later. While the NSA has varied what portions of this information it keeps, and under what legal authority it claims the right to keep them, those changes are governed by internal decisions at the agency, not by the technology of the system itself.

Your Permanent Record

It is impossible to say just how much of this raw data the NSA has kept since 2001. Because there are no legal restrictions on storing information about non-US citizens, the recently disclosed documents pay little attention to the issue. We have learned that in Germany alone the NSA collects half a billion records a month. One possible indication of the scale of the data being stored is the new $2 billion data center the NSA is opening this September: estimates are that it will be able to store all the traffic that moves over the internet for years to come.

For US citizens we know that the NSA collected a nearly complete index for all emails sent between 2001 and 2011, when they halted the program for “operational and resource reasons”. This index includes a record of each email sent, who sent it, and what computer network they were on when sending it. They appear to have collected some form of credit card transaction history, likely a list of purchase times, amounts, and merchants. Similarly, the NSA has been collecting records of all phone calls made on US carriers, what numbers they call, how long they talk, and, potentially, where they call from if they are using mobile phones. This sort of communications history for an individual has historically been called a “pen register” and government agencies normally need a court order to create one. The NSA argues that they are not governed by these rules because they collect data in bulk and only search through it later while the older laws were designed for devices that did both at once. This recording of phone activity is still going on today.

In the press this index of everyone’s activity is refereed to as “metadata” because it is information about our communications but not the contents of those communications. Storing the contents of our communications would run afoul of wiretapping laws and would require many times more storage than keeping an index does. Until that new data center goes online, such activity might be operationally difficult for the NSA as well as legally treacherous. Instead, the NSA keeps an index of our communications and, whenever they want to see the contents, they request them from the tech companies that run our email and social networks.

Tomorrow we will look at the role that private companies play in distilling our data: Part 2.