Justifications for the federal government’s controversial mass surveillance programs have involved the distinction between the contents of communications and associated “meta-data” about those communications. Finding out that two people spoke on the phone requires less red tape than listening to the conversations themselves. While “meta-data” doesn’t sound especially ominous, analysts can use graph theory to draw surprisingly powerful inferences from it. A funny illustration of that can be found in Kieran Healy’s blog post, Using Metadata to find Paul Revere.
On November 10, the Third Circuit Court of Appeals made a ruling that web browsing histories are “content” under the Wiretap Act. This implies that the government will need a warrant before collecting such browsing histories. Wired summarized the point the court was making:
A visit to “webmd.com,” for instance, might count as metadata, as Cato Institute senior fellow Julian Sanchez explains. But a visit to “www.webmd.com/family-pregnancy” clearly reveals something about the visitor’s communications with WebMD, not just the fact of the visit. “It’s not a hard call,” says Sanchez. “The specific URL I visit at nytimes.com or cato.org or webmd.com tells you very specifically what the meaning or purport of my communications are.”
Interestingly, this party accused of violating the Wiretap Act in this case wasn’t the federal government. It was Google. The court ruled that Google had collected content in the sense of the Wiretap Act, but that’s okay because you can’t eavesdrop on your own conversation. I’m not an attorney, but the legal technicalities were well-explained in the Washington Post.
The technical technicalities are also interesting.
Basically, a cookie is a secret between your browser and an individual web server. The secret is in the form of a key-value pair, like id=12345. Once a cookie is “set,” it will accompany every request the browser sends to the server that set the cookie. If the server makes sure that each browser it interacts with has a different cookie, it can distinguish individual visitors. That’s what it means to be “logged in” to a website: after proving your identity with a username and password, the cookie assigns you a “session cookie.” When you visit https://www.example.com/my-profile, you see your own profile because the server read your cookie, and your cookie was tied to your account when you logged in.
A cookie is nothing more than a place for application developers to store short strings of data. These are some of the common security considerations with cookies:
- Is inappropriate data being stored in cookies?
- Can an attacker guess the values of other people’s cookies?
- Are cookies being sent across unencrypted connections?
OWASP has a more detailed discussion of cookie security here.
When a user requests a web page and receives an HTML document, that document can instruct their browser to communicate with many different third parties. Should all of those third parties be able to track the user, possibly across multiple websites?
Enough people feel uncomfortable with third-party cookies that browsers include options for disabling them. The case before the Third Circuit Court of Appeals was about Google’s practices in 2012, which involved exploiting a browser bug to set cookies in Apple’s Safari browser, even when users had explicitly disabled third-party cookies. Consequently, Google was able to track individual browsers across multiple websites. At issue was whether the list of URLs the browser visited consisted of “content.” The court ruled that it did.
The technical details of what Google was doing are described here.
To recap: Apple and Google had a technical arms race about tracking cookies. There was a lawsuit, and now we’re clear that the government needs a warrant to look at browser histories, because URL paths and query strings are very revealing.
The court suggested that there’s a distinction to be made between the domain and the rest of the URL, but that suggestion was not legally binding.