« Help me fix my comments, please | Main | Study for Students »

January 26, 2006

Spying on Users

I have recently figured out how to read my log files and draw some basic conclusions about how people use my site. This was in part motivated by a thread Professor Edward Tufte started on his site.

Here's a collection of things I've looked into over the years that get to the nitty gritty technical concepts involved.

First, the privacy policy of an organization that thinks about this issue a great deal: The Electronic Frontier Foundation's privacy policy.

IP address tracers are readily available free services on the net and will generally lead the investigator to an internet service provider. Providers fundamentally have to track IP addresses and associate them with the people who are paying the bills. If the government is interested in an IP address, it can subpoena the billing records from the internet service provider and then send an agent to the physical address to pick up the person.

If one really want to get into the weeds, some search phrases to start with are border gateway protocol and root server.

One may say "Why, can't I just use a whois search to find the registrar of any IP address?" Well, not any more. As many more machines continue to be added to the internet, the original global routing table scheme filled up. Most ISPs now control the delivery of packets to their subscribers through randomly assigned IP addresses that they register en bloc. This saves on registration costs (less than all subscribers have machines online at any given time), slows growth of the global routing tables, and does reduce the odds that to much information will be associated with one IP address. It also makes it harder to trace attacks, but it doesn't make it harder for governments to issue subpoenas.

Cookies, which the search engines also use, are a different story. Philip Greenspun has an excellent write-up on the spying potential of cookies (scroll down to the napkin drawing).

While resistance to a subpoena is probably argued on the assumption that the matter will end up in court, (otherwise, why the subpeona?), merely delivering a subpoena can be very coercive. Many people and businesses would decide it is in their best interest to cooperate, rather than spend time and money resolving the issue.

The big picture remains the same: if the information is recorded, the government can get it unless it's privileged communication, that is, the witnesses's relationship to the client would have to be spousal, attorney-client, clergy-parishoner, or psychotherapist-client. Even these few privileges only come into play in court, and only bear on what is actually admitted as evidence. Nothing prevents the government from using the subpoeneaed information for something else, once the information is in hand.

There is an interesting dilemma here: in order to know anything about my visitors, I have to collect some information for some period of time. But the first piece of information that has to be collected really is the IP address, which is fundamentally tracable. What is the best policy? To keep the information for no more than one month? Three months? Keep no log files at all and know nothing about who visits? From a consequentialist standpoint, I'm not sure it matters, as the content of this site is hardly controversial.

Posted by Niels Olson at January 26, 2006 2:29 AM

Comments

Happy Birthday, Niels!

Posted by: Lee at January 29, 2006 11:31 AM

Post a comment

Thanks for signing in, . Now you can comment. (sign out)

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)


Remember me?