« Integrity by Guidelines Is Misguided | Main | Help me fix my comments, please »

January 25, 2006

Class Lecture Audio Statistics

I record all the audio for our class lectures and post them on my site so anyone can download them. To date lectures have been downloaded 844 times; the average lecture has been downloaded 17.2 times, ranging from 42 times for Dr Blake's collagen biochemistry lecture, to 2 downloads of Physiology's Membrane Transport Three lecture. Biochemistry's 13 lectures have been downloaded 299 times(range: 4—42); Physiology's 21 lectures have been downloaded 268 times (2—23), and Neuroscience's 15 lectures have been downloaded 277 times (2—41).

Two major populations visiting the site: the normal internet traffic that plots essentially as a horizontal bar (average download size in this group is independent of visit rate, it simply varies around the site's average file size); this is the grey arm in the graph. A small group (the green arm) visits regularly, but not exorbatantly, and downloads huge amounts of data—1000 times more than the normal traffic!

A few additional groups broke out, but I'm speculating about what they are. A tight cluster of machines have visited between 450 and 800 times and each has downloaded 15 to 20 KB. That's about how big my homepage is. There's a cluster in red that seems to come out of the cloud and increase until plateauing at 20 MB. I think these are the search engines and each one has probably accidentally downloaded an audio file (each is about 10 MB) and quickly learned not to do that again! Finally, there's a group that downloads slightly, but clearly, more than the typical user (the vertical axis is log scale). I'm completely speculating that these are spammers, but I think the number of machines is roughly the same as the number spam hits I get on my blog. I'm about 10% sure though. Finally, note the abrupt drop in activity at 1000 hits. Those labels aren't covering up any points, they just stop. I suspect this will grow longer over time. That, at least, is a testable hypothesis.

It will be interesting to run this again after the tests next week.

Visitation (hits) Vs Bandwidth
The number of hits is not the number of times a person visited, it is the number of times that computer asked for a file. Sometimes the machine may have already downloaded the file very recently, so the server tells it to check its cache. A page with images will also cause the hit count to go up because each image is a separate file request.

Posted by Niels Olson at January 25, 2006 9:36 AM

Comments

Post a comment

Thanks for signing in, . Now you can comment. (sign out)

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)


Remember me?