Networking: July 2008 Archives


July 4, 2008

As you've no doubt heard, the District Court has ruled that YouTube has to hand over their entire database of who has watched each video. A little web searching didn't turn up the original motion, but the ruling is here:
Plaintiffs seek all data from the Logging database concerning each time a YouTube video has been viewed on the YouTube website or through embedding on a third-party website. Pls.' Mot. 19.

They need the data to compare the attractiveness of allegedly infringing videos with that of non-infringing videos. A markedly higher proportion of infringing-video watching may bear on plaintiffs' vicarious liability claim,3 and defendants' substantial non-infringing use defense.4

As others have noted, the claim that the IP address or login name aren't personally identifying isn't very credible. [Though, did you notice what the judge cites as evidence that they're not personally identifying? A post by Alma Whitten on Google's policy blog.]

Ignoring that, though, Viacom certainly doesn't need access to the entire database to answer this question, a small statistical sample would be plenty. Moreover, with the question as phrased above, you don't need the identities of the people downloading the videos at all: you just need to know the number of times each video was downloaded in any given time period. If you're truly worried about this being distorted by multiple downloads by the same viewer (which seems unlikely), you can assign identifiers in a unique sequence for each video. E.g., the first person who downloads video A gets identifier 1, the second identifier 2; the first person to download B gets 1, etc. [you can use random identifiers too, for better privacy]. If all you want to do is compare popularity, there's no need to link viewers between videos.

On the other hand, what this database would be useful for is identifying and pursuing the users who uploaded and downloaded videos Viacom claims infringe. For that, you would want both the identities of users (so you know who to go after) the whole database (so you can identity everyone).