Filters 101: User Generated Content

With the soaring popularity of User Generated Content sites (like YouTube and MySpace), I have had many people ask me about how filters handle that content. I thought it was time to sit down a blog about it, so here goes.

First of all, let me state very clearly that I am a huge supporter of filters, and believe that every computer should have an updated and operational filter installed. Having said that, filters are far from perfected technology, and they don’t deal well with user-generated content.

Before we can talk about YouTube, we need to understand a little bit about how filters work. There are essentially two things that filters can use to determine whether to display a page or not: it can either look at the URL, or it can use the content of the page to determine what category it falls into. The former is like blocking a channel on your TV, and the latter is like blocking shows based on their rating. The main difference between a computer filter and your TV parental controls, though, is that on the computer the filter attempts to “rate” the content on the fly, while the content on the TV uses a standard rating system. Television filtering is a much easier problem, as the parental control only needs to look at a standard rating in the stream, and can then enforce your choices for your family – the computer filter has to use sophisticated linguistic algorithms to determine what the content is. It is a much less accurate process.

So, how does this all apply to User Generated Content? Sites like YouTube and MySpace allow anyone to create content and upload it to their site for others to view. This does not go through any type of standard rating system, and when the content that is uploaded is video or images, the linguistic algorithms that filters use are relatively useless. This means that unless you block the entire site by adding the URL to your block list, the site is mainly unfiltered. If enough people type comments onto the page that describe what the video relates to, then the linguistic algorithms have something to work from, and they will pick up the page and categorize it – but this is only based on the textual comments added to the site. This is a very unreliable method of categorizing the video content.

Now, why is this important? Two reasons:

1. False sense of security with filters. I cannot tell you how many times I have had to explain this to parents. Their usual response is “but, I have a filter installed – won’t that block the inappropriate content from YouTube?” Too often we install a filter and then feel that our job of protecting our children online is done – unfortunately, filters are only a piece of the puzzle. We still need to remain very aware of what our children are doing online, and how they spend their time. If they are spending large amounts of time on sites like YouTube, we need to know what they are seeing and why. The best way to do this is the old-fashioned way: Communication. Direct questions.

2. Undesirable content is easily masked to appear innocuous. It is a sad fact of our life today that people want to push their inappropriate content into our homes. In the early days of the Internet, people would register domain names that were a common mis-spelling of a popular site, and would post pornographic content there. This made it very easy for someone to stumble across a bad site. An example of this was whitehouse.com (instead of whitehouse.gov). This used to host pornography, until a law was passed that made this type of deception illegal. Unfortunately, there are no similar laws for user-generated content (yet). So, someone could easily film some extremely inappropriate content, label it “Sponge bob” and upload it.

I ran into an example of this recently. I was searching for that very funny SNL skit with Christopher Walken, so I searched for “cow bell”. I found a video which looks like a possible hit for the content I was searching for. Instead, it turned out to be an ad for a presidential campaign. This is a perfect example of what our children could run into: they search for one thing, someone uploads content that appears to be what they are looking for, only to find that it is something much more offensive than a presidential advertisement. And no filter would catch it, unless you block the entire site where the content is hosted.

The bottom line here is that we need to be very careful about what our children are viewing online, and we cannot allow ourselves to be lulled into a false sense of security just because we have a filter.

In a future post, I will discuss the related problem of very popular peer-to-peer file sharing applications and how undesirable content can get bypass our filter, virus protection and other apps designed to keep that content off of our systems.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.