Why it doesn't matter whether censorware works or not

Publication date: 12 December 2000.
Last modified 03-Dec-2011.

 

Porn. Smut. Naked people.

And, just for the professional naughtiness-searchers, pron, and pr0n.

Oh, what the heck. Let's say "Japanese schoolgirls" too.

This page now won't be accessible to users of various half-baked censorware products.

Which is not news, of course. "Blacklist" Internet filtering software - whose central feature is an allegedly carefully vetted categorised list of sites which it uses to prevent people from seeing content on certain subjects - is renowned for blocking too many things.

You may be able to turn off the real-time naughty word filtering that stops you doing things like reading pages about women called Maryanne (because the string "aryan" is in the name...), but the blacklist is based on the same technology, and it's mathematically impossible for the censorware companies to have humans look at all of the entries. So, sooner or later, innocent-ish pages like this one get blocked, in one or another blacklist update.

People with more experience in the field than me have summed up the problems with censorware more eloquently than I can.

In essence, the major argument against censorware itself - as distinct from the companies that create it, and their politics - is that it cannot avoid being over-broad in what it blocks.

Censorware makers can apply all the energy they like to getting their categorisation as correct as they can and weeding out things they blocked by accident, but with well over a billion Web pages to index and well over a hundred million individual Internet hosts, it's just impossible for them not to paint the world with a darn broad brush.

Hang the politics of it, for the time being. People who live in countries where they actually have Constitutionally guaranteed freedom of speech can field that ball; I'm in Australia, where we don't. What interests me is - if censorware, generally speaking, works poorly, why do people keep buying it?

Moreover, why on earth do politicians make laws requiring it to be used, by schools and libraries?

The glib answer to this is "because people are stupid, and politicians are really stupid." Hanlon's Razor says you should never attribute to malice that which is adequately explained by stupidity; hey presto, there's your explanation. Some decisions to use censorware, analysed on a superficial and obvious level, do indeed make those making that decision look as dumb as a bag of honey-glazed doorknobs.

But people and politicians are not idiots, by and large. Far from it. Oh, shut up, you in the peanut gallery; go and watch a few episodes of Yes, Minister. There's a more devious explanation, if you ask me.

Before I explain my brilliant deductions, let me give you an example of some obviously broken censorware, paying for which would appear to suggest that all of the customers' shoes fasten, of necessity, with Velcro.

This is not the Web-browsing-limiter sort of censorware. Recently, a new kind of censorware's emerged, which as yet doesn't have a lot to do with the usual kind. This new censorware aims to tell dirty pictures from clean ones algorithmically, and block the former in order (principally) to protect companies from hostile-workplace sexual harassment litigation.

This concept's not all that new, but software that even vaguely looks like being able to live up to its sales pitch is. The first package I've seen that does have some diffuse idea about the difference between smutty and clean is Baltimore Technologies' PORNsweeper, which I reviewed some time ago, here.

In brief, PORNsweeper didn't work very well. Its false-negative rate - saying a picture is clean when it isn't - was quite good. But it got that good rate at the cost of its false-positive rate - saying a picture's dirty when it's not. Essentially, it tended to think that pictures of people, and various other pictures that don't have people in them at all, are porn.

After I put the PORNsweeper review up, I was e-mailed by a rep for UK company First 4 Internet. They had a picture categoriser, too, and apparently still do.

Unlike Baltimore Technologies, First 4 Internet had the courage to make their software publicly testable. They had a page, here, that let you test what they called "a small proportion of the functionality of the First 4 Internet Image Filtering Software". That page now just lets you submit a request to try out the software; there's another request page here, but I just tried to register using that form and received a big fat OLE error for my trouble.

I think I know why they're not making it easy to play with their product any more.

Like PORNsweeper, their system, when I tested it, was indeed quite good at detecting porn. Well, when it could load an image, it was; it only understood JPG and GIF, and that would have been fine, but it just failed to load a significant number of images. But the makers claimed that it blocked better than 95% of commercial pornography, and when it could load the picture, that seemed pretty accurate.

But, like PORNsweeper, this software also thought most pictures of people were porn. It blocked pretty much anything with skin tones in it.

Amusingly, the First 4 Internet software had two levels of porn detection - probably-porn and definitely porn. Needless to say, just like PORNsweeper, First 4's software thought that pretty much any picture of a person was pornographic. But the second, no-sir-I-don't-mean-maybe detection level with its "This image has PORNOGRAPHIC content and will not be displayed" message was a sure crowd pleaser, when it was produced in response to perfectly innocent pictures. Which it often was.

It conjured up, for me at least, the image of a sex-starved preacher perpetually trembling on the edge of a chasm of uncontrollable arousal, so preoccupied with the subject that standing in a breeze strikes him as lascivious behaviour, because it sure as heck presses his buttons.

First 4's software seemed quite good at letting through things that weren't colour photos of people, which is more than I could say for PORNsweeper. But, like PORNsweeper, the First 4 software was incapable of detecting evil in black and white images. Fortunately, there's no such thing as an offensive black and white picture, so the concerned parents can sleep safely in their beds.

If you hanker for software that blocks colour photos of people, First 4 Internet had just what you want. Heck, maybe they've improved it now, but the fact that you can't just feed it an image and see what it thinks of it any more suggests to me that they haven't. In my testing, it showed false positive rates on clean pictures of people in excess of 80%.

This is where it gets interesting.

I communicated my findings to the First 4 chap, and he said that I was welcome to try out the real, ready-for-prime-time version of their software. But he also said, leading with his chin, that the First 4 product was "more accurate" than PORNsweeper, when I'd just then told him why I didn't think it was.

I was less than totally polite to him about his assertion, and said I'd be happy to review another version of the software, if it actually bloomin' worked. He slunk away.

OK, big deal. Another half-baked attempt at porn image recognition. Not headline news.

The thing that gets me, though, is that this fellow cheerfully invited me to try the software out.

Which suggests only two possibilities.

Possibility one - he didn't check to see whether the thing he was trying to sell actually worked worth beans before he proudly presented it to a journalist who wrote a highly critical review of another such product. This conclusion could be correct, but it doesn't allow me to construct an elaborate theory of human behaviour, so I shall discount it.

Possibility two - he reckons that any publicity is good publicity. Get a review, even if the review says "absolutely as effective as an oxy-acetylene rig made out of butter", and your product name's in the minds of those who make buying decisions about things like this.

And you're set. Because those people are either twits that'll buy anything, or cynics that'll buy from anyone who's willing to keep up the pretence that the software performs the task it's made to do.

I think a combination of the dumb and/or uninformed, and the knowledgeable but cynical, have to be the market for these sorts of products.

Work with me here. I've got another crackpot theory on the burner.

Let's presume you're running a business which provides Internet access to its employees, and you're in a country where it's likely that the employees will be able to sue you if someone else manages to send them smutty e-mail, or if their workmates can download porn and set it as their desktop wallpaper, or whatever.

Now, it's just flat-out impossible to provide your employees with proper e-mail and Web access, and also make it impossible for them to access offensive content.

But the name of the game here isn't really blocking what's meant to be blocked, while letting through innocent material that your employees need in order to do their jobs. The name of the game is covering your rear.

Legislators cover their rears by proudly introducing dumb unworkable legislation that wins votes. Bosses cover their rears by installing software that works as well as anybody's censorware does, so far as can be determined through the haze of public relations.

It'd be nice if dirty picture spotters worked properly, but even if they don't, you can make company policies (or national laws, depending on who you are) requiring people to use them, and you'll look as if you're Taking A Stand and Doing All You Can.

Anybody who asks awkward questions about whether the Stand you are Taking has any chance of achieving something worthwhile can be given one of the stock answers from the War On Some Drugs Sourcebook and brushed aside. For most rule makers, being tough on porn, like being tough on crime, has no down side, electorally or commercially.

If you're a bright rule maker, you can figure this out for yourself; if you're a dim rule maker, your advisers will steer you to the same decision. If everybody with decision-making input is either possessed of unthinking religious faith in the value of the product, or is a cheerful cynic who's doing what needs to be done, your purchase and implementation experience should be a marvellously smooth one.

What people are interested in, when they're talking about, legislating about, buying or selling censorware, is not the product itself. It's the idea of the product - a thing that magically prevents people from seeing things which they don't want to, or shouldn't be allowed to, see.

We live in the age of public relations, which can Newspeak its way past any awkward facts, at least for long enough for stock options to vest or a short-memoried electorate to cast their votes. PR is marvellous gap filler, if you're building castles in the air.

Getting back to the more common flavours of censorware - back in 1997, all of these products pretty much stunk. Blacklists were grossly obviously compiled algorithmically and not subject to adequate human review, so enormous numbers of innocuous sites were unfairly blocked.

There are still qualitatively similar problems with current censorware, but the quantity of those problems seems to be much smaller. The bizarre quirks of taxonomy are fading somewhat - for instance, Secure Computing's SmartFilter seems no longer to have the peculiar "Non-essential" category, but "Worthless" appears to survive, at least for SmartFilter v2.x. It also no longer categorises as "Extreme" an awful lot of pages that just have the word "Extreme" on them somewhere.

It's still not too hard to find censorware SNAFUs, but the software is a lot closer to living up to its advertising than it used to be.

Well, OK, maybe not. Sigh. But at least they're starting to fight among themselves.

Even if the awful-categorisation problem really is fading - and maybe everybody's just gotten sick of seeing another list of fuzzy-bunny sites listed as EXTREME SEX BONDAGE SATANISM - other problems have become more prominent. Like the way some censorware marks anonymiser and translator sites as being members of every single category. See this review by the same fellow that wrote the one above, for instance; it's getting on for two years old, now, but several of the sites it points to are still listed in every category by SmartFilter. Put sites in every category, and anybody who blocks any category won't be able to see them. More recently, as another Seth Finkelstein piece points out, N2H2's BESS censorware has grown a separate semi-secret "Loop Hole" category, which does the same thing; sites in that category can't be viewed by any BESS user, ever.

This super-blocking's done to stop people accessing other banned content by going through redirector sites that are not, inherently, offensive in any way. But this is a clunky solution, when it really ought to be possible to pluck the destination URL out of the composite address of a site viewed through an anonymiser, or a translator, or other random proxy-things (I'm still mourning the death of AskJesus) for that matter (SmartFilter, by the way, categorises www.rinkworks.com/dialect/ as "Online Sales, Entertainment"). Blocking a whole translator's an easy workaround, but not a good one.

But this doesn't matter. None of it matters. Because nobody at any point in the censorware revenue chain - from the censorware companies to the end users of their software - seems to care very much. Parents who're buying censorware to stop their kids from seeing things that they shouldn't may get pretty much what they want from over-strict machine-compiled nanny-ware (well, except for things like that last URL, which is "Art/Culture" according to SmartFilter...), but the big-money clients don't much care whether the software works or not.

So the civil libertarians are spitting chips.

So what?

Heavens, they might even suggest that most of the time censorware's used, it shouldn't be, even if it works perfectly!

Big deal.

Personally, I'd definitely rather be trapped in a lift with someone who owns Gold Editions of all of Nina Hartley's movies than someone who collects Precious Moments figurines. But that just means I'm not part of the censorware target market, so who cares about me?

Civil libertarians wouldn't buy censorware either, so who cares about them, either? It takes them a lot longer to put together a well-supported, well-reasoned argument against censorware than it takes someone who disagrees to swat them out of the way with a hearty "won't someone think of the children!?"

Hence, cynicism. If people will pay you good money to stick your thumbs in your ears and dance around naked in order to make it rain, then, well, that's an employment option for you, isn't it?

It's not going to work, in the sense of actually causing precipitation. But it sure is going to work, in the sense of improving your bank balance.

And so it is with censorware, even the kind that just doesn't work.

Now do please excuse me. That fellow in the brown raincoat just went into the wonky lift, and I haven't had a good conversation all day.

Other columns

Learning to love depreciation

Overclockers: Get in early!

Stuff I Hate

Why Macs annoy me

USB: It's worth what you pay

"Great product! Doesn't work!"

The virus I want to see

Lies, damned lies and marketing

Unconventional wisdom

How not to e-mail me

Dan's Quick Guide to Memory Effect, You Idiots

Your computer is not alive

What's the point of robot pets?

Learning from spam

Why it doesn't matter whether censorware works

The price of power

The CPU Cooler Snap Judgement Guide

Avoiding electrocution

Video memory mysteries

New ways to be wrong

Clearing the VR hurdles

Not So Super

Do you have a license for that Athlon?

Cool bananas

Getting rid of the disks

LCDs, CRTs, and geese

Filling up the laptop

IMAX computing

Digital couch potatoes, arise!

Invisible miracles

Those darn wires

Wossit cost, then?

PFC decoded

Cheap high-res TV: Forget it.

V-Pr0n

Dan Squints At The Future, Again

The programmable matter revolution

Sounding better

Reality Plus™!

I want my Tidy-Bot!

Less go, more show

In search of stupidity

It's SnitchCam time!

Power struggle

Speakers versus headphones

Getting paid to play

Hurdles on the upgrade path

Hatin' on lithium ion

Wanted: Cheap giant bit barrel

The screen you'll be using tomorrow

Cool gadget. Ten bucks.

Open Sesame!

Absolutely accurate predictions

The truth about everything

Burr walnut computing

Nothing new behind the lens

Do it yourself. Almost.

The quest for physicality

Tool time

Pretty PCs - the quest continues

The USB drive time bomb

Closer to quietness

Stuff You Should Want

The modular car

Dumb smart houses

Enough already with the megapixels

Inching toward the NAS of our dreams

Older than dirt

The Synthetics are coming

Pr0nBack!

Game Over is nigh

The Embarrassingly Easy Case Mod

Dumb then, smart now

Fuel cells - are we there yet?

A PC full of magnets

Knowledge is weakness

One Laptop Per Me

The Land of Wind, Ghosts and Minimised Windows

Things that change, things that don't

Water power

Great interface disasters

Doughnut-shaped universes

Grease and hard drive change

Save me!

Impossible antenna, only $50!

I'm ready for my upgrade

The Great Apathetic Revolution

Protect the Wi-Fi wilderness!

Wi-Fi pirate radio

The benign botnet

Meet the new DRM, same as the old DRM

Your laptop is lying to you

Welcome to super-surveillance

Lemon-fresh power supplies

A>B>C>A!

Internet washing machines, and magic rip-off boxes

GPGPU and the Law of New Features

Are you going to believe me, or your lying eyes?

We're all prisoners of game theory

I think I'm turning cyborg-ese, I really think so

Half an ounce of electrons

Next stop, clay tablets

A bold new computer metaphor

Won't someone PLEASE think of the hard drives?!

Alternate history

From aerial torpedoes to RoboCars

How fast is a hard drive? How long is a piece of string?

"In tonight's episode of Fallout 4..."

How hot is too hot?

Nerd Skill Number One

What'll be free next?

Out: Hot rods. In: Robots.

500 gig per second, if we don't get a flat

No spaceship? No sale.

The shifting goalposts of AI

Steal This Education

Next stop: Hardware piracy

A hundred years of EULAs

The triumph of niceness

The daily grind

Speed kings

Alt-tCRASH

Game crazy

Five trillion bits flying in loose formation

Cannibalise the corpses!

One-note NPCs

Big Brother is watching you play

Have you wasted enough time today?

The newt hits! You die...

Stuck in the foothills

A modest censorship proposal

In Praise of the Fisheye

Filenames.WTF

The death of the manual

Of magic lanterns, and MMORPGs

When you have eliminated the impossible...

Welcome to dream-land

Welcome to my museum

Stomp, don't sprint!

Grinding myself down

Pathfinding to everywhere

A deadly mouse trap

If it looks random, it probably isn't

Identical voices and phantom swords

Boing!

Socialised entertainment

Warfare. Aliens. Car crashes. ENTERTAINMENT!

On the h4xx0ring of p4sswordZ

Seeing past the normal

Science versus SoftRAM

Righteous bits

Random... ish... numbers

I get letters

Money for nothing

Of course you'd download a car. Or a gun!

A comforting lie



Give Dan some money!
(and no-one gets hurt)