What happens when the scummers get scummed?
Background: Scummers and Scumware
"Scumware" is a term which refers to a nasty type of software that is
sometimes installed on peoples' computers without their asking.
Scumware is typically written by marketing companies trying to make a
fast buck. What it does is make unauthorized changes to Web sites as
they are being rendered by the user's Web browser: the program runs in
the background, waiting for the user to access any old Web site. When
he does, the scumware program intercepts the incoming data, replaces or
inserts content (usually adding its own, unauthorsized advertising, or
redirecting the user to paying competitors' sites), then passes it
along to the browser where it is displayed. While this seems a clear
violation of the scummed Web sites' copyrights, Scumware authors defend
this as a perfectly legitimate and legal practice.
"Dundee" is a fictional program we made up to see what would happen
when scumware vendors (Gator, eZula, etc.) thought a competing product
was inserting foul language and vile racist propaganda on their own
sites. We rounded up a handful of regulars from GRC's spyware and
privacy newsgroups to be the "actors", contacting scumware companies
and feining righteous indignation after they've seen all the racist
viewpoints the company expresses on its Web site. Naturally, the
company might want to investigate these claims, especially coming from
multiple sources in the course of a few days. To that end, the "actors"
were equipped with bait in the form of (forged) screenshots depicting
the racial slurs appearing on the Web sites, a dummy executable named
"dundee.exe", and bogus config files for dundee.exe.
To complete the baiting, the .ini file made reference to the "Dundee
home page" and pointed to the "proof of concept" document below, and
contained various data instructing the program on how to add racial
slurs, where to send bugreports, etc. When the dummy executable was
run, it would complain of missing files and then die with an enigmatic
As this says, the goal of the experiment is to see if any Scummers take
the bait, demanding removal of "Dundee" and/or threatening lawsuits,
Our actors were very convincing. However, only a few minor nibbles from
sumware companies, and NO lawsuit threats! (Come on, scummers...and you
call yourselves Americans... :-)
The original "Proof of concept" bait
The "Proof of concept" file appears below.
"Dundee.exe" proof of
Dundee is a client-side
application that adds racial slurs to selected commercial Web sites.
Dundee is a client-side
application that adds highly offensive anti-racial sentiments to selected
commercial Web pages. According to US law, it is entirely legal to do so,
as the algorithmically-modified content is not stored or transmitted, and
the user consented to installing the application. Currently, the application
targets the makers of products that "hijack" third-party Web pages using
similar (advertising-centric) tools.
Racial slurs were chosen
because they are highly and *universally* offensive to most everybody,
and easy to add to Web pages (unlike, say, porno images). Also, racial
slurs appearing on a company Web site could bring massive lawsuits, investigations
and loss of business, more effectively than just about any other
modification a client-side program could make to Web pages. To many Webmasters
whose sites are currently defaced by the applications Dundee is modelled
after, defacing their site and goodwill with banners, popups and paylinks
is no different than defacing it with racial epithets and nasty language.
Dundee uses a single configuration
file, config.ini, to store all its local configuration data. This configuration
contains lists of Web pages Dundee is to act upon, and additional data
containing racial slurs to include into the target Web Page. These slur
words are separated by noun, verb, etc., for reasons that will become obvious
later. More general slur constructs are embedded in the ddlexan.dll file.
There is no stand-alone
Dundee. However, it comes bundled with a number of free and low-cost software
products, including two major MP3 sharing clients, and there are plans
to bundle it with more applications as time goes on. It has an auto-update
feature, which periodically checks the Web site and installs updates. Advertising
and resource-sharing capabilities will be included in a future release,
and bundlers supplied a portion of the revenues. The ultimate goal is to
use the Dundee algorithm to seamlessly insert other things, for a reasonable
fee, into specific Web sites at the behest of a client. These are intended
to match the tone and context of the page well enough that the average
user cannot differentiate original content from Dundee-enhanced content
on the page.
Dundee is offered as an option
during installation: "Dundee: Free Web Browser Enhancement". The idea is
that "sheeple" will install this good-sounding application without understanding
what it does. The user can opt in or out of installation. An uninstall
reference is placed in Add/Remove Programs.
Dundee acts as a proxy between
the Internet and the Web browser. Upon visiting a Web site matching the
URL rules specified in the .ini file, Dundee springs into action, using
its quick-and-dirty lexical analyser to seamlessly(?) insert racial slurs
and rude epithets into the target Web page. These are created by random
selection of the words specified in the .ini file and in the .dll. While
the specific algorithm is proprietary to CEXX Labs and subject to much
revision, the basic idea is as follows:
The density of changes made
to the target Web page are dependent on the apparent SNR (Signal-to-Noise
Ratio) of the target document, also determined by the Dundee application.
The SNR is a measure of the ratio between apparent "signal" (informational
text content) and "noise" (ad text, text containing marketing terms or
constructs, ad images, etc.) present on the page. In general, text modifications
are preferred to graphic by Dundee, as the image-module is somewhat flaky,
and cannot yet create images dynamically (proportionally-sized text is
instead inserted). Many extremely low-SNR sites consist of nearly all images--in
this case, there is no other choice but to replace some. The first few
non-animated images are skipped in the hope of preserving the company's
logo on the page with the racial slurs.
The image-module locates images
that appear to be ads (based on size, URL, etc.), randomly replacing some
with appropriately-sized anti-racial text and/or images (not yet implemented).
HTML code is stripped from the
Web page, leaving only the plain text contents, and sent to the lexical
Lexical analyser searches for
key words in the raw text and attempts to diagram each sentence (if any)
or surrounding words to deduce context. If the context matches that specified
for an epithet construct, such a construct is added--either adding to or
replacing the original text. Some (navigation links, marketing phrases)
will be stand-alone words or phrases, these will also be handled, albeit
Changes made by the analyser
are re-inserted into the HTML document, which is sent to the browser.
In general, the lexical analyser
Since the replacement is done
on the HTML-stripped text and re-inserted within the same tags, the attributes
(font, colour, etc.) of the replacement text will seamlessly match the
Key words. These include many
commonly found on commercial Web sites, including "shop", "buy", "search",
"look", and "e-mail". Appropriate text is inserted or replaced, e.g. "Shop
for great deals" -> "Shop for nigger beating sticks", "Support Us" -> "Support
Marketing terms and constructs.
Similar in many respects to the Web Bullshit Analyser, it detects common
marketing spew with surprising accuracy. Most synergistic e-commerce
p2p vortal solutions with dynamic and innovative community will find
themselves begging for the detector's mercy, as well as pages containing
lots of "Click Here"-alike, "Welcome to...", "Introducing...", "Presenting...",
"Check out ... amazing ...", "Try out $blah today!", "We hope you will...",
"$company is a world-renowned|leading|world leader|top provider|dynamic
and innovative..." ... you get the idea. Some marketing spew is replaced
or appended, but it is most importantly used to detect the company or product's
name, so that Dundee can turn it into a racism toolkit with grace and ease.
Stand-alone offensive statements are also inserted between sentences of
marketing drivel. Based on our informal testing, it looks very convincing
when this is done.
Boilerplate text. Copyright
of almost any commercial web site. Statements of support for the KKK, advocation
of hate crimes, etc., look surprisingly natural when part of a contact
link or added after a copyright notice.
While it has been tuned
to perform well on the homepages of currently-known highjackers, the insertion
of text is prone to fail in some circumstances, particularly on other random
Web sites. As mentioned earlier, Dundee's lexical analyser is a quick-n-dirty
hack, nothing more. While it does well at fitting text appropriately into
a sentence, it does not "understand" language and so will be foiled by
odd usage of language, misspelling, structures it does not recognise, and
where its key phrases are used in ways I have not anticipated. Hopefully,
by the time more hijackers appear or Dundee nears completion, this will
be significantly improved.
The TITLE tag contents are
currently excluded from analysis, to prevent munging of a company's identifying
tagline (we want these racially-intolerant companies to be easily recognised,
no?) In the future, the TITLE may be included in analysis by default or
as an option.
The image engine also leaves
much to be desired. Using images to display Web page text appears to be
a common practice among the current crop of hijackers, but the replacement
algorithm is very primitive. It may zap the company's logo, it may mess
up the page formatting, it may place something completely out of context,
it may replace a hidden image, or completely butcher the size/dimensions
of an image whose size is not declared in the IMG SRC tag. It is a necessary
evil, whose operation will be tuned greatly as Dundee approaches non-beta
No provision is currently
made for other "enhancements" co-existing on the system. It's entirely
possible for another "enhancement" to modify Dundee content before it gets
to the browser, or for Dundee to act upon already-modified content. Future
releases will include code to detect and remove competing plug-ins (GATOR,
eZula, etc.) on initialisation. This also solves the problem of Dundee
paid content being "watered down" by other paid content, or other plugins
breaking the original context of the page.
Screenshots from three major
hijacker Web sites. Note the fluent insertion of slur text in most cases.