TELECOM Digest OnLine - Sorted: Book Review: "Ending Spam", Jonathan A. Zdziarski


Book Review: "Ending Spam", Jonathan A. Zdziarski


Rob Slade (rMslade@shaw.ca)
Thu, 19 Jan 2006 08:16:23 -0800

BKENDSPM.RVW 20051029

"Ending Spam", Jonathan A. Zdziarski, 2005, 1-59327-052-6,
U$39.95/C$53.95
%A Jonathan A. Zdziarski
%C 555 De Haro Street, Suite 250, San Francisco, CA 94107
%D 2005
%G 1-59327-052-6
%I No Starch Press
%O U$39.95/C$53.95 415-863-9900 fax 415-863-9950 info@nostarch.com
%O http://www.amazon.com/exec/obidos/ASIN/1593270526/robsladesinterne
http://www.amazon.co.uk/exec/obidos/ASIN/1593270526/robsladesinte-21
%O http://www.amazon.ca/exec/obidos/ASIN/1593270526/robsladesin03-20
%O Audience s+ Tech 3 Writing 2 (see revfaq.htm for explanation)
%P 287 p.
%T "Ending Spam"

The preface states that the book is for those seriously interested in
spam identification technologies, and concentrates on Bayesian and
related statistical filtering.

Part one is an introduction to spam filtering. Chapter one reviews
the history of spam, although many of the early entries are simply
annoyances or chain letters rather than the commercial or fraudulent
items considered under the banner today, and the author does not seem
to realize that 419 scams predated email by a considerable margin. A
look at the development of spam filtering (excluding Bayesian) is
presented in chapter two, along with some non-filtering. Bayesian
analysis is explained in chapter three, and the statistical filtering
basis is outlined in chapter four.

The fundamental actuarial core is expanded in part two. Chapter five
covers message coding. Tokenization, chunking characters into
identifiable items, is examined in chapter six. Tricks spammers use
to evade filters, and the solutions finding spam despite the
deceptions, are outlined in chapter seven. Storage and performance
issues raised by the data rules required by statistical filters are
addressed in chapter eight. Chapter nine looks at aspects of scaling
to systems supporting large numbers of users.

Part three deals with advanced concepts in statistical filtering.
Chapter ten delves into testing which, because of the individual and
adaptive nature of Bayesian filtering, presents unique challenges.
Tokenization is revisited in chapter eleven, in more advanced forms.
Markovian discrimination, with its examination of stateful entities,
is explained in chapter twelve. Having noted many kinds of features
in the book, chapter thirteen explores ways to reduce the items used
(and data required) while maintaining accuracy. Collaborative rule-
building with other users, groups, or systems is reviewed in chapter
fourteen.

As the preface implies, this is *not* a book for users who just want
to install POPFile (although that and other programs are explored in
an appendix). For those who are seriously involved in managing and
developing spam filtering, however, the book does provide very useful
advice, pointers, and research.

copyright Robert M. Slade, 2005 BKENDSPM.RVW 20051029

====================== (quote inserted randomly by Pegasus Mailer)
rslade@vcn.bc.ca slade@victoria.tc.ca rslade@sun.soci.niu.edu
I do not feel obliged to believe that the same God who has
endowed us with sense, reason and intellect has intended us to
forgo their use - Galileo
http://victoria.tc.ca/techrev or http://sun.soci.niu.edu/~rslade

Post Followup Article Use your browser's quoting feature to quote article into reply
Go to Next message: Cellular-News: "Cellular-News for Thursday 19th January 2006"
Go to Previous message: Alyce Lomax: "Make it a Google Night?"
TELECOM Digest: Home Page