You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Writing from CEAS with quick notes on each talk. Proceedings are at http://www.ceas.cc/papers-2004/papersbytopic.htm .

All ports except 80 and 443 are blocked! Very annoying (wink)

Chung-Kwei:

  • Teiresias is IBM's pattern-discovery tool from bioinformatics
  • looks *directly* transferrable to SpamAssassin's "regexp rules" approach
  • probably heavily patented and hard to license though
  • but a Google search for "pattern discovery algorithm" looks like a promising source (wink)

Social network talk:

  • pretty useless spamfiltering-wise at least; not any spam orientation at all

Joshua Goodman Received talk:

  • talking about parsing Received lines
  • basically reimplementing spamcop algorithm
  • looking for "last external IP address"
  • thinks this will be useful for SenderID
  • SenderID example uses HELO data, looks like, instead of PRA or SMTP MAIL FROM; due to multiple intervening hops
  • try to use heuristics to find last external IP address:
    • using MX data fails due to load-balancing edge router
    • also the msn.com/hotmail.com problem
  • proposed algo:
    • IP addr is 192.168
    • HELO matches user's domain and forward DNS lookup of HELO matches IP address
    • find an IP that matches MX record, next is external
  • Bob Atk suggested putting external IP addrs in a DNS record?!
  • interesting that they'd never checked SpamAssassin or Spamcop's algorithms, but that's MS for you (wink)

Brett Watson: beyond identity: problems even with sender id

  • economics of whitelisting/blacklisting based on a reliable sender identification (ie. forging is no longer possible)
  • mostly a philsophical talk

Multiple email addresses:

  • about 50% of surveyed users had multiple email addresses
  • "identities"; separation of work, personal, social groups; pseudo-anonymity; affiliation, status, prestige (alumni accts)
  • mobility (available on the road)
  • people now frequently have different "role" accounts
  • typically once people go over 3 accts, they set them up to forward to a smaller number
  • 20-30% of all email addrs change annually
  • this talk is really oriented towards MUA UI developers
  • another talk with not a whole lot of antispam relevance (sad)

Panel discussion of monetary spam filtering:

  • Cynthia Dwork's talk:
    • 16 seconds per message computation time doubles spam cost
    • 56 seconds per message " means $36 per message for spammers
    • cycle theft arguments (zombies are illegal; spyware can be combatted with user @+ education) *already don't work* in the real world
  • MailFrontier:
    • some kind of marketroid noise about how they're "third generation" because they have grey areas, or something; combination of multiple tests means "definitely spam, no false positives". riiight
    • "Reverse Turing Test": C-R as usual, with pictures of puppies
    • except the C-R page has some kind of plugin which will burn CPU cycles instead, woo
  • The naysayer:
    • http://www.cl.cam.ac.uk/~rnc1/
    • going rate to solve puzzles is about $.11/hr in South India
    • Real Money systems: people will regulate it; EU Directive on E-Money (2000/46/EC)
    • people will walk away with 2.5% of it (cost of running + greed)
    • people will steal it (e.g. sysadmin skimming x% of incoming mails and stealing their tokens)
    • Payment systems: settlement: see taugh.com
    • also compares with the telco system (~1200mill ham mails/day, ~2000mill phone calls per day) – much fewer calls on telco system, most local, diff trust model
    • how much payment:
      • 30 responses per mill: .1c/mail mean $33 per sale to be viable
      • if .05c/mail, $16
      • at a 0.7% response rate, $33 profit means 23c/mail
  • questions:
    • to Ironport: "why can't I nominate a charity?" to avoid interested parties
    • Dan Kohn to Ironport: "how much bonds debited?" not very much
    • question from an Indian querier: "any documented cases of South Indian kids clicking on CAPTCHAs?" MailFrontier guy, naturally, says "nope". In reality, the answer is "yes", but that was in Thailand
    • Yahoo! guy on CAPTCHAs: "seen everything: porn sites, people paid to type them; sites in Russia with full pages of CAPTCHAs, 10 hour turnaround after a new fix is deployed"
    • Vanquish guy says they use CMU's CAPTCHA code
    • question on CPU time stamp inflation: Cynthia Dwork says "memory cycles much more stable over time"
    • Daniel: annoyed about senders having to "prove they are real" when they're doing the recipient a favour: MailFrontier guy: "we just want the problem to go away"
    • Dave Crocker: "why didn't anyone on the panel take any notice of the naysayer's presentation and its points?"
    • panel: "but we have only 5 minutes!"
    • Vanquish guy: "he doesn't understand how PKI works" (!!!) then some advertising for Vanquish (again)
    • Ironport: "Bonded Sender is working right now"
    • MailFrontiers guy: "mostly agreed with his presentation, but we'll do whatever works (titters from audience); C-R is an atomic bomb against spam, but with some collateral damage against ham, but it can be turned off"
    • naysayer on pay-to-send: "not only is my machine insecure, my email is insecure, but I don't want my *money* to be insecure" (applause)
    • panel mod: there will be coevolution between attacker and defender, a lesson from the Cold War

MailFrontier presentation: anatomy of a phishing email

  • Bank of America sends email from bankofamerica1.com, Sony from sonystar.com; this screws up the notion of a trusted domain name
  • the MSIE %00 vulnerability
  • high-numbered ports mean that websites can be run unnoticed, even if a HTTP server is already running
  • the fake address-bar window trick
  • fraudulent pop-ups over real site: goes to fraud site, create popup, go to fraud site: pop-ups are a phishing risk (yay!)
  • "your submitted information will be verified by eBay staff within 24 hours"; buys more time
  • A survey, based on results from over 83,450 respondents (subset of total responses), in diagnosing which sites were frauds and which were real:
    • 26.7% got everything wrong
    • only 13.8% of respondents got all correct
  • da.ru is a frequent hosting site for phishing scams
  • hasn't looked at the Active/X malware on the phishing sites, for some reason!
  • Consumer Reports sends from some domain called "d1sub.com"; Fortune 500's should really improve their practices
  • q: "are we getting to a stage where we won't be able to tell phish from ham?" a from audience: "use pine"
  • q: "why haven't the arrests of phishers been publicised better?" suggests including some support in web browser for a "trusted logos" area on-screen, for certifications
  • Dave Crocker: don't map to domain names, "domain names are not good enough, they do *not* map to trademarks".

Geoff Hulten, MS: Trends in Spam Products and Exploits

  • corpus analysis, from Hotmail's feedback loop
    • volunteers classify random samples of their mail as spam or good; tens of thousands of hand-classified messages per day; large "unbiased" (???) sample of spam
  • additional analysis on two sets of spam:
    • about a year between the two
    • products sold, exploits used, trends
  • viagra types: 17% 2003, to 34% 2004
  • graphic porn down: 13% to 7%
  • exploits: increasing rapidly, 1.33 exploits 2003 to 1.73 in 2004
  • word obscuring: up to 20% in 2004
  • URL chaffing, adding good URLs to spam: not there in 2003, 10% in 2004 – anti-SURBL attack (wink)
  • Spammers are putting more work into each spam

Introducing the Enron Corpus:

  • 1.3million messages originally; removed msgs with "integrity problems", replaced usernames etc
  • http://www-2.cs.cmu.edu/~enron
  • 200,399 useful, non-dupe messages
  • 158 messages, 1,268 msgs/user
  • missing message headers, so not much use for spam filtering; Exchange-mangled; no HTML. still, maybe good for "body" rules and FP avoidance
  • no mention how much of the corpus was spam (wink)

Larry Lessig:

  • extraordinary amount going to tech fixes; very little going to how the law could address it
  • compares govt attention to "pirate radio" creating static for large commercial stations, vs the spam problem
  • multiple types of regulators: the law, social norms, the market, and architecture (example: windows in lecture theatre are closed to enforce paying attention to speakers)
  • the law also regulates the other three
  • (that was the wrong talk! starts again!)
  • 1. "regulation is always multiple modalities"
  • 2. "interests will react"
  • 3. "special interests defeat general interests"
  • in the old days, we had norms to defeat spam; that failed
  • using code to fix; so far that's failed
  • "the market will fix the problem"; ISPs trying to be the spam-free email provider; that's also failed
  • CAN-SPAM: totally failed – even displaced effective state legislation
  • not any single modality alone can fix it
  • regulation is a restriction, plus somebody to enforce it
  • CAN-SPAM: wanted truthful headers
  • opt-out doesn't provide any way for you to know if you've really been opted-out
  • enforcement: state AGs, ISPs, federal - centralised; too big though. they have better things to do with their time than bust spammers
  • solution: marries legal/architectural/market
  • legal: has two parts: (1) labels ("ADV" in the subject line)
  • (2) a bounty
  • (q: SEXUALLY-EXPLICIT tag is a label, already massively flouted by spammers. other labels would be flouted just as much.)
  • architecture: filter code then blocks mails with "ADV"
  • market: spammers would then have to incentivise people to receive their mail by sending offers they want (yeah right (wink)
  • enforcement: spam will only be sent if you can be paid, so "follow the money" – part of CAN-SPAM states "the business that benefits is responsible"
  • market in enforcement: bounty hunters who identify label-less spam (ah). amateurs, not law enforcement, large population
  • during CAN-SPAM development: labels were undesirable. Reason: "labels are too effective", because e.g. Amazon would have to have labelled their ads (because there was no distinction between opt-in and opt-out) and would be filtered
  • fundamental problem: corruption due to vested interests lobbying (cf CAN-SPAM)
  • sees difficulties in differentiating
  • q: tracing spam to the business that benefits often involves getting forwarding addresses from e.g. a CGI script running on a server in the Ukraine. *needs* law-enforcement power to get that IMO. a: "yes, and law-enforcement power is available, and jurisdiction problems are easy" (not sure about that! at least for the non-LE bounty-hunter case)
  • q: opt-in would have fixed it, like it has in Australia; but DMA keeps emasculating the laws into YOU-CAN-SPAM. a: agrees that there are multiple answers, but prefers not requiring opt-in across the board and uses the UCE definition as it allows political speech without adding to their costs. (I disagree, personally; the "UBE" definition works for me --jm)
  • Jon Praed: enforcement requires tremendous resources, and in some cases you've got to get to that IP address within 7 days to get those logs, with LE power. This is not easy. Notes that spammer margins are incredibly low, and those bounties as a result would be small and/or hard to get.
  • JP again: also suggests labels to label "good" commercial mail, personal mail, and then leave over "unknown" mail – which is then suspect. also suggests that the *headers* are the labelling, in reality.
  • q: "special interests always seem to wipe out general interest on this issue in laws. what can we do, law-wise?" "my brand is pessimism", "there was this moment, when they passed CAN-SPAM, when legislators were keen to fix it – then the special interests came in".
  • observation from audience: spots the parallel between UK and Pirate radio in the late 60's, which also passed a McCain anti-advertiser provision to deal with it.
  • Dave Crocker: believes that the suggestion would result in little real effect on spammers, and quite a heavy hit on legit businesses
  • No labels