Brian Arbogast, Corporate Vice President, Identity, Mobile and Partner Services Group, MSN and Personal Services Division.
REDMOND, Wash. , Nov. 17, 2003 Innovative spam-filtering technology developed at Microsoft Research is being deployed across all Microsoft e-mail platforms as part of the company’s multi-pronged effort to chase unsolicited e-mail and practitioners of illegal spamming out of consumers’ inboxes.
In his keynote address at COMDEX Las Vegas 2003 on Sunday evening, Bill Gates unveiled SmartScreen Technology, a machine-learning-based filtering technology. SmartScreen Technology uses a probability-based algorithm to essentially “learn” what is and what isn’t spam based on characteristics of both types of mail. The source material for educating SmartScreen Technology has come from hundreds of thousands of e-mail users who contribute to Microsoft’s feedback loop program. Gates called SmartScreen Technology a major advance in the battle to help secure consumers’ inboxes and return greater productivity to people’s e-mail experience.
SmartScreen Technology has already been incorporated into the spam filters for Microsoft Outlook 2003, MSN 8 and Hotmail, and will soon be available in a new Intelligent Message Filter for Microsoft Exchange Server 2003 that is planned for release to customers in the first half of 2004.
PressPass asked Brian Arbogast , the Microsoft corporate vice president charged with driving anti-spam efforts across the company, and Rick Rashid , senior vice president of Microsoft Research (MSR), to describe the research behind SmartScreen Technology and the other components of Microsoft’s contributions to the industry-wide campaign to eradicate spam.
PressPass: Just about everyone agrees that spam is a nuisance, but what are some of the more damaging effects of spam that demand a strong response from Microsoft and other interested parties?
Arbogast: I’d start by considering the definition of spam: it’s unsolicited, unwanted e-mail sent by someone with whom the recipient has no personal or business relationship. Consumers and businesses should be protected from unwanted intrusions like this, and today, ridding inboxes of spam takes a significant toll on employee productivity and organizations’ IT budgets. Spam is undermining the value of e-mail for businesses and consumers worldwide, as well as eroding people’s trust in technology. A quarter of the e-mail users involved in an October 2003 study by Pew Internet & American Life Project said spam has reduced their overall use of e-mail, and more than half said spam has made them less trusting of e-mail in general. Recent reports also show that the volume of spam likely comprises more than 50 percent of total e-mail traffic today.
These are all indicators of how pervasive the problem of spam has become, and why Microsoft takes this problem so seriously. It’s going to require an industry-wide, multi-faceted solution. Anti-spam technology, such as the SmartScreen Technology filtering tools that we’re introducing across all Microsoft e-mail platforms, is one of several pillars in a coordinated approach that also includes industry self-regulation, consumer education, effective legislation and targeted enforcement against illegal spammers.
Dr. Richard Rashid, Senior Vice President, Microsoft Research.
Rashid: While technology alone cannot solve the problem, it is a critical element in successfully combating spam. Spammers are continually getting more sophisticated in their methods of escaping detection by current spam filters. We need to continue to invest in research to develop new, innovative anti-spam technologies and stay ahead of the curve. To that end, Microsoft Research is working with Microsoft’s Anti-Spam Technology & Strategy Group and our industry partners to help get new technologies from the lab into consumers’ hands as quickly as possible.
PressPass: How does the new SmartScreen Technology work?
Rashid: The spam-filtering SmartScreen Technology is built on machine learning, meaning that your computer uses a series of probability-based algorithms to distinguish between legitimate e-mail and spam. It essentially “learns” what is and what isn’t spam. The SmartScreen Technology filter has to be trained to recognize the different characteristics of both legitimate e-mail and spam. To get enough training data, Microsoft has instituted a feedback program in which customers voluntarily review messages to make a determination as to whether they believe a given message is spam. Based on that information, those messages get placed in a training database for SmartScreen Technology. The machine learning algorithm extracts specific words or characteristics from each e-mail message and weights them, based on their likelihood to indicate that a message is spam or legitimate mail..
As new e-mail messages arrive at a Microsoft e-mail server or client machine running SmartScreen Technology, the filter analyzes it for the weighted characteristics and generates an overall probability that the message could be spam. If the message hits a specific threshold of probability, it gets marked either for deletion or placement in the user’s junk e-mail folder. The key advantage of SmartScreen Technology is that it is always adapting and learning more about what is and isn’t spam. It learns the latest characteristics that distinguish spam from good mail based on data that the filtering technology collects over time, both from the e-mails that individual users deem as spam and the data collected centrally through Microsoft’s feedback loop program. SmartScreen Technology already searches for more than 500,000 characteristics of spam that are based on feedback from e-mail users, which enables the filter to be highly effective. And Microsoft will also issue periodic updates to the filtering technology to augment the machine learning process.
PressPass: How has MSR contributed to the development of SmartScreen Technology?
Rashid: Back in 1997, David Heckerman led researchers in Microsoft’s Machine Learning and Applied Statistics (MLAS), Adaptive Systems and Interactions (ASI) and Signal Processing groups in creating the first machine-learning-based spam filtering technology using probabilistic machine-learning algorithms. MSR holds a patent on probability-based spam filtering. These researchers worked to refine and improve the algorithms over the next several years, working with various people throughout MSR, and talked with the product teams to determine what some of their needs were. Early last year, two of the MLAS researchers wrote a think paper for Bill Gates about how the company needed to do even more to fight spam issue, as well as what kinds of technology could help. Bill’s positive feedback spurred MSR to kick the machine-learning anti-spam filtering work into high gear again and combine it with other projects underway within Microsoft.
Arbogast: This culminated in a number of MSR researchers joining with people from various other groups at Microsoft in a new Anti-Spam Technology and Strategy Group, formed in early 2003. In addition to helping add new anti-spam technology such as SmartScreen Technology to Microsoft products like Hotmail, MSN, Office 2003, and Exchange Server 2003, the group is spearheading a variety of other efforts aimed at solving the spam problem. The Anti-Spam Technology and Strategy Group is also building alliances with fellow technology industry companies and other organizations to cooperate on technologies and best practices, such as the recently announced Anti-Spam Technology Alliance with AOL, Yahoo!, Earthlink, Comcast and British Telecom. We’re also contributing to public policy discussions on effective anti-spam legislation, working with investigators to promote targeted enforcement against illegal spammers targeting our customers, and reaching out to consumers to raise awareness about how they can help protect themselves.
PressPass: How will SmartScreen Technology be rolled out to customers using Microsoft Exchange Server 2003?
Arbogast: SmartScreen Technology will power the new Intelligent Message Filter (IMF), an optional add-on for Exchange Server 2003 that filters the email data stream to analyze the messages and determine the likelihood that they are spam. IMF will work with the existing anti-spam functionality in Exchange 2003 to block junk e-mail both at the gateway and server levels. We’re committed to providing this and other anti-spam technology tools to customers as quickly and easily as possible.
PressPass: How will the introduction of SmartScreen Technology affect Microsoft partners who market similar anti-spam technology products?
Arbogast: Microsoft has been working very closely with other spam-filtering providers, such as Brightmail, to help ensure that our technologies will work in concert with their products so that our mutual customers receive the best protection available. To date, there is no single approach that can deflect all forms of junk e-mail, but multiple capabilities distributed across various products can do a great job of catching most unwanted messages. SmartScreen Technology deployed alongside other third-party filters has the potential of catching more junk e-mail than either filter can do alone. Microsoft will continue to build out infrastructure components for partners that enable both parties to deliver new and better ways to address e-mail protection, security and hygiene.
PressPass: What other technology is Microsoft developing to help combat spam?
Rashid: One approach spammers use to inexpensively ply their trade is employing computers to automatically sign up for free e-mail accounts that can be used to bombard millions of recipients with unsolicited messages. MSR’s Document Processing group is developing new types of Human Interactive Proofs, or HIPs, that can test whether the entity trying to sign up for an account is a human or a machine. The HIP requires that a certain combination of letters and numbers be re-entered on screen as part of the process of registering for the account. The advanced HIPs being created by Microsoft researchers consist of random numbers and letters that are distorted and connected by arcing lines, which make the image much more difficult for a computer to decipher.
Another group within the labs is developing puzzles that are sent to an e-mail sender’s machine if that sender is not known by the recipient. The puzzle must be solved before an unsolicited e-mail can be delivered. The puzzle requires a small amount of computational effort to solve, maybe 10 seconds of CPU effort per message, so a normal e-mail sender’s computer wouldn’t be significantly affected by the request. But if you’re a spammer and it takes your computer 10 seconds’ worth of computational effort per message, you’re going to be limited to roughly 8,000 e-mail messages a day from one computer a dramatic reduction in the hundreds of thousands of spam e-mails that the average spammer currently can churn out. This kind of limitation creates a strong financial deterrent to spamming.
Arbogast: Those are just two examples of how Microsoft is investing heavily in research and development to bring more effective anti-spam innovations to light. We’ll continue to dedicate resources toward fighting spam from all angles technology, enforcement, education, legislation and industry self-regulation. I believe that in the next few years, we can succeed in helping spammers find another line of work.