Home > Articles > Security > Network Security

Managing Spam

This chapter is from the book

Antispam Approaches

This section looks at several approaches that have been taken to combat spam. Most of these are techniques that have been incorporated into tools and larger applications. The last two are approaches with which many of us could become more involved:

  • Accept list—This is a list of e-mail addresses or domains that are determined to be trusted. This list is built over time as the user determines which e-mails are spam and which ones are legitimate. The drawback to the approach is that spammers often use fake e-mail addresses to evade being identified by these lists. For example, I often get e-mails that have my e-mail address as the sender. This approach also requires constant interaction from the user.

  • Block list17—This is a list of e-mail addresses or domains that are determined to be responsible for sending spam. This list is built over time as the user determines which e-mails are spam and which ones are legitimate. The drawback to the approach is that spammers usually use fake e-mail addresses and domain names and often change them to evade being identified by these lists. This approach also requires constant interaction from the user.

  • Challenge-response—This technique sends an e-mail to the originator of an e-mail asking the originator to validate the e-mail by answering a question or typing in a sequence of numbers and letters displayed in an image that cannot be easily read by a computer. This method easily catches spam sent by automated systems where no one is monitoring received e-mails. Unfortunately, this can include legitimate automated response systems from which you may receive an e-mail as the result of an online purchase or a subscription to an online newsletter. This technique can also be an annoyance because e-mails are delayed by a request being sent to the originator asking for validation.

  • Keyword-search—This approach looks for certain words or a combination of words in the subject line or body of an e-mail. For example, an e-mail that promotes organ enlargements or Viagra would be considered spam. Using a keyword search to validate e-mails for children may be fine. However, many of the words in a keyword search could be part of legitimate e-mails. Moreover, many spammers use clever misspellings to get around these types of filters. Search rules are not case sensitive (so SEX, Sex, and sex as subject words would all be detected). Misspelling and punctuation in the middle of a spam word defeats keyword search spam detectors. Spammers also add additional white space or invisible characters between letters in a word to avoid these filters.

  • Hashing—With hashing, the contents of a known piece of spam is hashed and stored. Each received e-mail is then hashed and if the hash matches any of the stored hash values for spam, it is rejected. Although this technique is quite accurate at rejecting known spam, it requires additional computing power to process each e-mail, and it is not very effective against most spammers. Many spammers modify their e-mails by adding a random phrase at the beginning or end of an e-mail, which renders hashing useless.

  • Header analysis—Each e-mail that is sent across the Internet has a header associated with it that contains routing information. This routing information can be analyzed to determine whether it has the wrong format, because many spammers try to hide their tracks by placing invalid information in the header. For example, the from-host field of one line may not match the by-host field of a previous line. Although this may indicate spam, it could also indicate a misconfigured e-mail server. Equally, a well-formed header doesn't necessarily mean that an e-mail is not from a spammer.

  • Reverse DNS lookup—This approach validates the domain name of the originator of an e-mail by performing a Domain Name System (DNS) lookup using the IP address of the originator. The domain name that is returned from the lookup request is compared against the domain of the sender to see whether they match. If there is no match, this e-mail is considered spam. Although this can be effective in many cases, some companies do not have their DNS information set up properly, causing their e-mail to be interpreted as spam. This happens often enough to be a problem. That, combined with the performance hit for doing this, makes this solution less than optimum. To perform your own DNS lookup, go to http://remote.12dt.com/rns/.

  • Image processing—Many advertising e-mails contain images of products or pornographic material. These images usually have a link associated with them so the recipient of the e-mail can click it to obtain more information about the product or service being advertised. Images can also contain a Web bug used to validate an e-mail address. Blocking these images can protect children from harmful images. Some spam tools flag e-mails with images, especially if they are associated with a link, and block them from the inbox. Some sophisticated tools can perform a keyword search of images and reject an e-mail based on the results.

  • Heuristics—This technique looks at various properties of an e-mail to determine whether collectively enough evidence exists to suggest that a piece of e-mail is spam. Using this approach, several of the techniques previously mentioned, such as header analysis and reverse DNS lookup, are combined and a judgment made based on the results. Although this approach is more accurate than any of the approaches used individually, it is still not foolproof and requires a lot of tweaking to compensate for new evasion techniques that spammers deploy.

  • Bayesian filter18—This filtering technique is one of the cleverest and most effective means for combating spam.19 It is a self-learning mechanism that can continue to outwit spammers during its lifetime. It works by taking the top tokens from legitimate e-mails and spam e-mails and placing them in a weighted list. Tokens are words, numbers, and other data that might be found in an e-mail. Fifteen tokens are considered to be the optimum number of tokens to use. Too few tokens and you get false hits because the few tokens will exist in good and bad e-mail. Selecting too many tokens results in more tokens appearing in good and bad e-mail. 20

    • Suppose, for example, that you are a doctor. It may be common for you to receive e-mail with the words breast and Viagra in them. However, the words examination, patient, x-ray, and results should be more common for your legitimate e-mails than spam. These words would become tokens for the legitimate list, and spam-related tokens would go in the other list.

    • You can see how this technique would be more effective on the client than at the server. Deploying this at the server will result in a more generic set of tokens than tokens that are customized for the type of e-mail that each individual would receive. Looking at the previous example, the tokens for the doctor would probably not appear in the legitimate e-mail list because the majority of the e-mails being received by the e-mail server probably won't be for a doctor, or certainly not for the same type of doctor.

  • Payment at risk—This is an idea that was presented at the World Economic Forum in Davos.21 It would charge the sender of e-mail a small amount of money each time one of the sender's e-mails was rejected as spam. Although this may be worrisome for senders of legitimate bulk e-mail, it should not be a problem if they are using an opt-in model for determining who is sent e-mails.

  • Honeypots22—Some spammers use open relays on the Internet to send their spam on to its final destination, thus hiding their own identity. A honeypot is a service that simulates the services of an open relay to attract spammers and detect their identity. Deploying these can help fight spam, but could also make you a target. There have been cases where companies that deployed honeypots suffered denial-of-service attacks from spammers attempting to seek retribution. Operators of honeypots can also risk litigation by interfering with Internet communications.23 Funding one may be better.24

  • Legislation—Legislation such as the Controlling the Assault of Non-Solicited Pornography and Marketing (CAN-SPAM) Act25 has made great strides in stopping U.S.-based spammers from sending out spam. The European Union's E-Privacy Directive Proposal also seeks to stop spammers.26 Support of these types of legislation can do a lot for national spam control and will hopefully encourage other nations to pass similar laws.

Challenge-Response for Account Creation

Several ISPs, such as MSN, AOL, and Yahoo, have implemented challenge-response systems for the creation of new accounts to thwart spammers who use automated programs to create new e-mail accounts from which to send new spam. In typical challenge-response systems, the user is presented with a blurred image and asked to enter the characters displayed in it using the keyboard to complete the creation of a new account. This represents a major barrier to spammers who use automated account-creation systems. EarthLink has even extended this feature to force e-mail senders to respond to a challenge e-mail before their initial e-mail is delivered to the addressee.27

A variation of this idea proposes to send a response to each sender of an e-mail to force the sender to perform a simple operation that will use up the resources of the originating e-mail server. Although this is not of any consequence to a sender of a few e-mails, this would heavily impact a company that sends millions of e-mails.

Client-Side Antispam Solutions

Client-side e-mail solutions are features that come with an e-mail client such as Outlook, Netscape, or Eudora. ISPs such as MSN, Yahoo, and AOL also provide antispam features for their client software. These features usually consist of filters that check incoming e-mail and block it based on various criteria. Many of these filtering techniques were described in the previous section.

E-mails that are filtered may be placed into a spam folder, deleted-items folder, a specified folder, or just deleted. One of the problems with these filters is they can inadvertently filter out valid e-mails. Suppose, for example, that you have a filter that routes e-mails to a spam folder based on obscene words. After setting up the e-mail filter, you may receive an e-mail from your doctor about breast cancer. This e-mail could be filtered out of your inbox as spam. For this reason, some e-mail clients permit the user to flag e-mails that have been routed to a spam folder as legitimate e-mails. This flagging tells the filter utility to accept e-mails from specific e-mail addresses or domains. The utility remembers the user's selection and uses the information to filter successive e-mails that arrive at the client. However, this can be a bit tedious. Some more advanced filters automatically place the e-mail addresses of contacts and sent e-mails on the list of acceptable e-mails, relieving users of this burden.

Microsoft Outlook 2003 and MSN software both block images by default. Images that are embedded in e-mails may contain Web beacons that can be used by spammers to validate e-mail addresses. For users who have enabled the preview feature of their e-mail client, these Web beacons can be activated without reading the e-mail.

Peer-to-peer software such as Cloudmark permits users to mark e-mail as spam. Information about the marked e-mails goes to the other members in the peer-to-peer network to block the e-mails from other members' inboxes. This permits everyone in the peer-to-peer network to benefit from spam detection by any of the members.

The company Cobion28 makes Windows, Linux, and Solaris-based e-mail filtering software. Their Web filter software controls which Web sites employees can visit based on the employee's role, the Web site's address, and the Web site's content. Their e-mail filter can control e-mail entering or leaving an enterprise. The e-mail filter makes use of acceptance and rejection lists. They also filter on domain name, subject, body content, and the content of attachments. The software is also able to scan an image file to determine whether it contains restricted text.

Spam and Infected Attachments

Undesired attachments that often accompany e-mail are not considered spam. However, when they contain viruses, they can be more harmful than the spam that delivered it. One thing that makes malicious attachments insidious is the fact that they can come from people you know who were previously infected by the same software virus. Using an antivirus application such as the ones that are made by McAfee, Symantec, or Computer Associates can help protect your computer and data from harm. Following are some guidelines that can help protect you against viruses:

  • Don't open attachments from unknown e-mail addresses.

  • Validate that attachments sent by friends were actually sent by them.

  • Use antivirus software to scan e-mail attachments.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020