Emails.  You need them, you want them.

But so does everybody else.

How do you get them without being spammy?

There are any number of valid ways to do this, but they all fall into one of two basic strategies. 

Strategy #1: The Sherlock Holmes

The undisputed winner in the contest to find quality contact information, the Sherlock Holmes method is simply reading through the website that provided the backlink in order to discover relevant contact information for either the author the webmaster or the organization.

The advantage of the Sherlock’s Holmes method is that it is personal and real. If you’re having trouble finding a good look for you can literally ask you can reach out through instant message or you could send them an email inquiry asking who you should talk to, or you could literally call them on the phone. 

I once used this method to ace a job interview for a large link building firm. Rather than wasting my time running over all over the internet trying to figure out who to contact, I picked up this ancient communicator called a telephone and just called up some small-town city hall. I got past a basic automated telephone system and simply asked a real person who the right person to approach linking to their articles. The wonderfully helpful staff gave me the information for their webmaster and their director of digital marketing. The exact two people I wanted to talk to.

The downside is, it took me the better part of an hour. And that’s the trade-off you make with the Sherlock Holmes method. You get cleaner results because you’re cleaning them as you go by rejecting some options of contact in favor of others. You know well enough to ignore some email addresses and highlight others because you’re an educated human who knows what you’re looking for. But sifting through all of that information will ultimately take a considerable amount of time. 

If you have time to give, or you’re comfortable giving away your staff’s time this is a very viable list cleaning method. However, it can get a little tedious. Which is why many people prefer another strategy.

Strategy #2: Robots…

The opposite of doing it yourself is in fact not doing it yourself, surprise surprise. Now I’m sure you can find any number of solid ways to remove email information from a website. but most of the techniques you’ll find are categorized under the technique of scraping. Scraping is where a piece of software pulls all of the code that makes up your website and parses it for certain types of keywords.

This is what Google does when it sends it to search bot to your site. They crawl around to understand what kind of content you’ve created and how it’s a value of people. For this job, we need a tool much less refined than Google’s crawling spiders. For ourselves, we chose the creatively named {sarcasm} program called Scrapebox. 


Scrapebox is a piece of software that falls squarely into the realm of “ gray hat”  SEO. Which is a fancy marketing way of saying that it’s a fairly neutral tool? It’s like a really good sword.  A practiced swordsman can use it to devastating effect on only the targets he wishes to. Alternatively, your average fantasy nerd would probably pick it up and accidentally kill someone, if not himself.

Scrapebox users contain a significant percentage of dangerous actors. A vast amount of spam comments you get on your blog or social media channel are probably generated through Scrapebox or some very similar piece of software.

But used correctly Scrapebox can enable you to do high-value Outreach at a speed that would otherwise be unsustainable.  You can use Scrapebox to pull all of the possible contact information off of a single page and every page connected to that page with a technique called mining. And it doesn’t just go one web page at a time. You can upload a list of hundreds of backlinks and Scrapebox can parse them all as a single job. It’s one of the best tools around for streamlining tedious SEO activities, It’s hard to beat for a one-time price of ninety-seven dollars.

Scrapebox

For the purpose of this book, I’m going to assume that you’re not a programmer or software engineer and that you have better things to do with your time than manually go through a hundred different websites in order to pick out a couple of ideal email addresses. So I’m going to walk you through exactly how we use Scrapebox at the Infinite Upcycle to get the most accurate data possible and some of the things we do to avoid dangerous pitfalls.

There is a free version that you can use to try out the software in case you’re not sure. But trust me, buy Scrapebox.

The time you spend learning the software will be hours upon hours that you’ll save in the future without ever having to sacrifice your integrity on sleazy activities or your sanity on tedious ones.

https://www.youtube.com/watch?v=CykedqJg92w

Final Step: Clean Out Useless and Dangerous Emails

This is where you transition from using Scrapebox to speed up your work and back to using your brains to make sure you’re doing it right. Simply put, you should know the kind of email address you’re looking for. Ideal prospect email should be at least one of the following:

A significant portion of the emails collected by scrapebox will not fall into the previous criteria, so you will need to clean the house. Next, you are simply going to delete the row that contains email addresses that:

  • End in  .jpg, jpeg, .png, etc (image files)
  • Any email with words like “noreply”, “spam”, “advertise” or “sales”
  • Spam Fishers – such as emails that end in sentry.io are used to spoof scrapers and are instantly reported to blacklists if you send them to them.  
  • Addresses from massive publishers, aggregators or platforms like the New York Times or Medium.com. These sites give links so rarely that it’s effectively never.

💥NOTE:  Multiple emails for the same domain are ok if they make sense, just put them on the same row/column and separate them with a comma (example: info@website.com, webmaster@website.com)

Now that you’re done cleaning out the junk email addresses, you’re going to resort the entire spreadsheet by URL and then by email. If you’re using Google Sheets your setting should look like this 

Sorting this way will bring all the URLs with emails to the top of the spreadsheet. now we’re going to copy all of the rows with emails into the “FINAL”  tab.

Before we add names to this list of emails, we are going to copy all of our selected emails into an addon tool for scrapebox called email scraper. This tool will validate the emails by checking to see if they respond to a query. By checking first, you can remove dead email addresses which not only saves you considerable time, but prevents you from being flagged as spam for sending to non-existent emails.  

If you’re looking for a free option, (the Scrapebox email scraper is an additional forty-seven dollars) a free tool like neverbounce will allow you to validate a limited number of emails for free. Other tools like emaillistverify will also look for spam traps and you can buy services based on your projected volume so it saves you money by preventing you from sending useless emails and allows you to tie a specific cleaning cost to each cycle. 

In the final tab, you’re going to look through all of the email addresses try to add first names to any of the emails where the first name is obvious. (ie.for  JohnSmith@johnsshow.com you’d add the name John to the first name field) The name may be obvious from either the email side (johnsmith@) or the URL side (johnsshow.com). Generally speaking, a URL side name will only be relevant if you’re confident the website you’re targeting is owner-operated. However, it’s often better than nothing.