List Cleaning Secrets
This article is more of a list of notes in my head rather than a structured article. Below, you’ll find information that took me some time to confirm. All this stuff is useful if you want to clean your list without paying fat fees to third parties every time before you hit SEND. Because that’s the thing. You have to be scrubbing your list regularly, especially if you don’t send emails very often.
The very first thing you need to do to scrub your list is to remove inactive subscribers. Don’t send to folks who don’t open your emails because that’s affecting your reputation and ability to reach the inbox.
You don’t only have to keep a list clean for ISP’s but for your ESP too. If you get high bounce rates, it’s the ESP that will dump you first as you’re damaging their IP reputation.
Spam-traps, due to their nature, are not traceable as they are created in secret and are only known to the mail administrator who created them. Some companies claim they can catch spam-traps but I doubt that… Spam-traps become an issue when you buy lists or you scrape data on the Internet (harvesting robots, etc.) If you only collect emails on your site through an opt-in process, then you should be fine. When I first read about spam-traps I thought… and what if a competitor opts-in using a spam-trap on purpose to sabotage me? OK dude, you’re really creative, but chill… 🙂 That’s too much.
A list cleaning company will not verify any list you give to them. They have their own terms and policies and if they find out you are a spammer and collect any available address out there, they will refuse to clean your list and will also block your account instantly. Why? Because by checking your spam-traps and garbage they sabotage their own IP reputation and network, They themselves look stupid to ISP’s.
Catch-all or Accept-all domains are domains that will accept any email no matter whether a recipient exists or not. So, an email addressed to whatever@catchalldomain.com will be delivered. These domains are mostly used to catch spammers but not always.
Some addresses may be deliverable but could be risky to send emails to, for example, catch-all domains, addresses of the form asdfsdfasfawe@gmail.com (these may not be as frequently used by their owners which implies low engagement), and role-addresses (admin@company.com, info@company.com, etc.)
An SMTP check is when you send an ehlo request, RCPT TO, and attempt to send mail or at least engage with the server. An MX check is just checking records that exist in the DNS. You can safely do as many MX checks as you want. That’s not the same with the ehlo requests! If you push too much you’ll be blocked or reported to an organisation like Spamhaus. Another scenario is that the mail server will start ignoring you, ie. keep saying ‘yes’ or ‘no’ (greylisting). Below is what I mean using Python code (credit to Scott).
import dns.resolver records = dns.resolver.query('openedclicked.com', 'MX') mxRecord = records[0].exchange mxRecord = str(mxRecord) import socket import smtplib # Get local server hostname host = socket.gethostname() # SMTP lib setup (use debug level for full output) server = smtplib.SMTP() server.set_debuglevel(0) # SMTP Conversation server.connect(mxRecord) server.helo(host) server.mail('me@domain.com') code, message = server.rcpt(str(addressToVerify)) server.quit() # Assume 250 as Success if code == 250: print('Success') else: print('Bad')
Cleaning companies are able to perform SMTP checks on big volumes because they own a trusted network which manage and maintain. An Amazon server can do loads of SMTP checks without any issues. The thing is that if you check a ton of addresses that don’t exist, this looks to a mail server that you’re trying to guess addresses, so you get blocked. On the contrary, if you check an address that used to exist but no longer exists (like my address above), that’s a fair check. In other words, to a mail server, you look as good as your questions. The more you are a good Internet citizen and you’re upholding ISP’s and ESP’s standards, the better your reputation becomes.
A lot of cleaning companies don’t check webmail – freemail accounts (yahoo.com, gmail.com, yandex.com, hotmail.com, etc.). They only check corporate accounts (ibm.com, barclays.com, etc.). Well, to be honest, it’s safer not to check all these freemail accounts than to check. Why? First, your list is full of free webmail addresses, and unless you have a highly reputable network, you don’t want to f£$% around with Google asking thousands of silly questions. Second, it’s unlikely that one of these addresses are no longer valid provided that someone opted-in to your list with it and you didn’t just guess! People didn’t shut down their Hotmail when they moved to Gmail. They just don’t check their Hotmail anymore or at least as often. The issue is with corporate emails when people change jobs. I used to work in Barclays Investment Bank so I had an account angelos.georgakis@barclayscapital.com. That one is not valid anymore… or maybe it is??? S£@$, let me have a look! Nope, all good. 🙂
So, regarding free webmails, if people gave you an email address that don’t check often, you can remove those when you delete inactive subscribers from your list. Low engagement damages your reputation, remember. I want to say that you don’t have to check the validity of those emails, you’ll get rid of them in the first place because of low engagement.
Now, think about this. If you can disregard the webmails (which are the vast majority in your list), you can only check corporate addresses for list cleaning purposes. Those corporate addresses will be more spread out along different mail servers (different domains) so you’re safe to do the SMTP checks “mailbox exists” using code like the Python one above.
Now, what’s the possibility that 50 subscribers in your list work in the same company? Not that high… I mean it can happen especially if that company is really big. But even in that case if you have 50 @barclayscapital.com subscribers following you that means you’re already big enough to be able to afford to pay a cleaning company like Kickbox to verify those addresses. The whole point of this article (I discovered the point after I started writing… I know that’s not cool) is what you can do yourself to clean your list for free.
An alternative if you don’t want to pay is to do the checks not all at once but spread them out in time (load balancing) so you don’t get blocked. You could take this further and hop from IP to IP but that can be a pain. I’d say there are other more useful things to do for growing your business rather than that…
Another thing I wanted to mention is this: Don’t think of doing SMTP checks using a VPN because most mail servers know about VPN addresses especially when it comes to server to server communication. So, you’ll be blocked or greylisted (they will start giving you back only one response – unknown, unknown, etc.) sooner or later. Amazon, for example, tracks down on people who use VPN IP’s to leave fake reviews. How do they know? They do!