Claim & start

When building a user registration flow or cleaning up a massive, messy dataset, the instinct is to rely on a standard Regular Expression (Regex) to validate email addresses. You write the pattern, the script runs perfectly against your test cases, the data looks pristine, and you confidently push the code to production.
A month later, your marketing team tells you that the hard bounce rate has spiked to 10%, and your domain is on the verge of being blacklisted by Google.
What went wrong?
The truth is, validating the syntax of an email address is not the same as validating its existence. An email like fakeuser12345@google.com will pass perfectly through almost any Regex pattern, but any message sent to it will bounce violently.
If you want to protect your database and your sender reputation, you need to go deeper. In this guide, we are going to drop the complex Regex patterns and build a Python script that actually talks to mail servers to verify if an inbox is real.
To fully validate an email address in Python, your code must perform three distinct checks:
Before we write the real code, we need to address the elephant in the room. Why not just use Regex?
The official standard for email addresses (RFC 5322) is incredibly complex. Writing a Regex pattern that covers every valid edge case is nearly impossible, and writing a simple one leaves massive vulnerabilities.
More importantly, Regex is blind. It only looks at the shape of the string. It cannot tell you if the domain name has expired, if the mail server is offline, or if the user account was deleted yesterday. Relying solely on Regex is the primary cause of dirty databases.
The first layer of our script needs to handle the basics: syntax and Domain Name System (DNS) checks. If a domain does not have Mail Exchanger (MX) records, it cannot receive email, and we can immediately reject it.
Instead of reinventing the wheel, we will use the highly reliable email-validator library for this step.
First, install the package: pip install email-validator
Here is how you implement the initial check (Python):
from email_validator import validate_email, EmailNotValidError
def basic_validation(email_address):
try:
# Check syntax and query DNS for MX records
valid = validate_email(email_address, check_deliverability=True)
# Update with the normalized form of the email
email = valid.normalized
print(f"Success: {email} is valid and the domain can receive mail.")
return True
except EmailNotValidError as e:
# The email is either improperly formatted or the domain lacks MX records
print(f"Error: {str(e)}")
return False
# Test the function
basic_validation("test@example.com")
This script is a massive upgrade from Regex. However, it still does not tell us if the specific user exists on that server. To find out, we need to knock on the door.
This is where the real engineering happens. We are going to use Python's built-in smtplib and the dnspython library to connect to the recipient's mail server.
First, install the DNS library: pip install dnspython
The concept is simple: We will ask the DNS server for the mail server's IP address. Then, we will connect to that server and start an SMTP conversation. We will pretend we are about to send an email, and wait to see if the server accepts the recipient's address. Before actually sending the data, we will quit the connection.
Here is the complete script (Python):
import dns.resolver
import smtplib
import socket
def verify_email_existence(email):
# Split the email into user and domain
user, domain = email.split('@')
try:
# Step 1: Get the MX record for the domain
records = dns.resolver.resolve(domain, 'MX')
mx_record = records[0].exchange.to_text()
# Step 2: Connect to the mail server
server = smtplib.SMTP()
server.set_debuglevel(0) # Set to 1 to see the raw SMTP conversation
# Connection timeout
server.connect(mx_record, 25)
server.helo(server.local_hostname)
# Step 3: Start the handshake
server.mail('verification@yourdomain.com')
code, message = server.rcpt(str(email))
server.quit()
# Step 4: Interpret the response
if code == 250:
print(f"Valid: The mailbox {email} exists.")
return True
else:
print(f"Invalid: The server rejected the address (Code: {code}).")
return False
except (dns.resolver.NoAnswer, dns.resolver.NXDOMAIN):
print("Invalid: Domain does not exist or has no mail servers.")
return False
except socket.error as e:
print(f"Error: Could not connect to the mail server. {e}")
return False
# Test the function
verify_email_existence("bill.gates@microsoft.com")
When we issue the server.rcpt(email) command, the receiving server checks its internal directory. If the user exists, it responds with a 250 OK code. If the user was deleted or never existed, it typically responds with a 550 User Unknown code.
If you run the script above on a corporate email address, you might notice something strange. It will return a 250 OK code for any random string you type, like dsfhjsdfh@company.com.
This is because many B2B domains are configured as Catch-All (Accept-All) servers. They are set up to accept every incoming email, regardless of whether the specific user exists, often to route misdirected mail to a central admin inbox.
A simple Python script cannot bypass a Catch-All server. Sending real campaigns to Catch-All domains is highly risky, as it inflates your bounce rate silently.
The script we built is excellent for learning the mechanics of email infrastructure. However, deploying this script on your own server to clean a database of 100,000 emails is a terrible idea. Here is why:
In a production environment, you need to offload the risk, handle Catch-All logic, and process requests asynchronously. This requires a distributed infrastructure with rotating residential IPs.
The most efficient way to validate emails in Python at scale is by utilizing a dedicated API like EmailAwesome.
Here is how easily you can implement enterprise-grade validation in just a few lines using the standard requests library:
import requests
def validate_with_emailawesome(email_address):
api_key = "YOUR_EMAILAWESOME_API_KEY"
endpoint = f"https://api.emailawesome.com/v1/verify?email={email_address}&key={api_key}"
response = requests.get(endpoint)
if response.status_code == 200:
data = response.json()
if data['status'] == 'valid':
print("Inbox is real and safe to send.")
elif data['status'] == 'catch_all':
print("Proceed with caution. Domain is a Catch-All.")
else:
print(f"Do not send. Status: {data['status']}")
else:
print("API request failed.")
validate_with_emailawesome("test@example.com")
By leveraging an API, you bypass rate limits, instantly identify disposable burner emails, and get accurate diagnostics on Catch-All servers, all without putting your own infrastructure at risk.
Validating emails is a non-negotiable step in modern software development. Bad data costs money and destroys sender reputation.
Protect your application's data integrity today.
Ready to implement scalable email validation? Get your API key and 1,000 free requests with EmailAwesome.