< Back to blog

Python Email Validation: How to Check if an Address Exists (Without Regex)

If you are using Regex to validate email addresses in Python, you are letting up to 40% of invalid emails slip into your database. True email validation requires pinging the recipient's mail server. In this developer guide, we will build a Python script from scratch that performs syntax checks, MX record lookups, and deep SMTP handshakes to verify if an email address actually exists.
Python Email Validation: How to Check if an Address Exists (Without Regex)

When building a user registration flow or cleaning up a massive, messy dataset, the instinct is to rely on a standard Regular Expression (Regex) to validate email addresses. You write the pattern, the script runs perfectly against your test cases, the data looks pristine, and you confidently push the code to production.

A month later, your marketing team tells you that the hard bounce rate has spiked to 10%, and your domain is on the verge of being blacklisted by Google.

What went wrong?

The truth is, validating the syntax of an email address is not the same as validating its existence. An email like fakeuser12345@google.com will pass perfectly through almost any Regex pattern, but any message sent to it will bounce violently.

If you want to protect your database and your sender reputation, you need to go deeper. In this guide, we are going to drop the complex Regex patterns and build a Python script that actually talks to mail servers to verify if an inbox is real.

Key Takeaways (TL;DR)

To fully validate an email address in Python, your code must perform three distinct checks:

  • Syntax Check: Ensures the string follows basic formatting rules (contains an "@" and a valid Top Level Domain).
  • DNS Validation (MX Record): Queries the Domain Name System to confirm the domain is configured to receive mail.
  • SMTP Handshake: Communicates directly with the recipient's mail server to confirm the specific mailbox exists, without actually sending a message.

Step 1: Why Regex is Not Enough

Before we write the real code, we need to address the elephant in the room. Why not just use Regex?

The official standard for email addresses (RFC 5322) is incredibly complex. Writing a Regex pattern that covers every valid edge case is nearly impossible, and writing a simple one leaves massive vulnerabilities.

More importantly, Regex is blind. It only looks at the shape of the string. It cannot tell you if the domain name has expired, if the mail server is offline, or if the user account was deleted yesterday. Relying solely on Regex is the primary cause of dirty databases.

Step 2: Validating Syntax and DNS

The first layer of our script needs to handle the basics: syntax and Domain Name System (DNS) checks. If a domain does not have Mail Exchanger (MX) records, it cannot receive email, and we can immediately reject it.

Instead of reinventing the wheel, we will use the highly reliable email-validator library for this step.

First, install the package: pip install email-validator

Here is how you implement the initial check (Python):

from email_validator import validate_email, EmailNotValidError

def basic_validation(email_address):
    try:
        # Check syntax and query DNS for MX records
        valid = validate_email(email_address, check_deliverability=True)
        # Update with the normalized form of the email
        email = valid.normalized
        print(f"Success: {email} is valid and the domain can receive mail.")
        return True
    except EmailNotValidError as e:
        # The email is either improperly formatted or the domain lacks MX records
        print(f"Error: {str(e)}")
        return False

# Test the function
basic_validation("test@example.com")

This script is a massive upgrade from Regex. However, it still does not tell us if the specific user exists on that server. To find out, we need to knock on the door.

Step 3: The SMTP Handshake (Verifying Inbox Existence)

This is where the real engineering happens. We are going to use Python's built-in smtplib and the dnspython library to connect to the recipient's mail server.

First, install the DNS library: pip install dnspython

The concept is simple: We will ask the DNS server for the mail server's IP address. Then, we will connect to that server and start an SMTP conversation. We will pretend we are about to send an email, and wait to see if the server accepts the recipient's address. Before actually sending the data, we will quit the connection.

Here is the complete script (Python):

import dns.resolver
import smtplib
import socket

def verify_email_existence(email):
    # Split the email into user and domain
    user, domain = email.split('@')

    try:
        # Step 1: Get the MX record for the domain
        records = dns.resolver.resolve(domain, 'MX')
        mx_record = records[0].exchange.to_text()
        
        # Step 2: Connect to the mail server
        server = smtplib.SMTP()
        server.set_debuglevel(0) # Set to 1 to see the raw SMTP conversation
        
        # Connection timeout
        server.connect(mx_record, 25)
        server.helo(server.local_hostname)
        
        # Step 3: Start the handshake
        server.mail('verification@yourdomain.com')
        code, message = server.rcpt(str(email))
        server.quit()

        # Step 4: Interpret the response
        if code == 250:
            print(f"Valid: The mailbox {email} exists.")
            return True
        else:
            print(f"Invalid: The server rejected the address (Code: {code}).")
            return False

    except (dns.resolver.NoAnswer, dns.resolver.NXDOMAIN):
        print("Invalid: Domain does not exist or has no mail servers.")
        return False
    except socket.error as e:
        print(f"Error: Could not connect to the mail server. {e}")
        return False

# Test the function
verify_email_existence("bill.gates@microsoft.com")

How It Works

When we issue the server.rcpt(email) command, the receiving server checks its internal directory. If the user exists, it responds with a 250 OK code. If the user was deleted or never existed, it typically responds with a 550 User Unknown code.

The Catch-All Problem (When SMTP Fails)

If you run the script above on a corporate email address, you might notice something strange. It will return a 250 OK code for any random string you type, like dsfhjsdfh@company.com.

This is because many B2B domains are configured as Catch-All (Accept-All) servers. They are set up to accept every incoming email, regardless of whether the specific user exists, often to route misdirected mail to a central admin inbox.

A simple Python script cannot bypass a Catch-All server. Sending real campaigns to Catch-All domains is highly risky, as it inflates your bounce rate silently.

Why You Shouldn't Run This Script in Production

The script we built is excellent for learning the mechanics of email infrastructure. However, deploying this script on your own server to clean a database of 100,000 emails is a terrible idea. Here is why:

  1. IP Rate Limiting: Mail servers like Google and Microsoft track connections. If your server IP suddenly connects to them 5,000 times in an hour without sending actual mail, they will flag you as a spammer conducting a directory harvest attack.
  2. Greylisting: Many servers intentionally delay unknown IPs (returning a 451 temporary error). Your script will hang or return false negatives.
  3. Blacklisting: Once your server's IP is blacklisted by Spamhaus, any legitimate emails your company tries to send will go straight to the spam folder.

The Scalable Solution: Using an Email Validation API

In a production environment, you need to offload the risk, handle Catch-All logic, and process requests asynchronously. This requires a distributed infrastructure with rotating residential IPs.

The most efficient way to validate emails in Python at scale is by utilizing a dedicated API like EmailAwesome.

Here is how easily you can implement enterprise-grade validation in just a few lines using the standard requests library:

import requests

def validate_with_emailawesome(email_address):
    api_key = "YOUR_EMAILAWESOME_API_KEY"
    endpoint = f"https://api.emailawesome.com/v1/verify?email={email_address}&key={api_key}"

    response = requests.get(endpoint)
    
    if response.status_code == 200:
        data = response.json()
        
        if data['status'] == 'valid':
            print("Inbox is real and safe to send.")
        elif data['status'] == 'catch_all':
            print("Proceed with caution. Domain is a Catch-All.")
        else:
            print(f"Do not send. Status: {data['status']}")
            
    else:
        print("API request failed.")

validate_with_emailawesome("test@example.com")

By leveraging an API, you bypass rate limits, instantly identify disposable burner emails, and get accurate diagnostics on Catch-All servers, all without putting your own infrastructure at risk.

Final Action Plan

Validating emails is a non-negotiable step in modern software development. Bad data costs money and destroys sender reputation.

  • If you are building a small internal script, use the email-validator library combined with the SMTP handshake code to understand the mechanics.
  • If you are validating user signups in real-time or cleaning a massive CRM database, offload the heavy lifting to a specialized service.

Protect your application's data integrity today.

Ready to implement scalable email validation? Get your API key and 1,000 free requests with EmailAwesome.

Written by
Charlie
Tech Team

Latest
Posts

Actionable tips, current trends, and step-by-step guides to help your campaigns move from "delivered" to "adored."

View all posts