Email Delivery Optimization
Strategic Problem
The VP of Engineering at Act-On Software presented a challenging deliverability problem: a client with legal requirements to contact all 2 million recipients on their list, regardless of email address quality. The client's previous manual approach was labor-intensive and didn't scale. The question posed: could ML/AI optimize email delivery to maintain sender reputation while reaching the entire list?
The core tension: email service providers penalize senders with high bounce rates and poor engagement by damaging sender reputation, which ultimately leads to blocking. Yet the business requirement demanded sending to every address, including those with questionable validity.
Approach Evaluation
I began by evaluating whether this problem could be solved with reinforcement learning. The characteristics initially seemed promising: sequential decision-making about batch composition with feedback from deliverability metrics. However, fundamental constraints ruled out this approach:
- Email providers deliberately obscure their spam detection algorithms and regularly change them to prevent evasion, making environment modeling impossible
- No access to client data until immediately before the send, eliminating the possibility of training with historical examples
- The adversarial nature of the domain - providers adapt to sender tactics - meant any learned model would likely be obsolete by deployment
This analysis pointed toward a rule-based system that could incorporate domain expertise and handle uncertainty without requiring a training dataset or environment model.
Methodology Selection
I worked with the deliverability team to understand their manual process for similar sends. They described a set of heuristics they used to balance email address quality across batches, but these rules were imprecise and required constant manual adjustment during sends.
The problem characteristics matched applications where fuzzy logic has proven effective: a control problem with uncertain inputs, expert knowledge expressible as rules, and the need to handle gradations rather than binary classifications. Email addresses weren't simply "valid" or "invalid" - they existed on a spectrum of confidence levels. Similarly, sender reputation wasn't binary but a continuous measure that degraded gradually.
Fuzzy logic could formalize the deliverability team's expertise into membership functions and rule sets that would automate their decision-making process while handling the inherent uncertainty in email address quality indicators.
Validation Through Simulation
Without production data available for testing, I designed a simulator to validate the fuzzy logic approach. The simulator generated synthetic data representing:
- Email addresses with varying quality levels based on expected NeverBounce flag distributions
- Multiple sending domains with different initial reputation scores
- Bounce rates that varied based on both email quality and domain reputation
- Spamhaus reputation changes that responded to bounce patterns over time
I tracked all simulation experiments in MLFlow, documenting different fuzzy rule configurations, membership function parameters, and the resulting delivery and reputation metrics. This experimental tracking allowed systematic comparison of different approaches and provided baseline performance expectations.
The simulation results demonstrated the fuzzy logic system performed consistently even when synthetic data was biased toward lower quality addresses. The rule-based approach maintained sender reputation within acceptable bounds while processing the full recipient list, validating that the methodology was sound.
System Implementation
The fuzzy logic system classified email addresses into quality tiers using membership functions based on NeverBounce validation flags. It classified sending domains by current reputation scores from Google Postmaster Tools and Spamhaus. Rule sets determined optimal batch compositions by calculating acceptable ratios of lower-quality addresses that could be mixed with high-confidence addresses without triggering reputation penalties.
The system processed 540,000 emails daily, making continuous decisions about batch composition, send timing, and when to temporarily halt sends to specific domains showing warning signs. After completing implementation and validation, I handed the system to the engineering team for integration into the production mail infrastructure.
Data Quality Issues
Approximately one month after deployment, data quality issues emerged during the email sends. I was brought back to diagnose the problems.
Analysis of bounce data revealed that recipients flagged by NeverBounce as high quality were bouncing at unexpectedly high rates. I examined the correlation between NeverBounce confidence flags and actual bounce outcomes, finding that the validation service's quality indicators did not reliably predict deliverability for this client's data.
The MLFlow experiment tracking proved valuable during this diagnostic work. The simulation results demonstrated the fuzzy logic methodology was sound - when given quality indicators that accurately reflected email validity, the system maintained reputation within acceptable bounds. The issue was data quality, not model design.
Working with the deliverability team, we contacted the client who confirmed their email list data quality was not as high as initially believed. The root cause: flaws in the client's address validation process that preceded the NeverBounce analysis.
Findings and Recommendations
I provided analysis documenting that the fuzzy logic system performed correctly given the quality indicators it received, but those indicators did not accurately represent the underlying email address validity. The recommendations to the client:
- Audit their email collection and validation processes to identify where quality indicators diverged from actual deliverability
- Implement validation using multiple services to cross-check quality flags
- Establish ground truth through test sends to sample addresses before large-scale campaigns
The deliverability team reverted to manual sending processes while working with the client to address these data quality issues. The client needed to re-validate their 2 million address list using corrected processes.
Development Environment
- Python
- Jupyter Lab
- NumPy
- Pandas
- scikit-fuzzy
- Matplotlib
- MLFlow
- Snowflake
- Snowflake Python Connector
- Git
- Bitbucket