Why your Let's Encrypt auto-renewal silently failed (and how to catch it)
By Nick Phillips, Founder
Let's Encrypt's auto-renewal works beautifully right up until the moment it doesn't. The unhelpful part is the failure mode: certbot renew exits 0, the log says "no certificates due for renewal," everything looks fine — and the cert still expires three weeks later.
This post catalogs the actual reasons we and other people have seen this happen, ordered from most common to least. If you're reading because your cert just expired, scan the list and find yours. If you're reading because you want to prevent the future version of this, the last section is for you.
1. The cron is running, but as the wrong user
Symptom: sudo certbot renew works from your shell. The systemd timer (or cron) runs, exits 0, and renews nothing.
Cause: the timer is running as a user that can't read the existing cert files — usually because someone set up certbot manually as root the first time, then later "fixed" the timer to run as certbot or www-data. The renewal logic looks at the existing /etc/letsencrypt/live/ symlinks, can't read them, and decides there's nothing to renew.
Check: systemctl status certbot.timer (or crontab -l if it's old-school) and look at the user. Then sudo -u <that-user> certbot renew --dry-run to reproduce.
Fix: run the timer as the user that owns /etc/letsencrypt/, usually root. If that's uncomfortable, change ownership of the letsencrypt directory tree to a dedicated user and make sure the webserver reload step is what handles privilege escalation, not the renewal itself.
2. The renewal succeeded — but the webserver wasn't reloaded
Symptom: certbot renews the cert. The files on disk are new. openssl s_client -connect yoursite.com:443 still returns the old cert.
Cause: the renewal hook didn't run, or ran and silently failed. nginx/apache holds the old cert in memory until you reload, and reload requires the daemon to actually be running with valid config. If your nginx config has a syntax error from an unrelated change, nginx -s reload will fail, the renew hook will exit non-zero, and certbot will move on without complaining loudly.
Check: sudo certbot renew --dry-run --post-hook "nginx -t" — if the post-hook fails, you'll see it.
Fix: use --deploy-hook (per-cert, only runs when a renewal actually happens) rather than --post-hook (always runs). Have the hook log its exit code somewhere you'll see:
certbot renew --deploy-hook "systemctl reload nginx || echo 'RELOAD FAILED' | mail -s 'certbot' [email protected]"
3. DNS-01 broke because your DNS provider changed their API
Symptom: HTTP-01 challenges work, but you're using DNS-01 (for wildcards). Renewal logs say "could not determine authoritative DNS server" or "challenge failed."
Cause: certbot's DNS plugins authenticate against the provider's API, and the providers change those APIs. Cloudflare deprecated their global API key in favor of scoped tokens; Route 53 changed permission models; smaller providers do this regularly.
Check: certbot renew --dry-run will surface the actual error. Also check whether your DNS plugin is up to date — pip show certbot-dns-cloudflare (or apt equivalent).
Fix: update the plugin, regenerate API credentials with the current scopes the plugin needs, and re-run the dry run until it passes.
4. The webroot moved
Symptom: HTTP-01 renewal fails with "challenge did not pass: ... 404 Not Found."
Cause: you set up renewal pointing at /var/www/html. Six months later you deployed a new static site builder that puts files under /var/www/site/public. The webroot in /etc/letsencrypt/renewal/yoursite.com.conf still says the old path. certbot writes the challenge file there, your webserver looks somewhere else, the ACME server fetches a 404, the challenge fails.
Check: cat the .conf file under /etc/letsencrypt/renewal/ and confirm the webroot_path matches reality.
Fix: edit the renewal config (it's plain text and safe to change) or re-run certbot with --webroot --webroot-path /new/path to overwrite it.
5. Rate limits from a previous failed attempt
Symptom: renewal fails with "too many certificates already issued" or "too many failed authorizations recently."
Cause: Let's Encrypt has generous but real rate limits. The two that bite people: 5 failed validations per account per hostname per hour, and 50 certificates per registered domain per week. If your renewal has been failing every hour for a day, you've already used up your error budget, and now even a fixed configuration won't go through for an hour.
Check: the error message includes the relevant limit.
Fix: wait. Don't escalate by running certbot in a loop trying to "fix" it — that just deepens the rate-limit hole. Test against the staging environment (--server https://acme-staging-v02.api.letsencrypt.org/directory) while you wait the cooldown out.
6. The renewal cron got disabled by a system update
Symptom: nothing has run in months. last shows the server has been up. journalctl -u certbot.timer shows nothing recent.
Cause: a distro upgrade swapped systemd units and the new one is disabled by default, or the cron file moved from /etc/cron.d/certbot to /etc/cron.d/python3-certbot, or snap-certbot replaced apt-certbot and quietly disabled the old timer. Distro packagers have a creative streak here.
Check: systemctl list-timers | grep cert and ls /etc/cron.d/ | grep cert.
Fix: re-enable. Then go to the next section and make sure you'd have caught this from the outside.
7. The renewal worked, but to a different cert chain
Symptom: the renewed cert is valid but clients (especially old Android, IoT devices, embedded Java) suddenly can't verify the chain.
Cause: in 2021 the DST Root CA X3 cross-sign expired, and Let's Encrypt switched its default chain. In 2024 they did it again, dropping the IdenTrust cross-sign entirely. Older clients that don't trust the new ISRG root start failing.
Check: curl -v https://yoursite.com 2>&1 | grep -i issuer from a recent client; compare to what your IoT/legacy client sees.
Fix: pin the alternative chain at certbot config time with --preferred-chain "ISRG Root X1" (or whichever your clients trust). Or, more honestly, update your clients — Let's Encrypt isn't going to keep cross-signing forever just for the long tail.
How to actually catch this next time
The pattern in all of the above: certbot exits 0 even when the outcome isn't what you wanted. Trusting the exit code is what gets people. The fix is to verify the outcome — read the cert that's actually being served from outside the box — not the process.
A 10-line check script does this well enough:
#!/bin/bash
host="$1"
expires=$(echo | openssl s_client -servername "$host" -connect "$host:443" 2>/dev/null \
| openssl x509 -noout -enddate | cut -d= -f2)
expiry_epoch=$(date -d "$expires" +%s)
days_left=$(( (expiry_epoch - $(date +%s)) / 86400 ))
echo "$host: $days_left days"
[ "$days_left" -lt 20 ] && echo "WARN: cert expiring soon" | mail -s "cert warning" [email protected]
Cron it daily. Done — except now you have a second cron whose failure also exits 0 silently. (See where this is going.)
The honest answer is that you want the check running somewhere other than the box that's also responsible for serving the cert. That's the entire point of external monitoring: if the host is offline, or its cron is broken, or its mail relay is down, something else still notices. We built Otterwatch for this. So did half a dozen others — the comparison post covers the differences.
If you want to verify a single cert right now, our free checker reads notAfter straight off the live handshake and shows you the date in two seconds. If you want a daily check that emails you 30 days, 7 days, and on-expiry, sign up and it'll take about a minute to wire your first domain in.
Either way: don't trust certbot's exit code. Trust what the cert actually looks like from the outside.
Catch the next cert expiry before your users do.
Otterwatch checks your SSL certificates daily and emails you 30 days before they expire. Five sites free.
Start watching →