From ESTS Cookie Replay to Inbox Persistence

We’re in 2026, and inbox persistence is still here, and it’s one of the core pillars of the Microsoft 365 attack playbook. Once an attacker lands in a mailbox, they don’t rush to chase noisy privilege escalation. They go for something quieter and harder to shake off: a long‑term seat inside your mail flow.

In this post, I’ll walk through how a single stolen ESTS browser session can be replayed into full inbox control, without ever touching a password, an MFA prompt, or the official Microsoft 365 login page. Using nothing more than valid ESTSAUTH cookies and the Outlook REST API, an attacker can silently convert a one‑time session into long‑term inbox persistence, data exfiltration, and a reliable early‑warning channel for future opportunities.

Nowadays, defenders are trained to watch for credential theft. Wrong passwords, suspicious MFA pushes, and sign-ins from unexpected locations are all reasonable areas to focus on. Still, they miss a technique that sidesteps the entire authentication flow without touching a single credential.

ESTSAUTH and ESTSAUTHPERSISTENT are the session cookies that Microsoft issues after a user completes a full Entra ID login, including MFA. They represent a proven authentication state that Microsoft’s SSO engine will honor without asking for anything again. If an attacker gets hold of them through an AiTM proxy, a compromised endpoint, or direct browser access, they inherit that proven state completely. No password needed, no MFA challenge fired, no step-up prompt triggered.

This post walks through a post-compromise flow in which an attacker plants a persistent inbox-forwarding rule using only curl, standard OAuth endpoints, and behavior that already exists within the Microsoft 365 authentication stack. What makes the technique interesting is not a single exotic exploit, but the way three behaviors connect into a chain that most defenders are unlikely to treat as one suspicious sequence.

The first behavior is a non-interactive SSO redirect that Entra records as low severity, providing the attacker with a quiet authentication path that does not immediately appear to be a high-risk mailbox operation.

The second behavior is using a first-party FOCI client to pick up Mail.ReadWrite through the Outlook REST audience rather than Microsoft Graph, which means the permission path does not always match the detection logic many teams have built around Graph activity. The third behavior is a deprecated API endpoint that Microsoft has not fully removed, leaving behind an older mailbox control path that can still be useful after a session has been captured.

When these behaviors are chained together, an attacker can create a durable forwarding rule against a compromised mailbox while the audit trail looks far less dramatic than the outcome, often resembling a routine background Office sign-in rather than an explicit mailbox persistence action.


Why Common Playbook Still Works?

Inbox rule abuse is not new. Security vendors have written about it, Microsoft has built detections for it, and most MDR playbooks document it. Yet it keeps appearing in incident reports because the defensive coverage is narrower than most teams assume, and attackers know exactly where the gaps are.

The most detected pattern is forwarding to an external address. Creating a rule that copies every incoming message to an outside domain is the version that gets caught most often because external forwarding policies and DLP rules in modern tenants are frequently configured to flag or block it. Some tenants enforce a transport rule that strips the forwarding action entirely before the message leaves the organization. Attackers who encounter this quickly learn to abandon the external forward and pivot to something quieter.

Moving emails to a folder the victim never checks is the version that flies under the radar the longest. A rule that takes any message matching a keyword pattern and drops it into a subfolder buried inside Deleted Items, or into a folder named with a non-printing character, passes every external forwarding check because nothing ever leaves the tenant. The attacker reads the folder contents using the same token, on their own schedule, without a single message crossing an organizational boundary. DLP never fires. Transport rules never trigger. The only evidence is the rule itself sitting in the mailbox and a MailItemsAccessed event in the audit log if E5 licensing is present and audit logging is enabled at the right verbosity.

Marking messages as read before the victim sees them is another layer that attackers add on top of a move rule. The combination means messages about password resets, MFA changes, security alerts, and IT notifications arrive in the inbox but already appear read, reducing the chance the victim notices that something was acted on. This is particularly effective during account takeover when the attacker is making changes elsewhere in the tenant and needs to suppress the notification emails that those changes generate.

Deleting specific messages entirely is the most aggressive version and the hardest to recover from. A rule that permanently deletes anything from a specific sender or containing a specific subject line can be used to suppress wire transfer confirmation emails, security alert notifications, or responses from IT helpdesk tickets that the attacker opened to buy time. Permanent deletion without a litigation hold in place means those messages are gone before any investigation starts.

Redirect rather than forward is a variation that fewer detections cover. The redirect action moves the message out of the victim’s inbox entirely and delivers it only to the attacker’s address, meaning the victim never receives the original. Forward leaves a copy in the inbox. Redirect does not. Most detection queries focus on forwardTo actions and miss redirectTo entirely because the documentation treats them as equivalent, even though operationally, they differ significantly in impact.


The Classic Inbox Rule Attack Scenarios

Forward Mails to External Address

The original and most straightforward technique. A rule with no conditions and a forwardTo action pointing to an attacker-controlled domain copies every incoming message in real time. Effective against executives and finance roles where the intelligence value is high enough to accept the detection risk. Most tenants now have external forwarding policies that block or flag this, which is why it has largely been replaced by quieter variants in sophisticated attacks.

Redirect All Mail to External Address

Functionally similar to forwarding, but the victim never receives the original message. Where forwarding leaves a copy in the inbox, redirect delivers exclusively to the attacker. Used in account takeovers where the attacker needs to fully control communications coming into the account, particularly useful for intercepting password reset emails and MFA enrollment links before the legitimate owner sees them.

Move to Hidden or Obscure Folder

A rule that moves messages matching a condition into a subfolder buried deep in the mailbox hierarchy, often inside Deleted Items or a folder named with a space or special character. Nothing leaves the tenant,t so external forwarding controls never fire. The attacker reads the folder directly through API access. This technique can run undetected for weeks because the victim’s inbox appears normal, and there are no mail flow anomalies.

Mark as Read Before Delivery

Typically layered on top of another rule rather than used alone. Security alert emails, IT notifications, and password change confirmations all arrive in the inbox already marked read, reducing the probability that the victim notices them. Particularly effective during the window when an attacker is making account changes and needs to suppress the resulting notification stream without deleting messages outright.

Delete Messages from Specific Senders

A targeted deletion rule built to suppress communications from security teams, IT helpdesk, or automated alert systems. An attacker who opens a helpdesk ticket to socially engineer a password reset can use this rule to delete the confirmation email and any follow-up from IT before the legitimate account owner sees it. Without a litigation hold, those messages are unrecoverable once purged.

Forward Keyword-Triggered Emails Only

A condition-based rule that forwards only messages containing words such as invoice, wire transfer, payment, credentials, or budget. Lower volume than a full forward means less anomaly to detect and a more targeted intelligence feed. Common in BEC preparation, where the attacker wants financial communications specifically, without creating the noise of forwarding every message.

Forward Emails from Specific Senders

Rather than keyword matching on content, this variant targets sender addresses directly. Rules that forward anything from the CFO, legal counsel, board members, or specific vendors provide the attacker with a clean feed of high-value communications at minimal volume. Particularly common in vendor email compromise scenarios where the attacker needs to intercept a specific business relationship rather than the full mailbox.

Forward and Mark as Read Simultaneously

Combining forwarding with the mark-as-read action so forwarded messages appear in the inbox but show no unread indicator. A casual glance at the mailbox shows nothing unusual. The victim would need to notice the forwarding rule itself or observe that messages they expected to appear unread are already marked as read, which rarely happens in practice unless they are actively looking for it.

BEC Intercept Rule for Payment Threads

The most financially damaging variant. A rule that watches for email threads involving payment approval, wire transfer confirmation, or invoice approval and either forwards them to the attacker or redirects them entirely. The attacker reads the thread, understands the approval chain and language patterns, then times a spoofed payment redirect request to land at exactly the right moment in the approval workflow. The inbox rule does not directly cause fraud, but it provides the intelligence that makes the fraud convincing enough to succeed.


How the Chain Actually Works? 

The attack breaks down into three distinct stages, each one building on what the previous stage established. Understanding why each step works the way it does matters more than just knowing the commands.

The first stage is turning the captured cookies into an authorization code. ESTSAUTH and ESTSAUTHPERSISTENT are scoped to login.microsoftonline.com, which means they only travel to Microsoft’s identity endpoints, not directly to downstream services like Exchange or Graph. To get anywhere useful, you need to convert them into a token. The way to do that is to hit the OAuth2 authorize endpoint with the cookies attached and let Microsoft’s SSO engine recognize the existing session and issue a code, rather than presenting a login form.

Two things are required to make this work reliably. The Netscape cookie jar format that curl uses treats any cookie with an expiry value of zero as already expired and silently drops it, so the expiry field must be a valid future epoch before the request goes out. Microsoft’s WAF also requires Sec-Fetch headers on navigation requests and returns a 417 without them, so those need to be present as well. With both in place, the authorize endpoint returns a 302 redirect to the native client URI with a fresh authorization code in the query string, and the entire flow happens without any user interaction.

The second stage is to exchange that code for an access token with the correct scope. This is where most attempts stall. Requesting ‘Mail.ReadWrite’ directly through the Graph resource on a first-party client like Microsoft Office triggers AADSTS65002, which is Microsoft refusing to grant explicit delegated permissions to its own apps outside their pre-consented scope sets. The workaround is to target a different OAuth resource entirely. When the authorization code is exchanged for the Outlook resource rather than the Graph, the resulting token includes Mail.ReadWrite is part of its default scope set because the Outlook resource grants it implicitly for delegated access. The token audience changes to outlook.office.com rather than graph.microsoft.com, and that distinction opens the door to the third stage.

The third stage is placing the forwarding rule itself. The modern approach would be the Graph messageRules endpoint, but that requires the Graph-scoped Mail.ReadWrite, which the previous step deliberately avoided. The Outlook REST API v2.0, hosted at outlook.office.com, accepts the token because the audience matches. Microsoft deprecated this API in 2020 and formally sunsetted it in 2022, but it remains fully functional. The messagerules endpoint under it behaves identically to its Graph counterpart, accepts the same JSON body, and returns a 201 with the rule ID on success. The rule is created in the authenticated identity’s mailbox without additional consent, admin involvement, or any visible prompt during the session. Every subsequent email arriving in that inbox is silently forwarded to the address specified in the rule.


Into the Attacker side

Getting a forwarding rule planted is only the beginning. An attacker who stops there has persistence but no operational picture, and persistence without situational awareness is just noise. The way to think about this stage is in terms of what the access is actually worth and how long it can be maintained before it is closed down.

The first consideration is dwell time. ESTSAUTHPERSISTENT is issued with a 90-day expiry by default on tenants without an aggressive session policy. That means the same cookies that planted the rule can keep generating fresh tokens for months without any re-authentication. Every time a new token is needed, the same SSO flow runs silently, and Entra logs another non-interactive sign-in from Microsoft Office. Unless someone is actively correlating non-interactive sign-in volume for that identity against a behavioral baseline, nothing surfaces. The forwarding rule itself survives password resets and MFA re-enrollment because it lives in the mailbox layer, not the identity layer. Changing a password revokes refresh tokens but does not affect inbox rules.

The second consideration is what the forwarded mail reveals. A passive forward running for a few days against a finance or executive mailbox builds an intelligence picture that no active enumeration could match without generating significant noise. Approval chains, vendor relationships, payment schedules, internal escalation patterns, and ongoing negotiations all flow through email continuously. An attacker reading that stream in real time can time a BEC attempt with precision, identify the exact language patterns the target uses with their bank, and understand which requests will clear without a callback verification. The forwarding rule essentially converts passive access into active intelligence at zero additional risk.

The third consideration is lateral expansion. The same FOCI client and Outlook REST API approach that created the inbox rule also grants access to calendar, contacts, and OneDrive files through scope pivots on the same token. A refresh token obtained during the initial code exchange can be replayed with other FOCI-compatible clients to access SharePoint and Teams without triggering the authorization flow again. Each pivot generates another non-interactive sign-in that blends into the background traffic of a busy tenant.


Security Controls That Actually Matter

Conditional Access sign-in frequency is the most accessible control for reducing dwell time. Setting re-authentication intervals for sensitive roles and finance mailboxes to 8 or 12 hours forces the ESTS session to expire regularly, which limits how long a captured cookie remains useful. It does not prevent the initial access, but it shrinks the window considerably.

Token Protection in Conditional Access is the control specifically designed to address this attack path. It cryptographically binds access tokens to the originating device using the Primary Refresh Token, meaning a token issued to one device cannot be replayed from another. Replaying ESTS cookies from an attacker machine would produce a token that Exchange rejects at the resource layer. As of 2026, this requires Entra ID P2 and an explicit policy targeting Exchange Online. Most tenants do not have it configured.

Disabling the Outlook REST API v2.0 at the tenant level removes the specific endpoint this technique relied on. Microsoft provides an Exchange Online PowerShell command to block access to the legacy REST endpoint. Since the API is officially deprecated, there is no legitimate reason for it to remain accessible in a production environment. Blocking it forces any attempt to use this technique to fall back to the Graph endpoint, where explicit Mail.ReadWrite consent is required, and the consent grant itself becomes a detectable event.

External forwarding policies enforced at the transport layer stop the most common variant of the technique, but do nothing to address redirect rules, move rules, delete rules, or any scenario in which the mail stays within the tenant. They are worth having, but treating them as sufficient coverage is the exact assumption this attack was designed to exploit.

Monitoring for the combination of a non-interactive sign-in, a new inbox rule, and a subsequent MailItemsAccessed spike from an unfamiliar IP within the same session context detects this technique as a chain rather than three unrelated low-severity events. Most SIEM configurations that handle each signal independently will miss it entirely.


ESTS Inbox Siphon

The ESTS Inbox Siphon is a script built for one thing: silent, automated exfiltration. No brute-forcing, no interactive logins, zero bullshit. It weaponizes hijacked ESTSAUTH cookies, forces a silent SSO flow to rip an OAuth code, dodges Microsoft’s WAF with spoofed headers, and abuses the legacy Outlook REST API to plant a stealth forwarding rule. We pivot past modern EWS blocks simply because defenders are looking the wrong way.

Tachles, this is exactly how your data leaves the tenant. Read the breakdown, understand the mechanics, and patch your blind spots. Yalla, let’s get into the code.

Cookie Hijacking & MFA Bypass allow the script to entirely skip credential brute-forcing. By injecting the ESTSAUTH and ESTSAUTHPERSISTENT cookies directly into the session, you hijack an already validated authentication state. MFA and Conditional Access were already handled by the victim; you just walk through the open door.

The part of Silent Auth Code Extraction that allows appending sso_reload=True to the authorization URL is the real magic here. It forces the Microsoft identity provider into its Single Sign-On flow, which skips any interactive login forms. The server just looks at the cookies and silently hands over the OAuth code.

Evasion via Header Spoofing to avoid nuke basic bot traffic, returning a 417 Expectation Failed if you just use a naked curl. Injecting the Sec-Fetch-Dest, Sec-Fetch-Mode, and Sec-Fetch-Site headers mimics modern browser behavior, allowing the script to blend in and bypass frontline anti-bot checks.

Legacy API Abuse via Exchange Web Services (EWS) is heavily monitored and mostly blocked on modern tenants, and Graph API scopes can be restrictive. Requesting outlook.office.com/.default and routing the attack through the older Outlook REST API v2.0 is a deliberate downgrade tactic. It leverages the same token audience but slips under the radar of modern administrative blocks.

Inbox Rule Injection can be a kind of payload execution. The script takes the newly minted access token and POSTs a JSON payload to create a new MessageRule. It tells Exchange to auto-forward all incoming mail to an external drop address, then immediately loops back to query the rules to verify the backdoor is alive and kicking.

Why The Script That Made It Real?

Everything described in this post was reduced to a single shell script. cookie-login-inbox-rule.sh takes two values: an ESTSAUTH cookie and an ESTSAUTHPERSISTENT cookie, both lifted directly from an intercepted browser session, and walks through the full attack chain without touching a password or triggering an MFA prompt.

The flow inside the script moves in four steps. First, it writes the cookies into a Netscape-format cookie jar, setting the expiry epoch to 90 days ahead because curl silently drops any cookie with an epoch of zero. Then it hits the Microsoft SSO authorization endpoint using the d3590ed6 client ID for the Microsoft Office application, with sso_reload=True appended to the URL. That parameter is the key piece. It tells the ESTS engine to skip the login form and resolve the session from cookies directly, returning a 302 with an auth code in the Location header rather than rendering an HTML page.

The second step is a straightforward token exchange. The auth code is sent to the v2.0 token endpoint, scoped to https://outlook.office.com/.default, which returns a bearer token with Mail scope.ReadWrite is already included. No consent dialog, no admin approval, no additional prompt of any kind.

From there, the script calls POST /api/v2.0/me/mailfolders/inbox/messagerules on outlook.office.com. The rule body specifies a ForwardTo action pointing to the collection address, wrapped in conditions that match only finance- and credential-related keywords, so the rule fires selectively and sits quietly for legitimate email.

The result is that a single stolen session cookie directly translates into persistent access to mail. The audit log records a ‘nonInteractiveUserSignIn’ event from a Microsoft Office client, the same event that fires every time Outlook syncs in the background, and a single New-InboxRule entry that is only actionable if someone is watching that specific account at that specific time.

That is where most attackers stop, and that is precisely why most detections catch them.

cookie-login-inbox-rule.sh is the starting point. Before running it in a real engagement, the approach warrants a harder look at what stands out. A rule named forward All planted two minutes after an auth code flow from an unfamiliar IP is a story that writes itself in any SIEM. The v1 script does not try to hide that story.

cookie-login-inbox-rule2.sh is the version built with that story in mind. It reads the target’s existing inbox rules before planting anything, selects the next sequence number, and selects a display name that mirrors the account owner’s existing naming pattern. The auth and rule creation steps are separated by a randomized sleep so they do not fall inside the same SIEM correlation window. The User-Agent rotates between a browser string for the auth leg and an Outlook desktop string for the API calls, matching what a genuine Office client would produce. Keywords in the conditions are scoped to finance and credential terminology, so the rule volume stays low, and the name reads like something a user would create themselves.

The point is not to make detection impossible. It is to move the signal out of the automated detection layer and into the territory where an analyst has to sit down and think. The difference between a rule that fires in thirty seconds and one that requires a human review is often just a handful of decisions made before the first curl request goes out.

The defensive answer to all of this is not just inbox rule monitoring, though that matters. It is Token Protection inside Conditional Access, the only control that cryptographically binds a session token to the device that originally authenticated. Without it, a valid cookie is a valid key, and the lock does not care who is holding it.

More OffSec and Research stories from Cyberdom

Discover more from CYBERDOM

Subscribe now to keep reading and get access to the full archive.

Continue reading