Identity Resolution and Cross-Site Tracking

In the Identity 2.0 era, the mental model was simple: you log in, the site knows who you are; you log out, it doesn’t. Identity was an explicit act. You disclosed what you chose to disclose, to the specific party you chose to disclose it to.

That model no longer describes what happens when you browse the web. Identity resolution — the practice of linking your behavior across websites into a persistent profile — operates without any login event at all. You can clear your cookies, use a private window, and never create an account, and a sufficiently instrumented advertising network may still recognize you on the next site you visit. This post explains the mechanisms, why “logging out” doesn’t stop it, and what the realistic limits are.

What Identity Resolution Actually Means

The term comes from the advertising and data industry, where it describes the process of consolidating multiple signals about a person — different devices, different browsers, different email addresses — into a single unified record. The goal is persistence: a stable identifier for you that survives the things users do to remove tracking, like deleting cookies or switching networks.

The same concept applies across sites. A data broker or ad network with reach across thousands of sites can observe your behavior on each one separately, then stitch those observations together. The stitching is what “identity resolution” refers to. The result is a cross-site behavioral profile that knows things no individual site could know on its own.

This is why the privacy conversation has shifted from “don’t give your email address to sketchy sites” to “the tracking infrastructure is structural.” The issue is not what any single site does with your data. It is what entities positioned across many sites can infer by joining their observations — and how that identity question has evolved into a surveillance system is the thread this site has been tracing since the Identity 2.0 era to the consent-law era.

The Four Main Re-Identification Mechanisms

Identity resolution draws on several overlapping techniques. In practice, they are often used together, layering signals to improve confidence in the match.

Hashed Email Matching

When you give your email address to a retailer, a publisher, or any site with a login, that address can be hashed — converted into a fixed-length string using SHA-256 or MD5 — and shared with advertising platforms. The hash is technically not your email address, but since the same input always produces the same output, it functions as a stable identifier. Two parties who both have your hashed email can match records without ever exchanging the raw address.

Google’s Enhanced Conversions, Meta’s Conversions API, and similar systems are built on this pattern. A purchase on one site can be matched to an ad click that happened on a different platform months earlier, linked through the hashed email you provided at checkout. The join happens server-side, so it is not blocked by ad blockers or cookie settings.

Probabilistic Identity Graphs

Not every site has your email address. Probabilistic techniques fill the gap by assembling a fingerprint from signals that are hard to control: IP address, browser version, installed fonts, screen resolution, time zone, language settings, and behavioral patterns like typing speed or mouse movement. No single signal is unique enough to identify you; the combination frequently is.

This is the mechanism that makes “I didn’t log in” irrelevant as a privacy protection. A probabilistic match does not require your consent or your participation. It requires only that your device behaves consistently across sessions — which, by design, it does. Publishers and ad networks can use these graphs to assign a persistent identifier to you across visits, then sell or share access to that identifier.

Device Fingerprinting

Device fingerprinting is the most technically visible form of probabilistic identification. The browser exposes a range of attributes that differ subtly from device to device. Canvas rendering differences, WebGL behavior, audio context response, and hardware concurrency combine into a fingerprint that is stable across browser restarts and cookie deletions.

Third-party tracking scripts that appear on many sites can collect these signals independently and match them at the network level. The user has no straightforward way to detect this happening; the requests look like standard analytics or ad calls. You can check what trackers are running on any given page in the tracker knowledge base, which documents what each tracker collects and what its re-identification risk profile looks like.

Walled-Garden Login as a Tracking Layer

The original Identity 2.0 critique of centralized identity providers — that whoever controls authentication controls a surveillance layer — turned out to be prescient. “Sign in with Google” and “Sign in with Facebook” are convenient and widely deployed. They are also mechanisms through which the identity provider learns which third-party sites you authenticate on, when, and how often.

This is not a hypothetical risk. The authentication flow requires your browser to contact the provider’s servers. The provider sees the referrer domain. Whether that signal is used in advertising systems depends on the provider’s policies and technical implementations — policies that change. The structure of Google’s identity silo is relevant here: Google’s authentication, search, browser, and advertising infrastructure are integrated, and a login event on a third-party site is a data point in that integrated system.

The contrast with the original user-centric vision — where who holds your identity credential matters enormously — is direct. The dream was that the identity provider would be a neutral, user-controlled entity. What we have instead is providers whose business model depends on cross-context data.

Why Logging Out Doesn’t Stop It

The session-based mental model — log in, act, log out — assumes that your identity exists on the server only while you are authenticated. For the tracking layer, that assumption does not hold.

Several things persist after logout:

Device fingerprint — your browser’s hardware and software characteristics do not change when you log out. A fingerprinting system that recorded your fingerprint during a logged-in session can recognize the same device in a logged-out session.
IP address continuity — most home connections have a stable IP for days or weeks. An advertising network that correlated your IP with your authenticated identity during one session can use the same IP to identify likely traffic from your device later.
Cross-site pixel fires — when you visit a page that contains a tracking pixel from a large ad network, that pixel fires regardless of whether you are logged into the site hosting it. The pixel call often includes enough context — referrer, user agent, IP — to append to an existing profile.
Hashed email stored at the ad platform level — if you authenticated anywhere with an email address that has been ingested into an identity graph, that record exists independently of any session on any individual site.

Logging out ends your session on the site you logged out of. It does not delete the observations that advertising networks made while you were browsing. Those observations live in systems you have no direct access to.

Cross-Site Tracking vs. On-Site Analytics: A Comparison

Dimension	Privacy-First On-Site Analytics	Cross-Site Identity Resolution
What is observed	Your behavior on this site only	Your behavior across many sites joined into a single record
Identifier type	Session-level, typically cookieless	Persistent — survives session end, cookie deletion, browser restart
Who holds the data	The site operator	Third-party ad networks, data brokers, identity resolution vendors
Consent requirement (EU)	Often legitimate interest or no consent needed (no cross-site join)	Consent required under ePrivacy; often contested in practice
What “opt out” means	Typically removes this session from counts	Depends on the identity graph operator; may suppress targeting without deleting history
Re-identification risk	Low (no linkage to external profiles)	High — that is the product’s stated purpose

The distinction matters for website operators, not just end users. If your analytics setup involves third-party scripts, your visitors may be exposed to cross-site identity resolution whether or not you are aware of it. Knowing the weight and purpose of the scripts you run is the starting point — which is what the tracker weight database maps across the most common tracking tools.

What Actually Limits Re-Identification

Technical and legal measures can reduce re-identification risk, though none eliminates it entirely.

Browser-level protections. Safari’s Intelligent Tracking Prevention blocks third-party cookies and storage outright, while Firefox’s Enhanced Tracking Protection uses Total Cookie Protection to confine cookies to the site that set them, so a third-party script can no longer carry an identifier from one site to the next. Chrome’s Privacy Sandbox proposals attempt a similar goal through different architecture. These protections reduce the effectiveness of third-party cookie tracking and some fingerprinting approaches, but they are not absolute — server-side tracking and hashed-email matching are mostly unaffected.

Network-level isolation. A VPN changes your visible IP address, which degrades IP-based re-identification. But if you log into any site using an email address that exists in an identity graph, the IP obfuscation is partially defeated at that moment. VPNs also shift trust to the VPN provider, who now sees the same traffic patterns.

Legal mechanisms. GDPR Article 17 gives EU residents a right to erasure. Identity resolution vendors operating in scope are required to honor it. Enforcement varies; the practical challenge is that you often do not know which vendors hold data about you, and the erasure requests must go to each one separately.

Reduced surface exposure. The fewer sites that receive your email address, the smaller the set of hashed-email matches that exist. The fewer browser plugins and non-default settings your browser has, the closer your fingerprint is to a common baseline — though this cuts both ways, since some fingerprint-reduction tools themselves introduce distinctive signals.

Publisher-side decisions. A site that uses no third-party scripts, runs its analytics first-party, and does not deploy social login buttons does not contribute to the cross-site identity graph, regardless of what users do. The publisher’s infrastructure choices have more leverage over re-identification risk than most user-side measures.

The FAQ on Re-Identification Mechanics

A few questions come up consistently when this topic is explained to non-specialists.

Does incognito mode prevent identity resolution? Incognito mode prevents the browser from writing to your local history and deletes cookies when the window closes. It does not affect server-side tracking, hashed-email matching, or the IP address visible to servers. Device fingerprinting may be partially disrupted if the incognito window uses different defaults than your regular browser, but this is inconsistent across browsers and versions.

Is this only an advertising problem? The techniques are not limited to advertising. Publisher analytics platforms, fraud detection systems, login risk scoring, and personalization engines all use overlapping methods. The difference is purpose and data retention, not the re-identification mechanism itself.

Can you opt out of identity graphs? Industry opt-out registries (NAI, DAA in the US; Your Online Choices in Europe) allow you to opt out of targeted advertising from member companies. This typically suppresses ad targeting but does not necessarily stop data collection or profile building. The opt-out is behavioral, not structural.

Does GDPR solve this? GDPR provides rights and imposes obligations. Whether it solves cross-site identity resolution depends on enforcement, on whether the processing has a valid legal basis, and on whether the identity resolution vendor is in scope. The Belgian data protection authority ruled in 2022 that the advertising industry’s Transparency and Consent Framework breached the GDPR, and the Court of Justice of the EU upheld key parts of that finding in 2024. Enforcement against the wider real-time bidding ecosystem continues at the practice level across several jurisdictions.

What This Means If You Run a Site

The identity question that the original Identity 2.0 project raised — who controls your digital identity, and under what rules — is now, for most web publishers, a question about what your site’s infrastructure does to your visitors without their full understanding.

The practical audit starts with what scripts are loading. If you are not sure, a browser network tab and a tool audit will show you. From there: are those scripts third-party? Do they set cross-site identifiers? Are they covered by your consent mechanism, or are they firing before consent is given?

Most of what limits cross-site identity resolution for your visitors is within your control as a publisher, not theirs. The choice to use first-party analytics, to avoid social login integrations, and to minimize third-party script load is a technical decision that has direct privacy consequences. It is also, increasingly, a legal requirement in jurisdictions with ePrivacy enforcement.

The framing from the Identity 2.0 era — that identity infrastructure is power infrastructure, and the design of that infrastructure determines who benefits — holds. The layer where that fight is happening has just moved from authentication protocols to analytics and ad-tech architecture.