Apple MPP reporting and geolocation
A while back I wrote about Apple Mail Privacy Protection, what it does and how it works. Since MPP was first announced I’d assumed that it would be built on the same infrastructure as iCloud Private Relay, Apple’s VPN product, but hadn’t seen anything from Apple to explicitly connect the two and didn’t have access to enough data to confirm it independently.
But the nice folks at MailChimp did gather enough image load data to confirm that the two are related, and prompted me to look into Private Relay a bit more.
Apple have a nice description of Private Relay from the consumer perspective in their support pages, but the interesting bits are in their technical info for network admins. Their description there matches my black box testing of MPP image loads exactly, but the bit that clinches it is the directions for how enterprise networks can block private relay access:
[…] your network can block access to Private Relay in these cases. The user will be alerted that they need to either disable Private Relay for your network or choose another network. The fastest and most reliable way to alert users is to return a negative answer from your network’s DNS resolver, preventing DNS resolution for the following hostnames used by Private Relay traffic. […]
Those are the same hostnames that Apple MPP uses to start a QUIC connection to load images, so if Private Relay is blocked, MPP is blocked. They’re using the same infrastructure. (It would be possible, if unlikely from an engineering perspective, for them to use the same QUIC front end, but different subsets of egress IPs. Data from real world MPP loads show they don’t seem to today, and it seems unlikely that’ll change).
So, we know that image loads from Apple’s proxies may be from Apple Mail on any platform (iOS, macOS, ipadOS) where the user has MPP enabled. Or they may be from any application running on iOS, macOS or ipadOS that is using iCloud Private Relay (or, perhaps, non-Apple devices that are sharing that connection). We also know that enterprise and education networks are able to disable Private Relay and MPP network-wide – this means that the same device may use MPP or not, even for the same email, depending on which network it’s on. That’s likely to have an impact on open reporting for B2B email.
(We also know that someone running a spamtrap network, or an enterprise filter, could set things up such that their image loads were done via Apple’s proxies if they wanted to do so.)
Apple provide a list of their Private Relay egress IPs, linked from their technical info page, as a giant CSV. This will let ESPs identify which image loads are coming from Apple (iOS, macOS, ipadOS) users via MPP or Private Relay relatively simply. How best to report that to the user will be the tricky bit.
It’s a long list of addresses, so here are some stats about their egress nodes:
325,664 CIDR ranges, of which only 37,545 are IPv4 as compared to 288,119 IPv6. That’s total of 89,926 IPv4 addresses. The IPv6 ranges are mostly /64s, with a few Akamai /42s and /46s.
Apple say that Private Relay traffic will originate from addresses that are related to the location of the user. Private Relay users can choose to use either an egress IP that’s close to them, typically a nearby city, or a random location in the same country and time zone. I’m assuming, without much evidence, that MPP traffic is likely to use the “general location” approach, where it appears to come from a nearby city.
There are 1,867 unique geoip locations for IPv4 addresses, and 41,417 for IPv6 addresses. That means that geolocation of image loads is likely to be more accurate if the images are served over IPv6. Most apps will load over IPv6 preferentially if both are available, so serving your images over both IPv4 and IPv6 on the same hostname will give the best result.
The picture at the top of the post is a very rough visualization of about half of the egress nodes; there’s an interactive version at https://appleegress.wordtothewise.com/. It’s very crude, based solely on the geolocation data I have handy, but it gives an idea of the coverage.
Apple have worked with commercial geoip providers to set up these locations, and many have annotated them as “iCloud Private Relay” or similar, which will make them easy to recognize as part of the geolocation pipeline rather than needing to check against a local database.
Images being preloaded, or loaded via proxies isn’t really anything new. MPP just raises the visibility of it. ESPs can relatively easily identify which image loads are via Apple proxies. The tricky bit is how to usefully convert that into actionable information for the user.
Any time mail is sent images are likely to be loaded multiple times, potentially from multiple devices (‘phone, desktop, webmail). The crudest way to aggregate that data is to treat the first image load as evidence the mail was read, when it was read, what device and where in the world it was read, then to ignore all later image loads. That’s going to be decreasingly useful, and potentially misleading. But attempting to display all the image load data is also going to be misleading. ESPs are going to have to develop a more nuanced way of summarizing and showing this data to their users.
Do MPP image loads use an egress IP that’s using “Maintain General Location” or “Use Country and Time Zone” style? The data I have suggests the former, but it’d be possible to test.
Does that change if the user is using iCloud Private Relay set to “Use Country and Time Zone”?
How does a gif’s frame delay, the time Apple Mail prefetches an image and the time it displays an image interact? e.g. if a gif with a frame delay of 1 hour is prefetched at 10am, then first displayed at 10:30am, does the second frame display at 11am or 11:30am?
Apple MPP does change the landscape of data based on image loads. But not as much as some had feared.
Use of image loads to provide time-varying data to the recipient (e.g. countdown timers, sales ending) is likely to need to be rethought.
Use of an image load to mean that a mail was read by a person continues to be misleading and will get increasingly so.
Geolocation based on image loads is likely to be relatively accurate, at least to a city level, and probably no worse than it is now.
Geolocation based on image loads will be more accurate if your images are served via IPv6. Your image server should serve over both IPv6 and IPv4, with it’s hostname having both A and AAAA records. (That’s been true for years, but more accurate geolocation is another reason).
ESPs may need to move to more sophisticated use of image loads than their current “open” reporting. Marketers should expect their ISPs to be explicit about what their reporting is based on, and distrust any reporting that simply describes events as “opens” or (worse) “real opens” without any explanation.