Introduction
In recent years, Microsoft Safe Link experienced a wide-spread deployment. The product promises time-of-click protection through automated detection of phishing and malware distribution sites. Despite the noble intent, the product design is opaque and includes some controversial design choices, of which some lead to (unwilling) information leakage.
This research effort aims at assessing what an operator experiences through Safe Links being clicked and processed by the backend. It is the final project of the Master student Max Resing, a student at the University of Twente, the Netherlands.
Study Objectives
The goals of this study include:
- Analysing the structure of Safe Links and how much information is embedded in them?
- Understanding the processing of the Safe Links backend the moment a link is received in a mail, and the moment it is clicked by a user.
- Explore the amount of information leakage in public data sources through shared links, archived mails, etc.
Methodology
We setup some Internet infrastructure to perform measurements no the Safe Links backend. The infrastructure is hosted on this domain. We measure DNS queries to authoritative name servers under our control. Furthermore, we log detailed web requests on this web service.
To study different Safe Links configurations, we send out mails to mailing lists and researchers in the hope these links are getting processed and clicked. The infrastructure performs detailled logging, which is filtered afterwards, to filter out any personal information, such as residential IP addresses. The logging data helps to understand what an operator can learn about Safe Links, but also about users simply through the leakage of information through the product design.
Participation
If you have ambitions to join the participation, feel free to reach out to us. You can find our contact details below. We process incoming participation requests manually. Anyone who requests participation will receive a response with a dedicated link for participation. The hope is to find Safe Links being generated on message retrieval by the Microsoft Exchange or Office 365 backend.
Data Processing
The infrastructure involved in this study consists of two external servers. These servers act as authoritative nameservers and web server. We have two servers, one in the EU, another in the US, where both servers are designed to log information about the Safe Link backend.
Therefore, it is inevitable to briefly store the IP address of the recipients. Once a day, the data is processed and filtered.
We are not interested in personal IP addresses. Any new log data is stored temporarily.
Once a day, the data is processed and filtered.
We only keep information about IP addresses that are not flagged as residential
by the GeoIP location service IP2Location.
In case, we find any critical vulnerability or exposure of data, we will adhere to the procedure of responsible disclosures.
Resources and References
Contact Information
If you have any questions, want to participate, or need more information, please contact us.