February 23, 2016

Windows 10 telemetry secrets: Where, when, and why Microsoft collects your data

How does Windows 10 telemetry really work? It’s not a state secret. I’ve gone through the documentation and sorted out the where, when, and why. If you’re concerned about private documents accidentally leaving your network, you might want to turn the telemetry setting down.

Telemetry is not a four-letter word.

You wouldn’t know that to listen to the relentless hammering of the technology by Windows 10 critics, who see it as a form of “spying” on the part of Microsoft. Unfortunately, many of those critics have used unreliable data , compounded by a misunderstanding of the basic technology, to form their opinions.

In this article, I want to take a closer look at the way that telemetry works and the data it collects. This article relies primarily on my own testing, using a number of Microsoft-provided tools as well as third-party utilities.

Revealed! The crucial detail that Windows 10 privacy critics are missing

Here we go again. The usual suspects are trying to turn routine diagnostic information into another manufactured privacy controversy over Windows 10. Don’t fall for it. (PS: You won’t believe what Apple’s privacy policy says.)

My research also included discussions with engineers as well as reviews of some thorough but obscure documentation. The most useful resource I found is a detailed technical paper written for IT pros and published in the TechNet Library: Configure telemetry and other settings in your organization . (That article has a convenient short link: aka.ms/ConfigureTelemetry.)

What is Windows 10 telemetry?

Microsoft defines telemetry as “system data that is uploaded by the Connected User Experience and Telemetry component,” also known as the Universal Telemetry Client, or UTC service. (More on that shortly.)

Microsoft uses telemetry data from Windows 10 to identify security and reliability issues, to analyze and fix software problems, to help improve the quality of Windows and related services, and to make design decisions for future releases.

Telemetry features aren’t unique to Microsoft and there’s nothing particularly secret about them. They’re part of a larger trend in the software industry to collect and analyze event data as part of a shift to data-driven decision making. (My definition of “the software industry” includes not just Microsoft and Google but also companies like Tesla Motors, which uses vehicle telemetry to provide ongoing product improvements to its cars.)

You can read about Microsoft’s use of this technology in a paper co-authored by Titus Barik of the University of North Carolina and several individuals at Microsoft Research. “The Bones of the System: A Case Study of Logging and Telemetry at Microsoft” will be presented at the International Conference on Software Engineering in September 2016.

It’s worth noting that the telemetry data I describe here is only a small part of the routine traffic between a Windows 10 PC and various servers controlled by Microsoft. Most network analysis I’ve seen looks at all that traffic and doesn’t isolate the telemetry data transmissions.

How does Windows 10 collect and transmit telemetry data?

Windows 10 includes a piece of software called the Connected User Experience and Telemetry component, also known at the Universal Telemetry Client (UTC). It runs as a Windows service with the display name Diagtrack and the actual service name utcsvc. Microsoft has engineered this component as a part of Windows.

You can see the DiagTrack service in the Services console in Windows 10. As I said, it’s not a secret.

windows-10-diagtrack.jpg

To find the process ID (PID) for the service, look on the Services tab in Windows Task Manager. This piece of information is useful for anyone who wants to monitor activities of the DiagTrack service using other software tools.

I used that PID to watch the activity of the DiagTrack service over the period of several days, using the built-in Resource Monitor tool on a virtual machine running Windows 10 Enterprise with a local account and the telemetry level set to Basic.

windows-10-telemetry-resource-monitor.jpg

That screenshot shows the DiagTrack component doing exactly what the documentation says it does, performing an initial performance measurement and then checking the contents of four log files every 15 minutes or so. Because I wasn’t doing anything with this test system, there weren’t any crashes or app installations to report, so those log files didn’t change during the period I was measuring.

Each data transmission was small. Microsoft says the average size is 1.2K, which is certainly consistent with my experience.

On my AC-powered test system running on a wired network, that’s roughly 32 connections every eight hours. If you run the same experiment on a metered network, Microsoft says no data is transmitted. If this system has been a notebook running on battery power, check-ins would have been once every four hours.

Diagnostic and crash data is uploaded only on AC power and on non-metered networks.

What data is collected from a Windows 10 PC?

The amount and type of data telemetry that the UTC will collect is determined by which of four telemetry levels is selected. Three of them (Basic, Enhanced, and Full) can be configured using the Settings app; the fourth level (Security) is available for PCs only in Windows 10 Enterprise and Education editions and can only be set using administrative tools such as Group Policy or mobile device management software.

Microsoft uses the following diagram to describe these four levels.

win10-telemetry-levels.png

Telemetry data includes information about the device and how it’s configured (including hardware attributes such as CPU, installed memory, and storage), as well as quality-related information such as uptime and sleep details and the number of crashes or hangs. Additional basic information includes a list of installed apps and drivers. For systems where the telemetry is set to a level higher than Basic, the information collected includes events that analyze interaction between the user and the operating system and apps.

I will not try to summarize the four levels here but instead encourage you to read the full descriptions for each level in the documentation.

The default level is Full for Windows 10 Home and Pro and Enhanced for Enterprise edition.

If you are concerned enough about privacy to have read this far, you probably want to set the telemetry level to Basic. Search for Feedback in the Settings app to find the Diagnostic And Usage Data switch shown here.

feedback-settings-basic.jpg

You can also use Group Policy and MDM software to enforce these and other settings on a Windows domain.

Organizations that have a need to keep outside network connections and data transfer to a minimum should consider the Security level, but only if they have the IT chops to set up their own update infrastructure. (At this level of minimal data collection, Windows Update doesn’t work.)

Where is telemetry data stored?

On a Windows 10 PC, telemetry data is stored in encrypted files in the hidden %ProgramData%\Microsoft\Diagnosis folder. The files and folders in this location are not accessible to normal users and have permissions that make it difficult to snoop in them.

diagnosis-folder-hidden.jpg

Even if you could look into the contents of those files, there’s nothing to see, because the data files are encrypted locally.

The UTC client connects to settings-win.data.microsoft.com, provides its device ID and a few other configuration details, and downloads a settings file.

Next, the telemetry client connects to the Microsoft Data Management Service at v10.vortex-win.data.microsoft.com and uploads any data that is waiting to be sent. The transmission takes place over encrypted HTTPS connections.

(That’s a security change Microsoft made in the Windows 7 timeframe. Previous versions sent telemetry data over unencrypted connections, making it possible for attackers to intercept the data.)

10 best privacy tools for staying secure online

A number of free and open-source projects exist solely to protect your identity and online activity. Here are just a few to make you more secure in the new year.

I was able to confirm these values using many hours of network diagnostics. Note that the IP addresses assigned to these individual hosts might vary. This is the very definition of big data.

How does Microsoft use this data?

Microsoft maintains potentially sensitive telemetry data “in a separate data store that’s locked down to a small subset of Microsoft employees in the Windows Devices Group.” In addition, the company says, “Only those who can demonstrate a valid business need can access the telemetry info.”

This data is compiled into business reports for analysis and for use by teams tasked with fixing bugs and improving the performance of the operating system and associated services. Only “aggregated, anonymous telemetry information” is included in reports that are shared with partners.

There’s no hard-and-fast rule that defines how long data is retained. However, Microsoft says its goal is to store data only “for as long as it’s needed to provide a service or for analysis.” A vague follow-up statement says “much of the info about how Windows and apps are functioning is deleted within 30 days.”

Is it possible for Microsoft to collect business or personal information?

Yes, especially at the higher telemetry settings.

The collection process is tailored so that the telemetry component avoids gathering information that could directly identify a person or an organization. However, at the Enhanced setting, when Windows or an app crashes or hangs, the memory contents of the faulting process are included in the diagnostic report generated at the time of the crash or hang, and that crash dump might include sensitive information.

At the Full setting, you grant Microsoft permission to collect extra data when your device “experiences problems that are difficult to identify or repeat using Microsoft’s internal testing.

The formal documentation makes it clear that this sort of investigation can snag personal documents:

This info can include any user content that might have triggered the problem and is gathered from a small sample of devices that have both opted into the Full telemetry level and have exhibited the problem.

However, before more info is gathered, Microsoft’s privacy governance team, including privacy and other subject matter experts, must approve the diagnostics request made by a Microsoft engineer. If the request is approved, Microsoft engineers can use the following capabilities to get the information:

  • Ability to run a limited, pre-approved list of Microsoft certified diagnostic tools, such as msinfo32.exe, powercfg.exe, and dxdiag.exe.
  • Ability to get registry keys.
  • Ability to gather user content, such as documents, if they might have been the trigger for the issue.

If you’re not comfortable with granting that sort of access, make sure you turn this setting down to Enhanced or Basic.

Permalink • Print • Comment
Made with WordPress and Semiologic • Sky Gold skin by Denis de Bernardy