Tor Metrics archives historical data about the Tor ecosystem, collects data from the public Tor network and related services, and assists in developing novel approaches to safe, privacy preserving data collection.
We only use public, non-sensitive data for metrics. Each metric goes through a rigorous review and discussion process before appearing here. We never publish statistics—or aggregate statistics—of sensitive data, such as unencrypted contents of traffic.
The goals of a privacy and anonymity network like Tor are not easily combined with extensive data gathering, but at the same time data is needed for monitoring, understanding, and improving the network. Data can be used to detect possible censorship events or attacks against the network. Safety and privacy concerns regarding data collection by Tor Metrics are guided by the Tor Research Safety Board's guidelines. Safety and privacy assessment is usually done openly by discussion during the proposal process for changes to the Tor source, and/or supported by closer analysis in form of Tor Technical Reports.
For data we collect from the public Tor network, we will always follow three main guidelines:
You can read more about safety considerations when collecting data in the Tor network in "A Case Study on Measuring Statistical Data in the Tor Anonymity Network" by Karsten Loesing, Steven J. Murdoch, and Roger Dingledine. In the Proceedings of the Workshop on Ethics in Computer Security Research (WECSR 2010), Tenerife, Canary Islands, Spain, January 2010.
Tor relays and bridges collect aggregated statistics about their usage including bandwidth and connecting clients per country. Source aggregation is used to protect the privacy of connecting users—discarding IP addresses and only reporting country information from a local database mapping IP address ranges to countries. These statistics are sent periodically to the directory authorities.
CollecTor downloads the latest server descriptors, extra info descriptors containing the aggregated statistics, and consensus documents from the directory authorities and archives them. This archive is public and the metrics-lib Java library can be used to parse the contents of the archive to perform analysis of the data.
In order to provide easy access to visualizations of the historical data archived, the Tor Metrics website contains a number of customizable plots to show user, traffic, relay, bridge, and application download statistics over a requested time period and filtered to a particular country.
In order to provide easy access to current information about the public Tor network, Onionoo implements a protocol to serve JSON documents over HTTP that can be consumed by applications that would like to display information about relays along with historical bandwidth, uptime, and consensus weight information.
An example of one such application is Relay Search which is used by relay operators, those monitoring the health of the network, and developers of software using the Tor network. Another example of such an application is metrics-bot which posts regular snapshots to Twitter and Mastodon including country statistics and a world map plotting known relays.
The diagram below shows how data is collected, archived, analyzed, and presented to users through services operated by Tor Metrics. The majority of our services use metrics-lib to parse the descriptors that have been collected by CollecTor as their source of raw data about the public Tor network.
Collecting and processing new data won't likely happen without your help! If you really want to see something measured here, we would be happy to work with you. Learn more about contributing on our team wiki page.
Tor Metrics is a project of:
The Tor Project P.O. Box 5 Winchester, NH 03470 United States
Metrics are a critical part of any security technology. If you don't know how the technology works in practice, you can't find and fix problems. You can't improve the security. You can't make it work better. This isn't glamorous or sexy work, but it's essential. This is especially true for security and privacy, where our preconceived notions of threats and usage are regularly wrong—and knowing what's really going on is the difference between security and insecurity.
Tor is doing cutting-edge work in the anonymity space, and Tor metrics are already proven to provide critical information for research and development. It's one of the few open data sets available for how, why, where, and when people use anonymizing technologies.
Tor's metrics project increases the transparency of Tor's work. This helps users understand how Tor works. With good network metrics, you can look back for indicators and anomalies at the time a privacy issue was reported. You can also extrapolate and look forward to prevent related issues in the future. This helps alleviate users' security concerns, and helps others contribute to security issues in the network and browser.
Finally, Tor metrics are the ammunition that lets Tor and other security advocates argue for a more private and secure Internet from a position of data, rather than just dogma or perspective. It's where the real world influences Tor.”
This material is supported in part by the National Science Foundation under Grant No. CNS-0959138. Any opinions, finding, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. "Tor" and the "Onion Logo" are registered trademarks of The Tor Project, Inc.. Data on this site is freely available under a CC0 no copyright declaration: To the extent possible under law, the Tor Project has waived all copyright and related or neighboring rights in the data. Graphs are licensed under a Creative Commons Attribution 3.0 United States License.