Reproducible Metrics #

The graphs and tables on Tor Metrics are the result of aggregating data obtained from several points in the Tor network. Some of these aggregations are straightforward, but some are not.

We want to make the graphs and tables on this site easier to access and reproduce, so on this page, we specify how you can reproduce the data behind them to create your own. We also provide background for some of the design decisions behind our aggregations and link to technical reports and other additional information.

This page is a living document that reflects the latest changes to graphs and tables on Tor Metrics. Whenever we create new aggregations or visualizations, we may write down our thoughts in technical reports; but, if we later expand or change a statistic, we don't update the original technical reports. Instead, we update the specification here.

While we may refer to technical reports for additional details, we do not assume their knowledge in order to make sense of the specifications here. Knowledge of our source code is not needed, either.

Users #

The number of Tor users is one of our most important statistics. It is vital for us to know how many people use the Tor network on a daily basis, whether they connect via relays or bridges, from which countries they connect, what transports they use, and whether they connect via IPv4 or IPv6.

Due to the nature of Tor being an anonymity network, we cannot collect identifying data to learn the number of users. That is why we actually don't count users, but we count requests to the directories or bridges that clients clients make periodically to update their list of relays and estimate user numbers indirectly from there.

The result is an average number of concurrent users, estimated from data collected over a day. We can't say how many distinct users there are. That is, we can't say whether the same set of users stays connected over the whole day, or whether that set leaves after a few hours and a new set of users arrives. However, the main interest is finding out if usage changes, for which it is not critical to estimate exact absolute user numbers.

Relay users #

Relay users are users that connect directly to a relay in order to connect to the Tor network—as opposed to bridge users that connect to a bridge as entry point into the Tor network. Many steps here are similar to the steps for estimating bridge users, which are specified further down below.

The following description applies to the following graph and tables:

Relay users graph
Top-10 countries by relay users table
Top-10 countries by possible censorship events table

Step 1: Parse consensuses to learn which relays have been running

Obtain consensuses from CollecTor. Refer to the Tor directory protocol, version 3 for details on the descriptor format.

From each consensus, parse the "valid-after" and "fresh-until" times from the header section.

From each consensus entry, parse the base64-encoded relay fingerprint from the "r" line. Also parse the relay flags from the "s" line. If there is no "Running" flag, skip this entry. (Consensuses with consensus method 4, introduced in 2008, or later do not list non-running relays, so that checking relay flags in recent consensuses is mostly done as a precaution without actual effect on the parsed data.)

Step 2: Parse relay extra-info descriptors to learn relevant statistics reported by relays

Also obtain relay extra-info descriptors from CollecTor. As above, refer to the Tor directory protocol, version 3 for details on the descriptor format.

Parse the relay fingerprint from the "extra-info" line and the descriptor publication time from the "published" line.

Parse the "dirreq-write-history" line containing written bytes spent on answering directory requests. If the contained statistics end time is more than 1 week older than the descriptor publication time in the "published" line, skip this line to avoid including statistics in the aggregation that have very likely been reported in earlier descriptors and processed before. If a statistics interval spans more than 1 UTC date, split observations to the covered UTC dates by assuming a linear distribution of observations.

Parse the "dirreq-stats-end", "dirreq-v3-resp", and "dirreq-v3-reqs" lines containing directory-request statistics. If the statistics end time in the "dirreq-stats-end" line is more than 1 week older than the descriptor publication time in the "published" line, skip these directory request statistics for the same reason as given above: to avoid including statistics in the aggregation that have very likely been reported in earlier descriptors and processed before. Also skip statistics with an interval length other than 1 day. Parse successful requests from the "ok" part of the "dirreq-v3-resp" line, subtract 4 to undo the binning operation that has been applied by the relay, and discard the resulting number if it's zero or negative. Parse successful requests by country from the "dirreq-v3-reqs" line, subtract 4 from each number to undo the binning operation that has been applied by the relay, and discard the resulting number if it's zero or negative. Split observations to the covered UTC dates by assuming a linear distribution of observations.

Step 3: Approximate directory requests by country

Relays report directory request numbers in two places: as a total number ("dirreq-v3-resp" line) and as numbers broken down by country ("dirreq-v3-reqs" line). Rather than using numbers broken down by country directly we multiply total requests with the fraction of requests from a given country. This has two reasons: it reduces the overall effect of binning, and it makes relay and bridge user estimates more comparable. If a relay for some reason only reports total requests and not requests by country, we attribute all requests to "??" which stands for Unknown Country.

Step 4: Estimate fraction of reported directory-request statistics

The next step after parsing descriptors is to estimate the fraction of reported directory-request statistics on a given day. This fraction will be used in the next step to extrapolate observed request numbers to expected network totals. For further background on the following calculation method, refer to the technical report titled "Counting daily bridge users" which also applies to relay users. In the following, we're using the term server instead of relay or bridge, because the estimation method is exactly the same for relays and bridges.

For each day in the time period, compute five variables:

Compute n(N) as the total server uptime in hours on a given day, that is, the sum of all server uptime hours on that day. This is the sum of all intervals between "valid-after" and "fresh-until", multiplied by the contained running servers, for all consensuses with a valid-after time on a given day. A more intuitive interpretation of this variable is the average number of running servers—however, that interpretation only works as long as fresh consensuses are present for all hours of a day.
Compute n(H) as the total number of hours for which servers have reported written directory bytes on a given day.
Compute n(R\H) as the number of hours for which responses have been reported but no written directory bytes. This fraction is determined by summing up all interval lengths and then subtracting the written directory bytes interval length from the directory response interval length. Negative results are discarded.
Compute h(H) as the total number of written directory bytes on a given day.
Compute h(R^H) as the number of written directory bytes for the fraction of time when a server was reporting both written directory bytes and directory responses. As above, this fraction is determined by first summing up all interval lengths and then computing the minimum of both sums divided by the sum of reported written directory bytes.

From these variables, compute the estimated fraction of reported directory-request statistics using the following formula:

       h(R^H) * n(H) + h(H) * n(R\H)
frac = -----------------------------
                h(H) * n(N)

Step 5: Compute estimated relay users per country

With the estimated fraction of reported directory-request statistics from the previous step it is now possible to compute estimates for relay users. Similar to the previous step, the same approach described here also applies to estimating bridge users by country, transport, or IP version as described further down below.

First compute r(R) as the sum of reported successful directory requests from a given country on a given day. This approach also works with r(R) being the sum of requests from all countries or from any other subset of countries, if this is of interest.

Estimate the number of clients per country and day using the following formula:

r(N) = floor(r(R) / frac / 10)

A client that is connected 24/7 makes about 15 requests per day, but not all clients are connected 24/7, so we picked the number 10 for the average client. We simply divide directory requests by 10 and consider the result as the number of users. Another way of looking at it, is that we assume that each request represents a client that stays online for one tenth of a day, so 2 hours and 24 minutes.

Skip dates where frac is smaller than 10% and hence too low for a robust estimate. Also skip dates where frac is greater than 110%, which would indicate an issue in the previous step. We picked 110% as upper bound, not 100%, because there can be relays reporting statistics that temporarily didn't make it into the consensus, and we accept up to 10% of those additional statistics. However, there needs to be some upper bound to exclude obvious outliers with fractions of 120%, 150%, or even 200%.

Step 6: Compute ranges of expected clients per day to detect potential censorship events

As last step in reproducing relay user numbers, compute ranges of expected clients per day to detect potential censorship events. For further details on the detection method, refer to the technical report titled "An anomaly-based censorship-detection system for Tor". Unlike the previous two steps, this step only applies to relay users, not to bridge users.

Start by finding the 50 largest countries by estimated relay users, excluding "??", on the last day in the data set. (Note that, as new data gets available, the set of 50 largest countries may change, too, affecting ranges for the entire data set.)
For each of these largest countries and for each date in the data set, compute the ratio R_ij = C_ij / C_(i-7)j of estimated users on any given date divided by estimated users 1 week earlier. Exclude ratios for which there are no user estimates from 1 week earlier, or where that estimate is 0.
For the computed ratios on each date, remove outliers that fall outside four interquartile ranges of the median, and remove dates with less than 8 ratios remaining.
For each date, compute mean and standard variation in order to use these as parameters of a normal distribution.
For each date and country, compute a range of expected user numbers by using a normal distribution with parameters from the previous step and the ratio of estimated users divided by estimated users 1 week earlier as input. As previously, exclude ratios for which there are no user estimates from 1 week earlier, or where that estimate is 0. Compute the low and high ranges as:
- low = max(0, NormalDistribution(mu, standard deviation, P = 0.0001) * PoissonDistribution(lambda = C_(i-7)j, P = 0.0001))
- high = NormalDistribution(mu, standard deviation, P = 0.9999) * PoissonDistribution(lambda = C_(i-7)j, P = 0.9999)

Bridge users #

Bridge users are users that connect to a bridge as entry point into the Tor network as opposed to relay users that connect directly to a relay. Many steps here are similar to the steps for estimating relay users, which are specified above.

Bridge users by country graph
Bridge users by transport graph
Bridge users by country and transport graph
Bridge users by IP version graph
Top-10 countries by bridge users table

Step 1: Parse bridge network statuses to learn which bridges have been running

Obtain bridge network statuses from CollecTor. Refer to the Tor bridge descriptors page for details on the descriptor format.

From each status, parse the "published" time from the header section.

From each status entry, parse the base64-encoded hashed bridge fingerprint from the "r" line. Also parse the relay flags from the "s" line. If there is no "Running" flag, skip this entry.

As opposed to relay consensuses, there are no "valid-after" or "fresh-until" times in the header of bridge network statuses. To unify processing, we use the publication hour as valid-after time and one hour later as fresh-until time. If we process multiple statuses published in the same hour, we take the union of contained running bridges as running bridges in that hour.

Step 2: Parse bridge extra-info descriptors to learn relevant statistics reported by bridges

Also obtain bridge extra-info descriptors from CollecTor. As above, refer to the Tor bridge descriptors page for details on the descriptor format.

Parse the hashed bridge fingerprint from the "extra-info" line and the descriptor publication time from the "published" line.

Parse the "dirreq-stats-end", "dirreq-v3-resp", and "dirreq-v3-reqs" lines containing directory-request statistics. If the statistics end time in the "dirreq-stats-end" line is more than 1 week older than the descriptor publication time in the "published" line, skip these directory request statistics for the same reason as given above: to avoid including statistics in the aggregation that have very likely been reported in earlier descriptors and processed before. Also skip statistics with an interval length other than 1 day. Parse successful requests from the "ok" part of the "dirreq-v3-resp" line, subtract 4 to undo the binning operation that has been applied by the bridge, and discard the resulting number if it's zero or negative. Parse successful requests by country from the "dirreq-v3-reqs" line, subtract 4 from each number to undo the binning operation that has been applied by the bridge, and discard the resulting number if it's zero or negative. Split observations to the covered UTC dates by assuming a linear distribution of observations.

Parse the "bridge-ips", "bridge-ip-versions", and "bridge-ip-transports" lines containing unique connecting IP addresses by country, IP version, and transport. From each number of unique IP addresses, subtract 4 to undo the binning operation that has been applied by the bridge. Discard the resulting number if it's zero or negative.

Step 3: Approximate directory requests by country, transport, and IP version

Older bridges did not report directory requests by country but only total requests and unique IP address counts by country. In that case we approximate directory requests by country by multiplying the total number with the fraction of unique IP addresses from a given country. For newer bridges that do report directory requests by country we still take total requests as starting point and multiply with the fraction of requests by country. Otherwise, if we had used directory requests by country directly, totals by country, transport, and IP version would not match. If a bridge reports neither directory requests by country nor unique IP addresses by country, we attribute all requests to "??" which stands for Unknown Country.

Bridges do not report directory requests by transport or IP version. We approximate these numbers by multiplying the total number of requests with the fraction of unique IP addresses by transport or IP version. If a bridge does not report unique IP addresses by transport or IP version, we attribute all requests to the default onion-routing protocol or to IPv4, respectively.

As a special case, we also approximate lower and upper bounds for directory requests by country and transport. This approximation is based on the fact that most bridges only provide a small number of transports. This allows us to combine unique IP address sets by country and by transport and obtain lower and upper bounds:

We calculate the lower bound as max(0, C(b) + T(b) - S(b)) using the following definitions: C(b) is the number of requests from a given country reported by bridge b; T(b) is the number of requests using a given transport reported by bridge b; and S(b) is the total numbers of requests reported by bridge b. Reasoning: If the sum C(b) + T(b) exceeds the total number of requests from all countries and transports S(b), there must be requests from that country and transport. And if that is not the case, 0 is the lower limit.
We calculate the upper bound as min(C(b), T(b)) with the definitions from above. Reasoning: There cannot be more requests by country and transport than there are requests by either of the two numbers.

Step 4: Estimate fraction of reported directory-request statistics

The step for estimating the fraction of reported directory-request statistics is pretty much the same for bridges and for relays. This is why we refer to Step 4 of the Relay users description for this estimation.

Step 5: Compute estimated bridge users per country, transport, or IP version

Similar to the previous step, this step is equivalent for bridge users and relay users. We therefore refer to Step 5 of the Relay users description for transforming directory request numbers to user numbers.

BridgeDB requests #

BridgeDB metrics contain aggregated information about requests to the BridgeDB service. BridgeDB keeps track of each request per distribution method (HTTPS, moat, email), per bridge type (e.g., vanilla or obfs4) per country code or email provider (e.g., "ru" or "gmail") per request success ("success" or "fail"). Every 24 hours, BridgeDB writes these metrics to disk and then begins a new measurement interval.

The following description applies to the following graphs:

BridgeDB requests by requested transport graph
BridgeDB requests by distributor graph

Step 1: Parse BridgeDB metrics to obtain reported request numbers

Obtain BridgeDB metrics from CollecTor. Refer to the BridgeDB metrics specification for details on the descriptor format.

Step 2: Skip requests coming in over Tor exits

Skip any request counts with "zz" as their CC/EMAIL metrics key part. We use the "zz" pseudo country code for requests originating from Tor exit relays. We're discarding these requests because bots use the Tor network to crawl BridgeDB, and including bot requests would provide a false sense of how users interact with BridgeDB. Note that BridgeDB maintains a separate distribution pool for requests coming from Tor exit relays.

Step 3: Aggregate requests by date, distributor, and transport

BridgeDB metrics contain request numbers broken down by distributor, bridge type, and a few more dimensions. For our purposes we only care about total request numbers by date and either distributor or transport. Our total request number includes both successful (i.e., the user ended up getting bridge lines) and unsuccessful (e.g., the user failed to solve the CAPTCHA) requests. We're using request sums by these three dimensions as aggregates and we are subtracting bin_size/2 from each count to better approximate the count before binning. The value for bin_size is found in the above mentioned specification and currently set to 10. As date we're using the date of the BridgeDB metrics interval end. If we encounter more than one BridgeDB metrics interval end on the same UTC date (which shouldn't be possible with an interval length of 24 hours), we arbitrarily keep whichever we process first.

Servers #

Statistics on the number of servers—relays and bridges—were among the first to appear on Tor Metrics. Most of these statistics have one thing in common: they use the number of running servers as their metric. Possible alternatives are to use consensus weight totals/fractions or guard/middle/exit probabilities as metrics, but we only recently started doing that. In the following, we describe how exactly we count servers.

Running relays #

We start with statistics on the number of running relays in the network, broken down by criteria like assigned relay flag, self-reported tor version and operating system, or IPv6 capabilities.

The following description applies to the following graphs:

Relays and bridges (just the relays part; for the bridges part see below) graph
Relays by relay flag graph
Relays by tor version graph
Relays by platform graph
Relays by IP version graph

Step 1: Parse consensuses

Obtain consensuses from CollecTor. Refer to the Tor directory protocol, version 3 for details on the descriptor format.

Parse and memorize the "valid-after" time from the consensus header. We use this UTC timestamp to uniquely identify the consensus while processing, and to later aggregate by the UTC date of this UTC timestamp.

Repeat the following steps for each consensus entry:

Server descriptor digest: Parse the server descriptor digest from the "r" line. This is only needed for statistics based on relay server descriptor contents.
IPv6 reachable OR: Parse any "a" lines, if present, and memorize whether at least one of them contains an IPv6 address. This indicates that at least one of the relay's IPv6 OR addresses is reachable.
Relay flags: Parse relay flags from the "s" line. If there is no "Running" flag, skip this consensus entry. This ensures that we only consider running relays. Also parse any other relay flags from the "s" line that the relay had assigned.

If a consensus contains zero running relays, we skip it. This is mostly to rule out a rare edge case when only a minority of directory authorities voted on the "Running" flag. In those cases, such a consensus would skew the average, even though relays were likely running.

Step 2: Parse relay server descriptors

Obtain relay server descriptors from CollecTor. Again, refer to the Tor directory protocol, version 3 for details on the descriptor format.

Parse any or all of the following parts from each server descriptor:

Tor version: Parse the tor software version from the "platform" line and memorize the first three dotted numbers from it. If the "platform" line does not begin with "Tor" followed by a space character and a dotted version number, memorize the version as "Other". If the platform line is missing, we skip this descriptor, which later leads to not counting this relay at all rather than including it in the "Other" group, which is slightly wrong. Note that consensus entries also contain a "v" line with the tor software version from the referenced descriptor, which we do not use, because it was not present in very old consensuses, but which should work just as well for recent consensus.
Operating system: Parse the "platform" line and memorize whether it contains either one of the substrings "Linux", "Darwin" (macOS), "BSD", or "Windows". If the "platform" line contains neither of these substrings, memorize the platform as "Other". If the platform line is missing, we skip this descriptor, which later leads to not counting this relay at all rather than including it in the "Other" group, which is slightly wrong.
IPv6 announced OR: Parse any "or-address" lines and memorize whether at least one of them contains an IPv6 address. This indicates that the relay announced an IPv6 address.
IPv6 exiting: Parse the "ipv6-policy" line, if present, and memorize whether it's different from "reject 1-65535". This indicates whether the relay permitted exiting to IPv6 targets. If the line is not present, memorize that the relay does not permit exiting to IPv6 targets.
Server descriptor digest: Compute the SHA-1 digest, or determine it from the file name in case of archived descriptor tarballs.

Step 3: Compute daily averages

Match consensus entries with server descriptors by SHA-1 digest. Every consensus entry references exactly one server descriptor, and a server descriptor may be referenced from an arbitrary number of consensus entries. If at least 0.1% of referenced server descriptors are missing, we skip the consensus. We chose this threshold as low, because missing server descriptors may easily skew the results. However, a small number of missing server descriptors per consensus is acceptable and also unavoidable.

Go through all previously processed consensuses by valid-after UTC date. Compute the arithmetic mean of running relays, possibly broken down by relay flag, tor version, platform, or IPv6 capabilities, as the sum of all running relays divided by the number of consensuses. Round down to the next integer number.

Skip the last day of the results if it matches the current UTC date, because those averages may still change throughout the day. Further skip days for which fewer than 12 consensuses are known. The goal is to avoid over-representing a few consensuses during periods when the directory authorities had trouble producing a consensus for at least half of the day.

Running bridges #

After explaining our running relays statistics we continue with our running bridges statistics. The steps are quite similar, except for a couple differences in data formats that justify explaining these statistics in a separate subsection.

The following description applies to the following graphs:

Relays and bridges (just the bridges part; for the relays part see above) graph
Bridges by IP version graph

Step 1: Parse bridge network statuses

Obtain bridge network statuses from CollecTor. Refer to the Tor bridge descriptors page for details on the descriptor format.

Parse the bridge authority identify from the file name and memorize it. This is only relevant for times when more than 1 bridge authority was running. In those cases, bridges typically register at a single bridge authority only, so that taking the average of running bridges over all statuses on those day would be misleading.

Parse and memorize the "published" time either from the file name or from the status header. This timestamp is used to uniquely identify the status while processing, and the UTC date of this timestamp is later used to aggregate by UTC date.

Repeat the following steps for each status entry:

Server descriptor digest: Parse the server descriptor digest from the "r" line and memorize it.
Relay flags: Parse relay flag from the "s" line. If there is no "Running" flag, skip this entry. This ensures that we only consider running bridges.

If a status contains zero running bridges, skip it. This may happen when there is a temporary issue with the bridge authority.

Step 2: Parse bridge server descriptors.

Obtain bridge server descriptors from CollecTor. As above, refer to the Tor bridge descriptors page for details on the descriptor format.

Parse the following parts from each server descriptor:

IPv6 announced OR: Parse any "or-address" lines and memorize whether at least one of them contains an IPv6 address. This indicates that the bridge announced an IPv6 address.
Server descriptor digest: Parse the SHA-1 digest from the "router-digest" line, or determine it from the file name in case of archived descriptor tarballs.

Step 3: Compute daily averages

Match status entries with server descriptors by SHA-1 digest. Every status entry references exactly one server descriptor, and a server descriptor may be referenced from an arbitrary number of status entries. If at least 0.1% of referenced server descriptors are missing, we skip the status. We chose this threshold as low, because missing server descriptors may easily skew the results. However, a small number of missing server descriptors per status is acceptable and also unavoidable.

Compute the arithmetic mean of running bridges as the sum of all running bridges divided by the number of statuses and round down to the next integer number. We are aware that this approach does not correctly reflect that bridges typically register at a single bridge authority only.

Skip the last day of the results if it matches the current UTC date, because those averages may still change throughout the day. Further skip days for which fewer than 12 statuses are known. The goal is to avoid over-representing a few statuses during periods when the bridge directory authority had trouble producing a status for at least half of the day.

Consensus weight #

The following statistic uses measured bandwidth, also known as consensus weight, as metric for relay statistics, rather than absolute relay counts.

The following description applies to the following graph:

Total consensus weights across bandwidth authorities graph

Step 1: Parse consensuses.

Obtain consensuses from CollecTor. Refer to the Tor directory protocol, version 3 for details on the descriptor format.

Parse and memorize the "valid-after" time from the consensus header. We use this UTC timestamp to aggregate by the UTC date.

Parse the "s" lines of all status entries and skip entries without the "Running" flag. Optionally distinguish relays by assigned "Guard" and "Exit" flags.

Parse the (optional) "w" lines of all status entries and compute the total of all bandwidth values denoted by the "Bandwidth=" keyword. If an entry does not contain such a value, skip the entry. If a consensus does not contain a single bandwidth value, skip the consensus.

Step 2: Parse votes.

Obtain votes from CollecTor. Refer to the Tor directory protocol, version 3 for details on the descriptor format.

Parse and memorize the "valid-after" time from the vote header. We use this UTC timestamp to aggregate by the UTC date.

Also parse the "nickname" and "identity" fields from the "dir-source" line. We use the identity to aggregate by authority and the nickname for display purposes.

Parse the "s" lines of all status entries and skip entries without the "Running" flag. Optionally distinguish relays by assigned "Guard" and "Exit" flags.

Parse the (optional) "w" lines of all status entries and compute the total of all measured bandwidth values denoted by the "Measured=" keyword. If an entry does not contain such a value, skip the entry. If a vote does not contain a single measured bandwidth value, skip the vote.

Step 3: Compute daily averages

Go through all previously processed consensuses and votes by valid-after UTC date and authority. If there are less than 12 consensuses known for a given UTC date, skip consensuses from this date. If an authority published less than 12 votes on a given UTC date, skip this date and authority. Also skip the last date of the results, because those averages may still change throughout the day. For all remaining combinations of date and authority, compute the arithmetic mean of total measured bandwidth, rounded down to the next-smaller integer number.

Traffic #

Our traffic statistics have in common that their metrics are based on user-generated traffic. This includes advertised and consumed bandwidth and connection usage statistics.

Advertised bandwidth #

Advertised bandwidth is the volume of traffic, both incoming and outgoing, that a relay is willing to sustain, as configured by the operator and claimed to be observed from recent data transfers. Relays self-report their advertised bandwidth in their server descriptors which we evaluate together with consensuses.

The following description applies to the following graphs:

Total relay bandwidth (just the advertised bandwidth part; for the consumed bandwidth part see below) graph
Advertised and consumed bandwidth by relay flags (just the advertised bandwidth part; for the consumed bandwidth part see below) graph
Advertised bandwidth by IP version graph
Advertised bandwidth distribution graph
Advertised bandwidth of n-th fastest relays graph

Step 1: Parse relay server descriptors

Obtain relay server descriptors from CollecTor. Refer to the Tor directory protocol, version 3 for details on the descriptor format.

Parse the following parts from each server descriptor:

Advertised bandwidth: Parse the three values (or just two in very old descriptors) from the "bandwidth" line. These values stand for the average bandwidth, burst bandwidth, and observed bandwidth. The advertised bandwidth is the minimum of these values.
Server descriptor digest: Compute the SHA-1 digest, or determine it from the file name in case of archived descriptor tarballs.

Step 2: Parse consensuses

Obtain consensuses from CollecTor. Refer to the Tor directory protocol, version 3 for details on the descriptor format.

From each consensus, parse the "valid-after" time from the header section.

From each consensus entry, parse the base64-encoded server descriptor digest from the "r" line. We are going to use this digest to match the entry with the advertised bandwidth value from server descriptors later on.

Also parse the relay flags from the "s" line. If there is no "Running" flag, skip this entry. (Consensuses with consensus method 4, introduced in 2008, or later do not list non-running relays, so that checking relay flags in recent consensuses is mostly done as a precaution without actual effect on the parsed data.) Further parse the "Guard", "Exit", and "BadExit" relay flags from this line. We consider a relay with the "Guard" flag as guard and a relay with the "Exit" and without the "BadExit" flag as exit.

Step 3: Compute daily averages

The first three graphs described here, namely Total relay bandwidth, Advertised and consumed bandwidth by relay flags and Advertised bandwidth by IP version, have in common that they show daily averages of advertised bandwidth.

In order to compute these averages, first match consensus entries with server descriptors by SHA-1 digest. Every consensus entry references exactly one server descriptor, and a server descriptor may be referenced from an arbitrary number of consensus entries. If at least 0.1% of referenced server descriptors are missing, we skip the consensus. We chose this threshold as low, because missing server descriptors may easily skew the results. However, a small number of missing server descriptors per consensus is acceptable and also unavoidable.

Go through all previously processed consensuses by valid-after UTC date. Compute the arithmetic mean of advertised bandwidth as the sum of all advertised bandwidth values divided by the number of consensuses. Round down to the next integer number.

Break down numbers by guards and/or exits by taking into account which relay flags a consensus entry had that referenced a server descriptor.

Step 4: Compute ranks and percentiles

The remaining two graphs described here, namely Advertised bandwidth distribution and Advertised bandwidth of n-th fastest relays, display advertised bandwidth ranks or percentiles.

Similar to the previous step, match consensus entries with server descriptors by SHA-1 digest. We handle missing server descriptors by simply skipping the consensus entry, at the risk of over-representing available server descriptors in consensuses where most server descriptors are missing.

For the Advertised bandwidth distribution graph, determine the i-th percentile value for each consensus. We use a non-standard percentile definition that is loosely based on the nearest-rank method: the P-th percentile (0 ≤ P ≤ 100) of a list of N ordered values (sorted from least to greatest) is the largest value in the list such that no more than P percent of the data is strictly less than the value and at least P percent of the data is less than or equal to that value. We calculate the ordinal rank n using the following formula: floor((P / 100) * (N - 1)) + 1

Calculate the median value over all consensus from a given day for each percentile value.

Consider the set of all running relays as well as the set of exit relays.

For the Advertised bandwidth of n-th fastest relays graph, determine the n-th highest advertised bandwidth value for each consensus, and then calculate the median value over all consensus from a given day. Again consider the set of all running relays as well as the set of exit relays.

Consumed bandwidth #

Consumed bandwidth, or bandwidth history, is the volume of incoming and/or outgoing traffic that a relay claims to have handled on behalf of clients. Relays self-report bandwidth histories as part of their extra-info descriptors, which we evaluate in combination with consensuses.

The following description applies to the following graphs:

Total relay bandwidth (just the consumed bandwidth part; for the advertised bandwidth part see above) graph
Advertised and consumed bandwidth by relay flags (just the consumed bandwidth part; for the advertised bandwidth part see above) graph
Bandwidth spent on answering directory requests graph

Step 1: Parse extra-info descriptors

Obtain extra-info descriptors from CollecTor. Refer to the Tor directory protocol, version 3 for details on the descriptor format.

Parse the fingerprint from the "extra-info" line. We will use this fingerprint to deduplicate statistics included in other extra-info descriptor published by the same relay. We may also use this fingerprint to attribute statistics to relays with the "Exit" and/or "Guard" flag.

Parse the "write-history", "read-history", "dirreq-write-history", and "dirreq-read-history lines containing consumed bandwidth statistics. The first two histories include all bytes written or read by the relay, whereas the last two include only bytes spent on answering directory requests. If a statistics interval spans more than 1 UTC date, split observations to the covered UTC dates by assuming a linear distribution of observations. As a simplification, we shift reported statistics intervals forward to fully align with multiples of 15 minutes since midnight. We also discard reported statistics with intervals that are not multiples of 15 minutes.

Step 2: Parse consensuses

Obtain consensuses from CollecTor. Refer to the Tor directory protocol, version 3 for details on the descriptor format.

From each consensus, parse the "valid-after" time from the header section.

From each consensus entry, parse the base64-encoded relay fingerprint from the "r" line. We are going to use this fingerprint to match the entry with statistics from extra-info descriptors later on.

Also parse the relay flag from the "s" line. If there is no "Running" flag, skip this entry. (Consensuses with consensus method 4, introduced in 2008, or later do not list non-running relays, so that checking relay flags in recent consensuses is mostly done as a precaution without actual effect on the parsed data.) Further parse the "Guard", "Exit", and "BadExit" relay flags from this line. We consider a relay with the "Guard" flag as guard and a relay with the "Exit" and without the "BadExit" flag as exit.

Step 3: Compute daily totals

The first two graphs described here, namely Total relay bandwidth and Advertised and consumed bandwidth by relay flag show daily totals of all bytes written or read by relays. For both graphs we sum up all read and written bytes on a given day and divide the result by 2. However, we only include bandwidth histories for a given day if a relay was listed as running in a consensus at least once on that day. We attribute bandwidth to guards and/or exits if a relay was a guard and/or exit at least in one consensus on a day.

The third graph, Bandwidth spent on answering directory requests, shows bytes spent by directory authorities and directory mirrors on answering directory requests. As opposed to the first two graphs, all bandwidth histories are included, regardless of whether a relay was listed as running in a consensus. Also, we compute total read directory and total written directory bytes for this fourth graph, not an average of the two.

Connection usage #

The last category of traffic statistics concerns statistics on the fraction of connections used uni- or bidirectionally. A subset of relays reports these highly aggregated statistics in their extra-info descriptors.

The following description applies to the following graph:

Fraction of connections used uni-/bidirectionally graph

Step 1: Parse relay extra-info descriptors

Obtain relay extra-info descriptors from CollecTor.

Parse the relay fingerprint from the "extra-info" line. We deduplicate reported statistics by UTC date of reported statistics and relay fingerprint. If a given relay publishes different statistics on a given UTC day, we pick the first encountered statistics and discard all subsequent statistics by that relay on that UTC day.

Parse the UTC date from the "conn-bi-direct" line. We use this date to aggregate statistics, regardless of what period of time fell on the statistics end date as opposed to the previous date.

From the same line, parse the three counts READ, WRITE, and BOTH, but disregard the BELOW value. Discard any statistics where the sum of these three values is 0. Compute three fractions by dividing each of the three values READ, WRITE, and BOTH by the sum of all three. Multiply results with 100 and truncate any decimal places, keeping a fraction value between 0 and 100.

Step 2: Aggregate statistics

For each date, compute the 25th, 50th, and 75th percentile of fractions computed in the previous step.

We use a non-standard percentile definition that is similar to the nearest-rank method: the P-th percentile (0 < P ≤ 100) of a list of N ordered values (sorted from least to greatest) is the largest value in the list such that no more than P percent of the data is strictly less than the value and at least P percent of the data is less than or equal to that value. We calculate the ordinal rank n using the following formula: floor((P / 100) * N) + 1

Performance #

We perform active measurements of Tor network performance by running several OnionPerf (previously: Torperf) instances from different vantage points. Here we explain how we evaluate Torperf/OnionPerf measurements to obtain the same results as on Tor Metrics.

The following description applies to the following graphs:

Time to download files over Tor graph
Timeouts and failures of downloading files over Tor graph
Circuit build times graph
Circuit round-trip latencies graph
Throughput graph

Step 1: Parse OnionPerf and/or Torperf measurement results

Obtain OnionPerf/Torperf measurement results from CollecTor.

Note: you need to convert the OnionPerf analysis files first in order to be able to progress as outlined below.

From each measurement result, parse the following keys:

SOURCE: Configured name of the data source.
FILESIZE: Configured file size in bytes.
START: Download start time that we use for two purposes: to determine how long a request took and to aggregate measurements by date.
DATAREQUEST: Time when the HTTP request was sent.
DATARESPONSE: Time when the HTTP response header was received.
DATACOMPLETE: Download end time that is only set if the request succeeded.
READBYTES: Total number of bytes read, which indicates whether this request succeeded (if ≥ FILESIZE) or failed.
DIDTIMEOUT: 1 if the request timed out, 0 otherwise.
PARTIAL51200 and PARTIAL1048576: Time when 51200 or 1048576 bytes were read.
DATAPERCx: Time when x% of expected bytes were read for x = { 10, 20, 50, 100 }.
BUILDTIMES: Comma-separated list of times when circuit hops were built, which includes all circuits used for making measurement requests, successful or not.
ENDPOINTREMOTE: Hostname, IP address, and port that was used to connect to the remote server; we use this to distinguish a request to a public server (if ENDPOINTREMOTE is not present or does not contain ".onion" as substring) or to an onion server.

Step 2: Aggregate measurement results

Each of the measurement results parsed in the previous step constitutes a single measurement. We're first interested in statistics on download times for the Time to download files over Tor graph. Therefore we consider complete downloads as well as partial downloads. For complete downloads we calculate the download time as DATACOMPLETE - START for measurements with DATACOMPLETE > START. For partial downloads of larger file sizes we calculate the download time as PARTIAL51200 - START for measurements with PARTIAL51200 > START and FILESIZE > 51200; and PARTIAL1048576 - START for measurements with PARTIAL1048576 > START and FILESIZE > 1048576. We then compute the 25th, 50th, and 75th percentile of download times by sorting download times, determining the percentile rank, and using linear interpolation between adjacent ranks.

Next we're interested in the average throughput of measurements for the Throughput graph. We calculate throughput from the time between receiving 0.5 and 1 MiB of a response for 1 MiB transfers, which obviously excludes any measurements with responses smaller than 1 MiB. For 5MiB downloads we calculate throughput from the time between receiving 4 and 5 MiB of a response. From DATAPERC50 and DATAPERC100 (if FILESIZE = 1048576) or DATAPERC80 and DATAPERC100 (if FILESIZE = 5242880) we can compute the number of milliseconds that have elapsed between receiving bytes 524,288 and 1,048,576, and bytes 4,194,304 and 5,242,880, respectively. This is a total of 524,288 bytes or 4,194,304 bits for 1 MiB transfers and 1,048,576 bytes or 8,388,608 bits for 5 MiB transfers. We divide the values 4,194,304 and 8,388,608 by this time difference to obtain throughput in bits per millisecond which happens to be the same value as the number of kilobits per second.

We're also interested in circuit round-trip latencies for the Circuit round-trip latencies graph. We measure circuit latency as the time between sending the HTTP request and receiving the HTTP response header. We calculate latencies as DATARESPONSE - DATAREQUEST for measurements with non-zero values for both timestamps. We then compute 25th, 50th, and 75th percentiles in the same way as for download times above. We also compute the lowest latency within 1.5 IQR of the lower quartile and the highest latency within 1.5 IQR of the upper quartile.

Ideally, all measurements would succeed. But it's also possible that some measurements did not complete within a pre-defined timeout or failed for some other reason. We distinguish three cases for the Timeouts and failures of downloading files over Tor graph and provide counts of each case per day:

Timeouts: measurements that either timed out (with DIDTIMEOUT = 1) or that have an invalid measurement end time (DATACOMPLETE ≤ START),
Failures: measurements that did not time out (with DIDTIMEOUT = 0), that had a valid measurement end time (DATACOMPLETE > START), and that had fewer bytes read than expected (READBYTES < FILESIZE), and
Requests: all measurements, including successes, timeouts, and failures.

The fifth metric that we obtain from OnionPerf/Torperf measurements is circuit build time, which is shown in the Circuit build times graph. We extract circuit build times from the BUILDTIMES field included in measurement results. We use the first value as build time for the first hop and deltas between subsequent values as build times for the second and third hop. Again, we compute the 25th, 50th, and 75th percentiles of these build times in the same way as for download times and circuit round-trip latencies.

Onion Services #

Our onion services statistics are based on two statistics reported by relays that have been added in 2014 to give some first insights into onion-service usage. For further background on the following steps, refer to the technical report titled "Extrapolating network totals from hidden-service statistics" that this description is based on (which was written before hidden services were renamed to onion services).

The following description applies to the following graphs:

Unique .onion addresses (version 2) graph
Unique .onion addresses (version 3) graph
Onion-service traffic (versions 2) graph
Onion-service traffic (versions 3) graph

Step 1: Parse reported statistics from extra-info descriptors

Obtain relay extra-info descriptors from CollecTor.

Parse the following parts from each extra-info descriptor:

Relay fingerprint: The "extra-info" line tells us which relay reported these statistics, which we need to know to match them with the expected fraction of onion-service activity throughout the statistics interval.
Onion service statistics interval end: The "hidserv-stats-end" line tells us when the statistics interval ended, and, together with the interval length, when it started.
Cells relayed as rendezvous point: The "hidserv-rend-relayed-cells" line tells us the number of cells that the relay handled on rendezvous circuits, and it tells us how this number has been obfuscated by the relay. The value for "bin_size" is the bin size used for rounding up the originally observed cell number, and the values for "delta_f" and "epsilon" are inputs for the additive noise following a Laplace distribution.
.onion addresses observed as directory: Finally, the "hidserv-dir-onions-seen" line tells us the number of .onion addresses that the relay observed in published onion-service descriptors in its role as onion-service directory.

Note: Unlike other statistics, we're not splitting statistics by UTC date. Instead, we're only accepting statistics intervals that are exactly 1 day long, and we're counting all reported values for the UTC date of the statistics end time.

Step 2: Remove previously added noise

When processing onion-service statistics, we need to handle the fact that they have been obfuscated by relays. As first step, we're attempting to remove the additive Laplace-distributed noise by rounding up to the nearest multiple of bin_size. The idea is that it's most likely that noise was added to the closest right side of a bin than to the right side of another bin. In step two, we're subtracting half of bin_size, because the relay added between 0 and bin_size - 1 to the originally observed value. All in all, we're using the following formula to remove previously added noise:

halfBin = binSize / 2
floor((reported + halfBin) / binSize) * binSize - halfBin

Step 3: Parse consensuses

Obtain consensuses from CollecTor.

From each consensus, parse the "valid-after" time from the header section.

From each consensus entry, parse the base64-encoded relay fingerprint from the "r" line.

Finally, parse the weights contained in the "bandwidth-weights" line from the footer section of the consensus.

Step 4: Derive network fractions from consensuses

The probability of choosing a relay as rendezvous point varies a lot between relays, and not all onion-service directories handle the same number of onion-service descriptors. Fortunately, we can derive what fraction of rendezvous circuits a relay has handled and what fraction of descriptors a directory was responsible for.

The first fraction that we compute is the probability of a relay to be selected as rendezvous point. Clients only select relays as rendezvous point that have the "Fast" flag. They weight relays differently based on their bandwidth and depending on whether they have the "Exit" and/or "Guard" flags: they weight the bandwidth value contained in the "w" line with the value of "Wmg", "Wme", "Wmd", or "Wmm", depending on whether the relay has only the "Guard" flag, only the "Exit" flag, both such flags, or neither of them.

The second fraction that we can derive from this consensus entry is the fraction of descriptor space that this relay was responsible for in its role as onion-service directory. The Tor Rendezvous Specification contains the following definition: "A[n onion] service directory is deemed responsible for a descriptor ID if it has the HSDir flag and its identity digest is one of the first three identity digests of HSDir relays following the descriptor ID in a circular list."

Based on the fraction of descriptor space that a directory was responsible for we can compute the fraction of descriptors that this directory has seen. Intuitively, one might think that these fractions are the same. However, this is not the case: each descriptor that is published to a directory is also published to two other directories. As a result we need to divide the fraction of descriptor space by three to obtain the fraction of descriptors observed the directory. Note that, without dividing by three, fractions of all directories would not add up to 100%.

We calculate network fraction per consensus. When we extrapolate reported statistics, we compute the average (arithmetic mean) of all such network fractions with consensus valid-after times falling into the statistics interval. In particular, we're not computing the average of network fractions from the UTC day when the statistics interval ends; even though we're attributing extrapolated statistics to the UTC date of the statistics interval end in the next step.

Step 5: Extrapolate network totals

We are now ready to extrapolate network totals from reported statistics. We do this by dividing reported statistics by the calculated fraction of observations made by the reporting relay. The underlying assumption is that statistics grow linearly with calculated fractions. We only exclude relays from this step that have a calculated fraction of exactly zero, to avoid dividing by zero.

While we can expect this method to work as described for extrapolating cells on rendezvous circuits, we need to take another step for estimating the number of unique .onion addresses in the network. The reason is that a .onion address is not only known to a single relay, but to a couple of relays, all of which include that .onion address in their statistics. We need to subtract out the multiple counting of .onion addresses to come up with a network-wide number of unique .onion addresses.

As an approximation, we assume that an onion service publishes its descriptor to twelve directories over a 24-hour period: the service stores two replicas per descriptor using different descriptor identifiers, both descriptor replicas get stored to three different onion-service directories each, and the service changes descriptor identifiers once every 24 hours which leads to two different descriptor identifiers per replica.

To be clear, this approximation is not entirely accurate. For example, the descriptors of roughly 1/24 of services are seen by 3 rather than 2 sets of onion-service directories, when a service changes descriptor identifiers once at the beginning of a relay's statistics interval and once again towards the end. In some cases, the two replicas or the descriptors with changed descriptor identifiers could have been stored to the same directory. As another example, onion-service directories might have joined or left the network and other directories might have become responsible for storing a descriptor which also include that .onion address in their statistics. However, for the subsequent analysis, we assume that neither of these cases affects results substantially.

Step 6: Compute daily averages

As last step in the analysis, we aggregate extrapolated network totals from all reporting relays to obtain a daily average. We're using the weighted interquartile mean as metric, because it is robust against noisy statistics and potentially lying relays and considers half of the reported statistics. For this metric we order extrapolated network totals by their value, discard the lower and the upper quartile by weight, and compute the weighted mean of the remaining values.

We further define a threshold of 1% for the total fraction of relays reporting statistics. If less than these 1% of relays report statistics on a given day, we don't include that day in the end results.

Applications #

Our applications statistics are based on Tor web server requests where our users download applications initially and where they ask for updates.

The following description applies to the following graphs:

Tor Browser downloads and updates graph
Tor Browser downloads and updates by platform graph
Tor Browser downloads and updates by locale graph
Tor Browser updates by release channel graph
Tor Messenger downloads and updates graph

Step 1: Parse Tor web server logs

Obtain Tor web server logs from CollecTor. Refer to the separate specification page for details on the data format.

Each log file contains relevant meta data in its file name, including the site name, the server name, and the log date. The log file itself contains sanitized requests to Tor web servers.

All patterns mentioned in the following are understood by PostgreSQL's LIKE operator. An underscore (_) matches any single character; a percent sign (%) matches any string of zero or more characters.

Step 2: Count Tor Browser initial downloads

We count a request as Tor Browser initial download if it matches the following criteria:

Request method: GET
Resource string: '%/torbrowser/%.exe', '%/torbrowser/%.dmg', or '%/torbrowser/%.tar.xz'
Response code: 200

We distinguish platforms based on the resource string: '%.exe%' for Windows, '%.dmg%' for macOS, and '%.tar.xz%' for Linux.

We distinguish release channels based on the resource string: '%-hardened%' for hardened releases, '%/%.%a%/%' for alpha releases, and stable releases otherwise.

We extract the locale (for example, 'en-US' for English as used in the United States or 'de' for German) from the resource string using regular expression '.*_([a-zA-Z]{2}|[a-zA-Z]{2}-[a-zA-Z]{2})[\._-].*', falling back to '??' for unrecognized locales if the regular expression does not match.

Step 3: Count Tor Browser signature downloads

We count a request as Tor Browser signature download if it matches the following criteria:

Request method: GET
Resource string: '%/torbrowser/%.exe.asc', '%/torbrowser/%.dmg.asc', or '%/torbrowser/%.tar.xz.asc'
Response code: 200

We break down requests by platform, channel, and locale in the exact same way as for Tor Browser initial downloads (see above).

Step 4: Count Tor Browser update pings

We count a request as Tor Browser update ping if it matches the following criteria:

Request method: GET
Resource string: '%/torbrowser/update\__/%' but not '%.xml'
Response code: 200

We distinguish platforms based on the resource string: '%/WINNT%' for Windows, '%/Darwin%' for macOS, and Linux otherwise.

We distinguish release channels based on the resource string: '%/hardened/%' for hardened releases, '%/alpha/%' for alpha releases, and '%/release/%' for stable releases.

We extract the locale (for example, 'en-US' for English as used in the United States or 'de' for German) from the resource string using regular expression '.*/([a-zA-Z]{2}|[a-zA-Z]{2}-[a-zA-Z]{2})\??$', falling back to '??' for unrecognized locales if the regular expression does not match.

Step 5: Count Tor Browser update requests

We count a request as Tor Browser update request if it matches the following criteria:

Request method: GET
Resource string: '%/torbrowser/%.mar'
Response code: 302

We distinguish platforms based on the resource string: '%-win32-%' for Windows, '%-osx%' for macOS, and Linux otherwise.

We distinguish release channels based on the resource string: '%-hardened%' for hardened releases, '%/%.%a%/%' for alpha releases, and stable releases otherwise.

We distinguish incremental updates having '%.incremental.%' in the resource string from non-incremental (full) updates that don't contain this pattern in their resource string.

Step 6: Count Tor Messenger initial downloads

We count a request as Tor Messenger initial download if it matches the following criteria:

Request method: GET
Resource string: '%/tormessenger/%.exe', '%/tormessenger/%.dmg', and '%/tormessenger/%.tar.xz'
Response code: 200

We distinguish platforms based on the resource string: '%.exe' for Windows, '%.dmg' for macOS, and '%.tar.xz' for Linux.

Step 7: Count Tor Messenger update pings

We count a request as Tor Messenger update ping if it matches the following criteria:

Request method: GET
Resource string: '%/tormessenger/update\__/%' but none of '%.xml', '%/' or '%/?'
Response code: 200

We distinguish platforms based on the resource string: '%/WINNT%' for Windows, '%/Darwin%' for macOS, and '%/Linux%' for Linux.

Contact

This material is supported in part by the National Science Foundation under Grant No. CNS-0959138. Any opinions, finding, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. "Tor" and the "Onion Logo" are registered trademarks of The Tor Project, Inc.. Data on this site is freely available under a CC0 no copyright declaration: To the extent possible under law, the Tor Project has waived all copyright and related or neighboring rights in the data. Graphs are licensed under a Creative Commons Attribution 3.0 United States License.