Tor Metrics Library is a Java API that facilitates processing Tor network data from the CollecTor service for statistical analysis and for building services and applications.
Download Release View Change Log Browse JavaDocsWelcome to metrics-lib, a Java API that facilitates processing Tor network data from the CollecTor service for statistical analysis and for building services and applications.
In the tutorials below we're explaining the basic steps to get you started with metrics-lib.
The following tutorials are written with an audience in mind that knows Java and to a lesser extent how Tor works. We explain all data used in the tutorials. More and most up-to-date information about descriptors can be found in the Tor directory protocol specification and on the CollecTor page.
All tutorials require you to download the latest release of metrics-lib, follow the instructions to verify its signature, extract the tarball locally, and copy the lib/
and the generated/
directories to your working directory for the tutorials.
Let's start this tutorial series by doing something really simple. We'll use metrics-lib to download recent consensuses from CollecTor and write them to a local directory. We're not doing anything with those consensuses yet, though we'll get back to that in a bit.
We'll need to tell metrics-lib five pieces of information for this:
"https://collector.torproject.org"
),new String[] { "/recent/relay-descriptors/consensuses/" }
),0L
),new File("descriptors")
), andfalse
).Create a new file DownloadConsensuses.java
with the following content:
import org.torproject.descriptor.*;
import java.io.File;
public class DownloadConsensuses {
public static void main(String[] args) {
// Download consensuses published in the last 72 hours, which will take up to five minutes and require several hundred MB on the local disk.
DescriptorCollector descriptorCollector = DescriptorSourceFactory.createDescriptorCollector();
descriptorCollector.collectDescriptors(
// Download from Tor's main CollecTor instance,
"https://collector.torproject.org",
// include only network status consensuses
new String[] { "/recent/relay-descriptors/consensuses/" },
// regardless of last-modified time,
0L,
// write to the local directory called descriptors/,
new File("descriptors"),
// and don't delete extraneous files that do not exist remotely anymore.
false);
}
}
If you haven't already done so, prepare the working directory for this tutorial as described above.
Compile and run the Java file:
javac -cp lib/\*:generated/dist/signed/\* DownloadConsensuses.java
java -cp .:lib/\*:generated/dist/signed/\* DownloadConsensuses
This will take up to five minutes and require several hundred MB on the local disk.
If you want to play a bit with this code, you could extend it to also download recent bridge extra-info descriptors from CollecTor, which are stored in /recent/bridge-descriptors/extra-infos/
and which we'll need for tutorial 3 below. (If you're too impatient curious, scroll down to the bottom of this page for the diff.)
If you just followed tutorial 1 above, you now have a bunch of consensuses on your disk. Let's do something with those and look at relay capacity by Tor version. A possible use case could be that the Tor developers debate which of the older versions to turn into long-term supported versions, and you want to contribute more facts to that discussion by telling them how much relay capacity each version provides.
Consider the following snippet from a consensus document showing a single relay to get an idea of the underlying data:
[...] r PrivacyRepublic0001 XOzFwwrMSz3kYnkjI5Zwh8xT2Uc WLlCQj3gVELkwIBh3EWxG74LZ2E 2017-03-04 08:16:22 178.32.181.96 443 80 s Exit Fast Guard HSDir Running Stable V2Dir Valid v Tor 0.2.8.9 pr Cons=1-2 Desc=1-2 DirCache=1 HSDir=1 HSIntro=3 HSRend=1 Link=1-4 LinkAuth=1 Microdesc=1-2 Relay=1-2 w Bandwidth=136000 p reject 22,25,109-110,119,143,465,563,587,6881-6889 [...]
We're interested in the Tor version number without patch level (0.2.8
) and the consensus weight (136000
).
Create a new file ConsensusWeightByVersion.java
with the following content:
import org.torproject.descriptor.*;
import java.io.File;
import java.util.*;
public class ConsensusWeightByVersion {
public static void main(String[] args) {
// Download consensuses.
DescriptorCollector descriptorCollector = DescriptorSourceFactory.createDescriptorCollector();
descriptorCollector.collectDescriptors("https://collector.torproject.org", new String[] { "/recent/relay-descriptors/consensuses/" }, 0L, new File("descriptors"), false);
// Keep local counters for extracted descriptor data.
long totalBandwidth = 0L;
SortedMap<String, Long> bandwidthByVersion = new TreeMap<>();
// Read descriptors from disk.
DescriptorReader descriptorReader = DescriptorSourceFactory.createDescriptorReader();
for (Descriptor descriptor : descriptorReader.readDescriptors(new File("descriptors/recent/relay-descriptors/consensuses"))) {
if (!(descriptor instanceof RelayNetworkStatusConsensus)) {
// We're only interested in consensuses.
continue;
}
RelayNetworkStatusConsensus consensus = (RelayNetworkStatusConsensus) descriptor;
for (NetworkStatusEntry entry : consensus.getStatusEntries().values()) {
String version = entry.getVersion();
if (!version.startsWith("Tor ") || version.length() < 9) {
// We're only interested in a.b.c type versions for this example.
continue;
}
// Remove the 'Tor ' prefix and anything starting at the patch level.
version = version.substring(4, 9);
long bandwidth = entry.getBandwidth();
totalBandwidth += bandwidth;
if (bandwidthByVersion.containsKey(version)) {
bandwidthByVersion.put(version, bandwidth + bandwidthByVersion.get(version));
} else {
bandwidthByVersion.put(version, bandwidth);
}
}
}
// Print out fractions of consensus weight by Tor version.
if (totalBandwidth > 0L) {
for (Map.Entry<String, Long> e : bandwidthByVersion.entrySet()) {
System.out.printf("%s -> %4.1f%%%n", e.getKey(), (100.0 * (double) e.getValue() / (double) totalBandwidth));
}
}
}
}
If you haven't already done so, prepare the working directory for this tutorial as described above.
Compile and run the Java file:
javac -cp lib/\*:generated/dist/signed/\* ConsensusWeightByVersion.java
java -cp .:lib/\*:generated/dist/signed/\* ConsensusWeightByVersion
There will be some log statements, and the final output should now contain lines like the following:
0.2.4 -> 3.2% 0.2.5 -> 9.4% 0.2.6 -> 3.2% 0.2.7 -> 7.3% 0.2.8 -> 6.4% 0.2.9 -> 48.2% 0.3.0 -> 20.8% 0.3.1 -> 1.2% 0.3.2 -> 0.3%
These are the numbers we were looking for. Now you should know what to do to extract interesting data from consensuses. Want to give that another try and filter relays with the Exit
flag to learn about exit capacity by Tor version? Hint: You'll want to check for entry.getFlags().contains("Exit")
. Of course, you could as well continue with the next tutorial below. (Or you could scroll down to the bottom of this page to see the diff.)
In the previous tutorial we looked at relay descriptors, so let's now look a bit at bridge descriptors.
Every bridge publishes its transports in its extra-info descriptors that it periodically sends to the bridge authority. Let's count the frequency of transports. A possible use case could be that the Pluggable Transports developers debate which of the transport name is the least pronouncable, and you want to give them numbers to talk about something much more useful instead.
Consider this snippet from a bridge extra-info descriptor:
extra-info LeifEricson 3E0908F131AC417C48DDD835D78FB6887F4CD126 [...] transport obfs2 transport scramblesuit transport obfs3 transport obfs4 transport fte
What we need to do is extract the list of transport names (obfs2
, scramblesuit
, etc.) together with the bridge fingerprint (3E0908F131AC417C48DDD835D78FB6887F4CD126
). Considering the fingerprint is important, so that we avoid double-counting transports provided by the same bridge.
Create a new file PluggableTransports.java
with the following content:
import org.torproject.descriptor.*;
import java.io.File;
import java.util.*;
public class PluggableTransports {
public static void main(String[] args) {
DescriptorCollector descriptorCollector = DescriptorSourceFactory.createDescriptorCollector();
descriptorCollector.collectDescriptors("https://collector.torproject.org", new String[] { "/recent/bridge-descriptors/extra-infos/" }, 0L, new File("descriptors"), false);
Set<String> observedFingerprints = new HashSet<>();
SortedMap<String, Integer> countedTransports = new TreeMap<>();
DescriptorReader descriptorReader = DescriptorSourceFactory.createDescriptorReader();
for (Descriptor descriptor : descriptorReader.readDescriptors(new File("descriptors/recent/bridge-descriptors/extra-infos"))) {
if (!(descriptor instanceof BridgeExtraInfoDescriptor)) {
continue;
}
BridgeExtraInfoDescriptor extraInfo = (BridgeExtraInfoDescriptor) descriptor;
String fingerprint = extraInfo.getFingerprint();
if (observedFingerprints.add(fingerprint)) {
for (String transport : extraInfo.getTransports()) {
if (countedTransports.containsKey(transport)) {
countedTransports.put(transport, 1 + countedTransports.get(transport));
} else {
countedTransports.put(transport, 1);
}
}
}
}
if (!observedFingerprints.isEmpty()) {
double totalObservedFingerprints = observedFingerprints.size();
for (Map.Entry<String, Integer> e : countedTransports.entrySet()) {
System.out.printf("%20s -> %4.1f%%%n", e.getKey(), (100.0 * (double) e.getValue() / totalObservedFingerprints));
}
}
}
}
If you haven't already done so, prepare the working directory for this tutorial as described above.
Compile and run the Java file:
javac -cp lib/\*:generated/dist/signed/\* PluggableTransports.java
java -cp .:lib/\*:generated/dist/signed/\* PluggableTransports
The output should contain lines like the following:
fte -> 2.3% meek -> 0.2% obfs2 -> 0.7% obfs3 -> 20.8% obfs3_websocket -> 0.0% obfs4 -> 77.0% scramblesuit -> 17.3% snowflake -> 0.1% websocket -> 0.7%
As above, we'll leave it up to you to further expand this code. For example, how does the result change if you count transport combinations rather than transports? Hint: you won't need anything else from metrics-lib, but you'll need to add some code to order transport names and write them to a string. (And if you'd rather look up the solution, scroll down a bit to see the diff.)
Want to write more code that uses metrics-lib? Be sure to read the JavaDocs while developing new services or applications using Tor network data.
Ran into a problem, found a bug, or came up with a cool new feature? Feel free to contact us. Alternatively, take a look at the bug tracker and open a ticket if there's none for your issue yet.
Interested in writing code for metrics-lib? Please take a look at the Network Health team wiki page to find out how to contribute.
Scrolled down just to see where we're hiding the solutions of the three little riddles above? Here are the diffs:
diff -Nur DownloadConsensuses.java DownloadConsensuses.java
--- DownloadConsensuses.java 2017-03-07 17:48:35.000000000 +0100
+++ DownloadConsensuses.java 2017-03-10 23:02:51.000000000 +0100
@@ -11,7 +11,7 @@
// Download from Tor's main CollecTor instance,
"https://collector.torproject.org",
// include only network status consensuses
- new String[] { "/recent/relay-descriptors/consensuses/" },
+ new String[] { "/recent/bridge-descriptors/extra-infos/" },
// regardless of last-modified time,
0L,
// write to the local directory called descriptors/,
diff -Nur ConsensusWeightByVersion.java ConsensusWeightByVersion.java
--- ConsensusWeightByVersion.java 2017-03-10 23:00:40.000000000 +0100
+++ ConsensusWeightByVersion.java 2017-03-10 23:03:18.000000000 +0100
@@ -25,6 +25,9 @@
}
RelayNetworkStatusConsensus consensus = (RelayNetworkStatusConsensus) descriptor;
for (NetworkStatusEntry entry : consensus.getStatusEntries().values()) {
+ if (!entry.getFlags().contains("Exit")) {
+ continue;
+ }
String version = entry.getVersion();
if (!version.startsWith("Tor ") || version.length() < 9) {
// We're only interested in a.b.c type versions for this example.
diff -Nur PluggableTransports.java PluggableTransports.java
--- PluggableTransports.java 2017-03-10 23:01:43.000000000 +0100
+++ PluggableTransports.java 2017-03-10 23:03:43.000000000 +0100
@@ -20,12 +22,11 @@
BridgeExtraInfoDescriptor extraInfo = (BridgeExtraInfoDescriptor) descriptor;
String fingerprint = extraInfo.getFingerprint();
if (observedFingerprints.add(fingerprint)) {
- for (String transport : extraInfo.getTransports()) {
- if (countedTransports.containsKey(transport)) {
- countedTransports.put(transport, 1 + countedTransports.get(transport));
- } else {
- countedTransports.put(transport, 1);
- }
+ String transports = new TreeSet<>(extraInfo.getTransports()).toString();
+ if (countedTransports.containsKey(transports)) {
+ countedTransports.put(transports, 1 + countedTransports.get(transports));
+ } else {
+ countedTransports.put(transports, 1);
}
}
}
© 2009–2023 The Tor Project
This material is supported in part by the National Science Foundation under Grant No. CNS-0959138. Any opinions, finding, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. "Tor" and the "Onion Logo" are registered trademarks of The Tor Project, Inc.. Data on this site is freely available under a CC0 no copyright declaration: To the extent possible under law, the Tor Project has waived all copyright and related or neighboring rights in the data. Graphs are licensed under a Creative Commons Attribution 3.0 United States License.