81% of Tor Users Can be Easily Unmasked By Analysing Router Information

Swati KhandelwalNov 18, 2014

Tor has always been a tough target for law enforcement for years and FBI has spent millions of dollars to de-anonymize the identity of Tor users, but a latest research suggests that more than 81% of Tor clients can be "de-anonymised" by exploiting the traffic analysis software ‘Netflow’ technology that Cisco has built into its router protocols.

NetFlow is a network protocol designed to collect and monitor network traffic. It exchanged data in network flows, which can correspond to TCP connections or other IP packets sharing common characteristics, such UDP packets sharing source and destination IP addresses, port numbers, and other information.

The research was conducted for six years by professor Sambuddho Chakravarty, a former researcher at Columbia University’s Network Security Lab and now researching Network Anonymity and Privacy at the Indraprastha Institute of Information Technology in Delhi.

Chakravarty used a technique, in order to determine the Tor relays, which involved a modified public Tor server running on Linux, accessed by the victim client, and modified Tor node that can form one-hop circuits with arbitrary legitimate nodes.

"The server modulates the data being sent back to the client, while the corrupt Tor node is used to measure delay between itself and Tor nodes," researchers wrote in a paper PDF. "The correlation between the perturbations in the traffic exchanged with a Tor node, and the server stream helped identify the relays involved in a particular circuit."

According to the research paper, to carry out large-scale traffic analysis attacks in the Tor environment one would not necessarily need the resources of a nation state, even a single AS may observe a large fraction of entry and exit node traffic, as stated in the paper – a single AS (Autonomous System) could monitor more than 39% of randomly-generated Tor circuits.

"It is not even essential to be a global adversary to launch such traffic analysis attacks," Chakravarty wrote. "A powerful, yet non- global adversary could use traffic analysis methods […] to determine the various relays participating in a Tor circuit and directly monitor the traffic entering the entry node of the victim connection."

The technique depends on injecting a repeating traffic pattern into the TCP connection that it observes as originating from the target exit node, and then correlating the server’s exit traffic for the Tor clients, as derived from the router’s flow records, to identify Tor client.

Tor is vulnerable to this kind of traffic analysis because it is designed as low-latency anonymous communication networks.

"To achieve acceptable quality of service, [Tor attempts] to preserve packet interarrival characteristics, such as inter-packet delay. Consequently, a powerful adversary can mount traffic analysis attacks by observing similar traffic patterns at various points of the network, linking together otherwise unrelated network connections," Chakravarty explains.

Chakravarty’s research on traffic analysis doesn't need hundreds of millions of dollars in expense, neither it needed infrastructural efforts that the NSA put into their FoxAcid Tor redirects, however it benefits from running one or more high-bandwidth, high-performance, high-uptime Tor relays.

Just few days ago, US and European authorities announced the seizure of 27 different websites as part of a much larger operation called Operation Onymous, which led to take-down of more than "410 hidden domains" that sell illegal goods and services from drugs to murder-for-hire assassins by masking their identities using the Tor encryption network.

UPDATE
However, the Tor Project responded via a blog post. In a statement Tor project member 'Arma' confirmed that they have been aware of the network analysis attacks and has already implemented security measures in place.

"It's great to see more research on traffic correlation attacks, especially on attacks that don't need to see the whole flow on each side. But it's also important to realise that traffic correlation attacks are not a new area." reads the blog post.

"The discussion of false positives is key to this new paper too: Sambuddho's paper mentions a false positive rate of 6%. ... It's easy to see how at scale, this 'base rate fallacy' problem could make the attack effectively useless," he said.

Found this article interesting? Follow us on Google News, Twitter and LinkedIn to read more exclusive content we post.