An Extensive Evaluation of the Internet’s Open Proxies

This project (in collaboration with Northeastern University) conducts a comprehensive study of open proxies, encompassing more than 107,000 listed open proxies and 13M proxy requests over a 50 day period.

We provide a broad study that examines the availability, success rates, diversity, and also (mis)behavior of proxies. Our results show that listed open proxies suffer poor availability — more than 92% of open proxies that appear on aggregator sites are unresponsive to proxy requests. Much more troubling, we find numerous examples of malicious open proxies in which HTML content is manipulated to mine cryptocurrency (that is, cryptojacking). We additionally detect TLS man-in-the-middle (MitM) attacks, and discover numerous instances in which binaries fetched through proxies were modified to include remote access trojans and other forms of malware. As a point of comparison, we conduct and discuss a similar measurement study of the behavior of Tor exit relays. We find no instances in which Tor relays performed TLS MitM or manipulated content, suggesting that Tor offers a far more reliable and safe form of proxied communication.

More info…

Privacy-Preserving Tor Measurements

This project (in collaboration with researchers at University of New South Wales and the U.S. Naval Research Lab) conducts a detailed privacy-preserving measurement study of Tor, to better understand how the network is being (mis)used.

The Tor network is difficult to measure because, if not done carefully, measurements could risk the privacy (and potentially the safety) of the network’s users. Recent work has proposed the use of differential privacy and secure aggregation techniques to safely measure Tor. We significantly enhance two such tools—PrivCount and Private Set-Union Cardinality (PSC)—in order to support the safe exploration of three major aspects of Tor usage: how many users connect to Tor and from where do they connect, with which destinations do users most frequently communicate, and how many onion services exist and how are they used.

More Info…

Private Set-Union Cardinality (PSC)

This project, in collaboration with researchers at Tulane University and the U.S. Naval Research Lab, introduces a cryptographic protocol for efficiently aggregating a count of unique items across a set of data collectors privately – that is, without exposing any information other than the count. Our protocol allows for more secure and useful statistics gathering in privacy-preserving distributed systems such as anonymity networks; for example, it allows operators of anonymity networks such as Tor to securely answer the question: how many unique users were observed using the distributed service? We formally prove the correctness and security of our protocol in the Universally Composable framework against an active adversary that compromises all but one of the aggregation parties. We also show that the protocol provides security against adaptive corruption of the distributed data collectors, which prevents them from being victims of targeted compromise. To ensure safe measurements, we also show how the output can satisfy differential privacy.

We present a proof-of-concept implementation of the private set-union cardinality protocol (PSC) and use it to demonstrate that PSC operates with low computational overhead and reasonable bandwidth. In particular, for reasonable deployment sizes, the protocol run at timescales smaller than the typical measurement period would be and thus is suitable for distributed measurement.

More Info…


A large volume of existing research attempts to understand who uses Tor and how the network is used (and misused). However, conducting measurements on the live Tor network, if done improperly, can endanger the security and anonymity of the millions of users who depend on the network to enhance their online privacy. Indeed, several existing measurement studies of Tor have been heavily criticized for unsafe research practices.

Tor needs privacy-preserving methods of gathering statistics. The recently proposed PrivEx system demonstrates how data can be safely collected on Tor using techniques from differential privacy. However, the integrity of the statistics reported by PrivEx is brittle under realistic deployment conditions. An adversary who operates even a single relay in the volunteer-operated anonymity network can arbitrarily influence the result of PrivEx queries. We argue that a safe and useful data collection mechanism must provide both privacy and integrity protections.

HisTorɛ is a privacy-preserving statistics collection scheme based on (ɛ,𝛿)-differential privacy that is robust against adversarial manipulation. We formalize the security guarantees of HisTorɛ and show using historical data from the Tor Project that HisTorɛ provides useful data collection and reporting with low bandwidth and processing overheads.

DeDOS: Declarative Dispersion-Oriented Software

The goal of this project is to create fundamentally new defenses against distributed denial-of-service (DDoS) attacks that can provide far greater resilience to these attacks compared to existing solutions. Today’s responses to DDoS attacks largely rely on old-school network-based filtering or scrubbing, which are slow and manual, and cannot handle new attacks. DeDOS takes a radically different approach that combines techniques from declarative programming, program analysis, and real-time resource allocation in the cloud.

Rather than relying on traditional detection and mitigatiton techniques, the project aims to develop a new software architecture from the ground up that make it significantly harder for an attacker to slow down to system without expending large amounts of resources. For example, instead of running monolithic software and naively replicating it when under an attack, DeDOS logically and physically restructures complex software systems into smaller components that can react to attacks at a much finer granularity. DeDOS also uses state-of-the-art resource allocation algorithms to achieve near-optimal use of system resources and to support critical, time-sensitive applications, such as situational awareness. More info…

Hidden Voice Commands

Voice interfaces are becoming more ubiquitous and are now the primary input method for many devices. We explore in this project how they can be attacked with hidden voice commands that are unintelligible to human listeners but which are interpreted as commands by devices. We evaluate these attacks under two different threat models. In the black-box model, an attacker uses the speech recognition system as an opaque oracle. We show that the adversary can produce difficult to understand commands that are effective against existing systems in the black-box model. Under the white-box model, the attacker has full knowledge of the internals of the speech recognition system and uses it to create attack commands that we demonstrate through user testing are not understandable by humans. We then evaluate several defenses, including notifying the user when a voice command is accepted; a verbal challenge-response protocol; and a machine learning approach that can detect our attacks with 99.8% accuracy. More info…

HoneyMail and HoneyProxy

HoneyMail is a measurement study of email interception. Since the content (and metadata) of intercepted emails can be trivially read, convention wisdom tells us that confidential information should never be sent via unencrypted emails. The project explores whether such advice is actually prudent. That is, we aim to answer the question how often are emails actually intercepted on the Internet?

To determine the regularity of which interception occurs, we transmit (false) emails whose content is attractive to potential eavesdroppers, but are sent only between our own email accounts. In particular, our fake emails will contain URLs that purportedly contain sensitive information about mortgages, bank accounts, passwords, and shared files. The emails are sent between geographically distributed email servers located through the globe, with embedded URLs that resolve to web servers under our control. Since the emails are sent only between our email servers and are addressed to fictitious email accounts, any visit to one of the embedded URLs must be due to the (illegal) interception of our email. More info…

The HoneyProxy project is a comparative study between the numerous free proxies freely available online and the Tor network. Since both Tor and free proxies are susceptible to manipulation and monitoring of traffic, for this project we are examining the behavior of a set of over 5000 proxies and all Tor exit nodes to search for malicious behavior on the part of the proxies and exit nodes. More info…


Senser is a distributed censorship detection and circumvention system for the web. Senser operates as a network of proxies located at different vantage points on the Internet (some of which may be subject to censorship). Clients query a random subset of Senser proxies for compact descriptions of a desired web page, and apply consensus and matching algorithms to the returned results to locally render a “majority” web page. More info…


A project to automate the creation of EmuLab experiments involving Tor. Creating a Tor ‘network-in-a-box’ on EmuLab requires significant configuration of both EmuLab and Tor; EmulaTor simplifies this process and allows for push-button creation of the necessary files. More Info…


Tortoise is a system for rate limiting Tor at its ingress points. We demonstrate that the system incurs little penalty for interactive web users, while significantly decreasing the throughput for filesharers. Our techniques provide incentives to filesharers to configure their Tor clients to also relay traffic, which in turn improves the network’s overall performance. More info…

Secure Network Provenance

The goal of this project is to provide secure network provenance, that is, the ability to correctly explain system states even when (and especially when) the system is faulty or under attack. Towards this goal, we are substantially extending and generalizing the concept of network provenance by adding capabilities needed in a forensic setting, we are developing techniques for securely storing prove­nance without trusted components, and we are designing methods for efficiently querying secure provenance. We are evaluating our techniques in the context of concrete applications, such as Hadoop MapReduce or BGP interdomain routing.

Web Footprinting

With the growth of online social networks and social media sites, the increase in dynamic web content, and the popularity of digital communication, more and more public information about individuals is available on the Internet. While much of this information is not sensitive, it is not uncommon for users to publish some sensitive information, including their birth dates and addresses, on social networking sites. The availability of this publicly accessible and potentially sensitive data can (and does) lead to abuse, exposing users to fraud, stalking, and identity theft. To help users better understand the potential risks associated with publishing certain data on the web, this project focuses on helping individuals determine and understand their WebFootprints. More info…


The DARPA-funded Selectable Anonymity for Enabling SAFER Telecommunications (SAFEST) project investigates methods for constructing reliable, high-performing, and censorship-resistant anonymity networks. SAFEST is a collaborative effort between Georgetown and the University of Pennsylvania.