David Choffnes

Postdoctoral Research Associate and CI Fellow
Dept. of Computer Science and Engineering
Univeristy of Washington

~ Research Statement ~

Research Statement

CV

Software

Recreation

Photos

At its core, my research approach is to identify high-impact research problems in distributed systems, use novel techniques to solve them and produce open source software artifacts that immediately deploy them at the scale of hundreds of thousands of users. My work is driven by a desire to make a difference in people's lives to the largest extent possible in the field of networking and systems – as such, I focus on solutions to problems that affect large portions of the Internet population and produce software that delivers my research ideas to ordinary users worldwide.
   A key goal of my current work is enabling a sort of “information plane” for globally-distributed systems by reusing measurements collected by existing long-running services. By avoiding the need to perform potentially costly measurements, this design approach improves scalability, reduces code complexity and facilitates deployment for new and existing services.
   As part of this work, I have built and deployed systems that improve transfer performance for peer-to-peer (P2P) applications while reducing their cost to network providers, and reuse passively gathered P2P performance information to detect and localize network performance problems. Because designing successful distributed systems requires an understanding of the networks that support them, my research also encompasses topics in networking that include Internet topologies, network positioning and network neutrality. Finally, my work has touched on related areas including mobile ad-hoc networks, 3G networks, operating systems, security and privacy. In the following paragraphs, I describe some of my prominent research results.

Measurement Reuse for Internet-Scale Distributed Systems

Taming the Torrent
Over the past decade, the peer-to-peer (P2P) model for building distributed systems has enjoyed incredible success and popularity, forming the basis for a wide variety of important Internet applications such as file sharing, voice-over-IP (VoIP) and video streaming. This success has not been universally welcomed. Internet Service Providers (ISPs) and P2P systems, for example, have developed a complicated relationship that has been the focus of much media attention. While P2P bandwidth demands have yielded significant revenues for ISPs, as users upgrade to broadband for improved P2P performance, P2P systems are one of their greatest and costly traffic engineering challenges because peers establish connections largely independent of the Internet routing. To address these issues, I developed Ono, an extension to a popular BitTorrent client that biases P2P connections to avoid much of these costs without sacrificing -- and potentially improving -- BitTorrent performance [SIGCOMM 2008].
   The Ono software, which has been installed more than 700,000 times, provides an alternative approach to the unsustainable cat-and-mouse game where ISPs would block P2P traffic and P2P software would circumvent these measures. In addition to addressing the problem of cross-ISP traffic in P2P systems, Ono provides a clear instance of reusing information made available by existing long-running services [SIGCOMM 2006]. In this case, we used dynamic CDN redirections as hints regarding network proximity: if two peers are sent to the same CDN replica servers, they are likely to be close to those servers, and by transition, close to each other [ICDCS 2008]. Finally, by providing an immediately deployable system that locates nearby peers to improve P2P users’ performance, we showed that the right user incentives are essential for a successful approach to reducing costs for ISPs.

Using the Crowd to Monitor the Cloud
While my work on Ono demonstrated the effectiveness of measurement reuse for a popular P2P system, my current work uses a similar approach to address the challenge of monitoring performance for distributed services that extend to the edge of the network (e.g., VoIP, IPTV and content distribution). Given the popularity and potential for revenue from these services, their user experience has become an important benchmark for service providers, network providers and end users.
   Perceived user experience is in large part determined by the frequency, duration and severity of network events that impact a service. There is thus a clear need to detect, isolate and determine the root causes of these service-level network events so that operators can resolve such issues in a timely manner, minimizing their impact on revenue and reputation. While most network problems occur at the edges of the network, they are largely invisible to network operators. The reason is that most existing approaches to monitoring require O(N2) measurements that simply do not scale to the large number of network elements at the edge.
   My thesis work proposes that the most effective way to detect service-level events is by monitoring the end systems where the services are used. In essence, this approach detects network performance problems by crowdsourcing network monitoring – achieving scalable, real-time network coverage by pushing monitoring to end systems at the network edge. We used probability theory, extensive traces from BitTorrent users and ground-truth information from ISPs to design and build a system that detects network problems effectively, quickly and reliably. Its current implementation for BitTorrent, called the Network Early Warning System (NEWS), has been installed more than 30,000 times. While Ono reuses CDN information to improve BitTorrent efficiency, my current work reuses BitTorrent information to improve detection of network problems that impact performance.


Network Measurement at the Edge

Where the Sidewalk Ends
While the Internet has been described as on the of the greatest engineering successes in modern history, it is not without its limitations. One of its greatest strengths is its simple, decentralized design that enables the Internet to grow to reach nearly every corner of the globe. This design is also the reason that there is no built-in way to measure Internet growth – to the extent that researchers cannot even generate a complete map of today's Internet. Such Internet topologies, however, are critical not only for informing business decisions among different network providers, they are also important to isolating and addressing events and outages that impact network performance.
   Most Internet mapping efforts have derived the network structure, at the level of interconnected autonomous systems (ASes), from a limited number of data sources. While techniques for charting the topology continue to improve, the number of vantage points continues to shrink relative to the fast-paced growth of the Internet. By leveraging measurements performed by an extension to a popular P2P system as an observation platform that scales with the growing Internet, we revealed hidden areas of the Internet topology [CoNEXT 2009]. In particular, we used traceroute measurements from hundreds of thousands of edge systems to discover tens of thousands of links invisible to public views.

Network Positioning Reexamined
The same design principles that hinder attempts to generate complete Internet maps also make it difficult to to predict the performance (e.g., throughput or delay) between networks and hosts. In the context of large-scale distributed systems, accurate predictions could be used to improve performance by informing decisions regarding which hosts to connect to and which network paths to follow. A large body of previous work attempts to address this issue using various measurement techniques to predict future performance between hosts. A promising approach, network positioning, predicts performance by calculating a network distance between participating hosts.
   Using more than 1.4 billion network measurements gathered from P2P users, we have shown that existing approaches to network positioning exhibit noticeably worse performance than previously reported in studies conducted on limited-scale research testbeds. To explain this result, we identified several key properties of this environment that contradict fundamental assumptions driving network positioning research. Based on these observations, we are experimenting with a new approach to network positioning that determines relative network locations based on local topology information. We used extensive traceroute data from P2P users to design a system that uses local topology information to locate nearby peers reliably without any dependence on infrastructure, and we are currently deploying it as part of our Ono software.

Identifying ISP Interference
As the Internet is increasingly used for high-bandwidth and real-time services such as video streaming, VoIP and P2P content distribution, network providers are faced with difficult traffic engineering challenges that impact user performance and ISP revenue. To address these issues, some providers have resorted to interfering with subscriber traffic, e.g., through shaping, blocking or forging packets. Though the legality of such violations of network neutrality vary depending on the jurisdiction, there is a number of efforts that attempt to improve the transparency of these network policies. Existing solutions such as Glasnost and NANO suffer from a number of challenges, however, because centralized solutions are easily be detected and filtered by ISPs, while on-demand measurements may miss many instances of ISP interference.
   We are currently working on an alternative approach to detecting ISP interference that relies on passively monitoring the natural traffic generated by running software (e.g., P2P file sharing, Web browsing and VoIP) at the end systems where they are used. In combination with limited active measurements, this approach can confirm and isolate potential cases of interference using a fully decentralized approach that prevents ISPs from hiding their activity. We have produced software (currently in beta testing) that implements our approach for BitTorrent, which not only notifies participating users when their traffic is being interfered, but also makes this information available through a public Web interface accessible to any user.

Last updated October 31, 2009.