Filename: 348-udp-app-support.md
Title: UDP Application Support in Tor
Author: Micah Elizabeth Scott
Created: December 2023
Status: Open

UDP Application Support in Tor

Table of Contents

Introduction

This proposal takes a fresh look at the problem of implementing support in Tor for applications which require UDP/IP communication.

This work is being done with the sponsorship and goals of the Tor VPN Client for Android project.

The proposal begins with a summary of previous work and the specific problem space being addressed. This leads into an analysis of possible solutions, and finally some possible conclusions about the available development opportunities.

History

There have already been multiple attempts over Tor's history to define some type of UDP extension.

2006

Proposal 100 by Marc Liberatore in 2006 suggested a way to "add support for tunneling unreliable datagrams through tor with as few modifications to the protocol as possible." This proposal suggested extending the existing TLS+TCP protocol with a new DTLS+UDP link mode. The focus of this work was on a potential way to support unreliable traffic, not necessarily on UDP itself or on UDP applications.

In proposal 100, a Tor stream is used for one pairing of local and remote address and port, copying the technique used by Tor for TCP. This works for some types of UDP applications, but it's broken by common behaviors like ICE connectivity checks, NAT traversal attempts, or using multiple servers via the same socket.

No additional large-message fragmentation protocol is defined, so the MTU in proposal 100 is limited to what fits in a single Tor cell. This value is much too small for most applications.

It's possible these UDP protocol details would have been elaborated during design, but the proposal hit a snag elsewhere: there was no agreement on a way to avoid facilitating new attacks against anonymity.

2014

In a thread on the tor-talk mailing list, Nathan Freitas suggested UDP tunneling over Tor using the BadVPN project's udpgw protocol.

This protocol was never formally documented and is no longer actively maintained, but it was very broadly similar in scope to a RFC8656 TURN relay operating over a TCP transport.

2018

In 2018, Nick Mathewson and Mike Perry wrote a summary of the side-channel issues with unreliable transports for Tor.

The focus of this document is on the communication between Tor relays, but there is considerable overlap between the attack space explored here and the potential risks of any application-level UDP support. Attacks that are described here, such as drops and injections, may be applied by malicious exits or some types of third parties even in an implementation using only present-day reliable Tor transports.

2020

Proposal 339 by Nick Mathewson in 2020 introduced a simpler UDP encapsulation design which had similar stream mapping properties as in proposal 100, but with the unreliable transport omitted. Datagrams are tunneled over a new type of Tor stream using a new type of Tor message. As a prerequisite, it depends on proposal 319 to support messages that may be larger than a cell, extending the MTU to support arbitrarily large UDP datagrams.

In proposal 339 the property of binding a stream both to a local port and to a remote peer is described in UNIX-style terminology as a connected socket. This idea is explored below using alternate terminology from RFC4787, NAT behavior requirements for UDP. The single-peer connected socket behavior would be referred to as an endpoint-dependent mapping in RFC4787. This type works fine for client/server apps but precludes the use of NAT traversal for peer-to-peer transfer.

Scope

This proposal aims to allow Tor applications and Tor-based VPNs to provide compatibility with applications that require UDP/IP communications.

We don't have a specific list of applications that must be supported, but we are currently aiming for broad support of popular applications while still respecting and referencing all applicable Internet standards documents.

Changes to the structure of the Tor network are out of scope, as are most performance optimizations. We expect to rely on common optimizations to the performance of Tor circuits, rather than looking to make specific changes that optimize for unreliable datagram transmission.

This document will briefly discuss UDP for onion services below. It's worth planning for this as a way to evaluate the future design space, but in practice we are not aiming for UDP onion services yet. This will require changes to most applications that want to use it, as it implies that any media negotiations will need to understand onion addressing in addition to IPv4 and IPv6.

The allowed subset of UDP traffic is not subject to a single rigid definition. There are several options discussed below using the RFC4787 framework.

We require support for DNS clients. Tor currently only supports a limited subset of DNS queries, and it's desirable to support more. This will be analyzed in detail as an application below. DNS is one of very few applications that still rely on fragmented UDP datagrams, though this may not be relevant for us since only servers typically need to control the production of fragments.

We require support for voice/video telecommunications apps. Even without an underlying transport that supports unreliable datagrams, we expect a tunnel to provide a usable level of compatibility. This design space is very similar to the TURN (RFC8656) specification, already used very widely for compatibility with networks that filter UDP. See the analysis of specific applications below.

We require support for peer-to-peer UDP transfer without additional relaying, in apps that use ICE (RFC8445) or similar connection establishment techniques. Video calls between two Tor users should transit directly between two exit nodes. This requires that allocated UDP ports can each communicate with multiple peers: endpoint-independent mapping as described by RFC4787.

We do not plan to support applications which accept incoming datagrams from previously-unknown peers, for example a DNS server hosted via Tor. RFC4787 calls this endpoint-independent filtering. It's unnecessary for running peer-to-peer apps, and it facilitates an extremely easy traffic injection attack.

UDP Traffic Models

To better specify the role of a UDP extension for Tor, this section explores a few frameworks for describing noteworthy subsets of UDP traffic.

User Datagram Protocol (RFC768)

The "User Interface" suggested by RFC768 for the protocol is a rough sketch, suggesting that applications have some way to allocate a local port for receiving datagrams and to transmit datagrams with arbitrary headers.

Despite UDP's simplicity as an application of IP, we do need to be aware of IP features that are typically hidden by TCP's abstraction.

UDP applications typically try to obtain an awareness of the path MTU, using some type of path MTU discovery (PMTUD) algorithm. On IPv4, this requires sending packets with the "Don't Fragment" flag set, and measuring when those packets are lost or when ICMP "Fragmentation Needed" replies are seen.

Note that many applications have their own requirements for path MTU. For example, QUIC and common implementations of WebRTC require an MTU no smaller than 1200 bytes, but they can discover larger MTUs when available.

Socket Layer

In practice the straightforward "User Interface" from RFC768, capable of arbitrary local address, is only available to privileged users.

BSD-style sockets support UDP via SOCK_DGRAM. UDP is a stateless protocol, but sockets do have state. Each socket is bound, either explicitly with bind() or automatically, to a source IP and port.

At the API level, a socket is said to be connected to a remote (address, port) if that address is the default destination. A connected socket will also filter out incoming packets with source addresses different from this default destination. A socket is considered unconnected if connect() has not been called. These sockets have no default destination, and they accept datagrams from any source.

There does not need to be any particular mapping between the lifetime of these application sockets and any higher-level "connection" the application establishes. It's better to think of one socket as one allocated local port. A typical application may allocate only a single port (one socket) for talking to many peers. Every datagram sent or received on the socket may have a different peer address.

Network Address Translation (NAT)

Much of the real-world complexity in applying UDP comes from defining strategies to detect and overcome the effects of NAT. As a result, an intimidating quantity of IETF documentation has been written on NAT behavior and on strategies for NAT traversal.

RFC4787 and later RFC7857 offer best practices for implementing NAT. These are sometimes referred to as the BEHAVE-WG recommendations, based on the "Behavior Engineering for Hindrance Avoidance" working group behind them.

RFC6888 makes additional recommendations for "carrier grade" NAT systems, where small pools of IP addresses are shared among a much larger number of subscribers.

RFC8445 describes the Interactive Connectivity Establishment (ICE) protocol, which has become a common and recommended application-level technique for building peer-to-peer applications that work through NAT.

There are multiple fundamental technical issues that NAT presents:

  1. NAT must be stateful in order to route replies back to the correct source.

    This directly conflicts with the stateless nature of UDP itself. The NAT's mapping lifetime, determined by a timer, will not necessarily match the lifetime of the application-level connection. This necessitates keep-alive packets in some protocols. Protocols that allow their binding to expire may be open to a NAT rebinding attack, when a different party acquires access to the NAT's port allocation.

  2. Applications no longer know an address they can be reached at without outside help.

    Chosen port numbers may or may not be used by the NAT. The effective IP address and port are not knowable without observing from an outside peer.

  3. Filtering and mapping approaches both vary, and it's not generally possible to establish a connection without interactive probing.

    This is the reason ICE exists, but it's also a possible anonymity hazard. This risk is explored a bit further below in the context of interaction with other networks.

We can use the constraints of NAT both to understand application behavior and as an opportunity to model Tor's behavior as a type of NAT. In fact, Tor's many exit nodes already share similarity with some types of carrier-grade NAT. Applications will need to assume very little about the IP address their outbound UDP originates on, and we can use that to our advantage in implementing UDP for Tor.

This body of work is invaluable for understanding the scope of the problem and for defining common terminology. Let's take inspiration from these documents while also keeping in mind that the analogy between Tor and a NAT is imperfect. For example, in analyzing Tor as a type of carrier-grade NAT, we may consider the "pooling behavior" defined in RFC4787: the choice of which external addresses map to an internal address. Tor by necessity must carefully limit how predictable these mappings can ever be, to preserve its anonymity properties. A literal application of RFC6888 would find trouble in REQ-2 and REQ-9, as well as the various per-subscriber limiting requirements.

Mapping and Filtering Behaviors

RFC4787 defines a framework for understanding the behavior of NAT by analyzing both its "mapping" and "filtering" behavior separately. Mappings are the NAT's unit of state tracking. Filters are layered on top of mappings, potentially rejecting incoming datagrams that don't match an already-expected address. Both RFC4787 and the demands of peer-to-peer applications make a good case for always using an Endpoint-Independent Mapping.

Choice of filtering strategy is left open by the BEHAVE-WG recommendations. RFC4787 does not make one single recommendation for all circumstances, instead it defines three behavior options with different properties:

  • Endpoint-Independent Filtering allows incoming datagrams from any peer once a mapping has been established.

    RFC4787 recommends this approach, with the concession that it may not be ideal for all security requirements.

    This technique cannot be safely applied in the context of Tor. It makes traffic injection attacks possible from any source address, provided you can guess the UDP port number used at an exit. It also makes possible clear-net hosting of UDP servers using an exit node's IP, which may have undesirable abuse properties.

    This permissive filter is also incompatible with our proposed mitigation to local port exhaustion on exit relays. Even with per-circuit rate limiting, an attacker could trivially overwhelm the local port capacity of all combined UDP-capable Tor exits.

    It is still common for present-day applications to prefer endpoint-independent filtering, as it allows incoming connections from peers which cannot use STUN or a similar address fixing protocol. Choosing endpoint-independent filtering would have some compatibility benefit, but among modern protocols which use ICE and STUN there would be no improvement. The cost, on the other hand, would be an uncomfortably low-cost traffic injection attack and additional risks toward exit nodes.

  • Address-Dependent Filtering

    This is a permitted alternative according to RFC4787, in which incoming datagrams are allowed from only IP addresses we have previously sent to, but any port on that IP may be the sender.

    The intended benefits of this approach versus the port-dependent filtering below are unclear, and may no longer be relevant. In theory they would be:

    • To support a class of applications that rely on, for a single local port, multiple remote ports achieving filter acceptance status when only one of those ports has been sent a datagram. We are currently lacking examples of applications in this category. Any application using ICE should be outside this category, as each port would have its own connectivity check datagrams exchanged in each direction.

    • REQ-8 in RFC4787 claims the existence of a scenario in which this approach facilitates ICE connections with a remote peer that disregards REQ-1 (the peer does not use Endpoint-Independent Mapping). It is not clear that this claim is still relevant.

    One security hazard of address-dependent and non-port-dependent filtering, identified in RFC4787, is that a peer on a NAT effectively negates the security benefits of this host filtering. In fact, this should raise additional red flags as applied to either Tor or carrier grade NAT. If supporting peer-to-peer applications, it should be commonplace to establish UDP flows between two Tor exit nodes. When this takes place, non-port-dependent filtering would then allow anyone on Tor to connect via those same nodes and perform traffic injection. The resulting security properties really become uncomfortably similar to endpoint-independent filtering.

  • Address- and Port-Dependent Filtering

    This is the strictest variety of filtering, and it is an allowed alternative under RFC4787. It provides opportunities for increased security and opportunities for reduced compatibility, both of which in practice may depend on other factors.

    For every application we've analyzed so far, port-dependent filtering is not a problem. Usage of ICE will open all required filters during the connectivity check phase.

    This is the only type of filtering that provides any barrier at all between cross-circuit traffic injection when the communicating parties are known.

RFC4787 recommends that filtering style be configurable. We would like to implement that advice, but only to the extent it can be done safely and meaningfully in the context of an anonymity system. When possible, it would provide additional compatibility at no mandatory cost to allow applications to optionally request Address-Dependent Filtering. Otherwise, Address- and Port-Dependent Filtering is the most appropriate default setting.

Common Protocols

Applications that want to use UDP are increasingly making use of higher-level protocols to avoid creating bespoke solutions for problems like NAT traversal, connection establishment, and reliable delivery.

This section looks at how these protocols affect Tor's UDP traffic requirements.

QUIC

RFC9000 defines QUIC, a multiplexed secure point-to-point protocol which supports reliable and unreliable delivery. The most common use is as an optional HTTP replacement, especially among Google services.

QUIC does not normally try to traverse NAT; as an HTTP replacement, the server is expected to have an address reachable without any prior connection setup.

QUIC provides its own flexible connection lifetimes which may outlive individual network links or NAT mappings. The intention is to provide transparent roaming as mobile users change networks. This automated path discovery opens additional opportunities for malicious traffic, for which the RFC also offers mitigations. See path validation in section 8.2, and the additional mitigations from section 9.3.

When QUIC is used as an optional upgrade path, we must compare any proposed UDP support against the baseline of a non-upgraded original connection. In these cases the goal is not a compatibility enhancement but an avoidance of regression.

In cases where QUIC is used as a primary protocol without TCP fallback, UDP compatibility will be vital. These applications are currently niche but we expect they may rise in popularity.

WebRTC

WebRTC is a large collection of protocols tuned to work together for media transport and NAT traversal. It is increasingly common, both for browser-based telephony and for peer-to-peer data transfer. Non-browser apps often implement WebRTC as well, for example using libwebrtc. Even non-WebRTC apps sometimes have significant overlaps in their technology stacks, due to the independent history of ICE, RTP, and SDP adoption.

Of particular importance to us, WebRTC uses the Interactive Connection Establishment (ICE) protocol to establish a bidirectional channel between endpoints that may or may not be behind a NAT with unknown configuration.

Any generalized solution to connection establishment, like ICE, will require sending connectivity test probes. These have an inherent hazard to anonymity: assuming no delays are inserted intentionally, the result is a broadcast of similar traffic across all available network interfaces. This could form a convenient correlation beacon for an attacker attempting to de-anonymize users who use WebRTC over a Tor VPN. This is the risk enumerated below as interaction with other networks.

See RFC8825 Overview: Real-Time Protocols for Browser-Based Applications, RFC8445 Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal, RFC8838 Trickle ICE: Incremental Provisioning of Candidates for the Interactive Connectivity Establishment (ICE) Protocol, RFC5389 Session Traversal Utilities for NAT (STUN), and others.

Common Applications

With applications exhibiting such a wide variety of behaviors, how do we know what to expect from a good implementation? How do we know which compatibility decisions will be most important to users? For this it's helpful to look at specific application behaviors. This is a best-effort analysis conducted at a point in time. It's not meant to be a definitive reference, think of it as a site survey taken before planning a building.

In alphabetical order:

ApplicationTypeProtocol featuresCurrent behaviorExpected outcome
BitTorrentFile sharingMany peers per local portFails without UDPWorks, new source of nuisance traffic
BigBlueButtonTelecomWebRTC, TURN, TURN-over-TLSWorksSlight latency improvement
DiscordTelecomProprietary, client/serverFails without UDPStarts working
DNSInfrastructureMight want IP fragmentationLimitedFull DNS support, for better and worse
FaceTimeTelecomWebRTC, TURN, TURN-over-TCPWorksSlight latency improvement
Google MeetTelecomSTUN/TURN, TURN-over-TCPWorksSlight latency improvement
Jitsi (meet.jit.si)TelecomWebRTC, TURN-over-TLS, CloudflareFails on TorNo change, problem does not appear UDP-related
Jitsi (docker-compose)TelecomWebRTC, centralized STUN onlyFails without UDPStarts working
LinphoneTelecom (SIP)SIP-over-TLS, STUN, TURNFails without UDPStarts working
SignalTelecomWebRTC, TURN, TURN-over-TCPWorksSlight latency improvement
SkypeTelecomP2P, STUN, TURN-over-TLSWorksSlight latency improvement
WhatsAppTelecomSTUN, TURN-over-TCP. Multi serverWorksSlight latency improvement
WiFi CallingTelecomIPsec tunnelOut of scopeStill out of scope
ZoomTelecomclient/server or P2P, UDP/TCPWorksSlight latency improvement

Overview of Possible Solutions

This section starts to examine different high-level implementation techniques we could adopt. Broadly they can be split into datagram routing and tunneling.

Datagram Routing

These approaches seek to use a network that can directly route datagrams from place to place. These approaches are the most obviously suitable for implementing UDP, but they also form the widest departure from classic Tor.

Intentional UDP Leak

The simplest approach would be to allow UDP traffic to bypass the anonymity layer. This is an unacceptable loss of anonymity in many cases, given that the client's real IP address is made visible to web application providers.

In other cases, this is an acceptable or even preferable approach. For example, VPN users may be more concerned with achieving censorship-resistant network connectivity than hiding personal identifiers from application vendors.

In threat models where application vendors are more trustworthy than the least trustworthy Tor exits, it may be more appropriate to allow direct peer-to-peer connections than to trust Tor exits with unencrypted connection establishment traffic.

3rd Party Implementations

Another option would be to use an unrelated anonymizer system for datagram traffic. It's not clear that a suitable system already exists. I2P provides a technical solution for routing anonymized datagrams, but not a Tor-style infrastructure of exit node operators.

This points to the key weakness of relying on a separate network for UDP: Tor has an especially well-developed community of volunteers running relays. Any UDP solution that is inconvenient for relay operators has little chance of adoption.

Future Work on Tor

There may be room for future changes to Tor which allow it to somehow transfer and route datagrams directly, without a separate process of establishing circuits and tunnels. If this is practical it may prove to be the simplest and highest performance route to achieving high quality UDP support in the long term. A specific design is out of the scope of this document.

It is worth thinking early about how we can facilitate combinations of approaches. Even without bringing any new network configurations to Tor, achieving interoperable support for both exit nodes and onion services in a Tor UDP implementation requires some attention to how multiple UDP providers can share protocol responsibilities. This may warrant the introduction of some additional routing layer.

Tunneling

The approaches in this section add a new construct which does not exist in UDP itself: a point-to-point tunnel between clients and some other location at which they establish the capability to send and receive UDP datagrams.

Any tunneling approach requires some way to discover tunnel endpoints. For the best usability and adoption this should come as an extension of Tor's existing process for distributing consensus and representing exit policy.

In practice, exit policies for UDP will have limited practical amounts of diversity. VPN implementations will need to know ahead of time which tunnel circuits to build, or they will suffer a significant spike in latency for the first outgoing datagram to a new peer. Additionally, it's common for UDP port numbers to be randomly assigned. This would make highly specific Tor exit policies even less useful and even higher overhead than they are with TCP.

TURN Encapsulated in Tor Streams

The scope of this tunnel is quite similar to the existing TURN relays, used commonly by WebRTC applications to implement fallbacks for clients who cannot find a more direct connection path.

TURN is defined by RFC8656 as a set of extensions built on the framework from STUN in RFC8489. The capabilities are a good match for our needs, offering clients the ability to encapsulate UDP datagrams within a TCP stream, and to allocate local port mappings on the server.

TURN was designed to be a set of modular and extensible pieces, which might be too distant from Tor's design philosophy of providing single canonical representations. Any adoption of TURN will need to consider the potential for malicious implementations to mark traffic, facilitating de-anonymization attacks.

TURN has a popular embeddable C-language implementation, coturn, which may be suitable for including alongside or inside C tor.

Tor Stream Tunnel to an Exit

Most of the discussion on UDP implementation in Tor so far has assumed this approach. Essentially it's the same strategy as TCP exits, but for UDP. When the OP initializes support for UDP, it pre-builds circuits to exits that support required UDP exit policies. These pre-built circuits can then be used as tunnels for UDP datagrams.

Within this overall approach, there are various ways to assign Tor streams for the UDP traffic. This will be considered below.

Tor Stream Tunnel to a Rendezvous Point

To implement onion services which advertise UDP ports, we can use additional tunnels. A new type of tunnel could end at a rendezvous point rather than an exit node. Clients could establish the ability to allocate a temporary virtual datagram mailbox at these rendezvous nodes.

This leaves more open questions about how outgoing traffic is routed, and which addressing format would be used for the datagram mailbox. The most immediate challenge in UDP rendezvous would then become application support. Protocols like STUN and ICE deal directly with IPv4 and IPv6 formats in order to advertise a reachable address to their peer. Supporting onion services in WebRTC would require protocol extensions and software modifications for STUN, TURN, ICE, and SDP at minimum.

UDP-like rendezvous extensions would have limited meaning unless they form part of a long-term strategy to forward datagrams in some new way for enhanced performance or compatibility. Otherwise, application authors might as well stick with Tor's existing reliable circuit rendezvous functionality.

Specific Designs Using Tor Streams

Let's look more closely at Tor streams, the multiplexing layer right below circuits.

Streams have an opaque 16-bit identifier, allocated from the onion proxy (OP) endpoint. Stream lifetimes are subject to some slight ambiguity still in the Tor spec. They are always allocated from the OP end but may be destroyed asynchronously by either circuit endpoint.

We have an opportunity to use this additional existing multiplexing layer to serve a useful function in the new protocol, or we can opt to interact with streams as little as possible in order to keep the protocol features more orthogonal.

One Stream per Tunnel (VPN)

It's possible to transport arbitrary UDP traffic using only a single Tor stream ID within the circuit. In this case, both the source and destination address would need to be represented somehow within an added per-datagram header.

The most common examples of this approach might be TUN/TAP forwarding over SSH, or any VPN that uses a TCP transport.

This design might be preferable if 16-bit stream IDs are found to be in short supply, but otherwise we would expect some benefit from using multiple streams IDs and therefore allowing a shorter per-datagram header.

One Stream per Local port (TURN)

With the standard TURN relay protocol, implemented over a TCP transport, the "allocation" of a relayed port can be performed once per tuple of (protocol, source IP, source port, destination IP, destination port). Put another way, TURN does not amplify the number of local ports available. An application that wishes to use two relayed ports will need two separate TCP connections, or in this context two separate Tor stream IDs.

As with the SSH example above, TURN could be implemented without any new Tor message types. TURN adds a 4-byte "channel data" header to normal datagrams, to facilitate message framing and peer identification.

We could choose to implement protocol support for bundling TURN relay functionality with a normal Tor exit node. In that case, we would want a single new Tor message type:

  • CONNECT_TURN

    • Establish a stream as a connection to the exit relay's built-in (or configured) TURN server.

      This would logically be a TURN-over-TCP connection, though it does not need to correspond to any real TCP socket if the TURN server is implemented in-process with tor.

Note that RFC8656 requires authentication before data can be relayed, which is a good default best practice for the internet perhaps but is the opposite of what Tor is trying to do. We would either deviate from the specification to relax this auth requirement, or provide a way for clients to discover credentials: perhaps by fixing them ahead of time or by including them in the relay descriptor.

If authentication is used, we must consider the protocol's privacy limitations. STUN/TURN usernames are conveyed in plaintext unless an additional TLS layer is also in use. For anonymity, use a randomly generated username. The draft-uberti-behave-turn-rest-00 document describes a method for generating time-limited credentials, as implemented through --use-auth-secret in coturn. Username timestamps should be randomized, to avoid exposing the precise local system clock.

One Stream per Local Port (Proposal 339)

One stream per socket was the approach suggested in Proposal 339 by Nick Mathewson in 2020, defining a per-local-port approach using slightly different terminology than TURN.

Proposal 339 chooses a different UDP subset to support, but otherwise the main technical distinction vs. TURN's per-local-port stream comes down to the framing layer. TURN defines an additional header to be used within a TCP-like stream, whereas proposal 339 avoids the two redundant length bytes by defining new Tor message types: CONNECT_UDP, CONNECTED_UDP, and DATAGRAM.

Similar to TURN, each stream's lifetime would match the lifetime of a local port allocation. Unlike TURN, there would be a single peer (remote address, remote port) allowed per local port. This matches the usage of BSD-style sockets on which connect() has completed, but it's incompatible with many of the applications analyzed. Multiple peers are typically needed for a variety of reasons, like connectivity checks or multi-region servers.

This approach would be simplest to implement and specify, especially in the existing C tor implementation. It also unfortunately has very limited compatibility, and no clear path toward incremental upgrades if we wish to improve compatibility later.

A simple one-to-one mapping between streams and sockets would preclude the optimizations necessary to address local port exhaustion risks below. Solutions under this design are possible, but only by decoupling logical protocol-level sockets from the ultimate implementation-level sockets and reintroducing much of the complexity that we attempted to avoid by choosing this design.

One Stream per Local Port (NAT Mapping)

We could improve the compatibility of Proposal 339 while retaining its very low per-datagram overhead by updating it to align with terminology and requirements from RFC4787 where practical.

This approach would use each stream to represent one endpoint-independent mapping, for use as a local UDP port in communication with multiple peers.

A mapping would always be allocated from the OP (client) side. It could explicitly specify a filtering style, if we wish to allow applications to request non-port-dependent filtering for compatibility. Each datagram within the stream would still need to be tagged with a peer address/port in some way.

This approach would involve a single new type of stream, and two new messages that pertain to these mapping streams:

  • NEW_UDP_MAPPING

    • Always client-to-exit.
    • Creates a new mapping, with a specified stream ID.
    • Succeeds instantly; no reply is expected, early data is ok.
    • Externally-visible local port number is arbitrary, and must be determined through interaction with other endpoints.
    • Might contain an IP "don't fragment" flag.
    • Might contain a requested filtering mode.
    • Lifetime is until circuit teardown or END message.
  • UDP_MAPPING_DATAGRAM

    • Conveys one datagram on a stream previously defined by NEW_UDP_MAPPING.
    • Includes peer address (IPv4/IPv6) as well as datagram content.

This updated approach is more compatible than Proposal 339 but it's not yet as efficient as TURN. Peers are almost always repeated over the course of a tunnel's lifetime, so the header space needed for addressing could be compressed by keeping some persistent information about peers as well as mappings.

One Stream per Flow

One stream per flow has also been suggested. Specifically, Mike Perry brought this up during our conversations about UDP recently and we spent some time analyzing it from a RFC4787 perspective. This approach has some interesting properties but also hidden complexity that may ultimately make other options more easily applicable.

This would assign a stream ID to the tuple consisting of at least (local port, remote address, remote port). Additional flags may be included for features like transmit and receive filtering, IPv4/v6 choice, and IP Don't Fragment.

This has advantages in keeping the datagram cells simple, with no additional IDs beyond the existing circuit ID. It may also have advantages in DoS-prevention and in privacy analysis.

Stream lifetimes, in this case, would not have any specific meaning other than the lifetime of the ID itself. The bundle of flows associated with one local port would still all be limited to the lifetime of a Tor circuit, by scoping the local port identifier to be contained within the lifetime of its circuit.

It would be necessary to allocate a new stream ID any time a new (local port, remote address, remote port) tuple is seen. This would most commonly happen as a result of a first datagram sent to a new peer, coinciding with the establishment of a NAT-style mapping and the possible allocation of a socket on the exit.

A less common case needs to be considered too: what if the parameter tuple first occurs on the exit side? We don't yet have a way to allocate stream IDs from either end of a circuit. This would need to be considered. One simple solution would be to statically partition the stream ID space into a portion that can be independently allocated by each side.

When is this exit-originated circuit ID allocation potentially needed? It is clearly needed when using address-dependent filtering. An incoming datagram from a previously-unseen peer port is expected to be deliverable, and the exit would need to allocate an ID for it.

Even with the stricter address and port-dependent filtering clients may still be exposed to exit-originated circuit IDs if there are mismatches in the lifetime of the filter and the stream.

This approach thus requires some attention to either correctly allocating stream IDs on both sides of the circuit, or choosing a filtering strategy and filter/mapping lifetime that does not ever leave stream IDs undefined when expecting incoming datagrams.

Hybrid Mapping and Flow Approach

We can extend the approach above with an optimization that addresses the undesirable space overhead from redundant address headers. This uses two new types of stream, in order to have streams per mapping and per flow at the same time.

The per-mapping stream remains the sole interface for managing the lifetime of a mapped UDP port. Mappings are created explicitly by the client. As an optimization, within the lifetime of a mapping there may exist some number of flows, each assigned their own ID.

This tries to combine the strengths of both approaches, using the lifetime of one stream to define a mapping and to carry otherwise-unbundled traffic while also allowing additional streams to bundle datagrams that would otherwise have repetitive headers. It avoids the space overhead of a purely per mapping approach and avoids the ID allocation and lifetime complexity introduced with per flow.

This approach takes some inspiration from TURN, where commonly used peers will be defined as a "channel" with an especially short header. Incoming datagrams with no channel can always be represented in the long form, so TURN never has to allocate channels unexpectedly.

The implementation here could be a strict superset of the per mapping implementation, adding new commands for flows while retaining existing behavior for mappings. There would be a total of four new message types:

  • NEW_UDP_MAPPING

    • Same as above.
  • UDP_MAPPING_DATAGRAM

    • Same as above.
  • NEW_UDP_FLOW

    • Allocates a stream ID as a flow, given the ID to be allocated and the ID of its parent mapping stream.
    • Includes a peer address (IPv4/IPv6).
    • The flow has a lifetime strictly bounded by the outer mapping. It is deleted by an explicit END or when the mapping is de-allocated for any reason.
  • UDP_FLOW_DATAGRAM

    • Datagram contents only, without address.
    • Only appears on flow streams.

We must consider the traffic marking opportunities provided when allowing an exit to represent one incoming datagram as either a flow or mapping datagram.

It's possible this traffic injection potential is not worse than the baseline amount of injection potential than every UDP protocol presents. See more on risks below. For this hybrid stream approach specifically, there's a limited mitigation available which allows exits only a bounded amount of leaked information per UDP peer:

Ideally exits may not choose to send a UDP_MAPPING_DATAGRAM when they could have sent a UDP_FLOW_DATAGRAM. Sometimes it is genuinely unclear though: an exit may have received this datagram in-between processing NEW_UDP_MAPPING and NEW_UDP_FLOW. A partial mitigation would terminate circuits which send a UDP_MAPPING_DATAGRAM for a peer that has already been referenced in a UDP_FLOW_DATAGRAM. The exit is thereby given a one-way gate allowing it to switch from using mapping datagrams to using flow datagrams at some point, but not to switch back and forth repeatedly.

Mappings that do not request port-specific filtering may always get unexpected UDP_MAPPING_DATAGRAMs. Mappings that do use port-specific filtering could make a flow for their only expected peers, then expect to never see UDP_MAPPING_DATAGRAM.

NEW_UDP_MAPPING could have an option requiring that only UDP_FLOW_DATAGRAM is to be used, never UDP_MAPPING_DATAGRAM. This would remove the potential for ambiguity, but costs in compatibility as it's no longer possible to implement non-port-specific filtering.

Risks

Any proposed UDP support involves significant risks to user privacy and software maintainability. This section elaborates some of these risks, so they can be compared against expected benefits.

Behavior Regressions

In some applications it is possible that Tor's implementation of a UDP compatibility layer will cause a regression in the ultimate level of performance or security.

Performance regressions can occur accidentally due to bugs or compatibility glitches. They may also occur for more fundamental reasons of protocol layering. For example, the redundant error correction layers when tunneling QUIC over TCP. These performance degradations are expected to be minor, but there's some unavoidable risk.

The risk of severe performance or compatibility regressions may be mitigated by giving users a way to toggle UDP support per-application.

Privacy and security regressions have more severe consequences and they can be much harder to detect. There are straightforward downgrades, like WebRTC apps that give up TURN-over-TLS for plaintext TURN-over-UDP. More subtly, the act of centralizing connection establishment traffic in Tor exit nodes can make users an easier target for other attacks.

Bandwidth Usage

We should expect an increase in overall exit bandwidth requirements due to peer-to-peer file sharing applications.

Current users attempting to use BitTorrent over Tor are hampered by the lack of UDP compatibility. Interoperability with common file-sharing peers would make Tor more appealing to users with a large and sustained appetite for anonymized bandwidth.

Local Port Exhaustion

Exit routers will have a limited number of local UDP ports. In the most constrained scenario, an exit may have a single IP with 16384 or fewer ephemeral ports available. These ports could each be allocated by one client for an unbounded amount of exclusive use.

In order to enforce high levels of isolation between different subsequent users of the same local UDP port, we may wish to enforce an extended delay between allocations during which nobody may own the port. Effective isolation requires this timer duration to be greater than any timer encountered on a peer or a NAT. In RFC4787's recommendations a NAT's mapping timer must be longer than 2 minutes. Our timer should ideally be much longer than 2 minutes.

An attacker who allocates ports for only this minimum duration of 2 minutes would need to send 136.5 requests per second to achieve sustained use of all available ports. With multiple simultaneous clients this could easily be done while bypassing per-circuit rate limiting.

The expanded definition of "Port overlapping" from RFC7857 section 3, may form at least a partial mitigation:

This document clarifies that this port overlapping behavior may be extended to connections originating from different internal source IP addresses and ports as long as their destinations are different.

This gives us an opportunity for a vast reduction in the number of required ports and file descriptors. Exit routers can automatically allocate local ports for use with a specific peer when that peer is first added to the client's filter.

Due to the general requirements of NAT traversal, UDP applications with any NAT support will always need to communicate with a relatively well known server prior to any attempts at peer-to-peer communication. This early peer could be an entire application server, or it could be a STUN endpoint. In any case, the identity of this first peer gives us a hint about the set of all potential peers.

Within the exit router, each local port will track a separate mapping owner for each peer. When processing that first outgoing datagram, the exit may choose any local port where the specific peer is not taken. Subsequent outgoing datagrams on the same port may communicate with a different peer, and there's no guarantee all these future peers will be claimed successfully.

When is this a problem? An un-claimable peer represents a case where the exact (local ip, local port, remote ip, remote port) tuple is in use already for a different mapping in some other Tor stream. For example, imagine two clients are running different types of telecom apps which are nevertheless inter-compatible and capable of both calling the same peers. Alternatively, consider the same app but with servers in several regions. The two apps will begin by communicating with different sets of peers, due to different application servers and different bundled STUN servers. This is our hint that it's likely appropriate to overlap their local port allocations. At this point, both of these applications may be successfully sharing a (local ip, local port) tuple on the exit. As soon as one of these apps calls a peer with some (remote ip, remote port), the other app will be unable to contact that specific peer.

The lack of connectivity may seem like a niche inconvenience, and perhaps that is the extent of the issue. It seems likely this heuristic could result in a problematic information disclosure under some circumstances, and it deserves closer study.

Application Fingerprinting

UDP applications present an increased surface of plaintext data that may be available for user fingerprinting by malicious exits.

Exposed values can include short-lived identifiers like STUN usernames. Typically it will also be possible to determine what type of software is in use, and maybe what version of that software.

Short-lived identifiers are still quite valuable to attackers, because they may reliably track application sessions across changes to the Tor exit. If longer-lived identifiers exist for any reason, that of course provides a powerful tool for call metadata gathering.

Peer-to-Peer Metadata Collection

One of our goals was to achieve the compatibility and perhaps performance benefits of allowing "peer-to-peer" (in our case really exit-to-exit) UDP connections. We expect this to enable the subset of applications that lack a fallback path which loops traffic through an app-provided server.

This goal may be at odds with our privacy requirements. At minimum, a pool of malicious exit nodes could passively collect metadata about these connections as a noisy proxy for call metadata.

Any additional signals like application fingerprints or injected markers may be used to enrich this metadata graph, possibly tracking users across sessions or across changes to the tunnel endpoint.

Interaction with Other Networks

Any application using an ICE-like interactive connection establishment scheme will easily leak information across network boundaries if it ever has access to multiple networks at once.

In applications that are not privacy conscious this is often desired behavior. For example, a video call to someone in your household may typically transit directly over Wifi, decreasing service costs and improving latency. This implies that some type of local identifier accompanies the call signaling info, allowing the devices to find each other's LAN address.

Privacy-preserving solutions for this use case are still an active area of standardization effort. The IETF draft on mDNS ICE candidates proposes one way to accomplish this by generating short-lived unique IDs which are only useful to peers with physical access to the same mDNS services.

Without special attention to privacy, the typical implementation is to share all available IP addresses and to initiate simultaneous connectivity tests using any IP pairs which cannot be trivially discarded. This applies to ICE as specified, but also to any proprietary protocol which operates in the same design space as ICE. This has multiple issues in a privacy-conscious environment: The IP address disclosures alone can be fatal to anonymity under common threat models. Even if meaningful IP addresses are not disclosed, the timing correlation from connectivity checks can provide confirmation beacons that alert an attacker to some connection between a LAN user and a Tor exit.

Traffic Injection

Some forms of UDP support would have obvious and severe traffic injection vulnerabilities. For example, the very permissive endpoint-independent filtering strategy would allow any host on the internet to send datagrams in bulk to all available local ports on a Tor exit in order to map that traffic's effect on any guards they control.

Any vehicle for malicious parties to mark traffic can be abused to de-anonymize users. Even if there is a more restrictive filtering policy, UDP's lack of sequence numbers make header spoofing attacks considerably easier than in TCP. Third parties capable of routing datagrams to an exit with a spoofed source address could bypass filtering when the communicating parties are known or can be guessed. For example, a malicious superuser at an ISP without egress filtering could send packets with the source IPs set to various common DNS servers and application STUN/TURN servers. If a specific application is being targeted, only the exit node and local port numbers need to be guessed by brute force.

In case of malicious exit relays, whole datagrams can be inserted and dropped, and datagrams may be padded with additional data, both without any specific knowledge of the application protocol. With specific protocol insights, a malicious relay may make arbitrary edits to plaintext data.

Of particular interest is the plaintext STUN, TURN, and ICE traffic used by most WebRTC apps. These applications rely on higher-level protocols (SRTP, DTLS) to provide end-to-end encryption and authentication. A compromise at the connection establishment layer would not violate application-level end-to-end security requirements, making it outside the threat model of WebRTC but very much still a concern for Tor.

These attacks are not fully unique to the proposed UDP support, but UDP may increase exposure. In cases where the application already has a fallback using TURN-over-TLS, the proposal is a clear regression over previous behaviors. Even when comparing plaintext to plaintext, there may be a serious downside to centralizing all connection establishment traffic through a small number of exit IPs. Depending on your threat model, it could very well be more private to allow the UDP traffic to bypass Tor entirely.

Malicious Outgoing Traffic

We can expect UDP compatibility in Tor will give malicious actors additional opportunities to transmit unwanted traffic.

In general, exit abuse will need to be filtered administratively somehow. This is not unique to UDP support, and exit relay administration typically involves some type of filtering response tooling that falls outside the scope of Tor itself.

Exit administrators may choose to modify their exit policy, or to silently drop problematic traffic. Silent dropping is discouraged in most cases, as Tor prioritizes the accuracy of an exit's advertised policy. Detailed exit policies have a significant space overhead in the overall Tor consensus document, but it's still seen as a valuable resource for clients during circuit establishment.

Exit policy filtering may be less useful in UDP than with TCP due to the inconvenient latency spike when establishing a new tunnel. Applications that are sensitive to RTT measurements made during connection establishment may fail entirely when the tunnel cannot be pre-built.

This section lists a few potential hazards, but the real-world impact may be hard to predict owing to a diversity of custom UDP protocols implemented across the internet.

  • Amplification attacks against arbitrary targets

    These are possible only in limited circumstances where the protocol allows an arbitrary reply address, like SIP. The peer is often at fault for having an overly permissive configuration. Nevertheless, any of these easy amplification targets can be exploited from Tor with little consequence, creating a nuisance for the ultimate target and for exit operators.

  • Amplification attacks against an exit relay

    An amplification peer which doesn't allow arbitrary destinations can still be used to attack the exit relay itself or other users of that relay. This is essentially the same attack that is possible against any NAT the attacker is behind.

  • Malicious fragmented traffic

    If we allow sending large UDP datagrams over IPv4 without the Don't Fragment flag set, we allow attackers to generate fragmented IP datagrams. This is not itself a problem, but it has historically been a common source of inconsistencies in firewall behavior.

  • Excessive sends to an uninterested peer

    Whereas TCP mandates a successful handshake, UDP will happily send unlimited amounts of traffic to a peer that has never responded. To prevent denial of service attacks we have an opportunity and perhaps a responsibility to define our supported subset of UDP to include true bidirectional traffic but exclude continued sends to peers who do not respond.

    See also RFC7675 and STUN's concept of "Send consent".

  • Excessive number of peers

    We may want to place conservative limits on the maximum number of peers per mapping or per circuit, in order to make bulk scanning of UDP port space less convenient.

    The limit would need to be on peers, not stream IDs as we presently do for TCP. In this proposal stream IDs are not necessarily meaningful except as a representational choice made by clients.

Next Steps

At this point we have perhaps too many possibilities for how to proceed. We could integrate UDP quite closely with Tor itself or not at all.

Choosing a route forward is both a risk/benefit tradeoff and a guess about the future of Tor. If we expect that some future version of Tor will provide its own datagram transport, we should plot a course aiming in that direction. If not, we might be better served by keeping compatibility features confined to separate protocol layers when that's practical.

Requiring a Long-Term Datagram Plan

If we choose to add new long-term maintenance burdens to our protocol stack, we should ensure they serve our long-term goals for UDP adoption as well as these shorter-term application compatibility goals.

This work so far has been done with the assumption that end-to-end datagram support is out of scope. If we intend to proceed down any path which encodes a datagram-specific protocol into Tor proper, we should prioritize additional protocol research and standardization work.

Alternatively, Modular Application-Level Support

Without a clear way to implement fully generic UDP support, we're left with application-level goals. Different applications may have contrasting needs, and we can only achieve high levels of both compatibility and privacy by delegating some choices to app authors or per-app VPN settings.

This points to an alternative in which UDP support is excluded from Tor as much as possible while still supporting application requirements. For example, this could motivate the TURN Encapsulated in Tor Streams design, or even simpler designs where TURN servers are maintained independently from the Tor exit infrastructure.

The next prototyping step we would need at this stage is a version of onionmasq extended to support NAT-style UDP mappings using TURN allocations provided through TURN-over-TCP or TURN-over-TLS. This can be a platform for further application compatibility experiments. It could potentially become a low-cost minimal implementation of UDP application compatibility, serving to assess which remaining user needs are still unmet.