Filename: 306-ipv6-happy-eyeballs.txt
Title: A Tor Implementation of IPv6 Happy Eyeballs
Author: Neel Chauhan
Created: 25-Jun-2019
Supercedes: 299
Status: Open
Ticket: https://trac.torproject.org/projects/tor/ticket/29801

1. Introduction

   As IPv4 address space becomes scarce, ISPs and organizations will deploy
   IPv6 in their networks. Right now, Tor clients connect to entry nodes using
   IPv4 connectivity by default.

   When networks first transition to IPv6, both IPv4 and IPv6 will be enabled
   on most networks in a so-called "dual-stack" configuration. This is to not
   break existing IPv4-only applications while enabling IPv6 connectivity.
   However, IPv6 connectivity may be unreliable and clients should be able
   to connect to the entry node using the most reliable technology, whether
   IPv4 or IPv6.

   In ticket #27490, we introduced the option ClientAutoIPv6ORPort which
   lets a client randomly choose between IPv4 or IPv6. However, this
   random decision does not take into account unreliable connectivity
   or falling back to the alternate IP version should one be unreliable
   or unavailable.

   One way to select between IPv4 and IPv6 on a dual-stack network is a
   so-called "Happy Eyeballs" algorithm as per RFC 8305. In one, a client
   attempts the preferred IP family, whether IPv4 or IPv6. Should it work,
   the client sticks with the preferred IP family. Otherwise, the client
   attempts the alternate version. This means if a dual-stack client has
   both IPv4 and IPv6, and IPv6 is unreliable, preferred or not, the
   client uses IPv4, and vice versa. However, if IPv4 and IPv6 are both
   equally reliable, and IPv6 is preferred, we use IPv6.

   In Proposal 299, we have attempted a IP fallback mechanism using failure
   counters and preferring IPv4 and IPv6 based on the state of the counters.
   However, Prop299 was not standard Happy Eyeballs and an alternative,
   standards-compliant proposal was requested in [P299-TRAC] to avoid issues
   from complexity caused by randomness.

   This proposal describes a Tor implementation of Happy Eyeballs and is
   intended as a successor to Proposal 299.

2. Address/Relay Selection

   This section describes the necessary changes for address selection to
   implement Prop306.

2.1. Address Handling Changes

   To be able to handle Happy Eyeballs in Tor, we will need to modify the
   data structures used for connections to entry nodes, namely the extend info
   structure.

   Entry nodes are usually guards, but some clients don't use guards:

     * Bootstrapping clients can connect to fallback directory mirrors or
       authorities

     * v3 single onion services can use IPv4 or IPv6 addresses to connect
       to introduction and rendezvous points, and

     * Clients can be configured to disable entry guards

   Bridges are out of scope for this proposal, because Tor does not support
   multiple IP addresses in a single bridge line.

   The extend info structure should contain both an IPv4 and an IPv6 address.
   This will allow us to try IPv4 and the IPv6 addresses should both be
   available on a relay and the client is dual-stack.

   When processing:
     * relay descriptors,
     * hard-coded authority and fallback directory lists,
     * onion service descriptors, or
     * onion service introduce cells,
   and filling in the extend info data structure, we need to fill in both the
   IPv4 and IPv6 address if both are available. If only one family is
   available for a relay (IPv4 or IPv6), we should leave the other family null.

2.2 Bootstrap Changes

   Tor's hard-coded authority and fallback directory mirror lists contain
   some entries with IPv6 ORPorts. As of January 2020, 56% of authorities and
   47% of fallback directories have IPv6.

   During bootstrapping, we should have an option for the maximum number of
   IPv4-only nodes, before the next node must have an IPv6 ORPort. The
   parameter is as follows:

     * MaxNumIPv4BootstrapAttempts NUM

   During bootstrap, the minimum fraction of nodes with IPv6 ORPorts will be
   1/(1 + MaxNumIPv4BootstrapAttempts). And the average fraction will be
   larger than both minimum fraction, and the actual proportion of IPv6
   ORPorts in the fallback directory list. (Clients mainly use fallback
   directories for bootstrapping.)

   Since this option is used during bootstrapping, it can not have a
   corresponding consensus parameter.

   The default value for MaxNumIPv4BootstrapAttempts should be 2. This
   means that every third bootstrap node must have an IPv6 ORPort. And on
   average, just over half of bootstrap nodes chosen by clients will have an
   IPv6 ORPort. This change won't have much impact on load-balancing, because
   almost half the fallback directory mirrors have IPv6 ORPorts.

   The minimum value of MaxNumIPv4BootstrapAttempts is 0. (Every bootstrap
   node must have an IPv6 ORPort. This setting is equivalent to
   ClientPreferIPv6ORPort 1.)

   The maximum value of MaxNumIPv4BootstrapAttempts should be 100. (Since
   most clients only make a few bootstrap connections, bootstrap nodes will
   be chosen at random, regardless of their IPv6 ORPorts.)

2.3. Guard Selection Changes

   When we select guard candidates, we should have an option for the number of
   primary IPv6 entry guards. The parameter is as follows:

     * NumIPv6Guards NUM

   If UseEntryGuards is set to 1, we will select exactly this number of IPv6
   relays for our primary guard list, which is the set of relays we strongly
   prefer when connecting to the Tor network. (This number should also apply
   to all of Tor's other guard lists, scaled up based on the relative size of
   the list.)

   If NUM is -1, we try to learn the number from the NumIPv6Guards
   consensus parameter. If the consensus parameter isn't set, we should
   default to 1.

   The default value for NumIPv6Guards should be -1. (Use the consensus
   parameter, or the underlying default value of 1.)

   As of September 2019, approximately 20% of Tor's guards supported IPv6,
   by consensus weight. (Excluding exits that are also guards, because
   clients avoid choosing exits in their guard lists.)

   If all Tor clients implement NumIPv6Guards, then these 20% of guards will
   handle approximately 33% of Tor's traffic. (Because the default value of
   NumPrimaryGuards is 3.) This may have a significant impact on Tor's
   load-balancing. Therefore, we should deploy this feature gradually, and try
   to increase the number of relays that support IPv6 to at least 33%.

   To minimise the impact on load-balancing, IPv6 support should only be
   required for exactly NumIPv6Guards during guard list selection. All other
   guards should be IPv4-only guards. Once approximately 50% of guards support
   IPv6, NumIPv6Guards can become a minimum requirement, rather than an exact
   requirement.

   The minimum configurable value of NumIPv6Guards is -1. (Use the consensus
   parameter, or the underlying default.)

   The minimum resulting value of NumIPv6Guards is 0. (Guards will be chosen
   at random, regardless of their IPv6 ORPorts.)

   The maximum value of NumIPv6Guards should be the configured value of
   NumPrimaryGuards. (Every guard must have an IPv6 ORPort. This setting is
   equivalent to ClientPreferIPv6ORPort 1.)

3. Relay Connections

   If there is an existing authenticated connection, we must use it similar
   to how we used it pre-Prop306.

   If there is no existing authenticated connection for an entry node, tor
   currently attempts to connect using the first available, allowed, and
   preferred address. (Determined using the existing Client IPv4 and IPv6
   options.)

   We should also allow falling back to the alternate address. For this,
   a design will be given in Section 3.1.

3.1. TCP Connection to Preferred Address On First TCP Success

   In this design, we will connect via TCP to the first preferred address.
   On a failure or after a 250 msec delay, we attempt to connect via TCP to
   the alternate address. On a success, Tor attempts to authenticate and
   closes the other connection.

   This design is close to RFC 8305 and is similar to how Happy Eyeballs
   is implemented in a web browser.

3.2. Handling Connection Successes And Failures

   Should a connection to a entry node succeed and is authenticated via TLS,
   we can then use the connection. In this case, we should cancel all other
   connection timers and in-progress connections. Cancelling the timers is
   necessary so we don't attempt new unnecessary connections when our
   existing connection is successful, preventing denial-of-service risks.

   However, if we fail all available and allowed connections, we should tell
   the rest of Tor that the connection has failed. This is so we can attempt
   another entry node.

3.3. Connection Attempt Delays

   As mentioned in [TEOR-P306-REP], initially, clients should prefer IPv4
   by default. The Connection Attempt Delay, or delay between IPv4 and IPv6
   connections should be 250 msec. This is to avoid the overhead from tunneled
   IPv6 connections.

   The Connection Attempt Delay should not be dynamically adjusted, as it adds
   privacy risks. This value should be fixed, and could be manually adjusted
   using this torrc option or consensus parameter:

     * ConnectionAttemptDelay N [msec|second]

   The Minimum and Maximum Connection Attempt Delays should also not be
   dynamically adjusted for privacy reasons. The Minimum should be fixed at
   10 msec as per RFC 8305. But the maximum should be higher than the RFC 8305
   recommendation of 2 seconds. For Tor, we should make this timeout value 30
   seconds to match Tor's existing timeout.

   We need to make it possible for users to set the Maximum Connection Attempt
   Delay value higher for slower and higher-latency networks such as dial-up
   and satellite.

4. Option Changes

   As we enable IPv6-enabled clients to connect out of the box, we should
   adjust the default options to enable IPv6 while not breaking IPv4-only
   clients.

   The new default options should be:

    * ClientUseIPv4 1 (to enable IPv4)

    * ClientUseIPv6 1 (to enable IPv6)

    * ClientPreferIPv6ORPort 0 (for load-balancing reasons so we don't
      overload IPv6-only guards)

    * ConnectionAttemptDelay 250 msec (the recommended delay between IPv4
      and IPv6, as per RFC 8305)

   One thing to note is that clients should be able to connect with the above
   options on IPv4-only, dual-stack, and IPv6-only networks, and they should
   also work if ClientPreferIPv6ORPort is 1. But we shouldn't expect
   IPv4 or IPv6 to work if ClientUseIPv4 or ClientUseIPv6 is set to 0.

   When the majority of clients and relay are IPv6-capable, we could set the
   default value of ClientPreferIPv6ORPort to 1, in order to take advantage
   of IPv6. We could add a ClientPreferIPv6ORPort consensus parameter, so we
   can make this change network-wide.

5. Relay Statistics

   Entry nodes could measure the following statistics for both IPv4 and IPv6:

     * Number of successful connections

     * Number of extra Prop306 connections (unsuccessful or cancelled)
       * Client closes the connection before completing TLS
       * Client closes the connection before sending any circuit or data cells

     * Number of client and relay connections
       * We can distinguish between authenticated (relay, authority
         reachability) and unauthenticated (client, bridge) connections

   Should we implement Section 5:

     * We can send this information to the directory authorities using relay
       extra-info descriptors

     * We should consider the privacy implications of these statistics, and
       how much noise we need to add to them

     * We can include these statistics in the Heartbeat logs

6. Initial Feasibility Testing

   We should test this proposal with the following scenarios:

    * Different combinations of values for the options ClientUseIPv4,
      ClientUseIPv6, and ClientPreferIPv6ORPort on IPv4-only, IPv6-only,
      and dual-stack connections

    * Dual-stack connections of different technologies, including
      high-bandwidth and low-latency (e.g. FTTH), moderate-bandwidth and
      moderate-latency (e.g. DSL, LTE), and high-latency and low-bandwidth
      (e.g. satellite, dial-up) to see if Prop306 is reliable and feasible

7. Minimum Viable Prop306 Product

   The mimumum viable product for Prop306 must include the following:

    * The address handling, bootstrap, and entry guard changes described in
      Section 2. (Single Onion Services are optional, Bridge Clients are out
      of scope. The consensus parameter and torrc options are optional.)

    * The alternative address retry algorithm in Section 3.1.

    * The Connection Success/Failure mechanism in Section 3.2.

    * The Connection Delay mechanism in Section 3.3. (The
      ConnectionAttemptDelay torrc option and consensus parameter are
      optional.)

    * A default setup capable of both IPv4 and IPv6 connections with the
      options described in Section 4. (The ClientPreferIPv6ORPort consensus
      parameter is optional.)

8. Optional Features

   Some features which are optional include:

    * Single Onion services: extend info address changes for onion service
      descriptors and introduce cells. (Section 2.1.)

    * Bridge clients are out of scope: they would require bridge line format
      changes, internal bridge data structure changes, and extend info address
      changes. (Section 2.1.)

    * MaxNumIPv4BootstrapAttempts torrc option. We may need this option if
      the proposed default doesn't work for some clients. (Section 2.2.)

    * NumIPv6Guards torrc option and consensus parameter. We may need this
      option if the proposed default doesn't work for some clients.
      (Section 2.3.)

    * ConnectionAttemptDelay torrc option and consensus parameter. We will need
      this option if the Connection Attempt Delay needs to be manually
      adjusted, for instance, if clients often fail IPv6 connections.
      (Section 3.3.)

    * ClientPreferIPv6ORPort consensus parameter. (Section 4.)

    * IPv4, IPv6, client, relay, and extra Prop306 connection statistics.
      While optional, these statistics may be useful for debugging and
      reliability testing, and metrics on IPv4 vs IPv6. (Section 5.)

9. Acknowledgments

   Thank you so much to teor for your discussion on this happy eyeballs
   proposal. I wouldn't have been able to do this has it not been for
   your help.

10. Refrences

   [P299-TRAC]: https://trac.torproject.org/projects/tor/ticket/29801

   [TEOR-P306-REP]: https://lists.torproject.org/pipermail/tor-dev/2019-July/013919.html