The case for model-driven interpretability of delay-based congestion control protocols
Muhammad Khan, Yasir Zaki, Shiva Iyer, Talal Ahamd, Thomas Pötsch, Jay Chen, Anirudh Sivaraman, Lakshmi Subramanian
TThe Case for Model-Driven Interpretability of Delay-basedCongestion Control Protocols
Muhammad Khan
New York University Abu Dhabi, [email protected]
Yasir Zaki
New York University Abu Dhabi, [email protected]
Shiva Iyer
New York University, [email protected]
Talal Ahamd
Google, [email protected]
Thomas Pötsch
New York University Abu Dhabi, [email protected]
Jay Chen
ICSI Berkeley, [email protected]
Anirudh Sivaraman
New York University, [email protected]
Lakshmi Subramanian
New York University, [email protected]
ABSTRACT
Analyzing and interpreting the exact behavior of new delay-basedcongestion control protocols with complex non-linear control loopsis exceptionally difficult in highly variable networks such as cellu-lar networks. This paper proposes a Model-Driven Interpretability(MDI) congestion control framework, which derives a model versionof a delay-based protocol by simplifying a congestion control proto-col’s response into a guided random walk over a two-dimensionalMarkov model. We demonstrate the case for the MDI framework byusing MDI to analyze and interpret the behavior of two delay-basedprotocols over cellular channels: Verus and Copa. Our results showa successful approximation of throughput and delay characteristicsof the protocols’ model versions across variable network condi-tions. The learned model of a protocol provides key insights intoan algorithm’s convergence properties.
CCS CONCEPTS • Networks → Transport protocols ; KEYWORDS
Congestion Control, Markov Model
Cellular channels are known to fluctuate rapidly over short periodsof time [29]. 3G and LTE network measurements [10, 11, 16] demon-strated that variations in the channel cause significant performancedifferences across carriers, access technologies, geographic regions,and time. Rapid channel fluctuations cause loss-based congestioncontrol (CC) algorithms to overreact and under-perform [3], result-ing in buffer-bloat and high delays [7, 8, 13]. Several protocols suchas Sprout [27], Copa [2], Verus [29] and BBR [3] have demonstratedsignificant performance gains against traditional TCP variants overhighly variable network channels. A common recurring themeacross these protocols is to use delay-based signals to measure thenetwork congestion state. While there is a broad array of researchon the dynamics of loss-based CC protocols [4, 18, 23], we stilllack a principled framework for understanding the dynamics ofdelay-based protocols. This paper proposes
Model-Driven Interpretability (MDI)
CCframework, aiming to enhance the ability to interpret delay-basedCC protocols’ behavior. Given any protocol, the MDI frameworkuses empirical data on the protocol’s performance for training a sto-chastic two-dimensional discrete-time Markov model to representthe protocol’s behavior. In essence, using the empirical behaviorof a protocol across diverse network conditions, MDI converts aprotocol into a stochastic random walk in Markovian state space.Each state transition is determined by the delay variation feedbackfrom the network. MDI aims to:(1) Closely approximates the mean/variance of the throughputand delay distributions of the original protocol.(2) Track the original protocol’s temporal behavior, i.e., how toreact to variations in network conditions.We note that achieving these two properties for a broad array ofprotocols is a non-trivial task. In the MDI framework, the notionof protocol memory is implicitly captured in the definition of thestate space (transition probabilities), and the stochastic randomwalk using delay feedback. While the state space represents a sig-nificant approximation to the original protocol, we show that MDIsuccessfully approximates the protocol behavior in practice.To evaluate MDI, we developed MDI versions of two differentprotocols: Verus [29] and Copa [2]. Using real-world cellular tracesin 3G and 4G networks and across synthetic highly variable net-work conditions, we show that the MDI version of a protocol closelyapproximates the throughput and delay distributions of the orig-inal protocol and temporally tracks the protocols’ behavior. Wedemonstrate two specific benefits of MDI in this paper:
Visualizing Protocols:
A protocol state-space representation en-ables visual understanding of its behavior, including measuring howstate transitions vary across: (i) protocols under the same networkcondition; (ii) network conditions under the same protocol.
Reasoning about Convergence:
By representing a protocol ina Markovian state space, one can derive the mixing time and thecorresponding stationary distribution of the MDI version of a pro-tocol that we show empirically to mirror the protocol’s measuredstatistical properties closely.As presented in this paper, the MDI framework is a smallerpart of a much larger puzzle of understanding the properties ofdelay-based control protocols. This paper has primarily shown the a r X i v : . [ c s . N I] F e b easibility of the MDI framework in modeling two such protocolsusing a Markov Model representation. One long-term motivation touse a Markovian framework is to leverage the vast body of statisticsliterature on Markov models and random walks to understand thestability, dynamics, and adaptivity of delay-based protocols. Whilewe have shown initial empirical evidence for analyzing conver-gence properties of protocols and visualizing protocols using MDI,a detailed statistical analysis of protocols is necessary for futurework. It is beyond the scope of this paper. The main idea of MDI is to build a model that reflects the statisticalproperties, providing a more intuitive and predictable understand-ing of the protocol behavior. At an abstract level, MDI assumesthat CC protocols can be modeled by the relationship between thecurrent and the next state, where each state is a tuple of the relativechange in the network delay and the sending window size.
Consider a protocol 𝑃 that uses delay-variations as a congestionsignal. One can imagine such a protocol maintains a recent historyof delay observations, which can be used to estimates the nextsending window or rate. Let us consider an epoch as the unit oftime for making a decision, which can be a variable or a fixed perioddepending on the protocol.The challenge in a Markov model representation of a protocol 𝑃 is determining the appropriate state space and mapping the protocolactions to transitions within the states. The most straightforwardapproach is to map the absolute values directly by describing a stateas ( 𝑑 𝑖 , 𝑤 𝑖 ) where 𝑑 𝑖 and 𝑤 𝑖 are the experienced delay and sendingwindow in an epoch 𝑖 , respectively. We use 𝑑 and 𝑤 (without theepoch subscript 𝑖 ) to abstractly represent the observed delay andwindow parameters for brevity. While a two-variable state spaceusing ( 𝑑, 𝑤 ) is simple, it may not be rich/generic since it may notbe sufficient to capture the variations in these parameters. Supposeone were to represent the state space using a history of delay andwindow measurements. In that case, the state space representationcould be much more vibrant but correspondingly much harder tolearn accurately. In fact, for each additional dimension in the statespace, we need an order of magnitude more training data to deter-mine the state transitions. To capture the variations of the delay andwindow in the state space, we also consider (1) relative change inthe delay across neighboring epochs (captured by 𝛼 ( 𝑑 ) ); (2) relativechange in the window across adjacent epochs (obtained by 𝛽 ( 𝑤 ) ).These four parameters provide a richer representation of the statespace. However, the training data required for the 4-dimensionalspace is at least two orders of magnitude more than the ( 𝑑, 𝑤 ) space.To balance between state complexity and state richness, we choseto condense these four parameters into two composite parametersas 𝛼 ( 𝑑 ) · 𝑙𝑜𝑔 ( 𝑑 ) and 𝛽 ( 𝑤 ) · 𝑙𝑜𝑔 ( 𝑤 ) . By representing the delayand window in log space and quantizing the values (described inSection 2.2), we can better delineate variations in relative delay (orwindow) changes in comparison to variations in the actual delay(or window) values across different buckets in the state space. Thequantization of these values also helps maintain a condensed two-parameter representation of the four parameters: window, delay, a relative change in window size, and the relative change in delayacross epochs. We note that one can choose alternate state-spacerepresentations for the MDI framework; the key requirements areto balance the number of quantized states in the state space tocapture protocol dynamics across different network conditions. A discrete-time Markov model of a protocol is represented in theform of a state-transition probability matrix. The matrix describestransition probabilities from one state to another obtained by train-ing a protocol on a large set of network configurations. We call thisthe training phase of the Markov model. To capture the protocolbehavior, the matrix should include as many states as the onesobserved during the training. The state is defined as a tuple withvalue pairs of ( ˆ 𝑑 𝑖 , ˆ 𝑤 𝑖 ) . Where ˆ 𝑑 𝑖 and ˆ 𝑤 𝑖 are calculated using thecurrent epoch’s packet delays ( 𝑑 𝑖 ) and sending-window ( 𝑤 𝑖 ) andthe previous epoch’s delay ( 𝑑 𝑖 − ) and sending-window ( 𝑤 𝑖 − ):ˆ 𝑑 𝑖 = (cid:20)(cid:18) 𝑑 𝑖 𝑑 𝑖 − (cid:19) − (cid:21) ∗ 𝑙𝑜𝑔 ( 𝑑 𝑖 ) (1)ˆ 𝑤 𝑖 = (cid:20)(cid:18) 𝑤 𝑖 𝑤 𝑖 − (cid:19) − (cid:21) ∗ 𝑙𝑜𝑔 ( 𝑤 𝑖 ) (2)Assume that a protocol 𝑃 adjusts the congestion window as afunction of delay feedback. A user executing protocol 𝑃 has cur-rently the following values: the current sending window 𝑤 𝑖 , andthe previous epoch delay feedback 𝑑 𝑖 − . To decide on the valueof the next window 𝑤 𝑖 + , the user has to first identify the currentdelay 𝑑 𝑖 . The protocol 𝑃 decides the next window 𝑤 𝑖 + based on thefollowing factors: the prior window 𝑤 𝑖 , and the delay variations.Only upon observing 𝑑 𝑖 , 𝑃 would be aware of the true representedstate ( ˆ 𝑑 𝑖 , ˆ 𝑤 𝑖 ) in the model space of the protocol. Essentially, givenan initial window 𝑤 𝑖 and delay 𝑑 𝑖 − , the protocol 𝑃 has three vari-ations that influence a transition from ( ˆ 𝑑 𝑖 , ˆ 𝑤 𝑖 ) to ( ˆ 𝑑 𝑖 + , ˆ 𝑤 𝑖 + ) : (i)the variation in the initial observation 𝑑 𝑖 ; (ii) the variation in thedecision making of 𝑤 𝑖 + ; (iii) the variation in the next delay obser-vation 𝑑 𝑖 + . Note that, it is not necessary for two users running thesame protocol 𝑃 and in the same state ( ˆ 𝑑 𝑖 , ˆ 𝑤 𝑖 ) , to derive the samenext window 𝑤 𝑖 + . This decision is influenced by two factors: (i)different windows/delays values could effectively arrive at the samemodel state ( ˆ 𝑑 𝑖 , ˆ 𝑤 𝑖 ) ; (ii) different flows may observe variations inprior observations of delays and windows. The key assumption that MDI makes is that the state transitionfrom ( ˆ 𝑑 𝑖 , ˆ 𝑤 𝑖 ) to ( ˆ 𝑑 𝑖 + , ˆ 𝑤 𝑖 + ) can be captured by a guided Markovmodel with two basic properties: the delay feedback guides thedirection of the window change (increase or decrease), and thedelay variations of 𝑑 𝑖 and 𝑑 𝑖 + have an inherent randomness thatinfluence the protocol choice of the next window 𝑤 𝑖 + . The guidedMarkov assumption is clearly an approximation of the original pro-tocol behavior. To derive the transition matrix, we use a protocolemulation strategy in a constrained network environment. Con-sider a network simulation environment where one can execute theprotocol 𝑃 under various network conditions and background traf-fic. Our setup’s network environment is defined by a set of networktraces that specify bandwidth, packet loss, and RTT variations. Therotocol 𝑃 can be executed by simulating network flows executingthe protocol in the presence of competing traffic. We perform abroad array of network simulations by varying the network tracesand the background traffic emulating several real-world protocols,including 𝑃 . For each simulation, we measure the state transitionsof 𝑃 across the model states. By observing all possible state transi-tions of ( ˆ 𝑑 𝑖 , ˆ 𝑤 𝑖 ) , with ˆ 𝑑 𝑖 ranging from ˆ 𝑑 𝑚𝑖𝑛 to ˆ 𝑑 𝑚𝑎𝑥 , and ˆ 𝑤 𝑖 rangingfrom ˆ 𝑤 𝑚𝑖𝑛 to ˆ 𝑤 𝑚𝑎𝑥 , a 2D Markov chain is created defining the fol-lowing states: current state ( ˆ 𝑑 𝑘 , ˆ 𝑤 𝑙 ) and next state ( ˆ 𝑑 𝑟 , ˆ 𝑤 𝑣 ) , where 𝑘 and 𝑙 are the current state indexes of ˆ 𝑑 𝑖 and ˆ 𝑤 𝑖 , respectively. Sim-ilarly, 𝑟 and 𝑣 represents the next state indexes. To reduce the statespace of possible values for ( ˆ 𝑑 𝑖 , ˆ 𝑤 𝑖 ) , we quantize these values intosmall buckets. MDI captures the state transitions in the form of atransition probability matrix written as: 𝑝 ( 𝑘,𝑙 ) , ( 𝑟,𝑣 ) = 𝑝 [( ˆ 𝑑 𝑘 , ˆ 𝑤 𝑙 )|( ˆ 𝑑 𝑟 , ˆ 𝑤 𝑣 )] . (3)Thus, ( ˆ 𝑑 𝑘 , ˆ 𝑤 𝑙 ) defines a specific row in the transition matrix. De-pending on ˆ 𝑑 𝑖 + next value, represented by 𝑟 , we obtain a subset ofvalues from this specific row (i.e., the probability going to any ofthe possible ˆ 𝑤 𝑖 + in the state ( ˆ 𝑑 𝑟 , ˆ 𝑤 𝑣 ) . This paper focuses on training two delay-based protocols: Verusand Copa. The training is performed over a large sample of cellulartraces covering a wide range of diverse scenarios. We ran eachprotocol through a network emulator over a large set of traces ran-domly synthesized from the training traces. The protocol behavioris captured by logging the set of congestion windows and theirexperienced correlated delays in each run. Next, the logged windowand delay values are quantized (Equation 1 and 2). The quantizedvalues are used to obtain the transition probability matrix whereeach state is the quantized pair ( ˆ 𝑤 𝑖 , ˆ 𝑑 𝑖 ) . The matrix is structured inquadrants, highlighted by yellow and green in Figure 1. w w . d . w m w w . d . w m w w ... w m w w . d n . w m (%d *log(d), %w *log(w)) w w d ..w m w w d ..w m w w . ..w m w w dn ..w m ( % d * l og ( d ) , % w * l og ( w )) Figure 1: Transition Probability representation of a Model
Each quadrant represents a particular current delay ˆ 𝑑 𝑖 on the y-axis and a next delay ˆ 𝑑 𝑖 + on the x-axis, these values are quantized inthe range 𝑑 to 𝑑 𝑛 to keep the matrix from being prohibitively long.Each quadrant is further divided into smaller chunks representingthe current values of ˆ 𝑤 𝑖 on the y-axis and a next window variableˆ 𝑤 𝑖 + on the x-axis, which are quantized in the range 𝑤 to 𝑤 𝑛 .Figure 1 shows an empty sample matrix. The transition probabilityfor each chunk is computed by counting the number of occurrencesof going from one state to another as [( ˆ 𝑑 𝑖 , ˆ 𝑤 𝑖 ) , ( ˆ 𝑑 𝑖 + , ˆ 𝑤 𝑖 + )] . Wenormalize each row within a quadrant so that all outgoing transitionprobabilities of any state would sum to 1. We implemented a generic sender and receiver in C that takes atransition matrix as an input and uses the matrix to decide the nextsending window. The sender uses UDP as the transport protocol.It includes calculating the packet delays based on the incomingACKs and uses the delay to determine the sending window sizeafter each epoch. Epoch time is when the algorithm updates thecongestion window. Algorithm 1 outlines the MDI control loop.The model algorithm identifies the next sending window ˆ 𝑤 𝑖 + inevery epoch, obtained from the transition matrix, where a rowwithin a quadrant of the matrix represents all possible values for thefuture sending window. MDI first identifies the operating quadrantthrough the row and column index. The row index is taken from theprevious delay ˆ 𝑑 𝑖 , and the column index from the current delay ˆ 𝑑 𝑖 + (inferred from the incoming ACKs). Once the operating quadrant isidentified, a particular row within the quadrant can be determinedby the previous sending window ˆ 𝑤 𝑖 . This row represents all possiblesending windows decisions for the next epoch, each associated witha specific probability value. To decide the next sending window,MDI draws a random number (between 0 and 1) to determinesthe closest matching sending window. This process is a guidedrandom-walk within the state transition probability matrix. If thevalues are outside the matrix dimensions, the next sending windowis determined by a multiplicative increase/decrease to the currentwindow to force it back to the matrix bounds ( 𝑐 and 𝑐 ). Algorithm 1
MDI pseudo-code while TRUE do Compute ˆ 𝑑 𝑖 + from ACKs if ˆ 𝑑 𝑖 + < ˆ 𝑑 𝑚𝑖𝑛 then (Increase ˆ 𝑤 𝑖 + using const. multiplier 𝑐 > ˆ 𝑤 𝑖 + ← ˆ 𝑤 𝑖 ∗ 𝑐 else if ˆ 𝑑 𝑖 + > ˆ 𝑑 𝑚𝑎𝑥 then (Decrease ˆ 𝑤 𝑖 + using const. multiplier 𝑐 < ˆ 𝑤 𝑖 + ← ˆ 𝑤 𝑖 ∗ 𝑐 else Determine matrix quadrant ← ˆ 𝑑 𝑖 and ˆ 𝑑 𝑖 + Determine row within quadrant ← ˆ 𝑤 𝑖 ˆ 𝑤 𝑖 + ← Randomly choose next state using transitionprobabilities in the chosen row sleep(epoch) ⊲ epoch depends on the algorithm We evaluated two CC protocols as a proof-of-concept of MDI: Verus,and Copa. These protocols are modeled through the training phaseby generating the model transition matrix. The training is doneusing a set of 1000 different cellular traces (collected from real-world3G/4G networks) that cover a wide range of network scenarios. Toreplay these traces, we used the MahiMahi [15] linkshell networkemulator. We used a different set of cellular traces for testing, takenfrom several previously published papers: •
4G Verizon: taken from [27] and represents a recorded chan-nel over Verizon’s 4G network in the US. • Highway: taken from [29], it represents a channel over a 3Gnetwork in the UAE while driving on a highway.
100 200 300020 T h r oughpu t ( M bp s ) D e l a y ( m s ) verusmodelVerus (a) instantaneous P D F P D F verus modelVerus (b) PDF verus modelVerus T h r oughpu t ( M bp s ) d Delay (s)a (c) Population
Figure 2: Verus Highway
200 300100 T i m e ( s ) T h r oughpu t ( M bp s ) modelCopacopa D e l a y ( m s ) a (a) instantaneous P D F P D F modelCopa copa (b) PDF T h r oughpu t ( M bp s ) modelCopa copa (c) Population Figure 3: Copa Highway T h r oughpu t ( M bp s ) D e l a y ( m s ) verusmodelVerus (a) instantaneous P D F P D F verus modelVerus (b) PDF T h r oughpu t ( M bp s ) verus modelVerus (c) Population Figure 4: Verus Rapidly changing network T h r oughpu t ( M bp s ) D e l a y ( m s ) modelCopacopa (a) instantaneous P D F P D F modelCopa copa (b) PDF T h r oughpu t ( M bp s ) modelCopa copa (c) Population Figure 5: Copa Rapidly changing network T h r oughpu t ( M bp s ) D e l a y ( m s ) verusmodelVerus (a) instantaneous P D F P D F verus modelVerus (b) PDF T h r oughpu t ( M bp s ) verus modelVerus (c) Population Figure 6: Verus 4G Verizon T h r oughpu t ( M bp s ) D e l a y ( m s ) modelCopacopa (a) instantaneous P D F P D F modelCopa copa (b) PDF T h r oughpu t ( M bp s ) modelCopa copa (c) Population Figure 7: Copa 4G Verizon • Rapidly changing network: inspired by [5], this trace repre-sents a network with a highly fluctuating channel, wherethe capacity varied randomly every 5 seconds.We wanted to evaluate how well a model representation of analgorithm can track the throughput and delay of the native algo-rithms when run on the same network traces. This section showsthe results for the MDI versions of Verus and Copa. For each proto-col, we demonstrate the temporal variations of the original protocolagainst the MDI version of the protocol for a snippet of a 300 secondrun in one of the three scenarios in Figure 2a, 3a, 4a, 5a, 6a, and 7a.The results show that across both protocols (Verus and Copa), theMDI models (represented in red) are able to accurately track thethroughput of the native protocol (represented in blue) temporally.In addition, the MDI models are able to temporally track the delaybehavior of these protocols. To quantify that the MDI models sta-tistically matches the characteristics of the original protocols, wecomputed the Probability Density Function (PDF) of the throughputand delay for both Verus and Copa respectively. Figure 2b, 3b, 4b, 5b, 6b, and 7b shows the PDF comparisons. It can be seen that theMDI throughput distributions match the native ones perfectly.In summary, we observe that MDI have the ability to accuratelytrack the throughput behavior of the two used protocols acrosshighly variable network conditions, evident by the results of Fig-ure 2c, 3c, 4c, 5c, 6c, and 7c. The figures show the overall summarycomparing different values of the results population. Each of theMDI model and the native protocol is depicted by a circular shaperepresenting the operational region of the protocol circumscribedby the 25% and 75% percentile of the obtained throughput and delay,where the crosses (x) indicate the median values. The lower andupper part of the shape represents 25% and 75% of the throughput,respectively (y-axis), whereas the left and right part of the shaperepresents the 25% and 75% of the delay, respectively (x-axis). Re-sults show that MDI is capable of achieving quite similar statisticalperformance in terms of delay and throughput with a slight delaypenalty not exceeding 5% (i.e., in the rapidly changing channel).
RELATED WORK
CC for cellular networks: conventional loss-based TCP variants,in particular Cubic [9], performs poorly in cellular networks. Thisis due to the high sending rate that fills up the buffers causing abufferbloat [7]. Bufferbloat is detrimental to the performance ofdelay-sensitive applications like video calling. This has led to newerdelay-based CC protocols like Sprout [27], and Verus [29] that arespecifically designed in the context of cellular channels. Sproutfocuses on reducing self-inflicted queuing delays, and Verus createsa balance between the packet delays and the throughput. Recently,PCC Vivace [6], which followed PCC [5], has shown to react wellto changing networks while alleviating the bufferbloat. PCC Vi-vace leverages ideas from online (convex) optimization in machinelearning to do rate control. LEDBAT [21] is another delay-basedCC algorithm developed for BitTorrent and other bulk-transferapplications that had limited adoption. BBR [3] was recently pro-posed by Google and has shown promising results over cellularnetworks. BBR uses the bottleneck link’s round trip propagationand bandwidth to find CC’s optimum operating point.
Applying machine learning to CC: new CC protocols being pro-posed have complex control loops, which makes them harder tounderstand in the context of different network conditions. The re-cent development of CC protocols that employ machine learning(e.g., Remy [26], Vivace [6] and Indigo [28]) have only compoundedthis issue (e.g., some of Remy’s CC protocols employ rule tableswith more than 100 rules). Weinstein et. al. (Remy) [26], Sivaramanet. al. [22] and Pötsch [19] have provided different methodologiesto model non-linear CC from a theoretical perspective.
Analyzing TCP behavior:
TCP and its variants have been thor-oughly studied using the modeling, and analytical techniques [4,18, 20, 24, 25]. A recent work called ACT [23] uses the concept of aguided random walk in the state space of implementation variablesto find regions where the algorithm should never go, thereby indi-cating the existence of a possible bug in the implementation. Othersalso follow this approach of an automated model-guided methodas well [12] to explore the variable space in the implementation ofa CC algorithm. Our modeling approach also uses a random walk,but our state space is limited to a delay and window variable, andour goal is not to reach unreachable points but to guide the modelto follow the native algorithm it is modeling.
The MDI transition matrix helps reason about the essence of theCC protocol behavior. These matrices represent the probabilitydistributions across the transition space; it highlights which statesthe protocol mostly operates in. It also shows how the protocol islikely to behave under specific network changes, such as increasedor decreased network delay. Verus and Copa’s transition matrices(Figure 8a and 8b) clearly show that the Verus matrix is less densethan Copa’s, which means that Verus takes more decisive actionscompared to Copa that tend to explore more. Each protocol shows aparticular pattern reflecting the protocol’s behavior; we call this the protocol fingerprint . The sectors in the matrix represent differenttransitions for a specific change in packet delay. The relative delay and window ranges are determined from the training phase (the1% and 99% of the observed increase/decrease population).The protocols’ fingerprints reveal different characteristics ofthe protocol and how it reacts to various network changes. Forexample, the Verus transition matrix generally shows two distinctrecurring patterns in the sectors: one on the left side of the matrixand the other on the right side. We can see that the right side patternmainly contains window decrease probabilities. This is consistentwith Verus’s design, where if the observed delay increases, Veruslowers the sending rate by moving the operation point down thedelay profile curve. (a) Verus (b) Copa
Figure 8: MDI transition probability matrices
However, the left side pattern consists mainly of a diagonal linefrom the upper left corner down to the lower right corner. Addi-tionally, the pattern also has an anti-diagonal, which becomes moredominant, moving down the sectors (i.e., when the delay feedbackincreases). This gives another insight to Verus. If a decrease inthe previous delay is observed, it tends to continue alongside thesame previous decision, extending the last window to decrease orincrease. However, suppose Verus finds a delay-decrease with aprior increase in the delay. There is a higher probability that itmight increase the window in the next decision despite the windowdecrease in the previous epoch. This confirms Verus’s explorationbehavior, where, in case of a delay reduction, it tends to increasethe window to explore the channel variations immediately.On the other hand, Copa’s transition matrix shows that thematrix’s right side sectors show almost the same pattern, withsubstantial probabilities in the upper left and lower right cornersof the sectors and nearly no values in the top right or lower leftedges. This means that regardless of the previous delay values orthe severity of the observed delay values’ increase, Copa tendsto repeat its last epoch decision. For example, if Copa reducesthe window, it will continue doing so in the next epochs. This isunlike Verus, where it tends to minimize the window in case of anobserved delay increase. Looking at Copa’s matrix’s left sectors,we see that it has a similar pattern to the right side sectors withadditional values in the upper right corner. These values becomeless dominant when moving down from the top to the bottomsectors. The sector’s upper right corner represents increasing thewindow despite a reduction in the previous epoch. Like Verus, Copatends to increase the window by observing a delay reduction, andthe severity of exploring increases when the previously observeddelays are decreasing. a) Verus (b) Copa
Figure 9: Comparison between the theoretical stationary (probability) distribution of the Markov chain model (left) that istrained on the training set of traces vs. the empirical distribution over the state space after mixing time for both the originaland model versions of both protocols on the real-world test traces. These are for mixing time threshold ( 𝜖 ) − . Using our Markov formulation, we can provide convergence guar-antees as strong as the original protocols, using properties of con-vergence of Markov chains. Before presenting our results, we brieflyreview some necessary notations and definitions regarding Markovchains and convergence.
Markov chains and Mixing times:
Every Markov chain canbe represented as a transition matrix 𝑃 , where the entry 𝑝 𝑖 𝑗 repre-sents the probability of transitioning to state 𝑗 from state 𝑖 . Suppose 𝜇 ( 𝑡 ) is row vector that represents a probability distribution overthe state space at a time 𝑡 . Then at 𝑡 +
1, the distribution over thestate space is given by 𝜇 ( 𝑡 + ) = 𝜇 ( 𝑡 ) 𝑃 . If the initial distribution at 𝑡 = 𝜇 ( ) , then we have from above that 𝜇 ( 𝑡 ) = 𝜇 ( ) 𝑃 𝑡 .The limiting distribution 𝜆 is the limit of 𝜇 ( 𝑡 ) as 𝑡 → ∞ . If a uniquelimiting distribution exists, then it equals the stationary distribu-tion , which is the row vector 𝜋 , such that 𝜋𝑃 = 𝜋 . It is computedas the left eigenvector of the transition matrix corresponding tothe largest eigenvalue [17]. The mixing time of a Markov chain, 𝑡 mix , is the time 𝑡 to convergence from an initial distribution 𝜇 ( ) ,i.e., when the probability distribution 𝜇 ( 𝑡 ) over the state space issufficiently “close” to the stationary distribution 𝜋 that they areindistinguishable from one another. Any random walk process ina finite Markov space is associated with a finite mixing time [1].To obtain a conservative estimate, we define mixing time as themaximum convergence time starting from all possible initial states. Observations:
In our context, the state space comprises of theCartesian product of 11 states in the delay space ⟨ ˆ 𝑑 ⟩ and 21 statesin the window space ⟨ ˆ 𝑤 ⟩ , a total of 231 ( ˆ 𝑑, ˆ 𝑤 ) tuples. If the startstate is 𝑖 , then the initial distribution 𝜇 ( ) is a one-hot vector, with1 at the location corresponding to state 𝑖 and 0 everywhere else.Then, at every iteration 𝑡 (equivalent to an RTT), we compute 𝜇 ( 𝑡 + ) = 𝜇 ( 𝑡 ) 𝑃 , and declare convergence at time 𝑡 mix when themaximum element-wise difference between 𝜇 ( 𝑡 mix ) and 𝜇 ( 𝑡 mix + ) is less than a certain defined threshold ( 𝜖 ). We compute mixingtimes for three different thresholds: 10 − , 10 − and 10 − . The lastis chosen as it approximately equals the machine epsilon for 32-bitfloat. Table 1 shows the mixing times (in RTTs) obtained from thetransition matrix for both protocols.The heatmaps in Figure 9 show the theoretical stationary distri-bution computed using the Markov chain transition matrix trainedover a training sample of 1000 traces, compared with the empirical Protocol 𝜖 = − 𝜖 = − 𝜖 = − Verus 24 55 85Copa 8 24 41
Table 1: Mixing times (in RTTs) for both protocols, calcu-lated from the Markov model.
Testing protocol 𝐷 𝐾𝐿 ( 𝑃 || 𝑄 ) max | 𝑃 − 𝑄 | Copa 0.017 0.004Model Copa 0.147 0.01Verus 0.101 0.02Model Verus 0.773 0.054
Table 2: KL Divergence of the steady-state distribution of thestates ( 𝑄 ) in the testing set after mixing time w.r.t. the sta-tionary distribution ( 𝑃 ) computed from the Markov model. distribution of states after convergence (i.e., the mixing time) of theoriginal protocols and the model versions over a separate testingsample of 60 cellular traces. The heatmaps are displayed over thetwo-dimensional ( ˆ 𝑑, ˆ 𝑤 ) state space. The fact that these distributionsmatch very closely is a robust result that our Markov model ver-sions of the protocols are very close approximations of the originalprotocols. Table 2 shows the closeness of the two distributions interms of the Kullback-Leibler Divergence [14] of the two distribu-tions. The KL Divergence is a measure of how well one distributionapproximates another. The closer the KL Divergence is to zero,the better the approximation. The table also additionally shows asimple maximum element-wise absolute difference between thetwo distributions. From the heatmap plots and these numbers, weobserve that the model allows us to analyze the original protocols’convergence properties, which are a challenging proposition fordelay-based protocols due to complex non-linear control loops. This paper describes the MDI framework that can approximatedelay-based protocols’ behavior and potentially help visualize pro-tocol behavior, understand convergence properties, and derive amodel-based protocol replacement. We hope that this Markov mod-eling approach provides a new lens for understanding delay-basedongestion control algorithms’ behavior on highly variable net-works. In future work, we hope to extend this framework to under-stand the behavior of a broader array of protocols, analyze fairnessproperties of MDI protocols and explore alternative state-spaceprotocol representations within MDI.
ACKNOWLEDGMENTS
The work done by the authors Talal Ahmad, Shiva Iyer and Lak-shminarayanan Subramanian in this paper was supported by aDefense Advanced Research Projects Agency (DARPA) contractHR001117C0048. Any opinions, findings and conclusions or recom-mendations expressed in this material are those of the author(s)and do not necessarily reflect the views of DARPA.
REFERENCES [1] David Aldous. 1983. Random walks on finite groups and rapidly mixing Markovchains. In
Séminaire de Probabilités XVII 1981/82 . Springer, Berlin, Heidelberg,243–297.[2] Venkat Arun and Hari Balakrishnan. 2018. Copa: Practical Delay-Based Conges-tion Control for the Internet. In
Queue
14, 5,Article 50 (Oct. 2016), 34 pages. https://doi.org/10.1145/3012426.3022184[4] Neal Cardwell, Stefan Savage, and Thomas Anderson. 2000. Modeling TCP latency.In
Proceedings IEEE INFOCOM 2000. Conference on Computer Communications.Nineteenth Annual Joint Conference of the IEEE Computer and CommunicationsSocieties (Cat. No. 00CH37064) , Vol. 3. IEEE, Tel Aviv, Israel, Israel, 1742–1751.[5] M. Dong, Q. Li, D. Zarchy, P. B. Godfrey, and M. Schapira. 2015. PCC: Re-architecting Congestion Control for Consistent High Performance. In
Proceedingsof the 12th USENIX Conference on Networked Systems Design and Implementation(NSDI 15) . USENIX Association, Oakland, CA, USA, 395–408.[6] Mo Dong, Tong Meng, Doron Zarchy, Engin Arslan, Yossi Gilad, Brighten Godfrey,and Michael Schapira. 2018. { PCC } Vivace: Online-Learning Congestion Control.In { USENIX } Symposium on Networked Systems Design and Implementation( { NSDI } . USENIX Association, Renton, WA, 343–356.[7] Jim Gettys and Kathleen Nichols. 2011. Bufferbloat: Dark buffers in the Internet. Queue
9, 11 (2011), 40.[8] Yihua Guo, Feng Qian, Qi Alfred Chen, Zhuoqing Morley Mao, and SubhabrataSen. 2016. Understanding On-device Bufferbloat for Cellular Upload. In
Pro-ceedings of the 2016 Internet Measurement Conference (IMC 16) . Association forComputing Machinery, Santa Monica, CA, USA, 303–317.[9] S. Ha, I. Rhee, and L. Xu. 2008. CUBIC: a new TCP-friendly high-speed TCPvariant.
ACM SIGOPS Operating Systems Review
42, 5 (2008), 64–74.[10] Zhenxian Hu, Yi-Chao Chen, Lili Qiu, Guangtao Xue, Hongzi Zhu, NicholasZhang, Cheng He, Lujia Pan, and Caifeng He. 2015. An In-depth Analysis of3G Traffic and Performance. In
Proceedings of the 5th Workshop on All ThingsCellular: Operations, Applications and Challenges (AllThingsCellular ’15) . ACM,New York, NY, USA, 1–6. https://doi.org/10.1145/2785971.2785981[11] Junxian Huang, Feng Qian, Yihua Guo, Yuanyuan Zhou, Qiang Xu, Z. Morley Mao,Subhabrata Sen, and Oliver Spatscheck. 2013. An In-depth Study of LTE: Effectof Network Protocol and Application Behavior on Performance. In
Proceedings ofthe ACM SIGCOMM 2013 Conference on SIGCOMM (SIGCOMM ’13) . ACM, NewYork, NY, USA, 363–374. https://doi.org/10.1145/2486001.2486006[12] Samuel Jero, Endadul Hoque, David Choffnes, Alan Mislove, and Cristina Nita-Rotaru. 2018. Automated attack discovery in TCP congestion control using amodel-guided approach. In
Proc. of Network and Distributed System Security Symp., San Diego, CA, USA . Association for Computing Machinery, New York, NY, USA,1–15.[13] Haiqing Jiang, Yaogong Wang, Kyunghan Lee, and Injong Rhee. 2012. TacklingBufferbloat in 3G/4G Networks. In
Proceedings of the 2012 Internet MeasurementConference (IMC ’12) . ACM, New York, NY, USA, 329–342. https://doi.org/10.1145/2398776.2398810[14] S. Kullback and R. A. Leibler. 1951. On Information and Sufficiency.
The Annals ofMathematical Statistics { HTTP } . In { USENIX } Annual Technical Conference( { USENIX }{ ATC } . USENIX Association, Santa Clara, CA, 417–429.[16] Ashkan Nikravesh, David R. Choffnes, Ethan Katz-Bassett, Z. Morley Mao, andMatt Welsh. 2014. Mobile Network Performance from User Devices: A Lon-gitudinal, Multidimensional Analysis. In Proceedings of the 15th InternationalConference on Passive and Active Measurement - Volume 8362 (PAM 2014) . Springer-Verlag New York, Inc., New York, NY, USA, 12–22. https://doi.org/10.1007/978-3-319-04918-2_2[17] J. R. Norris. 1997.
Markov Chains . Cambridge University Press. https://doi.org/10.1017/CBO9780511810633[18] Jitendra Padhye, Victor Firoiu, Don Towsley, and Jim Kurose. 1998. ModelingTCP throughput: A simple model and its empirical validation.
ACM SIGCOMMComputer Communication Review
28, 4 (1998), 303–314.[19] Thomas Pötsch. 2016.
Future Mobile Transport Protocols: Adaptive CongestionControl for Unpredictable Cellular Networks . Springer.[20] Charalampos (Babis) Samios and Mary K. Vernon. 2003. Modeling the Through-put of TCP Vegas. In
Proceedings of the 2003 ACM SIGMETRICS InternationalConference on Measurement and Modeling of Computer Systems (SIGMETRICS ’03) .ACM, New York, NY, USA, 71–81. https://doi.org/10.1145/781027.781037[21] S. Shalunov, G. Hazel, J. Iyengar, and M. Kuehlewind. 2012. Low Extra DelayBackground Transport (LEDBAT). (December 2012). http://tools.ietf.org/rfc/rfc6817.txt RFC6817.[22] A. Sivaraman, K. Winstein, P. Thaker, and H. Balakrishnan. 2014. An ExperimentalStudy of the Learnability of Congestion Control. In
Proceedings of the ACMSIGCOMM 2014 Conference . Association for Computing Machinery, Chicago, IL,USA.[23] Wei Sun, Lisong Xu, Sebastian Elbaum, and Di Zhao. 2019. Model-Agnostic andEfficient Exploration of Numerical State Space of Real-World { TCP } CongestionControl Implementations. In { USENIX } Symposium on Networked SystemsDesign and Implementation ( { NSDI } . USENIX Association, Boston, MA, 719–734.[24] A. Wierman and T. Osogami. 2003. A unified framework for modeling TCP-Vegas, TCP-SACK, and TCP-Reno. In IEEE, Orlando, FL, USA, USA, 269–278. https://doi.org/10.1109/MASCOT.2003.1240671[25] Adam Wierman, Takayuki Osogami, and Jörgen Olsén. 2003. Modeling TCP-vegas Under on/off Traffic.
SIGMETRICS Perform. Eval. Rev.
31, 2 (Sept. 2003),6–8. https://doi.org/10.1145/959143.959146[26] K. Winstein and H. Balakrishnan. 2013. TCP Ex Machina: Computer-generatedCongestion Control. In
Proceedings of the ACM SIGCOMM 2013 Conference . Asso-ciation for Computing Machinery, Hong Kong, China.[27] Keith Winstein, Anirudh Sivaraman, Hari Balakrishnan, et al. 2013. StochasticForecasts Achieve High Throughput and Low Delay over Cellular Networks..In