Ran Shu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ran Shu is active.

Explore More

Publication

Featured researches published by Ran Shu.

IEEE Journal on Selected Areas in Communications | 2014

Sharing Bandwidth by Allocating Switch Buffer in Data Center Networks

Jiao Zhang; Fengyuan Ren; Xin Yue; Ran Shu; Chuang Lin

In todays data centers, the round trip propagation delay is quite small. Therefore, switch buffer sizes are much larger than the Bandwidth Delay Product (BDP). Based on this observation, in this paper we introduce a new transport protocol which provides bandwidth Sharing by Allocating switch Buffer (SAB) for data centers. SAB sets the congestion windows for flows based on the buffer size of the switches along the path. On one hand, as long as the total buffer allocated to all the flows is larger than the BDP, the network bandwidth can be fully utilized. On the other hand, since SAB only allocates the buffer space to flows, the totally injected traffic will not exceed the network capacity. Thus, SAB rarely loses packets. SAB also reduces flow completion time by allowing flows to reach their fair share of bandwidth quickly. The results of a series of experiments and simulations demonstrate that SAB has the features of fast convergence and rare packet loss. It reduces the latency of short flows and solves theTCP Incast and TCP Outcast problems.

international conference on computer communications | 2012

Sliding Mode Congestion Control for data center Ethernet networks

Wanchun Jiang; Fengyuan Ren; Ran Shu; Chuang Lin

Recently, Ethernet is being enhanced as the unified switch fabric of data centers, called Data Center Ethernet. The end-to-end congestion management is one of the indispensable enhancements, and Quantized Congestion Notification (QCN) has been ratified to be the standard. Our experiments show that QCN suffers from the oscillation of the queue at the bottleneck link. With the changes of system parameters and network configurations, the oscillation may become so serious that the queue is emptied frequently. As a result, the utilization of the bottleneck link degrades. Theoretical analysis shows that QCN approaches to the equilibrium point mainly through the sliding mode motion. But whether QCN enters into the sliding mode motion also depends on both system parameters and network configurations. Hence, we present the Sliding Mode Congestion Control (SMCC) scheme, which can drive the system into the sliding mode motion under any conditions. SMCC benefits from the advantage that the sliding mode motion is insensitive to system parameters and external disturbances. Moreover, SMCC is simple, stable and has short response time. QCN can be replaced by SMCC easily since both of them follow the framework developed by the IEEE 802.1 Qau work group. Experiments on the NetFPGA platform show that SMCC is superior to QCN, especially in the condition that the traffic pattern and the network state are variable.

IEEE Transactions on Computers | 2016

Guaranteeing Delay of Live Virtual Machine Migration by Determining and Provisioning Appropriate Bandwidth

Jiao Zhang; Fengyuan Ren; Ran Shu; Tao Huang; Yunjie Liu

The proliferation of cloud services makes virtualization technology more important. One important feature of virtualization is live Virtual Machine (VM) migration. Two main metrics of evaluating a live VM migration mechanism are total migration time and downtime. Most existing literature on live VM migration focus on designing migration mechanisms to shorten the two metrics or making a tradeoff between them. Few of them can be applied to applications with delay requirements, such as a VM backup process that needs to be done in a specific time. This will negatively impact the user experiences and reduce the profit of cloud service providers. Besides, the frequently varied bandwidth required by the widely used pre-copy mechanism is difficult to be provided by current network technologies. In this work, we theoretically analyze how much bandwidth is required to guarantee the total migration time and downtime of a live VM migration, and then propose a novel transport control mechanism to guarantee the computed bandwidth. The experimental results demonstrate that the bandwidth obtained from the proposed reciprocal-based model guarantees the expected total migration time and downtime, and the proposed transport control mechanism ensures that the live VM migration flow obtains the expected bandwidth even if there are background flows.

european conference on computer systems | 2016

TFC: token flow control in data center networks

Jiao Zhang; Fengyuan Ren; Ran Shu; Peng Cheng

Services in modern data center networks pose growing performance demands. However, the widely existed special traffic patterns, such as micro-burst, highly concurrent flows, on-off pattern of flow transmission, exacerbate the performance of transport protocols. In this work, an clean-slate explicit transport control mechanism, called Token Flow Control (TFC), is proposed for data center networks to achieve high link utilization, ultra-low latency, fast convergence, and rare packets dropping. TFC uses tokens to represent the link bandwidth resource and define the concept of effective flows to stand for consumers. The total tokens will be explicitly allocated to each consumer every time slot. TFC excludes in-network buffer space from the flow pipeline and thus achieves zero-queueing. Besides, a packet delay function is added at switches to prevent packets dropping with highly concurrent flows. The performance of TFC is evaluated using both experiments on a small real testbed and large-scale simulations. The results show that TFC achieves high throughput, fast convergence, near zero-queuing and rare packets loss in various scenarios.

IEEE Transactions on Computers | 2015

Sliding Mode Congestion Control for Data Center Ethernet Networks

Wanchun Jiang; Fengyuan Ren; Ran Shu; Yongwei Wu; Chuang Lin

Recently, Ethernet is enhanced as the unified switch fabric of data centers, called data center Ethernet. One of the indispensable enhancements is end-to-end congestion management, and currently quantized congestion notification (QCN) has been ratified as the corresponding standard. However, our experiments show that QCN suffers from large oscillations of the queue length at the bottleneck link such that the buffer is emptied frequently and accordingly the link utilization degrades, with certain system parameters and network configurations. This phenomenon is corresponding to our theoretical analysis result that QCN fails to enter into the sliding mode motion (SMM) pattern with certain system parameters and network configurations. Knowing the drawbacks of QCN and realizing the advantage that congestion management system is insensitive to the changes of parameters and network configurations in the SMM pattern, we present sliding mode congestion control (SMCC), which can enter into the SMM pattern under any conditions. SMCC is simple, stable, fair, has short response time, and can be easily used to replace QCN because both of them follow the framework developed by the IEEE 802.1Qau work group. Experiments on the NetFPGA platform show that SMCC is superior to QCN, especially when traffic pattern and network states are variable.

international workshop on quality of service | 2014

Analysing convergence of Quantized Congestion Notification in Data Center Ethernet

Ran Shu; Jiao Zhang; Fengyuan Ren; Chuang Lin

Enhancing Ethernet as the unified data center fabric to concurrently handle the traffic of Local Area Network (LAN), Storage Area Network (SAN), and High Performance Computing (HPC) has attracted much attention. Congestion management is one critical enhancement to fill the performance gap between traditional Ethernet and the unified data center fabric. Currently, Quantized Congestion Notification (QCN) has been approved as the standard congestion management mechanism. However, lots of work pointed out that QCN suffers from the problem of unfairness among different flows. In this paper, we found that QCN could achieve fairness, merely the convergence time to fairness is quite long. Thus, we build a convergence time model to investigate the reasons of the slow convergence process of QCN. The model indicates that the convergence time of QCN can be decreased if RPs have the same rate increase probability or the rate increase step becomes larger at steady state. We validate the precise of our model by comparing with experimental data on the NetFPGA platform. The results show that it well characterizes the convergence time to fairness of QCN. Based on the proposed model, the impact of QCN parameters, network parameters, and QCN variants on the convergence time is analysed. Finally, enlightened by the analysis, we proposed a mechanism, called QCN-T, which replaces the Byte Counter and Timer at sources with a single modified Timer, to reduce the convergence time of QCN.

Computer Networks | 2018

Analysing and improving convergence of quantized congestion notification in Data Center Ethernet

Ran Shu; Fengyuan Ren; Jiao Zhang; Tong Zhang; Chuang Lin

Abstract Quantized Congestion Notification (QCN) has been approved as the standard congestion management mechanism for the Data Center Ethernet (DCE). However, lots of work pointed out that QCN suffers from the problem of unfairness among different flows. In this paper, we found that QCN could achieve fairness, merely the convergence time to fairness is quite long. Thus, we build a convergence time model to investigate the reasons of the slow convergence process of QCN. We validate the precision of our model by comparing with experimental data on the NetFPGA platform. The results show that the proposed model accurately well characterizes the convergence time to fairness of QCN. Based on the model, the impact of QCN parameters, network parameters, and QCN variants on the convergence time is analysed in detail. Results indicate that the convergence time of QCN can be decreased if sources have the same rate increase probability or the rate increase step becomes larger at steady state. Enlightened by the analysis, we proposed a mechanism called QCN-T, which replaces the original Byte Counter and Timer at sources with a single modified Timer to reduce the convergence time. Finally, evaluations show great improvements of QCN-T in both convergence and stability.

international conference on computer communications | 2017

Modeling and analyzing the influence of chunk size variation on bitrate adaptation in DASH

Tong Zhang; Fengyuan Ren; Wenxue Cheng; Xiaohui Luo; Ran Shu; Xiaolan Liu

Recently, HTTP-based adaptive video streaming has been widely adopted in the Internet. Up to now, HTTP-based adaptive video streaming is standardized as Dynamic Adaptive Streaming over HTTP (DASH), where a client-side video player can dynamically pick the bitrate level according to the perceived network conditions. Actually, not only the available bandwidth is varying, but also the chunk sizes in the same bitrate level significantly fluctuate, which also influences the bitrate adaptation. However, existing bitrate adaptation algorithms do not accurately involve the chunk size variation, leading to performance losses. In this paper, we theoretically analyze the influence of chunk size variation on bitrate adaptation performance. Based on DASH system features, we build a general model describing the playback buffer evolution. Applying stochastic theories, we respectively analyze the influence of the chunk size variation on rebuffering probability and average bitrate level. Furthermore, based on theoretical insights, we provide several recommendations for algorithm designing and rate encoding, and also propose a simple bitrate adaptation algorithm. Extensive simulations verify our insights as well as the efficiency of the proposed recommendations and algorithm.

international conference on distributed computing systems | 2016

Backlog-Aware SRPT Flow Scheduling in Data Center Networks

Tong Zhang; Fengyuan Ren; Ran Shu

The rapidly developing soft real-time data center applications impose stringent delay requirements on internal data transfers. Therefore many recently emerged network protocols in data center share a common goal of decreasing Flow Completion Time (FCT), in which case the Shortest Remaining Processing Time (SRPT) scheduling discipline has attracted widespread attentions. However, SRPT suffers the instability issue, incurring more and more flows left uncompleted even when traffic load is within network capacity, which implies unnecessary bandwidth waste. To solve the problem, this paper proposes a backlog aware scheduling algorithm (BASRPT) that stabilizes queue length while maintaining relatively low FCT based on Lyapunov optimization. To overcome the huge computational overhead, a fast and practical approximation algorithm called fast BASRPT is also developed. Extensive flow-level simulations show that fast BASRPT indeed stabilizes switch queue and obtains a higher throughput while being able to push FCT arbitrarily close to the optimal value in the condition of feasible traffic load.

international conference on parallel processing | 2015

Slowing Little Quickens More: Improving DCTCP for Massive Concurrent Flows

Mao Miao; Peng Cheng; Fengyuan Ren; Ran Shu

DCTCP is a potential TCP replacement to satisfy the requirements of data center network. It receives wide concerns in both academic and industrial circles. However, DCTCP could only support tens of concurrent flows well and suffers timeouts and throughput collapse facing numerous concurrent flows. This is far from the requirement of data center network. Data centers employing partition/aggregation pattern usually involve hundreds of concurrent flows. In this paper, after tracing DCTCPs dynamic behavior through experiments, we explored two roots for DCTCPs failure under the high fan-in traffic pattern: (1) The regulation mechanism of sending window is ineffective when cwnd is decreased to the minimum size, (2) The bursts induced by synchronized flows with small cwnd cause fatal packet loss leading to severe timeouts. We enhance DCTCP to support massive concurrent flows by regulating the sending time interval and desynchronizing the sending time in particular conditions. The new protocol called DCTCP+ outperforms DCTCP when the number of concurrent flows increases to several hundreds. DCTCP+ can normally work to effectively support the short concurrent query responses in the benchmark from real production clusters, and keep the same good performance with the mixture of background traffic.

Explore More