Gangyong Jia
Hangzhou Dianzi University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gangyong Jia.
IEEE Transactions on Industrial Informatics | 2017
Gangyong Jia; Guangjie Han; Jinfang Jiang; Li Liu
The increasing demand on the main memory capacity is one of the main big data challenges. Dynamic random access memory (DRAM) does not represent the best choice for a main memory, due to high power consumption and low density. However, the nonvolatile memory, such as the phase-change memory (PCM), represents an additional choice because of the low power consumption and high-density characteristic. Nevertheless, the high access latency and limited write endurance have disabled the PCM to replace the DRAM currently. Therefore, a hybrid memory, which combines both the DRAM and the PCM, has become a good alternative to the traditional DRAM memory. Both DRAM and PCM disadvantages are challenges for the hybrid memory. In this paper, a dynamic adaptive replacement policy (DARP) in the shared last-level cache for the DRAM/PCM hybrid main memory is proposed. The DARP distinguishes the cache data into the PCM data and the DRAM data, then, the algorithm adopts different replacement policies for each data type. Specifically, for the PCM data, the least recently used (LRU) replacement policy is adopted, and for the DRAM data, the DARP is employed according to the process behavior. Experimental results have shown that the DARP improved the memory access efficiency by 25.4%.
Sensors | 2017
Gangyong Jia; Guangjie Han; Hao Wang; Xuan Yang
In a cloud computing environment, the number of virtual machines (VMs) on a single physical server and the number of applications running on each VM are continuously growing. This has led to an enormous increase in the demand of memory capacity and subsequent increase in the energy consumption in the cloud. Lack of enough memory has become a major bottleneck for scalability and performance of virtualization interfaces in cloud computing. To address this problem, memory deduplication techniques which reduce memory demand through page sharing are being adopted. However, such techniques suffer from overheads in terms of number of online comparisons required for the memory deduplication. In this paper, we propose a static memory deduplication (SMD) technique which can reduce memory capacity requirement and provide performance optimization in cloud computing. The main innovation of SMD is that the process of page detection is performed offline, thus potentially reducing the performance cost, especially in terms of response time. In SMD, page comparisons are restricted to the code segment, which has the highest shared content. Our experimental results show that SMD efficiently reduces memory capacity requirement and improves performance. We demonstrate that, compared to other approaches, the cost in terms of the response time is negligible.
Journal of Network and Computer Applications | 2015
Gangyong Jia; Guangjie Han; Jinfang Jiang; Joel J. P. C. Rodrigues
The growing gap between microprocessor speed and DRAM speed is a major problem that computer designers are facing. In order to narrow the gap, main memory is expected to grow significantly in both speed and capacity for it is a major shared resource among cores in a multi-core system, which will lead to increasing power consumption, perhaps replacing the dominant consuming fraction of processor. Therefore, it is critical to address the power issue without seriously decreasing performance in the main memory subsystem. In this paper, we propose a periodically active rank scheduling (PARS) to optimize power efficiency for DRAM in smart phones. Our scheduling features a three-step design. First, we partition all threads in the system into groups. Second, we modify page allocation policy to achieve threads in the same group which occupies the same rank but different banks of DRAM. Finally, we sequentially schedule threads in one group after another while only active running group׳s ranks to retain other ranks low power status. As a result, our scheduling periodically activates one rank after another to optimize memory power efficiency. We implement PARS in Linux 2.6.39 kernel running randomly generated workloads containing single-threaded and multi-threaded benchmark. Experimental results show that PARS can improve both the memory power efficiency by 26.8% and performance by 4.2% average relative to default system while reducing negligible fairness.
IEEE Access | 2015
Gangyong Jia; Guangjie Han; Daqiang Zhang; Li Liu; Lei Shu
Limited memory bandwidth is considered as the major bottleneck in multimedia cloud computing for more and more virtual machines (VMs) of multimedia processing requiring high memory bandwidth simultaneously. Moreover, contending memory bandwidth among parallel running VMs leads to poor quality of service (QoS) of the multimedia applications, missing the deadlines of these soft real-time multimedia applications. In this paper, we present an adaptive framework, Service Maximization Optimization (SMO), which is designed to improve the QoS of the soft real-time multimedia applications in multimedia cloud computing. The framework consists of an automatic detection mechanism and an adaptive memory bandwidth control mechanism. With the automatic detection mechanism, the critical section to the multimedia application performance in the VMs is detected. Then, our adaptive memory bandwidth control mechanism adjusts the memory access rates of all the parallel running VMs to protect the QoS of the soft real-time multimedia applications. From the case studies with real-world multimedia applications, our SMO significantly improves the QoS of the soft real-time multimedia applications with a negligible penalty on system throughput.
Enterprise Information Systems | 2018
Gangyong Jia; Guangjie Han; Hao Wang; Feng Wang
ABSTRACT Fog computing requires a large main memory capacity to decrease latency and increase the Quality of Service (QoS). However, dynamic random access memory (DRAM), the commonly used random access memory, cannot be included into a fog computing system due to its high consumption of power. In recent years, non-volatile memories (NVM) such as Phase-Change Memory (PCM) and Spin-transfer torque RAM (STT-RAM) with their low power consumption have emerged to replace DRAM. Moreover, the currently proposed hybrid main memory, consisting of both DRAM and NVM, have shown promising advantages in terms of scalability and power consumption. However, the drawbacks of NVM, such as long read/write latency give rise to potential problems leading to asymmetric cache misses in the hybrid main memory. Current last level cache (LLC) policies are based on the unified miss cost, and result in poor performance in LLC and add to the cost of using NVM. In order to minimize the cache miss cost in the hybrid main memory, we propose a cost aware cache replacement policy (CACRP) that reduces the number of cache misses from NVM and improves the cache performance for a hybrid memory system. Experimental results show that our CACRP behaves better in LLC performance, improving performance up to 43.6% (15.5% on average) compared to LRU.
Journal of Network and Computer Applications | 2018
Guangjie Han; Wenhui Que; Gangyong Jia; Wenbo Zhang
Cloud computing has become an indispensable infrastructure that provides multi-granularity services to support large applications in the Industrial Internet of Things (IIOT), Cloud data centers have been built or extensively enlarged to cope with the growing computation and storage requirements of IIOT. The energy consumption of cloud data centers is dramatically increasing, which has created a lot of problems with greenhouse gas emissions and service costs. Server consolidation is a popular approach to reduce cloud data centers energy consumption by minimizing the number of active physical machines. Most of the extant research has focused on server reduction in the consolidation process, but unbalanced resource utilization among different physical machines can cause the waste of physical resources. This paper proposes a resource-utilization-aware energy efficient server consolidation algorithm (RUAEE) that can be used to improve resource utilization while reducing the number of virtual machine live migrations. Experimental results show that RUAEE can reduce the energy consumption and service-level agreement (SLA) violation in cloud data center. The proposedRUAEE in this paper entails four parts: disposition of overloaded hosts, adjustment of unbalance hosts, selection of underloaded hosts and VM placement to improve hostsresource utilization and reduce the number of SLA violations.At the beginning of server consolidation, overloaded hosts are disposed for improving the Quality of Service.The resource utilization description model is proposed for identifying unbalance hosts and unbalance hosts are selected for workload adjustment.And then, the underloaded host selection module detects low-utilized hosts by Select Factor (SF) value and migrates all of their running VMs out for switching them off to achieve energy saving.At last, migrating VMs are elaborately placed on suitable PMs to improve destination hostsresource utilization.
IEEE Transactions on Cloud Computing | 2015
Gangyong Jia; Guangjie Han; Joel J. P. C. Rodrigues; Jaime Lloret; Wei Li
Both limited main memory size and memory interference are considered as the major bottlenecks in virtualization environments. Memory deduplication, detecting pages with same content and being shared into one single copy, reduces memory requirements; memory partition, allocating unique colors for each virtual machine according to page color, reduces memory interference among virtual machines to improve performance. In this paper, we propose a coordinate memory deduplication and partition approach named CMDP to reduce memory requirement and interference simultaneously for improving performance in virtualization. Moreover, CMDP adopts a lightweight page behavior-based memory deduplication approach named BMD to reduce futile page comparison overhead meanwhile to detect page sharing opportunities efficiently. And a virtual machine based memory partition called VMMP is added into CMDP to reduce interference among virtual machines. According to page color, VMMP allocates unique page colors to applications, virtual machines and hypervisor. The experimental results show that CMDP can efficiently improve performance (by about 15.8 percent) meanwhile accommodate more virtual machines concurrently.
IEEE Systems Journal | 2017
Gangyong Jia; Guangjie Han; Aohan Li; Jaime Lloret
In a modern multicore system, memory is shared among more and more concurrently running multimedia applications. Therefore, memory contention and interference are more and more serious, inducing system performance degradation significantly, the performance degradation of each thread differently, unfairness in resource sharing, and priority inversion, even starvation. In this paper, we propose an approach of coordinating channel-aware page mapping policy and memory scheduling (CCPS) to reduce intermultimedia application interference in a memory system. The idea is to map the data of different threads to different channels, together with memory scheduling. The key principles of the policies of page mapping and memory scheduling are: 1) the memory address space, the thread priority, and the load balance; and 2) prioritizing a low-memory request thread, a row-buffer hit access, and an older request. We evaluate the CCPS on a variety of mixed single-thread and multithread benchmarks and system configurations, and we compare them with four previously proposed state-of-the-art interference-reducing policies. Experimental results demonstrate that the CCPS improves the performance while reducing the energy consumption significantly; moreover, the CCPS incurs a much lower hardware overhead than the current existing policies.
IEEE Access | 2017
Jianfan He; Gangyong Jia; Guangjie Han; Hao Wang; Xuan Yang
Cloud computing platform is one of the most important parts in the smart factory of industry 4.0. Currently, most cloud computing platforms have adopted flash memory as the mainly storage for more efficiency, because the flash memory having high capacity and speed. However, flash memory exhibits certain drawbacks in terms of out-of-place updates and asymmetric I/O latencies for read, write, and erase operations. These disadvantages prevent replacing traditional disks. Fortunately, the flash buffer can be used to address these drawbacks, and its replacement policies provide efficiency methods. Therefore, in this paper, we propose a locality-aware least recently used (LLRU) replacement algorithm, which exploits both access and locality characteristics. LLRU divides the LRU list into four lists: the hot-clean, hot-dirty, cold-clean, and cold-dirty LRU lists. According to reuse probability and eviction cost, the eviction page is selected to ensure effective system performance for cloud computing. The experimental results demonstrate LLRU outperforms other algorithms, including LRU, CF-LRU, LRU-WSR, and AD-LRU, which can optimize cloud computing for smart factory of industry 4.0.
Mobile Networks and Applications | 2015
Gangyong Jia; Guangjie Han; Jinfang Jiang; Aohan Li
Main memory dynamic voltage and frequency scaling (DVFS) has been proposed recently for improving energy efficiency further. However, recent work overlook the operating systems (OS) problems incurred by it, such as unpredictable performance decreasing, unfair performance sharing and priority inversion, which may render performance analysis, optimization and isolation extremely difficult. In this paper, we analyze the OS problems incurred by memory DVFS in detail firstly, and propose dynamic time-slice scaling (DTS) to address these problems, where allocating each thread a time-slice according to threads’ memory accessing behavior and memory frequency. Our paper has three main contributions: 1) we analyze the OS problems incurred by the newly approach of memory active low-power modes, the first work paying attention to the effect of up-to-date DVFS memory architecture; 2) performance decrease is more predictable and share is more fairness through adjusting time-slice; 3) priority inversion is solved with starvation forbidden. Simulation results show that the proposed methods can substantially reduce unpredictable performance degradation, improve fairness of performance sharing and solve the priority inversion.