Is this you? Create Your Porfile

Minghao Hu

National University of Defense Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Minghao Hu is active.

Explore More

Publication

Featured researches published by Minghao Hu.

international joint conference on artificial intelligence | 2018

Reinforced Mnemonic Reader for Machine Reading Comprehension

Minghao Hu; Yuxing Peng; Zhen Huang; Xipeng Qiu; Furu Wei; Ming Zhou

In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects. First, a reattention mechanism is proposed to refine current attentions by directly accessing to past attentions that are temporally memorized in a multi-round alignment architecture, so as to avoid the problems of attention redundancy and attention deficiency. Second, a new optimization approach, called dynamic-critical reinforcement learning, is introduced to extend the standard supervised method. It always encourages to predict a more acceptable answer so as to address the convergence suppression problem occurred in traditional reinforcement learning algorithms. Extensive experiments on the Stanford Question Answering Dataset (SQuAD) show that our model achieves state-of-the-art results. Meanwhile, our model outperforms previous systems by over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD datasets.

pacific rim conference on multimedia | 2018

MFM: A Multi-level Fused Sequence Matching Model for Candidates Filtering in Multi-paragraphs Question-Answering

Yang Liu; Zhen Huang; Minghao Hu; Shuyang Du; Yuxing Peng; Dongsheng Li; Xu Wang

Text-based question-answering (QA in short) is a popular application on multimedia environments. In this paper, we mainly focus on the multi-paragraphs QA systems, which can retrieve many candidate paragraphs to feed into the extraction module to locate the answers in the paragraphs. However, according to our observations, there are no real answer in many candidate paragraphs. To filter these paragraphs, we propose a multi-level fused sequence matching (MFM in short) model through deep network methods. Then we construct a distant supervision dataset based on Wikipedia and carry out several experiments on that. Also we use another popular sequence matching dataset to test the performance of our model. Experiments show that our MFM model can outperform recent models not only on the filtering candidates in multi-paragraphs QA task but also on the sequence matching task.

Journal of Zhejiang University Science C | 2017

Meeting deadlines for approximation processing in MapReduce environments

Minghao Hu; Changjian Wang; Yuxing Peng

To provide timely results for big data analytics, it is crucial to satisfy deadline requirements for MapReduce jobs in today’s production environments. Much effort has been devoted to the problem of meeting deadlines, and typically there exist two kinds of solutions. The first is to allocate appropriate resources to complete the entire job before the specified time limit, where missed deadlines result because of tight deadline constraints or lack of resources; the second is to run a pre-constructed sample based on deadline constraints, which can satisfy the time requirement but fail to maximize the volumes of processed data. In this paper, we propose a deadline-oriented task scheduling approach, named ‘Dart’, to address the above problem. Given a specified deadline and restricted resources, Dart uses an iterative estimation method, which is based on both historical data and job running status to precisely estimate the real-time job completion time. Based on the estimated time, Dart uses an approach–revise algorithm to make dynamic scheduling decisions for meeting deadlines while maximizing the amount of processed data and mitigating stragglers. Dart also efficiently handles task failures and data skew, protecting its performance from being harmed. We have validated our approach using workloads from OpenCloud and Facebook on a cluster of 64 virtual machines. The results show that Dart can not only effectively meet the deadline but also process near-maximum volumes of data even with tight deadlines and limited resources.

international conference on algorithms and architectures for parallel processing | 2015

Parallel Data Regeneration Based on Multiple Trees with Network Coding in Distributed Storage System

Pengfei You; Zhen Huang; Changjian Wang; Minghao Hu; Yuxing Peng

Distributed storage systems can provide large-scale data storage and high data reliability by redundant schemes, such as replica and erasure codes. Redundant data may get lost due to frequent node failures in the system. The lost data is needed to be regenerated as soon as possible so as to maintain data availability and reliability. The direct way for reducing regeneration time is to reduce network traffic in regeneration. Compared with that way, tree-structured regeneration achieves shorter regeneration time by constructing better tree-structured topology to increase transmission bandwidth. However, some bandwidth of many other edges beyond the tree is not utilized to speed up transmission in tree-structured regeneration. In this paper, we consider to use multiple edge-disjoint trees to parallel regenerate the lost data, and analyze the total regeneration time. We deduce the formula about optimal regeneration time, and propose an approximate construction algorithm with polynomial time complexity for the optimal multiple regeneration trees. Our experiments shows, the regeneration time reduces 62i?ź% compared with common tree---structured scheme, and the file availability reaches almost 99i?ź%.

international multi topic conference | 2014

MEX: A distributed computing framework for executable programs

Changjian Wang; Yuxing Peng; Pengfei You; Mingxing Tang; Minghao Hu; Dongsheng Li; Youguo Li

Parallel computing can improve the data-processing efficiency significantly. However, the traditional approaches, such as MPI and MapReduce, need to program in the special environment. In this paper, a new distributed computing framework named MEX is proposed. Users just provides the input files and the name of an executable program to MEX. Then MEX will automatically process these files on a cluster of machines with the executable program. The MEX platform has been designed and implemented based on MapReduce and some key problems are addressed. An improved map function are designed for the start-up of the executable program. To support the improved map function, a data-conversion mechanism is added into MEX which generates the command texts as the parameter of the map function. A process-feedback mechanism is proposed for the fault-tolerance of the executable program. The mechanism also supports the synchronous execution between the map task and the executable program, which can avoid too many processes to be started on the same worknode. Comprehensive experiments are performed to verify the effectiveness of the MEX framework. According to the results, more computing worknodes can result in less job runtime in MEX. When 100 virtual machines are used for an OCR job with 1000 images in 400 dpi, the runtime is reduced 88.6% compared to a single machine.

arXiv: Computation and Language | 2017