Mi Lu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mi Lu is active.

Explore More

Publication

Featured researches published by Mi Lu.

IEEE Transactions on Computers | 1992

A novel division algorithm for the residue number system

Mi Lu; Jen-Shiun Chiang

A novel general algorithm for signed number division in the residue number system (RNS) is presented. The parity checking technique used for sign and overflow detection in this algorithm is more efficient and practical than conventional methods. Sign magnitude arithmetic division is implemented using binary search. There is no restriction to the dividend and the divisor (except zero divisor), and no quotient estimation is necessary before the division is executed. Only simple operations are needed to accomplish this RBS division. All these characteristics have made the algorithm simple, efficient, and practical for implementation on a real RNS divider. >

IEEE Transactions on Parallel and Distributed Systems | 1994

Parallel algorithms for the longest common subsequence problem

Mi Lu; Hua Lin

A subsequence of a given string is any string obtained by deleting none or some symbols from the given string. A longest common subsequence (LCS) of two strings is a common subsequence of both that is as long as any other common subsequences. The problem is to find the LCS of two given strings. The bound on the complexity of this problem under the decision tree model is known to be mn if the number of distinct symbols that can appear in strings is infinite, where m and n are the lengths of the two strings, respectively, and m/spl les/n. In this paper, we propose two parallel algorithms far this problem on the CREW-PRAM model. One takes O(log/sup 2/ m + log n) time with mn/log m processors, which is faster than all the existing algorithms on the same model. The other takes O(log/sup 2/ m log log m) time with mn/(log/sup 2/ m log log m) processors when log/sup 2/ m log log m > log n, or otherwise O(log n) time with mn/log n processors, which is optimal in the sense that the time/spl times/processors bound matches the complexity bound of the problem. Both algorithms exploit nice properties of the LCS problem that are discovered in this paper. >

IEEE Transactions on Vehicular Technology | 2003

Topology-transparent time division multiple access broadcast scheduling in multihop packet radio networks

Zhijun Cai; Mi Lu; Costas N. Georghiades

Many topology-dependent transmission scheduling algorithms have been proposed to minimize the time-division multiple-access frame length in multihop packet radio networks (MPRNs), in which changes of the topology inevitably require recomputation of the schedules. The need for constant adaptation of schedules-to-mobile topology entails significant problems, especially in highly dynamic mobile environments. Hence, topology-transparent scheduling algorithms have been proposed, which utilize Galois field theory and Latin squares theory. We discuss the topology-transparent broadcast scheduling design for MPRNs. For single-channel networks, we propose the modified Galois field design (MGD) and the Latin square design (LSD) for topology-transparent broadcast scheduling. The MGD obtains much smaller minimum frame length (MFL) than the existing scheme while the LSD can even achieve possible performance gain when compared with the MGD, under certain conditions. Moreover, the inner relationship between scheduling designs based on different theories is revealed and proved, which provides valuable insight. For topology-transparent broadcast scheduling in multichannel networks, in which little research has been done, the proposed multichannel Galois field design (MCGD) can reduce the MFL approximately M times, as compared with the MGD when M channels are available. Numerical results show that the proposed algorithms outperform existing algorithms in achieving a smaller MFL.

IEEE Transactions on Mobile Computing | 2003

Channel access-based self-organized clustering in ad hoc networks

Zhijun Cai; Mi Lu; Xiaodong Wang

An ad hoc network is a self-organized and distributed entity consisting of a number of mobile stations (MS) without the coordination of any centralized access point. Clustering is one of the fundamental problems in ad hoc networks. In this context, we describe a distributed clustering algorithm for multihop ad hoc networks. We first propose a randomized control channel broadcast access method to maximize the worst-case control channel efficiency, based on which a distributed clustering algorithm is proposed. Both theoretical analysis and simulations indicate that the proposed clustering algorithm takes much less time and overhead to cluster a given network with more stable cluster structure, while incurring very small maintenance overhead in a dynamic network resulting from the mobility of the MS.

field-programmable custom computing machines | 2004

Accelerating seismic migration using FPGA-based coprocessor platform

Chuan He; Mi Lu; Chuanwen Sun

Migration is the most important seismic data processing method that recovers subsurface images of the Earths interior using surface-recorded data volumes obtained from seismic reflection surveys. A reconfigurable coprocessor platform called SPACE (seismic data processing accelerator with reconfigurable engine) using field programmable gate array (FPGA) technology is proposed in this paper to speed up these computationally demanding and data-intensive seismic migration applications. The proposed SPACE platform is characterized by its simple architecture and abundant on-board memory resources along with ultra-wide memory bandwidth, which also makes the platform suitable for other seismic data processing methods or some large-scale scientific computing applications. The time-consuming kernel part of the pre-stack Kirchhoff time migration (PSTM) algorithm is programmed into the FPGA-based coprocessor platform, which acts as a hardware accelerator attached to an Intel-based workstation through the local peripheral controller interface (PCI) bus. Improved performance can be achieved by integrating a number of parallel running fully pipelined arithmetic modules into a single FPGA chip. Our simulation results show that the proposed coprocessor platform operating at a conservative speed of 50 MHz can calculate the Kirchhoff summations for 50 million points per second, which is about 15.6 times faster than a referential 2.4 GHz Pentium 4 workstation. The impressive performance of the proposed platform implies its broad applications in seismic data processing industry.

field-programmable custom computing machines | 2005

Time domain numerical simulation for transient waves on reconfigurable coprocessor platform

Chuan He; Wei Zhao; Mi Lu

A successful application-oriented reconfigurable coprocessor design requires not only a powerful FPGA-based computing engine along with suitable hardware architecture, but also an efficient algorithm tailored for this special application. In this paper, we present our hardware architecture and numerical algorithms designed to speedup the time-domain finite-difference simulation of linear wave propagation problems in 2D and 3D space on FPGA-based reconfigurable platforms. Application fields of this work include seismic modeling and migration, computational electromagnetics, aeroacoustics, marine acoustics, to name a few. By writing first-order linear wave equations into second-order form, we halve the number of unknowns and simplify the treatment of parameters. We also adopt higher-order finite-difference (FD) schemes to further reduce the number of unknowns at the cost of increasing floating-point computations per discrete grid point. By doing so, we relief the bandwidth requirements between the FPGA and onboard memories but put more burden on the computing engine to take full advantage of FPGAs computational potentials. The speed of our design implemented on a Xilinx ML401 Virtex-4 evaluation platform is about 1.5/spl sim/4 times faster than a pure software implementation of the same algorithm running on a 3.0 GHz DELL workstation. This impressive result is mainly attributed to the memory architecture design, which is well-tuned for our numerical higher-order FD algorithms and can utilize onboard memory bandwidth more wisely. Furthermore, the good scalability of our design makes it compatible with most commercial reconfigurable coprocessor platforms and correspondingly, the performance would be proportional to their onboard memory bandwidth.

international conference on computer communications and networks | 2004

PAGER: a distributed algorithm for the dead-end problem of location-based routing in sensor networks

Le Zou; Mi Lu; Zixiang Xiong

The dead-end problem is an importance issue of location-based routing in sensor networks, which occurs when a message falls into a local minimum using greedy forwarding. Current methods for this problem are insufficient either in eliminating traffic/path memorization or finding satisfied short paths. In this paper, we propose a novel algorithm, named partial-partition avoiding geographic routing (PAGER), to solve the problem. The basic idea of PAGER is to divide a sensor network graph into functional sub-graphs, and provide each sensor node with message forwarding directions based on these sub-graphs. That results in loop-free short paths without memorization of traffics/paths in sensor nodes. We implement our algorithm in a protocol and evaluate it in sensor networks with different parameters. Results show that PAGER generates considerably shorter paths, higher delivery ratio and lower energy consumption than the greedy perimeter stateless routing protocol. At the same time, PAGER achieves better performance in handling large-scale networks than the ad-hoc on-demand distance vector protocol

IEEE Transactions on Parallel and Distributed Systems | 2003

Distributed initialization algorithms for single-hop ad hoc networks with minislotted carrier sensing

Zhijun Cai; Mi Lu; Xiaodong Wang

An ad hoc network is a self-organized and distributed entity, consisting of n mobile stations (MSs) without the coordination of any centralized access point. Initialization is one of the fundamental tasks to set up an ad hoc network, which involves assigning each of the n MSs a distinct ID number from 1 to n, distributedly. In Nakano et al. (2000), randomized initialization protocols are developed for single-hop ad hoc networks under different conditions. However, carrier sensing has not been utilized and suitable acknowledgment schemes for the algorithms are not developed. Moreover, the assumption taken by Nakano et al. about MSs being able to listen while transmitting is not valid for ad hoc networks. In this context, we describe two algorithms for initializing an ad hoc network with carrier sensing capability. First, a novel acknowledgment scheme is proposed for notifying a transmitting MS whether its transmission is successful during the initialization. Then, two distributed and randomized initialization algorithms are developed and analyzed, under the assumptions of a known and unknown number of users in the network, respectively. Both algorithms are obtained based on optimizing some key parameters to minimize the total time required to complete the initialization. Both theoretical analysis and simulations indicate that the proposed initialization algorithms outperform the existing methods, in the sense that they take much less time to complete the initialization and the average number of transmission attempts before success is much smaller.

vehicular technology conference | 2000

SNDR: a new medium access control for multi-channel ad hoc networks

Zhijun Cai; Mi Lu

A new multi-channel, CDMA (code division multiple access) (or FDMA (frequency division multiple access)) and TDMA (time division multiple access) combined, contention free MAC (medium access control), termed the sequenced neighbor double reservation (SNDR), is presented for mobile ad hoc networks. The SNDR uses the receiver-based data transmission strategy, based on which two methods are proposed. One is contention-based and the other is contention-free. We put emphasis on the contention-free type (SNDR). On the other hand, the contention-based MAC, which needs further research, is also discussed. The SNDR does not need any handshake process (such as request to send/clear to send (RTS/CTS) handshake) or any carrier sensing technology. It uses the neighbor sequenced method to avoid contentions and the double reservation method to improve the total throughput of ad hoc networks. No hidden or exposed terminal problem will exist in the SNDR. No collision will occur and no time slot will be wasted in the SNDR MAC frame. The protocol can be efficiently applied to the multi-channel ad hoc networks. The performance of the SNDR is analyzed carefully. Some future work and applications are also discussed.

international parallel and distributed processing symposium | 2000

Take Advantage of the Computing Power of DNA Computers

Z. Frank Qiu; Mi Lu

Ever since Adleman [1] solved the Hamilton Path problem using a combinatorial molecular method, many other hard computational problems have been investigated with the proposed DNA computer [3] [25] [9] [12] [19] [22] [24] [27] [29] [30]. However, these computation methods all work toward one destination through a couple of steps based on the initial conditions. If there is a single change on these given conditions, all the procedures need to be gone through no matter how complicate these procedures are and how simple the change is. The new method we are proposing here in the paper will take care of this problem. Only a few extra steps are necessary to take when the initial condition has been changed. This will provide a lot of savings in terms of time and cost.

Explore More