Biao Min
City University of Hong Kong
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Biao Min.
IEEE Transactions on Circuits and Systems for Video Technology | 2015
Biao Min; Ray C. C. Cheung
Intra coding plays a crucial role in the High Efficiency Video Coding (HEVC) standard. It provides the higher coding efficiency than the previous standard, H.264/Advanced Video Coding. The block partitioning in HEVC supports quad-tree-based coding unit (CU) structure from size 64×64 to 8×8. The new technique provides better performances on one hand, whereas on the other hand it also increases the coding complexity. In this paper, a novel fast algorithm is proposed for the CU size decision in intra coding. Both the global and local edge complexities in horizontal, vertical, 45° diagonal, and 135° diagonal directions are proposed and used to decide the partitioning of a CU. Coupled with handling its four sub-CUs in the same way, a CU is decided to be split, nonsplit, or undetermined for each depth. Compared with the reference software HM10.0, the encoding time is reduced by ~52% on average, with ~0.8% Bjontegaard Distortion-rate increasing and reasonable peak signal-to-noise ratio losses.
Microelectronics Journal | 2013
Yao Xin; Benben Liu; Biao Min; Will X. Y. Li; Ray C. C. Cheung; Anthony S. Fong; Ting Fung Chan
The Burrows-Wheeler Transform (BWT) based methodology seems ideally suited for DNA sequence alignment due to its high speed and low space complexity. Despite being efficient in exact matching, the application of BWT in inexact matching still has problems due to the excessive backtracking process. This paper presents a hardware architecture for the BWT-based inexact sequence mapping algorithm using the Field Programmable Gate Array (FPGA). The proposed design can handle up to two errors, including mismatches and gaps. The original recursive algorithm implementation is dealt with using hierarchical tables, and is then parallelized to a large extension through a dual-base extension method. Extensive performance evaluations for the proposed architecture have been conducted using both Virtex 6 and Virtex 7 FPGAs. This design is considerably faster than a direct implementation. When compared with the popular software evaluation tool BWA, our architecture can achieve the same match quality tolerating up to two errors. In an execution speed comparison with the BWA aln process, our design outperforms a range of CPU platforms with multiple threads under the same configuration conditions.
conference of the industrial electronics society | 2011
Biao Min; Ray C. C. Cheung; Yan Han
Hummingbird is an ultra-lightweight cryptography targeted for resource-constrained devices such as RFID tags, smart cards and sensor nodes. It has been implemented across different target platforms. In this paper, we present two different FPGA-based implementations for both throughput-oriented (TO) and area-oriented (AO) Hummingbird Cryptography (HC). The throughput-oriented design is optimized for operation speed while the area-oriented design consumes smaller area resource usage. Both proposed designs have been implemented on a Xilinx low-cost Spartan-3 XC3S200 FPGA. When compared with existed methods, the results from the proposed designs show that our designs cost less FPGA slices while the same throughput can be obtained. The proposed architectures are designed to best suit for adding customizable security to embedded control systems.
Signal Processing-image Communication | 2018
Zhe Xu; Biao Min; Ray C. C. Cheung
Scene background initialization allows the recovery of a clear image without foreground objects from a video sequence, which is generally the first step in many computer vision and video processing applications. The process may be strongly affected by some challenges such as illumination changes, foreground cluttering, intermittent movement, etc. In this paper, a robust background initialization approach based on superpixel motion detection is proposed. Both spatial and temporal characteristics of frames are adopted to effectively eliminate foreground objects. A subsequence with stable illumination condition is first selected for background estimation. Images are segmented into superpixels to preserve spatial texture information and foreground objects are eliminated by superpixel motion filtering process. A low-complexity density-based clustering is then performed to generate reliable background candidates for final background determination. The approach has been evaluated on SBMnet dataset and it achieves a performance superior or comparable to other state-of-the-art works with faster processing speed. Moreover, in those complex and dynamic categories, the algorithm produces the best results showing the robustness against very challenging scenarios.
Signal Processing-image Communication | 2018
Zhe Xu; Biao Min; Ray C. C. Cheung
Abstract A fast CU decision algorithm is very desirable for High Efficiency Video Coding (HEVC) due to its high encoding complexity. In this paper, a fast inter CU decision algorithm is proposed, with the motion correlations between neighboring CUs discussed. Decision for splitting of collocated CU has a strong impact on current CU, high-motion CUs are early split by means of calculating motion diversity of collocated CUs. On the other hand, SKIP mode indicates a motion sharing relation among neighboring CUs and it can be used to early determine CU termination. A discriminant function minimizing expected risk is defined for both early SKIP mode detection and SKIP mode based termination decision. Experimental results show that the proposed algorithm can reduce computational complexity by 48.2% with only 0.46% BDBR increase for random access configuration. For the low-delay B configuration, it can reduce complexity by 45.9% with 0.55% BDBR increase penalty. The results show our algorithm achieves less BDBR increase compared with other state-of-the-art works.
Microprocessors and Microsystems | 2018
Wei-pei Huang; Pok Yee Kwan; Weiyang Ding; Biao Min; Ray C. C. Cheung; Liqun Qi; Hong Yan
Abstract This paper presents a hardware architecture for singular spectrum analysis of Hankel tensors, including computation of tucker decomposition, tensor reconstruction and final Hankelization. In the proposed design, we explore two level of optimization. First, in algorithm level, we optimize the calculation process by exploiting the Hankel property to reduce the computation complexity and on-chip BRAM resource usage. Secondly, in hardware level, parallelism is explored for acceleration. Resource sharing is applied to reduce look-up tables (LUTs) usage. To enable flexibility, the number of processing elements (PEs) can be changed through parameter setting. Our proposed design is implemented on Field-Programmable Gate Arrays (FPGAs) to process third order tensors. Experiment results show that our design achieve a speed-up from 172 to 1004 compared with CPU implementation via Intel MKL and 5 to 40 compared with GPU implementation.
international conference on systems signals and image processing | 2017
Biao Min; Zhe Xu; Ray C. C. Cheung
As the successor of H.264, High Efficient Video Coding (HEVC) standard includes various novel techniques, including Coding Tree Unit (CTU) structure and additional angular modes used in intra coding. These new techniques promote the coding efficiency on one hand, while increasing the computational complexity significantly on the other hand. In this paper, we propose a fast intra block partitioning algorithm for HEVC to reduce the coding complexity, based on the statistical cost and corner detection algorithm. A block is considered as a multiple gradients region which will be split into multiple small ones, as the corner points are detected inside the block. A block without corner points existing is treated as being non-split when its RD cost is small according the statistics of the previous frames. The proposed fast algorithm achieves nearly 63% encoding time reduction with 3.42%, 2.80%, and 2.53% BD-Rate loss for Y, U, and V components, averagely. The experimental results show that the proposed method is efficient to fast decide the block partitioning in intra coding of HEVC, even though only static parameters are applied to all test sequences.
IEEE Transactions on Circuits and Systems for Video Technology | 2017
Biao Min; Zhe Xu; Ray C. C. Cheung
Ultrahigh definition (UHD), such as 4K/8K, is becoming the mainstream of video resolution nowadays. High Efficiency Video Coding (HEVC) is the emerging video coding standard to process the encoding and decoding of UHD video. This paper first develops multiple techniques that allow the proposed hardware architecture for intra prediction of HEVC working in full pipeline. The proposed techniques include: 1) a novel buffer structure for reference samples; 2) a mode-dependent scanning order; and 3) an inverse method for reference sample extension. The size of the buffer is 3K b for luma component and 3K b for chroma components, providing sufficient accessing to the reference samples. Since the data dependency between two neighboring blocks is addressed by the mode-dependent scanning order, the proposed fully pipelined design can produce 4 pixels/clock cycle. As a result, the throughput of the proposed architecture is capable to support
international conference of the ieee engineering in medicine and biology society | 2013
Yao Xin; Will X. Y. Li; Biao Min; Yan Han; Ray C. C. Cheung
3840 \times 2160
IEEE Transactions on Circuits and Systems Ii-express Briefs | 2013
Biao Min; Ray C. C. Cheung; Hong Yan
videos at 30 frames/s.