Janne Janhunen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Janne Janhunen is active.

Explore More

Publication

Featured researches published by Janne Janhunen.

IEEE Journal of Selected Topics in Signal Processing | 2011

Fixed- and Floating-Point Processor Comparison for MIMO-OFDM Detector

Janne Janhunen; Teemu Pitkänen; Olli Silvén; Markku J. Juntti

The evolution toward software-defined radio (SDR) technologies, in particular, cognitive radios, is leading toward the need to support multiple radio solutions with the same baseband processing resources. This implies not only a huge design effort, but also a shift from hardware to software design flavored tool chains. In this paper, a hardware complexity and energy dissipation are analyzed by implementing three programmable processor architectures that support 32- and 12-bit floating-point and 16-bit fixed-point arithmetics. The processors are based on the transport triggered architecture (TTA) that has a very low programmability overhead. We programmed a recently introduced selective spanning with fast enumeration (SSFE) soft-output detector for these processors. The processors are capable to achieve data rates required in multiple-input multiple-output orthogonal frequency- division multiplexing (MIMO-OFDM) 3G LTE system with a small energy dissipation. The analysis shows that at the same goodput rate a floating-point implementation can achieve a lower gate count and a better power efficiency than a fixed-point design. Combined with tool chain benefits, the floating-point arithmetic is becoming attractive for future SDR solutions.

international conference on embedded computer systems: architectures, modeling, and simulation | 2010

A GPU implementation for two MIMO-OFDM detectors

Teemu Nylanden; Janne Janhunen; Olli Silvén; Markku J. Juntti

Two real-valued signal models based on selective spanning with fast enumeration (SSFE) and layered orthogonal lattice detector (LORD) algorithms are implemented on a Nvidia graphics processing unit (GPU). A 2×2 multiple-input multiple-output (MIMO) antenna system with 16-quadrature amplitude modulation (16-QAM) is assumed. The chosen level update vector for SSFE is based on computer simulation results carried out in MATLAB environment. We implemented the algorithms with Nvidia Quadro FX 1700 GPU and achieved a throughput of 36.06 Mbps for SSFE and 16.8 Mbps for LORD. The results show that the general-purpose graphics processing unit (GPGPU) has the potential to achieve high throughput, presuming a detection algorithm that allows efficient parallel processing. The latency of the control code and partial Euclidean distance (PED) calculations are very small-scale, but the latency of memory loads and stores to the GPUs global memory are significant. We also compare results from the trellis based detector implementation for GPU, where a more powerful GPU and a different detection algorithm are used. The GPUs offer superior computing power and programmability compared to the application specific software defined radio (SDR) designs implemented so far.

International Journal of Distributed Sensor Networks | 2017

Performance of a low-power wide-area network based on LoRa technology : Doppler robustness, scalability, and coverage

Juha Petäjäjärvi; Konstantin Mikhaylov; Marko Pettissalo; Janne Janhunen; Jari Iinatti

The article provides an analysis and reports experimental validation of the various performance metrics of the LoRa low-power wide-area network technology. The LoRa modulation is based on chirp spread spectrum, which enables use of low-quality oscillators in the end device, and to make the synchronization faster and more reliable. Moreover, LoRa technology provides over 150 dB link budget, providing good coverage. Therefore, LoRa seems to be quite a promising option for implementing communication in many diverse Internet of Things applications. In this article, we first briefly overview the specifics of the LoRa technology and analyze the scalability of the LoRa wide-area network. Then, we introduce setups of the performance measurements. The results show that using the transmit power of 14 dBm and the highest spreading factor of 12, more than 60% of the packets are received from the distance of 30 km on water. With the same configuration, we measured the performance of LoRa communication in mobile scenarios. The presented results reveal that at around 40 km/h, the communication performance gets worse, because duration of the LoRa-modulated symbol exceeds coherence time. However, it is expected that communication link is more reliable when lower spreading factors are used.

Signal Processing | 2010

Programmable processor implementations of K-best list sphere detector for MIMO receiver

Janne Janhunen; Olli Silvén; Markku J. Juntti

An increasing number of standards in wireless communications have encouraged to study programmable processors as platforms for flexible receivers. A multiple-input multiple-output (MIMO) antenna system combined with orthogonal frequency division multiplexing (OFDM) technique has been introduced in many wireless communications standards, such as in the third generation long term evolution (3G LTE). The MIMO-OFDM system requires an efficient detector and a platform support for parallel processing of multiple subcarriers. A K-best list sphere detector (LSD) provides for near optimal decoding performance and a fixed throughput making it an interesting algorithm from the point of view of practical implementations. In this paper, we compare the implementations of the K-best LSD on four processor platforms: a digital signal processor (DSP), software defined radio (SDR), application-specific processor (ASP) and application-specific instruction-set processor (ASIP). The DSP is a popular very long instruction word (VLIW) device (TMS320C6455), the SDR processor employs multithreading and multiple cores (SB3500 core processor), the ASP is based on transport triggered architecture (TTA), while the ASIP is the SDR processor enhanced with a special instruction-set extension for sorting. A 2x2 MIMO antenna system with 64-quadrature amplitude modulation (64-QAM) is assumed. The chosen list sizes K=8 and 16 are based on simulation results carried out in MATLAB environment with the third generation long term evolution (3G LTE) parameters. The proposed ASIP achieved a promising throughput of 32.0Mbps, where the software defined radio (SDR) implementation on the SB3500 core processor suffers from an inefficient software sorter. The ASP, in which the minimized hardware complexity has been the goal, achieves a throughput of 7.6Mbps. However, more essential examination is related to the symbol time, which sets strict parallel processing requirements to the programmable processors.

european conference on networks and communications | 2017

On LoRaWAN scalability: Empirical evaluation of susceptibility to inter-network interference

Konstantin Mikhaylov; Juha Petäjäjärvi; Janne Janhunen

Appearing on the stage quite recently, the Low Power Wide Area Networks (LPWANs) are currently getting much of attention. In the current paper we study the susceptibility of one LPWAN technology, namely LoRaWAN, to the inter-network interferences. By means of excessive empirical measurements employing the certified commercial transceivers, we characterize the effect of modulation coding schemes (known for LoRaWAN as data rates (DRs)) of a transmitter and an interferer on probability of successful packet delivery while operating in EU 868 MHz band. We show that in reality the transmissions with different DRs in the same frequency channel can negatively affect each other and that the high DRs are influenced by interferences more severely than the low ones. Also, we show that the LoRa-modulated DRs are affected by the interferences much less than the FSK-modulated one. Importantly, the presented results provide insight into the network-level operation of the LoRa LPWAN technology in general, and its scalability potential in particular. The results can also be used as a reference for simulations and analyses or for defining the communication parameters for real-life applications.

international conference on embedded computer systems: architectures, modeling, and simulation | 2011

FPGA based application specific processing for sensor nodes

Teemu Nylanden; Janne Janhunen; Jari Hannuksela; Olli Silvén

Energy efficient sensor nodes are among the rapidly expanding applications for embedded systems technology. Typically, the processing resources in sensor nodes are based on programmable micro-controllers and digital signal processors, and the same processing architecture is used regardless of the actual task of the node. This regularly results in at least an order of magnitude over-provisioning of resources, and in higher power consumption than would be needed by tightly application specific processing solutions. Currently, experiments show that Flash FPGA technology enables implementing precisely provisioned processing for sensor nodes with energy efficiency that rivals off-the-shelf processor solutions. The expected competitiveness originates from savings in silicon real-estate, and lowered software overheads, as inherently parallel tasks can be offloaded to dedicated hardware accelerators on the same die with a microcontroller unit, and radio baseband. The results pave the way for a novel type of self-powered sensor nodes whose processing resources are configured according to their tasks.

international conference on embedded computer systems: architectures, modeling, and simulation | 2008

Software defined radio implementation of K-best list sphere detector algorithm

Janne Janhunen; Olli Silvén; Markku J. Juntti; Markus Myllylä

In this novel study, a real-valued signal model based on the K-best list sphere detector (LSD) algorithm is implemented to fixed-point digital signal processor (DSP). A 2 times 2 multiple-input multiple-output (MIMO) antenna system with 64-quadrature amplitude modulation (64-QAM) is assumed. Our former studies proved that software sorting does not meet the real-time requirements, and, thus, in the current studies we assume a hardware sorter. The chosen list size K=16 is based on the simulation results carried out in MATLAB environment. We implemented the K-best LSD algorithm with Sandblaster multithreaded processor and achieved the throughput of 17.9 Mbps, when the hardware sorter was assumed beside the digital signal processor. This novel study shows that the general-purpose digital signal processor has potential to achieve high throughput, when hardware accelerated sorter is assumed. In the current study, the latency of the control code and partial Euclidean distance (PED) calculations were decreased, but the latency of memory loads and stores are significant. We will also compare results from x86 processor architecture and application-specific instruction set processor (ASIP) implemented by using transport triggered architecture (TTA), in which the same parameters were used. The TTA has benefits compared to DSPs, especially in data transmission.

international symposium on system-on-chip | 2013

Study of adaptive detection for MIMO-OFDM systems

Essi Suikkanen; Janne Janhunen; Shahriar Shahabuddin; Markku J. Juntti

Requirements for higher data rates and lower power consumption set new challenges for implementation of multiple-input multiple-output orthogonal frequency division multiplexing (MIMO-OFDM) receivers. Simple detectors have the advantage of low complexity and power consumption, but they cannot offer as good performance as more complex detectors. Therefore it would be beneficial to be able to adapt the detector algorithm to suit the channel conditions to minimize the receiver processing power consumption while satisfying the quality of service requirements. At low signal-to-noise ratio (SNR) and/or low rank channel, more power and computation resources could be used for detection in order to guarantee reliable communication, while in good conditions, a simple and less power consuming detector could be used. In this paper, we compare the performance of different detection algorithms. The performance results are based on simulations in long term evolution (LTE) system. The effect of precoding and hybrid automatic repeat request (HARQ) on the performance is shown. Implementation results based on the existing literature are included in the comparison. We discuss when it would be beneficial to use a complex detector and when a simple one would be sufficient. Also the switching criterion is discussed.

international symposium on circuits and systems | 2015

A customized lattice reduction multiprocessor for MIMO detection

Shahriar Shahabuddin; Janne Janhunen; Zaheer Khan; Markku J. Juntti; Amanullah Ghazi

Lattice reduction (LR) is a preprocessing technique for multiple-input multiple-output (MIMO) symbol detection to achieve better bit error-rate (BER) performance. In this paper, we propose a customized homogeneous multiprocessor for LR. Each individual core is based on transport triggered architecture (TTA). We propose a few modifications of the popular LR algorithm, Lenstra-Lenstra-Lovász (LLL) for high throughput. High level programming is used to implement the control path of the TTA cores and several special function units are designed to accelerate the program. The multiprocessor takes 187 cycles to reduce a single matrix for LR. The architecture is synthesized on 90 nm technology and takes 405 kgates at 210 MHz.

international conference on acoustics, speech, and signal processing | 2011

Fixed- versus floating-point implementation of MIMO-OFDM detector

Janne Janhunen; Perttu Salmela; Olli Silvén; Markku J. Juntti

In this paper, we investigate the opportunities offered by floating-point arithmetics in enabling an assembly and intrinsics free high-level language based development. We compare the characteristics of floating- and fixed-point arithmetics by simulating a MIMO-OFDM soft output detector in a 3G LTE link level simulator. The hardware complexity and energy dissipation are analyzed by implementing three programmable processors supporting 32- and 12-bit floating-point and 16-bit fixed-point arithmetics. The processors are based on the transport triggered architecture (TTA) that has a very low programmability overhead. The analysis shows that at the same goodput rate a floating-point implementation can achieve a lower gate count and better power efficiency than a fixed-point design.

Explore More