[PDF] Reconfigurable Computing Applied to Latency Reduction for the Tactile Internet

Abstract

Tactile internet applications allow robotic devices to be remotely controlled over a communication medium with an unnoticeable time delay. In a bilateral communication, the acceptable round trip latency is usually in the order of 1ms up to 10ms depending on the application requirements. It is estimated that 70% of the total latency is generated by the communication network, and the remaining 30% is produced by master and slave devices. Thus, this paper aims to propose a strategy to reduce 30% of the total latency that is produced by such devices. The strategy is to apply reconfigurable computation using FPGAs to minimize the execution time of device-associated algorithms. With this in mind, this work presents a hardware reference model for modules that implement nonlinear positioning and force calculations as well as a tactile system formed by two robotic manipulators. In addition to presenting the implementation details, simulations and experimental tests are performed in order to validate the proposed model. Results associated with the FPGA sampling rate, throughput, latency, and post-synthesis occupancy area are analyzed.

Full PDF

RReconﬁgurable Computing Applied toLatency Reduction for the TactileInternet

JOSÉ C. V. S. JUNIOR , MATHEUS F. TORQUATO , TOKTAM MAHMOODI , MISCHADOHLER , (FELLOW, IEEE) AND MARCELO A. C. FERNANDES Laboratory of Machine Learning and Intelligent Instrumentation (LMLII), nPITI-IMD, Federal University of Rio Grande do Norte, 59078-970, Natal, Brazil. College of Engineering, Swansea University, Swansea, Wales SA2 8PP, UK. Centre for Telecommunications Research, Department of Engineering, Kingâ ˘A´Zs College London, London WC2R 2LS, UK. Department of Computer and Automation Engineering, Federal University of Rio Grande do Norte, Natal, 59078-970, Brazil. (Current address) John A. Paulson School of Engineering and Applied Sciences, Harvard University, 02138, Cambridge, MA, USA. Corresponding author: Marcelo A. C. Fernandes ([email protected] or [email protected] ).This study was partially ﬁnanced by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)—Finance Code 001.

ABSTRACT

Tactile internet applications allow robotic devices to be remotely controlled over a com-munication medium with an unnoticeable time delay. In a bilateral communication, the acceptable roundtrip latency is usually in the order of ms up to ms depending on the application requirements. It isestimated that of the total latency is generated by the communication network, and the remaining is produced by master and slave devices. Thus, this paper aims to propose a strategy to reduce the of total latency that is produced by such devices. The strategy is to apply reconﬁgurable computationusing FPGAs to minimize the execution time of device-associated algorithms. With this in mind, thiswork presents a hardware reference model for modules that implement nonlinear positioning and forcecalculations as well as a tactile system formed by two robotic manipulators. In addition to presenting theimplementation details, simulations and experimental tests are performed in order to validate the proposedmodel. Results associated with the FPGA sampling rate, throughput, latency and post-synthesis occupancyarea are analyzed. INDEX TERMS

Tactile Internet, Latency Reduction, Haptic Devices, Reconﬁgurable Computing, FPGA.

I. INTRODUCTION T HE Tactile internet is conceptually deﬁned as the newgeneration of internet connectivity which will combinevery low latency with extremely high availability, reliabilityand security [1]. Another feature pointed out is that thisnew generation will be centered around applications that usehuman-machine communications (H2M) alongside devicesthat are compatible with tactile sensations [2], [3].A tactile internet environment is basically composed ofa local device (known as a master) and a remote device(known as a slave), where the master device is responsiblefor controlling the slave device over the internet through atwo-way data communication network [4] [5]. Bidirectionalcommunication is needed to simulate the physical laws ofaction and reaction, where action can be represented as send-ing operational commands and reaction can be representedas the forces resulting from that action. In tactile internetapplications, the desired time delay for device communi-cation is characterized by an ultra-low latency. In bilateral communication, the required round trip latency ranges from ms up to ms depending on the application requirements[6]–[9].According to [10], it can be noticed that in a tactile internetapplication, of the total system latency is generatedby the master and slave devices. These devices demandhigh processing speeds as repeated execution of a varietyof computationally expensive algorithms and techniques arerequired. These algorithms involve the use of arithmeticoperations and calculations of linear and nonlinear equationsthat need to be computed at high sampling rates in orderto maintain application ﬁdelity. The remaining of thelatency is caused by the communication network, whichmakes them unsuitable for such latency constraints [11].To minimize this problem, some research groups have beenstudying prediction techniques, where many algorithms havebeen studied and proposals using artiﬁcial intelligence (AI)have proved to be effective [12]. On the other hand, theimplementation of complex AI-based prediction methods can a r X i v : . [ c s . OH ] M a r urther increase the latency of the computer systems presentin master and slave devices.Alternatively, new approaches such as reconﬁgurable com-puting can improve the performance of master and slavedevices in a tactile system environment. Reconﬁgurable com-puting with ﬁeld-programmable gate arrays (FPGAs) enablesthe creation of customizable hardware which allow algo-rithms to be parallelized and optimized at the logical gatelevel to speed up their operations. Literature results show thatcomputationally expensive algorithms can achieve speedupsof up to × over software implementations when custom-implemented in FPGAs [13]–[19].In this context, this paper proposes an implementationto target reducing the of the total latency related totactile devices. The project uses reconﬁgurable computationin FPGA to minimize the execution time of algorithms as-sociated with master and slave devices. The use of reconﬁg-urable computing allows the parallelization of algorithms andlatency reduction compared to software systems embeddedin traditional architectures with general purpose processorsand microcontrollers. In an effort to validate the proposedstrategy, this paper presents a discrete reference model thatcan be adjusted for different types of master and slave devicesin a tactile internet system. Validation results, throughput,and post-synthesis ﬁgures obtained for the proposed hard-ware implementation using FPGA reconﬁgurable computingare presented. Comparisons with other works in the literatureshow that the use of reconﬁguration computing can signiﬁ-cantly accelerate the processing speed in tactile devices. II. RELATED WORK

The authors of [20] presented a tactile internet environmentthat used a glove type device in conjunction with a roboticmanipulator. The environment was developed using a gen-eral purpose processor, which made the execution of thealgorithms sequential. In order to send the data, the tactileglove produced a latency of approximately . ms, and thehardware responsible for performing the inverse kinematicscalculations took an interval of . ms. The latency valuesobtained in this application could be improved by hardwarestructures that allow algorithms parallelization.Studies in the literature demonstrate the beneﬁt of usingFPGA to accelerate the sample rate for data acquisition fromdevices associated with haptic systems. The authors of [21]presented an implementation for controlling a 3-DoF (Degreeof Freedom) device. The presented technique proposed toincrease the device sampling rate using FPGA hardwaretogether with a real-time operating system (RTOS) in order toincrease the resolution acquisition of the stiffness sensor. Thecontrol technique presented was developed in -bit ﬁxedpoint, and trigonometric functions were implemented usinglookup tables.The work described in [22] presented a control system forone-dimensional haptic devices (1-DoF). The FPGA controlimplementation used single-precision ﬂoating point repre-sentation (IEEE std 754) and the algorithms performed all calculations in µ s. The processing time was satisfactory;however, the data frame size to be sent over the networkincreased with the size of the DoF. This peculiarity canincrease latency for more complex haptics systems withmany DoFs. In the same topic of previous works, an imple-mentation for bilateral control of single-dimensional hapticdevices (1-DoF) was presented in [23]. A more accuratecontrol techniques based on the sliding mode control (SMC)was implemented in FPGA, and to assist in performing thecomplex calculations, the CORDIC (COordinate RotationDIgital Computer) was used. The hardware was designed tolocally control two devices, one master and one slave. In theimplementation, a -bit ﬁxed point was used, of which bitsin the integer part and bits for the fractional, and the totalexecution time of the controllers was of . µ s.The works [21], [22] and [23] presented a control thatdepends directly on the encoder reading of the device mo-tors. Usually in commercial models, accessing the deviceelectronics can be tricky requiring some reverse engineeringand speciﬁc knowledge to make the appropriate encoderconnections. On the other hand, some works abstract the dataacquisition and work directly with robotics algorithms. Thesealgorithms may require high computational power that cansurpass the capabilities of many general-purpose processors(GPPs) that perform the operations sequentially.Some studies demonstrate the beneﬁt of using FPGA toaccelerate robotic manipulation algorithms related to hapticsystems. A hardware architecture implemented in FPGA forperforming the forward kinematics of 5-DoF robots usingﬂoating point arithmetic was described in [24]. In this hard-ware implementation all the forward kinematics calculationswere performed within . µs which represents clockcycles in a frequency of MHz. The equivalent softwareimplementation has a total processing time of . ms.Overall, the hardware implementation is × faster thanthe software implementation, which means a considerableacceleration in the forward kinematics processing time.The authors of the paper [25] presented an FPGA im-plementation of inverse kinematics, velocity calculation andacceleration of a -DoF robot. Three systems were created:the ﬁrst one did not use any arithmetic co-processor andﬂoating point operations were performed in software; inthe second system a ﬂoating point co-processor was usedwhich allowed the execution of the four basic mathematicaloperations in hardware; lastly, the third system also had acustom arithmetic co-processor but in this case it allowedhardware computation of square root. The overall times toperform the calculations were µ s, µ s and µ s andthe total logic elements used from the entire device were ( ), ( ) and ( ), respectively. The workuses hardware-software to implement inverse kinematics, inwhich critical parts were implemented in FPGAs to acceler-ate the whole process.In [26] is presented a hardware to control a -DoF deviceusing -bit ﬁxed point representation, where bits wereused for the fractional part and bits for the integer part. n that work, a CORDIC implementation was used to assistin performing the trigonometric calculations. The total timespent to compute the forward kinematics was µ s and for theinverse kinematics the time was . µ s for a clock of MHz.However, in the presented proposal, some calculations wereperformed sequentially, that is, for the execution of the for-ward kinematics it was necessary clock cycles and forthe inverse, cycles. The use of partial parallelization inthe execution of robotic manipulation algorithms provideda signiﬁcant increase in system throughput. Nevertheless, itis important to note that there is still room for improvementsince all calculations can be computed in parallel.Another hardware implementation of inverse kinematicswas presented in [27]. The device used was a 10-DoF bipedrobot. A CORDIC implementation was used to perform thetrigonometric calculations. The execution time needed tocompute the kinematics of the joints in FPGA was of . µ s. In this paper, a comparison with a software imple-mentation was also performed, and the time taken to performthe same calculations was µs , i.e. the gain on execution,or speedup, on custom FPGA hardware was × . Theresulting error between both implementations was acceptablefor this speciﬁc control.In [28] it was presented an FPGA implementation of theforward and inverse kinematics of a -DoF device. Thehardware was developed using a ﬁxed point representationwhere bits were used for the angles representation and bits for the fractional part. For the device spatial positioning, bits were used of which bits for the fractional part. In theimplementation of trigonometric functions, a combinationof techniques using lookup tables (LUTs) and Taylor serieswas used. To perform the necessary calculations, a ﬁnite-state machine model (FSM) was used to reduce the use ofhardware resources, however, the use of such FSM generateda sequential computation of the robotic manipulation algo-rithms. In this model, the forward kinematics implementationachieved a runtime of ns and the inverse ns, that is,for the MHz clock, the forward kinematics took clockcycles and the inverse kinematics took cycles. Using suchapproaches to reduce the use of hardware resources increasescomputation runtime. For tactile device applications, it isimportant to optimize the runtime rather than the use ofhardware resources.Similarly, an FPGA implementation of forward and in-verse kinematics for a -DoF device was presented in [29],however, only -DoF required to control the device move-ment were implemented in hardware. The proposal used a -bit ﬁxed point representation and a CORDIC was used toexecute the trigonometric functions. To validate the proposal,the FPGA was set to receive the three reference angles, per-form the forward kinematics and then the inverse. The modelwas developed based on pipeline and the operating frequencyused was of MHz. As a result, the model calculationtook µ s to perform the entire kinematics algorithm, whichrepresented clock cycles.In this context, it is possible to realize that the use of reconﬁgurable FPGA-based computing can accelerate hapticdevice control algorithms. Unlike traditional hardware thatprocesses information sequentially, FPGA enables parallelinformation processing. However, most studies from the liter-ature have developed partially parallel implementations, thatis, implementations in which parts of the used algorithmsare executed sequentially. Unlike the researches previouslymentioned, this study presents a new approach in whichthe execution of the robotic manipulation algorithms areperformed in a full-parallel hardware implementation. Thisproposed implementation provides a latency reduction for thetactile devices and enables tactile internet applications. III. DISCRETE MODEL OF THE TACTILE INTERNET

A discrete model of the tactile internet system is proposedand presented in Figure 1. This model consists of seven sub-systems: the Operator (OP), Master Device (MD), Hardwareof the MD (HMD), Network (NW), Hardware of the SD(HSD), Slave Device (SD) and the Environment (ENV). Itis assumed that the signals are sampled at time t s .The OP is an entity responsible for generating stimulithat can be in the form of position signals, speed, force,image, sound or any other. These stimuli are sent to thedevices involved so that some kind of task can be performedin some kind of environment. The environment, the ENVsubsystem, receives the stimuli from the OP and generatesfeedback signals associated with sensations such as reactiveforce information and tactile information that are sent backto the OP. The interaction between the OP and the ENV isperformed through the master and slave devices, MD and SD,respectively.Speciﬁcally in this work, MD is characterized as a localdevice, SD as remote one and both of them are responsible fortransforming the stimuli and sensations associated with OPand ENV into signals to be processed. Tactile devices (MDand SD) can take the form of robotic manipulators, hapticdevices, tactile gloves and others that may be developed inthe future. In the coming years, the introduction of new typesof sensors and actuators is expected that will form the basisfor the development of new tactile devices.Although there are no tactile internet standards nor prod-ucts yet, it can be afﬁrmed that future tactile devices willbe integrated with a hardware responsible for all operationalmetrics and calculations. Within this conjecture, this workadds a couple of modules to the discrete model (as perFigure 1), called HMD and HSD. HMD is responsible forperforming all transformations and calculations associatedwith MD, and HSD performs the equivalent operations forthe SD. Several algorithms associated with transformation,compression, control, prediction will be under the responsi-bility of these two modules.Based on the model presented in Figure 1, the signalsgenerated by the OP can be characterized by the array a ( n ) expressed as a ( n ) = [ a ( n ) , . . . , a i ( n ) , . . . , a N OP ( n )] , (1) P MD HMD HSD SD ENV a ( n ) o ( n ) p ( n ) u ( n ) g ( n ) b ( n ) Network c ( n ) h ( n ) v ( n ) q ( n ) a ( n ) ˆ o ( n ) ˆ FIGURE 1.

Proposed discrete model of the tactile internet system. where a i ( n ) is the i -th stimulus at the n -th instant and N OP is the total number of stimuli signals generated by the OP.At every n -th moment the stimulus array, a ( n ) , is receivedby the MD which transforms the stimuli into a set of N MD signals expressed as b ( n ) = [ b ( n ) , . . . , b i ( n ) , . . . , b N MD ( n )] , (2)where b i ( n ) is the i -th signal generated by MD at the n -thinstant. It can be stated that at each n -th moment a set ofstimuli a ( n ) generates a set of signals b ( n ) that depend onthe type of MD and the sensor set associated with the device.Especially important is the fact that the signals generated byMD, b ( n ) , have heterogeneous characteristics in which each i -th signal b i ( n ) can represent an angle, spatial coordinate,pixel of an image, audio sample or any other informationassociated with a stimulus generated by OP. In practice, thesignals grouped by the b ( n ) array originate from sensorscoupled to the MD and the amount of data may vary accord-ing to the amount of information to be sent, N MD .The set of signals, expressed by b ( n ) are sent to theHMD (Figure 1) which has the function of processing thisinformation before sending it to the NW subsystem. Cal-culations associated with calibration, linear and nonlineartransformations and signal compression are performed by theHMD. Essentially the majority of the computational effort ofMD is in this subsystem. At each n -th instant t s the HMDprocesses the array b ( n ) generating an information array c ( n ) expressed by c ( n ) = (cid:104) c ( n ) , . . . , c i ( n ) , . . . , c N fHMD ( n ) (cid:105) , (3)where c i ( n ) is the i -th signal generated by HMD towardsthe subsystem NW at the n -th instant t s and N fHMD is thenumbers of signals. N fHMD < N MD is expected to minimizelatency during the transmission in the NW subsystem.The NW subsystem, as shown in Figure 1, characterizesthe communication medium that links OP to ENV. In thismodel, the data propagates through two different channelscalled the forward channel, that transmits the OP data to-wards the ENV, and the backwards channel, that transmitsthe ENV signals towards the OP. The signal transmitted bythe forward and backwards channels may be disturbed anddelayed. In the case of the forward channel, the receivedsignal, v ( n ) , may be expressed as v ( n ) = (cid:104) v ( n ) , . . . , v i ( n ) , . . . , v N fHMD ( n ) (cid:105) , (4)where v i ( n ) = c i (cid:16) n − d fi ( n ) (cid:17) + r fi ( n ) (5) in which, r fi ( n ) represents the added noise and d fi ( n ) rep-resents a delays associated with the i -th information sentin c ( n ) . In this model, the noise can be characterized as arandom Gaussian variable of zero mean and σ rf variance,and the delays are characterized as integers, that is, they occurat a granularity of t s . It is important to note that the NWsubsystem can take the shape of the Internet, a metropolitannetwork (MAN), a local area network (LAN), or even a directconnection between an MD and a workstation or computer.As shown in Figure 1, the HSD receives the v ( n ) signalthrough the forward channel and has the role of generatingcontrol signals to the SD through the signal u ( n ) = (cid:104) u ( n ) , . . . , u i ( n ) , . . . , u N fHSD ( n ) (cid:105) , (6)where N fHSD is the number of control signals and u i ( n ) is i -th control signal at the n -th instant t s associated with thearray u ( n ) . It is important to note that there may be varioustypes of SD: from real robotic handlers to virtual tools incomputational environments. Thus, it can be stated withoutloss of generality that HSD can perform an inverse processingto HMD in addition to speciﬁc algorithms associated withthe type of SD. For example, if the SD is a robotic handler,HSD must additionally implement closed loop control algo-rithms, whereas if SD is a virtual arm HSD must implementpositioning algorithms for a given virtual reality platform. SDdoes not have to correspond directly with MD, e.g. MD canbe a glove while SD is a drone. However, it is desirable thatthe stimulus generated by the SD is a copy of the stimulusgenerated by the OP, that is, within the model presented inFigure 1, it can be understood that SD generate a signalexpressed as ˆa ( n ) = [ˆ a ( n ) , . . . , ˆ a i ( n ) , . . . , ˆ a N OP ( n )] , (7)where ˆ a i ( n ) is an estimate of the i -th stimulus a i ( n ) gener-ated by the OP. Thus, the estimate of the stimulus generatedby OP, ˆ a i ( n ) , is applied to the ENV subsystem representing agiven real or virtual environment in which OP is interacting.In the backwards direction, the stimulus actions generatedby OP, a ( n ) , and represented by ˆa ( n ) , receives a group ofreactions from the ENV subsystem that can be characterizedin the model by the set of signals expressed by o ( n ) = [ o ( n ) , . . . , o i ( n ) , . . . , o N ENV ( n )] , (8)where N ENV is the number of stimulus signals and o i ( n ) is i -th stimulus signal at the n -th instant t s . Reaction signalsgrouped into o ( n ) can be in the form of strength, touch,temperature, etc. eaction signals are captured by the SD that turns this in-formation into electrical signals from real or virtual sensors,if the SD is in a virtual reality environment. After capturingthis information the SD transmits these signals to the HSD.In the model presented in Figure 1, the signals generated bythe SD are expressed as g ( n ) = [ g ( n ) , . . . , g i ( n ) , . . . , g N SD ( n )] , (9)where g i ( n ) is the i -th signal generated by the SD at the n -th instant of time, t s and N SD is the amount of signals. TheHSD in turn processes this information and sends to the NWsubsystem through the array h ( n ) , expressed by h ( n ) = (cid:104) h ( n ) , . . . , h i ( n ) , . . . , h N bHSD ( n ) (cid:105) , (10)where h i ( n ) is the i -th signal generated by HSD at the n -thinstant of time, t s and N bHSD is the amount of signals.The signal received by the HMD through the backwardschannel of the NW subsystem can be expressed as q ( n ) = (cid:104) q ( n ) , . . . , q i ( n ) , . . . , q N bHSD ( n ) (cid:105) , (11)where q i ( n ) = h i (cid:0) n − d bi ( n ) (cid:1) + r bi ( n ) (12)in which, r bi ( n ) represents an added noise and d bi ( n ) repre-sents a delay associated with the i -th information transmittedin q ( n ) by the backwards channel. Similarly to the forwardchannel, noise can also be characterized as a random variableGaussian of zero mean and variance σ rb and delays are char-acterized as integers with t s granularity. The HMD processesthe q ( n ) signal information and generates a set of controlsignals that will act on the MD and can be characterized as p ( n ) = (cid:104) p ( n ) , . . . , p i ( n ) , . . . , p N bHMD ( n ) (cid:105) , (13)where p i ( n ) is the i -th signal generated by the HMD at the n -th instant of time t s and N bHMD is the number of signals.The MD in turn will synthesize the reaction stimuli generatedby the environment, i.e. the ENV subsystem. Based on themodel, it is possible to characterize these reaction stimuli asa signal expressed by ˆo ( n ) = [ˆ o ( n ) , . . . , ˆ o i ( n ) , . . . , ˆ o N ENV ( n )] , (14)where ˆ o i ( n ) is an estimate of the i -th stimulus o i ( n ) gen-erated in the ENV subsystem. Examples of reaction stimuligenerated or synthesized by MD are touch, strength andtemperature.In addition to the latency associated with the NW subsys-tem that characterizes the communication medium betweenthe OP and ENV subsystems, the MD, HMD, HSD, and SDsubsystems also add latency to the system. Based on the workpresented in [10], [11] these components represent oftotal latency. The latency of the MD and SD subsystems areassociated with sensors and actuators that can be mechanical,electrical, electromechanical and other variations. HMD andHSD latencies are associated with the processing time of thealgorithms in these devices and depending on the type of hardware and implementation architecture this latency can beconsiderably reduced. IV. PHANTOM OMNI DEVICE MODEL (MD & SD)

Based on the scheme presented in Figure 1, this sectionpresents details associated with the MD and SD used asreference for the hardware system proposed in this research.The MD and SD are characterized as a three degree offreedom robotic manipulator, 3-DoF, called the PHANToMOmni [30] (Figure 2). The PHANToM Omni has been widelyused in literature as presented in [31] and [32]. In this worktwo of this devices are going to be used: one as an MD andthe other as a SD.

Tool y xz θ θ θ FIGURE 2.

PHANToM Omni - MD and SD.

As can be seen from Figure 3, the PHANToM Omni phys-ical structure is formed by a base, an arm with two segments L and L which are interconnected by three rotary joints θ , θ and θ and a tool. The variables presented in Figure 3 arerepresented by: L = 0.135mm, L = L , L =0.025mm and L = L + A where A =0.035mm as described in [33]. Thesedetailed features of the device are essential for performingthe kinematics and dynamic calculations. θ L θ θ L L L A y x z FIGURE 3.

PHANToM Omni structure - MD and SD.

A. FORWARD KINEMATICS

The kinematics of manipulative devices makes use of therelationship between operational coordinates and joint co-ordinates. Forward kinematics (FK) correlates the angularvariables of the joints with the Cartesian system. That is, iven an array of joint coordinates it is possible to determinethe spatial position of the tool through the equation that canbe expressed by x = − sin( θ )( L sin( θ ) + L cos( θ )) , (15) y = − L cos( θ ) + L sin( θ ) + L , (16) z = L cos( θ ) sin( θ )+ L cos( θ ) cos( θ ) − L (17)where x , y and z are variables that determine the spatialposition of the tool in the Cartesian plane. B. INVERSE KINEMATICS

In the inverse kinematics (IK), the relationship between thejoint angles and the Cartesian system is reversed, that is,given the spatial position of the tool it may be possible todetermine the joint coordinates. The solution to this processis not as straightforward as in the direct kinematics. In directkinematics, the position of the tool is determined solelyby the displacements of the joints. In inverse kinematics,equations are composed of nonlinear calculations formedby trigonometric functions. Depending on the manipulatorstructure, multiple solutions may be possible for the sametool position, or there may be no solution for a particular setof tool positions. Based on the works [34], [35] and [33], thevalue of θ can be deﬁned through the equation expressed by θ = − atan x, z + L ) (18)where x and z represent coordinates in the Cartesian planeand L corresponds to the size of the the arm segments, asshown in Figure 3.To calculate the other two joints θ and θ it is necessaryto perform intermediate calculations. Thus, one can obtain R , r , β , γ and α through the equations R = (cid:112) x + ( z + L ) , (19) r = (cid:112) x + z + L ) + ( y − L ) , (20) γ = acos (cid:18) L − L + r L r (cid:19) , (21) β ( n ) = atan y − L , R ) , (22)and α = acos (cid:18) L + L − r L L (cid:19) . (23)After performing the intermediate calculations it is possi-ble to calculate θ through the equation θ = γ + β. (24)Finally, the value corresponding to the θ joint can be ob-tained through the equation θ = θ + α − π . (25) C. KINESTHETIC FEEDBACK FORCE

The kinesthetic feedback force allows the environment to be"felt", i.e. when the SD comes into physical contact with anobject, the MD will receive a counter force. This model canbe implemented through the equation τ = J T F , (26)where τ deﬁnes the torque array that will be applied to eachjoint ( θ , θ and θ ) of the PHANToM Omni associated withthe MD, J T is the transpose of the Jacobian matrix and F is the force array resulting from the interaction of SD withENV. The torque array τ can be expressed as τ = [ τ , τ , τ ] . (27)The J Jacobian matrix incorporates structural informationabout the handler and it is identiﬁed as J =  J J J J J J J J J  , (28)where J = − cos( θ )( L sin( θ ) + L cos( θ )) , (29) J = 0 , (30) J = − L cos( θ ) sin( θ ) − L sin( θ ) sin( θ ) , (31) J = L sin( θ ) sin( θ ) , (32) J = L cos( θ ) , (33) J = − L sin( θ ) cos( θ ) , (34) J = − L sin( θ ) cos( θ ) , (35) J = L sin( θ ) , (36)and J = L cos( θ ) cos( θ ) . (37)The force array F is expressed as F = [ F x , F y , F z ] (38)and can be obtained through sensors internal or external tothe device. According to (26), the τ torque array representingthe resulting force at each joint can be deﬁned as τ = J F x + J F y + J F z , (39) τ = J F x + J F y + J F z , (40)and τ = J F x + J F y + J F z . (41) . SIMULATED TACTILE INTERNET MODEL Figures 1 and 4 details the structure used for the hardwaredesign in FPGA, in which a given operator, OP, handles aPHANToM Omni on the master side, MD, which is con-nected to HMD that, in this case, is a dedicated FPGAhardware. Data is transmitted through the network, the NWsubsystem, to HSD which is also a dedicated hardware inFPGA. The HSD is also connected to a PHANToM Omni thatinteracts with the environment, the ENV subsystem. Figure 4also details the backwards direction from the ENV and theOP.The OP is modeled as an information source responsiblefor generating a spatial trajectory through discrete signalsexpressed in the a ( n ) array. At each n -th instant t s the OPsends three variables x OP ( n ) , y OP ( n ) and z OP ( n ) repre-senting the positioning of the MD tool (Figures 2 and 3) inthe Cartesian space an this is expressed by a ( n ) = (cid:2) x OP ( n ) , y OP ( n ) , z OP ( n ) (cid:3) . (42)This step simulates the spatial movement of the MD toolby the operator, that is, at each instant of time, t s , a spatialmovement is performed and a new signal a ( n ) is generatedby the OP.The PHANToM Omni has encoders at its three joints thattranslate spatial positioning at the three angles θ , θ and θ (Figures 2 and 3). Thus, based on Figure 4, it can be said thatMD converts the signal a ( n ) into a signal expressed as b ( n ) = (cid:2) θ MD ( n ) , θ MD ( n ) , θ MD ( n ) (cid:3) (43)and forwards it to the HMD at every n -th instant of time t s .Then, as can be seen in Figure 4, the b ( n ) signal propa-gates to the HMD, which on receiving the signal transformsthe joint positioning angles, b ( n ) , into spatial position bycalculating the FK according to (15), (16) and (17). Allequations are implemented in FPGA through a hardwaremodule called the FK-HMD. The equations are implementedin parallel which can signiﬁcantly increase the processingtime. The use of FK is motivated by an reduction of theamount of information utilized, i.e., for a N -DoF roboticmanipulator N joint angles will be generated and that can beconverted into only three values associated with the spatialposition of the tool, x , y and z . On the other hand, the useof this strategy increases the amount of calculations to beperformed by the MD, which is compensated by the parallelimplementation of the algorithm in FPGA. It is essential tonote that the use of custom hardware operating in parallelallows processing time not to be substantially affected by N .Based on Section III, after the FK calculation by the FK-HMD hardware module, a new discrete signal is created thatcan be expressed by c ( n ) = (cid:2) x HMD ( n ) , y HMD ( n ) , z HMD ( n ) (cid:3) (44)where x HMD ( n ) , y HMD ( n ) and z HMD ( n ) are the valuesof the spatial coordinate array generated by the HMD to be sent to HSD via the communication medium, NW. The FK-HMD hardware module generates a new c ( n ) array every n -th instant of time.After the transmission through the forward channel, herecalled FC, the signal received by the HSD can be expressedas v ( n ) = (cid:2) x HSD ( n ) , y HSD ( n ) , z HSD ( n ) (cid:3) . (45)Based on (5) the spatial coordinate signal received by HSDcan be expressed as x HSD ( n ) = x HMD (cid:0) n − d fx ( n ) (cid:1) + r fx ( n ) , (46) y HSD ( n ) = y HMD (cid:0) n − d fy ( n ) (cid:1) + r fy ( n ) , (47)and z HSD ( n ) = z HMD (cid:0) n − d fz ( n ) (cid:1) + r fz ( n ) (48)where d fx ( n ) , d fy ( n ) , d fz ( n ) , r fx ( n ) , r fy ( n ) and r fz ( n ) are thedelays and noises associated with CF.As in this case the Slave PHANToM Omni, SD, copies themovement of the master PHANToM Omni, MD, it is neces-sary for the HSD to perform a feedback control system on thethree joints of the PHANToM Omni slave, here expressed as θ SD ( n ) = (cid:2) θ SD ( n ) , θ SD ( n ) , θ SD ( n ) (cid:3) (49)that is, θ SD ( n ) , θ SD ( n ) , θ SD ( n ) are control variables asso-ciated with DS. The control system illustrated in Figure 4as FCS shall minimize the error, e F CS ( n ) , between θ SD ( n ) and the reference signal θ HSD ( n ) characterized as θ HSD ( n ) = (cid:2) θ HSD ( n ) , θ HSD ( n ) , θ HSD ( n ) (cid:3) (50)where e ( n ) = θ HSD ( n ) − θ SD ( n ) and (51)and  e F CS ( n ) e F CS ( n ) e F CS ( n )  =  θ HSD ( n ) θ HSD ( n ) θ HSD ( n )  −  θ SD ( n ) θ SD ( n ) θ SD ( n )  . (52)The θ SD ( n ) signal is obtained from the SD via sensors(encoders) at the SD joints and the θ HSD ( n ) signal is ob-tained from the IK-HSD hardware module shown in Figure4. This hardware module implements all inverse kinemat-ics equations presented in Section IV-B, i.e. (18) through(25). There are several techniques and approaches that canbe used in the FCS module ranging from more traditionaltechniques such as a proportionalâ ˘A ¸Sintegralâ ˘A ¸Sderivativecontroller [36] to more innovative artiﬁcial intelligence basedtechniques [37], [38].The CPD-HSD and JPD-HSD modules, illustrated in Fig-ure 4, represent the algorithms of prediction and detectionin cartesian space and joints, respectively. These modulesare responsible for minimizing the latency and noise addedby the FC associated with the tactile internet system (Eqs.(46), (47) and (48)). Depending on the prediction and de-tection technique used, the HSD may use only one of themodules, namely the CPD-HSD or JPD-HSD. There is still etwork IK-HSD h ( n ) FK-HSD l ( n ) c ( n ) BCFC FBF-HSD v ( n ) HMD HSD q ( n ) t HSD t HMD θ HSD ( n )JPDHMD KFF-HMD d b FK-HMD t NW d f CPDHSD JPDHSDCPDHMD g ( n ) u ( n ) o ( n ) SD a ( n ) ˆ ENV t FB t IK t FK FCS p ( n ) b ( n ) MD a ( n ) o ( n ) ˆ OP t KFF t FK t FCS t CPD t JPD t CPD t JPD t MD t SD θ SD ( n ) s OBJ ( n )Split FIGURE 4.

Detailed discrete model of a tactile internet system. no consensus about whether the Cartesian space or joints isthe best for minimizing latency and noise inserted by thechannel. There are several works in the literature that presentproposals using only one of the spaces and proposals that tryto use the information from both simultaneously.Similarly to the FCS module, approaches ranging from themore traditional techniques up to more innovative techniquesbased on artiﬁcial intelligence have been used in the CPD-HSD and JPD-HSD modules [39]–[43]. Thus, it can be saidthat θ HSD ( n ) is an estimate of the b ( n ) signal generated bythe MD.At each n -th time, the FCS acts on the SD through the u ( n ) signal, detailed in Figures 1 and 4, which in the case ofthe PHANToM Omni can be expressed as u HSD ( n ) = (cid:2) τ HSD ( n ) , τ HSD ( n ) , τ HSD ( n ) (cid:3) (53)where τ HSDi ( n ) is the i -th torque applied every i -th joint.The FCS will act as a tracking mechanism, making the SDfollow the path traveled by the MD. Finalizing the datastream associated with the forward channel, it can be saidthat the ˆa ( n ) signal is formed by an estimate of the spatialposition generated by the OP, ˆa ( n ) , i.e. ˆa ( n ) = (cid:2) ˆ x OP ( n ) , ˆ y OP ( n ) , ˆ z OP ( n ) (cid:3) . (54)The interaction of the PHANToM Omni, SD, with ENVcan vary from free movement to physical contact. Whensome kind of physical contact occurs, the SD detects thetouch and sends this information back to the HSD. As perthe model detailed in Figure 4 the ENV sends back to SD theinformation associated with the contact force in the spatialplane, expressed here as, o ( n ) = (cid:2) F ENVx ( n ) , F ENVy ( n ) , F ENVz ( n ) (cid:3) . (55)The value associated with the contact force information canbe measured directly through SD-coupled force sensors orindirectly estimated through other types of sensors that maybe SD-coupled or inserted into the environment [44]. In thecase of the model presented in Figure 4, the SD sends to HSD the objects surface’s spatial positions through sensors spreadin the ENV. The signal expressed as s OBJ ( n ) = (cid:2) x OBJ ( n ) , y OBJ ( n ) , z OBJ ( n ) (cid:3) (56)represents the spatial position of the closest object from theSD tool. Thus, based on the information already described,every n -th time t s the SD sends to the HSD a signal charac-terized by the array g ( n ) expressed as g ( n ) = (cid:104) θ SD ( n ) , s OBJ ( n ) (cid:105) . (57)In the HSD, when the signal g ( n ) is received, the Splitmodule separates the θ SD ( n ) signal and sends it to the FCSand the FK-HSD hardware module. And the signal s OBJ ( n ) is sent to the FB-HSD hardware module, as detailed in Figure4. The FK-HSD hardware module performs the forward kine-matics calculation similarly to FK-HMD and thus the currentspatial position of the SD tool in the environment, ENV, canbe obtained. Every n -th instant t s FK-HSD generates a signalexpressed as l ( n ) = [ x ENV ( n ) , y ENV ( n ) , z ENV ( n )] (58)where x ENV ( n ) , y ENV ( n ) and z ENV ( n ) are the spatialposition of the tool in the ENV module from θ SD ( n ) . TheFBF-HSD hardware module implements the calculations as-sociated with the generation of the feedback force from thecontact between the tool and the object. Based on the workpresented in [44] the contact force, represented by the h ( n ) signal, can be expressed as h ( n ) = (cid:2) F HSDx ( n ) , F HSDy ( n ) , F HSDz ( n ) (cid:3) , (59)where F HSDx ( n ) = h x ( n ) (cid:0) x OBJ ( n ) − x ENV ( n ) (cid:1) , (60) F HSDy ( n ) = h y ( n ) (cid:0) y OBJ ( n ) − y ENV ( n ) (cid:1) , (61)and F HSDz ( n ) = h z ( n ) (cid:0) z OBJ ( n ) − z ENV ( n ) (cid:1) . (62) n these equations, the constants h x ( n ) , h y ( n ) and h z ( n ) represent the elasticity coefﬁcients associated with the object.It is important to note that in this model the h ( n ) signal is asynthesized version of the real force value here characterizedby the o ( n ) array.After the feedback force calculation process, as illustratedin Figure 4, the h ( n ) signal is transmitted to the HMD via thebackwards channel (BC) which, similarly to FC, adds latencyand noise. The signal received by the HMD can be expressedas q ( n ) = [ F HMDx ( n ) , F HMDy ( n ) , F HMDz ( n )] (63)where F HMDx ( n ) = F HSDx (cid:0) n − d bx ( n ) (cid:1) + r bx ( n ) , (64) F HMDy ( n ) = F HSDy (cid:0) n − d by ( n ) (cid:1) + r by ( n ) , (65)and F HMDz ( n ) = F HSDz (cid:0) n − d bz ( n ) (cid:1) + r bz ( n ) (66)where d bx ( n ) , d by ( n ) , d bz ( n ) , r bx ( n ) , r by ( n ) and r bz ( n ) are thelatencies and the noises associated with the BC.Similarly to HSD, the HMD will minimize the effect oflatency and noise from operations of Cartesian and jointspace. For HMD, the calculations associated with the Carte-sian space will be performed by the CPD-HMD module andassociated with the joint space by the JPD-HMD module.In addition to the prediction and detection calculations, theHMD must transform the force signals received throughsignal q ( n ) into a torque to be applied to the MD joints whichis accomplished by the KFF-HMD hardware module. KFF-HMD implements the equations (39), (40) and (41) presentedin Section IV-C and generate the signal expressed as p ( n ) = (cid:2) τ HMD ( n ) , τ HMD ( n ) , τ HMD ( n ) (cid:3) (67)where τ HMDi ( n ) is the torque associated with the i -th jointof the MD. Since the PHANToM Omni is a haptic device,it already has a built-in control system, FCS, which uses asreference signal the torques associated with the p ( n ) array.After applying the torques to the MD joints via the p ( n ) signal, the OP receives the feedback force signal, in otherwords, it feels the object touched by the SD in the ENV. Thissensation is identiﬁed in by the ˆo ( n ) signal expressed as ˆo ( n ) = (cid:104) ˆ F ENVx ( n ) , ˆ F ENVy ( n ) , ˆ F ENVz ( n ) (cid:105) . (68)As illustrated in Figure 4, the MD, HMD, NW, HSD, andSD subsystems have the following runtimes: t MD , t HMD , t NW , t HSD and t SD , respectively. The sum of these, timestaking into account the forward direction (between OP andENV) and the backwards direction (between ENV and OP),represent the total system latency that can be expressed as t latency = 2 ( t MD + t HMD + t NW + t HSD + t SD ) . (69)Some works presented in the literature review agree that theideal requirement is that t latency ≤ ms, on the other hand,other works point out that the latency requirement can be expresses as t latency ≤ ms, depending on the application[6]–[9], [45]. Considering that of the total latency time t latency is spent by MD, HMD, HSD, and SD, it can beunderstood that ( t MD + t HMD + t HSD + t SD ) ≤ . t latency . (70)Assuming an equal time division among MD, HMD, HSD,and SD it is possible to afﬁrm that the time associated withhardware, t hardware , whether the master, HMD, or the slavedevice, HSD, can be expressed as t HMD = t HSD = t hardware ≤ . t latency . (71)Taking the ms constraints into consideration and substitut-ing this value in (71), it is possible to afﬁrm that the hardwaretime, t hardware , must meet the t hardware ≤ . µ s constraint forall cases (condition ms) or the t hardware ≤ µ s constraintfor some speciﬁc cases ( ms condition).Recent studies from the literature show that the msrestriction ( t hardware ≤ . µ s) is difﬁcult to achieve usinghardware devices based on embedded systems such as micro-processors and microcontrollers [46], [47]. The ms restric-tion ( t hardware ≤ µ s) is achieved in speciﬁc cases whereSD is a virtual environment and HSD is a high performanceprocessor computer [45]. Thus this work aims to minimizethe execution time in HMD, t HMD , and HSD, t HSD , usingFPGA reconﬁgurable computation. In other words, the targetis to achieve a t hardware ≤ . µ s.This paper presents a hardware reference model for theFK-HMD, KFF-HMD, IK-HSD, FK-HSD, and FBF-HSDmodules illustrated in Figure 4. The complete model that willbe presented in detail in the next section makes use of a par-allel implementation methodology in which high throughputis prioritized, i.e. the execution time of the modules t FK , t KFF , t IK and t FBF , illustrated in Figure 4.This work does not propose dedicated hardware refer-ence models for the CPD-HSD, JPD-HSD, CPD-HMD, JPD-HMD and FCS modules as there are several techniques andalgorithms that can be applied to them. However, consideringthe hardware time constraints, t hardware , it is noted that itis also important to use dedicated hardware structures withreconﬁgurable computing for these modules. Studies in theliterature foresee the use of AI based techniques for thesemodules; however, it is essential to note that AI techniquesand algorithms implemented on general purpose processor-based hardware platforms can lead to higher processing times[13]–[19]. VI. IMPLEMENTATION DESCRIPTION

The FK-HMD and KFF-HMD hardware modules associatedwith the master device (HMD) and the IK-HSD, FK-HSD,and FBF-HSD hardware modules associated with the slavedevice (HSD) (Figure 4) were designed using a parallelimplementation in order to prioritize the processing speed.The implementations were designed in FPGA using a hybridscheme with ﬁxed point and ﬂoating point representation in istinct parts of the proposed architecture. In the portionsthat adopt the ﬁxed point format, the variables follow anotation expressed as [ sV.N ] indicating that the variable isformed by V bits of which N bits are intended for thefractional part and the s symbol indicates that the variableis signed. In this case, the number of bits intended for theinteger part is V − N − . For the representation of ﬂoatingpoint variables, the notation [F32] is adopted. Most of theimplemented circuits were designed using a -bit singleprecision (IEEE754) ﬂoating point format representation.The ﬁxed point format was used only on the circuit thatimplements the trigonometric function block (TFB) module,as illustrated in Figure 5. TFB is the module responsible forperforming trigonometric operations through the hardwareimplementation of CORDIC (COordinate Rotation DIgitalComputer) [48]. The implemented CORDIC circuit usesdata representation in ﬁxed point format using the [ s . representation. F2FP [ s V.N] FP2F[F32] [ s V.N] [F32]

CORDIC Trigonometric Function Block

FIGURE 5.

Proposed circuit for calculating trigonometric functions - TFB.

As illustrated in Figure 5, the TFB module receives datafrom external circuits in the -bits ﬂoating point standard.A conversion to the ﬁxed point numeric representation typerepresented by the [ s . notation is performed throughthe Float to Fixed-point (F2FP) module that has been imple-mented in hardware. After the CORDIC hardware operationsare performed, the data in the ﬁxed point format is trans-formed back to the -bit ﬂoating point through the Fixed-point to Float (FP2F) module which was also implementedin hardware.Several of the proposed methods to be presented usethe constants L , L , L and L . They represent physicalcharacteristics of the PHANToM Omni device as illustratedin Figure 2. These constants use the -bit ﬂoating pointnumeric representation. A. FORWARD KINEMATICS (FK-HMD AND FK-HSD))

As illustrated in Figure 4, both the hardware associatedwith the master device (HMD) and the hardware associatedwith the slave device (HSD) implement forward kinematicsthrough the FK-HMD and FK-HSD modules, respectively.These modules have the same FPGA-implemented circuit,differing only in the input and output signals. They aredesigned to work with three input signals, one for each com-ponent of the angular positioning of the device’s joints, andthree output signals, one for each component of the the posi-tioning of the device’s tool in the Cartesian system. The inputsignals are θ [ F n ) , θ [ F n ) and θ [ F n ) and theoutput signals are x [ F n ) , y [ F n ) and z [ F n ) . For FK-HMD, the input signals represent the θ MD [ F n ) , θ MD [ F n ) and θ MD [ F n ) signals, and the outputsignals represent the x HMD [ F n ) , y HMD [ F n ) and z HMD [ F n ) signals. In the case of the FK-HSD mod-ule, the input signals represent the signals θ SD [ F n ) , θ SD [ F n ) and θ SD [ F n ) and the output signals rep-resent the signals x ENV ( n ) , y ENV ( n ) and z ENV ( n ) . Atevery n -th instant all the computation performed in order tocalculate the forward kinematics are executed in parallel.Based on (15), the algorithm used for calculating x [ F n ) was implemented in FPGA through the genericcircuit illustrated in Figure 6. The circuit was designed towork with three input signals θ [ F n ) , θ [ F n ) and θ [ F n ) and one output signal. These signals are for-warded to TFB sub circuits where sine and cosine calcula-tions are performed. For this process the constants L and L , three multipliers, one inverter and one adder are used. XX + L x [F32]( n ) θ [F32]( n ) θ [F32]( n ) θ [F32]( n ) X L TFBsin()TFBcos()TFBsin() -1 FIGURE 6.

Proposed forward kinematics circuit for obtaining the x [ F n ) spatial coordinate (Eq. (15)) - FK-HMD and FK-HSD. The calculation of y [ F n ) based on (16) was imple-mented in FPGA through the generic circuit shown in Figure7. The circuit was designed to work with two input signals θ [ F n ) and θ [ F n ) and one output signal. Thesesignals are routed to TFB sub circuits to perform sine andcosine calculations. In the process ﬂow two multipliers, twoadders, one inverter and the constants L and L are used. -L θ [F32]( n ) θ [F32]( n ) y [F32]( n )XX + + L TFBsin()TFBcos() L FIGURE 7.

Proposed forward kinematics circuit for obtaining the y [ F n ) spatial coordinate (Eq. (16)) - FK-HMD and FK-HSD. The generic circuit illustrated in Figure 8 was implementedin FPGA to perform the calculation of z [ F n ) and it isbased on (17). The circuit is designed to work with three nput signals θ [ F n ) , θ [ F n ) and θ [ F n ) andone output signal. These signals are routed to TFB subcircuits in order to perform sine and cosine calculations. Inthe process ﬂow four multipliers, two adders, one inverterand the constants L , L and L are used. X z [F32]( n ) θ [F32]( n ) θ [F32]( n ) θ [F32]( n ) X XX + + -L TFBcos()TFBcos()TFBsin() L L FIGURE 8.

Proposed forward kinematics circuit for obtaining the z [ F n ) spatial coordinate (Eq. (17)) - FK-HMD and FK-HSD. In the FK-HMD module the θ MD [ F n ) , θ MD [ F n ) and θ MD [ F n ) input signals are received through the b ( n ) array ((43) in section V), then all calculation areperformed in parallel resulting in the c ( n ) array ((44)in section V) with the x HMD [ F n ) , y HMD [ F n ) and z HMD [ F n ) signals as shown in Figure 4. Forthe FK-HSD module the θ SD [ F n ) , θ SD [ F n ) and θ SD [ F n ) input signals enter the module via the θ SD ( n ) array ((49) in section V) and after performing all parallelcomputations, the resulting signals x ENV ( n ) , y ENV ( n ) and z ENV ( n ) are output from the module via the l ( n ) array ((49)in section V). B. INVERSE KINEMATICS (IK-HSD)

The hardware associated with the slave device (HSD)implements the inverse kinematics through the IK-HSDmodule, as shown in Figure 4. The IK-HSD FPGA-implemented circuit is designed to work with three inputsignals x HSD [ F n ) , y HSD [ F n ) and z HSD [ F n ) and three output signals θ HSD [ F n ) , θ HSD [ F n ) and θ HSD [ F n ) . However, to calculate θ HSD [ F n ) (Eq. (24)) and θ HSD [ F n ) (Eq. (25)) it is ﬁrst nec-essary to perform intermediate calculations to obtain thevalues of R [ F n ) , r [ F n ) , β [ F n ) , γ [ F n ) and α [ F n ) Based on (18), (24) and (25), algorithms for calculating θ HSD [ F n ) , θ HSD [ F n ) and θ HSD [ F n ) wereimplemented in FPGA through the generic circuits illustratedin Figures 9, 10 and 11 respectively.As already described, and according to the illustrationsshown in Figures 10 and 11, to perform the calculationsof θ HSD [ F n ) and θ HSD [ F n ) it is ﬁrst necessaryto perform the intermediate calculations of γ [ F n ) (Eq.(21)), β [ F n ) (Eq. (22)) and α [ F n ) (Eq. (23)).However, these calculations depend on the calculation of z HSD [F32]( n ) + L TFBatan2() x HSD [F32]( n ) -1 θ HSD [F32]( n ) FIGURE 9.

Proposed inverse kinematics circuit for obtaining the θ HSD [ F n ) angular position (Eq. (18)) - IK-HSD. β [F32]( n ) + γ [F32]( n ) θ HSD [F32]( n ) FIGURE 10.

Proposed inverse kinematics circuit for obtaining the θ HSD [ F n ) angular position (Eq. (24)) - IK-HSD. R [ F n ) and r [ F n ) . Then, when the IK-HSD modulereceives the input signals at every n -th instant the circuitshown in Figure 9 performs the calculation of θ HSD [ F n ) in parallel with the generic circuits illustrated in Figures 12and 13 which were implemented in FPGA to perform thecalculation of R [ F n ) and r [ F n ) based on (19) and(20).The circuit shown in Figure 12 used to obtain R [ F n ) ,is designed to work with two input signals x HSD [ F n ) and z HSD [ F n ) and one output signal. This design con-tains two multipliers, two adders, the L constant and a sub-circuit called Sqrt , which was implemented in hardware tocalculate the square root.The r [ F n ) calculation is performed through the cir-cuit shown in Figure 13. This circuit is designed to workwith three input signals x HSD [ F n ) , y HSD [ F n ) and z HSD [ F n ) and one output signal. The circuit consists ofthree multipliers, four adders, one inverter, the constants L and L , and, again, the Sqrt sub-circuit.After the parallel processing of θ HSD [ F n ) , R [ F n ) and r [ F n ) , the circuits responsible for calculating γ [ F n ) , β [ F n ) and α [ F n ) are also executed inparallel through the FPGA implementations of the genericcircuits illustrated in Figures 14, 15 and 16. The value of γ [ F n ) is obtained through the circuit shown in Figure 14which is based on (21). The circuit is designed to work withan input signal r [ F n ) and one output signal. It consists ofﬁve multipliers, two adder, one divisor, one TFB sub-circuitto calculate the arccosine and the constants L and L .The circuit for obtaining β [ F n ) illustrated in Figure15 is based on (22) and is designed to work with twoinput signals y HSD [ F n ) and R [ F n ) and one outputsignal. The circuit is composed of one adder, one inverter, aTFB sub-circuit to perform the arctangent calculation and the L constant.The value of α [ F n ) is obtained from the circuit shownin Figure 16 which is based on (23) and is designed towork with an input signal r [ F n ) and one output signal.The circuit is composed of ﬁve multipliers, two adders, one [F32]( n ) + + -π/2θ HSD [F32]( n ) θ HSD [F32]( n ) FIGURE 11.

Circuito proposto da cinemÃ ˛atica inversa para obter aposiÃ˘gÃˇco angular θ HSD [ F n ) (Eq. (25)) - IK-HSD. x HSD [F32]( n ) R [F32]( n ) z HSD [F32]( n ) + Sqrt L x+ x FIGURE 12.

Proposed circuit to perform the calculation of R [ F n ) (Eq.(19)) - IK-HSD. inverter, one divider, one TFB sub-circuit to perform thearccosine calculation and the constants L and L .To complete the process, after performing the calculationsof β [ F n ) , γ [ F n ) and α [ F n ) , it is possible to ob-tain the θ HSD [ F n ) and θ HSD [ F n ) values in parallelthrough the circuits shown in Figures 10 and 11. C. KINESTHETIC FEEDBACK FORCE (KFF-HMD)

As illustrated in Figure 4, the hardware associated with themaster device (HMD) implements the kinesthetic feedbackforce through the KFF-HMD module. Based on (26), theKFF-HMD module was implemented in FPGA through thegeneric circuit illustrated in Figure 17. This circuit is com-posed of sub-circuits that correspond to parts of (26). Thesub-circuit called JM, described in (28), is responsible forcalculating the Jacobian matrix. The KFF sub-circuit makesthe relationship between the Jacobian matrix (JM) moduleand the force array from (38).The circuit shown in Figure 17 has the input sig-nals θ MD [ F n ) , θ MD [ F n ) and θ MD [ F n ) thatare received from the master device (MD) and also the F x [ F n ) , F y [ F n ) and F z [ F n ) signals that arereceived from the hardware associated to the slave de-vice (HSD). The three output signals are: τ HMD [ F n ) , τ HMD [ F n ) and τ HMD [ F n ) .The JM module that represents the sub-circuit re-sponsible for performing the Jacobian matrix calcula-tion consists of nine elements: J [ F n ) , J [ F n ) , J [ F n ) , J [ F n ) , J [ F n ) , J [ F n ) , J [ F n ) , J [ F n ) and J [ F n ) . The calculationof J [ F n ) based on (30) does not have an associated cir-cuit since its value is , i.e. J [ F n ) = 0 . Based on (29),the algorithm for calculating J [ F n ) was implementedin FPGA according to the generic circuit illustrated in Figure18. The circuit was designed to work with three input signalsand one output signal. It uses the constants L and L and x HSD [F32]( n ) r [F32]( n ) z HSD [F32]( n ) y HSD [F32]( n ) ++ -L L Sqrtx+ x+ x

FIGURE 13.

Proposed circuit to perform the calculation of r [ F n ) (Eq.(20)) - IK-HSD. r [F32]( n ) 2 + /XX + γ [F32]( n )TFBacos() -1 XX L X L FIGURE 14.

Proposed circuit to perform the calculation of γ [ F n ) (Eq.(21)) - IK-HSD. has three TFB sub-circuits: two for performing the cosinecalculation and one for obtaining the sine value.The calculation of J [ F n ) , based on (31), was imple-mented in FPGA according to the generic circuit illustratedin Figure 19. The circuit was designed to work with threeinput signals and one output signal. The circuit has three TFBmodules, two for sine calculation and one for cosine valueand uses the L and L constants.The generic circuit illustrated in Figure 20 was imple-mented in FPGA to perform the calculation of J [ F n ) and is based on (32). The circuit was designed to work withtwo input signals and one output signal. The circuit has twoTFB sub circuits to perform sine calculation and uses the L constant.Based on (33), the algorithm for calculating J [ F n ) was implemented in FPGA according to the generic circuitillustrated in Figure 21. The circuit was designed to workwith one input signal and one output signal. The circuit hasa TFB sub-circuit to perform cosine calculation and uses theconstant L .The calculation of J [ F n ) based on (34) was imple-mented in FPGA according to the generic circuit illustrated inFigure 22. The circuit was designed to work with two inputsignals and one output signal. In addition to the use of theconstant L , the circuit has two TFB sub circuits, one forperforming the cosine calculation and one for the sine.The generic circuit illustrated in Figure 23 was imple-mented in FPGA to perform the calculation of J [ F n ) and which is based on (35). The circuit was designed to workwith two inputs and one output signal. In addition to usingthe constant L , the circuit has two TFB sub circuits, one forperforming cosine calculation and one for the sine. [F32]( n ) y HSD [F32]( n ) R [F32]( n )+ -L TFBatan2()

FIGURE 15.

Proposed circuit to perform the calculation of β [ F n ) (Eq.(22)) - IK-HSD. α [F32]( n ) r [F32]( n ) 2 + /XX + L TFBacos() -1 XX L X FIGURE 16.

Proposed circuit to perform the calculation of α [ F n ) (Eq.(23)) - IK-HSD. Based on (36), the algorithm for calculating J [ F n ) was implemented in FPGA according to the generic circuitillustrated in Figure 24. The circuit was designed to workwith one input signal and one output signal. The circuitcontains a TFB sub-circuit to perform the sine calculationand uses the L constant.The calculation of J [ F n ) , based on (37), was imple-mented in FPGA according to the generic circuit illustratedin Figure 25. The circuit was designed to work with twoinput signals and one output signal. In addition to the use ofconstant L , the circuit has two TFB sub-circuits to performthe cosine calculation.All displayed circuits related to the JM sub-circuits arecalculated in parallel at each n -th instant. The results are thensent to the KFF module which also performs the calculationsof τ HMD [ F n ) , τ HMD [ F n ) and τ HMD [ F n ) inparallel. The KF circuit shown in Figure 17 is designed towork with twelve input signals and three output signals.Based on (39), the algorithm for calculating τ HMD [ F n ) was implemented in FPGA according to the generic circuitillustrated in Figure 26. The circuit was designed to workwith six inputs and one output.The calculation of τ HMD [ F n ) based on (40) wasimplemented in FPGA according to the generic circuit illus-trated in 27. The circuit was designed to work with six inputsand one output.The generic circuit illustrated in Figure 28 has beenimplemented in FPGA to perform the calculation of τ HMD [ F n ) and it is based on (41). The circuit wasdesigned to work with six inputs and one output. D. FEEDBACK FORCE (FBF-HSD)

As illustrated in Figure 4 the hardware associated with theslave device (HSD) implements the feedback force via theFBF-HSD module. The FPGA-implemented circuit of theFBF-HSD module is designed to work with six input signals

KFF J [F32]( n ) J [F32]( n ) J [F32]( n ) J [F32]( n ) J [F32]( n ) J [F32]( n ) J [F32]( n ) J [F32]( n ) J [F32]( n )JM θ MD [F32]( n ) θ MD [F32]( n ) θ MD [F32]( n ) τ HMD [F32]( n ) τ HMD [F32]( n ) τ HMD [F32]( n ) F HMD [F32]( n ) zF HMD [F32]( n ) yF HMD [F32]( n ) x FIGURE 17.

Proposed circuit to calculate kinesthetic feedback force (Eq. (26))- KFF-HMD. X+ L J [F32]( n )TFBsin()TFBcos()TFBcos() -1 X L X θ MD [F32]( n ) θ MD [F32]( n ) θ MD [F32]( n ) FIGURE 18.

Proposed circuit to calculate the Jacobian matrix J [ F n ) (Eq. (29)) - JM. L J [F32]( n )TFBsin()TFBcos()TFBsin() X -L XXX - θ MD [F32]( n ) θ MD [F32]( n ) θ MD [F32]( n ) FIGURE 19.

Proposed circuit to calculate the Jacobian matrix J [ F n ) (Eq. (31)) - JM. L J [F32]( n )TFBsin()TFBsin() X X θ MD [F32]( n ) θ MD [F32]( n ) FIGURE 20.

Proposed circuit to calculate the Jacobian matrix J [ F n ) (Eq. (32)) - JM. 13 J [F32]( n )TFBcos() X θ MD [F32]( n ) FIGURE 21.

Proposed circuit to calculate the Jacobian matrix J [ F n ) (Eq. (33)) - JM. J [F32]( n )TFBsin()TFBcos() -L XX θ MD [F32]( n ) θ MD [F32]( n ) FIGURE 22.

Proposed circuit to calculate the Jacobian matrix J [ F n ) (Eq. (34)) - JM. J [F32]( n )TFBsin()TFBcos() -L XX θ MD [F32]( n ) θ MD [F32]( n ) FIGURE 23.

Proposed circuit to calculate the Jacobian matrix J [ F n ) (Eq. (35)) - JM. J [F32]( n )TFBsin() L X θ MD [F32]( n ) FIGURE 24.

Proposed circuit to calculate the Jacobian matrix J [ F n ) (Eq. (36)) - JM. J [F32]( n )TFBcos()TFBcos() L XX θ MD [F32]( n ) θ MD [F32]( n ) FIGURE 25.

Proposed circuit to calculate the Jacobian matrix J [ F n ) (Eq. (37)) - JM. and three output signals. Among the six input variables, x OBJ [ F n ) , y OBJ [ F n ) and z OBJ [ F n ) repre-sent the spatial position of the closest object to the SDtool and the other three x ENV [ F n ) , y ENV [ F n ) and z ENV [ F n ) represent the spatial position of the SDtool in the ENV module. The three outputs F HSDx [ F n ) , F HSDy [ F n ) and F HSDz [ F n ) represent the touch of X J [F32]( n ) F HMD [F32]( n ) x X J [F32]( n ) F HMD [F32]( n ) y X J [F32]( n ) F HMD [F32]( n ) z ++ τ HMD [F32]( n ) FIGURE 26.

Proposed circuit to calculate the torque of the τ HMD [ F n ) joint (Eq. (39)) - KFF. X J [F32]( n ) F HMD [F32]( n ) x X J [F32]( n ) F HMD [F32]( n ) y X J [F32]( n ) F HMD [F32]( n ) z ++ τ HMD [F32]( n ) FIGURE 27.

Proposed circuit to calculate the torque of the τ HMD [ F n ) joint (Eq. (40)) - KFF. the tool on the object. The variables h x , h y and h z representthe elasticity coefﬁcients associated with the object. All FBF-HSD module calculations are performed in parallel.Based on (60), the algorithm used for calculating F HSDx [ F n ) was implemented in FPGA according tothe generic circuit illustrated in Figure 29. The circuit wasdesigned to work with two inputs signals x OBJ [ F n ) and x ENV [ F n ) and one variable h x .The calculation of F HSDy [ F n ) , based on (61), wasimplemented in FPGA according to the generic circuit illus-trated in Figure 30. The circuit was designed to work withtwo input signals y OBJ [ F n ) and y ENV [ F n ) andone variable h y .The generic circuit shown in Figure 31 was implementedin FPGA to perform the calculation of F HSDz [ F n ) andit is based on (62). The circuit was designed to work withtwo input signals z OBJ [ F n ) and z ENV [ F n ) andone variable h z . VII. RESULTS

The entire tactile internet model infrastructure presented inFigure 4 was implemented with the purpose of validatingthe FPGA hardware implementation. A spatial trajectory thatrepresents the data sent by the OP through the a ( n ) (Eq.(42)) signal was created to validate the entire developedenvironment.The created trajectory performs a variation in all of thethree angles of the MD articulation. (Figure 3). For this, itwas ﬁrst considered that the MD is in the initial angularposition expressed as θ MD (0) = 0 , θ MD (0) = 0 and J [F32]( n ) F HMD [F32]( n ) x X J [F32]( n ) F HMD [F32]( n ) y X J [F32]( n ) F HMD [F32]( n ) z ++ τ HMD [F32]( n ) FIGURE 28.

Proposed circuit to calculate the torque of the τ HMD [ F n ) joint (Eq. (41)) - KFF. X x OBJ [F32]( n ) x ENV [F32]( n ) - F HSD [F32]( n ) xh x FIGURE 29.

Proposed circuit to calculate the feedback force F HSDx [ F n ) (Eq. (60)) - FBF-HSD. θ MD (0) = 0 , which corresponds to the spatial position x OP (0) = 0 , y OP (0) = − . and z OP (0) = − . of the tool as illustrated in Figure 32. Initially, the ﬁrst jointis moved to θ MD ( vn ) = pi/ where v represents a quantityof samples that is equal to 4 seconds, thus resulting in theposition x OP ( vn ) = − . , y OP ( vn ) = − . and z OP ( vn ) = − . . Then, the second joint is moved to θ MD ( vn ) = pi/ which results in the position x OP ( vn ) = − . , y OP ( vn ) = − . and z OP ( vn ) = − . and,ﬁnally, the third joint moves up to θ MD ( vn ) = pi/ , thusresulting in the x OP ( vn ) = − . , y OP ( vn ) = 0 . and z OP ( vn ) = − . position. The path created is withinthe limits of the device workspace and takes a total time of t = 12 seconds of which seconds are used to perform themovement of each joint.In an effort to validate the circuits from the implementedmodules in FPGA, equivalent software models were used tocompare the results of both implementations. The softwaremodels use a -bit ﬂoating point format while the hard-ware modules run a parallel implementation with a hybridrepresentation which uses both a -bit ﬂoating point and aﬁxed point representation in different parts of the proposedarchitecture, as presented in Section VI. In all scenarios, thesignal sampling rate (or throughput) was R s = t s (samplesper second), where t s is the time between the n -th samples.From the experimental results, the mean square error(MSE) between the software model and the hardware imple-mentation proposed by this work was calculated using theMSE which can be expressed as M SE = 1 Q Q − (cid:88) n =0 ( M SW [ F n ) − M [ F n )) , (72)where Q represents the number of tested samples, M SW [ F n ) corresponds to the variables of the software X y OBJ [F32]( n ) y ENV [F32]( n ) - h y F HSD [F32]( n ) y FIGURE 30.

Proposed circuit to calculate the feedback force F HSDy [ F n ) (Eq. (61)) - FBF-HSD. X z OBJ [F32]( n ) z ENV [F32]( n ) - h z F HSD [F32]( n ) z FIGURE 31.

Proposed circuit to calculate the feedback force F HSDz [ F n ) (Eq. (62)) - FBF-HSD. model and M [ F n ) corresponds to the variables of themodel implemented in FPGA.The quantity of tested samples for the results presentedhere is Q = 1200 , which correspond to the quantityof samples of the generated trajectory. The variables thatcorrespond to the hardware model M [ F n ) vary ac-cording to the module in which it was implemented. Inthe case of forward kinematics, as the FK-HMD and FK-HSD modules have the same implementation, the val-ues corresponding to the variables x [ F n ) , y [ F n ) and z [ F n ) change according to the respective module.For the FK-HMD module, these variables correspond to x HMD [ F n ) , y HMD [ F n ) and z HMD [ F n ) andfor the FK-HSD module the same variables correspondto x ENV [ F n ) , y ENV [ F n ) and z ENV [ F n ) aspresented in Section VI. For inverse kinematics, the vari-ables M [ F n ) of the IK-HSD module correspond to θ HSD [ F n ) , θ HSD [ F n ) and θ HSD [ F n ) . Forthe kinesthetic feedback force, the variables M [ F n ) of the KFF-HMD module correspond to τ HMD [ F n ) , τ HMD [ F n ) and τ HMD [ F n ) . For the feedbackforce, the variables M [ F n ) of the FBF-HSD mod-ule correspond to F HSDx [ F n ) , F HSDy [ F n ) and F HSDz [ F n ) . And ﬁnally, in the MSE equation the M SW [ F n ) corresponds to the same variables as thesoftware-implemented model.Table 1 shows the mean square error between the soft-ware models and the hardware ones proposed in this paper.The obtained MSE-related results prove to be noteworthy,showing that the forward kinematics (FK-HMD and FK-HSD), inverse kinematics (IK-HSD), kinesthetic feedbackforce (KFF-HMD) and feedback force (FBF-HSD) moduleshad an acceptable response, even when using a hybrid repre-sentation, compared to the software model that uses a ﬂoatingpoint representation. It can be observed that for the variablesof the FK-HMD and FK-HSD modules the error was in therange of − , for the IK-HSD module the error was of − , for the variables of the KFF-HMD module the errorwas of − and for the FBF-HSD module the error was x OP (n)[m]y OP (n)[m] z O P ( n ) [ m ] FIGURE 32.

Trajectory used to validate hardware modules. in the range of − . These values demonstrate that theFPGA implementations presented an equivalent behavior tothe software models. TABLE 1.

Mean squared error (MSE) results for ﬂoating-point implementation.

Module Variable MSE

FK-HMD x HMD [ F n ) 2 . × − y HMD [ F n ) 8 . × − z HMD [ F n ) 1 . × − KFF-HMD τ HMD [ F n ) 1 . × − τ HMD [ F n ) 5 . × − τ HMD [ F n ) 3 . × − FK-HSD x ENV [ F n ) 2 . × − y ENV [ F n ) 8 . × − z ENV [ F n ) 1 . × − IK-HSD θ HSD [ F n ) 3 . × − θ HSD [ F n ) 2 . × − θ HSD [ F n ) 2 . × − FBF-HSD F HSDx [ F n ) 2 . × − F HSDy [ F n ) 1 . × − F HSDz [ F n ) 3 . × − In a hardware implementation, it is important to ana-lyze some requirements post-synthesis such as availablehardware usage and the execution time. In the case ofFPGAs, the resources are measured through the use oflookup tables (LUTs), Registers and Digital Signal Pro-cessors (DSPs) units, to name a few. After validating thehardware-implemented models, synthesis results were ob-tained using the implementation designed for an FPGAXilinx Virtex 6 XC6VLX240T-1FF1156. The used Virtex6 FPGA has , slices that group , ﬂip-ﬂops, , logical cells that can be used to implement logicalfunctions or memories, and DSP cells with multipliersand accumulators.Table 2 presents the post-synthesis results related to hard-ware occupancy, sampling rate, and throughput for the mod- ules FK-HMD, KFF-HMD, FK-HSD, IK-HSD, and FBF-HSD. The ﬁrst column shows the name of the module, thenext three columns called registers, LUTs and multipliersrepresent the amounts of resources used in the FPGA. Thecolumn register represents the number of ﬂip-ﬂops that wereused, followed by the total percentage used. The columnLUTs represents the number of LUTs that were used, fol-lowed by the total percentage used. And the column multi-pliers represents the number of DPS48 internal multipliersthat were used, followed by the total percentage used. The t s column represents the sampling rate in nanoseconds that wasobtained for each hardware module. Finally, the R s columndisplays throughput ( R s = t s ) values in mega-samples persecond for the hardware modules.The synthesis results presented in Table 2 show that theresources used for the FK-HMD and FK-HSD moduleswere the same. This means that each module, individually,used a percentage of . % which is equivalent to , of the available hardware resources for the registers, wasused . % with LUTs, and . % for embedded multipliersDSP48. The IK-HSD module had a hardware percentageconsumption of . % for registers, . % for LUTs and . % for multipliers. The KFF-HMD module had a con-sumption of . %, . % and . % for registers, LUTsand multipliers, respectively. Finally, the FBF-HSD moduleused a percentage of . % for registers, . % for LUTsand . % for multipliers.Based on data presented in Table 2, the HMD modules(FK-HMD and KFF-HMD) that is associated with the MDdevice has consumed , ( . %) for register, , ( . %) for LUTs and ( . %) for multipliers. In thecase of hardware associated with the SD device, the HSDmodules (FK-HSD, IK-HSD and FBF-HSD) had consumed , ( . %) for register, , ( . %) for LUTs and ( . %) for multipliers.The hardware resources consumed by the HMD hardwaremodules and the HSD hardware modules were very low.Even if all modules are implemented in single hardware,the consumption remains low. The total sum of hardwareresources used in the FPGA by all modules (FK-HMD,KFF-HMD, FK-HSD, IK-HSD and FBF-HSD) was: , ( . %) for register, , ( . %) for LUTs and ( . %) for multipliers. The low hardware resources con-sumption demonstrates that the proposed implementationstake up little hardware space in the FPGA which allows otherseparate implementations to be used concomitantly.As per Table 2, the throughput values, R s , obtainedwere signiﬁcant. Values of . MSps for the FK-HMDand FK-HSD modules, . MSps for the IK-HSD module, . MSps for the KFF-HMD module and . MSps forthe FBF-HSD module were achieved. These results enablecritical applications that demand strict time constraints, as isthe case with tactile internet applications.In Table 3, it is possible to see the speedup obtained inrelation to latency time constraints. The ﬁrst column presentsthe latency constraints of ms and ms that are presented ABLE 2.

Hardware occupancy, sampling rate and throughput results for ﬂoating-point format.

ModuleName Registers(Flip-Flops) LUTs Multipliers(DSP48) t s (ns) R s (MSps)FK-HMD , ( . %) , ( . %) ( . %)

47 21 . KFF-HMD , ( . %) , ( . %) ( . %)

70 14 . FK-HSD , ( . %) , ( . %) ( . %)

47 21 . IK-HSD , ( . %) , ( . %) ( . %)

218 4 . FBF-HSD ( . %) , ( . %) ( . %)

21 47 . in the literature. The second column shows the minimumlatency values that are required for the application to functionnormally. The third column shows the latency related with thehardware implementation presented here. TABLE 3.

Hardware speedup related to the time limits for the ms and mslatency constraints. TimeRestriction LatencyLimit t hardware Speedup ms . µ s ns × ms µ s ns × The ms restriction corresponds to the maximum latencylimit of . µ s for acceptable hardware performance. Forthe ms constraint, the maximum limit is µ s. Thevalue t hardware that is presented in Table 3 and accordingto (71), corresponds to the sum of the latencies of the ﬁveimplemented modules (Table 2), two modules are associatedwith the MD device (FK-HMD and KFF-HMD) and threemodules are associated with the SD device (FK-HSD, IK-HSD, and FBF-HSD).Thus, the presented value of ns in Table 3 correspondsto the sum of the two modules related to the master compo-nent, which has a total of ns of which ns come fromthe FK-HMD module and ns from the KFF-HMD moduletogether with the sum of the three modules referring to theslave component, which has a total of ns of which nsderives from the FK-HSD module, ns from IK-HSD and ns from the FBS-HSD module. So for the ms constraint,the implementation presented a × speedup relative to the . µ s, and for the ms constraint, the speedup was × relatives to the µ s limit.The sample rates resulted from the ﬁve modules that wereimplemented in this work were notably fast. The valuesobtained contributed to the hardware meeting the time con-straint limits required in a tactile internet environment. Hard-ware latency showed values signiﬁcantly below the requiredconstraints, as shown in Table 3. These values are well belowthe % presented in the literature and due to the fact thatthe communication medium demands % of applicationlatency, this value can be increased as the latency of hardwaredevices showed to be signiﬁcantly low. In other words, it canbe said that the remaining latency not spent on the hardwaredevices can be consumed in the network. It is important to remember that in a more complex tactileinternet environment, there are several others more algo-rithms to be implemented in hardware such as predictionalgorithms, dynamic control, AI based techniques, etc. How-ever, as the proposed implementations present low hardwareresource consumption, other necessary modules, as the onespreviously mentioned, could also be implemented in the sameshared hardware since resources would still be available.Table 4 presents comparisons of the results obtained bythe proposed implementation of this work with equivalentresults found in works from the state of the art. The ﬁrstcolumn indicates the references of related works. The nexttwo columns show the used FPGA platform and the amountof degrees of freedom of the used device. The fourth columnpresents the type of numerical representation used in theimplementation and, ﬁnally, the last four columns presentthe times obtained by each reference for latency added bythe forward kinematics (FK), inverse kinematics (IK), thekinesthetic force feedback (KFF) and feedback force (FBF)modules, respectively.As described in Table 4, a hardware model for calculatingthe forward kinematics of a -DoF device is presented in[24]. The proposed hardware was implemented using a -bit ﬂoating-point representation. The total time to performthe calculations was ns. Comparing to the forwardkinematics (FK) implementation using -bit ﬂoating-pointproposed by this work, the speedup was . × over themodel presented in [24].The work presented in [25] shows the results of an imple-mentation of the inverse kinematics module using ﬂoating-point 32-bit representation. The kinematic model was de-signed to work with a -DoF device, and the time requiredto calculate is ns. When compared with the proposalof inverse kinematics (IK) presented in this work, whichuses -bit ﬂoating-point representation, this implementationpresented a speedup of . × over in relation to the modelproposed by [25].The kinematics models presented in [26] described inTable 4, presented data regarding the forward and inversekinematics implementations for controlling a -DoF deviceusing the -bit ﬁxed-point representation. The moduleswere implemented using -bit for the fractional part and -bit in the integer part. For the forward kinematics (FK), ns are required to perform all calculations, and forinverse kinematics (IK), ns is required. Based on the ABLE 4.

Comparative table with state of the art works.

Reference Device DoF Data type FK IK KFF FBF

This work Virtex 6 3 Floating P. ns ns ns ns[24] Virtex 2 5 Floating P. ns - - -[25] Cyclone IV 3 Floating P. - ns - -[26] Unknown 6 Fixed P. ns ns - -[27] Cyclone IV 10 Fixed P. - ns - -[28] Cyclone IV 5 Fixed P. ns ns - -[29] Artix 7 3 Fixed P. ns - - results of the implementations presented in this section, theimplementation proposed for this work using ﬂoating-pointrepresentation had a speedup of . × for forward kine-matics and . × for the inverse kinematics.The research presented in [27] proposed a hardware im-plementation of inverse kinematics to control a -DoF de-vice. The hardware was projected using the 32-bit ﬁxed-point representation, however the amount of bits used in thefractional part was not speciﬁed. The architecture proposedto calculate the inverse kinematics requires ns to performthe computation. Comparing to the inverse kinematics (IK)implementation using -bit ﬂoating-point proposed by thiswork, the speedup was . × over the model presented in[27].The authors in [28] present the results of ﬁxed-point imple-mentation for forward and inverse kinematics to control a -DoF device, as described in Table 4. The proposed hardwareimplementation uses the numerical representation of -bit ( -bit to fractional part) and -bit ( -bit to fractionalpart) in different parts of the modules. The time required toperform the calculations is ns and ns for forward andinverse kinematics, respectively. Comparing to the ﬂoating-point implementation proposed by this work, the speedupwas . × for forward kinematic and . × for inversekinematic over the model presented in [28].Differently from previous works (Table 4), in [29], theauthors present unique hardware for calculating forward andinverse kinematics together. In the proposed model, the 32-bit ﬁxed-point representation was used. The total time toperform the calculation is ns. The time obtained wascalculated taking into account the entire process duration,however, separate times for each module were not speciﬁed.Given this scenario, by adding the t s FK module time thatcalculates forward kinematics with the IK module, the totaltime resulting from both implementations reaches ns,according to Table 4. Hence, the hardware presented in thework here developed achieved a . × speedup over themodel presented in [29].It can be seen from Table 4, that none of the works from thestate-of-the-art presented the hardware implementation of allfour robotics algorithms that were presented here. It is alsonoted that just two works used the ﬂoating-point numericalrepresentation. The ﬂoating-point implementation of roboticsalgorithms proposed by this work showed signiﬁcant gains when compared to the works presented in the literatureas shown in Table 4. The different amounts of degrees offreedom (DoF) used in the devices can somehow inﬂuence invalues of sample rate and throughput. Another factor that canalso inﬂuence these values is in relation to the type of FPGAthat is used to perform the synthesis. Due to the fact thatthe implementation of this work was designed in a parallelarchitecture, the increase in the amount of DoF does notnecessarily reﬂect in a signiﬁcant increase in sample rate. VIII. CONCLUSIONS

This paper presented a reconﬁgurable hardware referencemodel for four modules that implement robotics-associatedalgorithms. The FK-HMD and FK-HSD modules implementthe forward kinematics, the IK-HSD module implementsthe inverse kinematics, the KFF-HMD module implementsthe kinesthetic feedback force and the FBF-HSM moduleimplements the feedback force. The parallel FPGA imple-mentation of the four modules is intended to increase thetactile system’s processing speed with the purpose of meetingthe latency constraints required for tactile internet applica-tions. The modules were designed using a full-parallel im-plementation which works on a hybrid scheme that uses ﬁxedpoint and ﬂoating point representation in distinct parts of thearchitecture. Compared to the state of the art, this work standsout by presenting the description and implementation of fourdifferent robotics algorithms in FPGA. The implementationspresented in this work achieve higher module processingspeed when compared to equivalent implementations fromthe state-of-the-art. All of the modules here presented wereanalyzed based on the synthesis results, which included thehardware occupation area, sampling rate and throughput.Based on the synthesis results, it was observed that theimplementations achieved high module processing speed, farbelow the latency limit of ms. Hardware modules achievedan acceleration of × compared to the . µ s time con-straint. This demonstrates that using reconﬁgurable embed-ded systems on devices such as FPGAs enables parallelimplementation of algorithms thus speeding up processing ofthe data and minimizing execution time. Runtime gains canmake processing time possible for critical applications thatrequire short time constraints or a large amount of data to beprocessed in a short time frame. CKNOWLEDGMENT

This work was conducted during a scholarship supported bythe Doctoral Sandwich Program CAPES/PDSE at the FederalUniversity of Rio Grande do Norte. Financed by CAPESâ ˘A ¸S Brazilian Federal Agency for Support and Evaluationof Graduate Education within the Ministry of Education ofBrazil.

References [1] M. Dohler, “The tactile internet iot, 5g and cloud on steroids,” in 5G RadioTechnology Seminar. Exploring Technical Challenges in the Emerging 5GEcosystem, March 2015, pp. 1–16.[2] A. Aijaz, M. Dohler, A. H. Aghvami, V. Friderikos, and M. Frodigh,“Realizing the tactile internet: Haptic communications over nextgeneration 5g cellular networks,” CoRR, vol. abs/1510.02826, 2015.[Online]. Available: http://arxiv.org/abs/1510.02826[3] D. V. D. Berg, R. Glans, D. D. Koning, F. A. Kuipers, J. Lugtenburg,K. Polachan, P. T. Venkata, C. Singh, B. Turkovic, and B. V. Wijk,“Challenges in haptic communications over the tactile internet,” IEEEAccess, vol. 5, pp. 23 502–23 518, 2017.[4] M. Maier, M. Chowdhury, B. P. Rimal, and D. P. Van, “The tactile internet:vision, recent progress, and open challenges,” IEEE CommunicationsMagazine, vol. 54, no. 5, pp. 138–145, 2016.[5] M. Simsek, A. Aijaz, M. Dohler, J. Sachs, and G. Fettweis, “The 5g-enabled tactile internet: Applications, requirements, and architecture,” in2016 IEEE Wireless Communications and Networking Conference, April2016, pp. 1–6.[6] C. Li, C.-P. Li, K. Hosseini, S. B. Lee, J. Jiang, W. Chen, G. Horn,T. Ji, J. E. Smee, and J. Li, “5g-based systems design for tactile internet,”Proceedings of the IEEE, vol. 107, no. 2, pp. 307–324, 2018.[7] K. Antonakoglou, X. Xu, E. Steinbach, T. Mahmoodi, and M. Dohler, “To-ward haptic communications over the 5g tactile internet,” IEEE Commu-nications Surveys Tutorials, vol. 20, no. 4, pp. 3034–3059, Fourthquarter2018.[8] A. Nasrallah, A. S. Thyagaturu, Z. Alharbi, C. Wang, X. Shao,M. Reisslein, and H. ElBakoury, “Ultra-low latency (ull) networks: Theieee tsn and ietf detnet standards and related 5g ull research,” IEEECommunications Surveys & Tutorials, vol. 21, no. 1, pp. 88–145, 2018.[9] M. Simsek, A. Aijaz, M. Dohler, J. Sachs, and G. Fettweis, “5g-enabledtactile internet,” IEEE Journal on Selected Areas in Communications,vol. 34, no. 3, pp. 460–473, 2016.[10] D. Szabo, A. Gulyas, F. H. Fitzek, F. H. Fitzek, and D. E. Lucani,“Towards the tactile internet: Decreasing communication latency withnetwork coding and software deﬁned networking,” in European Wireless2015; 21th European Wireless Conference; Proceedings of, May 2015, pp.1–6.[11] M. Dohler, T. Mahmoodi, M. A. Lema, M. Condoluci, F. Sardis, K. Anton-akoglou, and H. Aghvami, “Internet of skills, where robotics meets ai, 5gand the tactile internet,” in 2017 European Conference on Networks andCommunications (EuCNC), June 2017, pp. 1–5.[12] Q. Yu, C. Wang, X. Ma, X. Li, and X. Zhou, “A deep learning predictionprocess accelerator based fpga,” in 2015 15th IEEE/ACM InternationalSymposium on Cluster, Cloud and Grid Computing, May 2015, pp. 1159–1162.[13] A. C. de Souza and M. A. Fernandes, “Parallel ﬁxed point implementationof a radial basis function network in an fpga,” Sensors, vol. 14, no. 10, pp.18 223–18 243, 2014.[14] A. L. X. Da Costa, C. A. D. Silva, M. F. Torquato, and M. A. C. Fernandes,“Parallel implementation of particle swarm optimization on fpga,” IEEETransactions on Circuits and Systems II: Express Briefs, pp. 1–1, 2019.[15] M. G. F. Coutinho, M. F. Torquato, and M. A. C. Fernandes, “Deep neuralnetwork hardware implementation based on stacked sparse autoencoder,”IEEE Access, vol. 7, pp. 40 674–40 694, 2019.[16] M. F. Torquato and M. A. C. Fernandes, “High-performance parallelimplementation of genetic algorithm on fpga,” Circuits, Systems, andSignal Processing, Jan 2019.[17] L. M. D. Da Silva, M. F. Torquato, and M. A. C. Fernandes, “Parallelimplementation of reinforcement learning q-learning technique for fpga,”IEEE Access, vol. 7, pp. 2782–2798, 2019. [18] F. F. Lopes, J. C. Ferreira, and M. A. C. Fernandes, “Parallel implementa-tion on fpga of support vector machines using stochastic gradient descent,”Electronics, vol. 8, no. 6, 2019.[19] D. H. Noronha, M. F. Torquato, and M. A. Fernandes, “A parallel imple-mentation of sequential minimal optimization on fpga,” Microprocessorsand Microsystems, vol. 69, pp. 138 – 151, 2019.[20] A. N, A. S. M, K. Polachan, P. T. V, and C. Singh, “An end to end tactilecyber physical system design,” in 2018 4th International Workshop onEmerging Ideas and Trends in the Engineering of Cyber-Physical Systems(EITEC), April 2018, pp. 9–16.[21] M. K. Oâ ˘A ´ZMalley, K. S. Sevcik, and E. Kopp, “Improved haptic ﬁdelityvia reduced sampling period with an fpga-based real-time hardware plat-form,” Journal of Computing and Information Science in Engineering,vol. 9, no. 1, p. 011002, 2009.[22] H. Tanaka, K. Ohnishi, and H. Nishi, “Haptic communication system usingfpga and real-time network framework,” in Industrial Electronics, 2009.IECON’09. 35th Annual Conference of IEEE. IEEE, 2009, pp. 2931–2936.[23] M. Franc and A. Hace, “A study on the fpga implementation of the bilat-eral control algorithm towards haptic teleoperation,” Automatika–Journalfor Control, Measurement, Electronics, Computing and Communications,vol. 54, no. 1, 2013.[24] D. F. Sánchez, D. M. Muñoz, C. H. Llanos, and J. M. Motta, “A recon-ﬁgurable system approach to the direct kinematics of a 5 dof roboticmanipulator,” International Journal of Reconﬁgurable Computing, vol.2010, 2010.[25] K. Gac, G. Karpiel, and M. Petko, “Fpga based hardware acceleratorfor calculations of the parallel robot inverse kinematics,” in Proceedingsof 2012 IEEE 17th International Conference on Emerging TechnologiesFactory Automation (ETFA 2012), Sept 2012, pp. 1–4.[26] M. Wu, Y. Kung, Y. Huang, and T. Jung, “Fixed-point computation ofrobot kinematics in fpga,” in 2014 International Conference on AdvancedRobotics and Intelligent Systems (ARIS), June 2014, pp. 35–40.[27] C. C. Wong and C. C. Liu, “Fpga realisation of inverse kinematics forbiped robot based on cordic,” Electronics Letters, vol. 49, no. 5, pp. 332–334, February 2013.[28] H. Linh, B. Thi, and Y.-S. Kung, “Digital hardware realization of forwardand inverse kinematics for a ﬁve-axis articulated robot arm,” MathematicalProblems in Engineering, vol. 2015, 2015.[29] Z. Jiang, Y. Dai, J. Zhang, and S. He, “Kinematics calculation of mini-mally invasive surgical robot based on fpga,” in 2017 IEEE InternationalConference on Robotics and Biomimetics (ROBIO), Dec 2017, pp. 1726–1730.[30] Geomagic, Phantom Omni, Device Guide.[31] G. Song, S. Guo, and Q. Wang, “A tele-operation system based onhaptic feedback,” in 2006 IEEE International Conference on InformationAcquisition, Aug 2006, pp. 1127–1131.[32] T. Sansanayuth, I. Nilkhamhang, and K. Tungpimolrat, “Teleoperationwith inverse dynamics control for phantom omni haptic device,” in 2012Proceedings of SICE Annual Conference (SICE), Aug 2012, pp. 2121–2126.[33] A. J. Silva, O. A. D. Ramirez, V. P. Vega, and J. P. O. Oliver, “Phan-tom omni haptic device: Kinematic and manipulability,” in Electronics,Robotics and Automotive Mechanics Conference, 2009. CERMA’09.IEEE, 2009, pp. 193–198.[34] M. C. Cavusoglu and D. Feygin, “Kinematics and dynamics of phantom(tm) model 1.5 haptic interface,” 2001.[35] J. San Martin and G. Triviño, “A study of the manipulability of thephantom omni haptic interface.” in VRIPHYS, 2006, pp. 127–128.[36] A. Kumar, P. J. Gaidhane, and V. Kumar, “A nonlinear fractional orderpid controller applied to redundant robot manipulator,” in 2017 6th Inter-national Conference on Computer Applications In Electrical Engineering-Recent Advances (CERA), Oct 2017, pp. 527–532.[37] C. Yang, H. Ma, and M. Fu, Intelligent Control of Robot Manipulator.Singapore: Springer Singapore, 2016, pp. 49–96.[38] H. Rahimi and M. Nazemizadeh, “Dynamic analysis and intelligent con-trol techniques for ﬂexible manipulators: a review,” Advanced Robotics,vol. 28, no. 2, pp. 63–76, 2014.[39] S. H. Tang, C. K. Ang, M. K. A. B. M. Arifﬁn, and S. B. Mashohor,“Predicting the motion of a robot manipulator with unknown trajectoriesbased on an artiﬁcial neural network,” International Journal of AdvancedRobotic Systems, vol. 11, no. 10, p. 176, 2014.