A surgical dataset from the da Vinci Research Kit for task automation and recognition
Irene Rivas-Blanco, Carlos J. Pérez-del-Pulgar, Andrea Mariani, Claudio Quaglia, Giuseppe Tortora, Arianna Menciassi, Víctor F. Muñoz
AA surgical dataset from the da VinciResearch Kit for task automation andrecognition
SAGE
Irene Rivas-Blanco , Carlos J. P ´erez-del-Pulgar , Andrea Mariani , Claudio Quaglia ,Giuseppe Tortora , Arianna Menciassi , and V´ıctor F. Mu ˜noz Abstract
The use of datasets is getting more relevance in surgical robotics since they can be used to recognise and automatetasks. Also, this allows to use common datasets to compare different algorithms and methods. The objective of this workis to provide a complete dataset of three common training surgical tasks that surgeons perform to improve their skills.For this purpose, 12 subjects teleoperated the da Vinci Research Kit to perform these tasks. The obtained datasetincludes all the kinematics and dynamics information provided by the da Vinci robot (both master and slave side)together with the associated video from the camera. All the information has been carefully timestamped and providedin a readable csv format. A MATLAB interface integrated with ROS for using and replicating the data is also provided.
Keywords
Dataset, da Vinci Research Kit, automation, surgical robotics
Introduction
Surgical Data Science (SDS) is emerging as a newknowledge domain in healthcare. In the field of surgery, itcan provide many advances in virtual coaching, surgeon skillevaluation and complex tasks learning from surgical roboticsystems Vedula and Hager (2020), as well as in the gesturerecognition domain P´erez-del Pulgar et al. (2019), Ahmidiet al. (2017). The development of large datasets related tothe execution of surgical tasks using robotic systems wouldsupport these advances, providing detailed information ofthe surgeon movements, in terms of both kinematics anddynamics data, as well as video recordings. One of themost known dataset in this field is the JIGSAWS dataset,described in Gao et al. (2014). In this work, they definedthree surgical tasks that were performed by 6 surgeons usingthe da Vinci Research Kit (dVRK). This research platform,based on the first-generation commercial da Vinci SurgicalSystem (by Intuitive Surgical, Inc., Sunnyvale, CA) andintegrated with ad-hoc electronics, firmware and software,can provide kinematics and dynamics data along with thevideo recordings. The JIGSAWS dataset contains 76 motionvariables collected at 30 samples per second for 101 trials ofthe tasks.Thus, the objective of this work is to extend the availabledatasets related to surgical robotics. We present the RoboticSurgical Manuevers (ROSMA) dataset, a large datasetcollected using the dVRK, in collaboration between theUniversity of Malaga (Spain) and The Biorobotics Insitute ofthe Scuola Superiore Sant’Anna (Italy), under a TERRINet(The European Robotics Research Infrastructure Network)project. This dataset contains 160 kinematic variablesrecorded at 50 Hz for 207 trials of three different tasks. Thisdata is complemented with the video recordings collectedat 15 frames per second with 1024 x 768 pixel resolution. Moreover, we provide a task evaluation based on time andtask-specific errors, and a questionnaire with personal dataof the subjects (gender, age, dominant hand) and previousexperience using teleoperated systems and visuo-motor skills(sport and musical instruments).
System description
The dVRK, supported by the Intuitive Foundation (Sunny-vale, CA), arose as a community effort to support researchin the field of telerobotic surgery Kazanzides et al. (2014).This platform is made up of hardware of the first-generationda Vinci system along with motor controllers and a softwareframework integrated with the Robot Operating System(ROS) Chen et al. (2017). There are over thirty dVRKplatforms distributed in ten countries around the world. ThedVRK of the Scuola Superiore Sant’Anna has two PatientSide Manipulators (PSM), labelled as PSM1 and PSM2(Figure 1(a)), and a master console consisting of two MasterTool Manipulators (MTM), labelled as MTML and MTMR(Figure 1(b)). MTML controls PSM1, while MTMR controlsPSM2. For these experiments, the stereo vision is providedusing two commercial webcams, as the dVRK used for theexperiments was not equipped with an endoscopic cameraand its manipulator. Each PSM has 6 joints following thekinematics described in Fontanelli et al. (2017), and an University of Malaga, Spain The BioRobotics Institute, Scuola Superiore Sant’Anna, Pisa, Italy Department of Excellence in Robotics & AI, Scuola SuperioreSant’Anna, Pisa, Italy
Corresponding author:
Irene Rivas Blanco Edificio de Ingenier´ıas UMA, Arquitecto FranciscoPe˜nalosa 6, 29071, M´alaga, SpainEmail: [email protected]
Prepared using sagej.cls [Version: 2017/01/17 v1.20] a r X i v : . [ c s . R O ] F e b Journal Title XX(X) (a) Slave manipulators (b) Master console
Figure 1. da Vinci Research Kit platform available at The Biorobotics Institute of Scuola Superiore Sant’Anna (Pisa, Italy) (a) Patient side kinematics (b) Surgeon side kinematics
Figure 2.
Patient and surgeon side kinematics. Kinematics of each PSM is defined with respect to the common frame ECM, whilethe MTMs are described with respect to frame HRSV. (a) Post and sleeve (b) Pea on a peg (c) Wire Chaser
Figure 3.
Snapshot of the three tasks in the ROSMA dataset at the starting position. additional degree of freedom for opening and closing thegripper. The tip of the instrument moves about a remotecenter of motion (RCM), where the origin of the base frame of each manipulator is set. The motion of each manipulatoris described by its corresponding tool tip frame with respectto a common frame, labelled as
ECM , as shown in Figure
Prepared using sagej.cls mith and Wittkopf Table 1.
Protocol for each task of the ROSMA datasetPost and sleeve Pea on a peg Wire chaser
Goal
To move the colored sleeves fromside to side of the board. To put the beads on the 14 pegs ofthe board. To move a ring from one side to theother side of the board.
Startingposition
The board is placed with the pegrows in a vertical position (from leftto right: 4-2-2-4). The six sleevesare positioned over the 6 pegs onone of the sides of the board. All beads are on the cup. The board is positioned with the text“one hand” in front. The three ringsare at the right side of the board.
Procedure
The subject has to take a sleevewith one hand, pass it to the otherhand, and place it over a peg onthe opposite side of the board. If asleeve is dropped, it is considered apenalty and it cannot be taken back. The subject has to take the beadsone by one out of the cup and placethem on top of the pegs. For thetrials performed with the right hand,the beads are placed on the rightside of the board, and vice versa. Ifa bead is dropped, it is considered apenalty and it cannot be taken back. The subject has to pick one ofthe rings and pass it through thewire to the other side of the board.The subjects must use only handto move the ring, but they areallowed to help themselves withthe other hand if needed. If thering is dropped, it is considered apenalty but it must be taken back tocomplete the task.
Repetitions
Six trials: three from right to left, andother three from left to right. Six trials: three placing the beadson the pegs of the right side of theboard, and other three on the leftside. Six trials: three moving the ringsfrom right to left, and other threefrom left to right.
Penalty
15 penalty points if a sleeve isdropped. 15 penalty points when a bead isdropped. 10 penalty points when the ring isdropped.
Score
Time in seconds + penalty points. Time in seconds + penalty points. Time in seconds + penalty points.* A trial is the data corresponding with the performance of one subject one instance of a specific task.
Table 2.
Dataset summary: number of trials of each subject and exercise in the ROSMA dataset.X01 X02 X03 X04 X05 X06 X07 X08 X09 X10 X11 X12 Total
Post and sleeve
Pea on a peg
Wire chaser
No. Trials per subject
18 18 17 17 17 18 18 18 16 17 17 16 207
HRSV , as show in Figure 2(b). Thetransformation between the base frames and the commonone in both sides of the dVRK is described in the jsonconfiguration file that can be found in the ROSMA Githubrepository * .The ROSMA dataset contains the performance of threetasks of the Skill Building Task Set (from 3-D TechicalServices, Franklin, OH): post and sleeve (Figure 3(a)),pea on a peg (Figure 3(b)), and Wire Chaser (Figure3(c)). These training platforms for clinical skill developmentprovide challenges that require motions and skills usedin laparoscopic surgery, such as hand-eye coordination,bimanual dexterity, depth perception or interaction betweendominant and non-dominant hand. These exercises wereborn for clinical skill acquisition, but they are also commonlyused in surgical research for different applications, such asclinical skills acquisition Hardon et al. (2018) or testing ofnew devices Velasquez et al. (2016). The protocol for eachexercise is described in Table 1. Dataset structure
The ROSMA dataset is divided into three training tasksperformed by twelve subjects. The experiments were carried out in accordance with the recommendations of our insti-tution with written informed consent from the subjects inaccordance with the declaration of Helsinki. Before startingthe experiment, each subject was taught about the goal ofthe exercises and the error metrics. The overall length of thedata recorded is 8 hours, 19 minutes and 40 seconds, and thetotal amount of kinematic data is around 1.5 millions for eachparameter. The dataset contains 416 files divided as follows:207 data files in csv (comma-separated values) format; 207video data files in mp4 format; 1 file in cvs format with theexercises evaluation, named ’scores.csv’ ; and 1 file, also incsv format, with the answers of the personal questionnaire,named ’User questionnaire - dvrk Dataset Experiment.csv’ .The name of the data and video files follows the follow-ing hierarchy: < User Id > <
Task name > <
Trial number > .The description of each of these fields is as follows: < User Id > provides a unique identifier for each user, andranges from ’X01’ to ’X12’; < Task name > may be one ofthe following labels, depending on the task being performed:’Pea on a Peg’, ’Post and Sleeve’, or ’Wire Chaser’; and < Trial number > is the repetition number of the user in thecurrent task, ranging from ’01’ to ’06’. For example, the filename ’X03 Pea on a Peg 04’ corresponds to the fourth trialof the user ’X03’ performing the task Pea on a Peg. * https://github.com/SurgicalRoboticsUMA/rosma_dataset Prepared using sagej.cls
Journal Title XX(X)
Table 3.
Structure of the columns of the data files.Indices Number Label Unit ROS publisher < x,y,z > m /dvrk/MTML/position cartesian current/position < x,y,z,w > radians /dvrk/MTML/position cartesian current/orientation < x,y,z > m/s /dvrk/MTML/twist body current/linear < x,y,z > radians/s /dvrk/MTML/twist body current/angular < x,y,z > N /dvrk/MTML/wrench body current/force < x,y,z > N·m /dvrk/MTML/wrench body current/torque < > radians /dvrk/MTML/state joint current/position < > radians/s /dvrk/MTML/state joint current/velocity < > N /dvrk/MTML/state joint current/effort < x,y,z > m /dvrk/MTMR/position cartesian current/position < x,y,z,w > m /dvrk/MTMR/position cartesian current/orientation < x,y,z > m/s /dvrk/MTMR/twist body current/linear < x,y,z > radians/s /dvrk/MTMR/twist body current/angular < x,y,z > N /dvrk/MTMR/wrench body current/force < x,y,z > N·m /dvrk/MTMR/wrench body current/torque < > radians /dvrk/MTMR/state joint current/position < > radians/s /dvrk/MTMR/state joint current/velocity < > N /dvrk/MTMR/state joint current/effort < x,y,z > m /dvrk/PSM1/position cartesian current/position < x,y,z,w > radians /dvrk/PSM1/position cartesian current/orientation < x,y,z > m/s /dvrk/PSM1/twist body current/linear < x,y,z > radians/s /dvrk/PSM1/twist body current/angular < x,y,z > N /dvrk/PSM1/wrench body current/force < x,y,z > N·m /dvrk/PSM1/wrench body current/torque < > radians /dvrk/PSM1/state joint current/position < > radians/s /dvrk/PSM1/state joint current/velocity < > N /dvrk/PSM1/state joint current/effort < x,y,z > m /dvrk/PSM2/position cartesian current/position < x,y,z,w > radians /dvrk/PSM2/position cartesian current/orientation < x,y,z > m/s /dvrk/PSM2/twist body current/linear < x,y,z > radians/s /dvrk/PSM2/twist body current/angular < x,y,z > N /dvrk/PSM2/wrench body current/force < x,y,z > N·m /dvrk/PSM2/wrench body current/torque < > radians /dvrk/PSM2/state joint current/position < > radians/s /dvrk/PSM2/state joint current/velocity < > N /dvrk/PSM2/state joint current/effort The summary of the number of trials performed by eachuser and task is shown in Table 2. Each user performed atotal of six trials per task, but during the post-processing ofthe data, the authors found recording errors in some of them.That is the reason why there are some users with fewer trialsin certain tasks. The full dataset is available for download † . Data files
The data files are in csv format and contain 155 columns:the first column, labelled as ’Date’ has the timestamp ofeach set of measures, and the other 154 columns have thekinematic data of the patient side manipulators (PSMs)and the master side manipulators (MSMs). The structureof these 154 columns is described in Table 3, whichshows the column indices for each kinematic motion, thenumber of columns, the descriptive label of each variable,the data units, and the ROS publishers of the data. Thedescriptive labels of the columns has the following format: < component name > < kinematic motion > < variable > .The timestamp values have a precision of milliseconds,and are expressed in the format: Year-Month-Day.Hour:Minutes:Seconds.Milliseconds . As data hasbeen recorded at 50 frames per seconds, the time stepbetween rows is 20 ms.
Video files
Images have been recorded with a commercial webcam at 15frames per second, with a resolution of 1024 x 768 pixels.Time of the internal clock of the computer recording theimages is shown at the right top corner, with a precisionof seconds. The webcam was placed in front of the dVRKsystem so that PSM1 is on the right side of the images, andPSM2 on the left side.
Exercises evaluation
The file ’scores.csv’ contains the evaluation of each exerciseaccording to the scoring of Table 1. Thus, for each file name,it features the execution time of the task (in seconds), thenumber of errors, and the final score.
Personal questionnaire
After completing the experiment, participants were asked tofill-in a form to collect personal data that could be usefulfor further studies and analysis of the data. The form hasquestions related with the following items: age, dominanthand, task preferences, medical background, previous † https://zenodo.org/record/3932964 Prepared using sagej.cls mith and Wittkopf (a) MATLAB GUI (b) RVIZ Simulation Figure 4.
MATLAB GUI for reproducing the motion of the PSMs and the MTMs using the data provided by the dataset. experience using the da Vinci or any teleoperated device, andhand-eye coordination skills. Questions demanding a levelof expertise are multiple choice ranging from 1 (low) to 5(high).
Data synchronization
The kinematics data and the video data have beenrecorded using two different computers, both running onUbuntu 16.04. The internal clocks of these computershave been synchronized in a common time referenceusing a Network Time Protocol (NTP) server. The internalclocks synchronization has been repeated before starting theexperiment of a new user, if there was a break between usershigher than one hour.
Using the data
We provide MATLAB code that visualizes and replicatesthe motion of the four manipulators of the dVRK for aperformance stored in the ROSMA dataset. This code hasbeen developed as the graphical user interface shown inFigure 4(a). When the user browses a data file from theROSMA dataset, the code generates a mat file with all thedata of the file as a structure variable. Then, the user canplay the data by using the GUI buttons, which also allow topause the reproduction or stop it. The time slider shows thetime progress of the reproduction, and at the top-right side,the current timestamp and data frame are also displayed.The GUI also gives the option of reproducing the datain ROS. If the checkbox
ROS is on, the joint configurationof each manipulator is sent through the corresponding ROStopics: /dvrk/ < name > /set position goal joint. Thus, if theROS package of the dVRK (dvrk-ros) is running, the realplatform or the simulated one using the visualization toolRVIZ, the system will replicate the motion performed duringthe data file trial. The dVRK package can be downloadedfrom GitHub ‡ , as well as our MATLAB code § . This packageincludes the MATLAB GUI and a folder with all the filesrequired to launch the dVRK package with the configurationused during the data collection. Summary
The ROSMA dataset is a large surgical robotics collectionof data using the da Vinci Research Kit. The authors providea video showing the experimental setup used for collectingthe data, along with a demonstration of the performance ofthe three exercises and the MATLAB GUI for visualizingthe data ¶ . The main strength of our dataset versus theJIGSAWS one (Gao et al. (2014)), is the large quantityof data recorded (155 kinematic variables, images, tasksevaluation and questionnaire) and the higher number of users(twelve instead of six). This high amount of data couldfacilitate to advance in the field of artificial intelligenceapplied to the automation of tasks in surgical robotics, as wellas surgical skills evaluation and gesture recognition. Funding
This work was partially supported by the European Commissionthrough the European Robotics Research Infrastructure Network(TERRINet) under grant agreement 73099, and by the AndalusianRegional Government, under grant number UMA18-FEDERJA-18.
References
Ahmidi N, Tao L, Sefati S, Gao Y, Lea C, Haro BB, Zappella L,Khudanpur S, Vidal R and Hager GD (2017) A Dataset andBenchmarks for Segmentation and Recognition of Gesturesin Robotic Surgery.
IEEE Transactions on BiomedicalEngineering
Proceedings - 20171st IEEE International Conference on Robotic Computing, IRC2017 . Institute of Electrical and Electronics Engineers Inc. ‡ https://github.com/jhu-dvrk/dvrk-ros § https://github.com/SurgicalRoboticsUMA/rosma_dataset ¶ Prepared using sagej.cls
Journal Title XX(X)
ISBN 9781509067237, pp. 180–187. DOI:10.1109/IRC.2017.69.Fontanelli GA, Ficuciello F, Villani L and Siciliano B (2017)Modelling and identification of the da Vinci Research Kitrobotic arms. In:
IEEE International Conference on IntelligentRobots and Systems , volume 2017-Septe. Institute of Electricaland Electronics Engineers Inc. ISBN 9781538626825, pp.1464–1469. DOI:10.1109/IROS.2017.8205948.Gao Y, Swaroop Vedula S, Reiley CE, Ahmidi N, Varadarajan B,Lin HC, Tao L, Zappella L, B´ejar B, Yuh DD, Chiung C,Chen G, Vidal R, Khudanpur S and Hager GD (2014) JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS): ASurgical Activity Dataset for Human Motion Modeling. In:
MICCAI Workshop: Modeling and Monitoring of ComputerAssisted Interventions (M2CAI) . Boston, MA.Hardon SF, Horeman T, Bonjer HJ and Meijerink WJ (2018) Force-based learning curve tracking in fundamental laparoscopicskills training.
Surgical Endoscopy
IEEE International Conferenceon Robotics & Automation (ICRA) . Hong Kong, China. ISBN9781479936854, pp. 6434–6439.P´erez-del Pulgar CJ, Smisek J, Rivas-Blanco I, Schiele A andMu˜noz VF (2019) Using Gaussian Mixture Models for GestureRecognition During Haptically Guided Telemanipulation.
Electronics
Innovative Surgical Sciences
Surgical Endoscopy