Plane-Casting: 3D Cursor Control with a SmartPhone
Nicholas Katzakis, Kiyoshi Kiyokawa, Masahiro Hori, Haruo Takemura
PPlane-Casting: 3D Cursor Control with a SmartPhone
Nicholas Katzakis
Osaka UniversityToyonaka, [email protected]
Kiyoshi Kiyokawa
Osaka UniversityToyonaka, [email protected]
Masahiro Hori
Kansai UniversityTakatsuki, [email protected]
Haruo Takemura
Osaka UniversityToyonaka, [email protected]
ABSTRACT
We present Plane-Casting, a novel technique for 3D object manipu-lation from a distance that is especially suitable for smartphones.We describe two variations of Plane-Casting, Pivot and Free Plane-Casting, and present results from a pilot study. Results suggest thatPivot Plane-Casting is more suitable for quick, coarse movementswhereas Free Plane-Casting is more suited to slower, precise mo-tion. In a 3D movement task, Pivot Plane-Casting performed betterquantitatively, but subjects preferred Free Plane-Casting overall.
3D interaction is a challenging problem and has been for over half acentury, ever since the creation of the first 3D computer graphics. Ashardware technology advanced, and display sizes grew, it becamepossible to view graphics from a distance, so the need to interactalso ensued. However, currently available controllers for remote 3Dcontrol, among other weaknesses, lack in intuitiveness. Therefore,in this work we propose the use of smartphones as 3D controllers.State of the art smartphones feature an array of orientation sensorswhich make it possible to calculate the device’s orientation in 3D.By further employing the touch-screen we demonstrate that withour proposed technique, Plane-Casting, it is possible to translatean object in 3D. In Plane-Casting, the rotation of the smartphonecontrols a virtual plane that constrains the movement of the 3Dcursor. Aside from the potential for intuitive 3D control, the verywide availability of smartphones is an additional motivating factorfor our work.Examples of situations where there is a need to interact in 3Dfrom a distance include the following: • Entertainment: As the number of displays in urban spacesincrease, there are numerous opportunities for social enter-tainment that involve 3D (like 3D Games). • Design: A team of designers is reviewing the latest 3D assetsin their weekly meeting. Participants interact, review anddiscuss changes relating to the 3D geometry. • Education: A medical school professor is demonstrating theanatomy of the human heart by projecting 3D graphics.A smartphone controller allows the professor to leave thepodium and approach the students while still being ableto interact with the model, thus making the class more en-gaging. Students can also use their smarphones to activelyparticipate. In the remainder of this paper we present two variations of Plane-Casting,
Pivot
Plane-Casting and
Free
Plane-Casting. We discusstheir strengths and limitations and present results from a pilotstudy.
Touch-input enables users to interact with a display by removingan indirection layer, and there are quite a few solutions for 3Dinteraction by using multi-touch[3]. However when the displaysize exceeds a certain threshold, touch input ceases to be an optionas the user needs to cover a large area with physical movementsand in some cases the display area is out of reach (as is the casewith projectors and Tile-displays). In addition to the input problemsof touch, physically approaching the display to interact limits theuser’s activity to a very small area of the display and in the case ofcollaborative work, the interacting user obscures the display for therest of the group. As an alternative to touch, gesture approachesrequire a carefully controlled environment and slightly lack inefficiency for practical use.The Nintendo Wii-mote™ is a popular choice for remote controlbut depends on a 2-state directional-pad for additional degrees offreedom. Other controllers like the 3Dconnexion SpaceNavigator™depend on desks and are tethered by cables, thus making themunsuitable for an active, engaging experience or for use in publicor shared spaces.More directly related to our work, Bier’s discussion on constrain-ing motion in a scene composition scenario is one of the earliestreferences in the literature [1]. Bier further emphasizes the powerof constraint-based systems in subsequent works.Hachet et al.[2] propose a controller that attaches to the side ofmobile phones and can provide 3-DOF control, a solution whichcould be used for remote 3D control. They evaluate the design in anavigation scenario and report positive reactions from the users.Their approach is however based on proprietary hardware externalto the device, and limited to rate control.More recently, Jimenez et al.[4] used a hand-held device in amuseum scenario for remote assembly of a puzzle-like task. Theirwork highlights some of the social aspects of using a hand-heldinterface in a collaborative task. Their evaluation suggests that ausable interface might better promote equal participation in a grouptask. a r X i v : . [ c s . H C ] J a n a) Pivot PC: The user gestures to translate the acquiredobject which moves on the plane. (b) Plane follows rotation of device. Rotates about thepivot with the object bound to it. (c) The object’s rotation does not change, only it’s posi-tion.(d) Free PC: The acquired object moves on the plane butthe plane stays attached to it. (e) The pivot point of the plane is always fixed at thecenter of the object’s bounding box. (f) Gesturing towards the motion direction regardless ofthe device’s orientation. Figure 1: The two variations of Plane-Casting
Finally, Song et al.[6] used a hand-held device in a large-displayscenario. The device controls the position of a slicing plane in 5-DOF that explores volume-rendering data and the authors presenta novel technique to annotate them. Song’s approach unfortunatelyrequires physical proximity to the screen which makes it unsuitablefor remote or collaborative work and also depends on proprietaryhardware attached to the hand-held device thus limiting it’s appli-cability. Their paper offers a thorough review of the literature onhand-held/remote interaction.Although there are a few interaction techniques that use mag-netic trackers like the go-go technique[5] or WIM[7], there is cur-rently no established standard technique/device for remote graphicsmanipulation. Although smartphones have been used in the past,solutions have been inadequate. In Pivot Plane-Casting (PivotPC) the shape of the touch-screen isdrawn as a rectangle at the center of the scene (Figure 1(a)). Theuser can rotate the device to control the orientation of the plane(position-controlled). The plane’s pivot point is at the center of therectangle and always remains fixed at the center of the virtual space.The 3D cursor’s movement is constrained on the plane defined by
Figure 2: PivotPC, moving vertically to the plane becomeseasier as the object moves away from the pivot point of theplane. the rectangle but not limited to it’s bounds. The user can translatethe cursor on the plane by gesturing on the touch-screen and rotatethe plane to move the cursor to any point in 3D space (Figure 1(b)).In our implementation the tactile-sensor is a touch screen but anytouch panel that can be tracked is suitable for Plane-Casting. igure 3: Illustration of the experimental setup Free Plane-Casting
Free Plane-Casting (FreePC) is similar to PivotPC but in this varia-tion the plane’s pivot point follows the cursor’s motion in 3D space.FreePC shifts the center/pivot point around with evey slide move-ment. The rectangle that defines the plane is thus always attachedto the cursor that is being manipulated and they move as one, withthe orientation of the rectangle constantly re-defining the plane(Figure 1(d)).In PivotPC, placing the cursor away from the pivot point of theplane makes it easier to rotate the object vertically, on the normal tothe current plane thus making it easier to quickly change directionas would be the case in a game (Figure 2), yet by sacrificing accuracy.FreePC’s nature makes it so only 2-DOF are instantly available atany time and moving in a direction on the normal to the currentplane requires a small supination/pronation move (Figure 1(e)).In our implementation of FreePC and PivotPC selection of theobject to be manipulated is done by a spherical cursor that inter-sects the desired object in a widely used "virtual hand" metaphor.Depending on the application there are many strategies for objectselection but that remains beyond the scope of this work.
Our pilot study evaluated the two techniques against each otherin a 3D positioning task. A remote manipulation scenario like theones mentioned in the introduction would require the ability tomove an object in 3D, but which technique would be best for this?We wanted to test the following two hypotheses: • H1 : PivotPC will perform faster than FreePC since it onlyrequires an initial alignment of the plane to the target. • H2 : FreePC will be more accurate as it’s essentially bringingthe pivot point closer to the target and does not require a"steady hand" like PivotPC.12 right-handed male participants, students and faculty volun-teered for the experiment (age mean 25). Participants had no priorexperience in using the techniques.
Set-up
Subjects sat 270 cm from the projection screen of an ultra-short-focus projector (Sanyo PDG-DWL2500J). They were instructed to
Figure 4: Screenshot of the evaluation task (in monoscopic3D) using PivotPC. Users had to dock the cursor (multi-colored-house) to the translucent target. hold the smartphone (Samsung Galaxy SII) in their non-dominanthand while gesturing on the touch-screen with their dominant hand(Figure 3). A foot switch was available for advancing to the nexttrial. The projection screen had a width and height of 245x138cmrespectively with a 1280x800 display resolution in stereoscopic3D (Nvidia 3D Vision). Data transmission of the device sensorinformation was over an IEEE 802.11g WiFi link and was filteredwith a 30 sample moving average filter for stabilization.
In the evaluation task, the house-shaped cursor (and rectangle)appeared between the viewpoint and the far wall of the 3D space(Figure 4). When the experiment commenced, a translucent copyof the cursor appeared randomly at one of 12 pre-defined positionsaround the cursor (Figure 6) and subjects had to match the positionof the cursor with that of the target under two conditions: 1) asquickly and 2) as accurately as possible. The targets appeared inpositions distributed evenly on the surface of a sphere centeredat the cursor’s starting position with a radius of either 52 or 96cm respectively (Figure 6). Each position was tested twice witheach technique, one in the
Speed and one in the
Accuracy condition(balanced order).When the cursor’s bounding box intersected the target’s bound-ing box, the target’s bounding box would become visible signaling amatch. Subjects could not end the trial if they did not have a match.When subjects felt they had achieved a good match, they pressedthe foot switch and the trial ended with both the cursor and thetarget disappearing. Only the rectangle remained. In FreePC, thepivot point of the plane/rectangle would return to the center of thescene. The next trial would only begin when subjects returned thedevice to it’s original orientation parallel to the ground at whichpoint the cursor would re-appear at the center of the plane, and thetarget at the next position to be tested.All subjects received a brief explanation of the techniques andperformed the task once with each technique as practice (order oftechniques was also balanced). Subjects performed 12 positions × × × he experiment lasted around 40 minutes with no break betweenconditions. We recorded the movement time (MT) , the accuracy as a measureof the euclidean distance (d) between the center of the cursor’sbounding box and the target’s bounding box at the time the subjectpressed the foot-switch. We also recorded the distance the user’sfinger traveled on the touch screen (T) during every trial. Results of a repeated-measures ANOVA showed that
Technique hada strong effect on movement time (MT). The average MT for FreePCwas 9.7s vs 8.2s for PivotPC (F ( , ) =17.1 p < .001). This confirmsour first hypothesis (H1) of PivotPC being the quicker techniqueof the two, though based on the fact that FreePC requires repeatedsupination/pronation we expected the performance difference tobe greater.Technique also had a significant effect on the amount partic-ipants gestured on the touch-screen. An average amount of 190pixels traveled was recorded when using PivotPC whereas whenusing FreePC users gestured an average of 225 pixels (F ( , ) =23.6p < .001). These results suggest that PivotPC is potentially suitableto implement on devices with smaller touch-surfaces than FreePC.Contrary to our second hypothesis (H2), neither technique wasfound to be more accurate (d=6.5 for FixedPC vs 6.8 for FreePCF ( , ) =0.3 p = Whether the target was placed in the near radius 52cm or the farradius 96cm had no significant effect on the time cursors took toreach the target (F ( , ) =3.2 p < ( , ) =1.5 p = ( , ) =0.4p = Accuracy/Speed Tradeoff
As would be expected, there was a strong effect on movement timeand accuracy by the speed/accuracy condition. On the
Speed condi-tion, users were asked to be as quick as possible with reasonableaccuracy and on the
Accurate condition they were asked to be asaccurate as possible again with reasonable speed. As such when MTis concerned users had an average of 5.2s in the
Speed condition vs12.7s in the
Accuracy condition (F ( , ) =115.8 p < Accuracy condition vs 8.7 in the
Speed condition (F ( , ) =27 p < Movement time started counting either when users rotated the device past a 2%threshold - thus signaling their attempt to align the plane to the target - or when theyfirst swiped on the touch-screen, whichever came first The T measurement is the result of the total distance in pixels the finger traveled onthe touch-screen while pressed down during the trial
Figure 5: FreePC: Mean movement time for every target po-sition.Figure 6: Layout of the targets. 12 positions evenly dis-tributed on a sphere around the cursor starting positionwith 4 of them axis-aligned. Targets 1-8 are pivoted 45 ◦ about the Y and Z axis. The two missing front and backaxis-aligned positions were not tested since they occludedor were occluded by the cursor and slightly confused partic-ipants. Target position had a strong effect on MT. Particularly positions 9and 10 which are horizontally aligned to the starting position werethe fastest in both techniques, as expected (Figure 5-7).In FreePC (Figure 5) there was a relatively even distribution ofmovement times between the various positions. In PivotPC (Fig-ure 7), however, users struggled with position 8 (mean time 15.5sec,(F ( , ) =3.8 p < igure 7: PivotPC: Mean movement time for every target po-sition. Position 8 was significantly slower to reach. to find which is the best combination of tilt/swiping to achieve thedesired motion, something which, however, remains to be validated.Finally, position of the target had no significant effect on accuracy(F ( , ) =1.5 p = Subjects answered a post-experimental questionnaire asking themto rate PivotPC and FreePC in terms of their intuitiveness and phys-ical demands. Summarizing the responses from the questionnairesshows that subjects clearly favor Free Plane-Casting over PivotPlane-Casting both in intuitiveness and in physical demands, evenif overall quantitative performance was better in PivotPC for thisspecific task. Subjects also commented that in FreePC, since thecursor can only move when gesturing, there is less "pressure" tokeep the device aligned with the target (as is required in PivotPC -Figure 2) and that is cognitively (and physically) less demanding.
We have introduced 2 variations of a novel technique for 3D cursormanipulation using a smartphone. Our pilot study verifies theirusability and highlights some of the issues associated with eachone. For the docking task PivotPC seems to be the quantitativeoverall winner, but subjects preferred FreePC. Further evaluation isrequired to ascertain their applicability to real-life scenarios. Thiswork establishes a broad base upon which more specific plane-casting-based 3D applications can be built.As future work we also plan to implement simultaneous rotationusing multi-touch gestures as well as to develop techniques forannotation using touch.
REFERENCES [1] Eric A. Bier. 1986. Skitters and Jacks: Interactive 3D Positioning Tools. In
Proc.workshop on Interactive 3D graphics . ACM, 183–196. https://doi.org/10.1145/319120.319135[2] Martin Hachet, Joachim Pouderoux, and Pascal Guitton. 2008. 3D Elastic Controlfor Mobile Devices.
IEEE Computer Graphics and Applications
28, 4 (2008), 58–62.https://doi.org/10.1109/MCG.2008.64[3] Mark Hancock, Sheelagh Carpendale, and Andy Cockburn. 2007. Shallow-depth3d interaction: design and evaluation of one-, two- and three-touch techniques. In
CHI . ACM, 1147–1156.[4] Priscilla Jimenez and Leilah Lyons. 2011. An exploratory study of input modalitiesfor mobile devices used with museum exhibits. In
Proc. of conference on HumanFactors in Computing Systems . ACM, 895–904. https://doi.org/10.1145/1978942.1979075[5] Ivan Poupyrev, Mark Billinghurst, Suzanne Weghorst, and Tadao Ichikawa. 1996.The go-go interaction technique: non-linear mapping for direct manipulation inVR. In
Proc. of the 9th symposium on User interface Software and Technology (UIST’96) . ACM, 79–80. https://doi.org/10.1145/237091.237102[6] Peng Song, Wooi B. Goh, Chi W. Fu, Qiang Meng, and Pheng A. Heng. 2011.WYSIWYF: exploring and annotating volume data with a tangible handheld device.In
Proc. of conference on Human Factors in Computing Systems (CHI ’11) . ACM,1333–1342. https://doi.org/10.1145/1978942.1979140[7] Richard Stoakley, Matthew J. Conway, and Randy Pausch. 1995. Virtual reality ona WIM: interactive worlds in miniature. In
Proceedings of the SIGCHI conferenceon Human factors in computing systems (CHI ’95) . ACM Press/Addison-WesleyPublishing Co., New York, NY, USA, 265–272. https://doi.org/10.1145/223904.223938. ACM Press/Addison-WesleyPublishing Co., New York, NY, USA, 265–272. https://doi.org/10.1145/223904.223938