Original article
Exploratory usability comparison of three interaction paradigms for touchless spatio-temporal manipulations of 3D images using Leap Motion
Abstract
New devices redefining human/machine interactions are constantly entering the consumer market. After integrated mouse interactions and, later, tactile interactions, more changes are ongoing with the increasing popularity of devices allowing contactless interactions. This type of interaction device is particularly interesting in healthcare environments where physical contacts are prohibited. For instance, it can be used in the operating theatre to avoid contamination while surgeons manipulate medical images. Finding a new touchless interaction paradigm that allows manipulation of 3D images in space and time in a simple and efficient manner is not a simple task. We performed an explorative study to compare three different ways of interacting with the Leap Motion™, a device allowing interaction with a computer system through hand movements. Three interaction strategies have been implemented, one copying the traditional mouse interaction paradigm, one assigning a different hand gesture for each possible action and one using a single gesture but allowing switching between interaction modes. The qualitative results obtained during the debriefing and observation of three participants testing the systems revealed a clear preference for using a limited number of gestures. The next step in the process will be to link the device with a real medical imagery system and to use a medical scenario to confirm that these early findings are still valid in more realistic situations.
Introduction
Technological progress constantly offers new ways of interacting with data. If mouses and keyboards are still the dominant interaction devices, new ways of interaction, such as touch and voice, are slowly but increasingly improving and thus taking their places as other natural ways to interact with machines. If tactile interactions are becoming popular, they are far from being the only alternative. Among these alternatives, the use of body movements to interact with a system has shown promising applications, especially in healthcare environments such as the operating theatre. Surgical environments must remain free of infection and therefore contact with non-sterile material is prohibited. Because of the number of germs that can be found on a computer keyboard, a surgeon cannot touch one during surgery [1]. However, surgeons need to interact with computer systems and, more particularly, with imaging systems, to guide them during invasive procedures [2]. Several solutions have been found to deal with this constraint, such as having a mouse in a sterile bag. These are not very convenient. Another simple solution is to rely on a third person to perform the image manipulations, guided by the surgeon. However, this solution is not optimal. It is costly and generates frustration as it can be difficult to realise exactly the desires of the surgeon. In this context, touchless interaction devices are a promising solution to give back to the surgeon the full control of image manipulations [3]. For instance, the Microsoft Kinect™ has been demonstrated to be a possible way for surgeons to interact with imaging systems while avoiding direct contact [4]. If this system has shown promising results for X-ray image manipulations, it relies on a very traditional paradigm of interaction based on the one from the mouse and can be found cumbersome to use [5]. In this work, we want to explore the usability of alternative interaction models using the Leap Motion™, a device recognising hand movements to manipulate 3D images [6].
Material and methods
Technical constraints
The Leap Motion™ is a device containing two cameras and three infrared light-emitting diodes (LEDs) (fig. 1a). These LEDs generate a volume of dots captured by the cameras and transformed into concrete data. When a hand goes inside the detection space, the device identifies the number of fingers that are visible. The announced precision scale is about 0.01 mm and the detection space is a cube with about 60 cm sides. The refresh rate varies depending on the type of USB port and for a USB 2.0 connection is around 60 frames per second. The application programming interface (API) can be accessed by several programming languages such as c++, java, python, C#, unity and JavaScript.
The device can detect movement in three axes. The directions, positions, rotation axes as well as moves are transmitted from the API as vectors. The device detects states (called frames), each frame containing objects. Each of the objects contains specific attributes that can be accessed in a simple manner.
A large number of gestures have been imagined and each has been tested to decide if it was satisfactory (fig. 1b). Two main modalities of interactions have been defined, using one hand or two hands gestures and using either fixed or mobile hand movements in the detection space.
Interaction space
In order to reproduce the type of action that can be made while manipulating 3D images, we have linked the Leap Motion™ to a prototype software that permits simple manipulations of 3D images (fig. 2).
Five types of interaction can be performed by the users of the interface:
– spatial manipulation – covers movement in the three axes of 3D space;
– zoom – allows zooming in or zooming on the object displayed in the selected space;
– rotation – allows rotation of the object displayed in the selected space;
– temporal manipulation – allows travel forward or backward in the temporal space;
– pointing – allows a specific location of the 3D images to be indicated.
Tested paradigms
Three different interaction paradigms have been implemented. The first copies the interaction model used with a mouse and serves as a baseline. The second uses a different hand gesture for each of the possible manipulations. Finally, the third has been devised with the opposite philosophy and uses only one gesture to switch between different manipulation modes activated by the same gesture.
First paradigm: imitating the mouse
In this first paradigm, the user’s finger is used as a mouse pointer. Each interaction is done by clicking with the finger on one of the buttons displayed on the screen. The possible interactions with the interface are done with the following gestures:
– Passive mode. A closed fist is used to enter the detection space. By doing so no specific action is interpreted by the software before the beginning of the interaction by the user.
– Pointer mode. By pointing only one finger, we enter the pointer mode; this mode allows selection of specific items on the screen.
Second paradigm: distinction by hand pattern
In this second proposition, each action is performed with a specific hand pattern. Once the specific hand position is adopted every action is done by moving the hand in one of the four directions (fig. 3):
– Passive mode is enabled when one or two closed hands are in the detection space.
– Pointer mode is enabled when one hand with one pointing finger is in the detection space.
– Movement mode is enabled when one open hand is in the detection space.
– Zoom and rotation mode is enabled when two open hands are located in the detection space.
– Temporal mode is enabled when two hands with one pointing finger are located in the detection space.
Third paradigm: one action, several modes
In this last paradigm each manipulation is made through a similar gesture, but it is possible to change between interaction modes using a specific gesture combination (fig. 4).
The selected interactions that have been retained for this exploratory study are:
– Passive mode, enabled when a closed hand is located in the detection space.
– Modeselection, enabled by holding a vertical hand in the detection space. Once enabled, movement to one of the four corners of the screen enables one of the four manipulation modes.
Evaluation methodology
The study design used an exploratory approach to compare the different interaction models. With this approach, we did not attempt to obtain results from a quantitative point of view, but instead to collect the users’ impressions in order to identify a future direction of research. During short sessions of 30 minutes, three voluntary participants tested the three models guided by a scenario. Once done, they shared their feelings with a moderator during a debriefing session. These sessions took place in a controlled environment where all discussions were recorded by means of a microphone and all participant actions were recorded on camera.
All the testing processes followed the same sequence. An interaction model was chosen randomly and demonstrated to the participant by the moderator. After a few tries, the participant had to perform a list of predefined manipulations described in a scenario. The use of the scenario was to ensure that users performed every type of possible action on the interface at least once.
During the test, participants were asked to think aloud in order for the moderator to know what action they wanted to perform. During the experiment, the moderator took note of the number of attempts necessary before each participant succeeded in reaching the desired goal. At the end the test, a debriefing session was held to record the general feeling of the user toward the use of the tool. The topics that were discussed during these debriefing sessions were the standard usability issues, ease of learning, efficiency, effectiveness and satisfaction.
Results
The analysis of the transcript of the participants’ debriefing sessions revealed the following weaknesses and strengths of the different interaction models.
First paradigm: faking the mouse
Since the interactions are done similarly to interactions via a mouse, every participant understood immediately how to manipulate the interface. The lack of reliability of the device made most of the manipulations very cumbersome. Participants had difficulties in crossing large distance as well as going from one control to another. Once the destination was reached with the pointer, the participants had also difficulty keeping the controller stable. However, once stabilised, the participants could easily perform the required actions. Overall, the users where poorly satisfied, because they considered that they were spending too much time for simple manipulations that had been much quicker on a mouse interface.
Second paradigm: one gesture for each action
In most cases, participants had difficulty remembering all the hand patterns required for each action. Once reminded, they had to practice a bit every time in order to be able to master it. Regarding the efficiency, the device sometimes didn’t recognise the desired actions and irritated the users. In overall, participants had the feeling that they had spend too much effort for simple manipulations and expressed the poorest satisfaction with this kind of interaction.
Third paradigm: one gesture, several modes
The last paradigm presented was mastered by participants as quickly as the mouse paradigm. Because the number of possible actions was limited, participants quickly understood how to perform the desired actions. Moreover, since only one mode was selected at a time, they did not disrupt the participants’ actions. Overall, this model achieved the highest satisfaction.
Discussion
Existing interaction paradigms must evolve along with the emergence of new interaction devices. Every new technology that modifies the way of interacting with systems requires interaction rules that benefit from the specificities of the new type of interaction. These new rules are not easy to find since no decision clearly influences positively or negatively the usability of the system. Indeed, test sessions revealed that many choices in the interaction paradigm has advantages and drawbacks. For instance, by comparing one hand versus two hand manipulations, we realised that including two hand manipulations offers additional possibilities, but increased complexity. Also, by comparing manipulation with ample movements versus change in posture of the hand, we noticed that (a) large moves offered better control of the actions performed, but induced significant fatigue with prolonged use of the system and (b) smaller hand gestures were much more convenient for the user, but required more precision and calm to be performed accurately. Finally, after several rounds of testing, the most satisfactory experience combined large moves as well as more subtle ones from the hand.
We are aware that the results obtained in this exploratory study relied on a small number of participants and therefore results cannot be generalised.
Conclusion
Touchless devices open a new era of support interactions with computer systems in specific health environments such as the operating theatre. They provide direct control without any physical contact. Several systems have already been developed to manipulate X-ray images, most of them rely on an interaction paradigm inherited from the mouse. However, developing welladapted interaction paradigms, with a good learning curve or using natural gestures, is a complicated task and can hardly be derived from what is known from existing tactile devices.
In this exploratory study, we compared three interaction paradigms employed to interact through hand gestures recognised by the Leap Motion™, with a prototype application allowing manipulation of 3D images in space and time.
Observations of participants’ actions and debriefing discussions revealed that relying on the mouse interaction paradigm while using a touchless interaction device is functional but not efficient. The display of a cursor put the user in a familiar situation, but made manipulations cumbersome. Using too many specific gestures to perform the manipulations gives the user a hard time to remember all of them, and can lead to irritation. The alternative solution aiming at relying on unique hand movements associated with a mechanism to select the type of action was unanimously recognised as the best solution in our comparative study.
Based on these findings, further steps will allow us to test this interaction paradigm in the realistic scenario of surgeons manipulating images.
Correspondence
Correspondence:
Frederic Ehrler
University Hospitals of Geneva
Division of Medical Information Sciences
Rue Gabrielle-Perret-Gentil 4
CH-1205 Geneva
frederic.ehrler[at]hcuge.ch
References
1 Bures S, Fishbain J, Uyehara C. Computer keyboards and faucet handles as reservoirs of nosocomial pathogens in the intensive care unit. Am J Infect Control. 2000 Dec;28(6):465–71. [Internet cited 2015 May 10]. Available from: http://www.sciencedirect.com/science/article/pii/S0196655300906552
2 Mirota D, Uneri A. Evaluation of a system for high-accuracy 3D image-based registration of endoscopic video to c-arm cone-beam CT for image-guided skull base surgery. IEEE Trans Med Imaging. 2013 Jul;32(7):1215–26. [Internet cited 2015 May 10]. Available from: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4118820/
3 O’Hara K, Dastur N, Carrell T, Gonzalez G, Sellen A, Penney G, et al. Touchless interaction in surgery. Commun ACM. 2014 Jan 1;57(1):70–7. Available from: http://dl.acm.org/citation.cfm?doid=2541883.2541899
4 Gallo L, Placitelli A, Ciampi M. Controller-free exploration of medical image data: Experiencing the Kinect. Comput Med… [Internet]. 2011 [cited 2015 Jan 21]; Available from: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5999138
5 Tory M, Moller T. Human factors in visualization research. IEEE Trans Vis Comput Graph. 2004 Jan-Feb;10(1):1–13. [Internet cited 2014 Oct 27] Available from: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1260759
6 Rubine D. Specifying gestures by example. ACM SIGGRAPH Comput Graph. 1991 Jul 2;25(4):329–37. Available from: http://portal.acm.org/citation.cfm?doid=127719.122753
Copyright
Published under the copyright license
“Attribution – Non-Commercial – NoDerivatives 4.0”.
No commercial reuse without permission.
See: emh.ch/en/emh/rights-and-licences/