Eye tracking

Eye tracking is the process of measuring either the point of gaze (where one is looking) or the motion of an eye relative to the head. An eye tracker is a device for measuring eye positions and eye movement. Eye trackers are used in research on the visual system, in psychology, in psycholinguistics, marketing, as an input device for human-computer interaction, and in product design. In addition, eye trackers are increasingly being used for assistive and rehabilitative applications such as controlling wheelchairs, robotic arms, and prostheses. Recently, eye tracking has been examined as a tool for the early detection of autism spectrum disorder. There are several methods for measuring eye movement, with the most popular variant using video images to extract eye position. Other methods use search coils or are based on the electrooculogram.

History


In the 1800s, studies of eye movement were made using direct observations. For example, Louis Émile Javal observed in 1879 that reading does not involve a smooth sweeping of the eyes along the text, as previously assumed, but a series of short stops (called fixations) and quick saccades. This observation raised important questions about reading, questions which were explored during the 1900s: On which words do the eyes stop? For how long? When do they regress to already seen words?



Edmund Huey built an early eye tracker, using a sort of contact lens with a hole for the pupil. The lens was connected to an aluminum pointer that moved in response to the movement of the eye. Huey studied and quantified regressions (only a small proportion of saccades are regressions), and he showed that some words in a sentence are not fixated.

The first non-intrusive eye-trackers were built by Guy Thomas Buswell in Chicago, using beams of light that were reflected on the eye, then recording on film. Buswell made systematic studies into reading and picture viewing.

In the 1950s, Alfred L. Yarbus performed eye tracking research, and his 1967 book is often quoted. He showed that the task given to a subject has a very large influence on the subject's eye movement. He also wrote about the relation between fixations and interest:

"All the records ... show conclusively that the character of the eye movement is either completely independent of or only very slightly dependent on the material of the picture and how it was made, provided that it is flat or nearly flat.""

The cyclical pattern in the examination of pictures "is dependent on not only what is shown on the picture, but also the problem facing the observer and the information that he hopes to gain from the picture."



"Records of eye movements show that the observer's attention is usually held only by certain elements of the picture.... Eye movement reflects the human thought processes; so the observer's thought may be followed to some extent from records of eye movement (the thought accompanying the examination of the particular object). It is easy to determine from these records which elements attract the observer's eye (and, consequently, his thought), in what order, and how often."

"The observer's attention is frequently drawn to elements which do not give important information but which, in his opinion, may do so. Often an observer will focus his attention on elements that are unusual in the particular circumstances, unfamiliar, incomprehensible, and so on."

"... when changing its points of fixation, the observer's eye repeatedly returns to the same elements of the picture. Additional time spent on perception is not used to examine the secondary elements, but to reexamine the most important elements."



In the 1970s, eye-tracking research expanded rapidly, particularly reading research. A good overview of the research in this period is given by Rayner.

In 1980, Just and Carpenter formulated the influential Strong eye-mind hypothesis, that "there is no appreciable lag between what is fixated and what is processed". If this hypothesis is correct, then when a subject looks at a word or object, he or she also thinks about it (process cognitively), and for exactly as long as the recorded fixation. The hypothesis is often taken for granted by researchers using eye-tracking. However, gaze-contingent techniques offer an interesting option in order to disentangle overt and covert attentions, to differentiate what is fixated and what is processed.

During the 1980s, the eye-mind hypothesis was often questioned in light of covert attention, the attention to something that one is not looking at, which people often do. If covert attention is common during eye-tracking recordings, the resulting scan-path and fixation patterns would often show not where attention has been, but only where the eye has been looking, failing to indicate cognitive processing.

The 1980s also saw the birth of using eye-tracking to answer questions related to human-computer interaction. Specifically, researchers investigated how users search for commands in computer menus. Additionally, computers allowed researchers to use eye-tracking results in real time, primarily to help disabled users.

More recently, there has been growth in using eye tracking to study how users interact with different computer interfaces. Specific questions researchers ask are related to how easy different interfaces are for users. The results of the eye tracking research can lead to changes in design of the interface. Another recent area of research focuses on Web development. This can include how users react to drop-down menus or where they focus their attention on a website so the developer knows where to place an advertisement.

According to Hoffman, current consensus is that visual attention is always slightly (100 to 250 ms) ahead of the eye. But as soon as attention moves to a new position, the eyes will want to follow.

Specific cognitive processes still cannot be inferred directly from a fixation on a particular object in a scene. For instance, a fixation on a face in a picture may indicate recognition, liking, dislike, puzzlement etc. Therefore, eye tracking is often coupled with other methodologies, such as introspective verbal protocols.

Thanks to advancement in portable electronic devices, portable head-mounted eye trackers currently can achieve excellent performance and are being increasingly used in research and market applications targeting daily life settings. These same advances have led to increases in the study of small eye movements that occur during fixation, both in the lab and in applied settings. In the 21st century, the use of artificial intelligence (AI) and artificial neural networks has become a viable way to complete eye-tracking tasks and analysis. In particular, the convolutional neural network lends itself to eye-tracking, as it is designed for image-centric tasks. With AI, eye-tracking tasks and studies can yield additional information that may not have been detected by human observers. The practice of deep learning also allows for a given neural network to improve at a given task when given enough sample data. This requires a relatively large supply of training data, however.

The potential use cases for AI in eye-tracking cover a wide range of topics from medical applications to driver safety to game theory and even education and training applications.

Tracker types
Eye-trackers measure rotations of the eye in one of several ways, but principally they fall into one of three categories:
 * 1) measurement of the movement of an object (normally, a special contact lens) attached to the eye
 * 2) optical tracking without direct contact to the eye
 * 3) measurement of electric potentials using electrodes placed around the eyes.

Eye-attached tracking
The first type uses an attachment to the eye, such as a special contact lens with an embedded mirror or magnetic field sensor, and the movement of the attachment is measured with the assumption that it does not slip significantly as the eye rotates. Measurements with tight-fitting contact lenses have provided extremely sensitive recordings of eye movement, and magnetic search coils are the method of choice for researchers studying the dynamics and underlying physiology of eye movement. This method allows the measurement of eye movement in horizontal, vertical and torsion directions.

Optical tracking


The second broad category uses some non-contact, optical method for measuring eye motion. Light, typically infrared, is reflected from the eye and sensed by a video camera or some other specially designed optical sensor. The information is then analyzed to extract eye rotation from changes in reflections. Video-based eye trackers typically use the corneal reflection (the first Purkinje image) and the center of the pupil as features to track over time. A more sensitive type of eye-tracker, the dual-Purkinje eye tracker, uses reflections from the front of the cornea (first Purkinje image) and the back of the lens (fourth Purkinje image) as features to track. A still more sensitive method of tracking is to image features from inside the eye, such as the retinal blood vessels, and follow these features as the eye rotates. Optical methods, particularly those based on video recording, are widely used for gaze-tracking and are favored for being non-invasive and inexpensive.

Electric potential measurement
The third category uses electric potentials measured with electrodes placed around the eyes. The eyes are the origin of a steady electric potential field which can also be detected in total darkness and if the eyes are closed. It can be modelled to be generated by a dipole with its positive pole at the cornea and its negative pole at the retina. The electric signal that can be derived using two pairs of contact electrodes placed on the skin around one eye is called Electrooculogram (EOG). If the eyes move from the centre position towards the periphery, the retina approaches one electrode while the cornea approaches the opposing one. This change in the orientation of the dipole and consequently the electric potential field results in a change in the measured EOG signal. Inversely, by analysing these changes in eye movement can be tracked. Due to the discretisation given by the common electrode setup, two separate movement components – a horizontal and a vertical – can be identified. A third EOG component is the radial EOG channel, which is the average of the EOG channels referenced to some posterior scalp electrode. This radial EOG channel is sensitive to the saccadic spike potentials stemming from the extra-ocular muscles at the onset of saccades, and allows reliable detection of even miniature saccades.

Due to potential drifts and variable relations between the EOG signal amplitudes and the saccade sizes, it is challenging to use EOG for measuring slow eye movement and detecting gaze direction. EOG is, however, a very robust technique for measuring saccadic eye movement associated with gaze shifts and detecting blinks. Contrary to video-based eye-trackers, EOG allows recording of eye movements even with eyes closed, and can thus be used in sleep research. It is a very light-weight approach that, in contrast to current video-based eye-trackers, requires low computational power, works under different lighting conditions and can be implemented as an embedded, self-contained wearable system. It is thus the method of choice for measuring eye movement in mobile daily-life situations and REM phases during sleep. The major disadvantage of EOG is its relatively poor gaze-direction accuracy compared to a video tracker. That is, it is difficult to determine with good accuracy exactly where a subject is looking, though the time of eye movements can be determined.

Technologies and techniques
The most widely used current designs are video-based eye-trackers. A camera focuses on one or both eyes and records eye movement as the viewer looks at some kind of stimulus. Most modern eye-trackers use the center of the pupil and infrared / near-infrared non-collimated light to create corneal reflections (CR). The vector between the pupil center and the corneal reflections can be used to compute the point of regard on surface or the gaze direction. A simple calibration procedure of the individual is usually needed before using the eye tracker.

Two general types of infrared / near-infrared (also known as active light) eye-tracking techniques are used: bright-pupil and dark-pupil. Their difference is based on the location of the illumination source with respect to the optics. If the illumination is coaxial with the optical path, then the eye acts as a retroreflector as the light reflects off the retina creating a bright pupil effect similar to red eye. If the illumination source is offset from the optical path, then the pupil appears dark because the retroreflection from the retina is directed away from the camera.

Bright-pupil tracking creates greater iris/pupil contrast, allowing more robust eye-tracking with all iris pigmentation, and greatly reduces interference caused by eyelashes and other obscuring features. It also allows tracking in lighting conditions ranging from total darkness to very bright.

Another, less used, method is known as passive light. It uses visible light to illuminate, something which may cause some distractions to users. Another challenge with this method is that the contrast of the pupil is less than in the active light methods, therefore, the center of iris is used for calculating the vector instead. This calculation needs to detect the boundary of the iris and the white sclera (limbus tracking). It presents another challenge for vertical eye movements due to obstruction of eyelids.

Eye-tracking setups vary greatly. Some are head-mounted, some require the head to be stable (for example, with a chin rest), and some function remotely and automatically track the head during motion. Most use a sampling rate of at least 30 Hz. Although 50/60 Hz is more common, today many video-based eye trackers run at 240, 350 or even 1000/1250 Hz, speeds needed to capture fixational eye movements or correctly measure saccade dynamics.

Eye movements are typically divided into fixations and saccades – when the eye gaze pauses in a certain position, and when it moves to another position, respectively. The resulting series of fixations and saccades is called a scanpath. Smooth pursuit describes the eye following a moving object. Fixational eye movements include microsaccades: small, involuntary saccades that occur during attempted fixation. Most information from the eye is made available during a fixation or smooth pursuit, but not during a saccade.

Scanpaths are useful for analyzing cognitive intent, interest, and salience. Other biological factors (some as simple as gender) may affect the scanpath as well. Eye tracking in human–computer interaction (HCI) typically investigates the scanpath for usability purposes, or as a method of input in gaze-contingent displays, also known as gaze-based interfaces.

Data presentation
Interpretation of the data that is recorded by the various types of eye-trackers employs a variety of software that animates or visually represents it, so that the visual behavior of one or more users can be graphically resumed. The video is generally manually coded to identify the AOIs (areas of interest) or recently using artificial intelligence. Graphical presentation is rarely the basis of research results, since they are limited in terms of what can be analysed - research relying on eye-tracking, for example, usually requires quantitative measures of the eye movement events and their parameters, The following visualisations are the most commonly used:

Animated representations of a point on the interface This method is used when the visual behavior is examined individually indicating where the user focused their gaze in each moment, complemented with a small path that indicates the previous saccade movements, as seen in the image.

Static representations of the saccade path This is fairly similar to the one described above, with the difference that this is static method. A higher level of expertise than with the animated ones is required to interpret this.

Heat maps An alternative static representation, used mainly for the agglomerated analysis of the visual exploration patterns in a group of users. In these representations, the 'hot' zones or zones with higher density designate where the users focused their gaze (not their attention) with a higher frequency. Heat maps are the best known visualization technique for eyetracking studies.

Blind zones maps, or focus maps This method is a simplified version of the heat maps where the visually less attended zones by the users are displayed clearly, thus allowing for an easier understanding of the most relevant information, that is to say, it provides more information about which zones were not seen by the users.

Saliency maps Similar to heat maps, a saliency map illustrates areas of focus by brightly displaying the attention-grabbing objects over an initially black canvas. The more focus is given to a particular object, the brighter it will appear.

Eye-tracking vs. gaze-tracking
Eye-trackers necessarily measure the rotation of the eye with respect to some frame of reference. This is usually tied to the measuring system. Thus, if the measuring system is head-mounted, as with EOG or a video-based system mounted to a helmet, then eye-in-head angles are measured. To deduce the line of sight in world coordinates, the head must be kept in a constant position or its movements must be tracked as well. In these cases, head direction is added to eye-in-head direction to determine gaze direction. However, if the motion of the head is minor, the eye remains in constant position.

If the measuring system is table-mounted, as with scleral search coils or table-mounted camera (remote) systems, then gaze angles are measured directly in world coordinates. Typically, in these situations head movements are prohibited. For example, the head position is fixed using a bite bar or a forehead support. Then a head-centered reference frame is identical to a world-centered reference frame. Or colloquially, the eye-in-head position directly determines the gaze direction.

Some results are available on human eye movements under natural conditions where head movements are allowed as well. The relative position of eye and head, even with constant gaze direction, influences neuronal activity in higher visual areas.

Practice
A great deal of research has gone into studies of the mechanisms and dynamics of eye rotation, but the goal of eye tracking is most often to estimate gaze direction. Users may be interested in what features of an image draw the eye, for example. The eye tracker does not provide absolute gaze direction, but rather can measure only changes in gaze direction. To determine precisely what a subject is looking at, some calibration procedure is required in which the subject looks at a point or series of points, while the eye tracker records the value that corresponds to each gaze position. (Even those techniques that track features of the retina cannot provide exact gaze direction because there is no specific anatomical feature that marks the exact point where the visual axis meets the retina, if indeed there is such a single, stable point.) An accurate and reliable calibration is essential for obtaining valid and repeatable eye movement data, and this can be a significant challenge for non-verbal subjects or those who have unstable gaze.

Each method of eye-tracking has advantages and disadvantages, and the choice of an eye-tracking system depends on considerations of cost and application. There are offline methods and online procedures like AttentionTracking. There is a trade-off between cost and sensitivity, with the most sensitive systems costing many tens of thousands of dollars and requiring considerable expertise to operate properly. Advances in computer and video technology have led to the development of relatively low-cost systems that are useful for many applications and fairly easy to use. Interpretation of the results still requires some level of expertise, however, because a misaligned or poorly calibrated system can produce wildly erroneous data.

Eye-tracking while driving a car in a difficult situation


The eye movement of two groups of drivers have been filmed with a special head camera by a team of the Swiss Federal Institute of Technology: Novice and experienced drivers had their eye-movement recorded while approaching a bend of a narrow road. The series of images has been condensed from the original film frames to show 2 eye fixations per image for better comprehension.

Each of these stills corresponds to approximately 0.5 seconds in real time.

The series of images shows an example of eye fixations #9 to #14 of a typical novice and of an experienced driver.

Comparison of the top images shows that the experienced driver checks the curve and even has Fixation No. 9 left to look aside while the novice driver needs to check the road and estimate his distance to the parked car.

In the middle images, the experienced driver is now fully concentrating on the location where an oncoming car could be seen. The novice driver concentrates his view on the parked car.

In the bottom image the novice is busy estimating the distance between the left wall and the parked car, while the experienced driver can use their peripheral vision for that and still concentrate vision on the dangerous point of the curve: If a car appears there, the driver has to give way, i.e. stop to the right instead of passing the parked car.

More recent studies have also used head-mounted eye tracking to measure eye movements during real-world driving conditions.

Eye-tracking of younger and elderly people while walking
While walking, elderly subjects depend more on foveal vision than do younger subjects. Their walking speed is decreased by a limited visual field, probably caused by a deteriorated peripheral vision.

Younger subjects make use of both their central and peripheral vision while walking. Their peripheral vision allows faster control over the process of walking.

Applications
A wide variety of disciplines use eye-tracking techniques, including cognitive science; psychology (notably psycholinguistics; the visual world paradigm); human-computer interaction (HCI); human factors and ergonomics; marketing research and medical research (neurological diagnosis). Specific applications include the tracking eye movement in language reading, music reading, human activity recognition, the perception of advertising, playing of sports, distraction detection and cognitive load estimation of drivers and pilots and as a means of operating computers by people with severe motor impairment. In the field of virtual reality, eye tracking is used in head mounted displays for a variety of purposes including to reduce processing load by only rendering the graphical area within the user's gaze.

Commercial applications
In recent years, the increased sophistication and accessibility of eye-tracking technologies have generated a great deal of interest in the commercial sector. Applications include web usability, advertising, sponsorship, package design and automotive engineering. In general, commercial eye-tracking studies function by presenting a target stimulus to a sample of consumers while an eye tracker records eye activity. Examples of target stimuli may include websites, television programs, sporting events, films and commercials, magazines and newspapers, packages, shelf displays, consumer systems (ATMs, checkout systems, kiosks) and software. The resulting data can be statistically analyzed and graphically rendered to provide evidence of specific visual patterns. By examining fixations, saccades, pupil dilation, blinks and a variety of other behaviors, researchers can determine a great deal about the effectiveness of a given medium or product. While some companies complete this type of research internally, there are many private companies that offer eye-tracking services and analysis.

One field of commercial eye-tracking research is web usability. While traditional usability techniques are often quite powerful in providing information on clicking and scrolling patterns, eye-tracking offers the ability to analyze user interaction between the clicks and how much time a user spends between clicks, thereby providing valuable insight into which features are the most eye-catching, which features cause confusion and which are ignored altogether. Specifically, eye-tracking can be used to assess search efficiency, branding, online advertisements, navigation usability, overall design and many other site components. Analyses may target a prototype or competitor site in addition to the main client site.

Eye-tracking is commonly used in a variety of different advertising media. Commercials, print ads, online ads and sponsored programs are all conducive to analysis with current eye-tracking technology. One example is the analysis of eye movements over advertisements in the Yellow Pages. One study focused on what particular features caused people to notice an ad, whether they viewed ads in a particular order and how viewing times varied. The study revealed that ad size, graphics, color, and copy all influence attention to advertisements. Knowing this allows researchers to assess in great detail how often a sample of consumers fixates on the target logo, product or ad. Hence an advertiser can quantify the success of a given campaign in terms of actual visual attention. Another example of this is a study that found that in a search engine results page, authorship snippets received more attention than the paid ads or even the first organic result.

Yet another example of commercial eye-tracking research comes from the field of recruitment. A study analyzed how recruiters screen LinkedIn profiles and presented results as heat maps.

Safety applications
Scientists in 2017 constructed a Deep Integrated Neural Network (DINN) out of a Deep Neural Network and a convolutional neural network. The goal was to use deep learning to examine images of drivers and determine their level of drowsiness by "classify[ing] eye states." With enough images, the proposed DINN could ideally determine when drivers blink, how often they blink, and for how long. From there, it could judge how tired a given driver appears to be, effectively conducting an eye-tracking exercise. The DINN was trained on data from over 2,400 subjects and correctly diagnosed their states 96%-99.5% of the time. Most other artificial intelligence models performed at rates above 90%. This technology could ideally provide another avenue for driver drowsiness detection.

Game theory applications
In a 2019 study, a Convolutional Neural Network (CNN) was constructed with the ability to identify individual chess pieces the same way other CNNs can identify facial features. It was then fed eye-tracking input data from 30 chess players of various skill levels. With this data, the CNN used gaze estimation to determine parts of the chess board to which a player was paying close attention. It then generated a saliency map to illustrate those parts of the board. Ultimately, the CNN would combine its knowledge of the board and pieces with its saliency map to predict the players' next move. Regardless of the training dataset the neural network system was trained upon, it predicted the next move more accurately than if it had selected any possible move at random, and the saliency maps drawn for any given player and situation were more than 54% similar.

Assistive technology
People with severe motor impairment can use eye tracking for interacting with computers as it is faster than single switch scanning techniques and intuitive to operate. Motor impairment caused by Cerebral Palsy or Amyotrophic lateral sclerosis often affects speech, and users with Severe Speech and Motor Impairment (SSMI) use a type of software known as Augmentative and Alternative Communication (AAC) aid, that displays icons, words and letters on screen and uses text-to-speech software to generate spoken output. In recent times, researchers also explored eye tracking to control robotic arms and powered wheelchairs. Eye tracking is also helpful in analysing visual search patterns, detecting presence of Nystagmus and detecting early signs of learning disability by analysing eye gaze movement during reading.

Aviation applications
Eye tracking has already been studied for flight safety by comparing scan paths and fixation duration to evaluate the progress of pilot trainees, for estimating pilots' skills, for analyzing crew's joint attention and shared situational awareness. Eye tracking technology was also explored to interact with helmet mounted display systems and multi-functional displays in military aircraft. Studies were conducted to investigate the utility of eye tracker for Head-up target locking and Head-up target acquisition in Helmet mounted display systems (HMDS). Pilots' feedback suggested that even though the technology is promising, its hardware and software components are yet to be matured. Research on interacting with multi-functional displays in simulator environment showed that eye tracking can improve the response times and perceived cognitive load significantly over existing systems. Further, research also investigated utilizing measurements of fixation and pupillary responses to estimate pilot's cognitive load. Estimating cognitive load can help to design next generation adaptive cockpits with improved flight safety. Eye tracking is also useful for detecting pilot fatigue.

Automotive applications
In recent time, eye tracking technology is investigated in automotive domain in both passive and active ways. National Highway Traffic Safety Administration measured glance duration for undertaking secondary tasks while driving and used it to promote safety by discouraging the introduction of excessively distracting devices in vehicles In addition to distraction detection, eye tracking is also used to interact with IVIS. Though initial research investigated the efficacy of eye tracking system for interaction with HDD (Head Down Display), it still required drivers to take their eyes off the road while performing a secondary task. Recent studies investigated eye gaze controlled interaction with HUD (Head Up Display) that eliminates eyes-off-road distraction. Eye tracking is also used to monitor cognitive load of drivers to detect potential distraction. Though researchers explored different methods to estimate cognitive load of drivers from different physiological parameters, usage of ocular parameters explored a new way to use the existing eye trackers to monitor cognitive load of drivers in addition to interaction with IVIS.

Entertainment applications
The 2021 video game Before Your Eyes registers and reads the player's blinking, and uses it as the main way of interacting with the game.

Engineering applications
The widespread use of eye-tracking technology has shed light to its use in empirical software engineering in the most recent years. The eye-tracking technology and data analysis techniques are used to investigate the understandability of software engineering concepts by the researchers. These include the understandability of business process models, and diagrams used in software engineering such as UML activity diagrams and EER diagrams. Eye-tracking metrics such as fixation, scan-path, scan-path precision, scan-path recall, fixations on area of interest/relevant region are computed, analyzed and interpreted in terms of model and diagram understandability. The findings are used to enhance the understandability of diagrams and models with proper model related solutions and by improving personal related factors such as working-memory capacity, cognitive-load, learning style and strategy of the software engineers and modelers.

Cartographic applications
Cartographic research has widely adopted eye tracking techniques. Researchers have used them to see how individuals perceive and interpret maps. For example, eye tracking has been used to study differences in perception of 2D and 3D visualization, comparison of map reading strategies between novices and experts or students and their geography teachers, and evaluation of the cartographic quality of maps. Besides, cartographers have employed eye tracking to investigate various factors affecting map reading, including attributes such as color or symbol density. Numerous studies about the usability of map applications took advantage of eye tracking, too.

The cartographic community's daily engagement with visual and spatial data positioned it to contribute significantly to eye tracking data visualization methods and tools. For example, cartographers have developed methods for integrating eye tracking data with GIS, utilizing GIS software for further visualization and analysis. The community has also delivered tools for visualizing eye tracking data or a toolbox for the identification of eye fixations based on a spatial component of eye-tracking data.

Privacy concerns
With eye tracking projected to become a common feature in various consumer electronics, including smartphones, laptops and virtual reality headsets, concerns have been raised about the technology's impact on consumer privacy. With the aid of machine learning techniques, eye tracking data may indirectly reveal information about a user's ethnicity, personality traits, fears, emotions, interests, skills, and physical and mental health condition. If such inferences are drawn without a user's awareness or approval, this can be classified as an inference attack. Eye activities are not always under volitional control, e.g., "stimulus-driven glances, pupil dilation, ocular tremor, and spontaneous blinks mostly occur without conscious effort, similar to digestion and breathing”. Therefore, it can be difficult for eye tracking users to estimate or control the amount of information they reveal about themselves.