Fitts's law

Fitts's law (often cited as Fitts' law) is a predictive model of human movement primarily used in human–computer interaction and ergonomics. The law predicts that the time required to rapidly move to a target area is a function of the ratio between the distance to the target and the width of the target. Fitts's law is used to model the act of pointing, either by physically touching an object with a hand or finger, or virtually, by pointing to an object on a computer monitor using a pointing device. It was initially developed by Paul Fitts.

Fitts's law has been shown to apply under a variety of conditions; with many different limbs (hands, feet, the lower lip, head-mounted sights ), manipulanda (input devices), physical environments (including underwater ), and user populations (young, old, special educational needs, and drugged participants ).

Original model formulation
The original 1954 paper by Paul Morris Fitts proposed a metric to quantify the difficulty of a target selection task. The metric was based on an information analogy, where the distance to the center of the target (D) is like a signal and the tolerance or width of the target (W) is like noise. The metric is Fitts's index of difficulty (ID, in bits):

$$\text{ID} = \log_2 \Bigg(\frac{2D} {W}\Bigg)$$



Fitts also proposed an index of performance (IP, in bits per second) as a measure of human performance. The metric combines a task's index of difficulty (ID) with the movement time (MT, in seconds) in selecting the target. In Fitts's words, "The average rate of information generated by a series of movements is the average information per movement divided by the time per movement." Thus,

$$\text{IP} = \Bigg(\frac{\text{ID}} {\text{MT}}\Bigg)$$

Today, IP is more commonly called throughput (TP). It is also common to include an adjustment for accuracy in the calculation.

Researchers after Fitts began the practice of building linear regression equations and examining the correlation (r) for goodness of fit. The equation expresses the relationship between MT and the D and W task parameters:

$$\text{MT} = a + b \cdot \text{ID} = a + b \cdot \log_2 \Bigg(\frac{2D}{W}\Bigg)$$



where:
 * MT is the average time to complete the movement.
 * a and b are constants that depend on the choice of input device and are usually determined empirically by regression analysis. a defines the intersection on the y axis and is often interpreted as a delay. The b parameter is a slope and describes an acceleration. Both parameters show the linear dependency in Fitts's law.
 * ID is the index of difficulty.
 * D	is the distance from the starting point to the center of the target.
 * W is the width of the target measured along the axis of motion. W can also be thought of as the allowed error tolerance in the final position, since the final point of the motion must fall within ±$W/2$ of the target's center.

Since shorter movement times are desirable for a given task, the value of the b parameter can be used as a metric when comparing computer pointing devices against one another. The first human–computer interface application of Fitts's law was by Card, English, and Burr, who used the index of performance (IP), interpreted as $1/b$, to compare performance of different input devices, with the mouse coming out on top compared to the joystick or directional movement keys. This early work, according to Stuart Card's biography, "was a major factor leading to the mouse's commercial introduction by Xerox".

Many experiments testing Fitts's law apply the model to a dataset in which either distance or width, but not both, are varied. The model's predictive power deteriorates when both are varied over a significant range. Notice that because the ID term depends only on the ratio of distance to width, the model implies that a target distance and width combination can be re-scaled arbitrarily without affecting movement time, which is impossible. Despite its flaws, this form of the model does possess remarkable predictive power across a range of computer interface modalities and motor tasks, and has provided many insights into user interface design principles.

Movement
A movement during a single Fitts's law task can be split into two phases:


 * initial movement. A fast but imprecise movement towards the target
 * final movement. Slower but more precise movement in order to acquire the target

The first phase is defined by the distance to the target. In this phase the distance can be closed quickly while still being imprecise. The second movement tries to perform a slow and controlled precise movement to actually hit the target. The task duration scales linearly in regards to difficulty. But as different tasks can have the same difficulty, it is derived that distance has a greater impact on the overall task completion time than target size.

Often it is cited that Fitts's law can be applied to eye tracking. This seems to be at least a controversial topic as Drewes showed. During fast saccadic eye movements the user is blind. During a Fitts's law task the user consciously acquires its target and can actually see it, making these two types of interaction not comparable.

Bits per second: model innovations driven by information theory
The formulation of Fitts's index of difficulty most frequently used in the human–computer interaction community is called the Shannon formulation:

$$\text{ID} = \log_2 \Bigg(\frac{D}{W}+1\Bigg)$$

This form was proposed by Scott MacKenzie, professor at York University, and named for its resemblance to the Shannon–Hartley theorem. It describes the transmission of information using bandwidth, signal strength and noise. In Fitts's law, the distance represents signal strength, while target width is noise.

Using this form of the model, the difficulty of a pointing task was equated to a quantity of information transmitted (in units of bits) by performing the task. This was justified by the assertion that pointing reduces to an information processing task. Although no formal mathematical connection was established between Fitts's law and the Shannon-Hartley theorem it was inspired by, the Shannon form of the law has been used extensively, likely due to the appeal of quantifying motor actions using information theory. In 2002 the ISO 9241 was published, providing standards for human–computer interface testing, including the use of the Shannon form of Fitts's law. It has been shown that the information transmitted via serial keystrokes on a keyboard and the information implied by the ID for such a task are not consistent. The Shannon-Entropy results in a different information value than Fitts's law. The authors note, though, that the error is negligible and only has to be accounted for in comparisons of devices with known entropy or measurements of human information processing capabilities.

Adjustment for accuracy: use of the effective target width
An important improvement to Fitts's law was proposed by Crossman in 1956 (see Welford, 1968, pp. 147–148) and used by Fitts in his 1964 paper with Peterson. With the adjustment, target width (W) is replaced by an effective target width (We). We is computed from the standard deviation in the selection coordinates gathered over a sequence of trials for a particular D-W condition. If the selections are logged as x coordinates along the axis of approach to the target, then

$$W_e = 4.133 \times SD_x$$

This yields

$$\text{ID}_e = \log_2 \Bigg(\frac{D}{W_e}+1\Bigg)$$

and hence

$$\text{IP} = \Bigg(\frac{ID_e} {MT}\Bigg)$$

If the selection coordinates are normally distributed, We spans 96% of the distribution. If the observed error rate was 4% in the sequence of trials, then We = W. If the error rate was greater than 4%, We > W, and if the error rate was less than 4%, We < W. By using We, a Fitts' law model more closely reflects what users actually did, rather than what they were asked to do.

The main advantage in computing IP as above is that spatial variability, or accuracy, is included in the measurement. With the adjustment for accuracy, Fitts's law more truly encompasses the speed-accuracy tradeoff. The equations above appear in ISO 9241-9 as the recommended method of computing throughput.

Welford's model: innovations driven by predictive power
Not long after the original model was proposed, a 2-factor variation was proposed under the intuition that target distance and width have separate effects on movement time. Welford's model, proposed in 1968, separated the influence of target distance and width into separate terms, and provided improved predictive power:

$$MT = a + b_1 \log_2 (D) + b_2 \log_2 (W)$$

This model has an additional parameter, so its predictive accuracy cannot be directly compared with 1-factor forms of Fitts's law. However, a variation on Welford's model inspired by the Shannon formulation,

$$MT = a + b_1 \log_2 (D+W) + b_2 \log_2 (W) = a + b\log_2 \left(\frac{D+W}{W^k}\right)$$

The additional parameter k allows the introduction of angles into the model. Now the users position can be accounted for. The influence of the angle can be weighted using the exponent. This addition was introduced by Kopper et al. in 2010.

The formula reduces to the Shannon form when k = 1. Therefore, this model can be directly compared against the Shannon form of Fitts's law using the F-test of nested models. This comparison reveals that not only does the Shannon form of Welford's model better predict movement times, but it is also more robust when control-display gain (the ratio between e.g. hand movement and cursor movement) is varied. Consequently, although the Shannon model is slightly more complex and less intuitive, it is empirically the best model to use for virtual pointing tasks.

Extensions to two or more dimensions
In its original form, Fitts's law is meant to apply only to one-dimensional tasks. However, the original experiments required subjects to move a stylus (in three dimensions) between two metal plates on a table, termed the reciprocal tapping task. The target width perpendicular to the direction of movement was very wide to avoid it having a significant influence on performance. A major application for Fitts's law is 2D virtual pointing tasks on computer screens, in which targets have bounded sizes in both dimensions.



Fitts's law has been extended to two-dimensional tasks in two different ways. For navigating e.g. hierarchical pull-down menus, the user must generate a trajectory with the pointing device that is constrained by the menu geometry; for this application the Accot-Zhai steering law was derived.

For simply pointing to targets in a two-dimensional space, the model generally holds as-is but requires adjustments to capture target geometry and quantify targeting errors in a logically consistent way. Multiple Methods have been used to determine the target size :


 * status Quo: horizontal width of the target
 * sum model: W equals height + width
 * area model: W equals height * width
 * smaller of model: W smaller value of height and width
 * W-model: W is the effective width in the direction of the movement

While the W-model is sometimes considered the state-of-the-art measurement, the truly correct representation for non-circular targets is substantially more complex, as it requires computing the angle-specific convolution between the trajectory of the pointing device and the target

Characterizing performance
Since the a and b parameters should capture movement times over a potentially wide range of task geometries, they can serve as a performance metric for a given interface. In doing so, it is necessary to separate variation between users from variation between interfaces. The a parameter is typically positive and close to zero, and sometimes ignored in characterizing average performance, as in Fitts' original experiment. Multiple methods exist for identifying parameters from experimental data, and the choice of method is the subject of heated debate, since method variation can result in parameter differences that overwhelm underlying performance differences.

An additional issue in characterizing performance is incorporating success rate: an aggressive user can achieve shorter movement times at the cost of experimental trials in which the target is missed. If the latter are not incorporated into the model, then average movement times can be artificially decreased.

Temporal targets
Fitts's law deals only with targets defined in space. However, a target can be defined purely on the time axis, which is called a temporal target. A blinking target or a target moving toward a selection area are examples of temporal targets. Similar to space, the distance to the target (i.e., temporal distance Dt) and the width of the target (i.e., temporal width Wt) can be defined for temporal targets as well. The temporal distance is the amount of time a person must wait for a target to appear. The temporal width is a short duration from the moment the target appears until it disappears. For example, for a blinking target, Dt can be thought of as the period of blinking and Wt as the duration of the blinking. As with targets in space, the larger the Dt or the smaller the Wt, the more difficult it becomes to select the target.

The task of selecting the temporal target is called temporal pointing. The model for temporal pointing was first presented to the human–computer interaction field in 2016. The model predicts the error rate, the human performance in temporal pointing, as a function of temporal index of difficulty (IDt):

$$\text{ID}_{t} = \log_2 \Bigg(\frac{D_{t}}{W_{t}}\Bigg)$$

Implications for UI design


Multiple design guidelines for GUIs can be derived from the implications of Fitts's law. In its basic form, Fitts's law says that targets a user has to hit should be as big as possible. This is derived from the W parameter. More specifically, the effective size of the button should be as big as possible, meaning that its form has to be optimized for the direction of the user's movement onto the target.

Layouts should also cluster functions that are commonly used with each other. Optimizing for the D parameter in this way allows for smaller travel times.

Placing layout elements on the four edges of the screen allows for infinitely large targets in one dimension and therefore presents ideal scenarios. Since the pointer will always stop at the edge, the user can move the mouse with the greatest possible speed and still hit the target. The target area is effectively infinitely long along the movement axis. Therefore, this guideline is called “Rule of the infinite edges”. The use of this rule can be seen for example in MacOS, which always places the menu bar on the top left edge of the screen instead of the current program's windowframe.

This effect can be exaggerated at the four corners of a screen. At these points two edges collide and form a theoretically infinitely big button. Microsoft Windows (prior to Windows 11) places its "Start" button in the lower left corner and Microsoft Office 2007 uses the upper left corner for its "Office" menu. These four spots are sometimes called "magic corners". MacOS places the close button on the upper left side of the program window and the menu bar fills out the magic corner with another button.

A UI that allows for pop-up menus rather than fixed drop-down menus reduces travel times for the D parameter. The user can continue interaction right from the current mouse position and doesn't have to move to a different preset area. Many operating systems use this when displaying right-click context menus. As the menu starts right on the pixel which the user clicked, this pixel is referred to as the "magic" or "prime pixel".

James Boritz et al. (1991) compared radial menu designs. In a radial menu all items have the same distance from the prime pixel. The research suggests that in practical implementations the direction in which a user has to move their mouse has also to be accounted for. For right-handed users, selecting the left-most menu item was significantly more difficult than the right-most one. No differences were found for transitions from upper to lower functions and vice versa.