User:Erel Segal/Multi-frame agents

This page collects citations, mainly from the economic literature, about agents with multiple preferences.

Examples
1. At time 0, an agent chooses a consumption plan (from available choices) for the future. But, at some later time, the agent may change his tastes and choose a different consumption plan. The consumption plans will coincide only in a very special case in which the time-discount has a constant rate of interest. In other words: the relative importance of consumption between time T and T+1 is the same whether it is evaluated at time T or at time T-1. Experiments show that this is not always the case: when people choose between a small reward at time T and a large reward at time T+1, they tend to choose the small reward if the choice is at time T, and the large reward if the choice is made at time T-1.

2. There are two restaurants: one serves only salad, the other serves both salad and ice-cream. Classic theory says that the second restaurant is always at least as good as the first one, as it gives me more choice. In practice, more choice means that I might be tempted to pick the un-healthy option. Agents cope with this in two ways:


 * Self-control: I spend energy to resist the temptation.
 * Commitment: I go to the first restaurant, to avoid the temptation altogether

Dynamic inconsistency
The first one who tried to explain this was Robert H. Strotz. He claimed that each agent contains infinitely many individuals: one for each point in time. This is like a family in which the oldest brother makes the decisions, but each year, the oldest brother dies and a new brother becomes the family head. He explained several ways by which people might cope with this change of tastes:

Consumer sovereignty has no meaning here, since there are many "consumers" residing in each person. A person's judgement may be biased towards the present time (at which the decision is made), but is usually rational w.r.t. future dates (that is, most people can plan their savings well, as long as the plan starts from next year).
 * Ignorance. The conflict is not recognized, and I become "spendthrifty" - my behavior is inconsistent with my plans.
 * Precommitment. I pay a price to commit myself to a plan that I think it is best from today's perspective, to prevent my future self from disobedience. Examples are saving accounts, and also buying on credit.
 * Consistent planning. If I cannot precommit, I will choose the best plan from the ones that my future self is most likely to follow (that is: disobedience is considered another constraint, in addition to the budget constraint). This seems closely related to Subgame perfect equilibrium.

Pollak criticizes Strotz' algorithm for computing the consistent plan, and shows a mistake in his derivation.

Phelps and Pollak considered a similar problem in the context of multiple generations. Each generation should decide how much to save for future generations, in view of what the future generations are going to do. If the present generation can commit, then it can attain the first-best option; if it cannot commit, it looks for a second-best option, where the behavior of the future generations is a constraint. They distinguish between "altruistic" generations (that care for future generations) and "imperfectly-altruistic" generations, which have a bias for present consumption (present-biased preferences). They introduced the beta-delta utility functions. They study a game-theoretic problem where the generations are the players. They show that the equilibrium of this game is not Pareto optimal - the savings rate is lower than the optimum.

Peleg and Yaari treat each "future self" as a distinct player, and find Nash equilibria the resulting dynamic game.

Kreps presents a different model: the agent chooses opportunity sets, from which he will later choose specific actions. It helps to explain a preference for flexibility, where an agent is unsure about his future tastes.

Akerloff shows how a sequence of small mistakes (regarding the costs and benefits of actions in the present vs. the future) might add up to severe errors. He uses his model to explain procrastination and over-obedience.

Laibson studies the effect of hyperbolic discounting on savings behavior. Since agents with hyperbolic dicsounting are present-biased, it may be better for them to put their money in an illiquid asset, which they cannot sell too early. Conversely, financial market innovation - which increases the liquidity of assets - might reduce the welfare of such agents.

O'Donoghue and Rabin study the inconsistency in time-discount: they call the above tendency present-bias (a special case of which is Hyperbolic discounting). They consider a simple setting with a single activity that must be done once: either now or later. There are two distinctions: whether there are immediate costs or immediate rewards; and whether the agent is naive or sophisticated (= foresees self-control problems in the future). Previous works assumed either naive decision making (e.g. Akerloff ) or sophisticated decision-making (e.g. Laibson and Fischer ) but did not provide behavioral evidence to any of these assumptions. Their model explains procrastination and its antonym - preproperation (e.g. overeating). Sophisticated peple procrastinate less (unpleasant tasks), but preproperate more (pleasant activities). Therefore, sophisticated people enjoy a higher welfare when there are immediate costs, but naive people may have a higher welfare when there are immediate rewards. Thus, sophistication can both mitigate and exacerbate self-control problems. The existence of present-biased preferences may serve as a justification for government intervention, e.g. taxing addictive foods, in order to decrease their consumption levels to the optimal ones.

Fischer presents a fully-rational (time-consistent) model of procrastination. Her model rationalizes some procrastination phoenomena, but not all of them.

Wang, Huang, Liu and Zhang study a principal-agent problem when the agent has time-inconsistent preferences. They show that the agency problem becomes more difficult: the principal's cost is higher, and the agent's income stream is lower.

Laibson, Repetto, Tobacman define Pareto-optimality with multiple selves (in the context of savings for retirement).

Jackson and Yariv: collective decision when agents have different time-horizons (e.g. parliament members with different projected service horizon; husbands and wives with different life expectancy). Each agent has a consistent preference. Main results:


 * For cardinal aggregation: If individual preferences are aggregated via a collective utility function that is non-dictatorial and respects unanimity, then that collective utility function must be time-inconsistent.
 * For ordinal aggregation: If preferences are aggregated via any voting rule that is locally non-dictatorial, such as majority voting, then the resulting social welfare order ing must exhibit cycles (intransitivities).

Policy implications: non-dictatorial collective choices either necessitate precommitment devices, or involve preference-reversals over time. It is crucial to consider individual preferences rather than just "representative preferences".

Temptation
Gul and Pesendorfer model this by attributing, to each agent, two utility functions: one denotes the "real" utility, while the other denotes the cost of self-control (equivalently, the strength of the temptation). Every agent satisfying several axioms can be represented in this way.

Gul and Pesendorfer also extend the Revealed preference theory to such agents. They model preference-relations over decision problems, rather than over actual decisions. For example, I may prefer a to b, but may prefer the decision problem {a} to {a,b}, since I may be tempted to choose b. This preferrence to commitment is contrary to the axiom of Kreps, which assumes preference to flexibility. They do not take as primitives the agent's expectation of his future behavior (e.g. whether he is naive or sophisticated), since these expectations cannot be observed; only the agent's choices can. Their explanation is time-consistent - their agents have the same preferences over decision problems in all periods. Their decision problem is similar to computing a value function in dynamic programming.
 * Two-period model: characterization by axioms; comparative measures of preference for commitment vs. self-control.
 * Infinite-horizon model, with a recursive decision problem: at each time, the agent chooses a consumption for the present, and a (recursive) decision problem for the next time-step.
 * The relation to consumer theory:
 * Removing non-binding constraints can change equilibrium allocations and prices.
 * Debt contracts are feasible even if the only punishment for default is the contract termination.

Uncertainty
Bewley - Knightian decision theory. Each agent has multiple probability distributions on states of the world. Each single probability distribution is considered risk; the fact that there are multiple such distributions is considered uncertainty. A lottery is preferred to another iff it is preferred by all distributions. Instead of the completeness assumption, he makes an inertia assumption - a preference for the status-quo. A bet is accepted iff it is preferred to the status quo by all probability distributions.


 * Distinction between Knighian behavior and Gilboa-Schmeidler behavior - a utility function that is a minimum, over all distributions, of the expected value.
 * The inertia assumption prevents exploitation of intransitivity by a money pump.

Incomplete preferences
Fon and Otani - classical welfare theorems for incomplete preferences (and non-transitive preferences). Properties of equilibrium in an abstract economy.

Choice with frames
In standard choice theory, an agent's choice depends only on the set of available options. There is a lot of evidence that real people base their choices on other conditions, which are irrelevant for the rational choice, for example: the order in which the options are presented, the default value, the phrasing of each option, the number of options in the list, etc. One way to model this is to define an extended choice function, where the agent's choice can depend both on the available set of options and on other conditions. These other conditions are called frames  or ancillary conditions . This model covers several behavioral phoenomena, such as status-quo bias, satisficing, limited attention, effects of advertising, and decisions under deadline.

It is possible to define, for each agent, a binary relation over choices: a choice x is unambigously-preferred to choice y (denoted x P* y), if the agent chooses x over y for all frames (that is, y is never chosen when x is available). This is an incomplete relation.


 * Rubinstein and Salant relate this model to the standard model of choice. They identify conditions under which the unaninmous-preference relation is transitive, or transitive+complete. They consider in particular Salient Consideration choice functions: for every frame $$f$$, there is a corresponding ordering $$\succ_f$$ such that, when the frame is $$f$$, the agent picks the maximal available element according to $$\succ_f$$. With such choice functions, the agent's revealed choice-set contains all the choices which are maximal according to at least one such ordering. On the other hand, they show examples in which, either this relation misses important information, and fails to explain the real motivations of the agents.


 * Bernheim and Rangel focus on welfare within this framework. They allow different separation-lines between "choice objects" and "ancilliary conditions". They define several binary relations between choices, that mean that some choice x "improves upon" some choice y:
 * x R' y means that x is weakly-unambiguously-chosen over y. That is, whenever both x and y are available, if y is chosen for some frame, then x is chosen for the same frame. This extends the "weak revealed preference" relation.
 * x P' y means that whenever both x and y are available, sometimes x is chosen and y is not, and otherwise, other both are chosen or both are not.
 * x I' y means that, whenever x is chosen for some frame, y is chosen for the same frame.
 * x P* y means that x is strictly-unambiguously-chosen over y. That is, whenever x and y are available for some frame, y is never chosen for that frame. This extends the "strict revealed preference" relation.
 * x R* y means that there is some situation in which x and y are available, and x is chosen.
 * x I* y means that there is some situation in which x and y are available and x is chosen, and another such situation in which y is chosen.
 * If the choice from each set is a singleton, then P' and P* coincide.
 * P* implies P' implies R' implies R*. When choices do not depend on anciliary conditions, both P' and P* reduce to the standard "strict revealed preference" relation, and both R' and R* reduce to the standard "weak revealed preference" relation.
 * R* is always complete, but the other relations may be incomplete.
 * None of these relations is always transitive.
 * Suppose we have choice information on all nonempty finite subsets of options. In a sequence of comparisons involving R', if at least one comparison involves P*, then the sequence is acyclic.
 * In particular, P* itself is acyclic, so it can be used for welfare analysis. However, P' (and the weaker relations) may be cyclic.
 * A weak-individual-welfare-optimum is any x such that y P* x does not hold for any y. Any x that is chosen in some choice-set is a WIWO. Since P* is acyclic, a weak-individual-welfare-optimum always exists.
 * A strict-individual-welfare-optimum is any x such that y P' x does not hold for any y. Any x that is the unique choice in some choice-set is a SIWO.
 * P* is the finest relation that is inclusive libertarian, that is, does not overrules any revealed choice (the set of maximal elements contains all elements that are chosen in some situation).
 * The multiple-self model, common in dynamic inconsistency literature, is a special case in which the set of choice-problems is rectangular (the Cartesian product of the set of frames and the set of choice-subsets), and for each frame, the choices correspond to the maximal elements of a standard preference-ranking (or maximize a standard utility function).
 * The relation y M x means that y weakly-multiself-Pareto-dominates x, that is, u(y)>=u(x) for all frames' utility functions, with at least one inequality. Rectangularity implies that M = P'.
 * The relation y M* x means that y strictly-multiself-Pareto-dominates x, that is, u(y)>u(x) for all frames' utility functions.  Rectangularity implies that M* = P*. That is, x is a weak/strict multiself Pareto-optimum iff it is a weak/strict individual-welfare-optimum.
 * They extend to their model concepts such as compensating variation and consumer surplus.
 * For settings with more than one individual, they generalized Pareto-optimality:
 * x is a weak-generalized-Pareto-optimum (WGPO) if there is no other y for which yPi*x for all agents i. This coincides with previous notions of Pareto-optimality with incomplete preferences, such as . WGPO trivially exist. The contract curve may be thick (full-dimensional).
 * x is a strict-generalized-Pareto-optimum (SGPO) if there is no other y for which yRi'x for all agents i, and yPi*x for at least one agent i. SGPO might not exist in general.
 * A behavioral competitive equilibrium (BCE) is a price-vector p and an allocation x such that, for each agent i, there exists some frame for which xi is the choice of agent i from his budget set.
 * The allocation of every BCE is a WGPO; this extends the first welfare theorem. It is a corollary of the theorem of Fon and Otani.
 * The relations R' and P* are not very discerning; to refine them, one can remove exclude some choice-sets from the family of "relevant choice sets". For example, one can claim that the choice of an individual when he is under the influence of drugs is not welfare-relevant. However, devising formal conditions for which problems exactly can be removed is conceptually difficult. They suggest using neurologic evidence for this.
 * Green and Hojman : method for evaluating welfare based on choice. Instead of assuming that the agent maximizes a single preference relation, they assume that the agent makes a compromise between multiple conflicting preference relations, representing conflicting motivations. focus is on deriving everything from the choice data, with no apriori assumptions on the structure of multiple selves.
 * An agent is identified with a probability distribution over preference-relations, describing both the possible motivations and their strengths. There is also an aggregation procedure - a voting rule among preferences. All choice-patterns can be explained based on score-voting rules.
 * If the choice data is coherent w.r.t. a specific pair, then there is an explanation that puts all weight on motivations that prefer x to y. Therefore, adding x to the available choice-set improves welfare.
 * When some cardinal information is available, it is possible to calculate lower and upper bounds for change in utilitarian welfare, using linear programming. It is possible to detect surely-beneficial and surely-harmful explanations.
 * Tversky and Kahneman: each frame corresponds to a different endowment.
 * Apesteguia and Ballster define the swap index - a measure of both rationality and welfare. For each preference-relation P, and for each choice that does not maximize P, we count the number of swaps required in P in order to make it consistent with the choice. We do the same over all observed choices, and take a weighted average by their relative frequency in the data. Then, we define PS to be the preference-relation with a minimum swap-index. The relation PS is a complete ranking, and hence it is not contained in the P* of Bernhaim and Rangel. Moreover, while P* compares x and y based only on choice-problems in which both x and y are available, PS uses all available data, and thus may lead to opposite rankings.

Efficiency and distributive justice
Kahneman, Wakker, Sarin (1997): Bentham defined utility as a measure of pleasure and pain - "experienced-utility". Modern economists claim that this cannot be measured, and they define utility as the thing that is maximized when the agent chooses - "decision-utility". KWS claim that experienced utility can be measured, and it may be different than decision utility (for irrational agents), and thus must be taken seriously.

Mandler defines the behavioral welfare relation similar to one of the relations of Bernheim and Rangel. He shows that, when there are multiple frames, the set of Pareto-optima can be very large: the set of PO has the same dimension as the set of allocations. A small diversity can make every allocation Pareto-optimal. Almost every PO is surrounded by a full-dimensional set of other PO. He concludes that the Pareto criterion is not useful for policy decisions.

Mandler suggests to use utilitarian-optimality to close this indecisiveness gap. He proves that, under a separability assumption between the goods, there is a unique utilitarian optimum. Even without separability, the set of utilitarian optima has measure 0, and its dimension is at most the number of goods minus 1. However, utilitarian optima may fail to be PO.

Fleurbaey and Schokkaert present fairness criteria based on the concept of equivalent income, which works as follows for complete preferences.


 * Pick a monotone path B in the space of all bundles (in a monotone path, each bundle contains strictly more or strictly less resources than any other bundle).
 * For each agent i and bundle x, find the indifference-curve of i through x, and find its intersection with B.
 * The agent's utility is defined as the distance of this intersection point from the origin.

The selection of the monotone path is a normative choice. For example, when the relevant factors in life are consumption and health, it can be argued that the monotone path should be the path of perfect health. Then, the "utility" of an agent in situation (c,h) is the income c0 such that the agent is indifferent between (c,h) and (c0,h0).

When agents have multiple preferences, the equivalent income may be different for different frames. We can compute, for each agent, the minimum and maximum utility in each situation (min/max taken over all frames). Then, situation x is preferred over y if the minimum utility in x is higher than the maximum utility in y. This relation is incomplete. By adding some safety principles, it is possible to refine it so that x is preferred over y if the minimum utility in x is higher than the minimum utility in y and the maximum utility in x is higher than the maximum utility in y. Adding the Pigou-Dalton principle and some other principles, we can compare allocations by their smallest min-utilities