# «RUNNING HEAD: The Fragile Nature of Contextual Preference Reversals The Fragile Nature of Contextual Preference Reversals: Reply to Tsetsos, Chater, ...»

The Fragile Nature of Contextual Preference Reversals

1

RUNNING HEAD: The Fragile Nature of Contextual Preference Reversals

The Fragile Nature of Contextual Preference Reversals: Reply to Tsetsos, Chater, and

Usher

Jennifer S. Trueblood

Vanderbilt University and University of California, Irvine

Scott D. Brown

University of Newcastle, Australia

Andrew Heathcote

University of Tasmania and University of Newcastle, Australia

Jennifer Trueblood

Department of Psychology

Vanderbilt University PMB 407817 2301 Vanderbilt Place Nashville, TN 37240-7817 phone: 949-824-1761 email: jennifer.s.trueblood@vanderbilt.edu The Fragile Nature of Contextual Preference Reversals 2 Abstract Trueblood, Brown, and Heathcote (2014) developed a new model, called the Multiattribute Linear Ballistic Accumulator (MLBA), to explain contextual preference reversals in multi-alternative choice. MLBA was shown to provide good accounts of human behavior through both qualitative analyses and quantitative fitting of choice data.

Tsetsos, Chater, and Usher (in press) investigated the ability of MLBA to simultaneously capture three prominent context effects (attraction, compromise, and similarity). They concluded that MLBA must set a “fine balance” of competing forces to account for all three effects simultaneously, and that its predictions are sensitive to the position of the stimuli in the attribute space. Through a new experiment, we show that the three effects are very fragile, and that only a small subset of people shows all three simultaneously.

Thus, the predictions that Tsetsos et al. generated from the MLBA model turn out to match closely real data in a new experiment. Support for these predictions provides strong evidence for the MLBA. A corollary is that a model that can “robustly” capture all three effects simultaneously is not necessarily a good model. Rather, a good model captures patterns found in human data, but cannot accommodate patterns that are not found.

Keywords: Decision-making; multi-alternative choice; preference reversal; context effects; dynamic models The Fragile Nature of Contextual Preference Reversals 3

## THE FRAGILE NATURE OF CONTEXTUAL PREFERENCE REVERSALS:

## REPLY TO TSETSOS, CHATER, AND USHER

Everyday we make hundreds of choices. Some are seemingly trivial – what cereal should I eat for breakfast? Others have long lasting implications – what stock should I invest in? Despite their obvious differences, these two decisions have one important thing in common. Both are potentially sensitive to context. That is, our preferences for existing alternatives can be altered by the introduction of new alternatives. Context effects – preference changes depending on the availability of other options – have attracted a great deal of attention because they violate the property of simple scalability (Krantz, 1964;Tversky, 1972), a central property of most utility models. Trueblood, Brown, and Heathcote (2014) introduced a new model to explain three prominent context effects (attraction, similarity, and compromise) called the Multi-attribute Linear Ballistic Accumulator (MLBA). We showed that the model provides good quantitative fits to individual-subject level data as well as making new predictions about the influence of time pressure on the effects, which we confirmed experimentally. Further, MLBA is analytically tractable unlike many previous models of context preference reversals (Usher & McClelland, 2004; Roe, Busemeyer, & Townsend, 2001).

Tsetsos, Chater, and Usher (in press), hereafter TCU, question the ability of the MLBA to simultaneously capture the three context effects stating that it must set a “fine balance” of competing forces to account for all three effects, and that its predictions are sensitive to the position of the stimuli in the attribute space. In past research, models of context effects have been evaluated by their ability to simultaneously capture the three major context effects. TCU’s arguments naturally lead to the question of how “robust” The Fragile Nature of Contextual Preference Reversals 4 context effects are within participants. However, almost all past experiments of context effects have been between subjects (i.e., separate experiments for the three effects with different groups of subjects). There are very few studies examining the co-occurrence of the three effects within individuals. In Trueblood et al. (2014), we examined the cooccurrence of the three effects in a “combined” inference experiment (i.e., deciding which of three criminal suspects most likely committed a crime). In this experiment, we found that only a small minority (11%) of participants showed all three effects. In another study involving choices among consumer products, Berkowitsch, Scheibehenne, and Rieskamp (2014) found that only 19% of participants showed all three effects. They also examined correlations between the effects, finding that the attraction and compromise effects were positively correlated (r =.49), while the attraction and similarity effects (r =

-.53), and the compromise and similarity effects (r=-.58), were negatively correlated.

The results of Trueblood et al. (2014) and Berkowitsch et al. (2014) suggest that the effects are very fragile and only a small subset of people show all three simultaneously. Thus, developing models that can “robustly” produce all three context effects with a single set of parameters is perhaps a misguided exercise, possibly leading to models that fail to match empirical reality. Rather, the focus should be on developing models that can accurately capture patterns and correlations found in human data. To further investigate the fragile nature of context effects, we examined the co-occurrence of the three effects in a perceptual decision-making task (Trueblood et al., 2014; Trueblood, Brown, Heathcote, & Busemeyer, 2013). We then compared the results of the experiment to a priori predictions from MLBA using artificial stimuli and very general assumptions

Trueblood et al. (2013) examined the three context effects using a simple perceptual decision-making task – deciding which of three rectangles had the largest area.

The three effects were examined in three separate experiments with different participants.

In the current experiment, all three effects were tested within participants during a single session.

Method Seventy-five undergraduate students at the University of California, Irvine participated for course credit. Similar to Trueblood et al. (2013), participants were told they would see three rectangles on each trial and to select the one with the largest area.

The height and width of the rectangles functioned as the two attribute values, analogous to attributes of price and quality in an experiment about consumer products. All rectangles were solid black in color and appeared on a white background. The rectangles were numbered from left to right and the location of different rectangles (i.e., target, competitor, and decoy) was randomized across trials. The vertical placement of the rectangles varied so that they were not all positioned on the same horizontal axis. Further details about the stimuli and experimental design can be found in Trueblood et al. (2013).

The experiment consisted of a total of 720 randomized trials with 160 testing the attraction effect, 160 testing the similarity effect, and 160 testing the compromise effect.

The remaining 240 trials were catch trials containing one rectangle that was clearly larger than the other two. These catch trials were used to gauge accuracy and engagement

Results Twenty participants answered more than one third of the catch trials incorrectly and were removed from the analyses. These participants were most likely not engaged in the task. For the remaining 55 participants, we removed trials with very short response times (less than 100 ms) and trials with very long response times (more than 8 s). On average, this procedure removed about 1% of trials for each participant (about five out of 480 trials for the three effects).

For each participant, we calculated the relative choice share for the target (RST;

Berkowitsch et al., 2014; Trueblood, 2015), defined as the number of times the target was selected divided by the number of times the target plus the competitor were selected. For this analysis, we collapsed across two different types of choice sets differing in the orientation of the target option – sets where the target was oriented vertically and sets where the target was oriented horizontally. If the RST value is greater than 0.5, this indicates that the target is selected more often than the competitor. Values equal to 0.5 suggest that the target and competitor were preferred equally.

Using the RST values, we examined the how frequently multiple effects occurred within a single participant. Out of 55 participants, only 13 had RST values greater than

0.5 for all three effects. Of the remaining participants, 12 had RST values greater than 0.5 for the similarity and attraction effects, 4 had RST values greater than 0.5 for the similarity and compromise effects, 12 had RST values greater than 0.5 for the attraction and compromise effects, 13 had RST value greater than 0.5 for only one of the three effects, and one individual had RST values less than 0.5 for all three effects. Table 1 lists

We also used a hierarchical Bayesian model (Trueblood, 2015) to test whether the RST values were on average greater than 0.5. For each effect, we assumed that the number of times the target is selected follows a binomial distribution where θ represents the probability of the target being selected and n is the total number of times the target plus the competitor are selected. We assumed that each individual has a different θ parameter for each of the three effects. We also assumed that these person-specific parameters are drawn from population-level beta distributions with hyperparameters defining the mean µj and concentration κj. A graphical model and the priors for µj and κj are shown in Figure 1 (Lee & Wagenmakers, 2014). The priors for these two parameters were determined from previous work (Trueblood, 2015; Lee & Wagenmakers, 2014) and set to be relatively vague. The prior for µj also slightly favors the null hypothesis that the RST values are equal to 0.5. Note that re-running the analysis with a uniform prior (which favors all values equally) did not change our results. Using JAGS, three MCMC chains were used to estimate the posterior distributions. All chains converged as

Table 2 lists the means of the posteriors for the hyperparameters µj, the 95% highest posterior density intervals (HDIs) for µj, and the results of frequentist tests. The HDIs represent the most credible posterior RST values at the group level (Kruschke, 2011). If this range lies above 0.5, then one can infer that the target was on average

suggests that µj was greater than 0.5 for both the attraction and similarity effects. The RST value for the compromise effect tended to be larger than 0.5 as well, but this result was not as strong as the attraction and similarity results. This finding is similar to Trueblood et al. (2013), which also found that the compromise effect was the weakest of the three effects.

We examined the correlation between the three effects using the individual parameters from the hierarchical Bayesian model. Correlations were calculated using the programs provided in Lee and Wagenmakers (2014). The posterior mean for the correlation between the similarity and attraction effects was -0.285 (95% HDI -0.527 to -0.044), the similarity and compromise effects was -0.271 (HDI -0.508 to -0.025), and the attraction and compromise effects was 0.666 (HDI 0.513 to 0.808). The data used in all of the analyses presented here are available in the Supplementary Information.

Conclusions Although there was evidence for the three context effects on average, very few individuals (13 out of 55) demonstrated all three effects simultaneously. These results confirm previous findings suggesting there are large individual differences in the manifestation of context effects (Trueblood et al., 2014; Berkowitsch et al., 2014).

Further, our analyses revealed several interesting correlations between the effects. The attraction and compromise effects were both negatively correlated with the similarity

To examine how well MLBA can account for the proportions of RST values and correlations found in the experiment, we conducted a prior predictive exercise where we calculated RST values predicted by the MLBA, by randomly sampling MLBA parameters. For this exercise, we examined three artificial choice sets from Trueblood et al. (2014) that were used for the qualitative analyses in that paper. These choice sets are listed in Table 3. Similar to the original paper, we fixed the start point parameter to A = 1, the threshold parameter to χ = 2, the drift rate standard deviation to s = 1, and the baseline input parameter to I0 = 5. The curvature parameter m determines the mapping from experimentally defined options (such as the height and width of rectangles in pixels) to subject values (see Trueblood et al., 2014, Figure 3). When m 1, intermediate options are preferred to extremes and the opposite is true when 0 m 1. When m = 1, subjective and objective values are equal. Because m must be positive, we assume m is log-normally distributed, i.e. m ~ eN(0,1). The attention weights in MLBA reflect the amount of attention given to pairwise comparisons of the options. The calculation of these weights involves free parameters λ1 and λ2. Because these parameters must also be positive, we assume that they are log-normally distributed as well, i.e. λ1, λ2 ~ eN(0, 1).