-
Same Data, Diverging Perspectives: The Power of Visualizations to Elicit Competing Interpretations
Authors:
Cindy Xiong Bearfield,
Lisanne van Weelden,
Adam Waytz,
Steven Franconeri
Abstract:
People routinely rely on data to make decisions, but the process can be riddled with biases. We show that patterns in data might be noticed first or more strongly, depending on how the data is visually represented or what the viewer finds salient. We also demonstrate that viewer interpretation of data is similar to that of 'ambiguous figures' such that two people looking at the same data can come…
▽ More
People routinely rely on data to make decisions, but the process can be riddled with biases. We show that patterns in data might be noticed first or more strongly, depending on how the data is visually represented or what the viewer finds salient. We also demonstrate that viewer interpretation of data is similar to that of 'ambiguous figures' such that two people looking at the same data can come to different decisions. In our studies, participants read visualizations depicting competitions between two entities, where one has a historical lead (A) but the other has been gaining momentum (B) and predicted a winner, across two chart types and three annotation approaches. They either saw the historical lead as salient and predicted that A would win, or saw the increasing momentum as salient and predicted B to win. These results suggest that decisions can be influenced by both how data are presented and what patterns people find visually salient.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
What Does the Chart Say? Grou** Cues Guide Viewer Comparisons and Conclusions in Bar Charts
Authors:
Cindy Xiong Bearfield,
Chase Stokes,
Andrew Lovett,
Steven Franconeri
Abstract:
Reading a visualization is like reading a paragraph. Each sentence is a comparison: the mean of these is higher than those; this difference is smaller than that. What determines which comparisons are made first? The viewer's goals and expertise matter, but the way that values are visually grouped together within the chart also impacts those comparisons. Research from psychology suggests that compa…
▽ More
Reading a visualization is like reading a paragraph. Each sentence is a comparison: the mean of these is higher than those; this difference is smaller than that. What determines which comparisons are made first? The viewer's goals and expertise matter, but the way that values are visually grouped together within the chart also impacts those comparisons. Research from psychology suggests that comparisons involve multiple steps. First, the viewer divides the visualization into a set of units. This might include a single bar or a grouped set of bars. Then the viewer selects and compares two of these units, perhaps noting that one pair of bars is longer than another. Viewers might take an additional third step and perform a second-order comparison, perhaps determining that the difference between one pair of bars is greater than the difference between another pair. We create a visual comparison taxonomy that allows us to develop and test a sequence of hypotheses about which comparisons people are more likely to make when reading a visualization. We find that people tend to compare two groups before comparing two individual bars and that second-order comparisons are rare. Visual cues like spatial proximity and color can influence which elements are grouped together and selected for comparison, with spatial proximity being a stronger grou** cue. Interestingly, once the viewer grouped together and compared a set of bars, regardless of whether the group is formed by spatial proximity or color similarity, they no longer consider other possible grou**s in their comparisons.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
The Arrangement of Marks Impacts Afforded Messages: Ordering, Partitioning, Spacing, and Coloring in Bar Charts
Authors:
Racquel Fygenson,
Steven Franconeri,
Enrico Bertini
Abstract:
Data visualizations present a massive number of potential messages to an observer. One might notice that one group's average is larger than another's, or that a difference in values is smaller than a difference between two others, or any of a combinatorial explosion of other possibilities. The message that a viewer tends to notice--the message that a visualization 'affords'--is strongly affected b…
▽ More
Data visualizations present a massive number of potential messages to an observer. One might notice that one group's average is larger than another's, or that a difference in values is smaller than a difference between two others, or any of a combinatorial explosion of other possibilities. The message that a viewer tends to notice--the message that a visualization 'affords'--is strongly affected by how values are arranged in a chart, e.g., how the values are colored or positioned. Although understanding the map** between a chart's arrangement and what viewers tend to notice is critical for creating guidelines and recommendation systems, current empirical work is insufficient to lay out clear rules. We present a set of empirical evaluations of how different messages--including ranking, grou**, and part-to-whole relationships--are afforded by variations in ordering, partitioning, spacing, and coloring of values, within the ubiquitous case study of bar graphs. In doing so, we introduce a quantitative method that is easily scalable, reviewable, and replicable, laying groundwork for further investigation of the effects of arrangement on message affordances across other visualizations and tasks. Pre-registration and all supplemental materials are available at https://osf.io/np3q7 and https://osf.io/bvy95 .
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Average Estimates in Line Graphs Are Biased Toward Areas of Higher Variability
Authors:
Dominik Moritz,
Lace M. Padilla,
Francis Nguyen,
Steven L. Franconeri
Abstract:
We investigate variability overweighting, a previously undocumented bias in line graphs, where estimates of average value are biased toward areas of higher variability in that line. We found this effect across two preregistered experiments with 140 and 420 participants. These experiments also show that the bias is reduced when using a dot encoding of the same series. We can model the bias with the…
▽ More
We investigate variability overweighting, a previously undocumented bias in line graphs, where estimates of average value are biased toward areas of higher variability in that line. We found this effect across two preregistered experiments with 140 and 420 participants. These experiments also show that the bias is reduced when using a dot encoding of the same series. We can model the bias with the average of the data series and the average of the points drawn along the line. This bias might arise because higher variability leads to stronger weighting in the average calculation, either due to the longer line segments (even though those segments contain the same number of data values) or line segments with higher variability being otherwise more visually salient. Understanding and predicting this bias is important for visualization design guidelines, recommendation systems, and tool builders, as the bias can adversely affect estimates of averages and trends.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
Seeing What You Believe or Believing What You See? Belief Biases Correlation Estimation
Authors:
Cindy Xiong,
Chase Stokes,
Yea-Seul Kim,
Steven Franconeri
Abstract:
When an analyst or scientist has a belief about how the world works, their thinking can be biased in favor of that belief. Therefore, one bedrock principle of science is to minimize that bias by testing the predictions of one's belief against objective data. But interpreting visualized data is a complex perceptual and cognitive process. Through two crowdsourced experiments, we demonstrate that sup…
▽ More
When an analyst or scientist has a belief about how the world works, their thinking can be biased in favor of that belief. Therefore, one bedrock principle of science is to minimize that bias by testing the predictions of one's belief against objective data. But interpreting visualized data is a complex perceptual and cognitive process. Through two crowdsourced experiments, we demonstrate that supposedly objective assessments of the strength of a correlational relationship can be influenced by how strongly a viewer believes in the existence of that relationship. Participants viewed scatterplots depicting a relationship between meaningful variable pairs (e.g., number of environmental regulations and air quality) and estimated their correlations. They also estimated the correlation of the same scatterplots labeled instead with generic 'X' and 'Y' axes. In a separate section, they also reported how strongly they believed there to be a correlation between the meaningful variable pairs. Participants estimated correlations more accurately when they viewed scatterplots labeled with generic axes compared to scatterplots labeled with meaningful variable pairs. Furthermore, when viewers believed that two variables should have a strong relationship, they overestimated correlations between those variables by an r-value of about 0.1. When they believed that the variables should be unrelated, they underestimated the correlations by an r-value of about 0.1. While data visualizations are typically thought to present objective truths to the viewer, these results suggest that existing personal beliefs can bias even objective statistical values people extract from data.
△ Less
Submitted 8 August, 2022;
originally announced August 2022.
-
Visual Arrangements of Bar Charts Influence Comparisons in Viewer Takeaways
Authors:
Cindy Xiong,
Vidya Setlur,
Benjamin Bach,
Kylie Lin,
Eunyee Koh,
Steven Franconeri
Abstract:
Well-designed data visualizations can lead to more powerful and intuitive processing by a viewer. To help a viewer intuitively compare values to quickly generate key takeaways, visualization designers can manipulate how data values are arranged in a chart to afford particular comparisons. Using simple bar charts as a case study, we empirically tested the comparison affordances of four common arran…
▽ More
Well-designed data visualizations can lead to more powerful and intuitive processing by a viewer. To help a viewer intuitively compare values to quickly generate key takeaways, visualization designers can manipulate how data values are arranged in a chart to afford particular comparisons. Using simple bar charts as a case study, we empirically tested the comparison affordances of four common arrangements: vertically juxtaposed, horizontally juxtaposed, overlaid, and stacked. We asked participants to type out what patterns they perceived in a chart, and coded their takeaways into types of comparisons. In a second study, we asked data visualization design experts to predict which arrangement they would use to afford each type of comparison and found both alignments and mismatches with our findings. These results provide concrete guidelines for how both human designers and automatic chart recommendation systems can make visualizations that help viewers extract the 'right' takeaway.
△ Less
Submitted 13 August, 2021;
originally announced August 2021.
-
Jurassic Mark: Inattentional Blindness for a Datasaurus Reveals that Visualizations are Explored, not Seen
Authors:
Tal Boger,
Steven B. Most,
Steven L. Franconeri
Abstract:
Graphs effectively communicate data because they capitalize on the visual system's ability to rapidly extract patterns. Yet, this pattern extraction does not occur in a single glance. Instead, research on visual attention suggests that the visual system iteratively applies a sequence of filtering operations on an image, extracting patterns from subsets of visual information over time, while select…
▽ More
Graphs effectively communicate data because they capitalize on the visual system's ability to rapidly extract patterns. Yet, this pattern extraction does not occur in a single glance. Instead, research on visual attention suggests that the visual system iteratively applies a sequence of filtering operations on an image, extracting patterns from subsets of visual information over time, while selectively inhibiting other information at each of these moments. To demonstrate that this powerful series of filtering operations also occurs during the perception of visualized data, we designed a task where participants made judgments from one class of marks on a scatterplot, presumably incentivizing them to relatively ignore other classes of marks. Participants consistently missed a conspicuous dinosaur in the ignored collection of marks (93% for a 1s presentation, and 61% for 2.5s), but not in a control condition where the incentive to ignore that collection was removed (25% for a 1s presentation, and 11% for 2.5s), revealing that data visualizations are not "seen" in a single glance, and instead require an active process of exploration.
△ Less
Submitted 31 August, 2021; v1 submitted 9 August, 2021;
originally announced August 2021.
-
Rethinking the Ranks of Visual Channels
Authors:
Caitlyn M. McColeman,
Fumeng Yang,
Steven Franconeri,
Timothy F. Brady
Abstract:
Data can be visually represented using visual channels like position, length or luminance. An existing ranking of these visual channels is based on how accurately participants could report the ratio between two depicted values. There is an assumption that this ranking should hold for different tasks and for different numbers of marks. However, there is little existing work testing assumption, espe…
▽ More
Data can be visually represented using visual channels like position, length or luminance. An existing ranking of these visual channels is based on how accurately participants could report the ratio between two depicted values. There is an assumption that this ranking should hold for different tasks and for different numbers of marks. However, there is little existing work testing assumption, especially given that visually computing ratios is relatively unimportant in real-world visualizations, compared to seeing, remembering, and comparing trends and motifs, across displays that almost universally depict more than two values.
We asked participants to immediately reproduce a set of values from memory. With a Bayesian multilevel modeling approach, we observed how the relevant rank positions of visual channels shift across different numbers of marks (2, 4 or 8) and for bias, precision, and error measures. The ranking did not hold, even for reproductions of only 2 marks, and the new ranking was highly inconsistent for reproductions of different numbers of marks. Other factors besides channel choice far more influence on performance, such as the number of values in the series (e.g. more marks led to larger errors), or the value of each mark (e.g. small values are systematically overestimated).
Recall was worse for displays with 8 marks than 4, consistent with established limits on visual memory. These results show that we must move beyond two-value ratio judgments as a baseline for ranking the quality of a visual channel, including testing new tasks (detection of trends or motifs), timescales (immediate computation, or later comparison), and the number of values (from a handful, to thousands).
△ Less
Submitted 23 July, 2021;
originally announced July 2021.
-
Truth or Square: Aspect Ratio Biases Recall of Position Encodings
Authors:
Cristina R. Ceja,
Caitlyn M. McColeman,
Cindy Xiong,
Steven L. Franconeri
Abstract:
Bar charts are among the most frequently used visualizations, in part because their position encoding leads them to convey data values precisely. Yet reproductions of single bars or groups of bars within a graph can be biased. Curiously, some previous work found that this bias resulted in an overestimation of reproduced data values, while other work found an underestimation. Across three empirical…
▽ More
Bar charts are among the most frequently used visualizations, in part because their position encoding leads them to convey data values precisely. Yet reproductions of single bars or groups of bars within a graph can be biased. Curiously, some previous work found that this bias resulted in an overestimation of reproduced data values, while other work found an underestimation. Across three empirical studies, we offer an explanation for these conflicting findings: this discrepancy is a consequence of the differing aspect ratios of the tested bar marks. Viewers are biased to remember a bar mark as being more similar to a prototypical square, leading to an overestimation of bars with a wide aspect ratio, and an underestimation of bars with a tall aspect ratio. Experiments 1 and 2 showed that the aspect ratio of the bar marks indeed influenced the direction of this bias. Experiment 3 confirmed that this pattern of misestimation bias was present for reproductions from memory, suggesting that this bias may arise when comparing values across sequential displays or views. We describe additional visualization designs that might be prone to this bias beyond bar charts (e.g., Mekko charts and treemaps), and speculate that other visual channels might hold similar biases toward prototypical values.
△ Less
Submitted 14 September, 2020;
originally announced September 2020.
-
How to evaluate data visualizations across different levels of understanding
Authors:
Alyxander Burns,
Cindy Xiong,
Steven Franconeri,
Alberto Cairo,
Narges Mahyar
Abstract:
Understanding a visualization is a multi-level process. A reader must extract and extrapolate from numeric facts, understand how those facts apply to both the context of the data and other potential contexts, and draw or evaluate conclusions from the data. A well-designed visualization should support each of these levels of understanding. We diagnose levels of understanding of visualized data by a…
▽ More
Understanding a visualization is a multi-level process. A reader must extract and extrapolate from numeric facts, understand how those facts apply to both the context of the data and other potential contexts, and draw or evaluate conclusions from the data. A well-designed visualization should support each of these levels of understanding. We diagnose levels of understanding of visualized data by adapting Bloom's taxonomy, a common framework from the education literature. We describe each level of the framework and provide examples for how it can be applied to evaluate the efficacy of data visualizations along six levels of knowledge acquisition - knowledge, comprehension, application, analysis, synthesis, and evaluation. We present three case studies showing that this framework expands on existing methods to comprehensively measure how a visualization design facilitates a viewer's understanding of visualizations. Although Bloom's original taxonomy suggests a strong hierarchical structure for some domains, we found few examples of dependent relationships between performance at different levels for our three case studies. If this level-independence holds across new tested visualizations, the taxonomy could serve to inspire more targeted evaluations of levels of understanding that are relevant to a communication goal.
△ Less
Submitted 3 September, 2020;
originally announced September 2020.
-
Why Shouldn't All Charts Be Scatter Plots? Beyond Precision-Driven Visualizations
Authors:
Enrico Bertini,
Michael Correll,
Steven Franconeri
Abstract:
A central concept in information visualization research and practice is the notion of visual variable effectiveness, or the perceptual precision at which values are decoded given visual channels of encoding. Formative work from Cleveland & McGill has shown that position along a common axis is the most effective visual variable for comparing individual values. One natural conclusion is that any cha…
▽ More
A central concept in information visualization research and practice is the notion of visual variable effectiveness, or the perceptual precision at which values are decoded given visual channels of encoding. Formative work from Cleveland & McGill has shown that position along a common axis is the most effective visual variable for comparing individual values. One natural conclusion is that any chart that is not a dot plot or scatterplot is deficient and should be avoided. In this paper we refute a caricature of this "scatterplots only" argument as a way to call for new perspectives on how information visualization is researched, taught, and evaluated.
△ Less
Submitted 12 February, 2021; v1 submitted 25 August, 2020;
originally announced August 2020.
-
Illusion of Causality in Visualized Data
Authors:
Cindy Xiong,
Joel Shapiro,
Jessica Hullman,
Steven Franconeri
Abstract:
Students who eat breakfast more frequently tend to have a higher grade point average. From this data, many people might confidently state that a before-school breakfast program would lead to higher grades. This is a reasoning error, because correlation does not necessarily indicate causation -- X and Y can be correlated without one directly causing the other. While this error is pervasive, its pre…
▽ More
Students who eat breakfast more frequently tend to have a higher grade point average. From this data, many people might confidently state that a before-school breakfast program would lead to higher grades. This is a reasoning error, because correlation does not necessarily indicate causation -- X and Y can be correlated without one directly causing the other. While this error is pervasive, its prevalence might be amplified or mitigated by the way that the data is presented to a viewer. Across three crowdsourced experiments, we examined whether how simple data relations are presented would mitigate this reasoning error. The first experiment tested examples similar to the breakfast-GPA relation, varying in the plausibility of the causal link. We asked participants to rate their level of agreement that the relation was correlated, which they rated appropriately as high. However, participants also expressed high agreement with a causal interpretation of the data. Levels of support for the causal interpretation were not equally strong across visualization types: causality ratings were highest for text descriptions and bar graphs, but weaker for scatter plots. But is this effect driven by bar graphs aggregating data into two groups or by the visual encoding type? We isolated data aggregation versus visual encoding type and examined their individual effect on perceived causality. Overall, different visualization designs afford different cognitive reasoning affordances across the same data. High levels of data aggregation by graphs tend to be associated with higher perceived causality in data. Participants perceived line and dot visual encodings as more causal than bar encodings. Our results demonstrate how some visualization designs trigger stronger causal links while choosing others can help mitigate unwarranted perceptions of causality.
△ Less
Submitted 1 August, 2019;
originally announced August 2019.
-
Biased Average Position Estimates in Line and Bar Graphs: Underestimation, Overestimation, and Perceptual Pull
Authors:
Cindy Xiong,
Cristina R. Ceja,
Casimir J. H. Ludwig,
Steven Franconeri
Abstract:
In visual depictions of data, position (i.e., the vertical height of a line or a bar) is believed to be the most precise way to encode information compared to other encodings (e.g., hue). Not only are other encodings less precise than position, but they can also be prone to systematic biases (e.g., color category boundaries can distort perceived differences between hues). By comparison, position's…
▽ More
In visual depictions of data, position (i.e., the vertical height of a line or a bar) is believed to be the most precise way to encode information compared to other encodings (e.g., hue). Not only are other encodings less precise than position, but they can also be prone to systematic biases (e.g., color category boundaries can distort perceived differences between hues). By comparison, position's high level of precision may seem to protect it from such biases. In contrast, across three empirical studies, we show that while position may be a precise form of data encoding, it can also produce systematic biases in how values are visually encoded, at least for reports of average position across a short delay. In displays with a single line or a single set of bars, reports of average positions were significantly biased, such that line positions were underestimated and bar positions were overestimated. In displays with multiple data series (i.e., multiple lines and/or sets of bars), this systematic bias still persisted. We also observed an effect of "perceptual pull", where the average position estimate for each series was 'pulled' toward the other. These findings suggest that, although position may still be the most precise form of visual data encoding, it can also be systematically biased.
△ Less
Submitted 31 July, 2019;
originally announced August 2019.
-
Truncating the Y-Axis: Threat or Menace?
Authors:
Michael Correll,
Enrico Bertini,
Steven Franconeri
Abstract:
Bar charts with y-axes that don't begin at zero can visually exaggerate effect sizes. However, advice for whether or not to truncate the y-axis can be equivocal for other visualization types. In this paper we present examples of visualizations where this y-axis truncation can be beneficial as well as harmful, depending on the communicative and analytic intent. We also present the results of a seri…
▽ More
Bar charts with y-axes that don't begin at zero can visually exaggerate effect sizes. However, advice for whether or not to truncate the y-axis can be equivocal for other visualization types. In this paper we present examples of visualizations where this y-axis truncation can be beneficial as well as harmful, depending on the communicative and analytic intent. We also present the results of a series of crowd-sourced experiments in which we examine how y-axis truncation impacts subjective effect size across visualization types, and we explore alternative designs that more directly alert viewers to this truncation. We find that the subjective impact of axis truncation is persistent across visualizations designs, even for designs with explicit visual cues that indicate truncation has taken place. We suggest that designers consider the scale of the meaningful effect sizes and variation they intend to communicate, regardless of the visual encoding.
△ Less
Submitted 8 January, 2020; v1 submitted 3 July, 2019;
originally announced July 2019.