Picture represents scatterplot of pressure and temperature variables from the Hurricane Isabel dataset. The standard design (left) loses fidelity and requires manual adjustment, but the optimized design (right) automatically adapts to the data. Credit: KTH Royal Institute of Technology and Aalto University Scatterplots are widely used in various disciplines and areas beyond sciences to visually communicate relationships between two data variables. Yet, very few users realize the effect the visual design of scatterplots can have on the human perception and understanding. Moreover, default designs of scatterplots often represent the data poorly, and manually fine tuning the design is difficult.
Researchers have recently found an algorithmic approach to automatically improve the design of scatterplots by exploiting models and measures of human perception.
"A scatterplot is designed successfully when humans can effectively decode the message that was originally encoded graphically into the scatterplot. On the contrary, poor designs could miscommunicate the intended message", tells postdoctoral researcher Luana Micallef.
Automatic and optimized scatterplot designs
The optimizer developed by the researchers can predict how users would respond to a given design. The human perception has a number of capabilities and limitations, which a visualization should respectively exploit and mitigate to effectively communicate a message to a reader.
"As the owner of a dataset, you do not necessarily know how others will perceive the scatterplot and large datasets are also difficult to visualize. With our new algorithmic method, we can optimize the design of the scatterplot for any data and analysis tasks the user requires", explains Professor Antti Oulasvirta.
Every design aspect of a scatterplot, be it the size, opacity and color of the markers or the aspect ratio of the plot, matters. These aspects have a great impact on the correlation, outliers and classes detected in the scatterplot by the human perception.
"Even when you are a visualization expert, an automated design helps saving time, especially for very large data sets. This time is better invested in interpreting the visualizations rather than fiddling around with tedious parameter settings", says the recently graduated postdoctoral researcher Gregorio Palmas.
"This is only the beginning. We are in the middle of a shift where we automate at least parts of our data analysis, necessitated due to the sheer size of the data alone. The interactive data analysis methods such as scatterplots will continue to serve us well, but even more so when augmented with some level of machine intelligence", explains Professor Tino Weinkauf.
The new algorithmic approach was most successful in terms of task completion time. According to the researchers, Luana Micallef, Gregorio Palmas, Antti Oulasvirta and Tino Weinkauf, even users that are non-expert in visualization design can use the optimizer to produce effective scatterplot designs. With such algorithmic methods, unintended miscommunication may be diminished in the future.
Explore further: Visualizing scientific big data in informative and interactive ways
More information: Luana Micallef et al. Towards Perceptual Optimization of the Visual Design of Scatterplots, IEEE Transactions on Visualization and Computer Graphics (2017). DOI: 10.1109/TVCG.2017.2674978