Skip to Content Skip to Footer
The Psychology of Feedback Design: How the Same Ratings Look Better (or Worse) Depending on Format

The Psychology of Feedback Design: How the Same Ratings Look Better (or Worse) Depending on Format

Mostofa Wahid Soykoth and Japmman Pahuja

Journal of Marketing Research Scholarly Insights are produced in partnership with the AMA Doctoral Students SIG – a shared interest network for Marketing PhD students across the world.

In an era where ratings and reviews shape consumer behavior and business reputation, the format in which performance scores are presented can dramatically alter how they are perceived. Firms like Uber, Amazon, and TripAdvisor present scores in a variety of formats: incremental (a raw score per occurrence), cumulative (updated average scores), or a combination thereof. A recent Journal of Marketing Research article examines the impact of incremental scores versus cumulative averages on judgments and why these matter for managers, platform designers, and policymakers.

It demonstrates that the presentation of performance scores—whether as cumulative averages, individual (incremental) scores, or a combination—can significantly influence how people evaluate products, services, or individuals. The authors find that when a generally well-performing entity receives a negative score, people view it as less damaging when the information is presented in a cumulative format. This presentation reduces negativity bias and helps prevent overreactions such as customer churn. However, incremental formats make single bad scores stand out more strongly, which could be helpful in contexts where managers want to stress accountability or encourage improvement.

Advertisement

The implications are far-reaching. For example, a restaurant with fluctuating quality may benefit from incremental formats that highlight recent improvements, while a ride-sharing app might prefer cumulative scores to maintain a stable reputation. The study also reveals that when both formats are presented together, users tend to focus more on the most extreme score—especially if it is negative—suggesting that hybrid formats may not provide the balance designers expect.

Managers can use these insights to tailor score presentation formats to different user segments. Novices may benefit from incremental feedback that encourages progress, while experts prefer cumulative scores that reflect long-term performance. The authors also suggest that dynamically switching formats could help platforms manage user expectations and behavior, though this approach may introduce confusion if not carefully designed.

Ultimately, this research highlights a subtle yet powerful lever for influencing consumer judgment. By rethinking how scores are presented, organizations can more effectively manage perceptions, foster trust, and achieve desired outcomes.

Key Takeaway

Across nine experiments, the authors find that cumulative formats tend to buffer negative feedback, making poor scores appear less severe. This can help reduce customer churn and maintain trust in platforms. In contrast, incremental formats make each score stand out, amplifying the impact of a single negative rating. This can be useful in contexts where accountability and improvement are key.

We had a chance to connect with one of the authors to learn more about their study and gain additional insights:

Q: Your research examines how the presentation format of quantitative scores affects decision makers’ evaluations. Did you have any observations that sparked your interest in reviewing the phenomenon closely and studying its consequences?

A: Yes, my two coauthors, Arne and Jeroen, are typically incredibly observant of the marketplace. Also, for this project, I actually learned it from them. They noticed that platforms like Uber, at the time, were using cumulative rating formats, while another app, Lyft, was using incremental formats. Building on that observation, we also looked at discussions on Reddit to see what people were saying about these differences. These different formats might have effects, and it became a nice combination of marketplace observation and psychological inquiry. We began asking: what kind of things vary in the marketplace? Do companies differ in their approaches, and could that have an impact?

If different companies are using various formats, there may be a reason behind it. Sometimes, companies haven’t thought it through, but in other cases, especially with tech companies, they have very deliberate reasons for their choices. From a psychological perspective, that makes it especially interesting to delve deeper. That was the starting point of this research.

Q: The research indicates that the presentation format has a significant impact on decision making when scores deviate. How do you suspect these findings would hold (or differ) in the context where performance expectations are less standardized and tend to be more subjective, such as creative services?

A: That is a good question. We haven’t thoroughly examined subjective domains, and several factors may be at play here. One thing to consider is that when it comes to highly subjective matters, people sometimes have strong preexisting preferences. For example, if I like the paintings of a particular artist, I will still appreciate them regardless of the score. In such cases, when people have strong preferences, ratings don’t matter much, and so the rating format likely won’t matter either.

On the other hand, in situations where people don’t have firm preexisting opinions, such as wine tasting, ratings can serve as a crucial cue. Many people lack in-depth expertise (strong preexisting opinions) in wine, so they tend to rely more on ratings (e.g., on an app like Vivino), whereas experts tend to depend less on them. Therefore, it can go either way, depending on how strong people’s preexisting preferences are.

It may also depend on the decision environment. When buying online without direct access to the product, ratings become more influential. If we do have direct access or if rich visual information is available, heavy reliance on ratings decreases. However, on many online platforms, ratings are among the primary pieces of information that influence purchase decisions.

Q: Among the many interesting findings in your study, were there any results that surprised you? If so, could you share which aspects stood out to you the most?

A: Yes, two findings were astonishing. Based on initial observations, one might have predicted that the combined format—which shows both cumulative and incremental ratings—would produce evaluations somewhere in between the two. For instance, if people see a negative score alongside a positive overall average, one might expect them to weigh both pieces of information and arrive at a more moderate judgment. However, consistent with theory on sensitivity to extremes, we found that evaluations in the combined format aligned entirely with the incremental presentation. When they encounter a negative score, it is challenging to ignore, and it strongly pulls down their overall evaluation.

Equally surprising was the strength of this effect. The effect sizes were much larger than we anticipated. To illustrate, in one of our studies, we asked participants to consider a product with an average rating of 4.2 based on five scores, four of which were maximum ratings. When asked to infer the missing score, many participants still significantly overestimated it. Instead of recognizing that the missing score must have been 1, participants often assumed it was a 2 or 3. In other words, the overall average of 4.2 seemed to “pull” their inference upward, making the extreme negative observation less salient than it genuinely was. People may systematically misestimate underlying scores, even in cases where the math is simple.

Q: Did you observe or imagine any unintended downsides to using cumulative formats, for instance, situations where critical problems might be masked rather than addressed? Can managers detect or avoid sweeping serious negative feedback under the rug in an average-based system?

A: Yes, this is a very real concern. One situation we examined was how cumulative formats can obscure recent performance issues, particularly in contexts such as app evaluations. For example, a local TV station had an app that initially received decent ratings. However, after a significant update, the app’s performance declined. Despite this, the cumulative score remained relatively high, masking the recent problems. In such cases, managers need to look beyond the overall average and examine incremental scores to understand what’s happening in the present.

This issue is not limited to apps. Service contexts such as restaurants, for instance, can vary significantly in quality over time. A place might have been excellent in the past but could be struggling now. If customers only see the cumulative score, they might miss these recent dips in quality. On the other hand, some services are inherently variable, experiencing random hiccups that aren’t sustained. In those cases, a cumulative score might be more representative of the overall experience.

Therefore, yes, cumulative formats can mask critical problems, and managers should exercise caution. They need to monitor recent feedback and not rely solely on averages, and they should actively do so. Otherwise, they risk overlooking serious issues that could impact customer satisfaction and retention.

Q: Can you envision a system where different user groups (novice/experts, high-value/low-value customers) would benefit from tailored score presentation formats? How might platforms segment their audience or dynamically switch formats to maximize desired behavioral outcomes?

A: Absolutely. The format of score presentation should align with the platform’s goals and the characteristics of its users. For example, if the goal is to encourage users to get started or continue engaging with a service, incremental formats can be more motivating. Imagine a course where the scores are 2, 3, 3, and then a sudden 5. Seeing the incremental progress might encourage someone to keep going. In contrast, a cumulative score might make the journey seem steep or discouraging, despite a recent maximum score.

There may also be the psychological impact of losing a perfect score. For instance, if someone has a cumulative score of 5 and then receives a 4, they may feel as though they’ve lost something valuable. This is a real issue: Some people react strongly to losing a perfect rating, as seen in platforms like Uber. Although we haven’t directly tested these scenarios, they are interesting and relevant.

Different users respond differently to feedback. Some are encouraged by seeing improvement, while others might be discouraged by a dip in their average. Platforms could segment users based on their behavior or preferences and present scores in a format that best supports their engagement. This kind of dynamic tailoring could be a powerful tool for influencing user behavior and satisfaction.

Q: If you could extend this research in any direction, which new context or type of platform would you benefit most from experimenting with score presentation formats, and why?

A: A promising direction would be to explore the dynamic switching of formats, where platforms change how scores are presented based on user behavior or context. For example, if a user receives a series of high scores (say, five 5s) and then gets a 1, the platform might switch to a cumulative format to soften the impact. However, if the user improves again, it may revert to an incremental format. However, this kind of switching can be confusing. Users may not understand why the format changed or what it means for their performance.

Cumulative formats are challenging for users to interpret. They require users to understand that they need to improve to increase their score consistently. This can feel like a slow climb, especially after a setback. The interplay between shifting formats, user expectations, and the pursuit of a perfect score creates a complex psychological landscape.

We’ve only studied one or two sequences that could realistically occur in the real world, but there’s a lot more to explore. Platforms like ride-sharing apps, educational tools, and fitness trackers could benefit from experimenting with different formats. Understanding how users respond to these changes can help platforms design more effective feedback systems that support motivation, satisfaction, and long-term engagement.

Read the Full Study for Complete Details

Source: Christophe Lembregts, Jeroen Schepers, and Arne De Keyser (2023), “Is It as Bad as It Looks? Judgments of Quantitative Scores Depend on Their Presentation Format,” Journal of Marketing Research, 61 (5), 937–54. doi:10.1177/00222437231193343.

Go to the Journal of Marketing Research

Mostofa Wahid Soykoth is a doctoral student in marketing, Louisiana State University, USA.

Japmman Pahuja is a doctoral student in marketing, Georgia State University, USA.

The owner of this website has made a commitment to accessibility and inclusion, please report any problems that you encounter using the contact form on this website. This site uses the WP ADA Compliance Check plugin to enhance accessibility.