Section 4 Presenting and Communicating uncertainty
Choosing what to communicate
The overall uncertainty is what we need to communicate, but often we don’t have a clearly quantified measure of this.
4.1 Framing the different uncertainties
If most uncertainty is quantified then present this prominently
4.1.1 If most of the overall uncertainty has been quantified (aleatory or epistemic), then this should be the most prominent message. Discussion of the unquantified uncertainties and risks can be included, but should be positioned so that they don’t reduce confidence in the main results unnecessarily.
4.1.2 If there are significant unquantified uncertainties (epistemic or ontological), then presenting the uncertainty that has been quantified may give a misleading impression of precision and thereby underestimate the uncertainty (i.e. don’t present a range if you know that there are substantial uncertainties that are not accounted for in that range).
However, if little is quantified, then it may be better to present no quantification at all
4.1.3 If the greatest source of uncertainty is the potential for a risk outside of the analysis to be realised, then this should be the most prominently displayed message.
If there is a major risk that has not been included in the analysis, then this may be greatest uncertainty
4.1.4 If quality assurance has been very limited then presenting any measure of uncertainty may be misleading. In such circumstances where the greatest source of uncertainty is the potential for error, then this should be the most prominent message that is conveyed to the decision maker. This may be in the form of an analytical assurance statement that highlights areas of concern.
If quality assurance has been very limited, it may be best to lead with this
4.1.5 Think about how caveats are presented – a long list is unhelpful, but two or three upfront that have the most impact on the results are likely to be more helpful and easily understood.
Front load the important caveats and explain why they matter
Deciding how to communicate it
Regardless of the situation, we recommend putting the uncertainty first when communicating – e.g “These results are interim. This policy showed a saving £xm in 2017 and therefore…”. The rationale is that we read top to bottom, so put the key caveat first.
This section considers a range of approaches based on your understanding of the audience and the type of message you need to deliver
You should also consider the onward communication of your message to ensure that when your work is passed on its core message and integrity are maintained.
4.2 Understanding the audience
Consider the audience when choosing appropriate communication methods
4.2.1 People respond differently to different communication methods. We need to assess the intended audience to understand appropriate communication methods. The audience might be:
- Analytical
- Non-analytical
- Mixed
- Someone you have worked with before - in which case tailor it to what has worked well in the past (or ask them, or see how they respond to different formats).
4.2.2 In presentations we often use advice from Aristotle to repeat our main message:
A combined approach will repeat your message and appeal to a wider audience
- Say what you are going to say
- Say it
- Say what you just said.
This is a powerful way of communicating your message in all situations. When communicating uncertainty your repetition can be achieved using different approaches to reach a wider audience.
4.2.3 If there are multiple audiences , then consider whether different communications should be created for each group.
Consider whether more than one approach might be beneficial
4.2.4 A good relationship with your decision maker will help you to understand their needs and choose the right approach for them reducing risks.
4.2.5 Think about what motivates the decision maker. For example, if the analysis is ultimately for the Chancellor of the Exchequer, you may get their interest more quickly by communicating early on that a policy would break one of the fiscal rules, rather than a policy change costing an extra £2bn (although that should still be communicated).
4.2.6 No matter how carefully you communicate the uncertainty to your immediate client, there is a risk that only the central numbers will persist and the uncertainty is overlooked. A good relationship with colleagues will minimise this risk and ensure that you have sight of such messaging before it goes out.
4.2.7 There are also risks around writing part of a larger document – you need to ensure that what you have said fits into the bigger document. A close relationship throughout can help with this.
4.3 Language
Avoid using words alone
4.3.1 Research shows that analysis is deemed to be less reliable if the outputs are conveyed only in words (even though some people may generally prefer to receive information in words). Therefore using words alone to convey uncertainty is discouraged.
4.3.2 Descriptive terms for probabilities are interpreted very differently by people, so should generally be avoided (e.g. ‘low risk’, or ‘very likely’). It is better to attach a numerical probability to the uncertainty, even if this is entirely subjective.
Avoid descriptive terms for uncertainty unless there is an established system in place
4.3.3 An exception to this is where there is an established system within the sector for attaching terminology to probabilities, one that can be assumed to be well-understood by the intended audience. For example:
- According to the IPCC (Intergovernmental Panel on Climate Change), “very likely” means 90-100% probability.
- According to NICE (National Institute for Health and Care Excellence), probabilities of between 1 in 100 and 1 in 10 are referred to as “common”.
- GAD have said it’s more effective to say this event (e.g. rivers flooding) will occur “once every 50 years” rather than “2% of the time”.
4.3.4 Presenting the likelihood of success may be perceived differently to presenting the corresponding likelihood of failure. Therefore presenting the information both ways can help avoid bias (e.g. “there is an 80% chance of success and a 20% chance of failure”). By adding in the chance of failure, the reader is reminded that there is a 20% chance of failure – which may otherwise be overlooked. Visual part-to-whole comparisons can help with this.
Use positive and negative framing
4.3.5 We should frame our advice from an audience perspective and events that they are interested in occurring (or not occurring). What decisions are they making based on our analysis and how can we explain the impacts of our findings in relation to those needs? For example “If you make this decision, this is what could happen to the thing you care about”.
Describe a possible outcome
4.3.6 There is no clear preference for choosing between probabilities and fractions (e.g. 10% probability, or 1 out of 10). Given this, the preferences of the audience should be considered. The ‘norm’ within the organisation may be best followed.
Decide whether to present percentages or frequencies
4.3.7 If using fractions, keep the denominator constant (e.g. “1 in 100 vs. 2 in 100”, rather than “1 in 100 vs. 1 in 50”) and as small as possible while keeping to integers (e.g. “1 in 100” rather than “10 in 1,000”), rounding if appropriate.
Keep denominators small and consistent when using fractions
4.3.8 Saying a ‘10% chance of rain’ is meaningless unless you also state the time period – e.g. in the next hour, or at some point tomorrow – and location.
Be clear on the specifics
4.3.9 If the outputs are only intended for use within a specific frame then make sure this is clearly stated alongside the outputs. For example, they may refer to a specific area or to a particular group of the population.
Be clear about the applicability of the analysis
4.3.10 Think about the word you use to describe the outputs. Are they ‘estimates’, ‘predictions’, ‘forecasts’, ‘projections’ or ‘scenarios’? Each of these has different connotations . Avoid words that imply certainty, such as ‘result’, or ‘answer’.
Choose an appropriate word for the outputs
- Forecasts: Analysis about the future, relatively robust, with all major sources of uncertainty quantified and included in the analysis. This is our best estimate of what we think will happen, including the full range of plausible outcomes.
- Projections: Analysis concerning the future, but less robust than a forecast, perhaps including some significant assumptions where the uncertainty hasn’t been quantified. This is an estimate of what might happen given certain assumptions (that may not have been thoroughly tested).
- Scenarios: Analysis concerning the future, but not robust as they include some assumptions that may not be likely to occur. This is an estimate of what we think would happen if certain things happened.
4.4 Numbers
Use an appropriate level of precision
4.4.1 Consider the overall uncertainty in the numbers you have calculated, and round appropriately to avoid spurious accuracy (e.g. perhaps 40% rather than 38.7% if the overall uncertainty is greater than one percentage point).
4.4.2 Presenting a single figure is best avoided as it can give a misleading impression of precision (e.g. “between 1,200 and 1,800”, rather than “1,500”).
Use ranges wherever possible
4.4.3 Commissioners may request a ‘best estimate’ for ease of onward use, but you must consider the risks in providing this. Try to understand how they intend to use the analysis, so you can provide something that meets their needs while also acknowledging the uncertainty.
4.4.4 Stating a range may be perceived as a uniform distribution across the range. Conversely, stating a range around a best estimate may be perceived as a triangular distribution (or Normal with analytical audiences). Consider which of these best reflects the actual uncertainty when deciding what to present.
Consider whether to include a ‘best estimate’ within the range
4.4.5 Don’t simply use 95% confidence intervals by default. Think about what the outputs are going to be used for (see section 1), and discuss the level of risk and uncertainty that the decision maker wants to plan for – this might not be 5%.
Choose appropriate condience/prediction intervals and be clear
4.4.6 Be clear what level of confidence level you are using and ensure your audience understands what this means (avoiding precise statistical definitions if it will increase comprehension).
4.4.7 Presenting numbers within a sentence helps give context, making them easier to read and understand. It also makes it easier for the recipient to copy the entire sentence, reducing the risk of misunderstanding where the outputs are being used in other documents.
Present key numbers in prose
4.5 Tables
4.5.1 I’ve left this section blank for now as it needs a fair bit of work
4.6 Graphs and visualisation
Graphs can be an excellent way of communicating the quantified elements of uncertainty
4.6.1 Graphs and visualisations are an excellent way of communicating the outputs of analysis, and many graph types allow you to communicate uncertainty within the graphic (provided the uncertainty has been quantified).
4.6.2 Unquantified uncertainties cannot generally be included in graphs, so will need to be communicated through other means.
4.6.3 Some types of graph are not particularly well suited to displaying quantified uncertainty:
Some graph types cannot be used to show uncertainty clearly
- Pie/donut charts
- Stacked graphs
- Choropleth maps (and most other geographical visualisations)
- Sankey diagrams
- Heatmaps
- Treemaps
- Most complex visualisations
As a result, uncertainty must be communicated through other means (e.g. description). Therefore these formats may not be appropriate if the communication of quantified uncertainty is critical to the analysis.
4.6.4 You may have the full understanding of the underlying probability distribution, or just a range within which we expect the result to fall with a given probability. You may choose to only include the uncertainty due to a single dominant uncertainty, or the outputs from a range of scenarios.
Decide what level of detail to include on uncertainty
4.6.5 The following table suggests some graph types that can be used for most situations, each of which are described in the following section.
Single Measure | Multiple Measure | Time Series | 2-dimensional data | |
---|---|---|---|---|
None | Single point graph, or describe in prose | Bar graph or line graph | Line graph | Scatter graph |
A Range | Single point graph with error bars | Bar or line graph with error bars | Line graph with range | Scatter with 2d error bars |
Summary Statistics | Single box plot | Series of box plots | ? | ? |
Maximum Detail | Probability Distribution Function (PDF) or Cumulative Distribution Function (CDF) | Multiple PDFs or Violin Plots | Fan chart | ? |
Uncertainty due to the methodology | Scatter Plot | Spaghetti Plot | Spaghetti Plot | |
Uncertainty due to alternative scenarios | Describe in prose | Multiple Line Graphs |
4.7 Errors bars
Error bars are a simple way to illustrate a range around a data point
4.7.1 Error bars can be added to bar graphs, line graphs and scatter graphs to illustrate a range around a central estimate, within which we expect the value to lie with a given probability.
4.7.2 As referred to previously, consider the situation and decide on an appropriate level to display. E.g., don’t apply 95% confidence/prediction intervals by default.
Choose an appropriate probability level based on the context
4.7.3 State what probability the error bars represent, and describe in prose how the viewer should ‘read’ the error bar.
Be clear about what the error bars represent
4.7.4 Error bars be added easily to a data series or time series. However, if the data are continuous (e.g. a time series) then consider whether showing multiple line graphs would be clearer than a single line graph with error bars.
Error bars can be applied to series of data points
4.7.5 If the output data are 2-dimensional, then you can apply error bars in 2 dimensions. Be careful to ensure that the resulting graph does not become illegible due to clutter.
2-dimensional error bars can be used where necessary
4.8 Box plots
Box plots can convey more information about possible outcomes than a range alone
4.8.1 Box plots can help the audience understand the underlying distribution of possible outcomes in more detail than just a range. Typically they show the median, interquartile range, maximum and minimum values for the range of possible outcomes. This can be particularly useful when the underlying distribution is skewed or non-normal.
4.8.2 Box plots can be arranged in parallel to show the distributions for a range of measures, and can help compare the different shapes.
A series of box plots can be used to compare distributions
4.8.3 Box plots may not be widely understood by non-analysts, so think carefully about whether the added information will be effective, or whether a simple range would be sufficient. A labelled example can be used to help the audience interpret the format.
Think about whether the audience will be familiar with the format
4.9 Probability density functions (PDFs)
PDFs show complete information on the quantified uncertainty
4.9.1 A probability density function can be used to give complete information on the range of possible outcomes, and the likelihood of each for a given estimate.
4.9.2 While presenting complete information may seem ideal, it may be more information than the audience actually needs. Would a prose description of the mean and range be sufficient?
Think about whether the audience needs this much information
4.9.3 If the PDF is approximately normal, then there may be little value in displaying it, as the essential features can be described in a few words.
PDFs can be useful when the distribution of outcomes is multimodal, or otherwise complex
4.9.4 However, if the distribution is multimodal , then it could be misleading to present the mean, so a graphical illustration of the distribution may be more effective.
4.9.5 It may aid clarity to draw the reader’s attention to important features, such as the mode.
Labelling can be used to highlight the key features
4.10 Multiple Probability Density Functions
Small multiples can be used to show the uncertainty across a number of different measures
4.10.1 If we need to communicate a series of PDFs, then multiple functions can be shown to compare the range of possible outcomes across the series.
4.10.2 If there are only 2 or 3 these can be overlaid to make it easy to compare. With more, ‘small multiples’ are likely to be clearer.
4.10.3 Violin plots are essentially mirrored PDFs which can be more aesthetic. Additional information (such as box plots) can be overlaid if required.
Violin plots can be used to compare PDFs
4.11 Cumulative density functions (CDFs)
A CDF may be more helpful than a PDF if there is a specific threshold of interest to the customer
4.11.1 A cumulative density function essentially shows the same information as a probability density function. However a CDF may be more helpful when the audience is primarily concerned with how likely it is that the value will be below (or above) a particular point (rather than the range within which we expect the value to fall). For example , how likely is it that our costs exceed our budget? (rather than what are our costs going to be?)
4.11.2 However, features such as the mode are less clear on a CDF (shown by the steepest part of the graph), as they are harder to read by eye.
The most likely value is less clear on a CDF
4.11.3 Drawing gridlines intersecting at key points of the function can help the viewer understand how to ‘read’ the graph.
Labelling can be used to highlight the key features
4.12 Fan Charts
Fan charts can show how uncertainty changes over time
4.12.1 Fan charts can be used to show a series of different prediction intervals for time-series projections (e.g. 30%, 60% and 90% at the same time).
4.12.2 This is essentially plotting selected points from a time-dependent PDF.
4.12.3 Often a central ‘best estimate’ is not included, to avoid the viewer focussing on a single estimate and undermining the importance of the uncertainty
Avoid including the mode
4.13 Spaghetti Plots
Spaghetti plots can be used to show the results from a range of different methodologies
4.13.1 If the methodology is believed to be the dominant source of uncertainty, then showing results with multiple different methodologies can be effective.
4.13.2 Less importance is placed on quantified uncertainties, and more on the general consensus of results.
4.13.3 Potential flaws are that all methodologies are given equal weight, which may not be appropriate.
Other sources of uncertainty should be considered
4.13.4 Also, the uncertainty within each methodology is not shown.
4.14 Multiple Line Charts
Multiple line charts can be clearer than a series of error bars
4.14.1 Multiple line charts with time series data to show a quantified range around a ‘most likely’ projection (essentially a series of error bars).
4.14.2 With scenario analysis, a series of line charts can be used to show the projections from each scenario.
Alternative scenarios can be illustrated with multiple line graphs
4.14.3 Generally with scenario analysis each scenario should be presented with equal prominence, to avoid suggesting that one is more likely than another (unless analysis has been carried out to quantify the likelihoods of each).
Give equal prominence to each scenario
4.14.4 Try to include an even number of scenarios, to avoid having a middle option that may be misinterpreted as the ‘most likely’ scenario.
Try to have an even number of scenarios
4.15 Tornado Diagrams
Tornado diagrams can be used to show the sources of uncertainty
4.15.1 Tornado diagrams are different to most other graphs discussed here. They are not used to show the outputs of the analysis, but to show how different sources of uncertainty contribute to the overall uncertainty.
4.15.2 Tornado diagrams depict sensitivity of a result to changes in selected variables.
4.15.3 They show the effect on the output of varying each variable at a time, keeping other input variables at their assumed values.
4.15.4 If the level of uncertainty is unpalatable to the customers, then this format can be useful to help focus work on reducing the level of uncertainty in key parameters.
Can help communicate the reasons for uncertainty, and identify further need for analysis
4.15.5 One limitation of the format is that only one parameter is changed at a time. There are some situations where the uncertainty due to one variable may appear small initially, but becomes much more prominent if a second variable takes on a slightly different value (e.g. think of a workflow model with a bottleneck. A tornado diagram might show the bottleneck parameter to be the overwhelming uncertainty. However, if this parameter is increased slightly then the bottleneck may move elsewhere, completely changing the picture)
Tornado diagrams can be misleading in complex models
4.16 Infographics
Infographics can be useful for public facing communications
4.16.1 Infographics are graphic visual representations of information, data or knowledge intended to present information quickly and clearly. They can improve people’s understanding by using graphics to enhance peoples’ ability to see patterns and trends.
4.16.2 When done well they will grab the reader’s attention from their otherwise busy day and become a very powerful way of communicating key messages.
They grab attention
4.16.3 Investing in and designing a good infographic may be worthwhile if your audience is less confident with data and analysis.
The additional graphics and text make your charts and messages more accessible
4.16.4 An example where uncertainty in the form of confidence intervals has been explained using an infographic is in the Households Below Average Income Report, ONS :
Example – confidence intervals
4.16.5 A simple infographic can be used to help the reader visualise the relative magnitude of numbers. This may be useful if we want to demonstrate either the magnitude of the uncertainty relative to the overall result. E.g. below helps the reader to interpret four percentages10
Visualise relative magnitudes
4.16.6 Infographics can use a lot of space and can become overly simplistic. Consider the format being used to determine if an infographic is the right choice.
Avoid common pitfalls
4.16.7 For wider principles of good infographic design and common mistakes to avoid see: https://www.nngroup.com/articles/designing-effective-infographics/
Refer to other sources for general principles around infographic design
4.17 Interactive Tools
Interactive tools can be used to immerse your reader on complex matters
4.17.1 An interactive tool can help to bring analysis to life and make it more accessible to non-specialists. They can create an immersive experience that is easier for them to understand and is highly memorable.
4.17.2 Consider the overall message and where the uncertainties lie. Which aspects will the audience be interested in and what do they need to hear? Use this understanding to bring focus to which interactive elements to create.
Focus on specific messages
4.17.3 The interactivity will enable your users to manipulate and get a deeper understanding of the message.
4.17.4 If a key source of uncertainty is a single variable, then it may be possible to construct a display that can be changed as the user adjusts the value of this variable by moving a slider.
Allow reader to adjust a key variable
4.17.5 Or, if there are several key assumptions that impact the result a chart may be created that will change depending on the inputs that the user inserts.
4.17.6 Being able to see what would happen if an underlying assumption was to change is a powerful way to demonstrate the level of uncertainty we may have in a given result.
4.17.7 If the communication with the decision maker is limited to a printed report, then interactive tools will not be possible.
Not possible on traditional printed format. Example - DECC 2050 Calculator