Horizon, you make me scared about science on TV.

Science broadcasting still remains one of our most important tools for communicating science outside of academia. I have clear memories of watching shows like the Really Wild Show as a child, and getting the huge buzz from science and discovery that has kept me here to do a PhD. So cheers Chris Packham, you landed me in this mess.

My appreciation of scientific broadcasting is equally the reason that I believe it should be just as accurate and honest as any other part of science. There is no justifiable reason that we should be able to present incomplete or inaccurate results just because “it’s only television”. This is patronising, insulting and above all else dangerous.

What’s the Right Diet for you? is a 3 part Horizon special following a tried and tested “science experts help fix fatties” storyline. The fancy twist here is relating genetic evidence to participant’s behavioural traits, and using this to prescribe them specific diets. A laundry list of related experts come along to help out, discussing the science behind a series of genetic and behavioural tests.

I somewhat accidentally ended up being at the filming of one of the key behavioural experiments. The production team approached myself and a few colleagues to come along as “scientists” and help run the experiment, record results and analyse data. Mostly on the promise of a day out, something a little different to add to your CV and a 2 second glimpse of our faces on television, a troupe of us went along.

The particular experiment we were involved in was the gripometer test, to identify so called “constant cravers”. This was broadcast as part of the first episode shown on BBC 2 at 9pm on Monday 12th January. If you want to see it in its full glory, it is still available on BBC iPlayer for about a month.

The idea was to identify people who still wanted to eat despite having only recently eaten a large meal. The hypothesis was that those possessing a particular genetic trait would be more at risk of displaying this “constant craver” behaviour and so would show greater desire for food. In particular it was anticipated that potential cravers would want food with traditional snack traits: high in salt, fat and sugar.

Participants were sat in exam-hall style seating, and at 1 minute intervals shown a plate of food. They were then asked to grip a gripometer to indicate how much they wanted the food. They were also asked to grip the meter as hard as possible 3 times, and the average of these values was used to normalise the scores.

Below is a rough transcript of Dr. Van Tulleken (DVT) and Dr. Yeo (DY) discussing the results of the experiment on Monday’s show (if you want to see for yourself skip to around 27 minutes in):

DY:  The ‘constant cravers’ are pulling harder for 5 of the 8 foods… in spite of the fact that it’s in the middle of the day two hours after lunch.
DVT: But 5 out of 8, that’s a good result!
DY: It’s a good result.
DVT: Presumably then this is gonna give us a way into designing a dieting strategy for people with these genes. [SIC]
DY: Absolutely. The thing is, how do you fight that higher drive to eat throughout the day? That’s what we have to tackle if we’re going to have an effective diet.

In science we need to know whether differences between results from different groups in a test are meaningful in the real world, by identifying if the difference is statistically significant. Simply looking at two results from a test and saying that they are not equal is not the same as identifying a difference that is significant. There are limitations in every experiment and we must account for the fact that sometimes results might just differ by chance.

Now, there is absolutely every possibility that statistical analysis was performed following our involvement. I have my suspicions that this is not the case, and certainly looking at the results file this 5/8 figure seems to fit with a sheet I have here, simply comparing the average score for each food in the first twenty people (potential cravers) and the second thirty people (not anticipated to be cravers).

A very quick and easy initial step to see if sets of data differ is to plot them as a box plot. If you’ve never seen them before, they’re dead nice and simple. Basically to see if the potential cravers (box on the left) differ from the others (on the right) in each food plot we would look for the two thicker bars showing their medians and the white boxes they are in to not overlap.

As the results they report in the show use the readings from the meters, divided by each participant’s maximum possible strength, I also use these values.

all-plots

Fig 1. Box plots for each food showing distribution of values for potential “constant cravers” (the first 20 people in the dataset) on the left and those not identified as potential cravers (the second 30 people in the dataset) on the right.

These boxplots are far away from what I would typically consider to show “significantly” different groups. As far as I can tell, the degree of overlap and the proximity of the median values shows that there is really only very small differences between these groups. Whilst we could perform further statistical tests, I personally struggle to find this compelling evidence that there are likely to be 5 groups where the potential cravers show a big enough difference in response to have any real life meaning.

In my view as a scientist, what this means is that we can’t take away any conclusion that there is any difference between these groups of people in their response to any of these foods. Or else that the measures taken and the study design were inappropriate. Or possibly both. Certainly I believe that this is wholly inappropriate evidence to base medical advice upon.

My colleagues and I feel that there is also evidence of the second possibility; the equipment was not calibrated with age/gender, the control was not properly used and so rendered ineffective, the treatment was not randomised in any way. Whilst none of us are medical or behavioural scientists, these are basic pillars of scientific experiments.

I absolutely do not wish to question the previous academic work of the experts or hosts on the show. But I would question their being happy with this show and its associated experiments as a representation of their work and its applications. Whilst it is an opportunity to discuss some interesting theories about diet and genetics, in my opinion this is well away from enough to base a lifestyle change upon.

I chose not to voice my concerns about the show at the time of filming because having not seen how the results would be represented on television, I couldn’t comment upon it. Now I have seen how the results have been reported and used, I cannot ignore the real problems in the programme.

BBC accuracy guidelines state that “3.2.1 we must do all we can to ensure due accuracy in all our output” and “3.2.2 We should not distort known facts, present invented material as facts, or otherwise undermine our audiences’ trust in our content”. In my opinion in this instance the evidence has been inaccurately represented in a big way and an a hypothesis accepted as fact for which there is little to no evidence.

I want to keep enjoying science on TV, and I want to tell everyone else around me to do so as well. So all I can hope is that this is not pointing to a bigger problem, it’s just a blip. I am certain I will maintain clear memories of this programme as well, but unfortunately for all the wrong reasons.

Leave a comment