All posts by Marc Cornelius

B*lls-Up

Apparently Public Health England were using the xls file format to upload Track and Trace data into Excel templates. The issue is that this file format can only handle 256 columns and 65,536 rows.  Apparently each test result generates several rows of data (why?) and thus the templates could only handle about 1,400 cases. Anymore than this and the data was just truncated.

The latest version of the Excel file format (xlsx) can handle 1,048,576 rows and 16,384 columns. 

Were PHE using a massively outdated version of Excel? Or were they just incompetent?

In any event datasets such is thus should be handled by a database (e.g. Access or Oracle) rather than a spreadsheet program. 

 

England 7 Day Case Rate

NOTE: I’ll no longer be updating this image. The updated image, along with others, can now be found here.

Seven day case rate per 100,000 population by lower tier local authority.

More for my edification than anything else. Basically so I can test how best to display such things and see how well they display on multiple devices.
I created the map based on data downloaded from
https://coronavirus.data.gov.uk/

Accuracy

This is the text of an e-mail I sent to my local MP on Sunday:

Dear Ms Saxby

I was very concerned to read the following report on the BBC web site on Saturday morning:
The south-west has always had relatively few cases – currently 778 infections a day, according to PHE. However, Dr Birrell says the north-west – 4,170 infections a day – is “more worrying”.” https://www.bbc.co.uk/news/health-52944037

The latest summary report from PHE which covers the period 20 January 2020 to 3 June 2020 shows a cumulative total number of cases of 11,945 for the South West and 37,321 for the North West. (National COVID-19 weekly summary report: 4 June 2020 – accessed from  https://www.gov.uk/government/publications/national-covid-19-surveillance-reports)

The latest figures on the Coronavirus Data Dashboard (https://coronavirus.data.gov.uk/ – accessed on 6 June 2020) show a cumulative total of positive test results of 7,818 for the South West and 26,133 for the North West. The granular case data downloaded from this site shows a total of 1,264 positive tests for May – an average of 41 a day. For the North West, the figures are 5,267 with an average of 170 a day. North Devon has had 92 positive cases since the pandemic began. With only 13 cases in the whole of May and no positive tests since 23 May.

In all cases the regions cover the same geographic area (and in the case of the South West this includes Bristol, Gloucestershire, Wiltshire and Dorset!).

There is a massive mismatch between the figures quoted in the BBC article which are reported as having come from PHE, those in the PHE report and those from the government’s Coronavirus data web site.

Of course, the number of cases that test positive is lower than the true number of infections. However, is there really almost a ten-fold difference in the South West and 25-fold difference in the North West?

I can fully accept small differences in the data because of delays in test results being received etc but not differences of this magnitude.

What is going on here? Why don’t the figures match?

Yours sincerely,
Marc Cornelius

Tricksy!

The graph above is extracted from the presentation given during the UK government’s Covid-19 press conference on Wednesday 6 May.  It  shows the cumulative number of deaths per million population for a number of countries. 

The UK is doing badly on that measure, as it’s just below Spain. But it doesn’t appear to be doing very much worse than the US. Neither does the difference between Germany and the UK  seem fantastically dramatic. It appears that Japan and South Korea are doing quite badly as well.

However, the graph is using a logarithmic scale (or log scale), which is a way of displaying numerical data over a very wide range of values in a compact way—typically the largest numbers in the data are hundreds or even thousands of times larger than the smallest numbers. The numbers 10 and 100, and 100 and 1000 are equally spaced. 

If the graph is redrawn using a standard scale, things look dramatically different:

The UK is still the second worst, but look how the gap between the UK (red line) and the USA (brown dashes) has dramatically widened, as has the gap between the UK and Germany (amber line). The death rates in Korea (blue line) and Japan (hidden by Korea) appear to be virtually zero. They’re not of course, but they aren’t yet in double digits.

Using a log scale tells a completely different story, and can be highly misleading.  I wonder whether the original graphic was deliberately designed to be misleading?

Note: The government’s figures came from John Hopkins University and Public Health England (PHE).  Mine are from the European Centre for Disease Prevention and Control (ECDC), so they might be slightly different.

German Lockdown

This graph shows the value of R in Germany before they relaxed  their lockdown rules (the amber line) and after they were relaxed (the blue line). The red horizontal line represents an R value of one.  If R is higher than one the infection is still spreading exponentially. If it’s lower than one the spread is slowing and will eventually die out. The goal is keep the R value below one and to lower it as much as possible. 

The slight rise in the German COVID-19 R value on 27 April to just below one prompted news reporting  such as this in the British press:
“WAVE OF FEAR Germany faces having to bring BACK strict coronavirus lockdowns as cases surge just days after easing them”.
“Germany has seen a worrying rise in its coronavirus infection rate after becoming one of the first countries in Europe to start easing lockdown measures”

This sort of misleading, and frequently sensationalist, reporting is rife, ands gets repeated ad nauseam online, and particularly on social media. 

Unfortunately some people inform their decision making based on reading stories such as these, rather than relying on facts. 

For those who would like to know the source of my data, it’s taken from the website of the Robert Koch Institut (RKI) in Germany and is available in German and English:
https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Situationsberichte/Gesamt.html
The facts and figures on COVID-19 that the RKI publishes on a daily basis makes the UK’s efforts to do the same seem inadequate.

EDIT: I’ll be updating the graph on a regular basis. So far the R-value shows no sign of increasing

Oh Dear

 The CNN reporting on Boris Johnson’s condition from outside St Thomas’ hospital near the Houses of Parliament. Note how their correspondent is disregarding government advice regarding self-isolation for pregnant women. Note also how the police to be seen later in the video are conspicuously not following  government rules re social distancing. None of this shows good role models.