Tuesday, 30 October 2018

HadEX3 - initial steps


With IPCC AR6 now in full swing, development of HadEX3 is occurring in earnest. In fact, I've been gently getting on with bits and bobs over the last year or so.  The largest bits were the calculation of the decorrelation length scale, and also the angular distance weighting gridding scheme.  These were tested on three different datasets which will go into HadEX3.

The European Climate Assessment and Dataset (ECAD) is coordinated by KNMI and now has partner datasets in Latin America (LACAD) and Southeast Asia (SACAD).  They have already calculated indices from the data they have sight to (not all can be shared in its raw, daily form).  These indices need converting to a standard format and the metadata checked.

The Global Historical Climate Network Daily (GHCND) is a large collection of daily observations.  I've so far just taken a subset of US stations which form part of their Historical Climate Network dataset (HCN).  These daily observations needed reformatting and then passing through the code which calculates the indices.  

Finally, as we are encouraging individuals, national meteorological services and organisations to submit any data or indices they have to this dataset, we also used some Spanish stations to include and process in our initial test.

There have been a number of codes which calculate the ETCCDI climate indices, in IDL, Fortran and R.  As different codes will, despite the best efforts of the scientists and programmers involved, have subtle differences in the methods and thresholds, an attemp has been made to take the best aspects of all of these and create a single codebase, freely available to calculate these indices.  The Climpact2 software uses the RClimdex software and wraps this to allow both for the processing of netCDF files of gridded temperature and precipitation values, but also batch processing of raw station files. 

Once the indices have been calculated, we do some quick checks to ensure that there were sufficient daily observations to actually produce the indices, that there are at least 20 years of data without too many long gaps and that the end of a station series occurs after 1950.  We still get a large number of stations that could contribute to the dataset from these three sources (Fig 1.), so in some parts of the world (or for some data sources) we may be able to be more exacting in our data requirements.  The large gaps where we have no data as yet are a more urgent aspect to address.
Fig 1.  Stations used for two example indices (top - PRCPTOT, bottom - TX90p) from the first test run of HadEX3.
Although we do not expect changes to occur in a linear way, showing the linear trends over the recent period give a good summary of the dataset and allows comparison with HadEX2 and others using the ETCCDI indices.  Two examples are in Fig. 2, but as is clear, the spatial coverage has some very large gaps.  The nature of the ADW gridding scheme is that it interpolates into areas with low station density, increasing the apparent coverage, especially for the temperature indices.
Fig 2. Linear trends for two example indices based on their annual values over 1951-2017 (top - PRCPTOT, bottom - TX90p) from the first test run of HadEX3.  Please note these are preliminary plots, with gaps in the spatial and temporal coverage.
We still have decisions and choices to make, some of those parameters which will result in small differences in the final data files.  There are parallel versions that we could create, with smaller gridbox sizes, other gridding methods, more- or less restrictive selection and quality control settings.  Watch this space.

Oh, and please get in touch if you have data that could contribute towards this dataset!  We'd be very happy to have you on board.

No comments:

Post a Comment