Skip navigation
Currently Being Moderated

Museum collections are being transformed into a radical new resource for science through digitisation: creating image resources and immense databases that allow advanced research for the future of the planet. Professor Ian Owens, the NHM Director of Science, gave a symposium on this subject in collaboration with Dr Jonathan Coddington and Dr Kirk Johnston of the Smithsonian National Museum of Natural History at the American Association for the Advancement of Science in San Jose, California, on 14 February 2015.

 

Ecosystems and human needs

 

All humans depend on biodiversity in a wide variety of ways. Clean water, food crop production, sea fisheries, tourism, timber and many more human needs rely on the functions of ecosystems to a significant degree. Over the last twenty years we have seen much greater development of the idea of ecosystem services - a concept that thinks of the economic and other values of the natural world to humans and integrates those values into policy, education, natural resource management and other activities. This supports better decision making and aims to ensure sustainability - the continued use of ecosystem services by people over time and by generations in the future.  Biodiversity is central to ecosystem services - the variety and complexity of species and populations is immensely valuable to us all, but we know that we do not properly understand how ecosystems work, or the real value of biodiversity.

 

Data: 4.5 billion specimens, 1.9 million species, 300 years, and now DNA

 

Sustainable ecosystems management depends upon the availability of information about the variation of biodiversity. Natural history collections are a vital source of these data, holding billions of specimens collected over three centuries, each witness to past ecological conditions and historic distributions. This presentation showed how collection organizations are using digitization to unlock the vaults of their collections and develop tools to map, monitor and understand the natural world.


Slide02.jpg

 

The scale of the world's collections is immense, representing billions of datapoints. The Smithsonian NMNH in Washington is the largest collection with the NHM in London and the MNHN in Paris following. The data from these and many other colections together are a resource recording distribution, species and dates from which changes in biodiversity over time can be analysed.  However, most of these data from the last 350 years is on labels, cards and in books, meaning that they are not readily available to modern science or computing.  The challenge for collaboration is to transform the information into electronic data for modern biosphere science.

 

Collections  transforming science

 

Museum collections have always changed the way that we think about the world by enabling scientific comparison and research: the discovery of the dinosaurs; the origins of humans; and the processes of evolution.

 

1311Leicester.033.jpg

Charles Darwin's Galapagos finches in the NHM: a key to understanding evolution

 

As science and techniques change, so does the potential of museum collections - the recent revolutions in DNA and genomics enable collections to be seen in a completely new light as resources for researching evolution and relationships; the development of computing and data analysis allows rapid analysis of big patterns in space and time to be explored in ways that could only be imagined twenty years ago.  NHM uses CT scanning to create digital replicas of delicate specimens for complex modelling; advanced analytical techniques with electron beam instruments to understand the detail of mineral structure and economic potential; and new applications of electron microscopy to give insights into the smallest detail of anatomy and development.

 

Our partial knowledge - species and diversity as a key to understanding ecosystem value and function

 

When it comes to the Biosphere and understanding how ecosystems work, the last 350 years have seen the discovery and description of around a quarter of the species that exist (excluding bacteria and similar microbes). 400,000 beetle species have been discovered, but this almost certainly represents a minority of those that exist.  New technologies with DNA look likely to revolutionise the nature of discovery - and give access to greater knowledge of the link between diversity and our needs from ecosystems.  Around 1.9 million Eukaryote species have been described out of a probable 8-9 million.  If we consider bacteria, there could be tens of millions more species.  We are currently, worldwide, describing around 15,000 species a year so the rate of discovery with current techniques is not going to close the knowledge gap: we need more rapid approaches to description and characterisation of biodiversity, and more sophisticated thinking on the importance of biodiversity in ecosystem function.

 

beetles.jpg


How do we understand biodiversity in a different way, and how can we speed up the development of our knowledge, particularly for the huge diversity of minute soil organisms, fungi and microbes? The effort of our science is at the moment focused on larger, more charismatic species such as birds and mammals, and the work of scientists on big processes and patterns in biodiversity - macroecology - is a small proportion of total activity.

 

Slide12.jpg

 

A key area of strategic development is focused on acceleration of biodiversity description and discovery, using forest insects as a pilot group: Breaking the Taxonomic Barrier, an important complement to the digital initiatives.

 

Digitisation - the beginning of the surge

 

We need to unlock the potential of our collections through digitisation to speed up this science -  transforming the labels and individual records into large datasets.  However, this is a major task that requires extensive collaboration and significant resources. Our efforts so far have digitised around 3% of our collection of 80 million specimens - and of this, only some is in suitable form for scientific analysis.

 

Slide13.jpg

Our response is the development of the Digital Museum: Collections. Over the last year, we have successfully transformed our approach from individual research projects to an efficient processing line with our iCollections project: we've digitised 500,000 UK butterflies and moths with an expert team who have taken images and transformed data for scientific use, taking 2 minutes per specimen and costing £1 per specimen.

 

1311Leicester.039.jpg

 

Our current project is the rapid digitisation of 70,000 plant specimens on a conveyor-belt digitiser, followed by transcription and database development, as a pilot towards the creation of a Digital Herbarium to allow wider and much more ambitious scientific use.

 

picturae.jpg

 

And we have already put millions of printed pages into digital form as part of the collaborative Biodiversity Heritage Library -  NHM's efforts are part of a global network of digitisation - iDigBio in the US has digitised 25 million specimens from a network of institutions, just as one example.  These efforts amount to the development of big data for biosphere science in coming years.

 

The need for citizen science

 

However, there is a significant challenge in transcribing the label data from older scripts - it cannot be done automatically and until we have transcriptions, these older collections are inaccessible to science

 

15903365787_67d188e362_o copy.jpgAnd this is where the involvement of thousands of volunteers can be essential in transforming collections into a scientific data resource: citizen science and transcription.  Some of this is through online crowdsourcing portals, such as Zooniverse, where we are experimenting with the transcription of our digitised collections.  However, we are also looking at how we can make crowdsourcing a live event: at our annual Science Uncovered event in September 2014 we welcomed 10,000 members of the public into the NHM to see science at first hand and one of the activities on offer was citizen science transcription of beetle specimens.

Screen Shot 2015-02-15 at 11.08.09.jpg

 

Results and data use

 

New science is emerging from these growing digital resources: the beginning of a new type of science possible from this investment.  Steve Brooks, Angela Self and Flavia Toloni from the NHM, with Tony Sparks from Coventry University, have used the digitisation of butterflies and moths to look at how UK butterflies are responding to climate change.  There are good observation data for the UK from the 1970s, but collections hold the key to looking further into the past. They analysed data from 2,630 specimens of four species of British butterflies (Anthocharis cardamines, Hamearis lucina, Polyommatus bellargus and Pyrgus malvae), collected from 1876 to 1999. The data on collection dates gives a record of the first emergence of these species each year and the research shows a good relationship between higher early spring temperature and early emergence dates.

 

1311Leicester.040.jpg

 

Big Data and Open Data

 

The data produced from our digitisation work is being released through a new Data Portal, enabling scientists to find information of interest and to download datasets for research on Open Data principles - the NHM has adopted a policy of being Open by Default.

 

dataportal.jpg

The broad challenge of global biodiversity

 

New work is using much broader datasets to understand big changes across multiple ecosystems: Andy Purvis and his team in the PREDICTS project which is taking data from multiple sources, including collections, to look at patterns of how local biodiversity typically responds to human pressures such as land-use change, pollution, invasive species and infrastructure, and aims to ultimately improve our ability to predict future biodiversity changes.  Data is being assembled from a wide range of biomes.

 

Predicts.jpg

 

The future of the Digital Museum for the Biosphere: Open Data; Big Data; Community Data

 

The future for the Digital Museum of the Natural History Museum is based on a new model for the development and use of collections data through digitisation:

 

  • Open Data that are available to scientists all around the world for collaboration and research.  We need to involve as wide a range of expertise in thinking about science for the future.  Museums will continue to be a key resource as a focus for evidence and extended collaboration;


  • Big Data that cover whole ecosystems over long periods of time, based on the solid evidence base of collections and extending from population and species to molecules and DNA.  Internationally, there are 4.5 billion specimens of 1.9 million species from 300 years of collecting. We need to use these data effectively but also work out new ways of gathering data on the millions of other species that will allow  understanding to help humanity to  tackle the challenges of the future in terms of environmental change and sustainable use; and


  • Community Data that are based on the involvement of a wide spectrum of public participation, from schoolchildren to students to communities: online, in museums and in the field. This is science for everybody, from basic curiosity, to observation and recording, to data development and interpretation, from appreciation to understanding practical application for the future.

 


 



Comments (0)