The NaturePlus Forums will be offline from mid August 2018. The content has been saved and it will always be possible to see and refer to archived posts, but not to post new items. This decision has been made in light of technical problems with the forum, which cannot be fixed or upgraded.
We'd like to take this opportunity to thank everyone who has contributed to the very great success of the forums and to the community spirit there. We plan to create new community features and services in the future so please watch this space for developments in this area. In the meantime if you have any questions then please email:
The highlights of creating and using a wildlife sound collection: reflections on a seminar by Margaret Cawsey, Curator of Data, Australian Wildlife Collection, CSIRO Ecosystem Sciences on 3 July 2014.
By Joanna Benedict, Learning Programme Developer at the Natural History Museum.
Margaret Cawsey speaking at the Sounds of Australia seminar Alex Drew is shown working on the sound archive.
Margaret is passionate about organising data and making it accessible for researchers, museum professionals and others interested in finding out about the sounds of Australian birds. She presents a case for why it is important to make the sounds collection accessible and the challenges involved.
Apparently, the ANWC is the only organisation in the Australian museum community to make bird sound files available through the Atlas of Living Australia.
Why collect bird noises?
During the seminar, Margaret played the sounds from the Grey butcherbird and the Pied butcherbird to demonstrate that the sounds from the two similar species are different. This gives researchers the opportunity to use sounds to differentiate the two birds from the same family.
Researchers can use the data to analyse the function of bird sounds in:
• mating
• giving out warning signs
• protecting their young
• communicating with each other
One study of the sounds from the Moonwalking birds found that the sounds were from the flapping of their wings. This information alone is valuable to further the understanding of the science of wing motion and the unique physicality of the species.
Challenges and questions
There is a high volume of analogue sound recordings, some of which are slowly degrading. This poses a real challenge for Margaret and her team. Converting analogue data to digital data requires many hours of laborious work.
Margaret explains that one physical container of sounds such as a tape or a reel can generate multiples of bird sound files and metadata. Sometimes the metadata for those bird species may also be stored elsewhere on letters and notes. It demands a lot of attention to detail to ensure that the sound files and metadata are named, matched and stored correctly on the Excel spread sheet which feed into the ANWC and the OZCAM database.
The seminar discussed some interesting questions:
• How accessible is the collection of sounds in Museums compared to other cultural organisations?
• How useful are the sound files versus the cost of digitalising the files?
• What are the intrinsic values of the bird sounds to further the understanding of bird research?
• Does the quality of the sounds matter or it is just a matter of getting the sounds available to the public?
These issues remain to be conclusively dealt with, but the ANWC will continue to work towards the answers as it develops a sustainable approach to the prioritisation of curation of sounds for research.
Margaret is determined to make the collection as accessible as possible to benefit the researchers who can reveal the value of the data. Margaret feels that the intrinsic value of bird sounds lie in being occurrence records as well as providing sounds for the analysis of species distributions and studies of speciation. As occurrence records, the quality of the sound is unimportant as long as it is identifiable and adequate to future analysis.
What’s next?
Margaret welcomes more collaborative work to share knowledge, including the strategic use of volunteers to convert the analogue files and assist with identification of species and collection of metadata, and more funding to recruit staff to locate, identify and curate valuable multimedia collection objects.
In reality, it will take more than 100 years to digitise the analogue data and curate the metadata due to lack of human resources. It is undeniable that there is real value in making the sound collection accessible; however curating and digitalising sound collection remain low in Museums’ work priority as most museums already struggle to find resources to convert their specimen collection to image files.
Despite this, museums professionals can at least start a conversation to discuss the potential in their sound collection and to develop a plan with a vision where the public get to hear those less heard sounds from nature.
Do you have a sound collection?
What is your vision for the collection?
What does your sound collection sounds like?
Share your thoughts and let us hear your sounds get in touch with Margaret Cawsey here
As part of International Open Data Day, the Natural History Museum is opening up its digital collections and research data through its new Data Portal. An increasing number of governments and publicly-funded organisations are committed to making data available for unrestricted use - Open Data. NHM supports this principle and its data are of particular value to scientific research on biodiversity, looking at changes of species over time and in geographical distributions, and predicting future trends. This is something of particular interest in the face of human pressures on the natural environment and the need for effective policy responses for a sustainable future.
The Museum’s Vince Smith and Ben Scott created the system. Vince Smith said, “Data on the collection is one of our greatest assets. We wanted to expose the Museum’s data to our peers in a way that allows them to easily discover and reuse it.”
“The Data Portal will provide an archive for the hundreds of research datasets generated by museum scientists each year”, said Vince. “It also allows the Museum to contribute to global science initiatives, such as the Global Biodiversity Information Facility, who are aggregating all known data on the occurrence of species worldwide.”
The collection could once only be accessed when academics took the opportunity to visit the Museum in person. It is now accessible to anyone with an internet connection, anywhere in the world. Ben Scott said: "There is huge value in exposing this data to the world - we are excited to see what people use it for."
The Museum has over 300 Science staff, generating almost 1,000 scientific papers every year - these papers are now being presented as dynamic lists on the new staff biographies, which will link in coming months to a new NHM Open Repository for published materials. The new Data Portal will provide a platform for scientists to share the datasets that have been created alongside their studies.
Vince Smith said: “We hope that the Museum's open approach will further understanding of the natural world, and foster innovation allowing other scientists to test and build upon existing Museum research.”
Open Data Day brings people together around the world to use open public data in innovative ways: creating new approaches to visual presentation; doing analysis and research; and exploring new data products. It is part of efforts to support and encourage open data policies all around the world to open up access and increase benefits to all. As part of Open Data Day on 21 February 2015, Ben Scott will be attending the London outpost, and helping people use Museum data in their hackathons.
Museum collections are being transformed into a radical new resource for science through digitisation: creating image resources and immense databases that allow advanced research for the future of the planet. Professor Ian Owens, the NHM Director of Science, gave a symposium on this subject in collaboration with Dr Jonathan Coddington and Dr Kirk Johnston of the Smithsonian National Museum of Natural History at the American Association for the Advancement of Science in San Jose, California, on 14 February 2015.
Ecosystems and human needs
All humans depend on biodiversity in a wide variety of ways. Clean water, food crop production, sea fisheries, tourism, timber and many more human needs rely on the functions of ecosystems to a significant degree. Over the last twenty years we have seen much greater development of the idea of ecosystem services - a concept that thinks of the economic and other values of the natural world to humans and integrates those values into policy, education, natural resource management and other activities. This supports better decision making and aims to ensure sustainability - the continued use of ecosystem services by people over time and by generations in the future. Biodiversity is central to ecosystem services - the variety and complexity of species and populations is immensely valuable to us all, but we know that we do not properly understand how ecosystems work, or the real value of biodiversity.
Data: 4.5 billion specimens, 1.9 million species, 300 years, and now DNA
Sustainable ecosystems management depends upon the availability of information about the variation of biodiversity. Natural history collections are a vital source of these data, holding billions of specimens collected over three centuries, each witness to past ecological conditions and historic distributions. This presentation showed how collection organizations are using digitization to unlock the vaults of their collections and develop tools to map, monitor and understand the natural world.
The scale of the world's collections is immense, representing billions of datapoints. The Smithsonian NMNH in Washington is the largest collection with the NHM in London and the MNHN in Paris following. The data from these and many other colections together are a resource recording distribution, species and dates from which changes in biodiversity over time can be analysed. However, most of these data from the last 350 years is on labels, cards and in books, meaning that they are not readily available to modern science or computing. The challenge for collaboration is to transform the information into electronic data for modern biosphere science.
Collections transforming science
Museum collections have always changed the way that we think about the world by enabling scientific comparison and research: the discovery of the dinosaurs; the origins of humans; and the processes of evolution.
Charles Darwin's Galapagos finches in the NHM: a key to understanding evolution
As science and techniques change, so does the potential of museum collections - the recent revolutions in DNA and genomics enable collections to be seen in a completely new light as resources for researching evolution and relationships; the development of computing and data analysis allows rapid analysis of big patterns in space and time to be explored in ways that could only be imagined twenty years ago. NHM uses CT scanning to create digital replicas of delicate specimens for complex modelling; advanced analytical techniques with electron beam instruments to understand the detail of mineral structure and economic potential; and new applications of electron microscopy to give insights into the smallest detail of anatomy and development.
Our partial knowledge - species and diversity as a key to understanding ecosystem value and function
When it comes to the Biosphere and understanding how ecosystems work, the last 350 years have seen the discovery and description of around a quarter of the species that exist (excluding bacteria and similar microbes). 400,000 beetle species have been discovered, but this almost certainly represents a minority of those that exist. New technologies with DNA look likely to revolutionise the nature of discovery - and give access to greater knowledge of the link between diversity and our needs from ecosystems. Around 1.9 million Eukaryote species have been described out of a probable 8-9 million. If we consider bacteria, there could be tens of millions more species. We are currently, worldwide, describing around 15,000 species a year so the rate of discovery with current techniques is not going to close the knowledge gap: we need more rapid approaches to description and characterisation of biodiversity, and more sophisticated thinking on the importance of biodiversity in ecosystem function.
How do we understand biodiversity in a different way, and how can we speed up the development of our knowledge, particularly for the huge diversity of minute soil organisms, fungi and microbes? The effort of our science is at the moment focused on larger, more charismatic species such as birds and mammals, and the work of scientists on big processes and patterns in biodiversity - macroecology - is a small proportion of total activity.
A key area of strategic development is focused on acceleration of biodiversity description and discovery, using forest insects as a pilot group: Breaking the Taxonomic Barrier, an important complement to the digital initiatives.
Digitisation - the beginning of the surge
We need to unlock the potential of our collections through digitisation to speed up this science - transforming the labels and individual records into large datasets. However, this is a major task that requires extensive collaboration and significant resources. Our efforts so far have digitised around 3% of our collection of 80 million specimens - and of this, only some is in suitable form for scientific analysis.
Our response is the development of the Digital Museum: Collections. Over the last year, we have successfully transformed our approach from individual research projects to an efficient processing line with our iCollections project: we've digitised 500,000 UK butterflies and moths with an expert team who have taken images and transformed data for scientific use, taking 2 minutes per specimen and costing £1 per specimen.
Our current project is the rapid digitisation of 70,000 plant specimens on a conveyor-belt digitiser, followed by transcription and database development, as a pilot towards the creation of a Digital Herbarium to allow wider and much more ambitious scientific use.
And we have already put millions of printed pages into digital form as part of the collaborative Biodiversity Heritage Library - NHM's efforts are part of a global network of digitisation - iDigBio in the US has digitised 25 million specimens from a network of institutions, just as one example. These efforts amount to the development of big data for biosphere science in coming years.
The need for citizen science
However, there is a significant challenge in transcribing the label data from older scripts - it cannot be done automatically and until we have transcriptions, these older collections are inaccessible to science
And this is where the involvement of thousands of volunteers can be essential in transforming collections into a scientific data resource: citizen science and transcription. Some of this is through online crowdsourcing portals, such as Zooniverse, where we are experimenting with the transcription of our digitised collections. However, we are also looking at how we can make crowdsourcing a live event: at our annual Science Uncovered event in September 2014 we welcomed 10,000 members of the public into the NHM to see science at first hand and one of the activities on offer was citizen science transcription of beetle specimens.
Results and data use
New science is emerging from these growing digital resources: the beginning of a new type of science possible from this investment. Steve Brooks, Angela Self and Flavia Toloni from the NHM, with Tony Sparks from Coventry University, have used the digitisation of butterflies and moths to look at how UK butterflies are responding to climate change. There are good observation data for the UK from the 1970s, but collections hold the key to looking further into the past. They analysed data from 2,630 specimens of four species of British butterflies (Anthocharis cardamines, Hamearis lucina, Polyommatus bellargus and Pyrgus malvae), collected from 1876 to 1999. The data on collection dates gives a record of the first emergence of these species each year and the research shows a good relationship between higher early spring temperature and early emergence dates.
Big Data and Open Data
The data produced from our digitisation work is being released through a new Data Portal, enabling scientists to find information of interest and to download datasets for research on Open Data principles - the NHM has adopted a policy of being Open by Default.
The broad challenge of global biodiversity
New work is using much broader datasets to understand big changes across multiple ecosystems: Andy Purvis and his team in the PREDICTS project which is taking data from multiple sources, including collections, to look at patterns of how local biodiversity typically responds to human pressures such as land-use change, pollution, invasive species and infrastructure, and aims to ultimately improve our ability to predict future biodiversity changes. Data is being assembled from a wide range of biomes.
The future of the Digital Museum for the Biosphere: Open Data; Big Data; Community Data
The future for the Digital Museum of the Natural History Museum is based on a new model for the development and use of collections data through digitisation:
Open Data that are available to scientists all around the world for collaboration and research. We need to involve as wide a range of expertise in thinking about science for the future. Museums will continue to be a key resource as a focus for evidence and extended collaboration;
Big Data that cover whole ecosystems over long periods of time, based on the solid evidence base of collections and extending from population and species to molecules and DNA. Internationally, there are 4.5 billion specimens of 1.9 million species from 300 years of collecting. We need to use these data effectively but also work out new ways of gathering data on the millions of other species that will allow understanding to help humanity to tackle the challenges of the future in terms of environmental change and sustainable use; and
Community Data that are based on the involvement of a wide spectrum of public participation, from schoolchildren to students to communities: online, in museums and in the field. This is science for everybody, from basic curiosity, to observation and recording, to data development and interpretation, from appreciation to understanding practical application for the future.
One of the most prestigious international gatherings of scientists, policy specialists, journalists, science communications professionals and the US public is the annual meeting of the American Association for the Advancement of Science. It attracts several thousand participants every year: in 2015 it is in San Jose, California from 12-16 February.
The NHM is represented this year by Professor Ian Owens, Director of Science, who is co-organiser and speaker at a session Unlocking Natural History Collections to Model the Biosphere in collaboration with NHM's sister institution the Smithsonian National Museum of Natural History (NMNH) from Washington D.C.
Ian and Jonathan Coddington (from the NMNH) will be speaking on the potential of collections data in addressing global environmental challenges. 4.5 billion specimens in natural history collections are a key resource for science supporting our future on Earth. Unlocking this valuable data source through digitisation will support sustainable use of biodiversity, better understanding of parasite threats to human health, and essential insights for the development of new crops to feed a growing population.
Both NHM and NMNH have ambitious programmes of digitising collections - creating images, data and DNA evidence - to enable much wider scientific use by researchers around the world. The future of this bold enterprise will be mapped out in the session on at AAAS on Saturday 14 February, showing how major institutions are creating genomic collections and digitizing biological data to make it openly available, often with the help of thousands of online citizen scientists. Researchers are harnessing the potential of collections as an immense dataset on the planet’s past and present, used for modelling the future for better-informed policy.
Digitization, genomics and citizen science are enabling scientists to work across multiple collections and millions of specimens. The collections highlight geographic, temporal, morphological, and genomic patterns of diversity across a vast range of species. By combining collections data with new modeling and data visualization tools, analyses of biodiversity are possible on a scale never before seen.
Society needs systems that deliver the best possible estimate of the abundance, distribution and functional role of all species, from the recent past to projecting into the future. Delivering this requires an unprecedented level of cooperation by natural history organizations and the wider community.
We introduced our new digital herbarium project in a previous post: with the herbarium of the Royal Botanic Gardens, Kew, we are moving 70,000 plant specimens temporarily to Picturae, a specialist company in the Netherlands, so that the herbarium sheets can be imaged in the most speedy and effective way.
The images will then be sent to Suriname for transcription of the typed and hand-written information on the sheets into electronic form. The information includes the species identification, the place and date of collection and often the collector, that can link to field notebooks and other resources. The images and data will then be accessible via online databases to scientists and conservation biologists and others for research and better understanding of plant distribution and biogeography.
This is what it all looked like as we packed up and got ready to go - not many people see this, so worth showing:
The NHM herbarium compactors
and the grey cupboards on the compactors
Jacek Wajer removing specimens of Dioscoreaceae (yams and related plants)
And wheeling them away on a trolley
Steve Cafferty preparing the transport boxes for the specimens
Kew and the Natural History Museum are working together on large scale digitisation of their plant collections - #digitalherbarium #Kew #NHM
Packing specimens at Kew. Kew is sending 41,000 specimens.
Plants preserved as herbarium specimens provide the evidence of what plants there are, where they grow and when they were collected. They provide the basis for modelling plant distribution over time, act as evidence that ensures plants are named consistently, and are a source of material for analyses of anatomy, disease and disease control, biochemistry and evolutionary relationships. Together, the herbaria at Kew and the Natural History Museum, London, contain more than 12 million specimens and are consulted by many visitors from around the world. Much of the information that these researchers need is stored away in cupboards, and is therefore not discoverable until a scientist visits the institution and looks inside. By providing images and data from these specimens online, anyone interested in plant diversity, for research or just for interest, can discover what our institutions hold and then access the information they need.
Recently some large European herbaria such as the Muséum National D’Histoire Naturelle in Paris and Naturalis in The Netherlands have had digital images made of their entire collections in order to make both specimen images and data about each collection available. Kew and the Natural History Museum have been working closely with Picturae, the company involved in the digitisation of the Naturalis herbarium, to develop cooperative workflows to make digital images and capture data from part of the two institutions’ collections.
Jacek Wajer and Jonathan Gregson selecting specimens for packing at the Natural History Museum
We are embarking on the first stage of this adventure starting the last week of January. This first stage is a pilot to refine workflows and to gather information so we can plan larger scale projects in the future. We are focusing our efforts on several groups of economic plants, the genus Solanum (potatoes, tomatoes and aubergines), the St. John’s Worts (Hypericum) and the family Dioscoreaceae (yams). In all, approximately 70,000 specimens will be digitised using Picturae’s ‘digistreet’ methods. A ‘digistreet’ is essentially a purpose-built conveyor belt system that minimises manual handling of fragile herbarium specimens and captures high resolution images of each. After quality control and checking at both Picturae and the respective institutions, detailed information on where and when each plant was collected will be transcribed from the labels on the specimens by a team in Suriname.
Our objectives for this pilot phase are:
Image all Kew’s and NHM’s selected pilot herbarium specimens to an agreed common standard
Transcribe all the label collection data from these specimens to an agreed standard.
Incorporate all of the images and data into the institutions’ specimen catalogues to make them discoverable on-line.
Work together to refine accurate costing of mass digitisation using Picturae’s methods and develop joint workflows that will facilitate future work involving more partners across the UK.
This important pilot will lay the foundation for future collaborative work, with the eventual goal of providing access to the rich botanical collections held in UK institutions. We will share the results of our pilot with other institutions to help increase access to the wealth of information on global plant diversity held within the UK and to maximise the scientific and conservation impact of data held in plant collections worldwide. We hope that others will want to join in on this adventure!
The Picturae conveyor belt imaging system in Amsterdam.
The pilot began on the 19th of January with material being sent to Picturae in the Netherlands. We will be tweeting and blogging on the progress of the project as the specimens are shipped, imaged and transcribed - follow us on Twitter using the hashtags #digitalherbarium #Kew #NHM
Margaret Cawsey, Curator of Data, Australian National Wildlife Collection, CSIRO
Friday 4 July 11:00
Sir Neil Chalmers seminar room, Darwin Centre LG16 (below Attenborough studio)
Specimen-based collection records from museums and herbaria are often regarded as a more authoritative basis for research than observational assertions. Through the Atlas of Living Australia (www.ala.org.au), Australian collections have a centralised venue for sharing their biodiversity data on a large scale. *3.3 million collection records are brought together with a variety of tools that enable researchers to select, interrogate, map and analyse these data. Scientists are taking advantage of the increasing accessibility and large numbers of these records to enhance their research - illustrative examples are presented. Advantage also accrues to collections, in that the value of their data to researchers, policy-makers, environmental managers and the community at large is demonstrated by data download statistics. The Atlas also provides tools for researchers to communicate with curators, in effect permitting collections to crowd-source the expert identification of data errors, facilitating rapid correction.
This project has a number of purposes and benefits.
First, the list of butterflies is a checklist - a list used to define all the species found in a particular area. This is important because it summarises current knowledge of diversity: biodiversity scientists and conservation professionals know what has been found and what they should take account of in research. The act of compiling a checklist will often involve research and reorganisation of collections to reflect current knowledge
Second, these are photographs of the type specimens - the definitive reference specimens used as authority for the use of a scientific name. These are housed in museums such as the NHM in a number of different locations. A virtual photographic collection allows scientists to see easily where the reference specimens are for use - and the photograph may be sufficient for some scientific uses. It also brings together specimens from different collections that would not otherwise be brought together without considerable cost.
Third, the photographs can help in identification and mean that scientists and conservation workers in different parts of the Americas can use the resource as a reference - this may need some care and development of more complex identification resources, such as keys, but the pictures are an important resource nonetheless.
The great majority of these images are scans of print photographs taken by Gerardo Lamas over many years of research in museums throughout the world, and we are very grateful for his generosity in allowing them to be made available. Scanning and initial databasing of the prints was completed by TABDP, supported by the Darwin Initiative, and then given to BoA to be made available online. BoA's Nick Grishin designed and wrote the web pages that now display the images. Numerous other people deserve acknowledgement, including the curators of the museums where these types are housed and many other members of TABDP, BoA and other lepidopterists who contributed images, time and encouragement.