John C Tweddle & Charles Hussey
Department of Library and Information Services, The Natural History Museum, Cromwell Rd, London, SW7 5BD
A growing number of biological observation and specimen databases are becoming available online. This offers exciting opportunities for cross-dataset searching and comparison, but also raises new difficulties. For example, can a user be sure that a search on a taxon name has retrieved all of the relevant information from each dataset?
In this presentation we discuss a series of observations relating to the integration of taxon names from multiple data sources, and propose practical solutions to some of the difficulties that arise. Our conclusions are based upon over four years of work for the UK’s National Biodiversity Network (NBN), which so far provides access to over 18 million observational records via the NBN Gateway (http://www.searchnbn.net/), supported by the NBN Species Dictionary (http://www.nhm.ac.uk/nbn).
We assert that a taxonomic name server is necessary to enable optimal retrieval of records when searching by organism name, and that a data warehouse architecture facilitates both data integration and validation. The observations made will have wider relevance to other initiatives that allow the searching and comparison of multiple datasets of taxon names.