Background

European Natural History Specimen Information Network

Background

ENHSIN is a Thematic Network funded through the European Commission's Improving Human Potential Programme. ENHSIN is essential as an evaluation of future strategy, management and implementation for the specimen collections that underpin wide areas of science and related enterprises. These collections are developed and managed with user needs as a constant priority: the partner institutions are all active in the electronic databasing of their specimens. 

Scientists and other users need access to information on the specimen's location, and its unique characteristics as a baseline for their work. In addition, a tremendous amount of associated information is held for specimens: point and time of collection, history, provenance, and so forth. Much of this information is now becoming available for individual institutions but there is at present no development of a common access and search mechanism to an electronic network of interoperable specimen databases in Europe. There is currently a diversity of approaches, meaning that there are barriers of access policy, technology, language and other factors to the user in seeking information on specimens in European institutions. 

This Infrastructure Cooperation Network will set standards for such a development of infrastructure, test options and specify a practical framework for implementation. It will stimulate a consensus between the infrastructure providers and users for European user needs, on data priorities and technology, on information policy and access, and on harmonized management of information resources.  It will provide the foundation for common information access.

Objectives

The broad purpose of this project is to enable the development of a shared, interoperable infrastructure of European natural history specimen databases by developing and assessing protocols, standards, methods and management frameworks, together with a consensus on user needs.

In more detail, the objectives are:

  • To confirm scientific and sectoral user needs for natural history specimen information in Europe
    This will ensure that users and needs, both direct and indirect, are clearly addressed in the development of the infrastructure  
  • To agree standards and protocols for specimen information exchange, access, quality and terminology.  To agree the core information to be searched and shared through the infrastructure collaboration.
    This will lay foundations for information exchange by ensuring consistency, in content and structure, of specimen data, and in communication between institutions  
  • To establish protocols for retrieving data across multiple sites
    Provides mechanism for actually networking information
  • To identify data sources suitable for linking in a pilot network.  To implement a pilot network and evaluate its effectiveness
    Allows priorities identified by users to be addressed as well as defining technical base for pilot. Provides feedback on meeting user requirements  
  • To identify and address key legal and intellectual property issues 
    Addresses barriers to successful implementation of infrastructure networking  
  • To develop policy and frameworks for implementation of collaborative infrastructure data networking.
    Draws together findings of Cooperation Network to produce a practical framework as a basis for implementation in RTD


Approach

The early stages of the project involve activities on several fronts. A review of legal issues is being undertaken to highlight and resolve matters (notably IPR) that have the potential to restrict information flow within the network. European specimen databases that satisfy the criteria for inclusion in a pilot network are being identified. Policy issues, such as the extent of access to data and data quality, are being agreed. Such policy questions will inform the very basis of the technical design solutions necessary for delivering both the model (pilot) network and the broader large-scale network. The pilot network will focus on identified priorities in areas such as earth sciences, biodiversity or human origins. All these activities are being executed against the background of user needs, both specialist and sectoral. Early user assessment is being made by means of a questionnaire, shortly to be sent to a wide range of stakeholders. Issues of network management are being undertaken as more specific tasks evolve. The middle and final phases of the project deal with fine-tuning of data standards and technical deliverables, completion and evaluation of the pilot network and the production of a management model for the large-scale network.

In addition to developing the infrastructure, the partner institutions are also users of the system, combining the infrastructural role with a substantial fundamental and applied systematic research effort. They employ research scientists who work on the collections, host scientific collaborators and visitors, and provide information in response to scientific and other sectoral needs.  Thus the members are both primary users of the collections and in close continual liaison with a far wider body of expert users in science. The needs of scientific research have a fundamental influence on the development of the infrastructures and the members of the Infrastructure Cooperation Network are key representatives of the primary users in this respect.


Communication

Communication with users is crucial to the successful conduct of the Network. This is particularly the case for the iterative definition and testing of user needs and the testing of the pilot networks, where the focused participation of a range of users is essential to ensure the relevance and practicality of the final findings. Such focused communication will be achieved through the use of direct email or personal contact, and, by particular members in implementing tasks, the use of list server technology. An important function of this website is to disseminate information to wider scientific, museum, cultural and other sectoral audiences and encourage their participation

 

Complementarity

The Network is strongly complementary to the current development of metadata systems for European collections under the BioCISE project (funded under Framework Programme 4).  BioCISE is identifying biological information resources (collections and databases), cataloguing interdisciplinary biodiversity database expertise, and providing guidelines for the incorporation of collection information in databases. BioCISE has been operating at the level of the collection - coverage, characteristics and other information - and working with a particular data model to develop a consistent metadata system for collections. ENHSIN will be developing access at the level of the specimen, and it is essential that development is compatible with outputs of the BioCISE initiative for the collections. The important difference from BioCISE is that while BioCISE deals with harmonized approaches to information on a collections scale, ENHSIN addresses the need for common access to information on the actual specimen. Both kinds of information are important to users and part of the ENHSIN task will be to address complementarity and possible harmonized development with initiatives such as BioCISE.  A further important difference is in coverage:  BioCISE deals with biological collections, while the collections within EHNSIN have wider coverage - not only biological, but mineralogical. 

There are other electronic information initiatives where complementarity is of importance and which ENHSIN will address. BCIS (Biodiversity Conservation Information System) is an international partnership of 12 organizations that, among other activities, is building a metadatabase for biodiversity data holdings of the partners.  Issues of data management and interoperability are of obvious importance but BCIS is not handling specimen information from collections. IABIN is a similar initiative for North American biodiversity conservation information that is addressing the issue of interoperability of information systems, but is again not addressing specimen information.

Species 2000 is an initiative that is aiming to provide a uniform and validated quality index of species names for all known species by forming a federation of species databases:  this initiative is supported directly or indirectly by a number of the partners in ENHSIN and is important in knowledge on species and in stabilizing nomenclature, but is not dealing with the specimens themselves that are essential in defining taxonomic names.

There are a number of small initiatives on an institutional level that are providing specimen information.  Some, such as the Catalogue of the Type Specimens of the Dutch herbaria, handle specimen records from several institutions but are not interoperable networks of active databases.  Some limited networks of small databases are being developed on common database platforms - NEODAT II is a good example.  These have not yet addressed the practical problems of a range of database platforms, large collections and diverse users.

The potential for complementarity will be assessed during the Network, and potential collaboration in implementation of the Pilot and in RTD will be evaluated.