The Species 2000 standard data set, common data model and transfer protocols
Richard White 1, Frank Bisby 2, Andrew Jones 1, Xuebiao Xu 1, Ed Donovan 1, Yuri Roskov 2, Alex Gray 1 & Rainer Froese 3
1 School of Computer Science, Cardiff University, Cardiff CF24 3XF, U.K.
2 School of Plant Sciences, The University of Reading, Reading RG6 6AS, U.K.
3 Institute of Marine Research, D-24105 Kiel, GermanyThe Species 2000 Project has created a prototype Catalogue of Life. Species data from a distributed array of Global Species Databases (GSDs) is accessed through a Web portal, Web Service and an Annual Checklist CD-ROM.
Each species entry in the Catalogue derives from a GSD and contains basic information known as the Standard Data Set: scientific name, synonyms, common names, taxonomic placement, a comment, geographical localities, bibliographic references, date of last editing and a hyperlink to further information.
Because the GSDs are independently implemented and managed, they do not share the same database schema. Instead, the Species 2000 Common Data Model (CDM) defines six operations, known as Requests, which the GSD or its wrapper can support to transfer defined data elements reliably to a client. The CDM does not provide a database implementation schema and should not be used as a model for setting up new GSDs. For this purpose we propose a schema developed to support a new version of the Annual Checklist.
The CDM forms the basis for several communications protocols previously and currently used by Species 2000, including CGI/HTML, Corba, CGI/XML and SOAP, and will be compared with the draft TDWG Taxonomic Concept Schema (TCS).