Papers and Posters

Abstract

Taxonomic Names Meet Formal Parsers: Building Better Biodiversity Databases
Brook Milligan, Department of Biology, New Mexico State University, Las Cruces, New Mexico 88003 USA

One common characteristic of many relational databases is the use of multiple fields to represent a single concept. One example from the Darwin Core schema is the representation of taxonomic names, where three fields are used to represent the full name of a species. One consequence of this practice is that the integrity of information split among these fields must be guarranteed independently by every data entry mechanism. Unless all are in strict agreement, no statements can be made about the integrity. Furthermore, every client using the data must include complex logic for handling all cases (including erroneous or meaningless ones) involving presence and absence of information in each of the several fields. To solve this problem I present an approach that unifies taxonomic name information into a single field, embeds the relevant logic within the database, and relies on the formal nomenclature of relevant taxonomic groups to ensure integrity of the information. This guarrantees that information entered into the database follows established nomenclatural conventions regardless of the data entry mechanism. Clients are greatly simplified because they need not check the integrity of the information; rather, they need only to retrieve it. Integration of such techniques into taxonomic and biodiversity databases will greatly simplify the curation of information.