Papers and Posters

Abstract

Data Cleaning Tools and Methodologies
Arthur D. Chapman. Centro de Referência em Informação Ambiental, CRIA, Av. Romeu Tórtima, 388, Barão Geraldo 13084-520 Campinas SP Brazil.
Email: paper142@achapman.org

Herbarium and Museum data is increasingly being used in conjunction with modelling tools to determine species distributions. Most collection institutions do not have a high level of expertise in data management techniques or in Geographic Information Systems (GIS). What is needed in these institutions is a simple, inexpensive set of tools to assist in the input of data and information, including geocoding information, and similar simple and inexpensive tools for data validation that can be used without the necessary incorporation of expensive GIS software. This paper concentrates on the latter - simple and inexpensive tools and methods for geocode validation. A number of methods have been developed to identify georeferencing errors in species' data. These include the use of climate models to identify outliers in climate space and the use of automated georeferencing tools. The various methods can be classified into four main classes. The use of databases for checking internal inconsistencies, the use of geographic information systems, the use of environmental space to check for outliers and the use of statistics to check for outliers in both geographic or environmental space. This paper looks at simple tools that are available now, and suggests a number of other methodologies that could be used.