Proxy data objects provide optional linking to external objects and small modular data interfaces from which DarwinCore and LinneanCore- like protocol interfaces can be constructed
Gregor Hagedorn (Federal Biological Research Center, Institute for Plant Virology, Microbiology, and Biological Safety, Königin-Luise-Str. 19, 14195 Berlin, Germany)Ideally, biodiversity data are expressed using well-defined object types (with generally accepted and intuitive property and object composition concepts) and all objects required in relations are available in digital form, identifiable through resolvable globally unique identifiers. Reality is different. a) the complex models like ABCD, SDD, TCS etc. are under debate and not necessarily fully stable. A consequence is that simplified "Cores" like DarwinCore or LinneanCore are proposed. b) Most required data are not digitized at all. Wherever data should naturally refer to other biodiversity domains alternative models are required to either linking something external or provide a sufficient internal object definition. Such a definition also involves a simplified set of core elements.
I therefore propose to combine the two problems and define relatively simple data interfaces, that can serve both for protocol/query purposes and for the definition of local proxy objects (= either link to external entities, or provide a local definition). Data interfaces shield the complexity of a fuller model (i.e. the full model can be treated as a black box). They should be rough enough to fit to several models, but also detailed enough to allow the definition of proxy data (i. e. make a substantial semantic definition). Data interface are often implicitly used in current practice. 'Specify' uses a simplified literature and nomenclature interface, DarwinCore contains name, identification and geographical location interfaces, Taxon Concept Schema contains interfaces for literature and specimen, and Linnean Core has a literature interface. Agreeing on a common set of such interface concepts would allow to build Darwin and LinneanCore as well as much of SDD, TCS, etc. from a the same building blocks, drastically reducing the investment needed for a full global biodiversity information system.