Andrew C. Jones, Richard J. White, Xuebiao Xu
School of Computer Science, Cardiff University, 5 The Parade, Cardiff CF24 3AA, UK
There is increasing interest in using workflow as a metaphor for e-Science problem-solving tools for biodiversity. Appropriate software is being used to implement this metaphor in a number of biodiversity informatics projects. For example, the Kepler system is being developed in association with the SEEK project, and the Triana system is being extended for use within the BiodiversityWorld project. Using such systems it is possible to build up complex analyses from sequences of simpler tasks.
BiodiversityWorld is a multi-site project within which we have had particular responsibility for designing and implementing the software architecture and providing appropriate workflow management facilities. In this paper we draw on our experience in BiodiversityWorld to discuss the benefits and limitations of a workflow-based approach, and propose the hypothesis that a more flexible, exploratory approach is needed in order better to support the needs of scientists. In this approach, software support would be provided for atomic and fragmentary experimentation (e.g. discovering some interesting properties of an individual data set by performing some individual task upon it), and software support would be provided for combining these previous, fragmentary tasks in meaningful ways. To support this, an effective mechanism for managing provenance data is needed. We also propose that a knowledge-based approach could be employed to assist in the discovery of resources and tools, and in their combination into larger sequences. This process would be directed by an understanding of the user’s goals.
We will present scenarios to illustrate the feasibility of our proposed approach.