Skip navigation
1

So it seems that everybody just duplicates everything… As Sandy wrote from Étape Eggplant in Montfavet, our last week’s focus on eggplants has revealed that duplicates are an issues not only in our database but in any collections such as world’s seed bank collections. So what is it with duplicates that make them an issue?

 

My role last week included gathering data on eggplant wild relatives, including both records of their natural distribution and currently available germplasm collections in seed banks. The idea of the gap analysis is to see whether eggplant crop wild relative (CWR) diversity is conserved in seed bank collections, and to identify where new collections should be made. It is a fascinating job – not only does it surprise you that such basic questions have not been looked into before, but the job also turns out to be more complex than you would expect at first.

 

It is easy to assume we (or somebody high up in one of the United Nations offices) know what is currently conserved in seed bank collections around the world. Seed banks are, after all, cornerstones of food security, banking seeds for the future if something would to happen. So somebody is counting what is being conserved across the globe, right? The answer is not that simple. For the ten or so most important food crops, specialised research institutes have been established to secure conservation and management of these important crops. For others, such as eggplant, the story is different.

 

World’s major seed banks, or genebanks as they are often called, all have their focus groups or focal areas. The bulk of these genebanks belong to CGIAR, the Consultative Group on International Agricultural Research, which is an informal network of 16 international agricultural research centres. Together, these 16 institutions manage c. 600,000 agricultural seed samples – that is quite a lot! On top of these international giants, there are national or more regional genebanks, such as the United States Department of Agriculture’s National Plant Germplasm Service (NPGS), and the Dutch national seed bank at Wageningen University, the Centre for Genetic Resources, the Netherlands (CGR). Despite their different sizes and slightly different mandates, these seed banks all share the aim of preserving crop diversity.

 

Svalbard.jpgPicture of the Svalbard Global Seed Vault: located in Spitsbergen, an island north of Norway, the Vault is storing seeds for the future.

 

The idea behind seed banking is to conserved diversity for future, short or long term, needs. These needs change through time. Currently, focus is being given to breeding crops to prepare them for the changing climates, as well as increased disease and environmental stress resistance. What needs to be analysed, however, is whether the currently conserved accessions in seed banks have potential in breeding efforts for these priority traits. Gap analysis, such as our project on eggplants and their wild relatives, have been developed to answer these questions and play an important role in understanding the potential of currently banked accessions for these priority breeding targets.

 

How does this all relate to duplications? Well from my short experience in this field, I have learnt that many of our global seed collections are actually duplicates. I have now worked on both tomato and eggplant CWR collections, downloading germplasm data from individual genebanks and then merging them to a single file to prepare them for the gap analysis. Seed banks exchange material, not only for diplomatic reasons but for the very important reason of securing their collections. Such duplications are called safety duplications, and are mandatory for large genebanks. This assures that if something would to happen due to political (wars do happen unfortunately) or environmental reasons (e.g. earthquakes), not all eggs were put in one basket.

 

What does this mean for food security? Well firstly it is great that if something would to happen, duplicates are safe somewhere around the world. But the other side of the coin is that although seemingly looking large at first sight, the actual number of seed collections conserved around the world is not that big after all. Conserving and preserving seeds in long term storage is costly, and this reduces the amount that can be kept in the banks.

 

My job this week continues to focus on pruning some of these duplicates out. For the sake of our gap analysis, we want to make sure we represent the global collections realistically. Some argue that duplicates can be viewed as unique collections because seeds have to be regenerated at regular time intervals and this process leads to slight differences between the duplicate collections. From the point of view of unique genetic resources that could be used for adapting our crops to climate change, we need truly unique collections that represent extreme environmental adaptations. Complex traits such as drought resistance or salt tolerance do not vary between duplicates – they evolve over hundreds if not thousands or even millions of years.

0

Sandy and Tiina, in Montfavet, near Avignon, France 

 

We had a small diversion in our sporting calendar to the south of France; we were invited to participate in a meeting convened by the Global Crop Diversity Trust (http://www.croptrust.org/) to discuss how the diversity of eggplant wild relatives could be conserved. The eggplant or aubergine is a species of SolanumSolanum melongena, and although not very physically similar to potato (Solanum tuberosum) or tomato (Solanum lycopersicum) it does have genetic similarities.

 

We worked hard, our days were not quite as physically demanding as those in the Tour de France ascent of the nearby Mont Ventoux (left out of the 2012 Tour), but we made a lot of progress! One fascinating fact is that in English, French biologists refer to our vegetable aubergine as the eggplant – just goes to show how language exchange can be pretty random!

 

Christine Daunay of INRA (http://www.international.inra.fr/), Jaime Prohens of the Universitat Politècnica de València (http://www.upv.es), Hannes Dempewolf of the Trust along with Tiina and I talked about the taxonomy of wild eggplants (based on the work of Maria Vorontsova), the state of seed collections of wild eggplants and what eggplant breeders need to improve the crop, especially in the face of climate change.

 

Despite its importance as a world crop, the community of plant breeders working on eggplant are few and far between – it is clear from the meeting that future collaboration with colleagues in India and China – the home of eggplant domestication – will be critical. The wild relatives of the crop are from Africa and future collecting to understand their ranges and environmental tolerances will be important for eggplant improvement. This is where the world of taxonomy intersects with the world of plant breeding and agriculture – knowledge from wild relatives can really help with problems faced by those in agriculture, not just in terms of genes that can be introduced into crops from wild relatives, but in understanding adaptation to different environments and habitats.

 

Our colleague Christine keeps a collection of seeds of wild relatives of eggplants, and we toured her fields and greenhouses – reminding us that Solanum species are more amazing than imaginable – truly paradoxical plants!

 

We returned to London and Edinburgh via TGV (Train à Grand Vitesse – and it is really fast!) through the French countryside and came back to an excited Britain – lots had happened while we were deep in discussing food. Time to catch up with all the other action!

Kuva1.jpg

1

Day 1 and 2 from Edinburgh

 

Similar to Sandy I was impressed by the plant inspired ending of Friday night’s ceremony! Whilst watching the inspiring event, I was looking through our database and seeing how big my part of the task ahead really is going to be…

 

As Sandy explained, simple differences in typing collectors names can result in two names being allocated to a single person – like the example of A. Fernandez and A. Fernández. The accent makes all the difference to the computer! The implications of such typo’s or spelling differences are what I’ll be focusing on this week.

 

Let me give an exaxmple of job as the “duplicate hunter” as I have named myself. If, for example, specimen “Fernandez 212” is being entered to our database, the database performs an automated check if other specimens (also known as duplicates) of the collection event already have been entered or not. If another duplicate of the collection event has been entered as “Fernández 212” with the accent on the a, whilst the one being entered is missing it, these two specimens will become part of two separate collection events… Again we can’t blame the computer, as the names are not exactly identical!

 

So I went into our database and checked how many collection events are identical based on collection number (for example 212) and collection date (day, month and year). As collection number and dates are numerical, typo’s caused by alternative spellings do not generally cause issues (although see below), meaning that identical entries can be identified easily.

 

Using the above ever-so-clever but simple technique, I identified 1839 records that are potentially duplicated. Of course there is a large list of collections that are not true duplicates although they appear on our suspected list. These are collections that have, just by chance, same number and collection dates. A mere 1549 of the 1839 suspected are collections that lack number, which are all labelled with number “s.n.” according to old tradition as “s.n.” means “without number” in latin. What the letters s.n. truly stand for escapes me now – s. = sin, but n. = numero or numerus? Latin speakers will be able to help me out here…

 

Prior to our Plant Challenge, I did a spur of duplicate spotting in our database over one quiet day. I found out that there are several errors leading to duplications. Spelling mistakes or alternative spellings of collectors’ names is one reason, but alternative spellings of numbers is another reason, although small I grant you. There is a set of numbers which have been entered with an unnecessary 0, such as “012” which appears simply as “12” in another duplicate entry. I plan to tackle these duplicates by filtering all collection numbers with “0” and then sorting in numerical order. There seems to be an additional 100 or so records to check there.

 

And lastly there are ones where duplicates appear simply as identical duplicates. These are ones where collectors name and collection number appear perfectly identical, and truly are. Although we try to elimanate entering duplicate records, it always happens, somehow

 

Quite impressively, I have now tagged myself a list of 11 388 collection events to check and go through!!! By no means will all of these records represent true duplicate entries – our data set is relatively clean we believe – but one never knows …

0

Plant Challenge - Let's begin!

Posted by Tiina Jul 23, 2012

A certain major sporting event will get under way this Friday and we'll be having our own celebration by launching our own Plant Challenge!

 

Here at the Solanaceae team we will be writing daily blogs about our activities. We have set ourselves a goal – a challenging goal we hope to achieve but in order to do so we might need a bit of luck and lots of hard work! The great big goal is to clean and update our ever growing BRAHMS database which holds the data needed for running the great Solanaceae Source website soon to be updated to Scratch pad 2. This is not a small task by any means: the database currently includes 60,005 collection events, 72,301 individual specimen entries, 16,759 collectors names, 13,565 species names, 19,318 gazetteer entries, and 71,345 species determination records!

 

Between Friday 27 July and Friday 10 August you can follow up on our progress and hear how our efforts are going. Our Team consists of three people: Mamen (Maria Peña Chocarro), Sandy, and Tiina. Mamen will be in charge of geography, Sandy is focusing on cleaning collectors, nomenclature, and literature, and Tiina is taking on data entry and unifying data records. Despite months of hard and strenuous training, the contestants are feeling nervous yet incredibly excited! One thing is for sure - the journey will be full of surprises, as you never know what one finds inside the big matrix!!!

 

The team will use “divide and conquer” strategy to tackle the mammoth task. Whilst Mamen and Sandy will stay at the project headquarters in London, Tiina will be sent to Edinburgh to the Royal Botanic Gardens Edinburgh to establish a remote base for the operations. The equipment for the task will include three laptops, three internet connections, and three desks. Coordination of research will be done through email and phones.

 

Whether you are a scientist or a keen natural historian, join us in your efforts in Plant Challenge! Send your comments to our blog, with links to your own planty challenge feat!

 

laptop_IMG_4011.jpg