Skip navigation

Nightshades: the paradoxical plants

2 Posts tagged with the data_entry tag

As I mentioned in an earlier post on the Seeking Nightshades in South America blog, I am in North Carolina at Duke University to give a talk about Alfred Russel Wallace (whose centenary we celebrate this year with lots of exciting events at the Museum in London) at the opening of an exhibit featuring the Duke herbarium.


Collections associated with universities are under threat worldwide - they take up lots of space that could be given over to labs or teaching space, and are often not considered necessary for the research projects that attract funding. Larger collections like the ones we hold at the Natural History Museum in London, or those at Kew Gardens or the Royal Botanic Gardens Edinburgh sometimes have an easier time justifying themselves in the face of competing pressures, but not always!


The herbarium is thriving at Duke - with almost a million specimens, it is one of the largest plant collections associated with a university in the United States. There is new housing for the flowering plants, and the university administration values the herbarium as an integral part of the Department of Biology. This is a real success story.



New compactors for the flowering plant specimens in the basement of the celebrated Duke Phytotron building; curator Layne Huiet bring me more Solanum specimens to identify!


I have spent the last couple of days working in the collection here, and it has come home to me how important these specimens are, and how important it is that they are held in institutions of higher learning and not just given away to big national museums.


The value of university collections


The Duke collection is a resource for a person like me - who comes in and wants to see as many solanums (or whatever else they might be working on!) as possible, but also a resource for the students, who can use the collection on-site to explore questions about the natural world.


These university collections are also unique in that they hold the materials that have been generated as part of PhD studies, or first year field courses - they are a part of the legacy of the institution in the same way as is a library. Unlike a library though, students can be part of building a museum or herbarium collection through their own contributions. The plants they collect on field trips or field courses become permanent records and can help other students.


The Duke collections are strong in plants of the region - making the herbarium locally relevant - and in plants from Costa Rica, where a long association with the Organization for Tropcial Studies brought projects and interest. So how did the botanists manage to make the herbarium seen as an asset and not a drain?


The power of positivity is at work here - the halls outside the herbarium are full of posters showing how herbarium specimens have been used to answer big questions in biology - climate change, invasive species, spread of disease - with big full page pdfs of articles in those high impact journals so beloved of those in charge. Kathleen Pryer, the director of the herbarium and fern systematist (and the person who, with her student and another colleague, rather famously named a fern genus for Lady Gaga!!), markets the collections tirelessly, with good effect.


Rather than giving ties or engraved glass bowls to visiting dignitaries Duke administrators now give away framed scans of beautiful herbarium specimens - and people love them! In fact, they are so in demand that the botanists are talking about going out and collecting some iconic local plants to make some new special prints - what a clever idea. Each scan comes with its biological relevance, driving home the message. The exhibit that opens today is entitled 'Botanical Treasures from Duke's Hidden Library' - a very good analogy.




Framed scans of herbarium specimens from the Duke collections in a biology conference room



There is a lot of discussion worldwide about consolidation of collections and some universities at least are deciding they no longer wish to keep the herbaria and museums that formed part of their past academic offering. Perhaps they are seen as old-fashioned, a drain on resources or just not relevant anymore. This is a mistake I feel, as these collections are the starting place for many new questions - if they are made accessible and valued.


It is not necessarily easy, but university collections are special just because of where they are, in the thick of training the next generations not only of scientists, but of citizens.


Long may they last in their academic homes, inspiring students and being part of the fabric of universities everywhere......


Day 1 and 2 from Edinburgh


Similar to Sandy I was impressed by the plant inspired ending of Friday night’s ceremony! Whilst watching the inspiring event, I was looking through our database and seeing how big my part of the task ahead really is going to be…


As Sandy explained, simple differences in typing collectors names can result in two names being allocated to a single person – like the example of A. Fernandez and A. Fernández. The accent makes all the difference to the computer! The implications of such typo’s or spelling differences are what I’ll be focusing on this week.


Let me give an exaxmple of job as the “duplicate hunter” as I have named myself. If, for example, specimen “Fernandez 212” is being entered to our database, the database performs an automated check if other specimens (also known as duplicates) of the collection event already have been entered or not. If another duplicate of the collection event has been entered as “Fernández 212” with the accent on the a, whilst the one being entered is missing it, these two specimens will become part of two separate collection events… Again we can’t blame the computer, as the names are not exactly identical!


So I went into our database and checked how many collection events are identical based on collection number (for example 212) and collection date (day, month and year). As collection number and dates are numerical, typo’s caused by alternative spellings do not generally cause issues (although see below), meaning that identical entries can be identified easily.


Using the above ever-so-clever but simple technique, I identified 1839 records that are potentially duplicated. Of course there is a large list of collections that are not true duplicates although they appear on our suspected list. These are collections that have, just by chance, same number and collection dates. A mere 1549 of the 1839 suspected are collections that lack number, which are all labelled with number “s.n.” according to old tradition as “s.n.” means “without number” in latin. What the letters s.n. truly stand for escapes me now – s. = sin, but n. = numero or numerus? Latin speakers will be able to help me out here…


Prior to our Plant Challenge, I did a spur of duplicate spotting in our database over one quiet day. I found out that there are several errors leading to duplications. Spelling mistakes or alternative spellings of collectors’ names is one reason, but alternative spellings of numbers is another reason, although small I grant you. There is a set of numbers which have been entered with an unnecessary 0, such as “012” which appears simply as “12” in another duplicate entry. I plan to tackle these duplicates by filtering all collection numbers with “0” and then sorting in numerical order. There seems to be an additional 100 or so records to check there.


And lastly there are ones where duplicates appear simply as identical duplicates. These are ones where collectors name and collection number appear perfectly identical, and truly are. Although we try to elimanate entering duplicate records, it always happens, somehow


Quite impressively, I have now tagged myself a list of 11 388 collection events to check and go through!!! By no means will all of these records represent true duplicate entries – our data set is relatively clean we believe – but one never knows …