Recognising butterflies with deep learning

Fontys Information and Communication Technology

Students help Naturalis recognise butterflies with deep learning

Monitoring biodiversity is a challenge, and for Naturalis it is mainly in the area of data. The biodiversity centre relies on observations by volunteers for their research into the distribution and movement of, for example, butterfly species. They take photographs of butterflies, which can be indexed by experts according to species. In this way Naturalis gets good data without having to go into the field itself. But what if artificial intelligence (AI) could convert that data directly into useful insights?

Citizen Science 
Contributing to scientific research without being a researcher is a common practice. It is called citizen science, when volunteers share data from archaeological finds or flora and fauna they encounter in nature. This data, including coordinates and date, is shared via a platform or app. Image Recognition Models (IRM) provide identification through the apps Iobs (iOS) and ObsIdentify (Android). But identifying what is in the photo can be difficult if the image quality is poor. With butterfly species, which were the focus of this project, there are also cases where two species cannot be distinguished from each other. In addition, observations are not made everywhere, which creates gaps in the data.

Species Distribution Models
Species distribution models can help with this. Based on geographical and climate-related data, such a model could predict the likelihood of a species occurring at location X. This would provide volunteers with an immediate identification of their observation and improve data quality for Naturalis. Moreover, it would allow predictions to be made about population distribution (e.g., based on climate change), and thus fill in the gaps in observations. Developing such a model with AI was the assignment for Fontys Hogeschool ICT students Max de Goede, Lars van Driel, Pol Roskam and Jochem Wienk based on deep learning AI.

Deep learning voor betere inzichten
The students developed the tooling to process and merge data for this model. A 'data pipeline' that can be extended in the future. Two datasets were available for this, says student Max de Goede: "The first set consisted of observations of butterfly species by volunteers via Images, including species, coordinates and date. The second set of data was collected by Naturalis itself with geo-factors, such as altitude, climate, and other variables."

Check out the video from the series Eyes on AI.