Classification of the Sex of Drosophila Suzukii with Pre-Trained Networks
Skip to main content
eScholarship
Open Access Publications from the University of California

UC Santa Cruz

UC Santa Cruz Electronic Theses and Dissertations bannerUC Santa Cruz

Classification of the Sex of Drosophila Suzukii with Pre-Trained Networks

Creative Commons 'BY-SA' version 4.0 license
Abstract

There has been a recent trend to applying deep learning methods compared to shallowmethods for automatic identification of insects. Classification strategies built around al- gorithms with deep learning architectures at their center like YOLO and others require large amounts of data to making learning successful and are often augmented with tens of thousands of images or more to achieve excellent performance. Recent pre-trained models of deep neural networks have significantly reduced the amount of data required to create accurate classification algorithms by ingesting and training on a huge data set different than the target task and using the resulting encoding to transfer information to a new task. This work shows that recent performance gains from models pre-trained on huge data sets are effective as image encoders for the classification of the sex of spotted wing drosophila (SWD). A data set of 676 SWD microscope images is created to evaluate classification models for use in automation of the sterile insect technique (SIT), which requires large amounts of male SWD to be identified and separated. Bi- nary classification models trained on top of image encoding from new models based off of visual transformers [3] pre-trained on over 400 million images with CLIP [2] are able to achieve accuracy as high as 96.7% when trained with LogReg and similar classifiers on augmented data from the SWD image set. Other models pre-trained on the ImageNet data set of 14 million images also performed well, approaching 92% with VGG models and 90% with MobileNetV2 model. Image segmentation of the data set is then inves- tigated as a source of corroboration for the identification of the morphological features responsible for classification, and an out-of-distribution data set is collected to evaluate classification and segmentation results on more diverse and difficult examples. While robust identification of features special to SWD remains, classification accuracy is not a guarantee on data which differs substantially from the factory or laboratory setting on which it is trained and additional data may be needed for training on use-cases outside of SIT such as for applications on the farm or for automated identification in insect traps. This emphasizes a fact which is not elaborated on for many insect detection models in the literature: that their models are not likely robust in situations where the data is significantly OOD and for situations which may not be adequately covered with- out specialized augmentation methods or additional data. Nonetheless results indicate that pre-trained models have advanced to the point where they can play a central role in securing the food supply from potentially billions of dollars of damages every year from pests such as SWD.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View