LIU, YANG

Exploring the Efficacy of of GPT-3.5 in Code Smell Detection

2024

LIU, YANG
Advisor(s): Ahmed, Iftekhar

Creative Commons 'BY' version 4.0 license

Abstract

Code smell is a widely recognized term in the software community, used to describe low-quality software designs. It adversely affects the comprehensibility and maintainability of software systems, ultimately reducing their overall quality. Despite the development of various static analysis tools for identifying code smells, many complex code smells are not properly detected because they cannot be fully captured by specific rules and patterns. The emergence of Large Language Models (LLMs) provides a new opportunity to identify code smells through natural language processing, which may be advantageous for detecting complex code smells. In this paper, we present experiments exploring the efficacy of GPT-3.5 in code smell detection, conducted on 14 Open-Source Software (OSS) projects, totaling 9,740 Python files. We select three code smells in Python to assess: too many parameters (TMP), too many nested blocks (TMNB), and unused variable (UV). We use the static analysis tool Pylint to automate the detection of code smells and create a dataset. Subsequently, we fine-tune the GPT-3.5 model specifically for code smell detection. To evaluate the performance of the fine-tuned model, we apply it to four additional OSS projects. Our results demonstrate that GPT-3.5 has the potential to replace traditional static analysis tools in code smell detection. However, it still requires more data and refinement to improve accuracy and reliability for certain smells. Additionally, we reveal that the size of the dataset significantly influences the performance of GPT-3.5. We also present conjectures for future research in code smell detection and LLM fine-tuning.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC Irvine

Exploring the Efficacy of of GPT-3.5 in Code Smell Detection