Skip to main content
Open Access Publications from the University of California

UC Irvine

UC Irvine Electronic Theses and Dissertations bannerUC Irvine

Improving the Retrieval of Related Questions in StackOverflow

Creative Commons 'BY-NC-SA' version 4.0 license

StackOverflow is a very popular Q&A website, known to all software developers. Developers can either post their coding questions on the website to be answered by other developers or explore the existing questions and their answers to find the solution they are looking for. Finding questions which are related to the desired topic and might include either the exact answer or some hints which can help to resolve the issue is very helpful for users of StackOverflow. This will decrease the time that users have to spend on search engines trying different keywords or waiting on Q&A websites for other users to reply to them.

In this work, I aim to improve related posts retrieved for each question in the StackOverflow website through information retrieval techniques customized for this website. The approach is based on text similarity algorithms applied on the content of posted questions including normal text and code. Firstly, I configured the algorithm based on the manual evaluation of the results of a small dataset. Then, I performed a user study with professional developers and graduate students in software engineering for evaluating the approach with the best configuration from the previous step in comparison to existing related posts in the StackOverflow. The results of this study revealed that my approach performs better than the algorithm used in StackOverflow website. Moreover, the statistical analysis of the results proved that this improvement is statistically significant.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View