Andrea Albarelli
2022
A multi-level approach for hierarchical Ticket Classification
Matteo Marcuzzo
|
Alessandro Zangari
|
Michele Schiavinato
|
Lorenzo Giudice
|
Andrea Gasparetto
|
Andrea Albarelli
Proceedings of the Eighth Workshop on Noisy User-generated Text (W-NUT 2022)
The automatic categorization of support tickets is a fundamental tool for modern businesses. Such requests are most commonly composed of concise textual descriptions that are noisy and filled with technical jargon. In this paper, we test the effectiveness of pre-trained LMs for the classification of issues related to software bugs. First, we test several strategies to produce single, ticket-wise representations starting from their BERT-generated word embeddings. Then, we showcase a simple yet effective way to build a multi-level classifier for the categorization of documents with two hierarchically dependent labels. We experiment on a public bugs dataset and compare our results with standard BERT-based and traditional SVM classifiers. Our findings suggest that both embedding strategies and hierarchical label dependencies considerably impact classification accuracy.