Clustering and Sentiment Analysis of Customer NVH Feedback in the Automotive Domain A Machine Learning Pipeline to Facilitate Extraction of Relevant Information from Large-Scale Textual Data

dc.contributor.authorEkholm, Tomas
dc.contributor.departmentChalmers tekniska högskola / Institutionen för elektrotekniksv
dc.contributor.examinerBrännström, Fredrik
dc.contributor.supervisorDeshmukh, Shubhada
dc.date.accessioned2025-09-24T09:37:33Z
dc.date.issued2025
dc.date.submitted
dc.description.abstractNoise, Vibration, and Harshness (NVH) attributes play a significant role in shaping overall customer satisfaction with a vehicle. Automotive manufacturers often collect large volumes of textual customer feedback through surveys, offering valuable insights into how various vehicle attributes are perceived. This information is intended to help engineering teams in making informed decisions about where to focus vehicle improvement efforts. However, its unstructured nature and scale make it difficult for individual teams to extract the feedback relevant to them. This thesis investigated the feasibility of a clustering and sentiment analysis pipeline to support NVH teams in making better use of customer feedback. The proposed pipeline combined sentence embeddings, dimensionality reduction, and clustering to group semantically similar feedback. Cluster labels were automatically generated using a large language model and manually refined when necessary. Both sentencebased and aspect-based sentiment analysis were applied to quantify sentiment and extract relevant subtopics for each cluster. The final configuration produced 15 semantically coherent clusters from approximately 36,000 customer feedback sentences. These clusters captured distinct themes, ranging from high-level impressions of driving and ownership to specific issues regarding individual components. Sentence-level sentiment analysis successfully distinguished between positive and negative feedback showing its potential to guide improvement efforts. In contrast, aspect-based sentiment analysis was less reliable: although per-cluster aspect distributions often aligned with cluster themes, individual aspect terms were too frequently inaccurate. Nonetheless, the method shows potential, and its effectiveness could likely be substantially enhanced through domain-specific fine-tuning. Overall, the pipeline effectively facilitated the identification of relevant feedback and could aid future data-driven design and product improvement efforts.
dc.identifier.coursecodeEENX30
dc.identifier.urihttp://hdl.handle.net/20.500.12380/310518
dc.language.isoeng
dc.relation.ispartofseries00000
dc.setspec.uppsokTechnology
dc.subjectClustering, Deep-Learning, HDBSCAN, K-Means, LCF-ATEPC, Machine- Learning, Natural Language Processing, PyABSA, Sentiment Analysis.
dc.titleClustering and Sentiment Analysis of Customer NVH Feedback in the Automotive Domain A Machine Learning Pipeline to Facilitate Extraction of Relevant Information from Large-Scale Textual Data
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeEngineering mathematics and computational science (MPENM), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
Chalmers_University_of_Technology__Masters_Degree_Project_Report_revised_20250815.pdf
Storlek:
5.06 MB
Format:
Adobe Portable Document Format

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: