Pharmaceutical assay search with AI
dc.contributor.author | Alladin, Ali | |
dc.contributor.department | Chalmers tekniska högskola / Institutionen för matematiska vetenskaper | sv |
dc.contributor.examiner | Jonasson, Johan | |
dc.contributor.supervisor | Jonasson, Johan | |
dc.date.accessioned | 2024-11-28T12:42:44Z | |
dc.date.available | 2024-11-28T12:42:44Z | |
dc.date.issued | 2024 | |
dc.date.submitted | ||
dc.description.abstract | Retrieving historical assay data in pharmaceutical research is often restricted by reliance on specific metadata, overlooking the contextual information in associated protocol documents. This thesis investigates the potential of utilizing these plain English protocol documents alongside Natural Language Processing (NLP) techniques to implement semantic search for assays. A baseline TF-IDF model and the Transformer models BERT, SBERT, and Longformer were used to get embeddings of protocol documents from a corpus of historical protocols. Their performance in retrieving relevant historical protocols was evaluated based on key technical criteria, where the TF-IDF models and BERT using the chunking technique showed the best results. However, limitations in the evaluation scope introduce some uncertainty to the findings, highlighting the need for more rigorous validation. Nevertheless, the conclusions suggest that integrating NLP-driven semantic search systems could reduce the time and manual effort required for assay retrieval, even though the current approach may need further refinement for practical application. These insights are a promising foundation for developing AI-powered search systems used for pharmaceutical texts. | |
dc.identifier.coursecode | MVEX03 | |
dc.identifier.uri | http://hdl.handle.net/20.500.12380/309014 | |
dc.language.iso | eng | |
dc.setspec.uppsok | PhysicsChemistryMaths | |
dc.subject | Pharmaceutical texts, Assays, Semantic Textual Similarity (STS), Artificial Intelligence (AI), Natural Language Processing (NLP), Large Language Model (LLM), TF-IDF, BERT, SBERT and Longformer | |
dc.title | Pharmaceutical assay search with AI | |
dc.type.degree | Examensarbete för masterexamen | sv |
dc.type.degree | Master's Thesis | en |
dc.type.uppsok | H | |
local.programme | Data science and AI (MPDSC), MSc |