Key Sentence Extraction From CRISPR-Cas9 Articles Using Sentence Transformers

dc.contributor.authorStranden Lae, Brage
dc.contributor.authorHenningsson, Sandra
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineeringen
dc.contributor.examinerJohansson, Richard
dc.contributor.supervisorFarahani, Mehrdad
dc.date.accessioned2023-11-06T14:03:16Z
dc.date.available2023-11-06T14:03:16Z
dc.date.issued2023
dc.date.submitted2023
dc.description.abstractThe annotation of CRISPR-related articles and extraction of key content has traditionally relied on manual efforts. Manual annotation is error-prone and timeconsuming. This thesis presents an alternative approach using transfer learning and pre-trained models based on the Transformer architecture. Specifically, Sentence Transformer models are fine-tuned using a CRISPR-related dataset. The dataset contains articles and key sentences, enabling automatic extraction of keyphrases. The study explores various modifications to the models and data to enhance performance for this task. The results demonstrate the effectiveness of fine-tuning Sentence Transformer models for keyphrase extraction, achieving an Average R-precision of 90.4 %. Future research could focus on alternative approaches or further automation to identify entities and relations within key sentences. Key sentence extraction is complex due to the varying definitions of key content, content location, and specific use cases. However, the potential benefits of time savings and improved workflow efficiency make this approach highly valuable.
dc.identifier.coursecodeDATX05
dc.identifier.urihttp://hdl.handle.net/20.500.12380/307322
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectNLP
dc.subjectTransformers
dc.subjectCRISPR
dc.subjectsemantic search
dc.subjectkeyphrase extraction
dc.titleKey Sentence Extraction From CRISPR-Cas9 Articles Using Sentence Transformers
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeComputer science – algorithms, languages and logic (MPALG), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 23-37 BNSL SH.pdf
Storlek:
2.55 MB
Format:
Adobe Portable Document Format
Beskrivning:
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: