AUB ScholarWorks

Semi-automatic annotator for medical NLP applications -

Show simple item record

dc.contributor.author Sabra, Mohamed Naji,
dc.date.accessioned 2017-08-30T14:06:19Z
dc.date.available 2017-08-30T14:06:19Z
dc.date.issued 2015
dc.date.submitted 2015
dc.identifier.other b18379539
dc.identifier.uri http://hdl.handle.net/10938/10671
dc.description Thesis. M.E. American University of Beirut. Department of Electrical and Computer Engineering, 2015. ET:6306
dc.description Advisor : Dr. Fadi Zaraket, Assistant Professor, Electrical and Computer Engineering ; Committee Members : Dr. Mariette Awad, Associate Professor, Electrical and Computer Engineering ; Dr. Rouwaida Kanj, Assistant Professor, Electrical and Computer Engineering.
dc.description Includes bibliographical references (leaves 51-54)
dc.description.abstract With the expansion of scientific and social media, a wealth of online information resources has accumulated as free text including articles, studies, and social blogs. Mining, standardization, and extraction of information from these resources brings upon novel approaches for data analysis and knowledge discovery; particularly from domain specific large text corpora. Key to this is annotated corpora. Supervised algorithms for machine learning need them for training. Unsupervised algorithms need them for testing and evaluation. Manual annotation is expensive especially in expert domains such as medicine. This thesis presents a Semi-Automatic Annotator for Medical NLP Applications (SAMNA). SAMNA takes a large corpus, a list of labels, a list of terms associated with each label, and lists of rules associated with labels and terms. SAMNA annotates the corpora words that match the corresponding terms and rules. It also uses distributional similarity to discover novel annotations. In addition, it provides the annotating scholar with an intuitive, friendly and efficient interface to navigate and edit the annotations. We used SAMNA in several medical NLP applications to annotate protein sets in medical articles related to specific diseases such as stroke, spinal cord injuries, and Alzheimer. The graph theory based analysis of the corpora annotated with SAMNA led to discoveries on interest to medical experts. SAMNA can also be applied in systems review, as well as other annotation domains.
dc.format.extent 1 online resource (v, 64 leaves) : illustrations (some color) ; 30cm
dc.language.iso eng
dc.relation.ispartof Theses, Dissertations, and Projects
dc.subject.classification ET:006306
dc.subject.lcsh Samna.
dc.subject.lcsh Natural language processing (Computer science)
dc.subject.lcsh Bioinformatics -- Statistical methods.
dc.subject.lcsh Data mining.
dc.subject.lcsh Parsing (Computer grammar)
dc.title Semi-automatic annotator for medical NLP applications -
dc.type Thesis
dc.contributor.department Faculty of Engineering and Architecture.
dc.contributor.department Department of Electrical and Computer Engineering,
dc.contributor.institution American University of Beirut.


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search AUB ScholarWorks


Browse

My Account