AUB ScholarWorks

Arabic named entity recognition via deep co-learning -

Show simple item record

dc.contributor.author Helwe, Chadi Talal,
dc.date.accessioned 2017-12-11T16:30:58Z
dc.date.available 2017-12-11T16:30:58Z
dc.date.issued 2017
dc.date.submitted 2017
dc.identifier.other b19214327
dc.identifier.uri http://hdl.handle.net/10938/21007
dc.description Thesis. M.S. American University of Beirut. Department of Computer Science, 2017. T:6553
dc.description Advisor : Dr. Shady Elbassuoni, Assistant Professor, Computer Science ; Committee members : Dr. Wassim El Hajj, Associate Professor, Computer Science ; Dr. Hazem El Hajj, Associate Professor, Electrical and Computer Engineering.
dc.description Includes bibliographical references (leaves 44-47)
dc.description.abstract Named entity recognition (NER) is the task of identifying named entities such as locations, persons, and organizations in a given piece of text. NER plays a signi cant role in many applications including information retrieval, question an-swering, machine translation, text clustering, and navigation systems. In this thesis, we tackled the problem of Arabic NER. Arabic is a very challenging lan-guage when it comes to natural language processing (NLP) in general. Arabic is both morphologically rich and highly ambiguous and has complex morpho-syntactic agreement rules and many irregular forms. To address all these issues, we proposed to use deep learning based on Arabic word embeddings that cap-ture syntactic and semantic relationships between words. Deep learning has been shown to perform signi cantly better than other approaches for various NLP tasks including NER. However, deep learning models also require a signi cantly large amount of training data, which is highly lacking in the case of Arabic. To be able to overcome this, we proposed a semi-supervised deep learning approach that uses both labeled and semi-labeled data, which we coin deep co-learning. We tested our approach using di erent established benchmarks and compared it to the state-of-the-art Arabic NER tools such as MadaMira and Farasa. Our deep co-learning approach signi cantly outperformed the compared to Arabic NER approaches as well as purely-supervised deep learning ones.
dc.format.extent 1 online resource (x, 47 leaves) : illustrations
dc.language.iso eng
dc.relation.ispartof Theses, Dissertations, and Projects
dc.subject.classification T:006653
dc.subject.lcsh Natural language processing (Computer science)
dc.subject.lcsh Machine learning.
dc.subject.lcsh Arabic language -- Morphology.
dc.subject.lcsh Data mining.
dc.subject.lcsh Text processing (Computer science)
dc.subject.lcsh Computational linguistics.
dc.title Arabic named entity recognition via deep co-learning -
dc.type Thesis
dc.contributor.department Faculty of Arts and Sciences.
dc.contributor.department Department of Computer Science,
dc.contributor.institution American University of Beirut.


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search AUB ScholarWorks


Browse

My Account