dc.contributor.author |
Haidar, Rabih Abdulsalam, |
dc.date.accessioned |
2017-12-11T16:30:49Z |
dc.date.available |
2017-12-11T16:30:49Z |
dc.date.issued |
2017 |
dc.date.submitted |
2017 |
dc.identifier.other |
b19184074 |
dc.identifier.uri |
http://hdl.handle.net/10938/20970 |
dc.description |
Thesis. M.S. American University of Beirut. Department of Computer Science, 2017. T:6598 |
dc.description |
Advisor : Dr. Shady Elbassuoni, Assistant Professor, Computer Science ; Committee members : Dr. Wassim El Hajj, Associate Professor, Computer Science ; Dr. Haidar Safa, Professor, Computer Science. |
dc.description |
Includes bibliographical references (leaves 30-32) |
dc.description.abstract |
Web robots are everywhere in today's web technology. These bots range from robots associated with viruses known as malicious to spiders also known as search engine bots. The latter, attempt to crawl the Internet harvesting information from websites for different purposes whereas no one can claim control over how and when this rich information is going to be used. While artificial intelligence keeps improving, robots become very smart too. Bots are likely to increase in quality and quantity as the world-wide-web develops and evolves. This is becoming a real threat to today's businesses and social life. What we're more likely to see in the future are smarter bots which can do anything at any time. This naturally urges contemporary researchers and experts in cyber security to invest in every possible direction to try protect the web environment. Detecting bots, whether malicious or search engine bots is an important goal for most website admins. In this thesis, we propose a novel machine learning bot detection approach based on web session navigation behavior. While machine learning has been used before for bot detection, most existing approaches rely on general hypotheses based on statistical analysis over multiple websites and are thus easy to counter. In our work, we build a website-specific hypothesis or classifier based on the actual navigation data of the website. The advantages of our approach is that it can be generally used to detect any type of bot attacks and is difficult to counter unless website-specific bots are designed as well. Our classifier uses a Two-Class Boosted Decision Tree classification model and can be periodically re-trained to learn new hypotheses as bots evolve. We tested our approach on two real-world websites and achieved an accuracy of around 83 percent, outperforming state-of-the-art machine-learning bot detection approaches by almost 14 percent. In summary, we are after a solution where each website can learn, generate and tune its own defensive mechanism and can co-exist with other defensive |
dc.format.extent |
1 online resource ( x, 32 leaves) : illustrations |
dc.language.iso |
eng |
dc.relation.ispartof |
Theses, Dissertations, and Projects |
dc.subject.classification |
T:006598 |
dc.subject.lcsh |
Machine learning. |
dc.subject.lcsh |
Search engines. |
dc.subject.lcsh |
Support vector machines. |
dc.subject.lcsh |
Neural networks (Computer science) |
dc.title |
Web session navigation behavior for bot detection - |
dc.type |
Thesis |
dc.contributor.department |
Faculty of Arts and Sciences. |
dc.contributor.department |
Department of Computer Science, |
dc.contributor.institution |
American University of Beirut. |