Abstract:
Negotiation is a fundamental aspect of human interaction, involving a dynamic
exchange of communication between two or more parties to reach mutually agreeable
outcomes. With recent advancements in chatbots, leveraging artificial intelligence (AI)
for negotiation has emerged as an ideal application. Despite significant progress in
English negotiation bots using deep learning and reinforcement learning, such
advancements are notably absent in other languages, particularly Arabic. Furthermore,
while previous research has primarily focused on developing high-performing neural
response generation systems for negotiation bots, the integration of multimodality into
these automated agents remains unexplored. The incorporation of multimodality is
represented in image analysis, and it contributes to a more comprehensive and userfriendly negotiation model. This thesis presents the first Arabic negotiation model,
distinguished by incorporating multimodality into negotiation models. The integration
of multimodality, particularly through image analysis, provides a more comprehensive
and user-centric approach to negotiation. Our primary objective is to develop an Arabic
multimodal negotiating bot, a seller agent capable of engaging in negotiations with
buyers in the context of item sales. This seller agent is designed to understand the
buyer's Arabic utterances and to interpret the negotiation context through images
provided by the buyer. To achieve this, we trained a Generative Pre-trained Transformer
(GPT-2) model on an Arabic dataset, integrating it with a Convolutional Neural
Network (CNN) for image analysis. The model's automatic evaluation yielded a BLEU4 score of 0.21 and a cross-entropy loss of 0.55, metrics that are promising for the first
model of its kind in Arabic. Our experiments and analyses reveal both the successes and
limitations of the designed multi-modal Arabic negotiating model, offering insights into
the inherent challenges and setting directions for future research.