dc.contributor.advisor |
Awad, Mariette |
dc.contributor.author |
El Zini, Julia |
dc.date.accessioned |
2023-01-27T06:40:14Z |
dc.date.available |
2023-01-27T06:40:14Z |
dc.date.issued |
1/27/2023 |
dc.date.submitted |
1/26/2023 |
dc.identifier.uri |
http://hdl.handle.net/10938/23878 |
dc.description.abstract |
Given the social implications of autonomous systems in high-stake areas, recent years have witnessed an outpouring of research on designing explainable and fair AI models. In this work, we consider the intersection of contrastive learning with explainable AI and fairness evaluation schemes. Current methods that provide contrastive explainability do not simultaneously satisfy model-agnosticism, immutability, semi-immutability, and attainability constraints. In the fairness framework, existing metrics rely on statistical and causal tools that do not cover all bias cases and do not leverage advances in contrastive learning.
To this end, we present CEnt, a Contrastive Entropy-based explanation method, to locally contrast the prediction of any classifier. CEnt generates contrastive examples and visual contrasts that achieve better proximity rates than existing methods without compromising latency, feasibility, and attainability.
We utilize contrastive sets to devise a novel individual fairness evaluation technique that respects attainability and plausibility by relying on a manifold-like distance metric. Inspired by counterfactual ExAI, we suggest three metrics to evaluate the faithfulness of our metric and we study its interconnection with attainability and plausibility. We demonstrate the effectiveness of our method at detecting bias cases missed by other metrics that do not always satisfy faithfulness requirements.
Furthermore, we extend our fairness metric to textual settings by developing a local method to detect bias cases in textual settings with little reliance on existing ontologies. Our evaluation method computes the statistical mutual information and the geometrical inter-dependency with the sensitive information embedding to evaluate the fairness of a classifier. Likewise, we extend contrastive faithfulness guarantees to natural language by relying on transformers' encodings.
Lastly, we devise a novel mitigation strategy that operates in the latent space by encouraging a classifier to have the same outcome when the latent representation is perturbed with a sensitive direction. Our strategy is effective at diluting, even removing, bias in classifiers without compromising performance.
Our work motivates follow-on research in the fields of contrastive explainable AI, bias detection, and mitigation in deep networks. Generative models can be employed to improve the privacy guarantees of our techniques and enhance the quality and plausibility of the generated contrastive examples. |
dc.language.iso |
en |
dc.subject |
Explainable AI |
dc.subject |
Fairness |
dc.subject |
Bias Detection |
dc.subject |
Artificial Intelligence |
dc.subject |
Machine Learning |
dc.subject |
Deep Learning |
dc.subject |
Contrastive |
dc.subject |
Counterfactual |
dc.title |
Theoretical Guarantees of Contrastive Learning in a Novel Explainable AI Method and a Deep Fairness Evaluation Framework |
dc.type |
Dissertation |
dc.contributor.department |
Department of Electrical and Computer Engineering |
dc.contributor.faculty |
Maroun Semaan Faculty of Engineering and Architecture |
dc.contributor.institution |
American University of Beirut |
dc.contributor.commembers |
Chehab, Ali |
dc.contributor.commembers |
Jabr, Rabih |
dc.contributor.commembers |
Elbassuoni, Shadi |
dc.contributor.commembers |
Mitra, Prasenjit |
dc.contributor.commembers |
Pechenizkiy, Mykola |
dc.contributor.commembers |
Castillo, Carlos |
dc.contributor.degree |
PhD |
dc.contributor.AUBidnumber |
201302849 |