Mona Diab

Mona Talat Diab is a computer science professor and director of Carnegie Mellon University's Language Technologies Institute. Previously, she was a professor at George Washington University and a research scientist with Facebook AI. Her research focuses on natural language processing, computational linguistics, cross lingual/multilingual processing, computational socio-pragmatics, Arabic language processing, and applied machine learning.

Education
Diab completed her M.Sc. in computer science with a major in machine learning and artificial intelligence at The George Washington University (1997) and her Ph.D. in computational linguistics at the University of Maryland, Linguistics Department and University of Maryland Institute for Advanced Computer Studies (UMIACS) in 2003, under the supervision of Philip Resnik. She was also a postdoctoral research scientist at Stanford University (2003–2005) under the mentorship of Dan Jurafsky, where she was a part of the Stanford NLP Group.

Career
After her postdoc at Stanford, Diab took a position as research scientist (principal investigator) at the Center for Computational Learning Systems (CCLS) in Columbia University, where she was also adjunct professor in the computer science department. In 2013 she joined the George Washington University as an associate professor, where she was promoted to full professor in 2017. Diab is the founder and director of the GW NLP lab CARE4Lang. Diab served as an elected faculty senator at Columbia University for 6 years (2007–2012) and an elected faculty senator at GW (2013–2014). She served the computational linguistics community as elected member, secretary and president of ACL SIGLEX (2005–2016) and elected president of ACL SIGSemitic. She currently serves as the elected VP-elect for ACL SIGDAT. In 2017 Diab joined Amazon AWS AI Deep Learning Group for Human Language Technologies, where she led the AWS Lex project for task oriented dialogue systems for enterprises. A couple of years later, she moved to Facebook AI as a research scientist. In the fall of 2023, she became the director of CMU's Language Technologies Institute -- the first full time director since the passing of its founder Jaime Carbonell.

Research
Diab's research interests include several areas in computational linguistics/natural language processing, like conversational AI, computational lexical semantics, multilingual and cross lingual processing, social media processing with an emphasis on computational socio- pragmatics, information extraction & text analytics, machine translation. Besides this, she also has special interests in Arabic NLP and low resource scenarios.

Diab co-established two research trends in the computational linguistics field, computational approaches to linguistic code switching in 2007 and semantic textual similarity in 2010.

Diab together with Nizar Habash and Owen Rambow, co-founded CADIM in 2005, a global reference point in Arabic dialect processing.

In 2012, Diab together with Eneko Agirre and Johan Bos, brought together two ACL communities SIGLEX and SIGSEM and established the 1st tier conference *SEM.

Awards and recognition

 * Selected as one of top 150 leaders and visionaries in AI nationwide to participate in White House AI Summit in Government, Washington, D.C., US, September 2019


 * March 2017: 3 Muslim Women in STEM You Should Know About, Teen Vogue, March 2017
 * May 2017: Behind Every Strong Woman Is...Another Strong Woman: Ten women give thanks to the women who supported them on the way up. Elle, May 2017.
 * Google Faculty Research Award – Tharwa++: Building a multidialectal Arabic Lexical Repository, (PI), 09.2015 –12.2016.
 * Google Faculty Research Award – Nuanced Sentiment and Perspective Analysis for Arabic Social Media Text, (PI), 12.2014 –12.2015
 * QNRF Best Poster Award – Ossama Obeid, Houda Bouamor, Wajdi Zaghouani, Mahmoud Ghoneim, Abdelati Hawwari, Mona Diab, Kemal Oflazer. (2016) MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization. Proceedings of the 2nd Workshop on Arabic Corpora and Processing Tools, LREC 2016.
 * Best Paper Award – Aminian, Maryam, Mahmoud Ghoneim, Mona Diab. (2015) Unsupervised False Friend Disambiguation Using Contextual Word Clusters and Parallel Word Alignments. In Proceedings of Workshop 9th Semantics Syntax Statistical Translation, NAACL 2015, Denver CO, US.

Publications
Diab has over 250 publications, and she is an acting editor for several scientific journals.

Selected publications

 * Semeval-2012 task 6: A pilot on semantic textual similarity. E. Agirre, D. Cer, M. Diab, A. Gonzalez-Agirre. *SEM 2012: The First Joint Conference on Lexical and Computational Semantics–Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)
 * Predictive linguistic features of schizophrenia. ES Kayi, M Diab, L Pauselli, M Compton, G Coppersmith. arXiv preprint arXiv:1810.09377
 * Ideological perspective detection using semantic features. H Elfardy, M Diab, C Callison-Burch – Proceedings of *SEM 2015
 * DeSePtion: Dual sequence prediction and adversarial examples for improved fact-checking. Christopher Hidey, Tuhin Chakrabarty, Tariq Alhindi, Siddharth Varia, Kriste Krstovski, Mona Diab, Smaranda Muresan, 2020
 * Does Causal Coherence Predict Online Spread of Social Media? Pedram Hosseini, Mona Diab, David A Broniatowski. Proceedings of International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, 2019.
 * Diversity, Density, and Homogeneity: Quantitative Characteristic Metrics for Text Collections. YA Lai, X Zhu, Y Zhang, M Diab, arXiv preprint arXiv:2003.08529, 2020
 * Readability of written medicine information materials in Arabic language: expert and consumer evaluation. S Al Aqeel, N Abanmy, A Aldayel, H Al-Khalifa, M Al-Yahya, M Diab. BMC health services research 18 (1), 1–7, 2019
 * Unsupervised word mapping using structural similarities in monolingual embeddings. H Aldarmaki, M Mohan, M Diab – Transactions of the Association for Computational Linguistics, 2018
 * An unsupervised method for word sense tagging using parallel corpora M Diab, P Resnik. Proceedings of ACL 2002
 * Overview for the first shared task on language identification in code-switched data. Thamar Solorio, Elizabeth Blair, Suraj Maharjan, Steven Bethard, Mona Diab, Mahmoud Ghoneim, Abdelati Hawwari, Fahad AlGhamdi, Julia Hirschberg, Alison Chang, Pascale Fung. Proceedings of the First Workshop on Computational Approaches to Code Switching, 2014
 * Modeling sentences in the latent space. W Guo, M Diab – ACL 20 12
 * Task-based evaluation of multiword expressions: a pilot study in statistical machine translation. M Carpuat, M Diab – NAACL-HLT 2010
 * Rumor detection and classification for twitter data. S Hamidian, MT Diab – arXiv preprint arXiv:1912.08926, 2019
 * Subgroup detection in ideological discussions. A Abu-Jbara, P Dasigi, M Diab, D Radev – ACL 2012
 * Madamira: A fast, comprehensive tool for morphological analysis and disambiguation of arabic. A. Pasha, M. Al-Badrashiny, M. Diab, A. El Kholy, R. Eskander, N. Habash, M. Pooleery, O. Rambow, R. Roth. LREC 14, 1094–1101. 2014
 * Context-Aware Self-Attentive Natural Language Understanding for Task-Oriented Chatbots. A. Gupta, P. Zhang, G. Lalwani, M. Diab. EMNLP 2019
 * A multitask learning approach for diacritic restoration. S. Alqahtani, A. Mishra, M. Diab. ACL 2020