I am a PhD candidate in Computer Science and Engineering at the University of Michigan, where I am a part of the LIT research group (part of the Michigan AI Lab), supervised by Dr. Rada Mihalcea. I earned my bachelor's degree in computer science at Grove City College in 2015 and my Master's degree from University of Michigan in 2017.

My research is in the area of natural language processing, where I am interested in word embeddings, computational social science, machine learning techniques, and multimodal problems with vision and language.

Publications

Building a Flexible Knowledge Graph to Capture Real-World Events

Laura Burdick, Mingzhe Wang, Oana Ignat, Steve Wilson, Yiming Zhang, Yumou Wei, Rada Mihalcea, Jia Deng
Text Analysis Conference (TAC), 2019

@article{Burdick19Building,
author = {Burdick, Laura and Mingzhe Wang and Oana Ignat and Steve Wilson and Yiming Zhang and Yumou Wei and Rada Mihalcea and Jia Deng},
title = {Building a Flexible Knowledge Graph to Capture Real-World Events},
journal = {Text Analysis Conference (TAC)},
year = {2019}
}

Analyzing Connections Between User Attributes, Images, and Text

Laura Burdick, Rada Mihalcea, Ryan L. Boyd, James W. Pennebaker
Cognitive Computation, 2019

Identifying Visible Actions in Lifestyle Vlogs

Oana Ignat, Laura Burdick, Jia Deng, Rada Mihalcea
ACL, 2019

Oana Ignat won a Best Poster Award at the Eastern European Machine Learning Summer School 2019 for this work.

PDF Data

We consider the task of identifying human actions visible in online videos. We focus on the widely spread genre of lifestyle vlogs, which consist of videos of people performing actions while verbally describing them. Our goal is to identify if actions mentioned in the speech description of a video are visually present. We construct a dataset with crowdsourced manual annotations of visible actions, and introduce a multimodal algorithm that leverages information derived from visual and linguistic clues to automatically infer which actions are visible in a video. We demonstrate that our multimodal algorithm outperforms algorithms based only on one modality at a time.

@inproceedings{Ignat19Actions,
author = {Ignat, Oana and Laura Burdick and Jia Deng and Rada Mihalcea},
title = {Identifying Visible Actions in Lifestyle Vlogs},
booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/P19-1643",
doi = "10.18653/v1/P19-1643",
pages = "6406--6417",
year = {2019}
}

Factors Influencing the Surprising Instability of Word Embeddings

Laura Wendlandt, Jonathan K. Kummerfeld, Rada Mihalcea
NAACL-HLT, 2018

I wrote a blog post (Michigan AI Blog) for the general public about this work.

PDF Code Poster Slides

Despite the recent popularity of word embedding methods, there is only a small body of work exploring the limitations of these representations. In this paper, we consider one aspect of embedding spaces, namely their stability. We show that even relatively high frequency words (100-200 occurrences) are often unstable. We provide empirical evidence for how various factors contribute to the stability of word embeddings, and we analyze the effects of stability on downstream tasks.

@inproceedings{Wendlandt18Surprising,
author = {Wendlandt, Laura and Kummerfeld, Jonathan K. and Mihalcea, Rada},
title = {Factors Influencing the Surprising Instability of Word Embeddings},
pages = "2092--2102",
url = "https://www.aclweb.org/anthology/N18-1190",
doi = "10.18653/v1/N18-1190",
booktitle = "Proceedings of the 2018 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies",
year = {2018}
}

"Low Supervision, Low Corpus size, Low Similarity! Challenges in cross-lingual alignment of word embeddings." Andrew Dyer. Uppsala University Master's Thesis in Language Technologies. 2019.

Word Embeddings: Reliability & Semantic Change. Johannes Hellrich. IOS Press. August 8, 2019.

"A Metrological Framework for Evaluating Crowd-powered Instruments." Chris Welty, Lora Aroyo, and Praveen Paritosh. The seventh AAAI Conference on Human Computation and Crowdsourcing. 2019.

"Comparing the Performance of Feature Representations for the Categorization of the Easy-to-Read Variety vs Standard Language." Marina Santini, Benjamin Danielsson, and Arne Jönsson. 22nd Nordic Conference on Computational Linguistics. 2019.

"A Framework for Anomaly Detection Using Language Modeling, and its Applications to Finance." Armineh Nourbakhsh and Grace Bang. 2nd KDD Workshop on Anomaly Detection in Finance. 2019.

"Estimating Topic Modeling Performance with Sharma–Mittal Entropy." Sergei Koltcov, Vera Ignatenko, and Olessia Koltsova. Entropy. 2019, 21(7), 660.

"Data Shift in Legal AI Systems" Venkata Nagaraju Buddarapu and Arunprasath Shankar. Workshop on Automated Semantic Analysis of Information in Legal Text (ASAIL). 2019.

"Modeling Word Emotion in Historical Language: Quantity Beats Supposed Stability in Seed Word Selection." Johannes Hellrich, Sven Buechel, and Udo Hahn. Workshop on Language Technologies for the Socio-Economic Sciences and Humanities (LaTeCH-CLfL). 2019.

"Investigating the Stability of Concrete Nouns in Word Embeddings." Bénédicte Pierrejean and Ludovic Tanguy. International Conference on Computational Semantics. 2019.

"Can prediction-based distributional semantic models predict typicality?" Tom Heyman and Geert Heyman. Quarterly Journal of Experimental Psychology. 2019.

"Density Matching for Bilingual Word Embedding." Chunting Zhou, Xuezhe Ma, Di Wang, and Graham Neubig. NAACL-HLT. 2019.

"CluWords: Exploiting Semantic Word Clustering Representation for Enhanced Topic Modeling" Felipe Viegas, Sérgio Canuto Christian Gomes, Washington Luis, Thierson Rosa, Sabir Ribas, Leonardo Rocha, and Marcos André Gonçalves. Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 2019.

Computational approaches for German particle verbs: compositionality, sense discrimination and non-literal language. Maximilian Köper. Diss. Universität Stuttgart, 2018. Web. November 26, 2018.

"Transparent, Efficient, and Robust Word Embedding Access with WOMBAT." Mark-Christoph Müller and Michael Strube. COLING: System Demonstrations. 2018.

"What’s in Your Embedding, And How It Predicts Task Performance." Anna Rogers, Shashwatch Hosur Ananthakrishna, and Anna Rumshisky. COLING. 2018.

"Analyzing Hypersensitive AI: Instability in Corporate-Scale Machine Learning." Michaela Regneri, Malte Hoffmann, Jurij Kost, Niklas Pietsch, Timo Schulz and Sabine Stamm. IJCAI-ECAI Workshop on Explainable AI. 2018.

"Subcharacter Information in Japanese Embeddings: When Is It Worth It?" Marzena Karpinska, Bofang Li, Anna Rogers, and Aleksandr Drozd. Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP (RepL4NLP). 2018.

Entity and Event Extraction from Scratch Using Minimal Training Data

Laura Wendlandt, Steve Wilson, Oana Ignat, Charles Welch, Li Zhang, Mingzhe Wang, Jia Deng, Rada Mihalcea
Text Analysis Conference (TAC), 2018

PDF

@article{Wendlandt18Entity,
author = {Wendlandt, Laura and Steve Wilson and Oana Ignat and Charles Welch and Mingzhe Wang and Jia Deng and Rada Mihalcea},
title = {Entity and Event Extraction from Scratch Using Minimal Training Data},
journal = {Text Analysis Conference (TAC)},
year = {2018}
}

Multimodal Analysis and Prediction of Latent User Dimensions

Laura Wendlandt, Rada Mihalcea, Ryan L. Boyd, James W. Pennebaker
SocInfo, 2017

PDF Code Poster Slides

Humans upload over 1.8 billion digital images to the internet each day, yet the relationship between the images that a person shares with others and his/her psychological characteristics remains poorly understood. In the current research, we analyze the relationship between images, captions, and the latent demographic/psychological dimensions of personality and gender. We consider a wide range of automatically extracted visual and textual features of images/captions that are shared by a large sample of individuals (N ~ 1,350). Using correlational methods, we identify several visual and textual properties that show strong relationships with individual differences between participants. Additionally, we explore the task of predicting user attributes using a multimodal approach that simultaneously leverages images and their captions. Results from these experiments suggest that images alone have significant predictive power and, additionally, multimodal methods outperform both visual features and textual features in isolation when attempting to predict individual differences.

@inproceedings{Wendlandt17Multimodal,
author = {Wendlandt, Laura and Rada Mihalce and Ryan L. Boyd and James W. Pennebaker},
title = {Multimodal Analysis and Prediction of Latent User Dimensions},
booktitle={International Conference on Social Informatics},
pages={323--340},
organization={Springer},
year = {2017}
}

"Do Machines Replicate Humans? Toward a Unified Understanding of Radicalizing Content on the Open Social Web." Margeret Hall, Michael Logan, Gina S. Ligon, and Douglas C. Derrick. Policy & Internet. 2019.

"Detecting and classifying online dark visual propaganda." Mahdi Hashemi and Margeret Hall. Image and Vision Computing 89 (2019): 95-105.

Author Profiling in Social Media with Multimodal Information. Miguel Ángel Álvarez Carmona. Diss. Instituto Nacional de Astrofísica, Óptica y Electrónica, 2019. Web. March 28, 2019.

Data Science in Service of Performing Arts: Applying Machine Learning to Predicting Audience Preferences

Jacob Abernethy, Cyrus Anderson, Chengyu Dai, John Dryden, Eric Schwartz, Wenbo Shen, Jonathan Stroud, Laura Wendlandt, Sheng Yang, Daniel Zhang
Bloomberg Data for Good Exchange, 2016

PDF

Performing arts organizations aim to enrich their communities through the arts. To do this, they strive to match their performance offerings to the taste of those communities. Success relies on understanding audience preference and predicting their behavior. Similar to most e-commerce or digital entertainment firms, arts presenters need to recommend the right performance to the right customer at the right time. As part of the Michigan Data Science Team (MDST), we partnered with the University Musical Society (UMS), a non-profit performing arts presenter housed in the University of Michigan, Ann Arbor. We are providing UMS with analysis and business intelligence, utilizing historical individual-level sales data. We built a recommendation system based on collaborative filtering, gaining insights into the artistic preferences of customers, along with the similarities between performances. To better understand audience behavior, we used statistical methods from customer-base analysis. We characterized customer heterogeneity via segmentation, and we modeled customer cohorts to understand and predict ticket purchasing patterns. Finally, we combined statistical modeling with natural language processing (NLP) to explore the impact of wording in program descriptions. These ongoing efforts provide a platform to launch targeted marketing campaigns, helping UMS carry out its mission by allocating its resources more efficiently. Celebrating its 138th season, UMS is a 2014 recipient of the National Medal of Arts, and it continues to enrich communities by connecting world-renowned artists with diverse audiences, especially students in their formative years. We aim to con tribute to that mission through data science and customer analytics.

@inproceedings{Abernethy2016Data,
author = {Abernethy, J. and C. Anderson and C. Dai and J. Dryden and E. Schwartz and W. Shen and J. Stroud and L. Wendlandt and S. Yang and D. Zhang},
title = {Data Science in Service of Performing Arts: Applying Machine Learning to Predicting Audience Preferences},
booktitle = {Bloomberg Data for Good Exchange},
year = {2016},
}

"The Michigan Data Science Team: A Data Science Education Program with Significant Social Impact." Arya Farahi and Jonathan C. Stroud. IEEE Data Science Workshop. 2018.