Research

1. Climate Change Persuasion Strategies

Sahasra Chava and Sloka Chava, 2024, “Can Large Language Models Identify Climate Change Persuasion Strategies in Short-form Videos?”

Abstract: Despite accumulating scientific evidence for anthropogenic emissions impacting recent climate change, there is widespread skepticism among some segments in the U.S. about human influence. The lack of consensus, partly driven by political affiliation, can have significant implications for policy making (for example, the U.S. joining Paris Climate Agreement in 2016, withdrawing in 2017 and rejoining in 2021). Persuasive communication, through short-form videos on social media, can potentially shape and change people’s attitude and behavior on climate change. In order to understand the goals and the persuasive strategies employed in the videos, we first curate a set of climate change related videos from YouTube from both climate change believers and climate change skeptics. We annotate each shortform video to get the persuasion strategy, an effectiveness rating for the strategy, an explanation for the strategy, and the overall goal of the video. We generate Large Language Model (LLM) responses using a zero-shot video prompting for a SOTA open-source model (VideoLLaMA2) and a proprietary model (Google Gemini Pro 1.5). We use metrics (BERTScore and ROUGE) and a survey to evaluate the model performance. We also measure the alignment of the LLM responses to teenagers who have taken AP Environmental Science (AP-ES) class and those who did not. We find that VideoLLaMA2 can describe the scene by scene happenings in the video but it doesn’t follow the instructions. Gemini Pro 1.5 follows the instructions well but the performance on identifying the strategy (based on the precision and recall) and lexical similarity (using ROUGE score) for the explanation are both low. However, the semantic similarity (using BERT score) of explanations for the strategy and the goal of the video are high. Our survey identifies 65% of the respondents agree with the LLM explanation but find it verbose.

2. Flashy Consumption and Teenager’s Stress

Sloka Chava and Sahasra Chava, 2024, “Flashy Consumption and Hidden Feelings on TikTok: Can Large Language Models Identify Teenager’s Emotions and Stress?”

Abstract: Flashy and conspicuous consumption, where in, a latest haul or purchase is posted on social media through short-form videos is growing popular. Some of these TikToks showcase makeup purchases (for example, from Sephora) or clothing purchases (for example, from Lululemon), in hundreds of dollars. While teenagers watching this flashy consumption can experience joy, excitement and vicarious pleasure, they can also potentially experience jealousy, peer pressure and be overwhelmed. In order to understand the emotional responses and potential stress from watching flashy consumption, we first curate 100 TikToks with almost 20 million likes and 100,000 comments. We annotate each TikTok video to identify the emotion experienced, a rating for the intensity of the felt emotion and the reasoning for the emotional response. We generate Large Language Model (LLM) responses using a zero-shot video prompting for a SOTA open-source model (VideoLLaMA2) and a proprietary model (Google Gemini Pro 1.5). We assess the LLM performance using model-based metrics (BERT and ROUGE scores) and a survey of teenagers. We find that VideoLLaMA2 can describe the scene by scene happenings in the video, but it doesn’t follow the instructions. Gemini Pro 1.5 does follow the instructions but the performance on identifying the emotion (based on the precision and recall) are low. In addition, the lexical similarity (using ROUGE score) for the explanation are both quote low. However, the semantic similarity (using BERT score) of explanations for the emotional response that teenagers may experience after watching the video are relatively high. Our survey identifies 73.1% of the respondents agree with the LLM explanation but find it verbose.

3. Numerical Claim Detection in Finance

Agam Shah, Arnav Hiray, Pratvi Shah, Arkaprabha Banerjee, Anushka Singh, Dheeraj Eidnani, Sahasra Chava, Bhaskar Chaudhury, Sudheer Chava, 2024, “Numerical Claim Detection in Finance: A New Financial Dataset, Weak-Supervision Model, and Market Analysis”.

Abstract: In this paper, we investigate the influence of claims in analyst reports and earnings calls on financial market returns, considering them as significant quarterly events for publicly traded companies. To facilitate a comprehensive analysis, we construct a new financial dataset for the claim detection task in the financial domain. We benchmark various language models on this dataset and propose a novel weak-supervision model that incorporates the knowledge of subject matter experts (SMEs) in the aggregation function, outperforming existing approaches. We also demonstrate the practical utility of our proposed model by constructing a novel measure of optimism. Here, we observe the dependence of earnings surprise and return on our optimism measure. Our dataset, models, and code are publicly (under CC BY 4.0 license) available on GitHub.

Available at https://arxiv.org/abs/2402.11728
Accepted at The Seventh FEVER Workshop EMNLP 2024

4. BERTScore Visualizer

Sebastian Jaskowski, Sahasra Chava, Agam Shah, 2024, “BERTScoreVisualizer: A Web Tool for Understanding Simplified Text Evaluation with BERTScore”

Abstract: The BERTScore metric is commonly used to evaluate automatic text simplification systems. However, current implementations of the metric fail to provide complete visibility into all information the metric can produce. Notably, the specific token matchings can be incredibly useful in generating clause-level insight into the quality of simplified text. We address this by introducing BERTScoreVisualizer, a web application that goes beyond reporting precision, recall, and F1 score and provides a visualization of the matching between tokens. We believe that our software can help improve the analysis of text simplification systems by specifically showing where generated, simplified text deviates from reference text. We host our code and demo on GitHub.

Available at https://arxiv.org/abs/2409.17160