Puneet Mathur

I am a Research Scientist at Adobe. I completed my Ph.D. in CS at the University of Maryland, College Park, advised by Dr. Dinesh Manocha in Fall 2023. My research was focused on document understanding, information extraction, and long-context multimodal understanding (documents, language, audio, video). My work spanned machine learning, natural language processing, speech processing, video understanding, and multimodal deep learning.

I did my Masters in Computer Science from UMD in 2021 and Bachelors in Engineering (B.E.) in Computer Engineering from Netaji Subhas Institute of Technology. I did AI research at MIDAS Labs at IIIT-Delhi in 2018-20.

Email  /  CV  /  Google Scholar  /  LinkedIn

profile photo

I was a research scientist intern at:

Research Publications

Conference Papers

  • DOC-RAG: ASR Language Model Personalization with Domain-Distributed Co-occurrence Retrieval Augmentation
    COLING 2024
    Puneet Mathur, Zhe Liu, Ke Li, Yingyi Ma, Gil Keren, Zeeshan Ahmed, Dinesh Manocha, Xuedong Zhang
    [Paper (Coming Soon)]

  • Saliency-Aware Interpolative Augmentation for Multimodal Financial Prediction
    COLING 2024
    Samyak Jain, Parth Chhabra, Atula Neerkaje, Puneet Mathur, Ramit Sawhney, Shivam Agarwal, Preslav Nakov, Sudheer Chava and Dinesh Manocha
    [Paper (Coming Soon)]

  • DocScript: Document-level Script Event Prediction
    COLING 2024
    Puneet Mathur, Rajiv Jain, Vlad Morariu, Aparna Garimella, Franck Dernoncourt, Jiuxiang Gu, Ramit Sawhney, Preslav Nakov, and Dinesh Manocha
    [Paper (Coming Soon)]

  • PersonaLM: Language Model Personalization via Domain-distributed Span Aggregated K-Nearest N-gram Retrieval Augmentation
    EMNLP 2023
    Puneet Mathur, Zhe Liu, Ke Li, Yingyi Ma, Gil Keren, Zeeshan Ahmed, Dinesh Manocha, Xuedong Zhang
    [Paper]

  • DocEdit: Language-guided Document Editing
    AAAI 2023
    Puneet Mathur, Rajiv Jain, Jiuxiang Gu, Franck Dernoncourt, Dinesh Manocha, Vlad Morariu
    [Paper]

  • DocInfer: Document-level Natural Language Inference using Optimal Evidence Selection
    EMNLP 2022
    Puneet Mathur, Gautam Kunapuli, Riyaz Ahmad Bhat, Manish Shrivastava, Dinesh Manocha, Maneesh Singh
    [Paper]

  • DocFin: Multimodal Financial Prediction and Bias Mitigation using Semi-structured Documents
    EMNLP 2022 (Findings)
    Puneet Mathur, Mihir Goyal, Ramit Sawhney, Ritik Mathur, Jochen L. Leidner, Franck Dernoncourt and Dinesh Manocha
    [Paper]

  • LayerDoc: Layer-wise Extraction of Spatial Hierarchical Structure in Visually-Rich Documents
    WACV 2023
    Puneet Mathur, Rajiv Jain, Ashutosh Mehra, Jiuxiang Gu, Franck Dernoncourt, Anandhavelu N, Quan Tran, Verena Kaynig-Fittkau, Ani Nenkova, Dinesh Manocha, Vlad Morariu
    [Paper]

  • MONOPOLY: Financial Prediction from MONetary POLicY Conference Videos Using Multimodal Cues
    ACM Multimedia 2022
    Puneet Mathur, Atula Tejaswi Neerkaje, Malika Chhibber, Ramit Sawhney, Fu-Ming Guo, Franck Dernoncourt, Sanghamitra Dutta, Dinesh Manocha
    [Paper]

  • DocLayoutTTS: Dataset and Baselines for Layout-informed Document-level Neural Speech Synthesis
    Interspeech 2022
    Puneet Mathur, Franck Dernoncourt, Quan Hung Tran, Jiuxiang Gu, Ani Nenkova, Vlad Morariu, Rajiv Jain and Dinesh Manocha
    [Paper]

  • PISA: PoIncaré Saliency-Aware Interpolative Augmentation
    Interspeech 2022
    Ramit Sawhney, Megh Thakkar, Vishwa Shah, Puneet Mathur, Vasu Sharma and Dinesh Manocha
    [Paper]

  • DocTime: A Document-level Temporal Dependency Graph Parser
    NAACL 2022
    Puneet Mathur, Vlad I Morariu, Verena Kaynig-Fittkau, Jiuxiang Gu, Franck Dernoncourt, Quan Hung Tran, Ani Nenkova, Dinesh Manocha, Rajiv Jain
    [Paper]

  • 3MASSIV: Multilingual, Multimodal and Multi-Aspect dataset of Social Media Short Videos
    CVPR 2022
    Vikram Gupta, Trisha Mittal, Puneet Mathur, Vaibhav Mishra, Mayank Maheshwari, Aniket Bera, Debdoot Mukherjee and Dinesh Manocha
    [Paper][Dataset]

  • TIMERS: Document-level Temporal Relation Extraction
    ACL 2021
    Puneet Mathur, Rajiv Jain, Franck Dernoncourt, Vlad Morariu, Quan Hung Tran and Dinesh Manocha
    [Paper]

  • Multimodal Multi-Speaker Merger & Acquisition (M3A) Financial Forecasting: A New Task, Dataset, and Neural Baselines
    ACL 2021
    Ramit Sawhney, Mihir Goyal, Prakhar Goel, Puneet Mathur, Rajiv Ratn Shah
    [Paper]

  • Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality
    CVPR 2021
    Trisha Mittal, Puneet Mathur, Aniket Bera, Dinesh Manocha
    [Paper]

  • Multitask Learning for Emotionally Analyzing Sexual Abuse Disclosures
    NAACL 2021
    Ramit Sawhney, Puneet Mathur, Taru Jain, Akash Kumar Gautam, Rajiv Ratn Shah
    [Paper]

  • Dynamic Graph Modeling of Simultaneous EEG and Eye-tracking Data For Reading Task Identification
    ICASSP 2021
    Puneet Mathur, Trisha Mittal, Dinesh Manocha
    [Paper]

  • Meta learning for Low Resource Speech Emotion Recognition
    ICASSP 2021
    Suransh Chopra*, Puneet Mathur*, Ramit Sawhney, Rajiv Ratn Shah
    [Paper]

  • Multimodal Multitask Financial Risk Forecasting
    ACM Multimedia 2020 (Oral)
    Ramit Sawhney, Puneet Mathur, Piyush Khanna, Ayush Mangal, Rajiv Ratn Shah
    [Paper]

  • VolTAGE: Volatility Forecasting via Text Audio Fusion with Graph Convolution Networks for Earnings Calls
    EMNLP 2020
    Ramit Sawhney, Arshiya Aggarwal, Piyush Khanna, Taru Jain, Puneet Mathur, Rajiv Ratn Shah
    [Paper]

  • Risk Forecasting from Earnings Calls Acoustics and Network Correlations
    Interspeech 2020
    Ramit Sawhney, Arshiya Aggarwal, Piyush Khanna, Puneet Mathur, Taru Jain, Rajiv Ratn Shah
    [Paper]

  • Mixup Multi-Attention Multi-Tasking Model for Early-Stage Leukemia Identification
    ICASSP 2020
    Puneet Mathur*, Mehak Piplani*, Ramit Sawhney, Rajiv Ratn Shah
    [Paper]

  • Rethinking Retinal Landmark Localization as Pose Estimation: Naive Single Stacked Network for Optic Disk and Fovea Detection
    ICASSP 2020
    Shishira Maiya*, Puneet Mathur*
    [Paper]

  • Utilizing Temporal Psycholinguistic Cues for Suicidal Intent Estimation
    ECIR 2020 (Short)
    Puneet Mathur, Ramit Sawhney, Shivang Chopra and Rajiv Ratn Shah
    [ Paper ]

  • #MeTooMA: Multi-Aspect Annotations of Tweets Related to the MeToo Movement
    ICWSM 2020
    Akash Gautam*, Puneet Mathur*, Rakesh Gosangi, Debanjan Mahata, Ramit Sawhney and Rajiv Ratn Shah
    [Paper][Dataset]

  • Hindi-English Hate Speech Detection: Author Profiling, Debiasing, and Practical Perspectives
    AAAI 2020 (Oral)
    Shivang Chopra, Ramit Sawhney, Puneet Mathur and Rajiv Ratn Shah
    [ Paper ]

  • Exploring Classification of Histological Disease Biomarkers from Renal Biopsy Images
    IEEE Winter Conference on Applications of Computer Vision (WACV) 2019
    Puneet Mathur*, Meghna P. Ayyar*, Rajiv Ratn Shah and Shree G Sharma
    [ Paper] [ Poster] [ Video Presentation ]

Workshop Papers

  • Suicide Risk Assessment via Temporal Psycholinguistic Modeling
    AAAI Student Abstract and Poster 2020
    Puneet Mathur, Ramit Sawhney and Rajiv Ratn Shah
    [ Paper ]

  • An Iterative Approach for Identifying Complaint Based Tweets in Social Media Platforms
    AAAI Student Abstract and Poster 2020
    Gyanesh Anand, Akash Kumar Gautam, Puneet Mathur, Debanjan Mahata, Rajiv Ratn Shah and Ramit Sawhney
    [ Paper ]

  • SNAP-BATNET: Cascading Author Profiling and Social Network Graphs for Suicide Ideation Detection on Social Media
    NAACL Student Research Workshop 2019
    Rohan Mishra∗, Pradyumna Prakhar Sinha∗, Ramit Sawhney, Debanjan Mahata, Puneet Mathur, Rajiv Ratn Shah
    [ Paper]

  • Speak Up, Fight Back! Detection of Social Media Disclosures of Sexual Harassment
    NAACL Student Research Workshop 2019
    Arijit Ghosh Chowdhury∗, Ramit Sawhney∗, Puneet Mathur, Rajiv Ratn Shah
    [ Paper]

  • Identification of Emergency Blood Donation Request on Twitter
    Social Media Mining for Health Applications Workshop (SMM4H), EMNLP 2018
    Puneet Mathur, Meghna Ayyar, Sahil Chopra, Simra Shahid, Laiba Mehnaz and Rajiv Shah
    [ Paper] [ Poster] [ Dataset ] [ Demo ]

  • Did You Offend Me? Classification of Offensive Tweets in Hinglish Language
    Abusive Language Workshop (ALW2), EMNLP 2018
    Puneet Mathur, Ramit Sawhney, Meghna Ayyar and Rajiv Shah
    [ Paper] [ Poster] [ Dataset ]

  • Exploring and Learning Suicidal Ideation Connotations on Social Media with Deep Learning
    WASSA, EMNLP 2018
    Ramit Sawhney, Prachi Manchanda, Puneet Mathur, Rajiv Shah and Raj Singh
    [ Paper] [ Poster]

  • Detecting Offensive Tweets in Hindi-English Code-Switched Language
    Workshop on Natural Language Processing for Social Media (Social NLP), ACL 2018
    Puneet Mathur, Rajiv Shah, Ramit Sawhney, and Debanjan Mahata
    [ Paper] [ Video Presentation ]