Publications

2024


OLMo: Accelerating the Science of Language Models
Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi
In preparation // [paper] [model] [code] [blog] [press]


Dolma: An Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Evan Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo
In preparation // [paper] [data] [code] [blog] [press]

2023


Paloma: A Benchmark for Evaluating Language Model Fit
Ian Magnusson, Akshita Bhagia, Valentin Hofmann, Luca Soldaini, Ananya Harsh Jha, Oyvind Tafjord, Dustin Schwenk, Evan Pete Walsh, Yanai Elazar, Kyle Lo, Dirk Groeneveld, Iz Beltagy, Hannaneh Hajishirzi, Noah A. Smith, Kyle Richardson, Jesse Dodge
arxiv // [paper] [data] [code] [models]


Catwalk: A Unified Language Model Evaluation Framework for Many Datasets
Dirk Groeneveld, Anas Awadalla, Iz Beltagy, Akshita Bhagia, Ian Magnusson, Hao Peng, Oyvind Tafjord, Pete Walsh, Kyle Richardson, Jesse Dodge
arxiv // [paper] [code]


What’s In My Big Data?
Yanai Elazar, Akshita Bhagia, Ian Magnusson, Abhilasha Ravichander, Dustin Schwenk, Alane Suhr, Pete Walsh, Dirk Groeneveld, Luca Soldaini, Sameer Singh, Hanna Hajishirzi, Noah A. Smith, Jesse Dodge
arxiv // [paper] [code] [demo] [press]


Reproducibility in NLP: What Have We Learned from the Checklist?
Ian Magnusson, Noah A. Smith, Jesse Dodge
Findings of ACL 2023 // [paper]

2022


Exploring The Landscape of Distributional Robustness for Question Answering Models
Anas Awadalla, Mitchell Wortsman, Gabriel Ilharco, Sewon Min, Ian Magnusson, Hannaneh Hajishirzi, Ludwig Schmidt
Findings of EMNLP 2022 // [paper]


Just-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTE
Yuling Gu, Yao Fu, Valentina Pyatkin, Ian Magnusson, Bhavana Dalvi, Peter Clark
FigLang Workshop, EMNLP 2022 // [paper] [code]


Towards a Multi-Entity Aspect-Based Sentiment Analysis for Characterizing Directed Social Regard in Online Messaging
Joan Zheng, Scott Friedman, Sonja Schmer-Galunder, Ian Magnusson, Ruta Wheelock, Jeremy Gottlieb, Diana Gomez, Chris Miller
Workshop on Online Abuse and Harms, NAACL 2022 // [paper]

2021


Extracting Fine-Grained Knowledge Graphs of Scientific Claims: Dataset and Transformer-Based Results
Ian Magnusson, Scott Friedman
EMNLP 2021 // [paper] [data]


Invertible Frowns: Video-to-Video Facial Emotion Translation
Ian Magnusson, Aruna Sankaranarayanan, Andrew Lippman
ADGD Workshop, ACM Multimedia 2021 // [paper] [code]


From Unstructured Text to Causal Knowledge Graphs: A Transformer-Based Approach
Scott Friedman, Ian Magnusson, Vasanth Sarathy, Sonja Schmer-Galunder
Advances in Cognitive Systems 2021 // [paper]


Extracting Qualitative Causal Structure with Transformer-Based NLP
Scott Friedman, Ian Magnusson, Sonja Schmer-Galunder
Qualitative Reasoning Workshop IJCAI 2021 // [paper]


Toward Transformer-Based NLP for Extracting Psychosocial Indicators of Moral Disengagement
Scott Friedman, Ian Magnusson, Sonja Schmer-Galunder, Ruta Wheelock, Jeremy Gottlieb, Pooja Patel, Christopher Miller
CogSci 2021 // [paper]


Systematizing Confidence in Open Research and Evidence (SCORE)
Nazanin Alipourfard, Beatrix Arendt, Daniel Benjamin, Noam Benkler, Michael Bishop, Mark Burstein, Martin Bush, James Caverlee, Yiling Chen, … Ian Magnusson et al.
SocArXiv 2021 // [paper]