Follow
Esin Durmus
Esin Durmus
Anthropic
Verified email at stanford.edu - Homepage
Title
Cited by
Cited by
Year
On the opportunities and risks of foundation models
R Bommasani, DA Hudson, E Adeli, R Altman, S Arora, S von Arx, ...
arXiv preprint arXiv:2108.07258, 2021
39222021
Holistic evaluation of language models
P Liang, R Bommasani, T Lee, D Tsipras, D Soylu, M Yasunaga, Y Zhang, ...
arXiv preprint arXiv:2211.09110, 2022
9802022
FEQA: A question answering evaluation framework for faithfulness assessment in abstractive summarization
E Durmus, H He, M Diab
ACL, 2020
4052020
Benchmarking large language models for news summarization
T Zhang, F Ladhak, E Durmus, P Liang, K McKeown, TB Hashimoto
Transactions of the Association for Computational Linguistics 12, 39-57, 2024
3282024
Whose opinions do language models reflect?
S Santurkar, E Durmus, F Ladhak, C Lee, P Liang, T Hashimoto
International Conference on Machine Learning, 29971-30004, 2023
2692023
Easily accessible text-to-image generation amplifies demographic stereotypes at large scale
F Bianchi, P Kalluri, E Durmus, F Ladhak, M Cheng, D Nozza, ...
Proceedings of the 2023 ACM Conference on Fairness, Accountability, and …, 2023
2112023
WikiLingua: A new benchmark dataset for cross-lingual abstractive summarization
F Ladhak, E Durmus, C Cardie, K McKeown
arXiv preprint arXiv:2010.03093, 2020
1952020
The gem benchmark: Natural language generation, its evaluation and metrics
S Gehrmann, T Adewumi, K Aggarwal, PS Ammanamanchi, ...
arXiv preprint arXiv:2102.01672, 2021
1392021
Towards measuring the representation of subjective global opinions in language models
E Durmus, K Nguyen, TI Liao, N Schiefer, A Askell, A Bakhtin, C Chen, ...
arXiv preprint arXiv:2306.16388, 2023
1132023
Marked personas: Using natural language prompts to measure stereotypes in language models
M Cheng, E Durmus, D Jurafsky
arXiv preprint arXiv:2305.18189, 2023
1022023
Towards understanding sycophancy in language models
M Sharma, M Tong, T Korbak, D Duvenaud, A Askell, SR Bowman, ...
arXiv preprint arXiv:2310.13548, 2023
972023
Evaluating human-language model interaction
M Lee, M Srivastava, A Hardy, J Thickstun, E Durmus, A Paranjape, ...
arXiv preprint arXiv:2212.09746, 2022
912022
Studying large language model generalization with influence functions
R Grosse, J Bae, C Anil, N Elhage, A Tamkin, A Tajdini, B Steiner, D Li, ...
arXiv preprint arXiv:2308.03296, 2023
802023
Exploring the role of prior beliefs for argument persuasion
E Durmus, C Cardie
NAACL, 2018
792018
Scaling monosemanticity: Extracting interpretable features from claude 3 sonnet
A Templeton
Anthropic, 2024
662024
Faithful or extractive? on mitigating the faithfulness-abstractiveness trade-off in abstractive summarization
F Ladhak, E Durmus, H He, C Cardie, K McKeown
arXiv preprint arXiv:2108.13684, 2021
662021
Measuring faithfulness in chain-of-thought reasoning
T Lanham, A Chen, A Radhakrishnan, B Steiner, C Denison, ...
arXiv preprint arXiv:2307.13702, 2023
652023
Question decomposition improves the faithfulness of model-generated reasoning
A Radhakrishnan, K Nguyen, A Chen, C Chen, C Denison, D Hernandez, ...
arXiv preprint arXiv:2307.11768, 2023
49*2023
Exploring the Role of Argument Structure in Online Debate Persuasion
J Li, E Durmus, C Cardie
EMNLP, 2020
472020
Persuasion of the Undecided: Language vs. the Listener.
L Longpre, E Durmus, C Cardie
Proceedings of the 6th Workshop on Argument Mining, 2019
372019
The system can't perform the operation now. Try again later.
Articles 1–20