Girish Sastry

Researcher

OpenAI

Title: Thinking through Disinformation and Other Malicious Uses of Language Models

Abstract: Recent capability improvements and widespread diffusion of generative AI systems has increased the risk of malicious use of language models, including for disinformation campaigns. In this talk, we will overview why language models could be useful for influence operations—building on a workshop report and original survey experiments. We will also use disinformation risks as a case study to consider broader challenges related to malicious use. These include challenges in forecasting or weighing unrealized harms, limitations for possible mitigation strategies, and trade-offs for different release options for AI systems.

Bio: I am a researcher at OpenAI interested in issues of AI ethics, policy, and strategy.