Long Ouyang

Training Language Models to Follow Instructions with Human Feedback
Learning to Summarize from Human Feedback
WebGPT: Browser-assisted Question-Answering with Human Feedback