Evaluating Jokes with LLMs


listen on castbox.fmlisten on google podcastslisten on player.fmlisten on pocketcastslisten on podcast addictlisten on tuninlisten on Amazon Musiclisten on Stitcher

--:--
--:--


2023-06-06

Evaluating Jokes with LLMs

On the show today, we are joined by Fabricio Goes, a lecturer at the University of Leicester. Fabricio’s research interest lies in computational creativity and creativity evaluation. He joins us to discuss the recent study on evaluating jokes with large language models (LLMs).

Fabricio started with some background about computational creativity. He defined computational creativity and explained how LLMs could better the field. In addition, he shared the process involved in building valuable and novel AI models. He also shared the key metrics involved in measuring the creativity of the models. Fabricio tested his model with humans and discussed the kind of feedback he and his team received. He also discussed the techniques for fine-tuning the performance of LLMs for creative tasks.

Fabricio delved into some real-world applications of LLMs for creativity. First, he discussed the use of GPT-4 for creating Walt Whitman-like poems. He also discussed the use of GPT for creating and evaluating jokes. He detailed the process of evaluating jokes with GPT-3 and GPT-4.

Fabricio hinted at his future research involving multiple LLMs communicating to evaluate jokes together. He also shared his thoughts on the future possibilities with LLMs. You can learn more about Fabricio and his work on his University of Leicester page.

Resources

Joe Toplyn’s Paper: Witscript 2: A System for Generating Improvised Jokes Without Wordplay

Fabricio Goes’ Paper: Crowd Score: A Method for the Evaluation of Jokes using Large Language Model AI Voters as Judges.