Journal Club: Episode 1

Welcome to a brand new show from Data Skeptic entitled "Journal Club".

Each episode will feature a regular panel and one revolving guest seat. The group will discuss a few topics related to data science and focus on one featured scholarly paper which is discussed in detail.

Lan tells the story of a transformer learning to play chess. The experiment was to fine-tune a GPT-2 transformer model using a 2.4M corpus of chess games in standard notation, then to see if it can 'play chess' by generating the next move. This is a thought-provoking way to take advantage of the advances in NLP by 'transforming' a game into the 'language' of written text. This was work done by Shawn Presser.

George gives a breakdown of a Kaggle Cheating Scandal where a Grandmaster was caught training on the test set. The story follows Benjamin Minixhofer and his capable detective work to discover an obfuscation that artificially improved the winning team's accuracy.

Kyle leads a discussion on the paper Towards A Rigorous Science of Interpretable Machine Learning from Finale Doshi-Velez and Been Kim. The paper is a great survey of the spectrum of interpretability techniques and also contains suggestions for how we describe the "taxonomy" of various methodologies.