I am Abhishek Das, a Computer Science PhD student at Georgia Tech, advised by Prof. Dhruv Batra. My research focuses on deep learning models and its applications in computer vision and natural language processing. Before transferring to Georgia Tech, I spent one year at Virginia Tech as an intern and later as a graduate student working with Prof. Dhruv Batra and Prof. Devi Parikh.

Prior to joining grad school, I worked on neural coding in zebrafish tectum as an intern under Prof. Geoffrey Goodhill and Lilach Avitan at the Goodhill Lab, Queensland Brain Institute.

I graduated from Indian Institute of Technology Roorkee in 2015. During my undergrad years, I’ve been selected twice for Google Summer of Code (2013 and 2014), won several hackathons and security contests (Yahoo! HackU!, Microsoft Code.Fun.Do., Deloitte CCTC 2013 and 2014), and been an active member of SDSLabs.

On the side, I built neural-vqa, an efficient Torch implementation for visual question answering, and maintain aideadlin.es, which is a website that hosts countdowns to a bunch of CV/NLP/ML/AI conference deadlines, and several other side projects (HackFlowy, graf, etc). I also help maintain Erdős, a competitive math learning platform I created during my undergrad. I often tweet, and post pictures from my travels on Instagram and Tumblr.


Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

Abhishek Das*, Satwik Kottur*, Stefan Lee, José M.F. Moura, Dhruv Batra
ArXiv 2017

Visual Dialog

Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José M.F. Moura, Devi Parikh, Dhruv Batra
CVPR 2017 (Spotlight)
Paper, Code, visualdialog.org [dataset], AMT chat interface, Demo

Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization

Ramprasaath R. Selvaraju, Abhishek Das, Ramakrishna Vedantam, Michael Cogswell, Devi Parikh, Dhruv Batra
NIPS 2016 Interpretable ML for Complex Systems Workshop
Paper, Code, Demo

Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?

Abhishek Das*, Harsh Agrawal*, C. Lawrence Zitnick, Devi Parikh, Dhruv Batra
EMNLP 2016, ICML 2016 Workshop on Visualization for Deep Learning
Paper, Project+Dataset, Press:


ICML 2016 Workshop on Visualization for Deep Learning

Side projects


Torch implementation of an attention-based visual question answering model (Yang et al., CVPR16). The model looks at an image, reads a question, and comes up with an answer to the question and a heatmap of where it looked in the image to answer it. Some results here.


aideadlin.es is a webpage to keep track of CV/NLP/ML/AI conference deadlines. It's hosted on GitHub, and countdowns are automatically updated via changes to the data file in the repo.


neural-vqa is an efficient, GPU-based Torch implementation of the visual question answering model from the NIPS 2015 paper 'Exploring Models and Data for Image Question Answering' by Ren et al.


Erdős by SDSLabs is a competitive math learning platform, similar in spirit to Project Euler, albeit more feature-packed (support for holding competitions, has a social layer) and prettier.


graf plots pretty git contribution bar graphs in the terminal. gem install graf to install.


Clone of WorkFlowy.com, a beautiful, list-based note-taking website that has a 500-item monthly limit on the free tier :-(. This project is an open-source clone of WorkFlowy. "Make lists. Not war." :-)


AirMaps was a fun hackathon project that lets users navigate through Google Earth with gestures and speech commands using a Kinect sensor. It was the winning entry in Microsoft Code.Fun.Do.


Another fun hackathon-winning project built during Yahoo! HackU! 2012 that involves webRTC-based P2P video chat, and was faster than any other video chat provider (at the time, before Google launched Hangouts).


Ugly-looking, but super-effective bash script for downloading entire playlists from 8tracks. (Still works as of 10/2016).