I am a Computer Science PhD student at Georgia Tech, advised by Prof. Dhruv Batra, and working closely with Prof. Devi Parikh. My research focuses on deep learning and its applications in building agents that can see (computer vision), think (reasoning/interpretability), talk (language modeling) and act (reinforcement learning). Before transferring to Georgia Tech, I spent one year at Virginia Tech as an intern and later as a graduate student. My CV is available here.
I graduated from Indian Institute of Technology Roorkee in 2015. During my undergrad years, I’ve been selected twice for Google Summer of Code (2013 and 2014), won several hackathons and security contests (Yahoo! HackU!, Microsoft Code.Fun.Do., Deloitte CCTC 2013 and 2014), and been an active member of SDSLabs.
On the side, I built neural-vqa, an efficient Torch implementation for visual question answering (and its extension neural-vqa-attention), and maintain aideadlin.es (countdowns to a bunch of CV/NLP/ML/AI conference deadlines), and several other side projects (HackFlowy, graf, etc). I also help maintain Erdős, a competitive math learning platform I created during my undergrad. I often tweet, and post pictures from my travels on Instagram and Tumblr.
Evaluating Visual Conversational Agents via Cooperative Human-AI Games
Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning
ICCV 2017 (Oral)
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
ICCV 2017, NIPS 2016 Interpretable ML for Complex Systems Workshop
CVPR 2017 (Spotlight)
Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?
CVIU 2017, EMNLP 2016, ICML 2016 Workshop on Visualization for Deep Learning
AirMaps was a fun hackathon project that lets users navigate through Google Earth with gestures and speech commands using a Kinect sensor. It was the winning entry in Microsoft Code.Fun.Do.
Another fun hackathon-winning project built during Yahoo! HackU! 2012 that involves webRTC-based P2P video chat, and was faster than any other video chat provider (at the time, before Google launched Hangouts).
Ugly-looking, but super-effective bash script for downloading entire playlists from 8tracks. (Still works as of 10/2016).