Ads Decoder: Generating Symbolic Bounding Boxes and Labels for the given Images and Descriptors
This project focuses on combining the image features with the text descriptor and predict the symbolic regions with labels that draws the audience’s attention. The idea of combining text embeddings with image features extracted by the image classifier is inspired from the VQA model.
Tech stack: Python, PyTorch (Faster R-CNN), scikit-learn, Gensim/TextBlob, pycocotools, Conda, SLURM (HPC/MASSIVE M3)

