BAKSA at SemEval-2020 Task 9- Bolstering CNN with Self-Attention forSentiment Analysis of Code Mixed Text
Abstract - Sentiment Analysis of code-mixed text has diversified applications in opinion mining rangingfrom tagging user reviews to identifying social or political sentiments of a sub-population. In thispaper, we present an ensemble architecture of convolutional neural net (CNN) and self-attentionbased LSTM for sentiment analysis of code-mixed tweets. While the CNN component helps inthe classification of positive and negative tweets, the self-attention based LSTM, helps in theclassification of neutral tweets, because of its ability to identify correct sentiment among multiplesentiment bearing units. We achieved F1 scores of 0.707 (ranked5th) and 0.725 (ranked13th) onHindi-English (Hinglish) and Spanish-English (Spanglish) datasets, respectively. The submissionsfor Hinglish and Spanglish tasks were made under the usernamesayushkandharsh6respectively.
Paper - https://arxiv.org/pdf/2007.10819.pdf
Dataset - https://github.com/keshav22bansal/baksa_iitk/tree/master/data