Political Bias in BERT

A pre-trained BERT model fine-tuned on a dataset consisting of news articles would inherit the political bias existing in the articles.

Description

Transformers can learn universal language representations. They learn useful patterns and information from the dataset. Pre-trained models such as BERT and GPT-2 are trained on large quantities of unsupervised data. However, they can sometimes pick up undesirable and nuanced knowledge from the dataset they are trained on. For example, if the dataset that the model is trained on has a negative sentiment towards a particular entity, the model can learn that and inherit this negative sentiment. This gives rise to bias in these pre-trained models which can be in the form of gender or toward particular political entities. If not tackled early on, this bias could cause serious problems when these models are deployed in the real world. As Natural Language Processing (NLP) techniques become more and more popular in the world, it is more important to address this kind of social & political bias.

In this project, I studied distilBERT, a pre-trained language model, and examined if there exists any inherent political bias. The main aim of this study is to check if BERT is biased toward a particular political entity such as Democrats or Republicans. After fine-tuning BERT on a set of political news articles, I tested the predictions of that model on a set of validation sentences that contain groups of covid-related topics. Moreover, I compute the sentiment of the MASK word being predicted by the model, to check if the model is more positively or negatively inclined towards different political entities - Democrats or Republicans.


Date: Aug 2022 – Dec 2022


Slash

Description

Slash is a command line tool that scrapes the most popular e-commerce websites to get the best deals on these searched items.
- Fast: With slash, you can save over 50% of your time by comparing deals across websites within seconds.
- Easy: Slash uses very easy commands to filter, sort & search your items.
- Powerful: Quickly alter the commands to get desired results.

Technologies Used
Python
DateAug 2021 – Sept 2021

Covid-19 Tracker Website

Description


- Developed a fully responsive Covid-19 Tracker Website.
- This website keeps track of the total cases, recovered cases, and deaths globally and country-wise in real-time. It further displays the data in a graphical format.
- With the rising demand for dark-mode in various applications, I integrated a toggle button to change the display mode according to user preference.

Technologies Used
React
DateMay 2020 – Jun 2020

DEV-THREAD

Description


- A developers platform using MERN stack.
- The main aim of this platform is to connect developers, allowing them to post articles on the latest developments in the fields.
- Developers can create their profiles, containing their areas of expertise and their projects. Github API has also been integrated to automatically fetch the latest repositories of the user and their display avatar, making it easier to connect, help, or ask for help.
- Developers can post new articles, comment on the posts, and even like or unlike a post.

Technologies Used
React
MongoDB
Node.js
Express
Date March 2020 – May 2020

Plastic Waste Profiling

Plastic Waste Profiling - Mobile App

Description

This project was developed as a part of the “Project Deep Blue” competition where my team reached the semi-final round.
This project focused on profiling the plastic waste which is disposed of in our nearby garbage dumps. We developed a Software that accepts an image of the garbage clicked by any user and gives us the entire profile of the garbage. By profiling, we mean that just by clicking a mere picture of a garbage dump, the software can analyze the product brand and backtrack it to its respective manufacturer. The developed software also creates a graph, which can give the manufacturer names that are the major contributors to the plastic waste in our country.


Technologies Used
TensorFlow
Python
Node.js
Date: Oct 2018 – Feb 2019