Building a text classification model using BERT and pytorch for sentiment analysis.
The model is trained and built for a corpus of Google Play Reviews. The dataset has been scraped using the google_play_scraper library. The models can be found on my google drive, and are not uploaded here due to space constraints.
.
│ README.md
| LICENSE
│ dataset-scraper.ipynb
│ text-preprocessing.ipynb
│ text-classification-bert.ipynb
│
└───data
│ │ apps.csv
│ │ reviews.csv
│ │
│
└───models
│ model.bin
│ model_base_cased_state_842.bin
│ model.pth
- The
dataset-scraper.ipynbnotebook has all the code required to scrape the reviews dataset - The
text-classification-bert.ipynbis the final notebook that has the code required to run the classification model.
Below are the performance metrics of the classification model
| precision | recall | f1-score | support |
|---|---|---|---|
| negative | 0.83 | 0.80 | 0.81 |
| neutral | 0.75 | 0.75 | 0.75 |
| positive | 0.85 | 0.88 | 0.86 |
| accuracy | - | - | 0.81 |
| macro avg | 0.81 | 0.81 | 0.81 |
| weighted avg | 0.81 | 0.81 | 0.81 |