Skip to content

This project focuses on detecting disaster-related tweets using machine learning. The dataset, sourced from Kaggle, contains 7613 tweets labeled as disaster-related or not. Various Natural Language Processing (NLP) techniques and machine learning models were applied to classify these tweets accurately.

License

Notifications You must be signed in to change notification settings

saivasanthg/NLP-based-Disaster-Detection-in-Tweets-using-ML-Algorithms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

OVERVIEW: This project aims to develop a machine learning model for the detection of disaster-related tweets. The dataset, obtained from Kaggle, consists of 7613 rows and 4 columns, providing information about tweets, their content, and their associated labels.

METHODOLOGY 1.Data Preprocessing: -Handling missing values in the 'keyword' column. -Tokenization, removing stopwords, and stemming/lemmatization of text data.

2.Named Entity Recognition (NER):

-Using both NLTK and spaCy for extracting location entities.

3.Feature Engineering:

-Encoding categorical features (keywords, locations) using one-hot encoding.

4.BERT-based Representation:

-Leveraging BERT for obtaining embeddings from tweet text.

5.Modeling:

-Training various models including Logistic Regression, SVM, Random Forest, and LightGBM. -Evaluating model performance using accuracy metrics.

RESULTS:

Logistic Regression Accuracy: 82.47% SVM Accuracy: 80.56% Random Forest Accuracy: 81.48% LightGBM Accuracy: 83.39%

The most accurate model is LightGBM.

About

This project focuses on detecting disaster-related tweets using machine learning. The dataset, sourced from Kaggle, contains 7613 tweets labeled as disaster-related or not. Various Natural Language Processing (NLP) techniques and machine learning models were applied to classify these tweets accurately.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published