A machine learning program, built from scratch. Motivated by Kaggle's datasets.
Dear all, this is alegendary dataset which you must have known. If you are new to this dataset, welcome and take a look at the application and the code to understand and learn the process of building an end to end data science project.
In the application or let's say project, a machine learning algorithm is used which predicts the survival of the passengers who travelled on the Titanic. This prediction is deployed on a web application where you can run the program based on the data you input.
In this app, the details you enter in the form will be considered as a ticket to board the Titanic. You will experience as if you are actually buying the ticket to travel. Take a look at the website.
I like to data science work actually. The only way I could become better and better at it is by learning constantly and applying the skills. I picked this project because a few months ago I worked on this dataset while doing the kaggle competition. That competion only required pre-processing, building the model and finding the accuracy. I thought I'd deploy that dataset on the web for everyone to experience. I couldn't do as I was occupied with work, etc. A few days ago I remembered, completed it today while I write this readme.md file.
Completed. However, there are a few things that can be improved. I will discuss below
As you may have noticed that this project is almost complete but there are a many more things that can be improved. Feel free to add functionalities and add more features if it can make the app more better.
1. CSS for designing pages
2. There is a bug where if you enter something else in the ticket form instead of what is required, the app crashes. I understand it should have been fixed but I wanted to get back to other things that requires my most attention.
3. I have used bootstrap and less, actually very less CSS. Just an FYI
4. And many things can be improved, such as connecting a database such as SQLAlchemy, saving the prediction result in some csv or pdf, and many more. The list would go on and on.
5. I would encourage you to participate and improve the code and bugs in this project as it might help you and me in our developing/programming career.
Everyone can contribute here. I'm sure the super busy guys and girls may not have time, but if you are learning and love open source contribution because of the benefits it can bring you and in your career, then go ahead and send pull requests.
Contribute to open source, learn and earn prizes. This fest is valid from 1st to 31st October.
- Decision Tree Classifier
- Data gathering was easy as the dataset is easily available on the internet
- File size was very small it didn't require a lot of space
- Feature Engineering: Removing nans, outliers, dropping columns
- Explanatory Data Analysis: Graphs, charts to understand the data
- Feature Selection
- Model Building
- Model Deployment
- Python libraries, pandas, matplotlib, scikit, numpy, seaborn
- Flask - Web Framework
- wtf-forms
- HTML/CSS/Bootstrap
- Heroku
If you get the below error the heroku free dyno hours might have been finished. Please run this program on you computer. You can access the website from the beginning of next month. Please find the steps below to run the program on you PC.
- Fork this repository
- Create a new environment for this project using conda or any other environment tool that you are used to
- Grab the http link of the forked repository then proceed to clone it on you computer
- Open terminal on your Mac or any CLI that you are used to, choose the path where you want this program to be saved and run
git clone 'paste the http link'
- From terminal enter in to the projects directory then run this code that will install the dependencies required for this program
pip install -r requirements.txt
- After the downloads, run the below this code that will start the program on your local server
python app.py
- In the terminal, find the server link. Open it in your browser.
If you have any feedback, please reach out to me at [email protected] or if you want to share a like, please hit a star to this project