Federated Semi-Supervised Learning in Image Classification

Official Code for CSE 847 Fall 2024 Final Project: "Federated Semi-Supervised Learning in Image Classification" Bao Hoang and Manh Tran.

Overview

Machine learning models often require large-scale labeled datasets, which are scarce in many real-world scenarios. Semi-supervised learning (SSL) addresses this limitation by leveraging a small amount of labeled data alongside a large pool of unlabeled data, enhancing model performance. In this paper, we explore the effectiveness of SSL techniques in improving image classification tasks. Additionally, we tackle the challenges of data privacy in decentralized environments by adapting SSL algorithms to the federated learning framework. Our approach enables privacy-preserving, distributed training across multiple clients, paving the way for robust and secure semi-supervised machine learning algorithms. Our codes are provided in https://github.com/hoangcaobao/CSE847-Fall2024-FinalProject.

Package dependencies

Use conda env create -f environment.yml to create a conda env and activate by conda activate FL-SSL.

Data preparation

For STL-10 and CIFAR-10, they already exist in the torchvision dataset library, so no further action is needed. For the Cat and Dog dataset, please download the data from https://www.kaggle.com/datasets/tongpython/cat-and-dog/data, unzip the folder, and place it in the data folder of the repository.

Demos

Here we provide several demos of results in the project report. You can change the arguments from main.py to try different settings.

Arguments of main.py

--dataset_name (string, optional, default: "Cat_and_Dog"):
- Specifies the dataset.
- Options include: "STL10", "CIFAR10", and "Cat_and_Dog".
--golden_baseline (flag, optional, default: False):
- If set, then evaluate the golden baseline which uses all labeled training data.
- Options include: False and True.
--numberOfClients (int, optional, default: 5):
- Specifies the number of clients in federated learning (set to 1 means centralized setting).
--solver (string, optional, default = "SelfTraining_solver"):
- Specifies semi-supervised algorithms.
- Options include: "Standard_solver", "SelfTraining_solver", "FixMatch_solver", "MeanTeachers_solver", and "MixMatch_solver"
--model (string, optional, default = "simpleCNN"):
- Specifies computer vision models.
- Options include: "simpleCNN", "resnet18", "densenet121"

Example

Run FixMatch algorithm for CIFAR10 using ResNet-18 model: python main.py --model resnet18 --dataset_name CIFAR10

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
Data		Data
Models		Models
Solvers		Solvers
configs		configs
.gitignore		.gitignore
LICENSE		LICENSE
Presentation.pdf		Presentation.pdf
README.md		README.md
Report.pdf		Report.pdf
environment.yml		environment.yml
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Federated Semi-Supervised Learning in Image Classification

Overview

Package dependencies

Data preparation

Demos

Arguments of main.py

Example

Find Project Report and Presentation Slides on Report.pdf and Presentation.pdf

About

Releases

Packages

Contributors 2

Languages

License

hoangcaobao/CSE847-Fall2024-FinalProject

Folders and files

Latest commit

History

Repository files navigation

Federated Semi-Supervised Learning in Image Classification

Overview

Package dependencies

Data preparation

Demos

Arguments of main.py

Example

Find Project Report and Presentation Slides on Report.pdf and Presentation.pdf

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages