Skip to content

SmartHomePrivacyProject/CrossTraceWF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 

Repository files navigation

Cross-trace Website Fingerprinting

The dataset and code are for research purposes only. The results of this study are published in the following paper:

Jimmy Dani and Boyang Wang, "HiddenText: Cross-Trace Website Fingerprinting Over Encrypted Traffic," IEEE 22nd International Conference on Information Reuse and Integration for Data Science (IRI'21), August, 2021.

Code

The src directory comprises sub-directories, each of which is labeled as a performance evaluation part of the paper depending on experiment name. At the beginning of each script the instructions to execute the script are included.

  • For experiments A1 and A2, execute following command python <name-of-script.py> /path/to/save/model/model-name.h5 /path/to/dataset
  • For experiments A3, A4, and A5 execute the following command python <name-of-script.py> /path/to/wt-def-model.h5 /path/to/paired/dataset
  • For experiment A6, the directory contains two sub-directories model-training and nlp-analysis
    • model-training: contains script for training model defended using MockingBird defense. To run this script, execute the following command python <name-of-script.py> /path/to/save/model/model-name.h5 /path/to/dataset
    • nlp-analysis: contains scripts for performing cross-trace attack on MockingBird defended dataset. To run the script, execute the following command python <name-of-script.py> /path/to/mb-def-model.h5 /path/to/paired/dataset
Additional Python Libraries Required

In addition, certain more libraries are needed to run scripts other than the pre-installed python packages. The following are the libraries:

  • tensorflow-gpu==2.3.1
  • sentence-transformers==0.4.1.2
  • gensim==3.8.3
  • pandas
  • numpy==1.19.5
  • scikit-learn==0.23.2
  • nltk==3.5

⚠️ Note: It will take a while to get results from WMD and BERT.

Dataset

The dataset used for this research is available at this link.

The used data set is provided as CSV files for this research. For each experiment six CSV files are required, each of which is described in the following manner:

  • x_train: This file contains the traffic traces which are used as an input for the CNN for training
  • y_train: This file contains labels corresponding to the traffic traces available in x_train
  • x_valid: This file contains the traffic traces used for validation
  • y_valid: This file contains labels corresponding to the traffic traces available in x_valid
  • x_test: This file constains the traffic traces for testing/evaluating the trained CNN model
  • y_test: This file contains labels corresponding the traffic traces in x_test

Note: The data needed for various experiments are organized in subdirectories similar to the code.

Questions and Comments

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages