Abstractive text summarisation of GitHub issues

Overview

A dataset of GitHub Issues' titles, bodies and URLs has been used to create a Sequence to Sequence model with GRUs to summarize the GitHub issue body. The machine generated title is a more compact yet accurate representation.

The project includes:

RNNs to create a sequence to sequence model for abstractive text summarisation.
Teacher forcing algorithm is used to train the decoder model.
A recommender that suggests GitHub issues with similar titles. The Spotify ANNOY package is used for this purpose.
The model's performance determined through it's BLEU score.

Architecture

Dataset

The dataset used has over 8M entries and hence the model requires sufficient training time.
You can find the dataset here.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
abstractive-text-summarisation-of-github-issues.ipynb		abstractive-text-summarisation-of-github-issues.ipynb
architecture.png		architecture.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Abstractive text summarisation of GitHub issues

Overview

The project includes:

Architecture

Dataset

About

Releases

Packages

Languages

ritika-07/Abstractive-text-summarisation-of-GitHub-issues

Folders and files

Latest commit

History

Repository files navigation

Abstractive text summarisation of GitHub issues

Overview

The project includes:

Architecture

Dataset

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages