Demo

You can access the demo via this Link

RAG Movie Recommender System

By leveraging recent advances in natural language processing (NLP) and information retrieval techniques, we propose a new way to enhance movie recommendation systems in this project. This project endeavors to harness cuttingedge methodologies in NLP and information retrieval to provide users with highly tailored movie suggestions based on Persian queries, thus enhancing their overall movie-watching experience.

Data Collection

We collected movie data from four prominent Persian websites: (DigiMovie, n.d.), (FilmKio, n.d.), (TinyMovie, n.d.), and (Uptv, n.d.). These websites offer a vast repository of movies with persian-language details, encompassing various genres, release years, IMDB ratings and more. The data extraction process involved scraping information such as movie titles, descriptions, genres, release years, ratings, actors' names and other relevant details.

Proposed Method

As part of the proposed system, information retrieval and text generation are combined through the Retrieval-Augmented Generation (RAG) framework. To capture semantic similarities between user queries and movie datasets, the system embeds them in language models. Using the embedded representations, movies that are most relevant to the user query are retrieved from the dataset. to enhance recommendation accuracy, we compare embeddings generated by FastText, ParsBERT, GPT, and Cohere multi lingual model.

Evaluation

To evaluate the performance of the proposed system, we benchmarked the models on the following metrics:

1. IoU (Intersection over Union)

The Intersection over Union (IoU) metric measures the overlap between the predicted and ground truth movie recommendations. To make the ground truth recommendations, we used the "similar movies" section on the IMDB website.

We randomly selected 100 movies from the dataset as the imdb evaluation set`.

2. Overlap

The Overlap metric measures the number of overlapping movies between the predicted and ground truth recommendations.

3. Accuracy by Genre

The Accuracy by Genre metric evaluates the performance of the models in recommending movies from different genres. We randomly selected 10 genres from the dataset and then calculated the accuracy of the models in recommending movies from these genres. Meaning that we calculated the percentage of movies from the selected genre that were recommended by the models.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Chroma DB		Chroma DB
benchmarking		benchmarking
crawled_data		crawled_data
crawler_parser		crawler_parser
docs		docs
figures		figures
finetuning		finetuning
generation		generation
preprocessing		preprocessing
tools		tools
utility_misc		utility_misc
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Demo

RAG Movie Recommender System

Data Collection

Proposed Method

Evaluation

1. IoU (Intersection over Union)

2. Overlap

3. Accuracy by Genre

Contributors

About

Releases

Packages

Languages

drippypale/movie-rag

Folders and files

Latest commit

History

Repository files navigation

Demo

RAG Movie Recommender System

Data Collection

Proposed Method

Evaluation

1. IoU (Intersection over Union)

2. Overlap

3. Accuracy by Genre

Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages