Skip to content

drippypale/movie-rag

Repository files navigation

Demo

You can access the demo via this Link

image

RAG Movie Recommender System

By leveraging recent advances in natural language processing (NLP) and information retrieval techniques, we propose a new way to enhance movie recommendation systems in this project. This project endeavors to harness cuttingedge methodologies in NLP and information retrieval to provide users with highly tailored movie suggestions based on Persian queries, thus enhancing their overall movie-watching experience.

Data Collection

We collected movie data from four prominent Persian websites: (DigiMovie, n.d.), (FilmKio, n.d.), (TinyMovie, n.d.), and (Uptv, n.d.). These websites offer a vast repository of movies with persian-language details, encompassing various genres, release years, IMDB ratings and more. The data extraction process involved scraping information such as movie titles, descriptions, genres, release years, ratings, actors' names and other relevant details.

DataCollection

Proposed Method

As part of the proposed system, information retrieval and text generation are combined through the Retrieval-Augmented Generation (RAG) framework. To capture semantic similarities between user queries and movie datasets, the system embeds them in language models. Using the embedded representations, movies that are most relevant to the user query are retrieved from the dataset. to enhance recommendation accuracy, we compare embeddings generated by FastText, ParsBERT, GPT, and Cohere multi lingual model.

Method

Evaluation

To evaluate the performance of the proposed system, we benchmarked the models on the following metrics:

1. IoU (Intersection over Union)

The Intersection over Union (IoU) metric measures the overlap between the predicted and ground truth movie recommendations. To make the ground truth recommendations, we used the "similar movies" section on the IMDB website.

We randomly selected 100 movies from the dataset as the imdb evaluation set`.

2. Overlap

The Overlap metric measures the number of overlapping movies between the predicted and ground truth recommendations.

IoU

3. Accuracy by Genre

The Accuracy by Genre metric evaluates the performance of the models in recommending movies from different genres. We randomly selected 10 genres from the dataset and then calculated the accuracy of the models in recommending movies from these genres. Meaning that we calculated the percentage of movies from the selected genre that were recommended by the models.

Accuracy by Genre

Contributors

  1. Mohammad Mahdi Gharaguzlo
  2. Mohammad Asadi
  3. Ramin Roshan

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published