Skip to content

Latest commit

 

History

History
290 lines (229 loc) · 14.5 KB

README-en.md

File metadata and controls

290 lines (229 loc) · 14.5 KB

en CodeFactor DeepScan grade

FAKENEWSDETECTOR

Built with the tools and technologies:

tqdm JavaScript Selenium NumPy Python


Table of Contents


Overview

Here's a compelling overview of the FakeNewsDetector project:

**Introducing FakeNewsDetector: Empowering Truth-Seekers

In today's digital age, it's crucial to stay informed about current events while avoiding misinformation. That's where FakeNewsDetector comes in – a cutting-edge tool designed to detect and debunk fake news. This innovative project uses AI-powered natural language processing (NLP) techniques to analyze news articles and identify potential fake news.

**Key Features: Accurate article analysis using KeyBERT model Real-time comparison with Naver's Open API for related news items Cosine similarity score calculation for each article


Features

Feature Summary
⚙️ Architecture
  • The project uses a modular architecture, with separate scripts for accuracy checking, news title generation, and crawling.
  • It utilizes natural language processing (NLP) techniques to analyze text and identify relevant phrases.
  • The project relies on various APIs, including Naver's Open API and FactCheck SNU website, for data retrieval.
🔩 Code Quality
  • The codebase uses Python as the primary language, with some JavaScript files for the news selector extension.
  • It employs popular libraries like pandas, numpy, requests, selenium, and tqdm for data manipulation and processing.
  • The project follows best practices for coding standards, with clear variable naming and concise function definitions.
📄 Documentation
  • The project has comprehensive documentation, including summaries of each code file and their purposes.
  • The documentation is written in Markdown format and includes links to relevant APIs and libraries.
  • The project uses Python as the primary language for documentation, with some JavaScript files for the news selector extension.
🔌 Integrations
  • The project integrates with various APIs, including Naver's Open API and FactCheck SNU website, for data retrieval.
  • It uses natural language processing (NLP) techniques to analyze text and identify relevant phrases.
  • The project relies on web-related modules like requests and selenium for data manipulation and processing.
💻 Tools and Technologies
  • Python as the primary language
  • JavaScript for the news selector extension
  • pandas, numpy, requests, selenium, and tqdm libraries for data manipulation and processing
  • NLP techniques for text analysis

Project Structure

└── FakeNewsDetector/
    ├── accuracy_checker.py
    ├── app.py
    ├── ChromeExt.py
    ├── crawling.py
    ├── data
    │   ├── 논쟁 중.txt
    │   ├── 대체로 사실 아님.txt
    │   ├── 대체로 사실.txt
    │   ├── 사실.txt
    │   ├── 전혀 사실 아님.txt
    │   ├── 절반의 사실.txt
    │   └── 판단 유보.txt
    ├── LICENSE
    ├── main.py
    ├── news-selector
    │   ├── app.js
    │   ├── background.js
    │   ├── icon.png
    │   ├── LICENSE
    │   └── manifest.json
    └── requirements.txt

Project Index

FakeNewsDetector/
__root__
accuracy_checker.py - Here is a succinct summary that highlights the main purpose and use of the `accuracy_checker.py` file:

Summarize: The script evaluates the accuracy of news articles by comparing their descriptions with keywords extracted from the text
- It uses the KeyBERT model to identify relevant phrases, then retrieves related news items from Naver's Open API
- The script calculates a cosine similarity score for each article and assigns an accuracy rating based on this score.

Key Points: The script processes multiple files, extracts keywords, retrieves descriptions, and calculates scores for each file
- It also keeps track of total titles and average cosine similarity for each file.

app.py - Here is a succinct summary that highlights the main purpose and use of the code file:

The app.py file generates a selector based on input text by extracting keywords, retrieving related news articles, and calculating their similarity
- It uses natural language processing techniques to analyze the text, identify relevant nouns, and retrieve descriptions from Naver News API
- The resulting selector is determined by the similarity score between the input text and the retrieved descriptions.

ChromeExt.py - Here is a succinct summary that highlights the main purpose and use of the ChromeExt.py file:

The script generates news titles from the NewsDataApiClient API and uses OpenAI's GPT-4o-mini model to analyze input text, identifying potential fake news based on provided news titles
- The output is written to a file named 'titles.txt'.

crawling.py - Here is a succinct summary of the main purpose and use of the crawling.py file:

The script crawls the FactCheck SNU website to extract titles from fact-check cards, storing them in text files
- It can be run in either single-threaded or multi-threaded modes, with the latter utilizing multiple threads to process pages concurrently
- The script also checks the server's status before running and provides a choice for users to select their preferred mode of execution.

main.py - The main purpose of this code file is to manage the execution flow of a project by providing options to run specific scripts based on user input
- It serves as an entry point, allowing users to choose between running the main application or an accuracy checker
- The code also handles data file integrity checks, ensuring that necessary files exist and are not empty before proceeding with the chosen action.
requirements.txt - Facilitates project dependencies by specifying required packages and their versions
- This file ensures the correct installation of necessary libraries, including data manipulation tools like pandas and numpy, web-related modules such as requests and selenium, and progress tracking utilities like tqdm.
news-selector
app.js - Here is a succinct summary of the provided code file:

The app.js file enables a news selector feature that highlights and extracts text content from an element on mouse hover or key press, allowing users to generate a unique CSS selector for the selected element
- The code achieves this by creating a highlighter div, updating its position and size based on the hovered element, and sending the extracted text value to a backend API to generate the selector.

background.js - The background.js file serves as the entry point for the news selector project's functionality
- It listens for browser actions and executes a script on the target tab when clicked, initiating the setup process
- This process enables highlighting, updates the highlight upon mouse movement, captures clicks to grab selectors, and checks for termination keys.
manifest.json - Here is a summary of the main purpose and use of the manifest.json file:

Define the News Selector extension's metadata, permissions, and functionality, enabling users to extract news pages and send them to a server
- The file specifies the background service worker, content scripts, and commands for executing actions, as well as icon and title settings.


Getting Started

Prerequisites

Before getting started with FakeNewsDetector, ensure your runtime environment meets the following requirements:

  • Programming Language: Python
  • Package Manager: Pip

Installation

Install FakeNewsDetector using one of the following methods:

Build from source:

  1. Clone the FakeNewsDetector repository:
❯ git clone https://github.com/tkgo11/FakeNewsDetector
  1. Navigate to the project directory:
cd FakeNewsDetector
  1. Install the project dependencies:

Using pip  

❯ pip install -r requirements.txt

Chrome Extension Installation

To install the news-selector Chrome extension, follow these steps:

  1. Open Chrome Extensions Page:

    • Open Google Chrome and navigate to chrome://extensions/.
  2. Enable Developer Mode:

    • In the top right corner, toggle the "Developer mode" switch to enable it.
  3. Load Unpacked Extension:

    • Click on the "Load unpacked" button.
    • Select the news-selector directory from the FakeNewsDetector project folder.
  4. Verify Installation:

    • Ensure the news-selector extension appears in the list of installed extensions.
    • You should see the extension icon in the Chrome toolbar.
  5. Usage:

    • Click on the news-selector icon in the toolbar to activate the extension.

Usage

Run FakeNewsDetector using the following command: Using pip  

❯ python main.py

Project Roadmap

  • Task 1: Make a crawling code
  • Task 2: Change crawling to multi-thread
  • Task 3: Use Requests library for crawling

Contributing

Contributing Guidelines
  1. Fork the Repository: Start by forking the project repository to your LOCAL account.
  2. Clone Locally: Clone the forked repository to your local machine using a git client.
    git clone C:\Users\tkgo1\FakeNewsDetector
  3. Create a New Branch: Always work on a new branch, giving it a descriptive name.
    git checkout -b new-feature-x
  4. Make Your Changes: Develop and test your changes locally.
  5. Commit Your Changes: Commit with a clear message describing your updates.
    git commit -m 'Implemented new feature x.'
  6. Push to LOCAL: Push the changes to your forked repository.
    git push origin new-feature-x
  7. Submit a Pull Request: Create a PR against the original project repository. Clearly describe the changes and their motivations.
  8. Review: Once your PR is reviewed and approved, it will be merged into the main branch. Congratulations on your contribution!
Contributor Graph


License

This project is protected under the MIT License. For more details, refer to the LICENSE file.


Acknowledgments

grab-selector