Skip to content

Repo for paper "M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation"

Notifications You must be signed in to change notification settings

SU-JIAYUAN/M-MAD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation

The code for M-MAD will be released upon acceptance of the paper. Stay tuned for updates!

🤖 About M-MAD

The M-MAD framework is a systematic LLM-based multi-agent framework for advanced LLM-as-a-judge MT evaluation. It operates in three stages:

  1. Dimension Partition: Decomposing the heuristic MQM annotation guideline into distinct dimensions for independent LLM-as-a-judge assessments.
  2. Multi-Agent Debate: Conducting multi-agent debates within each dimension, harnessing LLMs' inherent knowledge, reasoning, and collaborative abilities.
  3. Final Judgement: Synthesizing the debated outcomes through a final judge agent to produce a comprehensive evaluation judgment.

framework.png

📄 Paper

For a detailed explanation of the M-MAD framework, please refer to the paper:
Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation (arXiv)

Meta-evaluation

To run the meta-evaluation for the metrics, execute the following file:

wmt23_metrics.ipynb

Citation

@article{feng2024mmad,
  title={M-MAD: Multidimensional Multi-Agent Debate Framework for Fine-grained Machine Translation Evaluation},
  author={Feng, Zhaopeng and Su, Jiayuan and Zheng, Jiamei and Ren, Jiahan and Zhang, Yan and Wu, Jian and Wang, Hongwei and Liu, Zuozhu},
  journal={arXiv preprint arXiv:2412.20127},
  year={2024}
}

About

Repo for paper "M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published