This is the official repository of our NAACL 2025 main paper: Is Peeled Apple Still Red? Evaluating LLM for Conceptual Combination with Property Type
First, clone our GitHub repository.
git clone https://github.com/seokwon99/CCPT.git
Then navigate to the newly-created folder.
cd CCPT
Next, create a new Python 3.9+ environment using conda
.
conda create --name ccpt python=3.9
Activate the newly-created environment.
conda activate ccpt
All external package requirements are listed in requirements.txt
.
To install all packages, and run the following command.
pip install -r requirements.txt
pip install flash-attn --no-build-isolation
python -m spacy download en_core_web_sm
export OPENAI_API_KEY="sk-..." # if you want openai models
export ANTHROPIC_API_KEY="sk-..." # if you want anthropic models
python -m experiment.run.gen_property --property_type emergent # property induction (emergent)
python -m experiment.run.gen_property --property_type canceled # property induction (canceled)
python -m experiment.run.gen_combination --property_type emergent # noun phrase completion (emergent)
python -m experiment.run.cls_property_type # property type prediction
python -m experiment.eval.eval_property --property_type emergent # property induction (emergent)
python -m experiment.eval.eval_property --property_type canceled # property induction (canceled)
python -m experiment.eval.eval_combination --property_type emergent # noun phrase completion (emergent)
python -m experiment.eval.eval_type # property type prediction
pip install torch==2.3.0
pip install transformers trl peft bitsandbytes wandb
# Define environment
export ACCELERATE_USE_FSDP=1
export TOKENIZERS_PARALLELISM=false
GPU_NUM=4
MODEL="meta-llama/Meta-Llama-3.1-70B-Instruct"
TASK="task_name" # ["npc", "pi"]
DATA="YOUR_DATA"
# Run training
torchrun --nproc_per_node $GPU_NUM --nnodes 1 ./train.py \
--model_name_or_path $MODEL \
--data_path $DATA \
--task_type $TASK \
--output_dir model_params/qlora/$MODEL/$TASK \
--num_train_epochs 3 \
--per_device_train_batch_size 1 \
--gradient_accumulation_steps 4 \
--evaluation_strategy "no" \
--save_strategy "steps" \
--save_steps 2000 \
--save_total_limit 1 \
--learning_rate 1e-5 \
--weight_decay 0. \
--warmup_ratio 0.03 \
--lr_scheduler_type "cosine" \
--logging_steps 1 \
--model_max_length 512 \
--fp16 True \
--gradient_checkpointing True \
--use_reentrant False
Please contact at [email protected]
If you use CCPT in your research, please cite our work:
@inproceedings{song2025ccpt,
title={Is a Peeled Apple Still Red? Evaluating LLMs' Ability for Conceptual Combination with Property Type},
author={Song, Seokwon and Lee, Taehyun and Ahn, Jaewoo and Sung, Jaehyuk and Kim, Gunhee},
booktitle={},
pages={},
year={2025}
}
'''