Using LLMs to Uncover Memorization in Instruction-Tuned Models

A study introducing a black-box prompt optimization approach to uncover higher levels of memorization in instruction-tuned LLMs.

October 11, 2024 · 2 min · Chengyu Zhang

Do Membership Inference Attacks Work on Large Language Models?

This paper evaluates the effectiveness of membership inference attacks on large language models, revealing that such attacks often perform no better than random guessing.

June 14, 2024 · 2 min · Chengyu Zhang

Membership Inference Attacks Against Fine-tuned Large Language Models via Self-prompt Calibration

A novel study introducing self-prompt calibration for membership inference attacks (MIAs) against fine-tuned large language models, improving reliability and practicality in privacy assessments.

January 18, 2024 · 2 min · Chengyu Zhang