Using LLMs to Uncover Memorization in Instruction-Tuned Models
A study introducing a black-box prompt optimization approach to uncover higher levels of memorization in instruction-tuned LLMs.
A study introducing a black-box prompt optimization approach to uncover higher levels of memorization in instruction-tuned LLMs.
This paper evaluates the effectiveness of membership inference attacks on large language models, revealing that such attacks often perform no better than random guessing.
A novel study introducing self-prompt calibration for membership inference attacks (MIAs) against fine-tuned large language models, improving reliability and practicality in privacy assessments.