Beyond the final layer: Intermediate representations for better multilingual calibration in large language models

12 Nov

This paper tackles the blind-spot of confidence calibration in multilingual large language models: it shows that non-English languages are far worse calibrated than English, and finds that intermediate layers, not the final layer, offer much better confidence signals. Building on this, we introduce Language-Aware Confidence Ensemble (LACE), a training-free method that adaptively selects the best layers per language.

Read the full article here on arXiv.

Nigel Collier

Beyond the final layer: Intermediate representations for better multilingual calibration in large language models

Trident: Benchmarking llm safety in finance, medicine, and law

PrivacyPAD: A Reinforcement Learning Framework for Dynamic Privacy-Aware Delegation

Nigel H. Collier