Beyond the final layer: Intermediate representations for better multilingual calibration in large language models
This paper tackles the blind-spot of confidence calibration in multilingual large language models: it shows that non-English languages are far worse calibrated than English, and finds that intermediate layers, not the final layer, offer much better confidence signals. Building on this, we introduce Language-Aware Confidence Ensemble (LACE), a training-free method that adaptively selects the best layers per language.
Read the full article here on arXiv.