Trident: Benchmarking llm safety in finance, medicine, and law

As AI models enter high-stakes domains such as law, finance and healthcare, this work references clear safety principles drawn from professional ethics and introduces Trident-Bench, a new benchmark to test how well large language models adhere to them. We evaluate 19 models and find that while strong generalists (e.g., GPT, Gemini) pass basic checks, domain-specialist models often fail to comply with policies, underlining the urgent need for targeted safety evaluations.

Read the full paper here on arXiv.

Previous
Previous

All Roads Lead to Rome: Graph-Based Confidence Estimation for Large Language Model Reasoning

Next
Next

Beyond the final layer: Intermediate representations for better multilingual calibration in large language models