Publications

(2025). O3-Mini vs. DeepSeek-R1. Which One Is Safer?.

PDF

(2025). Early External Safety Testing of OpenAI's O3-Mini. Insights from Pre-Deployment Evaluation..

PDF

(2025). ASTRAL. Automated Safety Testing of Large Language Models.

PDF Podcast

(2025). AI-Driven Fairness Testing of Large Language Models. A Preliminary Study.

PDF Podcast

(2024). Toward Trustworthy AI-Enabled Internet Search. JISBD'24.

PDF Slides