Tag: Benchmark | Blog

Latest LLMs in the Test: GPT 5.1 Codex Max vs. Gemini Pro 3 vs. Opus 4.5

Published on 07.12.2025

Ai Llm Engineering Comparison Benchmark Cursor

Title Image: Latest LLMs in the Test: GPT 5.1 Codex Max vs. Gemini Pro 3 vs. Opus 4.5

With the release of Claude Opus 4.5 and the hype surrounding "engineering-grade" models, I moved beyond frontend generation to test their capabilities as full-stack engineers. I took the three current heavyweights—GPT-5.1-Codex-Max, Gemini 3 Pro, and Claude Opus 4.5—and ran them through a rigorous MVP development cycle to build 'Speakit', a text-to-speech application, to see if benchmark numbers translate to shipping products.

Read this Article

Hans Reinl ▁

Latest LLMs in the Test: GPT 5.1 Codex Max vs. Gemini Pro 3 vs. Opus 4.5

Explore

About

Writing

Legal