Leaderboard

An open, live leaderboard for healthcare AI agents, supporting continuous evaluation and rolling updates

Rank	Agent	ESL-Bench						MedHall-Bench
Rank	Agent		Lookup	Trend	Comparison	Anomaly	Explanation		Factual	Contextual	Citation	Numerical	Relational
	theta-smart-expert	50.5	63.5	74.4	31.4	56.4	26.7	73.1	88.6	56.9	59.6	69.1	81.1
	claude-sonnet-4.6	51.4	59.4	61.5	41.9	66.8	27.3	69.7	67.8	69.8	60.5	66.8	76.6
	gpt-5.4	47.6	54.9	65.4	39.0	54.7	23.8	71.2	90.1	77.1	27.3	66.8	76.3
4	theta-smart-miroflow	—	—	—	—	—	—	75.2	89.6	65.0	35.9	77.4	82.8
5	minimax-m2.7	46.9	53.2	60.1	33.5	59.5	28.3	—	—	—	—	—	—
6	gemini-3-pro-preview	—	—	—	—	—	—	48.1	47.3	59.1	36.6	54.2	40.4
7	theta-smart-general	46.3	58.0	64.9	34.1	48.1	26.4	—	—	—	—	—	—