YatCC-Hard-Pro Leaderboard

Pipeline-mode hard benchmark with fully isolated runs and strict task chaining. Data is normalized from container run summaries.

YatCC YatCC-Hard YatCC-Hard-Pro

#	Model	T0	T1	T2	T3	T4	T5	Mean Reward	Pipeline	🔄

🧪 About YatCC-Hard-Pro

YatCC-Hard-Pro is built from raw container-runs-summary-pipeline.json records, then normalized to the same leaderboard schema as YatCC and YatCC-Hard.

Mean Reward uses weighted task aggregation with weights [5%, 20%, 20%, 15%, 30%, 10%]. Empty or missing task outputs are treated as 0.