ACE adaptation suite
Held-out tests for support, finance, voice, and policy workflows where playbook adaptation should improve deterministic behavior.
Resources / Benchmarks
SCX Labs benchmarks focus on behavior that generic leaderboards miss: local context, operational reliability, energy efficiency, and deployment constraints.
An evaluation set for assessing alignment with Australian cultural norms, language, history, institutions, and media. Used to benchmark SCX MAGPiE.
Held-out tests for support, finance, voice, and policy workflows where playbook adaptation should improve deterministic behavior.
Per-token cost, energy, latency, and throughput metrics for production sovereign AI deployments.