A recent benchmark from a Chinese developer compares three major frameworks for agent tool-calling: OpenAI's Function Calling, Anthropic's MCP, and the research prototype Toolformer. The study measures latency and success rate across multiple test scenarios, revealing that Function Calling offers the lowest latency for simple tasks, while MCP excels in complex multi-step workflows. Toolformer shows promise but lags in production readiness. For engineering teams building AI agents, this data helps inform framework selection based on task complexity and performance requirements. The benchmark methodology is transparent, making it a useful reference for technical evaluations.
A detailed comparison of three agent tool-calling frameworks with latency and success rate data.