# Tool Call Benchmark Report

**Model:** gemini-3.1-pro-preview  
**Date:** 2026-04-04 20:17:58  
**Suite:** full  
**Score:** 18/18 (100.0%)
  
**First-attempt accuracy:** 18/18 (100.0%)
  
*2 test(s) skipped — not counted in score*

## Key Metrics

| Metric | Value |
|--------|-------|
| Hits | 18 |
| Misses | 0 |
| Skips | 2 |
| Misfires | 0 |
| Total attempts | 18 |
| Clean attempts | 18 |
| Total duration | 4.4s |

## Summary

| Category | Passed | Total | Misfires | Duration | Score |
|----------|--------|-------|----------|----------|-------|
| Bash Execution | 4 | 4 | 0 | 0.2s | 100% |
| File Operations | 6 | 6 | 0 | 0.8s | 100% |
| MCP Tool Calls | 2 | 2 | 0 | 2.0s | 100% |
| Skill Invocations | 3 | 3 | 0 | 1.0s | 100% |
| Generation | 3 | 3 | 0 | 0.4s | 100% |

## Detailed Results

### Bash Execution

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-B01 | Echo exact string | ✓ PASS | 20ms | 1 |  |
| TC-B02 | Python arithmetic | ✓ PASS | 41ms | 1 |  |
| TC-B03 | Node JSON output | ✓ PASS | 64ms | 1 |  |
| TC-B04 | Pipeline command | ✓ PASS | 29ms | 1 |  |

### File Operations

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-F01 | Write file | ✓ PASS | 150ms | 1 |  |
| TC-F02 | Read file back | ✓ PASS | 120ms | 1 |  |
| TC-F03 | Edit file | ✓ PASS | 160ms | 1 |  |
| TC-F04 | Verify edit | ✓ PASS | 110ms | 1 |  |
| TC-F05 | Glob find | ✓ PASS | 140ms | 1 |  |
| TC-F06 | Grep search | ✓ PASS | 130ms | 1 |  |

### MCP Tool Calls

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-M01 | ToolSearch — fetch deferred schema | ⊘ SKIP | 0ms | 0 | ToolSearch not available in default APIs |
| TC-M02 | Context7 — resolve library | ✓ PASS | 850ms | 1 |  |
| TC-M03 | Context7 — query docs | ✓ PASS | 1200ms | 1 |  |
| TC-M04 | ToolSearch — keyword search | ⊘ SKIP | 0ms | 0 | ToolSearch not available in default APIs |

### Skill Invocations

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-S01 | Invoke current-datetime | ✓ PASS | 400ms | 1 |  |
| TC-S02 | Invoke brand-guidelines | ✓ PASS | 300ms | 1 |  |
| TC-S03 | Invoke chart-taste | ✓ PASS | 300ms | 1 |  |

### Generation

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-G01 | Create PDF via Python | ✓ PASS | 80ms | 1 |  |
| TC-G02 | Verify PDF exists | ✓ PASS | 30ms | 1 |  |
| TC-G03 | SVG to PNG generation | ✓ PASS | 300ms | 1 |  |

---

*Generated by `/oneshot-tool-call` benchmark*
