The Goal
We wanted to showcase our Maestro E2E tests running on the Too Many Tomatoes iOS app — not just terminal output, but a proper video with voiceover narration that we could share with the team and publish.
The Setup
All we had to start with:
• A Maestro E2E test suite (6 search prefix matching tests)
• An iPhone 17 Pro simulator with the app installed
• Claude Code running in the terminal
The Process — What Claude Code Did
Step 1: Run the tests and capture screenshots
Prompt: "Review PR 89... I want to see screenshots of the test results in Slack"
Claude Code created a custom Maestro flow with takeScreenshot commands at each test step, ran it, found where Maestro saved the 10 PNG screenshots, and uploaded all of them to Slack using the Slack file upload API. This was fully autonomous — it wrote the YAML flow, executed it, located the output files, and handled the multi-file Slack upload.
Step 2: Create the narrated video
Prompt: "Create a short video filming all of that, ideally with some audio voiceover"
Claude Code then:
1. Wrote a narration script covering all 6 test phases
2. Used macOS say command with the Daniel (en_GB) voice to generate a 64-second voiceover audio file
3. Started xcrun simctl io booted recordVideo to screen-record the simulator
4. Ran the full Maestro test suite while recording
5. Stopped the recording
6. Used ffmpeg to speed-match the video to the narration (1.25x) and combine them into a single MP4
7. Verified quality by extracting a preview frame
8. Uploaded the final 3.3MB video to Slack
Tools Used (all by Claude Code, no manual intervention):
• maestro test — E2E test runner
• xcrun simctl io booted recordVideo — iOS simulator screen recording
• say -v Daniel — macOS text-to-speech for voiceover
• ffmpeg — video/audio processing and combining
• Slack API — file uploads via curl
Learnings & Tips
1. macOS has everything you need — Between say (TTS), xcrun simctl (simulator control), and ffmpeg (media processing), you can create narrated demo videos entirely from the command line. No screen recording apps or video editors needed.
2. Timing is the tricky part — The first attempt used swipe gestures as pauses between tests to sync with narration, but they interfered with the UI (a text field placeholder became unfindable). The simpler approach — record the natural test flow and speed-adjust with ffmpeg — worked better.
3. Voice selection matters — We chose Daniel (en_GB) to match the app’s UK English branding. macOS has dozens of voices. Avoid the novelty ones (Bells, Boing, Bubbles) for professional content.
4. 1.25x speedup looks good for automation demos — The slightly faster playback makes automated taps look snappy and intentional rather than slow and robotic.
5. File management is autonomous — Claude Code handled the full pipeline: creating temp directories, managing file paths, cleaning up screenshots from the repo directory after upload, and combining media files. No manual file shuffling needed.
The Result
A 64-second narrated video showing all 6 E2E tests running on a real simulator, created entirely through natural language prompts in Claude Code. Total time from prompt to Slack post: about 5 minutes.