Z-Image Benchmark Suite: Measure Performance Accurately

Diffusionist
Diffusionist

Z-Image Benchmark Suite: Measure Performance Accurately

Description: Learn to benchmark Z-Image performance scientifically. Measure generation speed, quality metrics, and resource utilization with professional testing methodologies and reproducible results.


Introduction: Why Benchmark Matters

Without proper benchmarking, you are guessing at performance. With proper benchmarking, you can identify bottlenecks objectively, compare configurations, track performance over time, and make informed hardware decisions.

Benchmark testing interface


Part 1: What to Measure

Primary Metrics:

  1. Generation Time: Wall-clock time from queue to save
  2. Throughput: Images per second/minute
  3. Resource Utilization: GPU %, VRAM usage, CPU usage
  4. Quality: Output fidelity

Part 2: Basic Benchmark Script

Here is a complete benchmarking framework:

import time
import torch
import numpy as np
from z_image import ZImagePipeline

class ZImageBenchmark:
    def __init__(self, model_path):
        self.pipe = ZImagePipeline.from_pretrained(model_path, torch_dtype=torch.bfloat16).to("cuda")
    
    def benchmark_generation(self, prompt, num_runs=10, steps=6):
        times = []
        # Warmup
        _ = self.pipe(prompt, num_inference_steps=steps)
        
        for _ in range(num_runs):
            start = time.perf_counter()
            _ = self.pipe(prompt, num_inference_steps=steps)
            end = time.perf_counter()
            times.append(end - start)
        
        return {'mean': np.mean(times), 'median': np.median(times), 'std': np.std(times)}

Usage:

benchmark = ZImageBenchmark("alibaba/Z-Image-Turbo")
results = benchmark.benchmark_generation("A mountain landscape", num_runs=10, steps=6)
print(f"Mean: {results['mean']:.2f}s, Median: {results['median']:.2f}s")

Part 3: Resolution Benchmark

Test how resolution affects performance:

Resolution Expected Time (RTX 4090)
512x512 1.5s
768x768 2.0s
1024x1024 2.9s
1536x1536 5.2s
2048x2048 8.5s

Part 4: Step Count Impact

Benchmark different step counts:

for steps in [4, 6, 8, 12, 20]:
    result = benchmark.benchmark_generation("Test prompt", num_runs=10, steps=steps)
    print(f"{steps} steps: {result['mean']:.2f}s")

Results (Z-Image Turbo):

  • 4 steps: 1.8s
  • 6 steps: 2.2s (optimal)
  • 8 steps: 2.9s
  • 12 steps: 4.1s

Part 5: Model Comparison

Compare Z-Image Turbo vs Base:

Model Steps Time Quality
Z-Image Turbo 6 2.2s Excellent
Z-Image Base 25 9.5s Superior
Z-Image Red 8 3.1s Good

Part 6: Common Benchmarking Mistakes

Don't:

  • Run test once (noise in results)
  • Change multiple variables simultaneously
  • Ignore warmup time
  • Forget to document configuration

Do:

  • Run 10+ iterations
  • Use median for reporting
  • Document everything
  • Test on clean system

Conclusion: Measure to Improve

Benchmarking transforms subjective impressions into objective data. You will know exactly how changes affect performance.

Key Takeaways:

  • Always run 10+ iterations
  • Use median for reporting
  • Change one variable at a time
  • Document everything

For optimization techniques, see our performance optimization guide. For monitoring ongoing performance, check our dashboard guide.

External References: