Skip to content

User Guide

This guide covers everything you need to know to effectively use GenAI Bench for benchmarking LLM endpoints.

What You'll Learn

  • Running Benchmarks


    Learn how to run benchmarks against various LLM endpoints

    Run Benchmark

  • Multi-Cloud Setup


    Configure authentication for AWS, Azure, GCP, OCI, Baseten, and more

    Multi-Cloud Guide

  • Baseten Support


    Learn about dual format support and flexible response handling

    Baseten Guide

  • Docker Usage Guide


    Run GenAI Bench in containerized environments

    Docker Usage Guide

  • Excel Reports


    Generate comprehensive Excel reports from benchmark results

    Excel Reports

  • Performance Troubleshooting


    Debug unexpected benchmark results and common pitfalls

    Troubleshooting Guide

Common Workflows

Basic Benchmarking

  1. Choose your model provider - OpenAI, AWS Bedrock, Azure OpenAI, etc.
  2. Configure authentication - API keys, IAM roles, or service accounts
  3. Run the benchmark - Specify task type and parameters
  4. Analyze results - View real-time dashboard or generate reports

Cross-Cloud Benchmarking

Benchmark models from one provider while storing results in another:

# Benchmark OpenAI, store in AWS S3
genai-bench benchmark \
  --api-backend openai \
  --api-key $OPENAI_KEY \
  --upload-results \
  --storage-provider aws \
  --storage-bucket my-results

Multi-Modal Tasks

Support for text, embeddings, and vision tasks:

  • text-to-text - Chat and completion tasks
  • text-to-embeddings - Embedding generation
  • image-text-to-text - Vision-language tasks
  • text-to-rerank - Document reranking

Need Help?