ModelTRTLLMRuntimeConfiguration

class baseten.client.modelconfig.ModelTRTLLMRuntimeConfiguration(*, kv_cache_free_gpu_mem_fraction=0.9, kv_cache_host_memory_bytes=None, enable_chunked_context=True, batch_scheduler_policy=ModelTRTLLMBatchSchedulerPolicy.guaranteed_no_evict, request_default_max_tokens=None, served_model_name=None, total_token_limit=500000, webserver_default_route=None, **extra_data)

Bases: BaseModel

Parameters:
model_config = {'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].