TRTLLMRuntimeConfigurationV2

class baseten.client.modelconfig.TRTLLMRuntimeConfigurationV2(*, max_seq_len=None, max_batch_size=256, max_num_tokens=8192, tensor_parallel_size=1, enable_chunked_prefill=True, served_model_name=None, patch_kwargs=None, **extra_data)

Bases: BaseModel

Parameters:
model_config = {'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].