ModelTRTLLMBuildConfiguration

class baseten.client.modelconfig.ModelTRTLLMBuildConfiguration(*, base_model=ModelTRTLLMModel.decoder, max_seq_len=None, max_batch_size=256, max_num_tokens=8192, max_beam_width=1, max_prompt_embedding_table_size=0, checkpoint_repository=None, gather_all_token_logits=False, strongly_typed=False, quantization_type=ModelTRTLLMQuantizationType.no_quant, quantization_config=<factory>, tensor_parallel_count=1, pipeline_parallel_count=1, moe_expert_parallel_option=-1, sequence_parallel_count=1, plugin_configuration=<factory>, num_builder_gpus=None, speculator=None, lora_adapters=None, lora_configuration=None, skip_build_result=False, **extra_data)

Bases: BaseModel

Parameters:
model_config = {'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].