=============== Guided Decoding =============== Before read this document, please first read :doc:`LLM Offline Inference By Python API <../llm/llm_offline_inference_en>` for basic concept and process. ******************************* JSON Format Structured Decoding ******************************* The AllSpark Engine uses lm-format-enforcer as the backend for guided decoding. Currently only JSON format is supported. `lm-format-enforcer repository ` Example ------- Provide the 'response_format' dict in GenerationConfig of a request, like .. code-block:: python # Fill in basic arguments in gen_cfg gen_cfg_builder = ASGenerationConfigBuilder() gen_cfg_updates = { "temperature": 0.7, "top_k": 20, "top_p": 0.9, "seed": 1234, "max_length": 1024, "repetition_penalty": 1.05, "length_penalty": 1.0, } # An example of a simple schema schema_str = r''' { "properties": { "company name": { "type": "string" }, "founding year": { "type": "integer" }, "founding person": { "type": "string" }, "founding city": { "type": "string" }, "employees": { "type": "integer" } }, "required": [ "company name", "founding year", "founding person", "founding city", "employees" ], "type": "object" }''' # Build GenerationConfig with the 'response_format' dict gen_cfg_updates["response_format"] = {"type": "json_object", "json_schema": schema_str} # or not providing any schema to generate any JSON format output, like: # gen_cfg_updates["response_format"] = {"type": "json_object"} gen_cfg_builder.update(gen_cfg_updates) config = gen_cfg_builder.build()