FIM Completion - MyTokenGate

1. Use Cases

FIM (Fill In the Middle) completion allows users to provide the desired prefix and suffix content, enabling the model to complete the content in between. This is typically used in scenarios such as code completion and filling in missing content in text.

2. How to Use

2.1 Using the chat/completions Interface


from openai import OpenAI
 
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://gateway.mytokengate.com/v1"
)
 
response = client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[
        {"role": "user", "content": "Complete this code"}
    ],
    extra_body={
        "prefix": "def quick_sort(arr):\n    if len(arr) <= 1:\n        return arr\n    else:\n",
        "suffix": "\n# Test\narr = [3, 6, 8, 10, 1, 2, 1]\nsorted_arr = quick_sort(arr)\nprint(sorted_arr)"
    }
)
 
print(response.choices[0].message.content)

2.2 Using the completions Interface


from openai import OpenAI
 
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://gateway.mytokengate.com/v1"
)
 
response = client.completions.create(
    model="deepseek-v4-flash",
    prompt="def quick_sort(arr):\n    if len(arr) <= 1:\n        return arr\n    else:\n",
    suffix="\n# Test\narr = [3, 6, 8, 10, 1, 2, 1]\nsorted_arr = quick_sort(arr)\nprint(sorted_arr)",
    max_tokens=4096
)
 
print(response.choices[0].text)

3. Supported Models

The following models support FIM completion:

deepseek-r1 - DeepSeek reasoning model
deepseek-v4-flash - DeepSeek fast model
qwen3-coder-30b-a3b-instruct - Qwen code model

View the full model list at Model List