Text Generation - MyTokenGate
Language Model (LLM) User Manual
1. Model Core Capabilities
1.1 Basic Functions
Text Generation: Generate coherent natural language text based on context, supporting various styles and genres.
Semantic Understanding: Deeply parse user intent, supporting multi-round dialogue management to ensure the coherence and accuracy of conversations.
Knowledge Q&A: Cover a wide range of knowledge domains, including science, technology, culture, history, etc., providing accurate knowledge answers.
Code Assistance: Support code generation, explanation, and debugging for multiple mainstream programming languages (such as Python, Java, C++, etc.).
1.2 Advanced Capabilities
Long Text Processing: Support context windows of 4k to 200k tokens, suitable for long document generation and complex dialogue scenarios.
Instruction Following: Precisely understand complex task instructions, such as “compare A/B schemes using a Markdown table.”
Style Control: Adjust output style through system prompts, supporting various styles such as academic, conversational, and poetry.
Multimodal Support: In addition to text generation, support tasks such as image description and speech-to-text.
2. API Call Specifications
2.1 Basic Request Structure
You can make end-to-end API requests using the OpenAI SDK.
Generate Dialogue
from openai import OpenAI
client = OpenAI(api_key="YOUR_KEY", base_url="https://gateway.mytokengate.com/v1")
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Write a haiku about recursion in programming."}
],
temperature=0.7,
max_tokens=1024,
stream=True
)
for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Analyze an Image
from openai import OpenAI
client = OpenAI(api_key="YOUR_KEY", base_url="https://gateway.mytokengate.com/v1")
response = client.chat.completions.create(
model="gemini-2.5-flash-image",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://example.com/image.png"
}
},
{
"type": "text",
"text": "What's in this image?"
}
]
}
],
temperature=0.7,
max_tokens=1024,
stream=True
)
for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Generate JSON Data
import json
from openai import OpenAI
client = OpenAI(
api_key="YOUR_KEY",
base_url="https://gateway.mytokengate.com/v1"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant designed to output JSON."},
{"role": "user", "content": "Who were the men's and women's singles champions of the 2020 Olympic table tennis? Respond in JSON format."}
],
response_format={"type": "json_object"}
)
print(response.choices[0].message.content)2.2 Message Structure Explanation
| Message Type | Function Description | Example Content |
|---|---|---|
| system | Model instructions, defining the AI’s role and general behavior | e.g., “You are a pediatrician with 10 years of experience.” |
| user | User input, passing the end user’s message to the model | e.g., “How should a persistent fever in a toddler be treated?“ |
| assistant | Model-generated historical responses, providing examples of how it should respond to the current request | e.g., “I suggest measuring the temperature first…“ |
3. Model Selection Guide
Claude Series
claude-opus-4-6- Strongest reasoning capabilityclaude-sonnet-4-6- Balanced performance and costclaude-haiku-4-5-20251001- Fast response
GPT Series
gpt-4o- Multimodal capabilitygpt-4.1- Enhanced versiongpt-5series - Latest models
Gemini Series
gemini-2.5-pro- Complex tasksgemini-2.5-flash- Fast responsegemini-3.1-pro-preview- Latest preview
4. Detailed Explanation of Core Parameters
4.1 Creativity Control
# Temperature parameter (0.0~2.0)
temperature=0.5 # Balances creativity and reliability
# Nucleus sampling (top_p)
top_p=0.9 # Considers only the top 90% probability cumulative word set4.2 Output Limits
max_tokens=1000 # Maximum generation length per request
stop=["\n##", "<|end|>"] # Stop sequences
frequency_penalty=0.5 # Suppresses repetitive word usage (-2.0~2.0)
stream=True # Stream output4.3 Common Issues with Language Model Scenarios
- Model Output Garbled
Try setting parameters like temperature, top_k, top_p, and frequency_penalty.
payload = {
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "1+1=?"}
],
"max_tokens": 200,
"temperature": 0.7,
"top_k": 50,
"top_p": 0.7,
"frequency_penalty": 0
}- Explanation of max_tokens
It is recommended to reserve around 10k as space for input content.
- Output Truncation Issues
- Set max_token to an appropriate value
- Use stream requests for long outputs
- Increase client timeout
- Error Code Handling
| Error Code | Common Cause | Solution |
|---|---|---|
| 400 | Parameter format error | Check parameter ranges |
| 401 | API Key not correctly set | Verify the API Key |
| 403 | Insufficient permissions | Refer to error messages |
| 429 | Request rate limit exceeded | Implement exponential backoff retry |
| 503/504 | Model overload | Switch to backup model nodes |
5. Billing and Quota Management
5.1 Billing Formula
Total Cost = (Input Tokens × Input Unit Price) + (Output Tokens × Output Unit Price)
6. Application Scenarios
6.1 Technical Documentation Generation
from openai import OpenAI
client = OpenAI(api_key="YOUR_KEY", base_url="https://gateway.mytokengate.com/v1")
response = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[{
"role": "user",
"content": "Write a Python tutorial on asynchronous web scraping, including code examples and precautions."
}],
temperature=0.7,
max_tokens=4096
)6.2 Data Analysis Reports
from openai import OpenAI
client = OpenAI(api_key="YOUR_KEY", base_url="https://gateway.mytokengate.com/v1")
response = client.chat.completions.create(
model="gemini-2.5-pro",
messages=[
{"role": "system", "content": "You are a data analysis expert. Output results in Markdown."},
{"role": "user", "content": "Analyze the sales trends of new energy vehicles in 2023."}
],
temperature=0.7,
max_tokens=4096
)