Inference Providers documentation
Z.ai
Z.ai
All supported Z.ai models can be found here
Z.ai is an AI platform that provides cutting-edge large language models powered by GLM series. Their flagship models feature Mixture-of-Experts (MoE) architecture with advanced reasoning, coding, and agentic capabilities.
For latest pricing, visit the pricing page.
Resources
- Website: https://z.ai/
 - Documentation: https://docs.z.ai/
 - API Documentation: https://docs.z.ai/api-reference/introduction
 - GitHub: https://github.com/zai-org
 - Hugging Face: https://huggingface.co/zai-org
 
Supported tasks
Chat Completion (LLM)
Find out more about Chat Completion (LLM) here.
Language
Client
Provider
 Copied
import os
from openai import OpenAI
client = OpenAI(
    base_url="https://router.huggingface.co/v1",
    api_key=os.environ["HF_TOKEN"],
)
completion = client.chat.completions.create(
    model="zai-org/GLM-4.6:zai-org",
    messages=[
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ],
)
print(completion.choices[0].message)Chat Completion (VLM)
Find out more about Chat Completion (VLM) here.
Language
Client
Provider
 Copied
import os
from openai import OpenAI
client = OpenAI(
    base_url="https://router.huggingface.co/v1",
    api_key=os.environ["HF_TOKEN"],
)
completion = client.chat.completions.create(
    model="zai-org/GLM-4.5V:zai-org",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Describe this image in one sentence."
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
                    }
                }
            ]
        }
    ],
)
print(completion.choices[0].message)
 