GLM-4.6
About GLM-4.6
Compared with GLM-4.5, GLM-4.6 brings several key improvements, including longer context window expanded to 200K tokens, superior coding performance, advanced reasoning, more capable agents, and refined writing.
Discover how GLM-4.6's advanced reasoning, superior coding, and expansive 200K context window solve complex, real-world challenges.
Large-Scale Codebase Refactoring
Analyze vast codebases (e.g., Python, Go) for architectural improvements, security vulnerabilities, and performance bottlenecks across entire projects.
Use Case Example:
"Refactored a legacy Python data pipeline, identifying redundant modules and suggesting optimized design patterns, reducing execution time by 25%."
Autonomous Workflow Agents
Design and deploy intelligent agents to automate complex, multi-step business processes, integrating various tools and APIs with long-context reasoning.
Use Case Example:
"Developed an agent that autonomously researches market trends, generates investment reports using financial APIs, and drafts executive summaries, cutting research time by 70%."
Engineering Design Optimization
Assist engineers in optimizing complex designs by simulating scenarios, analyzing performance data, and suggesting material or structural improvements.
Use Case Example:
"Optimized a drone's aerodynamic design by simulating various wing geometries and material compositions, leading to a 10% increase in flight efficiency."
Regulatory Compliance Auditing
Audit extensive legal documents and regulatory frameworks to identify compliance gaps, potential risks, and generate detailed reports.
Use Case Example:
"Reviewed 150+ pages of GDPR regulations against a company's data handling policies, flagging 7 critical non-compliance issues and suggesting remediation steps."
Dynamic Front-End Generation
Generate visually polished and interactive front-end code (e.g., React, Vue) from high-level descriptions or wireframes, leveraging superior coding.
Use Case Example:
"Created a fully responsive e-commerce product page in React, including dynamic filtering and sorting, based on a simple text prompt and design mockups."
Metadata
Specification
State
Deprecated
Architecture
Transformer MoE
Calibrated
Yes
Mixture of Experts
Yes
Total Parameters
335B
Activated Parameters
Reasoning
No
Precision
FP8
Context length
205K
Max Tokens
205K
Compare with Other Models
See how this model stacks up against others.

Z.ai
chat
GLM-5.1
Release on: Apr 3, 2026
Total Context:
205K
Max output:
131K
Input:
$
1.4
/ M Tokens
Output:
$
4.4
/ M Tokens

Z.ai
chat
GLM-5V-Turbo
Release on: Mar 30, 2026
Total Context:
205K
Max output:
131K
Input:
$
1.2
/ M Tokens
Output:
$
4.0
/ M Tokens

Z.ai
chat
GLM-5
Release on: Feb 12, 2026
Total Context:
205K
Max output:
131K
Input:
$
0.95
/ M Tokens
Output:
$
2.55
/ M Tokens

Z.ai
chat
GLM-4.7
Release on: Dec 23, 2025
Total Context:
205K
Max output:
205K
Input:
$
0.42
/ M Tokens
Output:
$
2.2
/ M Tokens

Z.ai
chat
GLM-4.6V
Release on: Dec 8, 2025
Total Context:
131K
Max output:
131K
Input:
$
0.3
/ M Tokens
Output:
$
0.9
/ M Tokens

Z.ai
chat
GLM-4.6
Release on: Oct 4, 2025
Total Context:
205K
Max output:
205K
Input:
$
0.39
/ M Tokens
Output:
$
1.9
/ M Tokens

Z.ai
chat
GLM-4.5-Air
Release on: Jul 28, 2025
Total Context:
131K
Max output:
131K
Input:
$
0.14
/ M Tokens
Output:
$
0.86
/ M Tokens

Z.ai
chat
GLM-4.5V
Release on: Aug 13, 2025
Total Context:
66K
Max output:
66K
Input:
$
0.14
/ M Tokens
Output:
$
0.86
/ M Tokens

Z.ai
chat
GLM-4.1V-9B-Thinking
Release on: Jul 4, 2025
Total Context:
66K
Max output:
66K
Input:
$
0.035
/ M Tokens
Output:
$
0.14
/ M Tokens
