Stay Updated
Subscribe to our newsletter for the latest news and updates about Open Source Alternatives
Subscribe to our newsletter for the latest news and updates about Open Source Alternatives
A fully open-source, 1B-parameter Chinese-centric language model trained from scratch, with complete access to code, data, checkpoints, and logs under the Apache 2.0 license.
Steel‑LLM is an open-source LLM developed by zhanshijinwat, built from scratch using 1 trillion+ tokens of primarily Chinese data, resulting in a 1 billion‑parameter model. It follows the LLaMA architecture and includes variants such as chat‑fine‑tuned and reasoning models. Everything—training code, data pipeline, checkpoints, and evaluation logs—is publicly available under a permissive Apache 2.0 license, ensuring full reproducibility .As a transparent and accessible alternative to closed-source or partially open LLMs like GPT‑4, Gemma, and Qwen, Steel‑LLM delivers strong Chinese-language performance, achieving CEVAL ≈ 38–42 and CMMLU ≈ 33–36, outperforming earlier and larger models
Key features include:
1 billion-parameter model trained on 1T+ tokens, mainly Chinese, with some English mix Fully open pipeline: all training code, dataset processing, checkpoints, and logs available Variants released: base, chat, reasoning, and vision-language tuned models Benchmark performance: CEVAL ≈ 41.9, CMMLU ≈ 36.1 on chat-v2 variant Apache 2.0 license: free for commercial and academic use
Use cases include:
Research into Chinese-language LLM training under limited resources Building chatbots or assistants with Chinese-centric understanding Fine-tuning models for domain-specific use Educational or reproducible LLM project pipelines