icon of Steel‑LLM

Steel‑LLM

A fully open-source, 1B-parameter Chinese-centric language model trained from scratch, with complete access to code, data, checkpoints, and logs under the Apache 2.0 license.

802 stars77 forksJupyter NotebookUpdated this year

What Steel‑LLM does

Steel‑LLM is an open-source LLM developed by zhanshijinwat, built from scratch using 1 trillion+ tokens of primarily Chinese data, resulting in a 1 billion‑parameter model. It follows the LLaMA architecture and includes variants such as chat‑fine‑tuned and reasoning models. Everything—training code, data pipeline, checkpoints, and evaluation logs—is publicly available under a permissive Apache 2.0 license, ensuring full reproducibility .As a transparent and accessible alternative to closed-source or partially open LLMs like GPT‑4, Gemma, and Qwen, Steel‑LLM delivers strong Chinese-language performance, achieving CEVAL ≈ 38–42 and CMMLU ≈ 33–36, outperforming earlier and larger models

Key features include:

1 billion-parameter model trained on 1T+ tokens, mainly Chinese, with some English mix Fully open pipeline: all training code, dataset processing, checkpoints, and logs available Variants released: base, chat, reasoning, and vision-language tuned models Benchmark performance: CEVAL ≈ 41.9, CMMLU ≈ 36.1 on chat-v2 variant Apache 2.0 license: free for commercial and academic use

Use cases include:

Research into Chinese-language LLM training under limited resources Building chatbots or assistants with Chinese-centric understanding Fine-tuning models for domain-specific use Educational or reproducible LLM project pipelines

GitHub Activity

802Stars
77Forks
4Open Issues

Tech Stack

language Jupyter Notebook

Details

Related Alternatives

Stay Updated

Subscribe to our newsletter for the latest news and updates about Alternatives