These LLMs are all licensed for commercial use (e.g., Apache 2.0, MIT, OpenRAIL-M). Contributions welcome!
Language Model | Release Date | Checkpoints | Paper/Blog | Params (B) | Context Length | Licence |
---|---|---|---|---|---|---|
SantaCoder | 2023/01 | santacoder | SantaCoder: don't reach for the stars! | 1.1 | 2048 | OpenRAIL-M v1 |
StarCoder | 2023/05 | starcoder | StarCoder: A State-of-the-Art LLM for Code, StarCoder: May the source be with you! | 15 | 8192 | OpenRAIL-M v1 |
StarChat Alpha | 2023/05 | starchat-alpha | Creating a Coding Assistant with StarCoder | 16 | 8192 | OpenRAIL-M v1 |
Replit Code | 2023/05 | replit-code-v1-3b | Training a SOTA Code LLM in 1 week and Quantifying the Vibes — with Reza Shabani of Replit | 2.7 | infinity? (ALiBi) | CC BY-SA-4.0 |
CodeGen2 | 2023/04 | codegen2 1B-16B | CodeGen2: Lessons for Training LLMs on Programming and Natural Languages | 1 - 16 | 2048 | Apache 2.0 |
CodeT5+ | 2023/05 | CodeT5+ | CodeT5+: Open Code Large Language Models for Code Understanding and Generation | 0.22 - 16 | 512 | BSD-3-Clause |
Name | Release Date | Paper/Blog | Dataset | Tokens (T) | License |
---|---|---|---|---|---|
starcoderdata | 2023/05 | StarCoder: A State-of-the-Art LLM for Code | starcoderdata | 0.25 | Apache 2.0 |
RedPajama | 2023/04 | RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens | RedPajama-Data | 1.2 | Apache 2.0 |
Name | Release Date | Paper/Blog | Dataset | Samples (K) | License |
---|---|---|---|---|---|
MPT-7B-Instruct | 2023/05 | Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs | dolly_hhrlhf | 59 | CC BY-SA-3.0 |
databricks-dolly-15k | 2023/04 | Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM | databricks-dolly-15k | 15 | CC BY-SA-3.0 |
OIG (Open Instruction Generalist) | 2023/03 | THE OIG DATASET | OIG | 44,000 | Apache 2.0 |
Name | Release Date | Paper/Blog | Dataset | Samples (K) | License |
---|---|---|---|---|---|
OpenAssistant Conversations Dataset | 2023/04 | OpenAssistant Conversations - Democratizing Large Language Model Alignment | oasst1 | 161 | Apache 2.0 |
--
Gruba davet etmek istediklerinize bu linki ulaştırabilirsiniz:
https://groups.google.com/g/yz-ve-insan/about
---
Bu iletiyi Google Grupları'ndaki "Yapay Zeka ve İnsan" grubuna abone olduğunuz için aldınız.
Bu grubun aboneliğinden çıkmak ve bu gruptan artık e-posta almamak için yz-ve-insan...@googlegroups.com adresine e-posta gönderin.
Bu tartışmayı web'de görüntülemek için https://groups.google.com/d/msgid/yz-ve-insan/CAC5h89xxt7VCZA%3Dh_e7%2BaAESmVmzGLGhUs4Y6qGOmbmYBXdDbQ%40mail.gmail.com adresini ziyaret edin.