açık kaynak LLM'ler

12 views
Skip to first unread message

Muhammed Uludag

unread,
May 22, 2023, 12:03:08 PM5/22/23
to yz-ve...@googlegroups.com


Open LLMs

These LLMs are all licensed for commercial use (e.g., Apache 2.0, MIT, OpenRAIL-M). Contributions welcome!

Language ModelRelease DateCheckpointsPaper/BlogParams (B)Context LengthLicence
T52019/10T5 & Flan-T5, Flan-T5-xxl (HF)Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer0.06 - 11512Apache 2.0
UL22022/10UL2 & Flan-UL2, Flan-UL2 (HF)UL2 20B: An Open Source Unified Language Learner20512, 2048Apache 2.0
Cerebras-GPT2023/03Cerebras-GPTCerebras-GPT: A Family of Open, Compute-efficient, Large Language Models(Paper)0.111 - 132048Apache 2.0
Open Assistant (Pythia family)2023/03OA-Pythia-12B-SFT-8, OA-Pythia-12B-SFT-4, OA-Pythia-12B-SFT-1Democratizing Large Language Model Alignment122048Apache 2.0
Pythia2023/04pythia 70M - 12BPythia: A Suite for Analyzing Large Language Models Across Training and Scaling0.07 - 122048Apache 2.0
Dolly2023/04dolly-v2-12bFree Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM3, 7, 122048MIT
DLite2023/05dlite-v2-1_5bAnnouncing DLite V2: Lightweight, Open LLMs That Can Run Anywhere0.124 - 1.51024Apache 2.0
RWKV2021/08RWKV, ChatRWKVThe RWKV Language Model (and my LM tricks)0.1 - 14infinity (RNN)Apache 2.0
GPT-J-6B2023/06GPT-J-6B, GPT4All-JGPT-J-6B: 6B JAX-Based Transformer62048Apache 2.0
GPT-NeoX-20B2022/04GPT-NEOX-20BGPT-NeoX-20B: An Open-Source Autoregressive Language Model202048Apache 2.0
Bloom2022/11BloomBLOOM: A 176B-Parameter Open-Access Multilingual Language Model1762048OpenRAIL-M v1
StableLM-Alpha2023/04StableLM-AlphaStability AI Launches the First of its StableLM Suite of Language Models3 - 654096CC BY-SA-4.0
FastChat-T52023/04fastchat-t5-3b-v1.0We are excited to release FastChat-T5: our compact and commercial-friendly chatbot!3512Apache 2.0
h2oGPT2023/05h2oGPTBuilding the World’s Best Open-Source Large Language Model: H2O.ai’s Journey12 - 20256 - 2048Apache 2.0
MPT-7B2023/05MPT-7B, MPT-7B-InstructIntroducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs784k (ALiBi)Apache 2.0, CC BY-SA-3.0
RedPajama-INCITE2023/05RedPajama-INCITEReleasing 3B and 7B RedPajama-INCITE family of models including base, instruction-tuned & chat models3 - 72048Apache 2.0
OpenLLaMA2023/05OpenLLaMA-7b-preview-300btOpenLLaMA: An Open Reproduction of LLaMA72048Apache 2.0

Open LLMs for code

Language ModelRelease DateCheckpointsPaper/BlogParams (B)Context LengthLicence
SantaCoder2023/01santacoderSantaCoder: don't reach for the stars!1.12048OpenRAIL-M v1
StarCoder2023/05starcoderStarCoder: A State-of-the-Art LLM for Code, StarCoder: May the source be with you!158192OpenRAIL-M v1
StarChat Alpha2023/05starchat-alphaCreating a Coding Assistant with StarCoder168192OpenRAIL-M v1
Replit Code2023/05replit-code-v1-3bTraining a SOTA Code LLM in 1 week and Quantifying the Vibes — with Reza Shabani of Replit2.7infinity? (ALiBi)CC BY-SA-4.0
CodeGen22023/04codegen2 1B-16BCodeGen2: Lessons for Training LLMs on Programming and Natural Languages1 - 162048Apache 2.0
CodeT5+2023/05CodeT5+CodeT5+: Open Code Large Language Models for Code Understanding and Generation0.22 - 16512BSD-3-Clause

Open LLM datasets for pre-training

NameRelease DatePaper/BlogDatasetTokens (T)License
starcoderdata2023/05StarCoder: A State-of-the-Art LLM for Codestarcoderdata0.25Apache 2.0
RedPajama2023/04RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokensRedPajama-Data1.2Apache 2.0

Open LLM datasets for instruction-tuning

NameRelease DatePaper/BlogDatasetSamples (K)License
MPT-7B-Instruct2023/05Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMsdolly_hhrlhf59CC BY-SA-3.0
databricks-dolly-15k2023/04Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLMdatabricks-dolly-15k15CC BY-SA-3.0
OIG (Open Instruction Generalist)2023/03THE OIG DATASETOIG44,000Apache 2.0

Open LLM datasets for alignment-tuning

NameRelease DatePaper/BlogDatasetSamples (K)License
OpenAssistant Conversations Dataset2023/04OpenAssistant Conversations - Democratizing Large Language Model Alignmentoasst1161Apache 2.0

Evals on open LLMs



twittter: @metamathician

Akif Eyler

unread,
May 22, 2023, 2:49:41 PM5/22/23
to yz-ve...@googlegroups.com
Hocam, bu listeden ne öğreniyoruz, önemi nedir?


On Mon, May 22, 2023 at 7:03 PM Muhammed Uludag 

Muhammed Uludag

unread,
May 23, 2023, 4:07:56 AM5/23/23
to yz-ve...@googlegroups.com
Bu listedeki LLM'ler açık kaynak, bilgisayarınıza indirip kendiniz eğitebilirsiniz.
Bir kısmı GPU gerektirebilir.
Ancak ChatGPT tarzı generative kabiliyetler beklememelisiniz. 
Sadece nitelik değil yavaşlık açısından da pek kullanışlı olmayacaklardır.
Türkçe konusundaki yeteneklerini bilmiyorum.
Keşke bizde bu modelleri çalıştırıp Türkçe yeteneklerini mukayese eden birileri olsaydı.

ChatGPT tarzı üretici LLM elde etmek için  büyük emekler ortaya dökmek gerekiyor.
Ancak buradaki açık kaynak modeller daha spesifik NLP tasklar için etkili olabilirler, mesela:
Duygu analizi
Semantik arama
İsim tanıma 

Hürmetler



twittter: @metamathician

--
Gruba davet etmek istediklerinize bu linki ulaştırabilirsiniz:
https://groups.google.com/g/yz-ve-insan/about
---
Bu iletiyi Google Grupları'ndaki "Yapay Zeka ve İnsan" grubuna abone olduğunuz için aldınız.
Bu grubun aboneliğinden çıkmak ve bu gruptan artık e-posta almamak için yz-ve-insan...@googlegroups.com adresine e-posta gönderin.
Bu tartışmayı web'de görüntülemek için https://groups.google.com/d/msgid/yz-ve-insan/CAC5h89xxt7VCZA%3Dh_e7%2BaAESmVmzGLGhUs4Y6qGOmbmYBXdDbQ%40mail.gmail.com adresini ziyaret edin.
Reply all
Reply to author
Forward
0 new messages