SomeIBM foundation models are also available from Hugging Face. License terms for IBM models that you access from Hugging Face are available from the Hugging Face website. For more information about contractual protections related to IBM indemnification for IBM foundation models that you access in
watsonx.ai, see the IBM Client Relationship Agreement and IBM
watsonx.ai service description.
The available foundation models support a range of use cases for both natural languages and programming languages. To see the types of tasks that these models can do, review and try the sample prompts.
The allam-1-13b-instruct foundation model is a bilingual large language model for Arabic and English provided by the National Center for Artificial Intelligence and supported by the Saudi Authority for Data and Artificial Intelligence that is fine-tuned to support conversational tasks. The ALLaM series is a collection of powerful language models designed to advance Arabic language technology. These models are initialized with Llama-2 weights and undergo training on both Arabic and English languages.
The elyza-japanese-llama-2-7b-instruct model is provided by ELYZA, Inc on Hugging Face. The elyza-japanese-llama-2-7b-instruct foundation model is a version of the Llama 2 model from Meta that is trained to understand and generate Japanese text. The model is fine-tuned for solving various tasks that follow user instructions and for participating in a dialog.
The flan-t5-xl-3b model is provided by Google on Hugging Face. This model is based on the pretrained text-to-text transfer transformer (T5) model and uses instruction fine-tuning methods to achieve better zero- and few-shot performance. The model is also fine-tuned with chain-of-thought data to improve its ability to perform reasoning tasks.
The model was fine-tuned on tasks that involve multiple-step reasoning from chain-of-thought data in addition to traditional natural language processing tasks. Details about the training data sets used are published.
The flan-t5-xxl-11b model is provided by Google on Hugging Face. This model is based on the pretrained text-to-text transfer transformer (T5) model and uses instruction fine-tuning methods to achieve better zero- and few-shot performance. The model is also fine-tuned with chain-of-thought data to improve its ability to perform reasoning tasks.
The flan-ul2-20b model is provided by Google on Hugging Face. This model was trained by using the Unifying Language Learning Paradigms (UL2). The model is optimized for language generation, language understanding, text classification, question answering, common sense reasoning, long text reasoning, structured-knowledge grounding, and information retrieval, in-context learning, zero-shot prompting, and one-shot prompting.
The flan-ul2-20b model is pretrained on the colossal, cleaned version of Common Crawl's web crawl corpus. The model is fine-tuned with multiple pretraining objectives to optimize it for various natural language processing tasks. Details about the training data sets used are published.
The Granite family of models is trained on enterprise-relevant data sets from five domains: internet, academic, code, legal, and finance. Data used to train the models first undergoes IBM data governance reviews and is filtered of text that is flagged for hate, abuse, or profanity by the IBM-developed HAP filter. IBM shares information about the training methods and data sets used.
IBM-developed foundation models are considered part of the IBM Cloud Service. For more information about contractual protections related to IBM indemnification, see the IBM Client Relationship Agreement and IBM
watsonx.ai service description.
The granite-13b-instruct-v2 model is provided by IBM. This model was trained with high-quality finance data, and is a top-performing model on finance tasks. Financial tasks evaluated include: providing sentiment scores for stock and earnings call transcripts, classifying news headlines, extracting credit risk assessments, summarizing financial long-form text, and answering financial or insurance-related questions.
Supports extraction, summarization, and classification tasks. Generates useful output for finance-related tasks. Uses a model-specific prompt format. Accepts special characters, which can be used for generating structured output.
The granite-7b-lab foundation model is provided by IBM. The granite-7b-lab foundation model uses a novel alignment tuning method from IBM Research. Large-scale Alignment for chatBots, or LAB is a method for adding new skills to existing foundation models by generating synthetic data for the skills, and then using that data to tune the foundation model.
IBM-developed foundation models are considered part of the IBM Cloud Service. When you use the granite-7b-lab foundation model that is provided in
watsonx.ai the contractual protections related to IBM indemnification apply. See the IBM Client Relationship Agreement and IBM
watsonx.ai service description.
The granite-8b-japanese model is provided by IBM. The granite-8b-japanese foundation model is based on the IBM Granite Instruct foundation model and is trained to understand and generate Japanese text.
The Granite family of models is trained on enterprise-relevant data sets from five domains: internet, academic, code, legal, and finance. The granite-8b-japanese model was pretrained on 1 trillion tokens of English and 0.5 trillion tokens of Japanese text.
A foundation model from the IBM Granite family. The granite-20b-multilingual foundation model is based on the IBM Granite Instruct foundation model and is trained to understand and generate text in English, German, Spanish, French, and Portuguese.
Foundation models from the IBM Granite family. The Granite code foundation models are instruction-following models fine-tuned using a combination of Git commits paired with human instructions and open-source synthetically generated code instruction data sets.
These models were fine-tuned from Granite code base models on a combination of permissively licensed instruction data to enhance instruction-following capabilities including logical reasoning and problem-solving skills.
The Meta Llama 3 family of foundation models are accessible, open large language models that are built with Meta Llama 3 and provided by Meta on Hugging Face. The Llama 3 foundation models are instruction fine-tuned language models that can support various use cases.
Llama 3 features improvements in post-training procedures that reduce false refusal rates, improve alignment, and increase diversity in the foundation model output. The result is better reasoning, code generation, and instruction-following capabilities. Llama 3 has more training tokens (15T) that result in better language comprehension.
The version 3.1 llama-3-405b-instruct is a foundation model provided by Meta. The llama-3-405b-instruct is a pretrained and instruction tuned text only generative model that is optimized for multilingual dialogue use cases. The model uses supervised fine-tuning and reinforcement learning with human feedback to align with human preferences for helpfulness and safety.
The llama-3-405b-instruct model is Meta's largest open-sourced foundation model to date. This foundation model can also be used as a synthetic data generator, post-training data ranking judge, or model teacher/supervisor that can improve specialized capabilities in more inference-friendly, derivative models.
Llama 3.1 was pretrained on 15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25 million synthetically generated examples.
The Llama 2 Chat model is provided by Meta on Hugging Face. The fine-tuned model is useful for chat generation. The model is pretrained with publicly available online data and fine-tuned using reinforcement learning from human feedback.
Llama 2 was pretrained on 2 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction data sets and more than one million new examples that were annotated by humans.
3a8082e126