Models
Roundup of the different models ..
Models
Here's an overview of key open source AI models we will be using:
Large Language Models (LLMs):
Llama 3 - Available in sizes from 8B to 70B parameters, with both base and fine-tuned chat versions. Known for strong performance while being relatively efficient.
Mistral - Released by Mistral AI in late 2023, this 7B model showed impressive performance despite its smaller size. Notable for introducing "sliding window attention" to handle longer contexts efficiently. The company later released Mixtral 8x7B, a mixture-of-experts model.
BLOOM - Created by HuggingFace and over 1000 researchers, this was one of the first major multilingual open source LLMs, trained on 46 natural languages and 13 programming languages.
Vision Models:
Stable Diffusion - Released by Stability AI, this text-to-image model became highly influential in the open source AI art community. Multiple versions have been released with improving capabilities.
SegmentAnything (SAM) - Released by Meta, this model can identify and segment objects in images with remarkable accuracy, becoming a standard tool for computer vision tasks.
Code Models:
StarCoder - Created by HuggingFace and ServiceNow, this model specializes in code generation and understanding across multiple programming languages.
CodeLlama - Meta's code-specialized version of Llama 2, trained specifically for programming tasks.
The open source AI landscape continues to evolve rapidly, with new models and improvements being released frequently.
Last updated