Exploring MR ALia's Open Source AI Models: From M 7B to C 22b

Welcome to our journey of open-sourcing AI models, where we're democratizing Frontier AI for developers and revolutionizing the way we train, fine-tune, and deploy language models.

1. The speaker is excited to talk about MrALIA open models.
2. MrALIA started last June and released its first open model M7B in September.
3. In December, they released a mixture of experts open model 8X7B.
4. They also released their platform with model APIs and commercial models MrMedium and MEmbed.
5. In February, they released their Flagship model MrLarge, which has superior reasoning and math abilities.
6. In April, they released a new open model 8X22b.
7. In June, they released a code-specific model called C22b, available for free use on their chat interface.
8. The mission of MrALIA is to bring frontier AI into everyone's hands, focusing on building cutting-edge AI for developers.
9. They have principles behind training and releasing models, such as openness, portability, optimized performance, and customizability.
10. Openness: they train best-in-class open models and release them to the open source community.
11. Portability: all models are available on Azure, AWS, GCP, virtual private cloud, and can be deployed on-premises with full control over security and privacy.
12. Optimized performance: they aim for optimal performance-to-size ratio in their models.
13. Customizability: they provide libraries and tools to customize their models according to specific applications.
14. MrALIA has open sourced three models in the past year, including a DeTransformer model (7B), a Sparse mixture of experts model (8X7B), and an even larger version of the latter (AX22b).
15. The 7B model was the first to achieve 60 on MMLU, which is considered a bare minimum for useful models.
16. The Sparse mixture of experts architecture allows pushing performance while keeping inference budget in check by using only a small subset of parameters for every token.
17. MrALIA sees open source as complementary to profit, not competitive.
18. Open sourcing their models serves as a branding and marketing tool, helping them create awareness about their products and acquire customers.
19. Open source models are trained in three stages: pre-training, instruction tuning, and learning from human feedback.
20. Pre-training involves predicting the next token in a sequence of text by training on massive datasets, which can require hundreds of trillions of tokens.
21. Instruction tuning involves using prompt-response pairs to teach models how humans want to interact.
22. Learning from human feedback allows for scaling data faster and optimizing model performance based on human preferences.
23. The open source models by MrALIA, including CODR 22b, have been trained using these techniques.
24. CODR 22b is a dense Transformer model specifically designed for code, fluent in + programming languages, and offering both code completion and question-answering capabilities.

Source: AI Engineer via YouTube

❓ What do you think? What are your thoughts on the ideas shared in this video? Feel free to share your thoughts in the comments!