For frontier AI models, when it rains, it pours. Mistral released a fresh new flagship model on Wednesday, Large 2, which it claims to be on par with the latest cutting-edge models from OpenAI and Meta in terms of code generation, mathematics, and reasoning.
The release of Mistral Large 2 falls just one day after Meta dropped its latest and greatest open-source model, Llama 3.1 405B. Mistral says Large 2 raises the bar for performance and cost for open models, backing that up with a handful of benchmarks.
Large 2 appears to outpace Llama 3.1 405B on code generation and math performance, and does so with under a third of the parameters: 123 billion, to be precise. This significant reduction in parameters without compromising on performance is a testament to Mistral’s engineering prowess and strategic approach to AI model development.
In a press release, Mistral emphasized that one of its key focus areas during training was to minimize the model’s hallucination issues. Hallucinations, where the AI generates information that seems plausible but is incorrect, have been a persistent challenge in the field. Mistral claims that Large 2 was meticulously trained to be more discerning in its responses, acknowledging when it does not know something instead of making up plausible but inaccurate answers. This focus on reliability and accuracy could make Large 2 a preferred choice for applications where trustworthiness is critical.
The Paris-based AI startup recently raised $640 million in a Series B funding round, led by General Catalyst, at a $6 billion valuation. Though Mistral is one of the newer entrants in the artificial intelligence space, it’s quickly shipping AI models on or near the cutting edge. The substantial investment underscores the confidence investors have in Mistral’s technology and its potential to disrupt the AI market.
However, it’s important to note that Mistral’s models are, like most others, not open source in the traditional sense. Any commercial application of the model requires a paid license. While this model is more open than, say, GPT-4, few in the world have the expertise and infrastructure to implement such a large model. This constraint is even more pronounced for models like Llama’s 405 billion parameters, which demand extensive computational resources.
Something missing from Mistral Large 2, and also absent from Meta’s Llama 3.1 release, is multimodal capabilities. OpenAI is far ahead of the competition with regard to multimodal AI systems, capable of processing image and text simultaneously. This feature is becoming increasingly sought after as the AI landscape evolves, with startups aiming to integrate multimodal functionalities to provide richer, more versatile AI experiences.
Despite this, Large 2 boasts several impressive features. The model has a 128,000 token window, allowing it to intake a substantial amount of data in a single prompt—equivalent to roughly a 300-page book. This large token window is beneficial for complex tasks that require the model to maintain context over extensive inputs, such as long-form content generation, detailed analysis, and comprehensive data processing.
Mistral’s new model also includes improved multilingual support, understanding languages such as English, French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean, along with 80 coding languages. This multilingual capability broadens the model’s applicability across different regions and industries, making it a versatile tool for global businesses.
Notably, Mistral claims Large 2 produces more concise responses than leading AI models, which have a tendency to generate verbose outputs. This conciseness can enhance user experience by providing clear and direct answers, especially in customer service, technical support, and interactive applications.
Mistral Large 2 is available to use on multiple platforms including Google Vertex AI, Amazon Bedrock, Azure AI Studio, and IBM watsonx.ai. This widespread availability ensures that businesses and developers can leverage the model’s capabilities within their preferred cloud environments, facilitating integration and deployment.
Furthermore, you can use the new model on Mistral’s proprietary platform, le Plateforme, under the name “mistral-large-2407”. For those interested in testing its capabilities, the startup’s ChatGPT competitor, le Chat, offers a free trial. This accessibility allows users to explore Large 2’s potential and evaluate its performance in real-world scenarios.
As the AI landscape continues to evolve, Mistral’s Large 2 model stands out as a significant advancement. By focusing on reducing hallucinations, enhancing multilingual support, and maintaining performance with fewer parameters, Mistral is setting a new benchmark for AI model development. While the absence of multimodal capabilities is a notable gap, the model’s other strengths position it as a formidable contender in the AI market.
In conclusion, Mistral’s release of Large 2 marks a pivotal moment in the AI industry. The model’s impressive performance, efficiency, and multilingual support reflect Mistral’s commitment to innovation and excellence. As more businesses and developers adopt Large 2, it will be fascinating to see how this model influences the future of AI applications and sets new standards for the industry.
Visit InstaDataHelp Blogs.
Visit InstadataHelp AI News.
Recent Comments