Engines & LLMs

Meta Reportedly Cooking Up Llama 4: Get Ready for Scout, Maverick, and Behemoth?

Published

2 weeks ago

April 7, 2025

Abby K.

Hold onto your hats, AI fans – whispers are swirling that Meta is deep into developing its next major large language model, Llama 4. Following the splash made by Llama 3, the tech giant seems poised to continue its aggressive push in the AI space, potentially rolling out a new family of models designed to compete head-on with offerings from OpenAI, Google, and Anthropic.

According to reports, Meta might be taking a tiered approach this time around. Instead of a single release, Llama 4 could arrive in three distinct sizes, tentatively named:

Scout: Likely a smaller, more efficient model, perhaps optimized for on-device tasks or quicker responses.
Maverick: A mid-tier option, potentially balancing performance with resource requirements, aiming for broad usability.
Behemoth: As the name suggests, this would probably be the flagship, large-scale model designed to push the boundaries of capability, targeting complex reasoning and performance benchmarks.

What Improvements Can We Expect?

While details are still scarce, the development focus for Llama 4 is expected to center on key areas where the AI race is heating up. Enhanced reasoning abilities, more sophisticated coding generation, and overall performance improvements are likely high on Meta’s priority list. The goal, as always, is to close the gap with – or even surpass – the top closed-source models while sticking to Meta’s significant open-source strategy.

Meta’s commitment to open-sourcing its powerful Llama models has been a major differentiator, fostering a vibrant ecosystem of developers and researchers building on their technology. Continuing this trend with Llama 4, especially with varied sizes like Scout, Maverick, and Behemoth, could further democratize access to cutting-edge AI.

The AI Arms Race Continues

This reported development comes as no surprise given the relentless pace of innovation and competition. With rivals constantly releasing updates and new models, Meta needs Llama 4 to maintain its momentum and influence. The potential release, possibly landing sometime between late 2024 and early 2025, would be Meta’s next big statement in the ongoing battle for AI supremacy.

Leveraging its considerable compute resources and the expertise within its FAIR (Fundamental AI Research) division, Meta appears determined to remain a central player. The development of different model sizes suggests a sophisticated strategy aimed at capturing various segments of the market, from individual developers to large enterprises.

Our Take

Okay, so Meta might be dropping a Llama 4 family? This “Scout, Maverick, Behemoth” thing sounds pretty slick if it’s true. It’s like they’re saying, “We know one size doesn’t fit all,” which is smart. Giving folks options from lightweight to heavyweight could seriously boost adoption, especially with their open-source approach – that’s still their killer app, honestly.

This whole thing really just throws more fuel on the AI fire, doesn’t it? It keeps the pressure cranked high on OpenAI and Google. More competition usually means faster innovation (and hopefully better tools for us!), but you also gotta wonder how long Meta can keep pouring resources into these massive models while giving them away. Still, for anyone building stuff with AI, a potentially more capable *and* open Llama 4 is definitely something to get excited about.

This story was originally featured on Silicon Republic.

Engines & LLMs

Microsoft Researchers Squeeze AI onto CPUs with Tiny 1-bit Model

Published

14 hours ago

April 18, 2025

Abby K.

In a significant step towards running powerful AI locally, Microsoft researchers have developed an incredibly efficient 1-bit large language model (LLM). Dubbed BitNet b1.58, this 2-billion parameter model is reportedly lightweight enough to run effectively on standard CPUs, potentially even on chips like the Apple M2, without needing specialized GPUs or NPUs.

The key innovation lies in its “1-bit” architecture. While technically using 1.58 bits to represent three values (-1, 0, +1), this is drastically smaller than the typical 16-bit or 32-bit formats used in most LLMs. This massive reduction in data size dramatically cuts down on memory requirements and computational power needed for inference.

Published as open-source on Hugging Face, BitNet b1.58 was trained on a hefty 4 trillion tokens. While smaller models often sacrifice accuracy, Microsoft claims this BitNet variant holds its own against comparable, larger models like Meta’s LLaMa and Google’s Gemma in several benchmarks, even topping a few. Crucially, it requires only around 400MB of memory (excluding embeddings) – a fraction of what similar-sized models need.

To achieve these efficiency gains, the model must be run using Microsoft’s custom bitnet.cpp inference framework, available on GitHub. Standard frameworks won’t deliver the same performance benefits.

This research tackles the high energy consumption and hardware demands often associated with AI. Developing models that can run efficiently on everyday hardware like CPUs could democratize AI access, reduce reliance on large data centers, and bring advanced AI capabilities to a wider range of devices.

Our Take

Okay, a 1-bit (ish) AI model from Microsoft that can run on a regular CPU? That’s pretty cool. It tackles one of the biggest AI hurdles: the need for beefy, power-hungry hardware. Making AI this lightweight could seriously shake things up.

Imagine capable AI running locally on phones or laptops without killing the battery or needing an expensive GPU. While there’s usually a trade-off between size and smarts, Microsoft seems to be closing that gap here. This kind of efficiency focus is exactly what we need to make powerful AI more accessible and maybe even a bit more sustainable.

This story was originally featured on Tom’s Hardware.

Engines & LLMs

Google Makes Gemini AI Assistant Free for Android Users

Published

19 hours ago

April 18, 2025

Abby K.

Google is making a significant push to integrate its Gemini AI directly into the Android experience, announcing that the Gemini app – positioned as an advanced AI assistant – is now free for all compatible Android users. This move essentially offers users an alternative, and potentially a replacement, for the traditional Google Assistant.

Previously, accessing the full capabilities of Gemini often required specific subscriptions or was limited in scope. Now, by downloading the dedicated Gemini app or opting in through Google Assistant, users can leverage Gemini’s conversational AI power for a wide range of tasks directly on their phones, at no extra cost.

What Does This Mean for Android Users?

Bringing Gemini to the forefront on Android allows users to tap into more sophisticated AI features. This includes tasks like generating text, summarizing information, brainstorming ideas, creating images (on supported devices), and getting help with context from their screen content. It represents a shift towards a more powerful, generative AI-driven assistant experience compared to the more command-focused Google Assistant.

Users can typically activate Gemini using the same methods previously used for Google Assistant, such as long-pressing the power button or using the “Hey Google” voice command (after enabling Gemini).

Google’s Strategy: AI Everywhere

Making Gemini freely available on Android is a clear strategic move by Google to embed its AI deeply within its mobile ecosystem. It aims to get users accustomed to Gemini’s capabilities, driving adoption and competing directly with other AI assistants and integrations, particularly Apple’s Siri and potential future advancements.

While Google Assistant isn’t disappearing entirely (it still handles some core smart home and routine functions better for now), this push positions Gemini as the future of AI assistance on Android devices.

Our Take

So Google’s basically putting Gemini front-and-center on Android for free now. This feels like them saying, “Okay, AI is the future, let’s get everyone using *our* AI assistant.” It makes sense – get users hooked on Gemini’s smarter features instead of just sticking with the old Google Assistant.

It’s a big play to keep Android competitive, especially with whatever Apple’s cooking up with Siri. Making it free removes the barrier, aiming for mass adoption. While the classic Assistant might still handle some stuff better for now, it’s pretty clear Google sees Gemini as the main event going forward on mobile.

This story was originally featured on Digital Trends.