Google introduced a new artificial intelligence model, Gemma 4 12B, capable of handling text, images, and audio, running on laptops with 16 GB of memory. The new development aims to deliver performance close to Gemma 26B with lower resource requirements.
Gemma 4 12B serves as the latest addition to Google’s family of open models, occupying an intermediate position between the compact Gemma E4B and the more powerful Gemma 26B. A unique feature of the new model is native support for audio inputs without a separate encoder, simplifying audio signal processing. As for working with images, Google has simplified the visual encoder, placing the main load on the language model.
Gemma 4 12B is geared towards complex multi-step tasks and agent scenarios, specifically supporting Multi-Token Prediction (MTP) technology, which reduces delays in response generation. This makes it an effective solution for various projects, including robotics and cybersecurity systems.
The presentation of Gemma 4 in April 2026 continued Google’s long-term strategy to create powerful artificial intelligence models. According to the company, the family of Gemma models has already exceeded 150 million downloads by developers worldwide.
Ukraine is also actively involved in the development of AI technologies, implementing an initiative with NVIDIA to create a sovereign artificial intelligence, for which Google’s Gemma 3 has been chosen as the base model. This is expected to significantly enhance the country’s technological independence.
The success of the new model could have global implications for the AI industry, as such technologies are changing the approach to developing complex tasks and autonomous agents across various fields. Experts note that further integration of such solutions can significantly optimize processes in many sectors.




