Google Unveils Gemini: A Multimodal Language Model for Advanced AI Capabilities

Gemini

Google has officially launched its latest large language model, Gemini, touted as the “most capable” in its arsenal.

The model is trained using custom-designed AI accelerators, Cloud TPU v4 and v5e, and is set to redefine the landscape of multimodal AI, offering advanced capabilities across various data types.

Gemini, built from the ground up to be multimodal, represents Google’s response to the demand for more versatile language models, akin to ChatGPT.

What sets Gemini apart is its ability to seamlessly understand, operate across, and combine different types of information, including text, images, audio, video, and code.

This native multimodal capability empowers users to transform any type of input into any type of output.

Sundar Pichai, Google’s CEO, stated, “Gemini 1.0 is optimized for different sizes: Ultra, Pro, and Nano. These are the first models of the Gemini era and the first realization of the vision we had when we formed Google DeepMind earlier this year. This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company.”

Google Bard, the company’s generative AI chatbot, will now leverage a specially tuned version of Gemini Pro in English for more advanced reasoning and planning. In early 2024,

Google plans to introduce Bard Advanced, providing users with access to its most sophisticated models and capabilities, starting with Gemini Ultra.

Furthermore, Pixel 8 Pro, touted as the first smartphone engineered to run Gemini Nano, is set to roll out. It will power new features, including Summarize in the Recorder app and a developer preview of Smart Reply in Gboard.

The latter will be available for testing initially with WhatsApp and is expected to extend to more messaging apps in the coming year.

With Gemini, Google aims to push the boundaries of AI capabilities, offering a versatile and powerful tool for users across different domains, from advanced reasoning to code interpretation and generation.

The multimodal aspect of Gemini is expected to play a pivotal role in shaping the future of AI applications.

Google Unveils Gemini: A Multimodal Language Model for Advanced AI Capabilities

Tags

About the author

Follow the report beyond this story

Tags

About the author

Follow the report beyond this story

Related stories

OpenAI Reportedly Developing AI-Agent Smartphone with MediaTek and Qualcomm

Google Data: Nigerians Turn to AI and Search to Build Creative Skills

DeepSeek Releases DeepSeek-V4 Artificial Intelligence Model

Four Nigerian Startups Join Google Accelerator Africa Cohort, Spotlighting AI Innovation

Get the app