Photo by Solen Feyissa on Unsplash
Google has finally unveiled Gemini, its answer to OpenAI's GPT-4.
The tech is a next-gen, multimodal AI model developed by a team of researchers pulled from Google's now-merged AI divisions DeepMind and Google Brain.
The model is being promoted as a significant advancement in natural language processing, with Google calling it "our largest science and engineering project ever."
Users can access Gemini this month, with a more advanced version set to arrive early next year.
The drawn-out release has been closely watched by the tech world, with many speculating about the model's ability to better its main competitor, OpenAI's large language model GPT-4.
An analysis that made an early declaration for Google's AI supremacy over GPT-4 ignited a fierce online debate that even lured OpenAI CEO Sam Altman into the fray.
The model has three "sizes," set to launch in stages. Here's what we know so far.
Google's Gemini is a multimodal AI, meaning it can process more than one data type.
The model can process images, text, audio, video, and coding languages. The new capabilities allow for features such as written analysis of visual graphs.
The tech giant is also boosting the tech's code-generating abilities to try and take on Microsoft's GitHub Copilot, which is powered by OpenAI, The Information previously reported.
The first iteration of the model Gemini 1.0, is optimized for three sizes: Gemini Ultra, Pro, and Nano.
Click for full article