Google recently unveiled its next-generation Gemini AI and the company claims that it can outperform OpenAI's GPT-4 as well as human experts on nearly all major tests. According to Google, Gemini AI understands images, videos, and audio; along with text and code.
Gemini AI scored 90.0% on the MMLU (massive multitask language understanding) test, becoming the first model to outperform human experts (89.8%) and GPT-4 (86.4%) in a series of knowledge and problem solving tasks across 57 subjects including mathematics, physics, history, medicine, law, and ethics.
The new AI tool is multimodal, implying that its original training data set contained a lot of other media in addition to text. In essence, it has better visual and auditory understanding. Other language models perceive even videos and images in textual terms, but Gemini has an edge and retains all the tone and nuance of the original video, audio, and image sources.
If you're wondering what Gemini AI can do, take a look at the promo video below.
AIs are now being trained with larger datasets, allowing them to mimic processes that allow humans to interact with the world. Gemini is central to this and as Google Deepmind CEO Demis Hassabis told Wired, Gemini will soon move into the logical sensory realm with touch and tactile feedback.
Also read:?Microsoft Copilot Now Available To All Windows Users: What Can You Do With It?
In the video below, Deepmind scientists show how Gemini can generate its own code and read 200,000 scientific studies. It can also filter them based on relevance by employing its own reasoning capabilities. In addition, it can collate all data to create new meta-knowledge. According to the team, Gemini did all this over their lunch break and claim that it'll also be useful in fields like law where large datasets need to be examined.
According to Google, Gemini is fluent in coding - whether it is Python, Java, C++, and Go programming. Gemini can create websites that code themselves as you use them. Check out the video below.
Google is releasing Gemini in three sizes: Gemini Nano that's built for installation on mobile devices, Gemini Pro - an equivalent of GPT 3.5, and Gemini Ultra which Google says beats GPT-4 across most tasks.
The Ultra version will be released to the public next year. Gemini Nano is now available on the Pixel 8 Pro smartphone and has begun rolling out to users.
Also read:?AI-Generated Images Are Appearing In Google Search Results, Raising Authenticity Issues
Gemini Pro may be also be tried for free via the Google Bard chatbot. The current version is slimmed-down and only supports the ability to upload images but Google says it'll gain new capabilities soon.
With user permission, Google Bard, powered by Gemini can operate on your Gmail, Google Drive, and Google Docs, Google Maps, and YouTube. The company is also working on ways to integrate the Gemini model into every product it makes.
Are you excited to see what becomes of Google services with Gemini? Let us know in the comments below.?For more in the world of?technology?and?science, keep reading?Indiatimes.com?and?click here?for our how-to guides.