SoatDev IT Consulting
SoatDev IT Consulting
  • About us
  • Expertise
  • Services
  • How it works
  • Contact Us
  • News
  • May 21, 2024
  • Rss Fetcher
Elon Musk grins in a photo illustration, lifting his arms over his head triumphantly
Illustration by Kristen Radtke / The Verge; Getty Images

Elon Musk’s AI company, xAI, is making progress on adding multimodal inputs to its Grok chatbot, according to public developer documents. What this means is that, soon, users may be able to upload photos to Grok and receive text-based answers.

This was first teased in a blog post last month from xAI which said Grok-1.5V will offer “multimodal models in a number of domains.” The latest update to the developer documents appear to show progress on shipping a new model.

In the developer documents, a sample Python script demonstrates how developers can use the xAI software development kit library to generate a response based on both text and images. This script reads an image file, sets up a text prompt, and uses the xAI SDK to generate a response.

A sample Python script that demonstrates how developers can to use the xAI software development kit library to perform multimodal completion.
Image: xAI

This is a big update for Grok, which xAI first released in November 2023 and is available to users who pay for the X Premium Plus subscription. The last update was Grok 1.5 in March, which came with improved reasoning capabilities.

The model is trained “on a variety of text data from publicly available sources from the Internet up to Q3 2023 and data sets reviewed and curated by … human reviewers,” according to a blog post from X. Grok-1 was not trained on X data (including public X posts), the blog added. However, Grok does have “real-time knowledge of the world,” including posts on X.

xAI, founded by Elon Musk in March 2023, is relatively new in the AI field and trails behind competitors such as OpenAI’s ChatGPT. However, according to a blog post from xAI, their Grok 1.5 model is closing the gap with GPT-4 on various benchmarks that span a wide range of grade school to high school competition problems. It’s important to note that benchmarks for large language models are often criticized because the models can perform well on benchmarks if those benchmarks are included in their training data. It’s sort of like memorizing test answers, rather than actually learning the material.

Multimodal conversational chatbots seem to be the next frontier for AI, with multiple advancements announced at Google I/O and OpenAI releasing GPT-4o, so Grok lacking multimodal capabilities has put it behind the curve — until now.

Previous Post
Next Post

Recent Posts

  • Crypto elite increasingly worried about their personal safety
  • Grok says it’s ‘skeptical’ about Holocaust death toll, then blames ‘programming error’
  • Heybike’s Alpha step-through e-bike is an affordable, all-terrain dreamboat
  • U.S. lawmakers have concerns about Apple-Alibaba deal
  • Microsoft’s Satya Nadella is choosing chatbots over podcasts

Categories

  • Industry News
  • Programming
  • RSS Fetched Articles
  • Uncategorized

Archives

  • May 2025
  • April 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023

Tap into the power of Microservices, MVC Architecture, Cloud, Containers, UML, and Scrum methodologies to bolster your project planning, execution, and application development processes.

Solutions

  • IT Consultation
  • Agile Transformation
  • Software Development
  • DevOps & CI/CD

Regions Covered

  • Montreal
  • New York
  • Paris
  • Mauritius
  • Abidjan
  • Dakar

Subscribe to Newsletter

Join our monthly newsletter subscribers to get the latest news and insights.

© Copyright 2023. All Rights Reserved by Soatdev IT Consulting Inc.