OpenAI's GPT-4 now understands both text and image inputs

OpenAI has announced the launch of GPT-4, the latest iteration of its generative pre-trained transformer system. Unlike its predecessor GPT-3.5, which can only read and respond with text, GPT-4 can generate text on input images. This development comes hot on the heels of Google’s Workspace AI announcement and ahead of Microsoft’s Future of Work event. OpenAI has reportedly spent the past six months refining the system’s performance based on user feedback generated from the recent ChatGPT conversational bot hype. The company claims that GPT-4 exhibits human-level performance on various professional and academic benchmarks.

OpenAI has partnered with Microsoft to develop GPT’s capabilities and has achieved record performance in “factuality, steerability, and refusing to go outside of guardrails” compared to its predecessor. The new system has also outperformed other state-of-the-art large language models (LLMs) in a variety of benchmark tests. GPT-4 will be made available for both ChatGPT and the API, but access will be restricted to ChatGPT Plus subscribers and API waitlist users, respectively. There will also be a usage cap in place for playing with the new model.

The added multi-modal input feature of GPT-4 will generate text outputs based on a wide variety of mixed text and image inputs. This means that users can scan marketing and sales reports, textbooks, shop manuals, and even screenshots, and ChatGPT will summarize the various details into small words that are easy to understand. The recently upgraded system can be customized by the API developer, allowing developers and soon ChatGPT users to prescribe their AI’s style and task by describing those directions in the ‘system’ message.

GPT-4 has been tested by 50 experts in a wide array of professional fields, and the model’s tendency to “hallucinate” facts has been reduced by around 40 percent compared to its predecessor. The new model is also 82 percent less likely to respond to requests for disallowed content. However, OpenAI still strongly recommends that great care should be taken when using language model outputs, particularly in high-stakes contexts, and that the exact protocol should match the needs of a specific use case.

OpenAI’s GPT-4 represents a significant advance in AI technology, with its ability to generate text on input images and improved performance in various benchmarks. As AI continues to develop, it will be interesting to see how GPT-4 and other systems like it can be used in practical applications.

Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

GadgetBond

OpenAI’s GPT-4 now understands both text and image inputs

Discover more from GadgetBond

Leave a ReplyCancel reply

Xbox initiates massive restructuring: 1,600 roles cut

New reports suggest a substantial battery increase for iPhone 18 Pro Max

A redesigned entry-level MacBook Pro is finally on the horizon

Where to stream Project Hail Mary worldwide

Why social media can be mentally exhausting

Apple TV announces ‘Guilty Creatures’ adaptation with all-star creative team

The first iPhone Ultra could be a rare find

Microsoft announces 4,800 layoffs in strategic shift

Google Play extends its reach to African indie creators

Figma officially earns ISO 42001 certification for AI governance

The iPhone 18 Pro Max is finally getting a massive battery

Apple drops native DVD support in macOS 27

WPA3 explained: Protecting your network in a connected world