OpenAI’s ChatGPT, the popular AI-powered chatbot used for a variety of tasks, has evolved beyond just handling text. The latest version of ChatGPT, known as GPT-4, now includes an interesting new capability – image analysis. Users can now not only interact with the bot using words, but can also use it to describe images, ask questions about them, and even recognize faces of specific individuals. Potential applications of this technology are promising, such as helping users identify and solve problems in images, such as troubleshooting a broken car engine or identifying mysterious grains.
One of the early adopters of this improved version is Jonathan Mosen, CEO of the Blind Employment Agency, who got a chance to experience the visual analysis feature during a visit. With the help of ChatGPT, he could recognize different dispensers in hotel bathrooms and learn their contents in detail, going beyond the capabilities of traditional image analysis software.
However, OpenAI is cautious about the potential risks associated with facial recognition. While the chatbot’s visual analysis can only identify certain public figures, the company is mindful of the ethical and legal concerns associated with the use of facial recognition technology, particularly those related to privacy and consent. For this reason, the app has stopped giving information about people’s faces to Mosen.
Sandhini Agarwal, policy researcher at OpenAI, says the company wants to engage in a transparent dialogue with the public regarding the integration of visual analytics capabilities into its chatbot. They are willing to solicit feedback and democratic input from users to establish clear guidelines and safeguards. In addition, the non-profit arm of OpenAI is exploring ways to involve the public in setting rules for AI systems to ensure responsible and ethical practices.
The development of scene analysis in ChatGPT is a natural progression given the model’s training data, which includes images and text collected from the Internet. However, OpenAI is aware of potential challenges, such as “hallucinations” where the system may generate misleading or false information in response to images. For example, when shown a picture of a person on the verge of fame, the chatbot may mistakenly name a different notable person.
Microsoft, a major investor in OpenAI, also has access to the visual analysis tool and is testing it in a limited rollout on its Bing chatbot. However, both companies are working carefully to protect user privacy and address concerns ahead of wider implementation.