ChatGPT’s ability to analyze images is so advanced that OpenAI is hesitant to release it to the wider public, over concerns the tool will become a facial recognition machine.
The AI-powered chatbot can describe what’s in images, answer questions about them and recognize specific people’s faces. The original intent for this functionality was for ChatGPT to be able to identify issues and suggest solutions, similar to how the app Be My Eyes works.
OpenAI’s hesitancy around releasing this version comes following news of WormGPT – the AI chatbot trained on malware designed to extort victims. After all, if image recognition capabilities work this well, it's usually only a matter of time before scammers find ways to use it with malicious intent.
ChatGPT's Visual Analysis Exceeds Performance Expectations
Back in March, OpenAI announced GPT-4 – the latest software model powering its chatbot. While the majority of users could only access its text-prompt functionality, a select few were given early access to an advanced version.
Jonathan Mosen, a blind broadcaster, advocate and teacher, was amongst those trialing the image-prompt feature. He expressed that its capabilities have far exceeded the performance of other image analysis software, and that he’s now able to “interrogate images.”
Instead of standard image alt text giving basic descriptions, such as “woman with blond hair looking happy”, ChatGPT was able to provide significant detail such as “woman in a dark blue shirt, taking a selfie in a full-length mirror.”
Mosen could also ask the AI tool follow-up questions, like what kind of shoes the woman was wearing and what else could be seen in the mirror’s reflection.
However, much to their disappointment, Mosen’s and other early adopters’ access to this information was recently revoked, with people’s faces becoming obscured for privacy reasons. The move reflects a wider concern OpenAI has about releasing something with such significant power.
The visual analysis capability of ChatGPT is the product of a collaboration with Danish startup Be My Eyes (BME). The BME technology has been around since 2012 and is used by a community of over 250m people who or blind or have low vision.
It works by connecting users with volunteers to help identify products or navigate through tasks. The aim of the ChatGPT partnership is to replace that human volunteer with a virtual one.
OpenAI Is Still “Figuring Out” Safety Concerns
OpenAI policy researcher Sandhini Agarwal has been following the chatbot’s progress, and clarified that it can identify public figures, for example those with a Wikipedia page. But references chatbot’s infamous “hallucinations” as being a potential hindrance.
“If you give it a picture of someone on the threshold of being famous, it might hallucinate a name. Like if I give it a picture of a famous tech CEO, it might give me a different tech CEO’s name.” – Sandhini Agarwal, Open AI Policy Researcher
OpenAI has expressed concerns that the tool could say things about people’s faces that provide unsafe assessments, such as their gender or emotional state.
This feature would be considered the first of its kind with regard to what is generally considered acceptable from US technology companies. More than that, it could cause legal issues in areas like Europe, who require citizen consent to use their biometric data.
Agarwal acknowledged that despite celebrity facial recognition software already existing, the tools provide an opt-out feature for those who don’t want to be recognized. It’s said that OpenAI is considering the same approach, while figuring out how to address other safety concerns before releasing the tool.
Read our in-depth guide to ChatGPT alternatives
The Rise – And Power – Of AI
While the power of its image recognition functionality is significant, is OpenAI’s concern and cautious approach justified? Microsoft – who has invested $10bn in OpenAI and has access to the visual analysis tool – certainly seems to think so.
A spokesman for the tech giant shared that it wouldn’t be “sharing technical details” but was “working closely with our partners at OpenAI to uphold our shared commitment to the safe and responsible deployment of AI technologies.”
Computer scientist and doctoral candidate at Princeton University, Sayash Kappor, recently used the tool to decode a captcha. The visual security check was supposedly only intelligible to humans but had been cracked by a chatbot. The only saving grace was the chatbot stopping itself from sharing the obscured captcha words by stating “captchas are designed to prevent automated bots like me from accessing certain websites or services.”
However, whether looking at images or text, relying on chatbots – or AIs in general – to self-police what they can or can’t share is not a foolproof strategy. The more advanced the technology, the more its developers are needing to put measures in place to regulate them.