OpenAI‘s generative language model, ChatGPT, has been a trailblazer, leaving a lasting impact on the world of natural language processing. However, recent developments have raised eyebrows and questions about its reliability. Observers and users alike have reported a noticeable decline in accuracy, leaving some to wonder if ChatGPT’s brilliance is waning. A study conducted by Stanford and Berkeley researchers, comparing ChatGPT-4 and ChatGPT-3.5, discovered substantial differences in the chatbot’s performance, particularly during the spring of this year.
Once lauded for its impressive prowess, ChatGPT-4’s decline in performance is startling. The study revealed a jaw-dropping contrast in the accuracy of identifying prime numbers, plummeting from a commendable 97.6% a few months ago to a mere 2.4% in June. High-school seniors, known for their penchant for sunglasses and calculus challenges, would likely outperform AI in this particular domain.
While the newer version showed improvement in deflecting problematic queries, such as illegal money-making schemes, it came at the cost of transparency. Rather than elucidating the reasons for rejecting such queries, ChatGPT-4 now takes the easy way out, simply stating, “Sorry, but I can’t assist with that.“
Another concerning aspect of ChatGPT-4’s regression was its performance in generating computer code. In March, it managed to produce functional code 52% of the time, an impressive feat. However, the landscape had drastically changed by June, with the success rate dropping to a mere 10%. The generated code, though accurate, started to be accompanied by nonusable text, which poses significant challenges for businesses integrating ChatGPT into their programming workflows.
The researchers behind the study speculate that the rapid transformation in ChatGPT’s behavior is likely a result of OpenAI’s continuous efforts to fine-tune its product. Yet, due to the company’s secretive nature, the exact reasons behind the decline remain a mystery. OpenAI has acknowledged the “reported regressions” and has assured its users that they are actively investigating the matter.
In light of growing concerns about the ethical implications of AI technologies, industry leaders have come together to address the issue of transparency. Recently, representatives from seven prominent AI companies, including OpenAI, assembled at the White House and reached a consensus on guidelines aimed at enhancing safety measures before product releases. Additionally, a watermarking system for AI-generated content and more disclosure of technology vulnerabilities are among the measures agreed upon. To ensure accountability, failure to adhere to these commitments may result in repercussions from the Federal Trade Commission (FTC).