In a groundbreaking project, a body of researchers at Humboldt-Universität zu Berlin have developed an AI model named ‘OpinionGPT’. This unique model, a variant of Meta‘s Llama 2, is built with an intriguing characteristic – it is intended to generate outputs reflecting bias. Employing ‘instruction-based fine-tuning’, the AI can seemingly mirror the perspective of one from 11 bias groups which include nationalities, age groups and political leanings.
This initiative involved training the AI on data compiled from ‘AskX’ communities (subreddits) on Reddit. The AI was then fine-tuned using instruction sets for each identified bias. Commenting on the accuracy of their system, the developers cautiously note that the model’s responsive descriptions should be viewed as representing ‘Reddit posters’ rather than a generalized group per se.
However, the credibility of this AI faces criticism. Skeptics question the authenticity of the AI’s outputs and their representation of real-world biases, given that the source data’s relevance to the designated labels is questionable. For instance, one output that stands out is OpinionGPT’s claim that basketball is the favourite sport for Latin Americans. This contradicts empirical research, which indicates football and baseball as more popular sports in Latin America.
Despite the vulnerability of the model’s data source to biases, the AI’s outputs are speculated to display some form of stereotype rather than actual world bias. The developers themselves acknowledge the study’s limitations, clarifying that the responses should ideally be regarded as those of a specific subreddit’s contributors rather than representative of the larger demographic. As an example, the responses of ‘Americans’ should be better understood as ‘Americans that post on Reddit’ or ‘Americans that post on this particular subreddit’.
In conclusion, while OpinionGPT’s utility for studying actual human bias may be dubious, it could prove insightful for exploring inherent stereotypes within large document repositories such as individual subreddits or AI training sets. The researchers plan to further explore models that offer demographic delineation and have made OpinionGPT available online for public testing, albeit with a warning that generated content could be false, inaccurate, or even obscene.
Source: Cointelegraph