Harnessing AI to Mirror Bias: The Case of OpinionGPT and Its Limitations

A cybernetic neural network rendered in an expressionist style, dusk sky backdrop, metallic sheen. Cropped identifiers for 11 bias groups including nationality, age and political leanings, depicted orbiting the neural network, casting long shadows to symbolize their influence. A mechanical bird, symbolizing the AI OpinionGPT, perched on the neural network. Plan to depict the network and bird with a touch of skepticism, punctuated by a single pyramid in the fore, displaying a mirrored surface emphasizing distorted interpretations of bias.

In a groundbreaking project, a body of researchers at Humboldt-Universität zu Berlin have developed an AI model named ‘OpinionGPT’. This unique model, a variant of Meta‘s Llama 2, is built with an intriguing characteristic – it is intended to generate outputs reflecting bias. Employing ‘instruction-based fine-tuning’, the AI can seemingly mirror the perspective of one from 11 bias groups which include nationalities, age groups and political leanings.

This initiative involved training the AI on data compiled from ‘AskX’ communities (subreddits) on Reddit. The AI was then fine-tuned using instruction sets for each identified bias. Commenting on the accuracy of their system, the developers cautiously note that the model’s responsive descriptions should be viewed as representing ‘Reddit posters’ rather than a generalized group per se.

However, the credibility of this AI faces criticism. Skeptics question the authenticity of the AI’s outputs and their representation of real-world biases, given that the source data’s relevance to the designated labels is questionable. For instance, one output that stands out is OpinionGPT’s claim that basketball is the favourite sport for Latin Americans. This contradicts empirical research, which indicates football and baseball as more popular sports in Latin America.

Despite the vulnerability of the model’s data source to biases, the AI’s outputs are speculated to display some form of stereotype rather than actual world bias. The developers themselves acknowledge the study’s limitations, clarifying that the responses should ideally be regarded as those of a specific subreddit’s contributors rather than representative of the larger demographic. As an example, the responses of ‘Americans’ should be better understood as ‘Americans that post on Reddit’ or ‘Americans that post on this particular subreddit’.

In conclusion, while OpinionGPT’s utility for studying actual human bias may be dubious, it could prove insightful for exploring inherent stereotypes within large document repositories such as individual subreddits or AI training sets. The researchers plan to further explore models that offer demographic delineation and have made OpinionGPT available online for public testing, albeit with a warning that generated content could be false, inaccurate, or even obscene.

Source: Cointelegraph

Sponsored ad