Skip to Main Content

Generative AI and Information Literacy


Email this link:

Critical information literacy and bias

Critical Information Literacy is "a theory and practice that considers the sociopolitical dimensions of information and production of knowledge, and critiques the ways in which systems of power shape the creation, distribution, and reception of information" (Drabinski and Tewell, 2019).

Information, in other words, is never "unbiased." Human creators of information have biases based on their own lived experiences or perspectives. Human receivers of information also have biases - for example, "confirmation bias" when you seek information that reinforces your existing beliefs (Mynatt et al, 1977), or "cognitive dissonance" when information does not align with your beliefs (Festinger, 1957). Biases are embedded in the ways that information is distributed - for example, who decides what kinds of information gets published or archived, or how search engines rank pages to display in a results list.

Bias, DEI, and technology

Multiple studies have documented how technologies perpetuate systemic biases and inequalities in our societies. In her pioneering book Algorithms of Oppression: How Search Engines Reinforce Racism (2018), digital media scholar Safiya Noble famously analyzed Google search results from 2009-2015 to demonstrate how search engines were not neutral, but reinforced racist and sexist biases. 

The Large Language Models (LLMs) underlying chatbots are trained on large data sets that contain biases that we (the end user) are not able to evaluate. The seminal research paper "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?" demonstrates how large data sets "overrepresent hegemonic viewpoints and encode biases potentially damaging to marginalized populations" (Bender, et al., 2021, p. 610). In other words, LLMs are trained on data that likely reproduce historical biases or may include overt hate speech or misinformation. 

When used in chatbots, these LLMs can produce or amplify sexist, abelist, racist, or other harmful ideologies when responding to user queries (Bender et al, 2021, p. 617). Omiye, et al's "Large Language Models Propagate Race-Based Medicine" (2023) demonstrates how the integration of Gen-AI tools into healthcare systems can further discriminate against persons of color in medicine. When used in workforce recruitment and resume screening, Gen-AI tools can perpetuate gender, age, and ableist biases (Glazko et al., 2024). 

Human Labor Concerns

In the conversation surrounding AI, the text produced by chatbots is often presented as the result of machine intelligence only. However, journalists have shown that AI text is not only the work of machines. Instead, the work of many human laborers is essential to the text generated by ChatGPT and other chatbots. According to an investigative report, "Behind even the most impressive AI systems are people — huge numbers of people labeling data to train it and clarifying data when it gets confused" (Dzieza, 2023).

How do humans contribute to the work of generative AI?

  • Annotation: People are given data gathered from the Internet and label this data: for example, they assign emotions to people's voices in video calls or on social media posts, or they categorize images of items such as clothing or food. These labels are used to train AI to recognize and assign categories to data.
  • Reinforcement learning from human feedback (RLHF): People "converse" with an AI chatbot and rate its responses for qualities such as authenticity or helpfulness. Engineers use these ratings to train the AI model to sound more "humanlike."
  • Detecting toxic content: Similar to annotation, people identify and label "toxic" content from the Internet (including violent, disturbing, and harmful content). This is used to train the AI model to exclude such content from its generated text.

The human labor used to train generative AI models is often outsourced to underpaid workers in the Global South. For instance, workers in Kenya were paid less than $2 an hour to label disturbing toxic content (Perrigo, 2023). Some academics refer to these practices as "digital neocolonialism": Western tech companies exploit the labor and natural resources (for example, minerals used in computer hardware) of poor nations in the Global South, further perpetuating the legacy of colonialism (Browne, 2023). Initiatives like the Data Workers’ Inquiry help to share firsthand stories from global data workers.

Selected readings

On Critical Information Literacy and Critical AI Literacies

Drabinski, Emily, and Eamon Tewell. “Critical Information Literacy.” In The International Encyclopedia of Media Literacy, edited by Renee Hobbs and Paul Mihailidis, 1st ed., 1–4. Wiley, 2019. https://doi.org/10.1002/9781118978238.ieml0042.

Festinger, Leon. A Theory of Cognitive Dissonance. Evanston, Ill: Row, Peterson, 1957

Gupta, Anuj, Yasser Atef, Anna Mills, and Maha Bali. “Assistant, Parrot, or Colonizing Loudspeaker? ChatGPT Metaphors for Developing Critical AI Literacies.” Open Praxis 16, no. 1 (2024): 37–53. https://doi.org/10.55982/openpraxis.16.1.631.

Mynatt, Clifford R, Michael E Doherty, and Ryan D Tweney. “Confirmation Bias in a Simulated Research Environment: An Experimental Study of Scientific Inference.” Quarterly Journal of Experimental Psychology 29, no. 1 (1977): 85–95. https://doi.org/10.1080/00335557743000053.

On Bias in Technology and LLMs

Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?" In FAccT ‘21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623. https://doi.org/10.1145/3442188.3445922.

Browne, Grace. “AI Is Steeped in Big Tech’s ‘Digital Colonialism.’” Wired, May 25, 2023. https://www.wired.com/story/abeba-birhane-ai-datasets/.

Buolamwini, Joy. Unmasking AI: A Story of Hope and Justice in a World of Machines. New York: Random House, 2023.

Glazko, Kate, Yusuf Mohammed, Ben Kosa, Venkatesh Potluri, and Jennifer Mankoff. “Identifying and Improving Disability Bias in GPT-Based Resume Screening.” In FAccT ‘24: Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, 687–700. https://doi.org/10.1145/3630106.3658933.

“How Artificial Intelligence Bias Affects Women and People of Color.” UCB-UMT, December 8, 2021. https://ischoolonline.berkeley.edu/blog/artificial-intelligence-bias/.

Lizarraga, Lori. “How Does a Computer Discriminate?” NPR Code Switch, November 8, 2023. https://www.npr.org/2023/11/08/1197954253/how-ai-and-race-interact.

Noble, Safiya Umoja. Algorithms of Oppression: How Search Engines Reinforce Racism. New York: University Press, 2018.

Omiye, Jesutofunmi A., Jenna C. Lester, Simon Spichak, Veronica Rotemberg, and Roxana Daneshjou. “Large Language Models Propagate Race-Based Medicine.” NPJ Digital Medicine 6, no. 1 (2023): 1–4. https://doi.org/10.1038/s41746-023-00939-z.

Human Labor and Generative AI

Browne, Grace. "AI is Steeped in Big Tech's 'Digital Colonialism.'" Wired, May 25, 2023. https://www.wired.com/story/abeba-birhane-ai-datasets/.

Distributed AI Research (DAIR) Institute. Data Workers’ Inquiry. https://data-workers.org/

Dzieza, Josh. "AI is a Lot of Work." The Verge, June 20, 2023. https://www.theverge.com/features/23764584/ai-artificial-intelligence-data-notation-labor-scale-surge-remotasks-openai-chatbots

Perrigo, Billy. "OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic." TIME, January 18, 2023. https://time.com/6247678/openai-chatgpt-kenya-workers/.

Questions to consider

  • What is your reaction to learning about systemic bias and how it influences and impacts technology? Does it change the way you approach or use Gen-AI tools?
  • How might our own (often unconscious) biases affect how we prompt a chatbot? How would that affect the output?
  • How does knowing about the way human labor is used to develop generative AI affect the way you think about or evaluate the text produced by AI?