Bias, Harm, and Privacy Concerns

This resource was remixed from the Centre for Teaching, Learning and Technology at UBC and adapted to Queen’s University context. All content is licensed under CC-BY-SA.

As with any technology platform, it’s important to consider the data collected by the company, for what purposes, and who will have access. Unlike Queen’s-supported tools such as onQ or Zoom, ChatGPT, DALL-E, and other generative AI tools have not undergone the Queen’s Security Assessment Process to thoroughly review privacy and security considerations and ensure implementation in a way that supports these.

It is therefore up to individual users to review terms of service and privacy policies for these kinds of tools, and it could be a useful learning experience to review and discuss them with students. In addition, it is important to discuss with students that users of such tools should be sure not to provide personal or other sensitive information in their prompts.

Please note that Queen’s has not banned the use of generative AI tools. This technology has already entered the educational system, is very difficult to detect, and is so widespread that it would be difficult, if not impossible, to prevent its use. In addition, many persuasive arguments have been made about the potential positive uses of generative AI.

(from the General Statement from the Vice-Provost (Teaching and Learning) on ChatGPT and Generative AI, February 2023)

However, inappropriate use of generative AI constitutes a departure from academic integrity. Among the core values of academic integrity are honesty and fairness that establish a framework for teaching and learning for both undergraduate and graduate students at Queen’s. 

(from the General Statement from the Vice-Provost (Teaching and Learning) on ChatGPT and Generative AI, February 2023)

 

Having students use AI tools in class

If you would like to have students use such tools in classes, we encourage you to provide them with information about what data is collected and where it is stored and provide them with alternatives if they choose not to provide identifying information. Below is a sample paragraph you could use, ideally in the syllabus:

In this course, students will be using [specify tool or platform], which is [specify what the tool is]. This tool will help us [specify how students will be using the tool]. During the account creation process, you will be required to provide your name and other identifying information. This tool is hosted on servers in [specify where]. By using this service, you are consenting to storage of your information in [the location]. If you choose not to provide your consent, see the instructor for alternate arrangements.

In using [tool or platform], be sure not to input any personally identifiable or sensitive information about yourself or others without their consent, as data you input may be used for training the tool and could emerge in later outputs.

Providing an option for AI tools in class

If use of the tool is option for the course, student who do not consent to providing their personal information can choose not to use it - though consider, when designing activities, whether those who are willing to do so may have more of an advantage than those who aren't

Requiring students to use AI tools for activities or assignments

If you wish to require students to use the tool for activities or assignments, you must provide alternatives for those students who do not consent to sharing their personal information. For some tools, such alternative arrangements could be for students to choose to use a pseudonym and an email address that doesn’t have their real name or other identifying information. For ChatGPT in particular, a cell phone number is required, and if students do not wish to provide their personal cellphone number, a different option must be provided. This could be in the form of an alternative way to achieve the same learning goals.

In addition, for ChatGPT in particular, one option is for a faculty member to create an account that could be shared with one or more students in the course. You would need to share the username and password with the students, so it should be a separate account from one you might have for yourself. However, please note that with ChatGPT, social media data can be transferred from users' computers to OpenAI-- see https://openai.com/policies/privacy-policy for more - so even if they log in using a different name or account, they might still unknowingly give away their data.

OpenAI’s Terms of Use state that “You may not make your access credentials or account available to others outside your organization, and you are responsible for all activities that occur using your credentials.” Before choosing this option, be sure to check the Terms of Use to ensure they haven’t changed since the time of writing this resource.

 

Equitable Access

Some generative AI tools are available free of cost for a limited time or have free tiers with fewer functionality than paid tiers. Consider whether asking students to use such tools in courses might mean that those who can afford to pay have a disproportionate advantage over those who cannot.

For example, as noted above, there is now both a free and a subscription version of ChatGPT. If you’re asking students to use it, consider how to design activities such that those who use the free version aren’t disadvantaged compared to those who can pay.

It’s important to recognize also that not all students have the same level of access to high-speed internet outside of campus, which could potentially mean more difficult access to generative AI tools.

Finally, note that Generative AI software is not available in all countries. So, if you are teaching an online or multi-access course with students located around the world, or if students are travelling at some point during the term, they may not be able to use the tool.

Bias and other harmful content in outputs

Large Language Models (LLMs) like such as that underlying ChatGPT are trained using large datasets of text. When the text in the training data contains biased, discriminatory, abusive, or other problematic content, it is possible that the model’s textual outputs may also contain some such content. As Weidinger et al. (2021) note:

Language Models are optimised to mirror language as accurately as possible, by detecting the statistical patterns present in natural (English) language . . . The fact that LMs track patterns, biases, and priors in natural language . . . becomes a problem when the training data is unfair, discriminatory, or toxic. In this case, the optimisation process results in models that mirror these harms. (p. 11)

There are multiple ways that creators of such models can try to reduce this risk, but harmful content may still get through filters and other mitigation efforts. In their Nov. 30, 2022 announcement about ChatGPT, Open AI notes:

While we’ve made efforts to make the model refuse inappropriate requests, it will sometimes respond to harmful instructions or exhibit biased behavior. We’re using the Moderation API to warn or block certain types of unsafe content, but we expect it to have some false negatives and positives for now. We’re eager to collect user feedback to aid our ongoing work to improve this system.

A related ethical concern has to do with the human cost of training models to detect and avoid such harmful content. This often requires that people review toxic, violent, abusive texts, images, or videos, in order to provide examples to train the model. An article in Time magazine from January 2023 discusses the lasting impacts of such work on low-wage workers in Kenya; this is likely just one among many examples.

It is important to be aware of the risk of biased, discriminatory, and otherwise harmful outputs, and plan for how you will manage these if you are using such tools in teaching. In addition, discussing these and other ethical issues with students can help them make better-informed choices on how they may use such tools.

Resources