Limitations of AI
Learning Objectives
This section will help you understand:
- Some of the limitations of AI models
- The wider risks of AI
- Policies and regulations that might apply
AI model limitations
When considering AI in your research, it's important to know about some of the limitations of current AI models. These span across all forms of machine learning model, and include:
Models don't have 100% accuracy, and so aren't appropriate for all scenarios or tasks. This needs to be taken into account in the design of the scenario you imagine AI to be used in. In safety critical tasks, they may always need human oversight.
Appropriateness - some tasks just aren't appropriate for AI because they are too noisy or unpredictable. Often, if it sounds too good to be true, then it is!
The black-box nature of many models means that it can be difficult to understand or interpret why AI models make the predictions that they do. For some domains, this can be challenging.
Hallucinations or confabulations are the name given to a specific kind of error made by language models outputting factually incorrect information. While it's possible to reduce the hallucination rate, eliminating hallucinations entirely is a challenge.
Third Party Tools
The use of third party tools like ChatGPT are increasingly common as part of scientific research.
Take care when uploading data to a third party tool. Information that you upload may be stored by them and used for model training or other purposes. Data protection laws like GDPR may be applicable when dealing with personal data, but also consider the risk of uploading unpublished data or work to a third party service. Many services have an option to opt-out of them storing your data, so be sure to check the settings you are using. There’s a need to be transparent about your use of AI with research participants, and inform them that their data may be shared with a third party.
Risks, Concerns and Scientific Integrity
There are also a number of wider risks and concerns about the field of AI.
ML models learn from the data they are trained on, which means that they learn and reflect any bias that was present in the training data. This can be particularly harmful if the models were trained on data scraped from the internet, which contains societal biases - like racial and gender - that shouldn’t be propagated.
The environmental cost of building and using ML models is in the spotlight. These models require large amounts of computing power to build and use, and hence use significant amounts of energy. Reducing the carbon footprint of AI is top of many people's minds.
Due to the mistakes that ML models make, it's important to think about accountability and human oversight. Leaving ML models to make critical decisions with no oversight raises the question about who remains accountable and responsible when the system makes the wrong decision, especially in tasks where a wrong decision can cause harm. Some models are used in a human-in-the-loop scenario, where the AI model provides a prediction or decision that is checked by a human rather than acting autonomously.
From a scientific research perspective, ensuring reproducibility is important. However, in the fast-paced field, reproducibility can be neglected. It is hard to explain the decisions of ML models, and in many fields explainability and transparency are also important. For example, a model making a prediction of medical diagnosis might also need to provide an explanation of how it reached its prediction.
ML models are trained on data, and it's very important to know the source of data. Much data is copyrighted, and we may need consent from the owners of that data in order to use it.
As with all technology, there is a potential for misuse. For researchers, the idea that their work can be misinterpreted and misused might seem like a far-off concern. However, it's important to consider at the outset how your work might be used in ways you don't support.
Policies and Regulations
Policies and regulation for AI are being developed, and this is a shifting area. Always do your own research into specific policies and regulations that might apply to your research. Those might come, for example, from your funding source, your research institution, your department, or your government.
As with other research, AI research may need to go through an ethics review board to be approved. Specific issues to be aware of include the dual use nature of AI, the use of data (including personal data), the use of AI tools in a way that goes against good scientific research practices, the proprietary nature of some tools and the lack of transparency, the bias and errors present in AI output, the ability to create misinformation at scale, and the ability to cause harm.
GDPR is the UK’s data legislation which may apply to the data you are working with. GDPR requires that you are transparent with the people you are collecting data from, that you store data securely, and that you minimise the amount of data and purpose for its use.
There are several regulations around the globe that are relevant to AI. The first law that is specific to AI is the EU's AI act. This regulation has some banned uses of AI. It also carves out some high-risk applications that come with obligations if they are deployed. Most of the act doesn't apply to research, but could come into force if you commercialise your work. Other jurisdictions are crafting laws and guidelines that may come into force, so be sure to check the current situation as you are carrying out your research.
Resources
- The ICO’s GDPR Guidance
- UKRI's GDPR guidance for researchers
- UK Government advice on Generative AI (aimed at civil servants)
- EU Guidelines on the responsible use of Generative AI in Research
Contact
If you can't find what you need