Retrieval-augmented generation (RAG) is an artificial intelligence (AI) system architecture that combines large language models (LLMs), such as GPT-4, with external data retrieval processes. Unlike traditional AI models, RAG retrieves relevant information in real-time from external databases or document repositories with the aim of generating contextually accurate responses. These external databases or document repositories are often a company’s private and internal resource tool that becomes shared and connected.[1]
Business use and deployment models of RAG
Businesses typically implement RAG systems to complete tasks such as customer support automation, internal knowledge management, document summarisation, compliance tracking and advanced enterprise search.
Common deployment models include:
- cloud-based hosting; or
- on-premise/private hosting.
The standard architecture of RAG systems include:
- vector databases (transform different forms of data, such as text, images and video, into a common form of simple vector points, connecting based on their relevancy to each other);
- retrieval mechanisms (identifying relevant documents); and
- an LLM (synthesising retrieved information into natural language outputs).

RAG integration into enterprise systems, such as Customer Relationship Management (CRM), Enterprise Resource Planning (ERP) and Document Management Systems (DMS) is also common.[2]
Prominent RAG tools and frameworks
Frequently utilised RAG frameworks and tools include:
- LangChain;
- Haystack;
- OpenAI Assistants API; and
- Cohere RAG API (commercial RAG platform tailored for businesses).
Legal risks and compliance considerations
Depending on how the RAG system is deployed and what the affected systems are used for, businesses need to consider the following legal risks:
- compliance with the Privacy Act 1988 (Cth) (Privacy Act);
- risks based on the particular industry sector that the business operates in (Sector Based Risks);
- compliance with organisational information security policies;
- compliance with the specific business’s contracts with third parties (Contractual Risk);
- intellectual property infringement;
- compliance with best practices for the implementation of AI systems in Australia; and
- compliance with the Australian Consumer Law (ACL) pursuant to Schedule 2 of the Australian Competition and Consumer Act 2010 (Cth).
Privacy Act compliance
APP Entities must ensure compliance with the Privacy Act, disclosure and use of personal information.
Sector Based Risks
The compliance obligations for businesses operating in different sectors can vary greatly. For example, the compliance obligations on businesses that operate in the health sector will vary wildly from those in the construction sector.
Information security
Deployment of RAG may also introduce cybersecurity vulnerabilities. Organisations must comply with the Security of Critical Infrastructure Act 2018 (Cth) by implementing appropriate cybersecurity measures and monitoring them.
Compliance with third party contracts
Automated data retrieval should adhere with an organisations’ contracts with third parties to ensure it does not cause a breach of these contracts. It may be that express permissions and consents must be obtained to address this issue.
Intellectual property infringement
RAG systems may inadvertently incorporate third-party intellectual property (literary and artistic works) that the internal database/resource pool has provided access to, potentially breaching the Copyright Act 1968 (Cth).
Compliance with best practices for implementing AI systems
RAG systems incorporate use of LLMs and therefore, businesses should ensure RAG systems adhere to the AI Ethics Principles and Voluntary AI Safety Standard. Businesses using RAG systems should also follow the guidance on privacy and the use of commercially available AI products published by the Office of the Australian Information Commissioner (OAIC). Meanwhile, software developers creating RAG systems can follow the OAIC’s guidance on privacy and developing and training generative AI models.
ACL compliance
Depending on what the RAG software is implemented to do, it is possible that implementation could amount to a false and misleading statement pursuant to section 29(1)(a)-(n) of the ACL or result in misleading and deceptive pursuant to section 18 of the ACL.
Links and further references
Legislation
Competition and Consumer Act 2010 (Cth)
Security of Critical Infrastructure Act 2018 (Cth)
Australian AI standards
Australian AI guidance
Guidance on privacy and developing and training generative AI models
Guidance on privacy and the use of commercially available AI products
Proposals paper for introducing mandatory guardrails for AI in high-risk settings
Australian AI checklists
Privacy considerations when developing or training generative AI models
Privacy considerations when selecting a commercially available AI product
Privacy considerations when using commercially available AI products
Further information about AI and RAG systems
If you need advice on risks of implementing RAG systems in your business, please contact us for a confidential and obligation free and discussion:

Malcolm Burrows B.Bus.,MBA.,LL.B.,LL.M.,MQLS.
Legal Practice Director
T: +61 7 3221 0013 (preferred)
M: +61 419 726 535
E: mburrows@dundaslawyers.com.au

Disclaimer
This article contains general commentary only. You should not rely on the commentary as legal advice. Specific legal advice should be obtained to ascertain how the law applies to your particular circumstances.
[1] Google Cloud, What is Retrieval-Augmented Generation (RAG), https://cloud.google.com/use-cases/retrieval-augmented-generation.
[2] IBM, What is retrieval-augmented generation, https://research.ibm.com/blog/retrieval-augmented-generation-RAG.