Published on

ABA Opinion 512 & Training Bespoke Models with Client Data

Authors
American Bar Association Opinion 512

The allure of bespoke generative AI (GAI) models for law firms is undeniable. Tailoring AI systems to a firm’s accumulated knowledge promises increased efficiency, consistency, and enhanced legal services. However, as highlighted in the American Bar Association's (ABA) Formal Opinion 512, the use of client data to train these models raises profound ethical concerns.1 While solutions like redacting sensitive information or securing client consent may seem burdensome, the ABA emphasizes a critical point: we do not fully understand how these models generate outputs, and this uncertainty justifies stringent safeguards.

The Unknowns of GAI Models: A Risk Too Great?

The ABA opinion underscores the inherent opacity of generative AI systems, often referred to as the "black box" problem.2 GAI tools generate outputs based on patterns in their training data, but even their developers cannot always predict how or why they produce specific results. This lack of transparency poses unique risks when client data is involved, particularly in self-learning models.

One significant danger is the potential for GAI tools to unwittingly disclose sensitive client information. Even if data is anonymized or redacted, the model might generate content that inadvertently reveals confidential details or combines data from different clients in unpredictable ways, breaching ethical walls or confidentiality agreements.3 This risk extends beyond external disclosures; it includes situations where the model outputs sensitive information to firm personnel who should not have access to it. Such scenarios could violate ethical obligations under Model Rule 1.6 and create cascading legal and reputational risks for the firm.4

Adding to the complexity is that these tools are often trained on vast, unstructured datasets, making it difficult for firms to pinpoint specific sources of disclosed information.5 The cumulative nature of the learning process means that even small lapses in data hygiene could amplify over time, embedding sensitive information into the model’s decision-making processes. This opacity, combined with the widespread reliance on GAI outputs, underscores the importance of robust safeguards, even when their implementation appears burdensome.

The Role of Retrieval-Augmented Generation (RAG) AI

An emerging approach to mitigate some risks associated with GAI is the use of Retrieval-Augmented Generation (RAG) AI. RAG models combine generative capabilities with information retrieval systems, allowing the AI to generate outputs based on both its trained data and real-time access to a curated knowledge base.6 This method can reduce the reliance on potentially sensitive training data by fetching relevant information as needed, rather than storing extensive client data within the model itself.

By leveraging RAG AI, law firms can limit the amount of client data embedded directly into the AI model, thereby decreasing the risk of unintended disclosures. The retrieval component can be configured to access only approved and sanitized data sources, providing an additional layer of control over the information used in generating outputs.7 Moreover, since RAG systems can update their knowledge bases without retraining the entire model, firms can more easily enforce data governance policies and remove outdated or sensitive information.

However, the use of RAG AI is not without ethical considerations. The retrieval mechanisms must be carefully managed to prevent unauthorized access to confidential information.8 Additionally, the integration of retrieval systems introduces new vectors for potential data breaches, requiring robust cybersecurity measures. Firms must ensure that the real-time data accessed by the AI complies with confidentiality obligations and that appropriate consent has been obtained for its use.9

American Bar Association Opinion 512

Burdensome Safeguards: Necessary Evils?

The ABA suggests that safeguards like informed client consent and data redaction are not merely bureaucratic hurdles but essential measures to mitigate the unique risks posed by GAI.10 While these safeguards may seem inefficient, they align with the legal profession's commitment to protecting client interests.

Securing client consent for the use of historical data presents significant logistical and practical challenges. Firms with extensive client bases may find it nearly impossible to contact all affected individuals, especially when clients are deceased or unreachable.11 Nevertheless, without consent, using client data risks breaching confidentiality obligations. This ethical mandate remains firm, even in the face of such inefficiencies. Firms that bypass this process may save time but risk long-term damage to client trust and ethical compliance.12

Redaction offers another solution but introduces its own inefficiencies and potential pitfalls.13 Removing sensitive details from large datasets is labor-intensive, especially in legal contexts where nuanced information often holds critical importance. Heavily redacted data might undermine the effectiveness of bespoke models, reducing their ability to generate meaningful and accurate outputs. The result is a delicate balancing act: protecting client confidentiality while preserving the utility of the AI tool. The ABA’s emphasis on redaction reflects a broader principle—that even imperfect solutions are preferable to ignoring the inherent risks of GAI use.14

Firm-Wide Use of Bespoke Models: Supervision and Accountability

Under Model Rules 5.1 and 5.3, the ABA places significant responsibility on managerial lawyers to oversee the use of GAI tools within their firms.15 Bespoke models trained on client data require even greater scrutiny due to their pervasive use across offices and practice areas.

The complexity of firm-wide adoption lies in the interconnected nature of legal practices. A bespoke model trained on data from one office or practice area might inadvertently produce outputs that draw from unrelated client data, creating significant ethical dilemmas.16 For example, if a model generates insights or drafts that incorporate information from a client represented by a different office, it could breach ethical walls designed to maintain client confidentiality. This risk highlights the importance of comprehensive oversight mechanisms that extend across all facets of the firm’s operations.17

In addition to maintaining confidentiality, firms must address the broader implications of adopting bespoke models. These tools do not operate in isolation; their outputs are shaped by the cumulative practices and decisions of the firm. Without robust training and supervision, lawyers and staff may misuse or misunderstand the tool, compounding the ethical risks.18 The ABA’s guidance on supervision reflects an understanding that GAI adoption is not merely a technical upgrade but a fundamental shift in how legal services are delivered. Firms must approach this transition with the same rigor and accountability they apply to any other major ethical obligation.

Balancing Innovation with Responsibility

While bespoke GAI models promise transformative efficiency, they also demand transformative responsibility. The ABA’s position is clear: the legal profession cannot prioritize innovation over its foundational ethical obligations.19 The burden of safeguards such as informed consent and redaction reflects the gravity of the risks involved. As the ABA warns, until we fully understand how these models generate their outputs, stringent safeguards are not just prudent—they are necessary.20

The incorporation of RAG AI presents a potential pathway to balance innovation with responsibility. By minimizing the amount of sensitive data stored within AI models and enhancing control over the data retrieval process, firms can reduce some ethical risks associated with traditional GAI models.21 However, the adoption of RAG AI does not eliminate the need for vigilant oversight and adherence to ethical standards. Firms must still ensure that the data accessed and used by the AI complies with confidentiality obligations and that appropriate client consent has been obtained.22

The broader challenge lies in reconciling the legal profession’s commitment to client protection with the inevitability of technological progress. Firms that successfully navigate this tension will not only comply with their ethical duties but also set a standard for responsible innovation.23 As AI technology evolves, the legal community must remain vigilant, adapting its practices and policies to ensure that the benefits of innovation do not come at the expense of trust and integrity.

Conclusion: A New Era of Ethical Practice

The ethical challenges of bespoke GAI models are significant but surmountable with deliberate and proactive approaches. The introduction of RAG AI offers promising avenues to mitigate some risks, but it also introduces new considerations that must be carefully managed. Law firms must embrace transparency, prioritize client protection, and diligently apply the ABA’s guidance.24 By doing so, they can harness the benefits of generative AI without compromising the profession's core values. The burden may be heavy, but the stakes—client trust, ethical integrity, and the rule of law—are far greater. This new era of practice demands not only innovation but also a reaffirmation of the ethical principles that have long defined the legal profession.

Footnotes

  1. See ABA Comm. on Ethics & Prof'l Responsibility, Formal Op. 512 (2024).

  2. Id.; see also Jenna Burrell, How the Machine ‘Thinks’: Understanding Opacity in Machine Learning Algorithms, 3 Big Data & Soc'y 1 (2016).

  3. See Model Rules of Pro. Conduct r. 1.6(a) (Am. Bar Ass'n 2023).

  4. Id.; see also Jacob Turner, Robot Rules: Regulating Artificial Intelligence 89–90 (Palgrave Macmillan 2019).

  5. See Paul Ohm, Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization, 57 UCLA L. Rev. 1701, 1704–05 (2010).

  6. See Patrick Lewis et al., Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, in Advances in Neural Information Processing Systems 34, 9459–71 (NeurIPS 2020).

  7. See Joshua A. Kroll et al., Accountable Algorithms, 165 U. Pa. L. Rev. 633, 680–82 (2017).

  8. See Florian Tramèr et al., Stealing Machine Learning Models via Prediction APIs, in Proceedings of the 25th USENIX Security Symposium 601, 613–14 (2016).

  9. See Model Rules of Pro. Conduct r. 1.6(c) (Am. Bar Ass'n 2023).

  10. See ABA Formal Op. 512, supra note 1.

  11. See Model Rules of Pro. Conduct r. 1.9 cmt. 9 (Am. Bar Ass'n 2023).

  12. See ABA Formal Op. 512, supra note 1.

  13. See Ohm, supra note 5, at 1716–17.

  14. See ABA Formal Op. 512, supra note 1.

  15. See Model Rules of Pro. Conduct rr. 5.1–5.3 (Am. Bar Ass'n 2023).

  16. See Charles W. Wolfram, Modern Legal Ethics § 7.6.5 (West Publishing Co. 1986).

  17. See ABA Comm. on Ethics & Prof'l Responsibility, Formal Op. 08-451 (2008).

  18. See Richard Susskind, Tomorrow's Lawyers 119–20 (2d ed. 2017).

  19. See ABA Formal Op. 512, supra note 1.

  20. Id.

  21. See Lewis et al., supra note 6.

  22. See Model Rules of Pro. Conduct r. 1.6(a), (c) (Am. Bar Ass'n 2023).

  23. See Susskind, supra note 18, at 123–24.

  24. See ABA Formal Op. 512, supra note 1.