Intellectual Property Litigation, Protection, and AI Legal Matters: Defending the Algorithm™

Our intellectual property trial lawyers have an extensive track record of aggressive courtroom advocacy in high-stakes IP, AI and technology disputes. From patent infringement battles to complex DTSA trade secret misappropriation cases, we've secured victories for clients across federal courts, specialized IP tribunals, and in both jury and bench trials. Our AI equipped team combines deep technical understanding with battle-tested litigation experience to protect cutting-edge innovations, particularly in cases involving artificial intelligence (AI), machine learning, and emerging technologies. Whether defending against IP infringement claims or pursuing misappropriated trade secret assets, we provide forceful representation for industrial leaders, technology companies, and innovative businesses, backed by our comprehensive understanding of both traditional and digital intellectual property rights: Patents; Trademarks; Copyrights; AI; Trade Secrets; DTSALaw®; Defending the Algorithm™

Defending the Algorithm™: A Bayesian Analysis of AI Litigation and Law.

By Henry M. Sneath on October 1, 2025

Listen to this Blog Post with Additional Commentary from the Author Henry M. Sneath - 37:33 min.

Copyright and Beyond – The Expanding Legal Battleground Over AI Training Data and AI Enterprise Software

This blog post and audio file is another in the series "Defending the Algorithm™" written and edited by Pittsburgh, Pennsylvania Business, IP and AI Trial Lawyer Henry M. Sneath, Esq. and was authored with research assistance by Claude® from Anthropic Sonnet Edition 4.5 Pro and some research confirmation from Google Gemini AI 2.5 Flash. This series focuses on AI, the legal practice, and the intersections of AI and substantive law. Claude and Gemini can make mistakes but the Author has taken care to check them all for accuracy.

Introduction

Our previous analysis of the historic $1.5 billion Anthropic settlement in Bartz v. Anthropic revealed how Judge Alsup's groundbreaking ruling established a potential bright line legal framework distinguishing between permissible AI training on legally obtained copyrighted works, and impermissible use of pirated materials. While this decision provides initial crucial guidance to legal practitioners on copyright fair use in AI training, it represents only the opening chapter in a rapidly expanding legal battleground over how AI companies acquire and use their training data, and how companies deploy AI enterprise software in the delivery of products and services. See my prior post on Bartz v. Anthropic.

Chapter 1: The Bayesian Analysis

AI and the Law are at a critical intersection and lawyers should be prepared to counsel business clients on the front end of AI integration into their business structure, and on the back end when litigation arises. What is the probability that AI issues will become a core or major part of litigation in the future? How will AI and super intelligence frontier technology affect the legal issues in IP, Patent, Copyright and Trade Secret Law, Technology and Cybersecurity Law, Environmental and Energy Law, Employment Law and Renewable Energy, Land Use and Zoning Law?

The Bayesian approach guides us to; 1) accept the Judge Alsup ND California Bartz v. Anthropic opinion and his ruling on Anthropic’s LLM training, as our “prior probability” or “baseline assumption”; 2) combine it with new evidence from other cases and forums to create a likelihood function and 3) apply Bayes Theorem to arrive at a posterior probability predicting how likely the Alsup Opinion will be to become established precedent across multiple federal circuits. Bayes theorem is a fundamental building block in the probability-predicting technology of AI and the race to create AGI and super intelligence and it is useful here to develop an understanding of how we can use it to predict new developments in the Law of AI. Let’s apply an AI principle, to the evaluation of AI and the Law.

With this understanding, we can then try to predict the likelihood of lawsuits being brought against businesses that use AI enterprise software; are training LLM’s; or are using AI to make important business decisions that impact other businesses, customers and employees. Bayesian analysis allows us to update probability estimates as new evidence emerges. Let's set up and then dissect the “AI and Law equation” to examine this probability analysis:

P(AI Litigation | Business Operations) = P(Business Operations | AI Litigation) × P(AI Litigation) / P(Business Operations)

Terms and Symbols meanings in the equation:

P = “Probability of”;
AI Litigation = “Lawsuits by or against companies which involve AI issues”;
Business Operations = “The information related to how a business uses AI”;
| = "given" (conditional probability);
× = “multiplication”;
/ = “division”.

Breaking Down Each Component of the Equation - definitions:

Posterior (what we want) →
Likelihood (pattern recognition from existing cases) →
Prior (baseline risk) →
Evidence (how common your activity is)

In other words, here's what we're calculating, here's how we recognize patterns, here's our starting assumption, here's how common a business or legal situation is. Here it is in more detail.

P(AI Litigation | Business Operations) = The Posterior Probability

This is what we're solving for: the probability that a company will face AI litigation given its specific business and AI operations;
This is the "updated" risk assessment after considering new evidence (like the Anthropic settlement and other cases)

P(Business Operations | AI Litigation) = The Likelihood

This asks a business who is worried about AI legal risk: "Among companies that have faced AI litigation, what percentage had AI business operations similar to yours?"
Example (not real statistics): If 80% of companies that were sued for AI copyright infringement were using LLMs for training or content generation, then P(Content Generation | AI Copyright Suit) = 0.80

P(AI Litigation) = The Prior Probability

This is the baseline probability of AI litigation across all businesses, regardless of specific operations;
Example (not real statistics): Before Anthropic case: estimated at 0.05 (5% of tech companies faced AI-related lawsuits);
- After Anthropic: updated to 0.12 (12% probability as litigation increases)

P(Business Operations) = The Evidence

This is the probability that any randomly selected business engages in your business’s specific AI-related or LLM operations;
Example (not real statistics): P(Using LLMs for Content) might be 0.25 (25% of businesses now use LLM tools)

The point here is that this probability analysis is at the ground floor, but is part of a rapidly developing analysis in the legal profession. The Bayesian analysis is a good way to guide our thinking about the probability of increasing AI related litigation and it helps us understand the foundational principles of AI itself. While we predict that AI litigation will rapidly increase as AI becomes ubiquitous in business, health care, insurance and most companies, we are not yet in a position to accurately predict the probability of any one company getting sued for its use of AI, unless we can determine that the company is training LLM’s or is using AI enterprise software in a way that has already resulted in similar companies getting sued for similar AI use.

We are only at the very beginning of the development of AI Law, but we are getting more data all the time, which will give us better and better ability to predict the “posterior probability” referenced above. Beyond the copyright issues addressed in Bartz, a new wave of litigation challenges the fundamental methods by which AI companies harvest the massive datasets that fuel modern large language models and by which companies use AI enterprise software to power their business operations, to make business and human resources decisions and to deliver products and services. These emerging cases move beyond traditional copyright infringement theories to explore contract violations, unauthorized access claims, DTSA trade secret claims, employment claims, and systematic data scraping disputes that may prove more resistant to the fair use defenses that benefited Anthropic, at least in part.

Chapter 2: The “AI and the Law” Evolution Beyond Copyright; New Legal Theories Emerge

While Judge Alsup's transformative use analysis in Bartz v. Anthropic established that training AI on legally acquired copyrighted works may constitute fair use, plaintiffs' attorneys are rapidly developing alternative legal frameworks that sidestep copyright's fair use protections entirely. The P(AI Lawsuits being based on claims other than copyright infringement) = High.

Terms of Service Violations: In Reddit v. Anthropic 3:25-cv-05643 (ND California) Reddit alleges that Anthropic has systematically been scraping data from Reddit’s massive trove of Reddit user data and posts in Reddit communities and in subreddit spaces. The complaint focuses on whether Anthropic violated the Reddit website terms of service when systematically scraping training data. Anthropic admits that it has scraped data from Reddit and other sources to train its Claude LLM. Unlike copyright claims, these contract-based theories don't face the same transformative use challenges that benefited Anthropic under Judge Alsup's four-factor fair use analysis. Website terms of service typically include explicit prohibitions against automated data collection, creating potential breach of contract claims that exist independently of any copyright considerations. Reddit’s legal strategy in this case seeks to avoid a fair use defense by Anthropic that has complicated other AI cases; it also seeks to leverage Reddit platform control over access and usage terms. The complaint also claims commercial harm to its business model without proving copyright ownership and tests whether contract law can provide stronger protection than copyright for platform data. Reddit claims that this scraping of Reddit user data puts Reddit in a position of violating its own warranties to users to protect their data, even when deleted by the user. Scraping by Anthropic and other AI entities, they argue, upends their entire information gathering and storage model.

Reddit brings claims for Breach of the Reddit User Agreement Contract (which they argue binds all platform users including Anthropic); Unjust Enrichment; Trespass to Chattels as a tort claim and; Tortious interference with Reddit’s contracts with its users and licensees. The case was removed from California state court to federal court and there is now a motion to remand back to state court by Reddit. There have been no substantive pleadings since the complaint and a hearing on the Motion to Remand is set for October 10, 2025. We will follow this ongoing case and report more in future posts.

The federal Computer Fraud and Abuse Act (CFAA): Plaintiffs are increasingly pursuing unauthorized computer access claims under the CFAA, arguing that data scraping constitutes unauthorized access to computer systems regardless of whether the scraped content itself enjoys copyright protection. This federal statute provides potential criminal and civil liability for accessing computers "without authorization," creating a pathway around copyright's fair use doctrine. It is a complicated statute and we will explore the CFAA in greater detail as the scraping cases move further along on CFAA theories.

Unjust Enrichment and Quasi-Contract Claims: These theories focus on AI companies profiting from data they obtained without permission or compensation, creating quasi-contractual obligations to pay for value received. Unlike copyright infringement, unjust enrichment claims don't require proving ownership of specific intellectual property rights. Reddit brings such claims in its case against Anthropic.

Chapter 3: Data Scraping and Terms of Service; A Legal Framework Distinct from Copyright

Systematic data scraping presents unique legal challenges that distinguish it from the straightforward copyright issues addressed in Judge Alsup's Bartz ruling. When AI companies deploy automated systems to harvest data from websites, social media platforms, or proprietary databases, they potentially violate multiple legal principles simultaneously—none of which depend on copyright ownership. In Ryanair v. Booking.com, the District Court of Delaware in a 2024 decision held Booking.com liable under the CFAA for “screen (data) scraping” of Ryanair data from the Ryanair reservations website. See my prior blog post on that case.

Contract Law Implications: Most commercial websites include terms of service that explicitly prohibit automated data collection, bot access, or systematic downloading. Even when the underlying content lacks copyright protection, violating these contractual terms creates independent liability. Federal courts have generally enforced such terms, particularly when data scraping interferes with website functionality or violates clearly stated access restrictions. Unlike the copyright fair use analysis that saved Anthropic from liability for legitimately acquired books, contract violations typically don't enjoy fair use-style defenses.

Trespass to Chattels: This traditional common law theory applies when automated scraping consumes server resources, degrades website performance, or interferes with normal operations. The claim focuses entirely on the method of data acquisition rather than the nature of the content itself. Server overload, bandwidth consumption, or interference with legitimate users can establish liability regardless of copyright status.

State Law Privacy and Right of Publicity Claims: When training data includes personal information, biometric data, or individually identifiable content (which it does in the Reddit database), AI companies may face state-law privacy violations that operate independently of federal copyright protections. California's robust privacy framework, for example, provides potential claims that don't depend on copyright ownership.

Chapter 4: Enterprise AI; Upstream Liability in the Training Data Supply Chain

The distinction between copyright infringement and data acquisition violations creates significant implications for companies deploying enterprise AI solutions. While Judge Alsup's ruling in Bartz v. Anthropic provides comfort that using AI trained on copyrighted works may be permissible under fair use, it offers no protection against claims that the training data was improperly acquired through pirated sites, contract violations or unauthorized access. The P(Lawsuits based on business use of AI Enterprise Software) = High.

Upstream Liability Risks: Companies implementing enterprise AI platforms like Nvidia AI Enterprise, Google's Vertex AI, or Microsoft's Azure OpenAI Service may face liability for training data acquisition methods they neither chose, nor control. If the underlying AI models were trained on improperly scraped data that violated terms of service or CFAA provisions, enterprise customers could face claims even if their specific deployment would qualify for copyright fair use under the Bartz precedent.

Indemnification Gap Analysis: Standard enterprise software licenses often limit vendor liability and may not cover claims related to training data acquisition methods. The Bartz settlement demonstrates that training data liability can reach billions of dollars, yet many enterprise AI contracts provide limited indemnification coverage. Companies should carefully review vendor agreements to understand whether they're protected against upstream data scraping claims that fall outside copyright law.

Due Diligence Evolution: As training data litigation expands beyond copyright, companies may need to conduct enhanced due diligence on AI vendors' data acquisition practices. This evaluation should extend beyond copyright compliance to examine terms of service violations, CFAA risks, and privacy law adherence—areas not addressed by the Bartz ruling's fair use framework.

Chapter 5: Trade Secret Law Per the DTSA Intersects with AI Legal Issues

The federal Defend Trade Secrets Act, which is part of the federal Economic Espionage Act is a powerful statute which was enacted in 2016 and which provides both criminal sanctions against a misappropriator of trade secrets, and civil liability damages and penalties against those found civilly liable to a trade secret owner for misappropriation. The intersection of AI training and enterprise AI data litigation, with trade secret law under the (DTSA), creates additional complexity that extends well beyond the copyright issues resolved in Bartz v. Anthropic. While Judge Alsup's ruling addressed copyright fair use, it provides no guidance on trade secret misappropriation claims. DTSA claims will undoubtedly be brought with regard to AI data collection or scraping of proprietary business information. The P(DTSA Lawsuits relating to AI) = Very High.

Proprietary Database Protection: Compiled databases, customer lists, pricing information, and proprietary datasets often qualify for trade secret protection when they provide competitive advantage and have been reasonably protected by the trade secret owner. AI companies that scrape such protected information may face DTSA claims that operate independently of any copyright considerations. Unlike copyright's fair use doctrine, trade secret law provides no general fair use defense. Trade secret owners however, must be ever-more careful to protect their trade secrets when they upload data to the cloud or AI platforms. This trade secret conundrum works in both directions. I will write more on this in future posts.

Discovery's "Black Box" Problem: Trade secret claims in training data cases create the same algorithmic transparency challenges we've identified in insurance and bad faith AI litigation like Estate of Lokken v. UnitedHealth Group. Courts must balance plaintiffs' need to understand how their proprietary data was acquired and used against defendants' interests in protecting their own trade secrets around training methodologies and algorithmic implementations. Insurance carriers are increasingly using AI enterprise software to underwrite insurance policies, to make claims decisions, and to evaluate health care claims like in the Lokken case. This portends that insurance carriers will continue to see a growing number of breach of contract claims and bad faith lawsuits regarding their claims decisions being done by bots. P(Insurance Lawsuits based on AI use) = High.

Cross-Border Complexity: Many training datasets include information scraped from international sources, potentially implicating foreign trade secret laws, data protection regulations like GDPR, and cross-border discovery challenges that didn't arise in the domestic copyright issues addressed in Bartz.

Chapter 6: Strategic Defense Evolution in AI Claims; Business Must Be Proactive

The expansion beyond copyright to breach of contract, unauthorized access, and trade secret claims requires defense strategies that address multiple legal theories simultaneously—a more complex challenge than the copyright-focused analysis that proved successful for Anthropic.

Multi-Vector Documentation: This simply means “better” documentation and proof of how a company collects data. While the Bartz case emphasized the importance of legally acquiring copyrighted works, companies now need comprehensive documentation covering terms of service compliance, automated access protocols, rate limiting measures, and respect for robots.txt files. Technical logs showing respectful scraping practices may provide defenses against trespass and CFAA claims if a lawsuit arises.

Licensing Strategy Evolution: This means Proactive Licensing of data from its owner. The trend toward post-training licensing agreements exemplified by Anthropic's settlement may not cure initial contract violations or unauthorized access claims. Proactive licensing arrangements become increasingly critical when legal theories extend beyond copyright's retroactive fair use protections. Companies need to consider licensing before collecting massive data and later getting sued.

Vendor Risk Management: This means Vendor Due Diligence like investigating where an AI Enterprise vendor got its training data for the software that you intend to purchase. Companies using enterprise AI must implement supply chain due diligence that evaluates training data acquisition methods across all applicable legal frameworks, not just copyright compliance. This includes reviewing vendor practices for terms of service adherence, CFAA compliance, and trade secret protection.

The P(need for businesses to be far more proactive in insuring legal collection of data) = High.

Chapter 7: Looking Forward: The Multi-Theory Training Data Defense Framework

Judge Alsup's favorable fair use ruling in Bartz v. Anthropic represents a significant victory for AI development, but it addresses only one component of an increasingly complex legal landscape. As training data and AI Enterprise litigation evolve beyond copyright, companies need comprehensive strategies that account for the full spectrum of potential legal theories against them – or on their own behalf if their business information has been violated.

Integrated Risk Assessment: Legal analysis must simultaneously evaluate, inter alia, copyright fair use, contract compliance, unauthorized access risks, trade secret exposure, tort claims for unfair competition and tortious interference, and privacy law adherence. The Bartz precedent provides copyright protection but leaves other legal vulnerabilities unaddressed.

Supply Chain Liability Planning: The enterprise AI ecosystem creates potential liability chains extending from training data acquisition through final deployment. Companies must understand and plan for potential liability arising from upstream vendor practices that may violate non-copyright legal requirements.

Evolving Compliance Standards: As courts develop precedents around acceptable data acquisition practices across multiple legal frameworks, compliance requirements will continue evolving beyond the copyright considerations addressed in Bartz. The bright-line rule against using pirated materials may expand to encompass other forms of improper data acquisition.

The next phase for lawyers "Defending the Algorithm™" requires understanding that AI legal challenges extend far beyond the copyright framework established in Bartz v. Anthropic. While Anthropic's settlement demonstrates that fair use may protect AI training on legally acquired copyrighted works, it provides no shelter against the expanding universe of contract, unauthorized access, tort claims and trade secret claims that challenge the fundamental data acquisition methods enabling modern AI systems.

Companies that prepare for this broader legal landscape—encompassing not just copyright but the entire ecosystem of laws governing data acquisition and use—will be better positioned to navigate the complex intersection of technological innovation and legal compliance in the AI era. We will explore these new legal theories in upcoming posts. The P(increasing litigation based on the behavior of the AI ecosystem) = Very High.

My Thanks to Claude®for its assistance with the Defending the Algorithm series. Houston Harbaugh's intellectual property and AI litigation team continues to monitor developments in training data and AI enterprise litigation across all legal frameworks. For questions about how these evolving legal standards may affect your AI deployment, data acquisition practices, or enterprise AI risk management, please contact please contact Henry M. Sneath in our office at email address sneathhm@hh-law.com or 412-288-4013. Thanks for reading or listening. See you next time.

Pittsburgh | Pennsylvania | Blog | Artificial Intelligence | AI | Law | Lawyers | Defending the Algorithm | Copyright Law | Fair Use

Posted in:

About Us

The IP, Technology, AI and Trade Secret attorneys at Houston Harbaugh, P.C., have extensive courtroom, jury and non-jury trial and tribunal experience representing industrial, financial, individual and business clients in IP and AI counseling, infringement litigation, trade secret protection and misappropriation litigation, and the overall creation and protection of intellectual property rights in an AI driven world. Our team combines extensive litigation experience with comprehensive knowledge of rapidly evolving AI and technology landscapes. From our law office in Pittsburgh, we serve a diverse portfolio of clients across Pennsylvania and other jurisdictions, providing strategic counsel in patent disputes, trade secret protection, IP portfolio development, and AI-related intellectual property matters. Our Trade Secret Law Practice is federally trademark identified by DTSALaw®. We practice before the United States Patent and Trademark Office (USPTO) and we and our partners and affiliates apply for and prosecute applications for patents, trademarks and copyrights. Whether navigating AI implementation challenges, defending against infringement claims, or developing comprehensive IP strategies for emerging technologies, our team provides sophisticated representation for industrial leaders, technology companies, financial institutions, and innovative businesses in Pennsylvania and beyond.

IP section chair Henry Sneath, in addition to his litigation practice, is currently serving as a Special Master in the United States District Court for the Western District of Pennsylvania in complex patent litigation by appointment of the court. Pittsburgh, Pennsylvania Intellectual Property Lawyers | Infringement Litigation | Attorneys | Patent, Trademark, Copyright | DTSALaw® | AI | Artificial Intelligence | Defending the Algorithm™

Henry Sneath Pittsburgh Business Litigation Lawyer. Pittsburgh Strong.® DTSALaw® Complex Case Mediation and ADR

Henry M. Sneath - Practice Chair

Co-Chair of Houston Harbaugh’s Litigation Practice, and Chair of its Intellectual Property Practice, Henry Sneath is a trial attorney, mediator, arbitrator and Federal Court Approved Mediation Neutral and Special Master with extensive federal and state court trial experience in cases involving commercial disputes, breach of contract litigation, intellectual property matters, patent, trademark and copyright infringement, trade secret misappropriation, DTSA claims, cyber security and data breach prevention, mitigation and litigation, probate trusts and estates litigation, construction claims, eminent domain, professional negligence lawsuits, pharmaceutical, products liability and catastrophic injury litigation, insurance coverage, and insurance bad faith claims. He is currently serving as both lead trial counsel and local co-trial counsel in complex business and breach of contract litigation, patent infringement, trademark infringement and Lanham Act claims, products liability and catastrophic injury matters, and in matters related to cybersecurity, probate trusts and estates, employment, trade secrets, federal Defend Trade Secrets Act (DTSA) and restrictive covenant claims. Pittsburgh, Pennsylvania Business Litigation and Intellectual Property Lawyer. DTSALaw® PSMNLaw® PSMN®