LLM DEVELOPERS’ PLATFORM‑SIDE INFRINGEMENT RISKS & LEGAL STRATEGIES: EU‑US‑CHINA COMPARISON
Keywords:
Large language models, Platform liability, Copyright infringement, Generative AI regulation, Comparative law, Safe harborAbstract
The commercial deployment of large language models (LLMs) for automated content generation has exposed developers to unprecedented platform-side infringement liabilities under copyright, data privacy, and tort law. Unlike traditional internet intermediaries that passively host user-uploaded content, LLM developers actively generate outputs through algorithmic inference, rendering existing safe harbor frameworks substantially inadequate. This paper conducts a tri-jurisdictional comparative analysis of platform-side infringement risks for LLM developers in the European Union (EU), United States (US), and China. Through doctrinal legal analysis of the EU AI Act (Regulation 2024/1689), EU Digital Services Act (DSA), US Section 230 jurisprudence and pending federal legislation, and China's Interim Measures for Generative AI (2023), this paper identifies three distinct risk categories: output-side copyright infringement, training data-derived privacy violations, and tort liability for defamatory hallucinations. The EU imposes proactive due diligence obligations on high-risk general-purpose AI systems. The US maintains a fragmented approach, with growing judicial and scholarly consensus that Section 230 immunity does not extend to AI-generated content, though final rulings remain pending. China adopts a strict administrative oversight model requiring algorithm filing and direct developer responsibility. The paper proposes a hybrid legal avoidance matrix integrating technical measures, organizational measures, and contractual measures. The paper concludes that LLM developers should develop jurisdictionally adaptive compliance architectures in the absence of globally harmonized AI regulations.References
[1] IDC. Worldwide Artificial Intelligence Spending Guide. 2026. https://www.idc.com.
[2] State Cyberspace Administration of China. Interim Measures for the Administration of Generative Artificial Intelligence Services. 2023. https://www.cac.gov.cn/2023-07/13/c_1690898327029107.htm.
[3] Lemley M A, Casey B. Fair learning. Texas Law Review, 2020, 99: 743. https://texaslawreview.org/fair-learning.
[4] Samuelson P. Generative AI meets copyright.Science, 2023, 381(6654): 158–161.
[5] National Development and Reform Commission, PRC. Reply to Proposal No. 4556 at the 3rd Session of the 14th National People’s Congress. 2026. https://www.ndrc.gov.cn/xxgk/jianyitianfuwen.
[6] US Copyright Office. Copyright and artificial intelligence: a Report of the register of copyright (Part 1): Digital Replicas. US Government Publishing Office, 2024.
[7] US Copyright Office. Copyright and artificial intelligence: a Report of the register of copyright (Part 2): Copyrightability. US Government Publishing Office, 2025.
[8] Nasr M, Carlini N, Jagielski M, et al. SCALPEL: Exploring the Limits of Extraction Attacks on LLMs with Fine-tuning. Proceedings of the 2023 IEEE Symposium on Security and Privacy (S&P), 2023.
[9] Court of Justice of the European Union. YouTube LLC and Cyando AG v. Frank Peterson and Google Germany GmbH(Case C-682/18). ECLI:EU:C:2021:503, 2021.
[10] European Commission & European Board for Digital Services. First report on the most prominent and recurrent systemic risks on very large online platforms and very large online search engines under the Digital Services Act. Publications Office of the European Union, 2025.
[11] European Data Protection Board. Guidelines 2/2024 on processing personal data for training generative AI models. EDPB Document, 2024.
[12] Russinovich M, Salem A. Obliviate: Efficient Unmemorization for Protecting Intellectual Property in Large Language Models. 2025. https://ar5iv.labs.arxiv.org/html/2502.15010.
[13] Zhang J, Yu J, Marone M, et al. Certified Mitigation of Worst-Case LLM Copyright Infringement (BloomScrub). 2025. https://arxiv.org/abs/2504.16046.
[14] Abadi M, Chu A, Goodfellow I, et al. Deep learning with differential privacy.Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 2016: 308–318.https://doi.org/10.1145/2976749.2978318
[15] Kirchenbauer J, Geiping J, Wen Y, et al. A watermark for large language models.Proceedings of the 40th International Conference on Machine Learning, 2023: 17061–17084.