×
California requires AI companies to disclose training data in 2026
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

California has passed Assembly Bill 2013, requiring generative AI developers to publicly disclose their training data starting January 1, 2026. The Generative Artificial Intelligence Training Data Transparency Act represents one of the most comprehensive U.S. rules on AI disclosure, potentially strengthening copyright lawsuits while raising compliance burdens for companies operating in the state.

What you should know: The law mandates detailed public disclosures about datasets used to train AI models, including sources, availability, size, and whether copyrighted or personal data are included.

  • Developers must publish information on their websites about data sources, whether datasets are publicly available or proprietary, their size and type, and the time period during which data was collected.
  • Bloomberg Law described AB 2013 as among the most comprehensive U.S. rules on AI disclosure, requiring companies to publish details about the data that trains their models.
  • Compliance presents significant challenges, particularly for models that have evolved over time using data from diverse sources that may lack clear ownership records or licensing information.

Why this matters: The disclosure requirements could make it easier to trace which datasets were used in training, potentially strengthening claims from copyright holders in ongoing litigation.

  • Generative AI firms are already navigating lawsuits alleging that models were trained on copyrighted works without permission.
  • Researchers argue that transparency could provide a foundation for independent audits and risk assessments of AI systems.
  • California’s regulatory approach often shapes national technology policy, from privacy rules to emissions standards, giving this law significance beyond state borders.

Industry pushback: Business and technology executives are expressing concerns about the law’s potential impact on innovation and development.

  • According to The Wall Street Journal, executives warned the bill could have a “chilling effect” on development in California, with startups particularly exposed to compliance burdens.
  • Some analysts argue California’s targeted strategy may prove more durable than broader regulatory approaches.
  • Microsoft’s Chief Scientist Eric Horvitz offered a contrasting view, suggesting that oversight “done properly” can accelerate AI advances by encouraging responsible data use and building public trust.

The big picture: California’s law signals that AI transparency may transition from voluntary best practice to mandatory requirement across industries.

  • The broader policy debate centers on whether transparency alone will be sufficient for AI governance.
  • Colorado has delayed its AI act implementation to June 2026, while financial institutions are independently moving toward clearer safeguards and responsible scaling practices.
  • If the disclosure requirements prove workable, other states could follow suit, potentially creating a national standard for AI transparency.
California Law Will Require AI Developers to Disclose Training Data

Recent News

Pennsylvania lawmaker proposes ban on AI as primary school instructor

Former teacher Rivera warns against ceding educational control to chatbots and computer programs.

AI translations threaten Wikipedia’s vulnerable language editions

Greenlandic Wikipedia became so corrupted its manager wants it shut down entirely.

Swiss startup Corintis raises $24M to cool AI chips from the inside

The company's liquid channels run directly through chip circuits, not just surface cooling.