California Now Requires AI Companies to Reveal If They Trained on Your Photos
Quick take: California's AI Training Data Transparency Act (AB 2013) is now in effect. Every AI company serving Californians must publicly disclose what data they used to train their models - including whether they scraped your photos. For the first time, you have a legal right to know if your images ended up in an AI training dataset. Here's what the law requires, what companies have disclosed so far, and what it means for your photo privacy.

What the law actually requires
AB 2013, California's Generative AI Training Data Transparency Act, took effect on January 1, 2026. It applies to every generative AI system that's publicly available to Californians and was released or substantially modified since January 2022. That covers basically every major AI model you've heard of.
The law requires developers to publish a 'high-level summary' of the datasets used to train their AI systems. This includes 12 categories of information: the sources of the data, whether it contains copyrighted material, whether personal information was included, and the types of data in each dataset.
In practical terms, if an AI company scraped publicly available photos from Flickr, Reddit, or Instagram to train an image generation model, they now have to say so. If they licensed a dataset containing millions of personal photos, they have to disclose that too.
What companies have revealed so far
The early disclosures have been illuminating. Several major AI developers have published their training data summaries, and the details confirm what privacy researchers suspected for years - personal photos from social media platforms, stock photo sites, and web scrapes are a fundamental ingredient in modern AI models.
Not every company has been forthcoming. Elon Musk's xAI has already filed a legal challenge against the law, arguing that requiring disclosure of training datasets threatens trade secrets. The case signals that some companies would rather fight the law in court than reveal what data they used.
That resistance is telling. If the training data was properly licensed and contained no personal information, there'd be little reason to fight the disclosure requirement. The pushback suggests that some training datasets contain exactly the kind of personal data that users would object to.

Were your photos used to train AI?
If you've ever posted a photo publicly on social media, the honest answer is: probably. Research has repeatedly shown that large-scale web scraping datasets like LAION-5B - which was used to train Stable Diffusion and other models - contain millions of personal photos scraped from social platforms without consent.
California's law doesn't give you an individual right to check whether a specific photo of yours was included. But the disclosures will tell you whether the source platforms you use (Instagram, Flickr, Reddit, DeviantArt) were scraped for training data. If you posted publicly on those platforms, your photos were likely included.
Photos shared privately through messaging apps or dedicated private sharing platforms were generally not included in public scraping datasets. The distinction between public and private sharing has never mattered more.
How this compares to EU regulations
The EU AI Act, which began enforcement in 2025, takes a broader approach. It classifies AI systems by risk level and imposes requirements ranging from transparency obligations to outright bans on certain uses. The EU also grants individuals stronger rights regarding their personal data through GDPR.
California's AB 2013 is narrower - it focuses specifically on training data transparency rather than regulating how AI systems are used. But it's the first US law to require AI companies to tell you what's in their training data. That's a significant step for a country that still doesn't have a federal privacy law.
There's a catch, though. President Trump's executive order from December 2025 proposes a federal AI policy framework that could preempt state laws like AB 2013. The tension between state-level privacy protections and federal deregulation is already playing out, and your photo privacy rights may depend on which side wins.
What this means for how you share photos
The California law makes one thing crystal clear: public photos are fair game. If your photos are publicly accessible on any platform, they can be scraped, and now at least the companies that used them have to admit it.
Private photo sharing changes the equation entirely. Photos shared through private links, password-protected albums, or end-to-end encrypted services aren't accessible to web scrapers. They can't end up in training datasets because they were never publicly indexed in the first place.
Try Viallo Free
Share your photo albums with a single link. No account needed for viewers.
Start Sharing FreeHow to protect your photos going forward
- Check company disclosures. Major AI companies are now required to publish training data summaries on their websites. Look up the platforms you use to see if they were listed as data sources.
- Audit your public posts. Any photo that's publicly accessible on social media could have been scraped for AI training. Consider making old posts private or deleting photos you no longer want public.
- Share privately by default. When sharing photos with family and friends, use private links rather than public posts. This keeps your photos out of web scraping datasets entirely.
- Choose platforms that don't train on your data. Viallo stores photos on European infrastructure, strips metadata from shared images, and never uses your photos for AI training. Recipients view albums through private links without needing an account.
- Use metadata stripping. EXIF data embedded in your photos contains GPS coordinates, device information, and timestamps. Stripping this data before sharing prevents it from being harvested alongside your images.

Frequently Asked Questions
Does the California law let me opt out of AI training?
No. AB 2013 requires disclosure of training data but doesn't create an opt-out right. You can't retroactively remove your photos from models that already trained on them. The law focuses on transparency - knowing what happened - rather than giving individuals control over their data after the fact.
Does this law apply outside California?
AB 2013 applies to AI systems available to Californians, which includes essentially every major AI platform. The disclosures are public, so anyone can read them. However, the enforcement mechanism is California-specific. Other states may pass similar laws.
Which AI companies have disclosed their training data?
Major AI developers including OpenAI, Google, and Meta have begun publishing training data summaries. Elon Musk's xAI has challenged the law in court rather than comply. Check each company's website for their AB 2013 disclosure page.
Can I check if a specific photo was used for training?
Not through this law. AB 2013 requires high-level dataset summaries, not individual image lookups. Tools like 'Have I Been Trained' (haveibeentrained.com) let you search some training datasets for specific images, but coverage is limited.
How do private photo sharing platforms protect against AI scraping?
Private platforms like Viallo serve photos through authenticated, private links that aren't indexed by search engines or accessible to web scrapers. Since AI training datasets are built primarily from public web scrapes, private sharing keeps your photos out of those datasets entirely.