- Law4Startups
- Posts
- ⚖️ Open AI Training Data Issues
⚖️ Open AI Training Data Issues
OpenAI’s Sora and the Legal Quagmire of Training Data
The launch of OpenAI’s video-generating AI, Sora, has sparked significant discussion about how generative AI models are trained. With indications that Twitch streams and video game walkthroughs might have been included in its training data, questions arise about whether this practice infringes on intellectual property (IP) rights. Video game content is uniquely complex, often involving multiple layers of copyright, from proprietary game elements to user-generated content and even the recorded gameplay itself. Without proper licensing, training on such data risks lawsuits, as seen in ongoing cases against other AI companies accused of infringing on creators’ rights.
Beat Black Friday with BILL
Get the deal of the year for you and your business when you choose the BILL Divvy Card + expense management software, AND an exclusive gift when you take a demo. Move over, Black Friday.
Choose BILL Spend & Expense to help your business:
Reap rewards with reliable cash back rates
Create virtual cards that help protect from fraud & overspending
Control spending with customizable budget controls
Take a demo by the end of the month and take home a Nintendo Switch, Apple AirPods Pro, Samsung 50" TV, or Xbox Series S—your choice1 .
1 Terms and Conditions apply. See offer page for more details.
BILL Divvy Card is issued by Cross River Bank, Member FDIC, and is not a deposit product.
Implications for Startups Utilizing Generative AI
Tech startups developing or utilizing generative AI tools should view Sora’s legal challenges as a cautionary tale. Training models on unlicensed material may expose companies to significant litigation risks, including claims for copyright, trademark, or likeness infringement. Even if a court finds that AI models have a transformative purpose, the risk doesn’t vanish. Users who inadvertently reproduce copyrighted material in their outputs may face legal liability, leaving startups to address potential reputational and financial fallout. Furthermore, startups must carefully consider indemnity clauses, which may offer limited protection, particularly for individual users.
Preparing for a Future of Stricter IP Enforcement
For startups in this space, proactive measures are crucial. These include investing in licensed datasets, implementing robust filtering to prevent the replication of protected content, and offering transparent usage policies. Startups should also stay informed about evolving legal standards around fair use and generative AI. Engaging IP counsel early can mitigate risks and ensure compliance with emerging regulations. As the legal landscape becomes increasingly defined, startups that prioritize ethical AI development and intellectual property respect will be better positioned to navigate and thrive.
In addition to our newsletter we offer 60+ free legal templates for companies in the UK, Canada and the US. These include employment contracts, investment agreements and more.
Newsletter supported by:
There’s a reason 400,000 professionals read this daily.
Join The AI Report, trusted by 400,000+ professionals at Google, Microsoft, and OpenAI. Get daily insights, tools, and strategies to master practical AI skills that drive results.
and