Close Menu
    Login
    • Register
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    • Home
    • Technology
    • Daily Tech
      • Science and Technology
    • Gadgets
    • Gaming
    • Space Exploration
    • Scope
    • Tech News
    Facebook X (Twitter) Instagram Pinterest YouTube WhatsApp
    Facebook X (Twitter) Instagram
    NewTechManiaNewTechMania
    Login
    • Home
    • Blog
    • Gadgets
      • Gaming
    • Technology
      • Science
    • Automobile
    • Exploration
    • Scope
    • Tech News
    NewTechManiaNewTechMania
    You are at:Home » Blog » Apple, NVIDIA, and Anthropic allegedly trained AI models with YouTube transcripts without permission
    Daily Tech

    Apple, NVIDIA, and Anthropic allegedly trained AI models with YouTube transcripts without permission

    By Ruchika oberoi17 July 2024No Comments4 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    8d541010 38d9 11ef b7fd 2183e5dd7ce6
    8d541010 38d9 11ef b7fd 2183e5dd7ce6
    Share
    Facebook Twitter LinkedIn Pinterest Email

    The dataset contains transcripts of videos uploaded to YouTube by some of the most popular content creators on the network.

    A new investigation conducted by Proof News has discovered that a number of the most prominent technology corporations in the world trained their artificial intelligence models on a dataset that contained transcripts of more than 173,000 films uploaded to YouTube without obtaining permission. EleutherAI, a nonprofit organization, is responsible for the creation of the dataset, which includes transcripts of videos from YouTube that were uploaded to more than 48,000 channels. The dataset was utilized by a variety of organizations, including Apple, NVIDIA, and Anthropic. A disturbing reality about artificial intelligence is brought to light by the results of the study, which are that the technology is largely constructed on the backs of data that has been stolen from creators without their knowledge or recompense.

    The dataset does not feature any videos or images from YouTube; however, it does include video transcripts from some of the most popular creators on the platform, such as Marques Brownlee and MrBeast, as well as transcripts from major news publishers, such as The New York Times, the BBC, and ABC News. There are additional subtitles included in the dataset that are taken from videos that belong to newtechmania.

    Brownlee wrote on X that Apple has obtained data for their artificial intelligence from a number of different companies. To add insult to injury, he stated that “one of them scraped tons of data and transcripts from YouTube videos, including mine.” “For a considerable amount of time, this is going to be a problem that progresses.”

    Apple has sourced data for their AI from several companies

    One of them scraped tons of data/transcripts from YouTube videos, including mine

    Apple technically avoids "fault" here because they're not the ones scraping

    But this is going to be an evolving problem for a long time https://t.co/U93riaeSlY

    — Marques Brownlee (@MKBHD) July 16, 2024

    According to a representative for Google, prior statements made by YouTube CEO Neal Mohan, in which he stated that businesses that use YouTube’s data to train artificial intelligence models would be in violation of the platform’s terms and service, are still taken into consideration. It was requested by newtechmania that Apple, NVIDIA, Anthropic, and EleutherAI provide a statement; however, none of these companies responded.

    To until point, artificial intelligence businesses have not been forthcoming about the data that they use to train their algorithms. Apple Intelligence is the company’s own take on generative artificial intelligence, and it will be available on millions of Apple devices this year. Artists and photographers have attacked Apple for omitting to share the source of training data for Apple Intelligence earlier this month.

    In instance, YouTube, which is the largest collection of videos in the world, is a treasure trove of not only transcripts but also audio, video, and images, which makes it an appealing dataset for the purpose of training artificial intelligence models. At the beginning of this year, Mira Murati, the chief technical officer of OpenAI, avoided answering queries from The Wall Street Journal on whether or not the company used movies from YouTube to train Sora, the next artificial intelligence video generation tool that OpenAI is developing. The statement that Murati made at the time was as follows: “I’m not going to go into the details of the data that was used, but it was data that was licensed or publicly available.” The Chief Executive Officer of Alphabet, Sundar Pichai, has also stated that businesses who use data from YouTube to train their artificial intelligence models will be in violation of the terms of service of the site.

    Head on over to the lookup tool provided by Proof News if you are interested in determining whether or not the subtitles from your YouTube videos or from the channels that you enjoy the most are included in the dataset.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleAndroid users can now use Anthropic’s Claude chatbot
    Next Article AMC is adding 15 series to Netflix for a year

    Related Posts

    Sam Altman Says Mission Driven AI Talent Will Outperform Meta’s

    Skypeaklimits 2024: Your Digital Success Elevate Your Presence

    OpenAI partners with Palmer Luckey’s Anduril to build military AI

    MS assures Windows 11 TPM security requirement won’t change

    Add A Comment

    Comments are closed.

    NewTechMania Tech Revolution Mastering Insights Embark on a tech adventure with latest gadgets technologies join us exploring possibilities main logo

    About US

    Embark on a tech adventure with NewTechMania. From the latest gadgets to emerging technologies, join us in exploring the possibilities that lie ahead.

    Terms

    • Privacy
    • Cookie
    • Terms
    • Disclaimer
    • DMCA

    Useful Links

    • Home
    • About Us
    • Contact Us
    • Get In Touch
    • Privacy

    Weekly Newslatter

    Subscribe to our newsletter to get updated!
    © 2025 NewTechMania. All RightS Reserved.
    Facebook-f Twitter Instagram Pinterest Youtube

    Type above and press Enter to search. Press Esc to cancel.

    Sign In or Register

    Welcome Back!

    Login below or Register Now.

    Continue with Google
    Lost password?

    Register Now!

    Already registered? Login.

    Continue with Google

    A password will be e-mailed to you.