Close Menu
    Login
    • Register
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    • Home
    • Technology
    • Daily Tech
      • Science and Technology
    • Gadgets
    • Gaming
    • Space Exploration
    • Scope
    • Tech News
    Facebook X (Twitter) Instagram Pinterest YouTube WhatsApp
    Facebook X (Twitter) Instagram
    NewTechManiaNewTechMania
    Login
    • Home
    • Blog
    • Gadgets
      • Gaming
    • Technology
      • Science
    • Automobile
    • Exploration
    • Scope
    • Tech News
    NewTechManiaNewTechMania
    You are at:Home » Blog » AI startup Anthropic is accused of violating anti-scraping protocols by websites
    Daily Tech

    AI startup Anthropic is accused of violating anti-scraping protocols by websites

    By Ruchika oberoi28 July 2024No Comments4 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    303a2ee0 4c0d 11ef bbbf a93d22c98d65
    303a2ee0 4c0d 11ef bbbf a93d22c98d65
    Share
    Facebook Twitter LinkedIn Pinterest Email

    According to iFixit and Freelancer, the bot belonging to Anthropic crawled their websites in an aggressive manner.

    Anthropic, the artificial intelligence startup that is responsible for the Claude big language models, has been accused by Freelancer of ignoring its “do not crawl” robots.txt policy in order to scrape the data from its websites. Meanwhile, Kyle Wiens, the CEO of iFixit, stated that Anthropic has disregarded the website’s policy that prohibits the usage of its content for the purpose of training artificial intelligence models. According to Matt Barrie, the chief executive officer of Freelancer, who spoke with The Information, ClaudeBot, which is owned by Anthropic, is “the most aggressive scraper by far.” Within a matter of four hours, his website is said to have received 3.5 million visitors from the company’s crawler. This is “probably about five times the volume of the number two” AI crawler, according to the report. Anthropic’s bot reportedly attacked iFixit’s servers one million times in a span of twenty-four hours, according to a post made by Wiens on X/Twitter. “You’re not only taking our content without paying, you’re tying up our devops resources,” he stated in his article.

    During the month of June, Wired made the accusation that another artificial intelligence company, Perplexity, crawled its website despite the presence of the Robots Exclusion Protocol, also known as robots.txt. In most cases, a robots.txt file is used to tell web crawlers which pages they are permitted to view and which they are not permitted to access. Even though compliance is voluntary, the majority of bots that are malicious have simply ignored it. TollBit, a startup that connects artificial intelligence companies with content producers, stated that it is not just Perplexity that is circumventing robots.txt signals after the article published in Wired was made public a few days later. Despite the fact that it did not identify specific individuals, Business Insider reported that it had discovered that OpenAI and Anthropic were also disregarding the policy.

    According to Barrie, Freelancer initially attempted to disregard the bot’s requests for access; nevertheless, in the end, it was necessary for it to completely obstruct Anthropic’s crawler. “This is egregious scraping [which]makes the site slower for everyone operating on it and ultimately affects our revenue,” he stated in addition. When it comes to iFixit, Wiens stated that the website has alarms set for high traffic, and his employees were woken up at three in the morning as a result of the activity of Anthropic. After adding a line to its robots.txt file that specifically disallows Anthropic’s bot, the firm’s crawler ceased scraping iFixit. This was implemented after the company updated the line.

    In an interview with The Information, the artificial intelligence startup stated that it respects robots.txt and that its crawler “respected that signal when iFixit implemented it.” In addition to this, it stated that it strives “for minimal disruption by being thoughtful about how quickly [it crawls]the same domains,” which is the reason why it is currently conducting an investigation into the matter.

    Crawlers are useful for artificial intelligence companies because they collect content from websites that can be used to train their generative AI technology. As a consequence of this, they have been the subject of many lawsuits, in which publishers have sued them for allegedly infringing upon their copyright. Businesses such as OpenAI have been negotiating agreements with publishers and websites in an effort to reduce the number of lawsuits that are being brought. New Corp, Vox Media, the Financial Times, and Reddit are some of the content partners that OpenAI has worked with up until this point. After telling Anthropic in a tweet that he is willing to have a talk about licensing content for commercial use, Wiens, who is the founder of iFixit, appears to be open to the possibility of negotiating a deal for the articles that are published on the website that provides instructions on how to repair things.

    Hey @AnthropicAI: I get you're hungry for data. Claude is really smart! But do you really need to hit our servers a million times in 24 hours?

    You're not only taking our content without paying, you're tying up our devops resources. Not cool.

    — Kyle Wiens (@kwiens) July 24, 2024

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleThe NBA is sued by Warner Bros. Discovery in a last attempt to stop Amazon’s streaming package
    Next Article Apple signed a unionized US retail contract for the first time

    Related Posts

    Skypeaklimits 2024: Your Digital Success Elevate Your Presence

    OpenAI partners with Palmer Luckey’s Anduril to build military AI

    MS assures Windows 11 TPM security requirement won’t change

    Peloton launches audio-focused strength training app

    Add A Comment

    Comments are closed.

    NewTechMania Tech Revolution Mastering Insights Embark on a tech adventure with latest gadgets technologies join us exploring possibilities main logo

    About US

    Embark on a tech adventure with NewTechMania. From the latest gadgets to emerging technologies, join us in exploring the possibilities that lie ahead.

    Terms

    • Privacy
    • Cookie
    • Terms
    • Disclaimer
    • DMCA

    Useful Links

    • Home
    • About Us
    • Contact Us
    • Get In Touch
    • Privacy

    Weekly Newslatter

    Subscribe to our newsletter to get updated!
    © 2025 NewTechMania. All RightS Reserved.
    Facebook-f Twitter Instagram Pinterest Youtube

    Type above and press Enter to search. Press Esc to cancel.

    Sign In or Register

    Welcome Back!

    Login below or Register Now.

    Continue with Google
    Lost password?

    Register Now!

    Already registered? Login.

    Continue with Google

    A password will be e-mailed to you.