Close Menu
Beverly Hills Examiner

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Hardy Blames ‘Hipster Jealousy’ for Creed + Nickelback Hate

    May 21, 2025

    Measles is highly contagious. Here’s how to protect yourself

    May 21, 2025

    Trump Tried To Bully Republicans To Support His Big Beautiful Bill And Flopped

    May 21, 2025
    Facebook X (Twitter) Instagram
    Beverly Hills Examiner
    • Home
    • US News
    • Politics
    • Business
    • Science
    • Technology
    • Lifestyle
    • Music
    • Television
    • Film
    • Books
    • Contact
      • About
      • Amazon Disclaimer
      • DMCA / Copyrights Disclaimer
      • Terms and Conditions
      • Privacy Policy
    Beverly Hills Examiner
    Home»Technology»Amazon Is Investigating Perplexity Over Claims of Scraping Abuse
    Technology

    Amazon Is Investigating Perplexity Over Claims of Scraping Abuse

    By June 28, 2024
    Facebook Twitter Pinterest LinkedIn WhatsApp Email Reddit Telegram
    Amazon Is Investigating Perplexity Over Claims of Scraping Abuse


    Amazon’s cloud division has launched an investigation into Perplexity AI. At issue is whether the AI search startup is violating Amazon Web Services rules by scraping websites that attempted to prevent it from doing so, WIRED has learned.

    An AWS spokesperson, who talked to WIRED on the condition that they not be named, confirmed the company’s investigation of Perplexity. WIRED had previously found that the startup—which has backing from the Jeff Bezos family fund and Nvidia, and was recently valued at $3 billion—appears to rely on content from scraped websites that had forbidden access through the Robots Exclusion Protocol, a common web standard. While the Robots Exclusion Protocol is not legally binding, terms of service generally are.

    The Robots Exclusion Protocol is a decades-old web standard that involves placing a plaintext file (like wired.com/robots.txt) on a domain to indicate which pages should not be accessed by automated bots and crawlers. While companies that use scrapers can choose to ignore this protocol, most have traditionally respected it. The Amazon spokesperson told WIRED that AWS customers must adhere to the robots.txt standard while crawling websites.

    “AWS’s terms of service prohibit customers from using our services for any illegal activity, and our customers are responsible for complying with our terms and all applicable laws,” the spokesperson said in a statement.

    Scrutiny of Perplexity’s practices follows a June 11 report from Forbes that accused the startup of stealing at least one of its articles. WIRED investigations confirmed the practice and found further evidence of scraping abuse and plagiarism by systems linked to Perplexity’s AI-powered search chatbot. Engineers for Condé Nast, WIRED’s parent company, block Perplexity’s crawler across all its websites using a robots.txt file. But WIRED found the company had access to a server using an unpublished IP address—44.221.181.252—which visited Condé Nast properties at least hundreds of times in the past three months, apparently to scrape Condé Nast websites.

    The machine associated with Perplexity appears to be engaged in widespread crawling of news websites that forbid bots from accessing their content. Spokespeople for The Guardian, Forbes, and The New York Times also say they detected the IP address on its servers multiple times.

    WIRED traced the IP address to a virtual machine known as an Elastic Compute Cloud (EC2) instance hosted on AWS, which launched its investigation after we asked whether using AWS infrastructure to scrape websites that forbade it violated the company’s terms of service.

    Last week, Perplexity CEO Aravind Srinivas responded to WIRED’s investigation first by saying the questions we posed to the company “reflect a deep and fundamental misunderstanding of how Perplexity and the Internet work.” Srinivas then told Fast Company that the secret IP address WIRED observed scraping Condé Nast websites and a test site we created was operated by a third-party company that performs web crawling and indexing services. He refused to name the company, citing a nondisclosure agreement. When asked if he would tell the third party to stop crawling WIRED, Srinivas replied, “It’s complicated.”



    Original Source Link

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp Email Reddit Telegram
    Previous ArticleFamily Files Claim against NASA after Space Junk Crashes into Florida Home
    Next Article USMNT's Timothy Weah issued red card after striking Panama player in head

    RELATED POSTS

    Trump administration may sell deep-sea mining leases at startup’s urging

    May 21, 2025

    Everything Google Announced at I/O 2025

    May 21, 2025

    Apple reportedly plans to let developers build on top of its AI

    May 20, 2025

    We Hand-Picked the 37 Best Deals From the 2025 REI Anniversary Sale

    May 20, 2025

    Trump to sign bill criminalizing revenge porn and explicit deepfakes

    May 19, 2025

    21 Best High School Graduation Gifts (2025)

    May 19, 2025
    latest posts

    Hardy Blames ‘Hipster Jealousy’ for Creed + Nickelback Hate

    Hardy has a theory regarding all the unfair hate aimed at Creed and Nickelback.The singer…

    Measles is highly contagious. Here’s how to protect yourself

    May 21, 2025

    Trump Tried To Bully Republicans To Support His Big Beautiful Bill And Flopped

    May 21, 2025

    Indy 500: Conor Day hopes to snap drought for Indiana

    May 21, 2025

    Trump administration may sell deep-sea mining leases at startup’s urging

    May 21, 2025

    Vagus nerve stimulation shows promise for spinal cord injury recovery

    May 21, 2025

    Jafar Panahi Speaks Out For freedom of speech at Cannes Press Conference

    May 21, 2025
    Categories
    • Books (529)
    • Business (5,433)
    • Film (5,370)
    • Lifestyle (3,475)
    • Music (5,424)
    • Politics (5,419)
    • Science (4,781)
    • Technology (5,367)
    • Television (5,044)
    • Uncategorized (1)
    • US News (5,421)
    popular posts

    Tool Have Their Own Festival Now, Announce 2025 Lineup

    Tool have just announced Live in the Sand, their very own destination festival taking place…

    Kid Rock Calls Oprah a “Fraud” for Fetterman Support over Dr. Oz

    November 7, 2022

    Case 63 Review: Spotify Strikes Gold With Devourable Audio Thriller

    October 31, 2022

    Biden Takes Fire After Inflation Increases Again

    March 14, 2023
    Archives
    Browse By Category
    • Books (529)
    • Business (5,433)
    • Film (5,370)
    • Lifestyle (3,475)
    • Music (5,424)
    • Politics (5,419)
    • Science (4,781)
    • Technology (5,367)
    • Television (5,044)
    • Uncategorized (1)
    • US News (5,421)
    About Us

    We are a creativity led international team with a digital soul. Our work is a custom built by the storytellers and strategists with a flair for exploiting the latest advancements in media and technology.

    Most of all, we stand behind our ideas and believe in creativity as the most powerful force in business.

    What makes us Different

    We care. We collaborate. We do great work. And we do it with a smile, because we’re pretty damn excited to do what we do. If you would like details on what else we can do visit out Contact page.

    Our Picks

    Vagus nerve stimulation shows promise for spinal cord injury recovery

    May 21, 2025

    Jafar Panahi Speaks Out For freedom of speech at Cannes Press Conference

    May 21, 2025

    ’90 Day Fiance’ Rob Warne Caught Red-Handed

    May 21, 2025
    © 2025 Beverly Hills Examiner. All rights reserved. All articles, images, product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Terms & Conditions and Privacy Policy.

    Type above and press Enter to search. Press Esc to cancel.

    We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
    Cookie SettingsAccept All
    Manage consent

    Privacy Overview

    This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
    Necessary
    Always Enabled
    Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
    CookieDurationDescription
    cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
    cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
    cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
    cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
    cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
    viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
    Functional
    Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
    Performance
    Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
    Analytics
    Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
    Advertisement
    Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
    Others
    Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
    SAVE & ACCEPT