Dopious
Senior Member
Founding Member
Sapphire Member
Patron
Hot Rod
Cloudflare has discovered that Perplexity AI is using undeclared web crawlers to scrape websites. These crawlers are specifically designed to ignore `robots.txt` directives, which are the standard rules website owners use to block bots.
To avoid being blocked, the crawlers disguise their identity by using generic user-agent strings, making them appear as regular browser traffic. This behavior directly contradicts Perplexity's public claims that they respect the choices of content creators and honor `robots.txt` files. Ultimately, this practice undermines the ability of website owners to control how their content is used by AI companies.
Source: https://blog.cloudflare.com/perplex...rawlers-to-evade-website-no-crawl-directives/
BlackHat all the way baby.
To avoid being blocked, the crawlers disguise their identity by using generic user-agent strings, making them appear as regular browser traffic. This behavior directly contradicts Perplexity's public claims that they respect the choices of content creators and honor `robots.txt` files. Ultimately, this practice undermines the ability of website owners to control how their content is used by AI companies.
Source: https://blog.cloudflare.com/perplex...rawlers-to-evade-website-no-crawl-directives/
BlackHat all the way baby.