Dopious
Senior Member
Founding Member
Sapphire Member
Patron





Cloudflare has discovered that Perplexity AI is using undeclared web crawlers to scrape websites. These crawlers are specifically designed to ignore `robots.txt` directives, which are the standard rules website owners use to block bots.
To avoid being blocked, the crawlers disguise their identity by using generic user-agent strings, making them appear as regular browser traffic. This behavior directly contradicts Perplexity's public claims that they respect the choices of content creators and honor `robots.txt` files. Ultimately, this practice undermines the ability of website owners to control how their content is used by AI companies.
Source: https://blog.cloudflare.com/perplex...rawlers-to-evade-website-no-crawl-directives/
BlackHat all the way baby.
To avoid being blocked, the crawlers disguise their identity by using generic user-agent strings, making them appear as regular browser traffic. This behavior directly contradicts Perplexity's public claims that they respect the choices of content creators and honor `robots.txt` files. Ultimately, this practice undermines the ability of website owners to control how their content is used by AI companies.
Source: https://blog.cloudflare.com/perplex...rawlers-to-evade-website-no-crawl-directives/
BlackHat all the way baby.