robots.txt

A website file that instructs search bots which paths they may crawl.

robots.txt is a plain-text file placed at a website’s root that tells search bots which paths they may or may not crawl. It uses rules such as Allow and Disallow to express crawler preferences, and it is part of the Robots Exclusion Protocol.

In cyber security, robots.txt matters because it can reduce unnecessary crawling of private, duplicate, or sensitive areas, but it is not an access-control mechanism. Anyone can fetch the file, and malicious scanners often use it to discover hidden directories, admin panels, backups, or staging paths. Defenders should treat it as a hint to well-behaved bots, not a barrier. For sensitive content, use authentication, authorization, and server-side controls; for search visibility, pair it with directives like noindex when appropriate.

Netcrook

robots.txt

Related articles