Your robots.txt file is the first thing search engine crawlers read when they visit your site. A misconfigured one can accidentally block Google from indexing your entire site. Here's everything you need to know.

What Is robots.txt?

It's a plain text file at the root of your website (e.g., yoursite.com/robots.txt) that tells search engine bots which pages they can and cannot crawl. It doesn't prevent pages from appearing in search results — it controls crawling, not indexing.

Basic Syntax

User-agent: * means "all bots." Disallow: /admin/ means "don't crawl the /admin/ folder." Allow: / means "crawl everything." Sitemap: https://yoursite.com/sitemap.xml tells bots where your sitemap is.

Common Mistakes

Blocking CSS and JS: Don't disallow /css/ or /js/ folders. Google needs these to render your pages properly.

Blocking entire site during development: The most common disaster. Developers add Disallow: / during staging and forget to remove it before launch.

Not including Sitemap directive: Always point to your sitemap — it helps bots discover all your pages faster.

Case sensitivity: /Admin/ and /admin/ are different paths. Be precise.

Testing Your robots.txt

Google Search Console has a robots.txt tester. You can also use GetSEOAnalyzer — our audit checks for robots.txt issues and tells you exactly what to fix.

Is your robots.txt configured correctly?

Our free audit checks robots.txt, sitemap, and 45 other SEO factors.

Check My robots.txt →