What is Robots.txt Generator?

Generate properly formatted robots.txt files to control how search engine crawlers access your website. Add user-agent rules, allow/disallow paths, set crawl delays, and include sitemap references — all without manually writing the file.

Add as many user-agent blocks as you need — one for * (all bots), separate ones for Googlebot, Bingbot, AhrefsBot, GPTBot, and others. Each block can list multiple Allow and Disallow paths and an optional Crawl-delay value in seconds. Multiple Sitemap URLs are emitted as standalone lines at the end. The output is the exact text that goes at /robots.txt on your domain.

How to use

  1. Add user-agent rules (e.g., Googlebot, Bingbot, or * for all) and specify which paths to allow or disallow.
  2. Optionally set crawl delay values and add your sitemap URL.
  3. Copy or download the generated robots.txt file and upload it to your site root.

When to use

  • Blocking AI training crawlers (GPTBot, ClaudeBot, CCBot, Google-Extended) from your content.
  • Hiding admin, search, staging, or duplicate-content paths from search-engine indexing.
  • Migrating a site and pointing crawlers to a new sitemap or temporarily disallowing everything during a rebuild.

Result

Create rules that allow all crawlers on your site but block /admin/ and /api/ paths, with a sitemap at https://example.com/sitemap.xml.

FAQ

Does robots.txt actually stop bots from crawling, or is it just a request?
It is a voluntary protocol. Reputable crawlers (Google, Bing, the major archivers) obey it. Scrapers, malware bots, and some grey-area AI crawlers ignore it. For real access control use server-side authentication or block by IP and user-agent at the edge.
What's the difference between Disallow and noindex?
Disallow stops crawling — Google won't fetch the page. Noindex (a meta tag or HTTP header) tells Google not to show the page in results even if it does crawl it. If a page is Disallowed, Google can't see the noindex tag, so disallowed URLs can still appear in results.
Where exactly do I upload the robots.txt file?
It must sit at the root of your domain, served at https://example.com/robots.txt. Subfolders or subdomains each need their own file. On Next.js put it in /public/robots.txt; on Vercel a static file in the project root works too.
How do I block ChatGPT and Claude from training on my site?
Add User-agent: GPTBot, User-agent: ClaudeBot, User-agent: CCBot, User-agent: anthropic-ai, and User-agent: Google-Extended, each followed by Disallow: /. Note that Google-Extended only opts you out of training; regular Googlebot still indexes the page.
Is Crawl-delay still respected by Google?
No. Google ignores Crawl-delay and uses Search Console's crawl-rate settings instead. Bing, Yandex, and Yahoo still honour it. If a value is set here it stays in the file as a hint for those crawlers; Google just skips the line.

Related Tools