What is Robots.txt Generator?
Generate properly formatted robots.txt files to control how search engine crawlers access your website. Add user-agent rules, allow/disallow paths, set crawl delays, and include sitemap references — all without manually writing the file.
Add as many user-agent blocks as you need — one for * (all bots), separate ones for Googlebot, Bingbot, AhrefsBot, GPTBot, and others. Each block can list multiple Allow and Disallow paths and an optional Crawl-delay value in seconds. Multiple Sitemap URLs are emitted as standalone lines at the end. The output is the exact text that goes at /robots.txt on your domain.
How to use
- Add user-agent rules (e.g., Googlebot, Bingbot, or * for all) and specify which paths to allow or disallow.
- Optionally set crawl delay values and add your sitemap URL.
- Copy or download the generated robots.txt file and upload it to your site root.
When to use
- Blocking AI training crawlers (GPTBot, ClaudeBot, CCBot, Google-Extended) from your content.
- Hiding admin, search, staging, or duplicate-content paths from search-engine indexing.
- Migrating a site and pointing crawlers to a new sitemap or temporarily disallowing everything during a rebuild.
Result
Create rules that allow all crawlers on your site but block /admin/ and /api/ paths, with a sitemap at https://example.com/sitemap.xml.
FAQ
- Does robots.txt actually stop bots from crawling, or is it just a request?
- It is a voluntary protocol. Reputable crawlers (Google, Bing, the major archivers) obey it. Scrapers, malware bots, and some grey-area AI crawlers ignore it. For real access control use server-side authentication or block by IP and user-agent at the edge.
- What's the difference between Disallow and noindex?
- Disallow stops crawling — Google won't fetch the page. Noindex (a meta tag or HTTP header) tells Google not to show the page in results even if it does crawl it. If a page is Disallowed, Google can't see the noindex tag, so disallowed URLs can still appear in results.
- Where exactly do I upload the robots.txt file?
- It must sit at the root of your domain, served at https://example.com/robots.txt. Subfolders or subdomains each need their own file. On Next.js put it in /public/robots.txt; on Vercel a static file in the project root works too.
- How do I block ChatGPT and Claude from training on my site?
- Add User-agent: GPTBot, User-agent: ClaudeBot, User-agent: CCBot, User-agent: anthropic-ai, and User-agent: Google-Extended, each followed by Disallow: /. Note that Google-Extended only opts you out of training; regular Googlebot still indexes the page.
- Is Crawl-delay still respected by Google?
- No. Google ignores Crawl-delay and uses Search Console's crawl-rate settings instead. Bing, Yandex, and Yahoo still honour it. If a value is set here it stays in the file as a hint for those crawlers; Google just skips the line.
Related Tools
Structured Data Generator
Generate JSON-LD schema markup for SEO
Webpage to PDF
Capture a webpage as a PDF
Privacy Policy Generator
Generate a privacy policy for your site
Terms of Service Generator
Generate a terms of service document
Cookie Consent Generator
Generate cookie consent banner code
CSS Minifier
Minify CSS code to reduce file size