Robots.txt Generator

Build a robots.txt file to control how search engines crawl your website. Add user-agent rules, paths, and sitemap URLs.

Quick Presets

User-agent Rules

User-agent

Allow Paths

Disallow Paths

Crawl Delay (optional)

Sitemap URLs One URL per line

robots.txt Output

User-agent: *
Disallow:

About the Robots.txt Generator

A robots.txt file is a plain-text file placed at the root of your website that instructs search engine crawlers which pages they can and cannot access. This free robots.txt generator lets you build a correctly formatted file in seconds without writing it by hand.

Add rules for any crawler — all bots (*), Googlebot, Bingbot, YandexBot, and more
Set Allow and Disallow paths to fine-tune which URLs get crawled
Include one or more Sitemap URLs so crawlers can discover all your pages
Set a Crawl-delay per bot to reduce server load during indexing
Use the Block AI Bots preset to prevent GPTBot, CCBot, and other AI scrapers from accessing your content
Download the finished file as robots.txt ready to deploy

How to Generate a robots.txt File

1
Choose a preset or start from scratch

Click one of the Quick Presets (Allow All, Block All, Block AI Bots, WordPress) to populate sensible defaults, or leave the form blank and build your rules manually.
2
Set the User-agent

Type a bot name or pick from the dropdown. Use * to target all crawlers. Click Add Rule to create separate rule blocks for individual bots.
3
Enter Allow and Disallow paths

Enter one path per line. Disallow paths are blocked from crawling; Allow paths explicitly permit access and override a broader Disallow rule.
4
Add your Sitemap URL(s)

Paste your sitemap URL (e.g. https://example.com/sitemap.xml) in the Sitemap field. Add one per line if you have multiple sitemaps.
5
Copy or Download

Click Copy to copy the output to your clipboard, or Download to save the file as robots.txt. Upload it to the root directory of your website.

Tip: The file must be accessible at https://yourdomain.com/robots.txt — not in a subdirectory. Most web hosts let you place it in the public root folder alongside your index.html.

Common Use Cases

Protecting Private Content

• Block crawlers from /admin/ or /dashboard/
• Hide staging or development paths from search indexes
• Prevent indexing of internal API endpoints

WordPress Sites

• Disallow /wp-admin/ and /wp-includes/
• Block plugin and cache directories from being crawled
• Link to your WordPress sitemap generated by Yoast or Rank Math

Blocking AI Scrapers

• Disallow GPTBot, ChatGPT-User, CCBot, and anthropic-ai
• Prevent your content from being used in AI training datasets
• Use the one-click "Block AI Bots" preset to apply all rules instantly

Reducing Crawl Budget Waste

• Disallow paginated archive pages, tags, or faceted search URLs
• Set a Crawl-delay for Bingbot or YandexBot to ease server load
• Focus crawl budget on high-value pages like product and landing pages

E-commerce Stores

• Block cart, checkout, and account pages from being indexed
• Allow product and category pages for full SEO visibility
• Include your product sitemap for faster indexing of new listings

Allowing All Crawlers

• Use the "Allow All" preset to generate the minimal valid robots.txt
• Still add your Sitemap URL for better crawl discoverability
• Suitable for fully public sites with no restricted content

Frequently Asked Questions

What is a robots.txt file?

A robots.txt file is a plain-text file placed at the root of your website that tells search engine bots (and other automated crawlers) which pages they are allowed or not allowed to visit. It follows the Robots Exclusion Protocol (REP) standard.

Where do I put the robots.txt file?

It must be placed at the root of your domain — accessible at https://yourdomain.com/robots.txt. It cannot be in a subdirectory. Most web hosts let you upload it to the public or www root folder.

Does robots.txt prevent pages from being indexed?

Not directly. Disallowing a URL in robots.txt prevents crawlers from visiting it, but a page can still appear in search results if other sites link to it. To prevent indexing entirely, use a noindex meta tag on the page itself.

What does `Disallow:` with an empty value mean?

An empty Disallow: line means "disallow nothing" — the crawler is allowed to access everything. This is equivalent to having no restrictions at all and is the correct way to explicitly allow all content.

Does blocking a bot in robots.txt actually stop it?

Legitimate bots like Googlebot and Bingbot respect robots.txt by design. However, malicious scrapers and some AI training crawlers may ignore it entirely. For stronger protection, consider IP blocking or rate limiting at the server level.

Can I have multiple User-agent blocks?

Yes. Each User-agent block applies its Disallow/Allow rules only to the named bot. This lets you allow Googlebot full access while blocking other crawlers from certain paths. Use the Add Rule button to create separate blocks.

What is Crawl-delay and should I use it?

Crawl-delay tells a bot to wait a set number of seconds between requests. It is useful for reducing server load on high-traffic sites. Note: Googlebot ignores this directive — configure crawl rate for Google in Google Search Console instead.

Is my data safe? Does this tool send data to a server?

No data leaves your browser. All robots.txt generation happens entirely client-side using JavaScript. Your configuration and output are never uploaded, stored, or transmitted anywhere.