AI Crawler Control and Management

Control how AI crawlers like GPTBot, Claude-Web, Perplexity, and others access your changelog for AI training purposes. Manage this separately from search engine indexing.

Understanding AI Crawlers

AI crawlers are automated bots used by AI companies to collect content from the web for training language models. These include:

GPTBot - OpenAI's crawler for ChatGPT training
ChatGPT-User - OpenAI's user agent
Claude-Web - Anthropic's crawler for Claude training
anthropic-ai - Anthropic's AI crawler
PerplexityBot - Perplexity AI's crawler
Google-Extended - Google's AI training crawler
cohere-ai - Cohere's AI crawler
YouBot - You.com's AI crawler
Applebot-Extended - Apple's AI crawler
Diffbot - AI data extraction crawler

Why Control AI Crawlers Separately?

You may want different policies for search engines and AI crawlers:

Search engines help users discover your product updates
AI crawlers use your content to train AI models that may reference or summarize your content
You might want search visibility but control over AI training data
Or you might want to allow AI training but keep search indexing private

Control Modes

Block Mode

Prevent AI crawlers from accessing your changelog.

Blocks all major AI crawlers via robots.txt
Prevents your content from being used for AI training
Gives you control over how your content is used
Perfect if you want to opt-out of AI training

Default for free users: Free accounts default to Block mode to protect your content.

Allow Mode

Normal access - standard behavior.

Allows AI crawlers to access your content
No special directives - standard access
Market average approach
Your content may be used for AI training

This is the default for paid users and represents standard practice.

Optimize Mode

Explicitly encourage AI crawler access with enhanced features.

Explicitly allows all major AI crawlers
Includes sitemap reference in robots.txt
Structured data (Schema.org) for better AI understanding
Helps AI systems better understand and reference your content

Premium feature: Optimize mode is available for paid users who want to maximize AI visibility.

How It Works

Robots.txt Configuration

ChangeCrab automatically configures your robots.txt file to control AI crawlers:

Block: Adds Disallow: / directives for all major AI crawlers
Allow: No restrictions (default allow behavior)
Optimize: Explicitly allows AI crawlers and includes sitemap reference

Structured Data

When Optimize mode is enabled, ChangeCrab adds Schema.org structured data that helps AI systems:

Better understand your content structure
Identify your changelog as a collection of updates
Extract meaningful information about your product

Configuring AI Crawler Control

Navigate to your changelog Settings
Go to the Privacy & Visibility section
Find AI Crawler Access
Select your preferred mode:
- Block - Prevent AI crawlers
- Allow - Normal access
- Optimize - Enhanced access (Premium)
Click Save

Important: AI crawler control is separate from search engine indexing. You can block AI crawlers while allowing search engines, or vice versa.

Use Cases

Block AI Crawlers, Allow Search Engines

Best for: Companies that want search visibility but want to control AI training data.

Search engines can index your changelog
AI systems cannot use your content for training
You maintain control over how your content is used

Allow Both

Best for: Companies that want maximum visibility and don't mind AI training.

Maximum discoverability in search results
Your content may be referenced by AI assistants
Good for brand awareness and reach

Block Both

Best for: Internal or sensitive changelogs that need complete privacy.

No search engine indexing
No AI training data collection
Maximum privacy and control

Optimize Both

Best for: Public-facing products that want maximum visibility and AI presence.

Professional SEO optimization
Enhanced AI system understanding
Best chance of being referenced by AI assistants
Maximum brand visibility

Compliance and Best Practices

Robots.txt Compliance

Reputable AI crawlers (like GPTBot, Claude-Web) respect robots.txt directives. However:

Compliance is voluntary - some crawlers may ignore directives
Block mode provides strong protection but isn't 100% guaranteed
For maximum protection, combine with password protection or private changelogs

When to Use Each Mode

Block: When you want to opt-out of AI training or have sensitive content
Allow: Standard approach - let AI systems access your content normally
Optimize: When you want to maximize AI visibility and understanding

Related Features

Search Engine Indexing - Control search engine access separately
Privacy Settings - Configure overall privacy

FAQ

Why would I want to block AI crawlers?

You might want to block AI crawlers if:

You want to control how your content is used for AI training
You have sensitive or proprietary information
You prefer to opt-out of AI training data collection
You want to maintain exclusive control over your content

Why would I want to allow or optimize AI crawlers?

You might want to allow AI crawlers if:

You want your product to be referenced by AI assistants
You see value in AI systems understanding your updates
You want maximum brand visibility
You're comfortable with your content being used for AI training

Can I use Optimize mode on a free plan?

Optimize mode is a premium feature available to paid users. Free users can use Block or Allow modes, or upgrade to Premium to access Optimize mode.

Will Block mode prevent all AI crawlers?

Block mode uses robots.txt directives that reputable AI crawlers respect. However, compliance is voluntary, and some crawlers may ignore these directives. For maximum protection, consider making your changelog private or adding password protection.

What's the difference between blocking AI crawlers and search engines?

Search engines help users discover your content through search results. AI crawlers collect content for training AI models. You can control them independently - for example, allowing search engines while blocking AI crawlers, or vice versa.

How do I know if AI crawlers are accessing my changelog?

You can check your server logs or analytics for user agents like "GPTBot", "Claude-Web", "PerplexityBot", etc. However, ChangeCrab's Block mode should prevent most reputable AI crawlers from accessing your content.

AI Crawler Control and Management

AI Crawler Control and Management

Understanding AI Crawlers

Why Control AI Crawlers Separately?

Control Modes

Block Mode

Allow Mode

Optimize Mode

How It Works

Robots.txt Configuration

Structured Data

Configuring AI Crawler Control

Use Cases

Block AI Crawlers, Allow Search Engines

Allow Both

Block Both

Optimize Both

Compliance and Best Practices

Robots.txt Compliance

When to Use Each Mode

Related Features

FAQ

Why would I want to block AI crawlers?

Why would I want to allow or optimize AI crawlers?

Can I use Optimize mode on a free plan?

Will Block mode prevent all AI crawlers?

What's the difference between blocking AI crawlers and search engines?

How do I know if AI crawlers are accessing my changelog?

We use cookies