procursorrules.com
Sign In
Back to MCPs
F

Firecrawl MCP Server

by mendableai

Firecrawl MCP Server

A Model Context Protocol (MCP) server implementation that integrates with Firecrawl for web scraping capabilities.

Big thanks to @vrknetha, @knacklabs for the initial implementation!

Features

  • Web scraping, crawling, and discovery
  • Search and content extraction
  • Deep research and batch scraping
  • Automatic retries and rate limiting
  • Cloud and self-hosted support
  • SSE support

Play around with our MCP Server on MCP.so's playground or on Klavis AI.

Installation

Running with npx

env FIRECRAWL_API_KEY=fc-YOUR_API_KEY npx -y firecrawl-mcp

Manual Installation

npm install -g firecrawl-mcp

Running on Cursor

Configuring Cursor 🖥️
Note: Requires Cursor version 0.45.6+
For the most up-to-date configuration instructions, please refer to the official Cursor documentation on configuring MCP servers:
Cursor MCP Server Configuration Guide

To configure Firecrawl MCP in Cursor v0.48.6

  1. Open Cursor Settings
  2. Go to Features > MCP Servers
  3. Click "+ Add new global MCP server"
  4. Enter the following code:
    *Configuration content*json
    {
    "mcpServers": {
    "mcp-server-firecrawl": {
    "command": "npx",
    "args": ["-y", "firecrawl-mcp"],
    "env": {
    "FIRECRAWL_API_KEY": "YOUR_API_KEY"
    }
    }
    }
    }

### Running with SSE Local Mode

To run the server using Server-Sent Events (SSE) locally instead of the default stdio transport:

```bash
env SSE_LOCAL=true FIRECRAWL_API_KEY=fc-YOUR_API_KEY npx -y firecrawl-mcp

Use the url: http://localhost:3000/sse

Installing via Smithery (Legacy)

To install Firecrawl for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @mendableai/mcp-server-firecrawl --client claude

Running on VS Code

For one-click installation, click one of the install buttons below...

[Install with NPX in VS Code](https://insiders.vscode.dev/redirect/mcp/install?name=firecrawl&inputs=[{"type":"promptString","id":"apiKey","description":"Firecrawl API Key","password":true}]&config={"command":"npx","args":["-y","firecrawl-mcp"],"env":{"FIRECRAWL_API_KEY":"${input:apiKey}"}}) [Install with NPX in VS Code Insiders](https://insiders.vscode.dev/redirect/mcp/install?name=firecrawl&inputs=[{"type":"promptString","id":"apiKey","description":"Firecrawl API Key","password":true}]&config={"command":"npx","args":["-y","firecrawl-mcp"],"env":{"FIRECRAWL_API_KEY":"${input:apiKey}"}}&quality=insiders)

For manual installation, add the following JSON block to your User Settings (JSON) file in VS Code. You can do this by pressing Ctrl + Shift + P and typing Preferences: Open User Settings (JSON).

*Configuration content*

Optionally, you can add it to a file called .vscode/mcp.json in your workspace. This will allow you to share the configuration with others:

*Configuration content*

Configuration

Environment Variables

Required for Cloud API

  • FIRECRAWL_API_KEY: Your Firecrawl API key
    • Required when using cloud API (default)
    • Optional when using self-hosted instance with FIRECRAWL_API_URL
  • FIRECRAWL_API_URL (Optional): Custom API endpoint for self-hosted instances
    • Example: https://firecrawl.your-domain.com
    • If not provided, the cloud API will be used (requires API key)

Optional Configuration

Retry Configuration
  • FIRECRAWL_RETRY_MAX_ATTEMPTS: Maximum number of retry attempts (default: 3)
  • FIRECRAWL_RETRY_INITIAL_DELAY: Initial delay in milliseconds before first retry (default: 1000)
  • FIRECRAWL_RETRY_MAX_DELAY: Maximum delay in milliseconds between retries (default: 10000)
  • FIRECRAWL_RETRY_BACKOFF_FACTOR: Exponential backoff multiplier (default: 2)
Credit Usage Monitoring
  • FIRECRAWL_CREDIT_WARNING_THRESHOLD: Credit usage warning threshold (default: 1000)
  • FIRECRAWL_CREDIT_CRITICAL_THRESHOLD: Credit usage critical threshold (default: 100)

Configuration Examples

For cloud API usage with custom retry and credit monitoring:

# Required for cloud API
export FIRECRAWL_API_KEY=your-api-key

# Optional retry configuration
export FIRECRAWL_RETRY_MAX_ATTEMPTS=5        # Increase max retry attempts
export FIRECRAWL_RETRY_INITIAL_DELAY=2000    # Start with 2s delay
export FIRECRAWL_RETRY_MAX_DELAY=30000       # Maximum 30s delay
export FIRECRAWL_RETRY_BACKOFF_FACTOR=3      # More aggressive backoff

# Optional credit monitoring
export FIRECRAWL_CREDIT_WARNING_THRESHOLD=2000    # Warning at 2000 credits
export FIRECRAWL_CREDIT_CRITICAL_THRESHOLD=500    # Critical at 500 credits

For self-hosted instance:

# Required for self-hosted
export FIRECRAWL_API_URL=https://firecrawl.your-domain.com

# Optional authentication for self-hosted
export FIRECRAWL_API_KEY=your-api-key  # If your instance requires auth

# Custom retry configuration
export FIRECRAWL_RETRY_MAX_ATTEMPTS=10
export FIRECRAWL_RETRY_INITIAL_DELAY=500     # Start with faster retries

Usage with Claude Desktop

Add this to your claude_desktop_config.json:

*Configuration content*

System Configuration

The server includes several configurable parameters that can be set via environment variables. Here are the default values if not configured:

const CONFIG = {
  retry: {
    maxAttempts: 3, // Number of retry attempts for rate-limited requests
    initialDelay: 1000, // Initial delay before first retry (in milliseconds)
    maxDelay: 10000, // Maximum delay between retries (in milliseconds)
    backoffFactor: 2, // Multiplier for exponential backoff
  },
  credit: {
    warningThreshold: 1000, // Warn when credit usage reaches this level
    criticalThreshold: 100, // Critical alert when credit usage reaches this level
  },
};

These configurations control:

  1. Retry Behavior

    • Automatically retries failed requests due to rate limits
    • Uses exponential backoff to avoid overwhelming the API
    • Example: With default settings, retries will be attempted at:
      • 1st retry: 1 second delay
      • 2nd retry: 2 seconds delay
      • 3rd retry: 4 seconds delay (capped at maxDelay)
  2. Credit Usage Monitoring

    • Tracks API credit consumption for cloud API usage
    • Provides warnings at specified thresholds
    • Helps prevent unexpected service interruption
    • Example: With default settings:
      • Warning at 1000 credits remaining
      • Critical alert at 100 credits remaining

Rate Limiting and Batch Processing

The server utilizes Firecrawl's built-in rate limiting and batch processing capabilities:

  • Automatic rate limit handling with exponential backoff
  • Efficient parallel processing for batch operations
  • Smart request queuing and throttling
  • Automatic retries for transient errors

How to Choose a Tool

Use this guide to select the right tool for your task:

  • If you know the exact URL(s) you want:
    • For one: use scrape
    • For many: use batch_scrape
  • If you need to discover URLs on a site: use map
  • If you want to search the web for info: use search
  • If you want to extract structured data: use extract
  • If you want to analyze a whole site or section: use crawl (with limits!)

Quick Reference Table

ToolBest forReturns
scrapeSingle page contentmarkdown/html
batch_scrapeMultiple known URLsmarkdown/html[]
mapDiscovering URLs on a siteURL[]
crawlMulti-page extraction (with limits)markdown/html[]
searchWeb search for inforesults[]
extractStructured data from pagesJSON

Available Tools

1. Scrape Tool (firecrawl_scrape)

Scrape content from a single URL with advanced options.

Best for:

  • Single page content extraction, when you know exactly which page contains the information.

Not recommended for:

  • Extracting content from multiple pages (use batch_scrape for known URLs, or map + batch_scrape to discover URLs first, or crawl for full page content)
  • When you're unsure which page contains the information (use search)
  • When you need structured data (use extract)

Common mistakes:

  • Using scrape for a list of URLs (use batch_scrape instead).

Prompt Example:

"Get the content of the page at https://example.com."

Usage Example:

*Configuration content*

Returns:

  • Markdown, HTML, or other formats as specified.

2. Batch Scrape Tool (firecrawl_batch_scrape)

Scrape multiple URLs efficiently with built-in rate limiting and parallel processing.

Best for:

  • Retrieving content from multiple pages, when you know exactly which pages to scrape.

Not recommended for:

  • Discovering URLs (use map first if you don't know the URLs)
  • Scraping a single page (use scrape)

Common mistakes:

  • Using batch_scrape with too many URLs at once (may hit rate limits or token overflow)

Prompt Example:

"Get the content of these three blog posts: [url1, url2, url3]."

Usage Example:

*Configuration content*

Returns:

  • Response includes operation ID for status checking:

*Configuration content*

3. Check Batch Status (firecrawl_check_batch_status)

Check the status of a batch operation.

*Configuration content*

4. Map Tool (firecrawl_map)

Map a website to discover all indexed URLs on the site.

Best for:

  • Discovering URLs on a website before deciding what to scrape
  • Finding specific sections of a website

Not recommended for:

  • When you already know which specific URL you need (use scrape or batch_scrape)
  • When you need the content of the pages (use scrape after mapping)

Common mistakes:

  • Using crawl to discover URLs instead of map

Prompt Example:

"List all URLs on example.com."

Usage Example:

*Configuration content*

Returns:

  • Array of URLs found on the site

5. Search Tool (firecrawl_search)

Search the web and optionally extract content from search results.

Best for:

  • Finding specific information across multiple websites, when you don't know which website has the information.
  • When you need the most relevant content for a query

Not recommended for:

  • When you already know which website to scrape (use scrape)
  • When you need comprehensive coverage of a single website (use map or crawl)

Common mistakes:

  • Using crawl or map for open-ended questions (use search instead)

Usage Example:

*Configuration content*

Returns:

  • Array of search results (with optional scraped content)

Prompt Example:

"Find the latest research papers on AI published in 2023."

6. Crawl Tool (firecrawl_crawl)

Starts an asynchronous crawl job on a website and extract content from all pages.

Best for:

  • Extracting content from multiple related pages, when you need comprehensive coverage.

Not recommended for:

  • Extracting content from a single page (use scrape)
  • When token limits are a concern (use map + batch_scrape)
  • When you need fast results (crawling can be slow)

Warning: Crawl responses can be very large and may exceed token limits. Limit the crawl depth and number of pages, or use map + batch_scrape for better control.

Common mistakes:

  • Setting limit or maxDepth too high (causes token overflow)
  • Using crawl for a single page (use scrape instead)

Prompt Example:

"Get all blog posts from the first two levels of example.com/blog."

Usage Example:

*Configuration content*

Returns:

  • Response includes operation ID for status checking:

*Configuration content*

7. Check Crawl Status (firecrawl_check_crawl_status)

Check the status of a crawl job.

*Configuration content*

Returns:

  • Response includes the status of the crawl job:

8. Extract Tool (firecrawl_extract)

Extract structured information from web pages using LLM capabilities. Supports both cloud AI and self-hosted LLM extraction.

Best for:

  • Extracting specific structured data like prices, names, details.

Not recommended for:

  • When you need the full content of a page (use scrape)
  • When you're not looking for specific structured data

Arguments:

  • urls: Array of URLs to extract information from
  • prompt: Custom prompt for the LLM extraction
  • systemPrompt: System prompt to guide the LLM
  • schema: JSON schema for structured data extraction
  • allowExternalLinks: Allow extraction from external links
  • enableWebSearch: Enable web search for additional context
  • includeSubdomains: Include subdomains in extraction

When using a self-hosted instance, the extraction will use your configured LLM. For cloud API, it uses Firecrawl's managed LLM service.
Prompt Example:

"Extract the product name, price, and description from these product pages."

Usage Example:

*Configuration content*

Returns:

  • Extracted structured data as defined by your schema

*Configuration content*

Logging System

The server includes comprehensive logging:

  • Operation status and progress
  • Performance metrics
  • Credit usage monitoring
  • Rate limit tracking
  • Error conditions

Example log messages:

[INFO] Firecrawl MCP Server initialized successfully
[INFO] Starting scrape for URL: https://example.com
[INFO] Batch operation queued with ID: batch_1
[WARNING] Credit usage has reached warning threshold
[ERROR] Rate limit exceeded, retrying in 2s...

Error Handling

The server provides robust error handling:

  • Automatic retries for transient errors
  • Rate limit handling with backoff
  • Detailed error messages
  • Credit usage warnings
  • Network resilience

Example error response:

*Configuration content*

Development

# Install dependencies
npm install

# Build
npm run build

# Run tests
npm test

Contributing

  1. Fork the repository
  2. Create your feature branch
  3. Run tests: npm test
  4. Submit a pull request

Thanks to contributors

Thanks to @vrknetha, @cawstudios for the initial implementation!

Thanks to MCP.so and Klavis AI for hosting and @gstarwd, @xiangkaiz and @zihaolin96 for integrating our server.

License

MIT License - see LICENSE file for details

Statistics
Tools
0
Stars
4,400
Last Checked
9/4/2025
Version Info
Currentv2.0.0
Tagv2.0.0
Status
Latest
Released8/23/2025
Contributors

13 contributors

N

nickscamara

@nickscamara

user
V

vrknetha

@vrknetha

user
T

tomkosm

@tomkosm

user
C

calclavia

@calclavia

user
M

mogery

@mogery

user
X

xiangkaiz

@xiangkaiz

user
F

ftonato

@ftonato

user
B

burkeholland

@burkeholland

user
C

CodeDuckky

@CodeDuckky

user
E

ericciarla

@ericciarla

user
C

calebpeffer

@calebpeffer

user
P

punkpeye

@punkpeye

user
R

rafaelsideguide

@rafaelsideguide

user
D

devin-ai-integration[bot]

@devin-ai-integration

bot