Thursday, May 8, 2025

Firecrawl-Mcp-Server – Official Firecrawl MCP Server




A Mannequin Context Protocol (MCP) server implementation that integrates with Firecrawl for internet scraping capabilities.

Large due to @vrknetha, @cawstudios for the preliminary implementation!

You can too mess around with our MCP Server on MCP.so’s playground. Due to MCP.so for internet hosting and @gstarwd for integrating our server.

 

Options

  • Scrape, crawl, search, extract, deep analysis and batch scrape help
  • Internet scraping with JS rendering
  • URL discovery and crawling
  • Internet search with content material extraction
  • Automated retries with exponential backoff
  • Environment friendly batch processing with built-in price limiting
  • Credit score utilization monitoring for cloud API
  • Complete logging system
  • Assist for cloud and self-hosted Firecrawl cases
  • Cell/Desktop viewport help
  • Sensible content material filtering with tag inclusion/exclusion

Set up

Operating with npx

env FIRECRAWL_API_KEY=fc-YOUR_API_KEY npx -y firecrawl-mcp

Handbook Set up

npm set up -g firecrawl-mcp

Operating on Cursor

Configuring Cursor 🖥️ Observe: Requires Cursor model 0.45.6+ For essentially the most up-to-date configuration directions, please confer with the official Cursor documentation on configuring MCP servers: Cursor MCP Server Configuration Information

To configure Firecrawl MCP in Cursor v0.45.6

  1. Open Cursor Settings
  2. Go to Options > MCP Servers
  3. Click on “+ Add New MCP Server”
  4. Enter the next:
  5. Title: “firecrawl-mcp” (or your most popular identify)
  6. Kind: “command”
  7. Command: env FIRECRAWL_API_KEY=your-api-key npx -y firecrawl-mcp

To configure Firecrawl MCP in Cursor v0.48.6

  1. Open Cursor Settings
  2. Go to Options > MCP Servers
  3. Click on “+ Add new world MCP server”
  4. Enter the next code: json { "mcpServers": { "firecrawl-mcp": { "command": "npx", "args": ["-y", "firecrawl-mcp"], "env": { "FIRECRAWL_API_KEY": "YOUR-API-KEY" } } } }

If you’re utilizing Home windows and are operating into points, strive cmd /c "set FIRECRAWL_API_KEY=your-api-key && npx -y firecrawl-mcp"

Exchange your-api-key along with your Firecrawl API key. If you do not have one but, you may create an account and get it from https://www.firecrawl.dev/app/api-keys

After including, refresh the MCP server listing to see the brand new instruments. The Composer Agent will robotically use Firecrawl MCP when acceptable, however you may explicitly request it by describing your internet scraping wants. Entry the Composer by way of Command+L (Mac), choose “Agent” subsequent to the submit button, and enter your question.

Operating on Windsurf

Add this to your ./codeium/windsurf/model_config.json:

{
"mcpServers": {
"mcp-server-firecrawl": {
"command": "npx",
"args": ["-y", "firecrawl-mcp"],
"env": {
"FIRECRAWL_API_KEY": "YOUR_API_KEY"
}
}
}
}

Putting in by way of Smithery (Legacy)

To put in Firecrawl for Claude Desktop robotically by way of Smithery:

npx -y @smithery/cli set up @mendableai/mcp-server-firecrawl --client claude

Configuration

Surroundings Variables

Required for Cloud API

  • FIRECRAWL_API_KEY: Your Firecrawl API key
  • Required when utilizing cloud API (default)
  • Non-obligatory when utilizing self-hosted occasion with FIRECRAWL_API_URL
  • FIRECRAWL_API_URL (Non-obligatory): Customized API endpoint for self-hosted cases
  • Instance: https://firecrawl.your-domain.com
  • If not offered, the cloud API shall be used (requires API key)

Non-obligatory Configuration

Retry Configuration
  • FIRECRAWL_RETRY_MAX_ATTEMPTS: Most variety of retry makes an attempt (default: 3)
  • FIRECRAWL_RETRY_INITIAL_DELAY: Preliminary delay in milliseconds earlier than first retry (default: 1000)
  • FIRECRAWL_RETRY_MAX_DELAY: Most delay in milliseconds between retries (default: 10000)
  • FIRECRAWL_RETRY_BACKOFF_FACTOR: Exponential backoff multiplier (default: 2)
Credit score Utilization Monitoring
  • FIRECRAWL_CREDIT_WARNING_THRESHOLD: Credit score utilization warning threshold (default: 1000)
  • FIRECRAWL_CREDIT_CRITICAL_THRESHOLD: Credit score utilization essential threshold (default: 100)

Configuration Examples

For cloud API utilization with customized retry and credit score monitoring:

# Required for cloud API
export FIRECRAWL_API_KEY=your-api-key

# Non-obligatory retry configuration
export FIRECRAWL_RETRY_MAX_ATTEMPTS=5 # Enhance max retry makes an attempt
export FIRECRAWL_RETRY_INITIAL_DELAY=2000 # Begin with 2s delay
export FIRECRAWL_RETRY_MAX_DELAY=30000 # Most 30s delay
export FIRECRAWL_RETRY_BACKOFF_FACTOR=3 # Extra aggressive backoff

# Non-obligatory credit score monitoring
export FIRECRAWL_CREDIT_WARNING_THRESHOLD=2000 # Warning at 2000 credit
export FIRECRAWL_CREDIT_CRITICAL_THRESHOLD=500 # Important at 500 credit

For self-hosted occasion:

# Required for self-hosted
export FIRECRAWL_API_URL=https://firecrawl.your-domain.com

# Non-obligatory authentication for self-hosted
export FIRECRAWL_API_KEY=your-api-key # In case your occasion requires auth

# Customized retry configuration
export FIRECRAWL_RETRY_MAX_ATTEMPTS=10
export FIRECRAWL_RETRY_INITIAL_DELAY=500 # Begin with sooner retries

Utilization with Claude Desktop

Add this to your claude_desktop_config.json:

{
"mcpServers": {
"mcp-server-firecrawl": {
"command": "npx",
"args": ["-y", "firecrawl-mcp"],
"env": {
"FIRECRAWL_API_KEY": "YOUR_API_KEY_HERE",

"FIRECRAWL_RETRY_MAX_ATTEMPTS": "5",
"FIRECRAWL_RETRY_INITIAL_DELAY": "2000",
"FIRECRAWL_RETRY_MAX_DELAY": "30000",
"FIRECRAWL_RETRY_BACKOFF_FACTOR": "3",

"FIRECRAWL_CREDIT_WARNING_THRESHOLD": "2000",
"FIRECRAWL_CREDIT_CRITICAL_THRESHOLD": "500"
}
}
}
}

System Configuration

The server contains a number of configurable parameters that may be set by way of atmosphere variables. Listed below are the default values if not configured:

const CONFIG = {
retry: {
maxAttempts: 3, // Variety of retry makes an attempt for rate-limited requests
initialDelay: 1000, // Preliminary delay earlier than first retry (in milliseconds)
maxDelay: 10000, // Most delay between retries (in milliseconds)
backoffFactor: 2, // Multiplier for exponential backoff
},
credit score: {
warningThreshold: 1000, // Warn when credit score utilization reaches this degree
criticalThreshold: 100, // Important alert when credit score utilization reaches this degree
},
};

These configurations management:

  1. Retry Conduct

  2. Mechanically retries failed requests as a consequence of price limits

  3. Makes use of exponential backoff to keep away from overwhelming the API
  4. Instance: With default settings, retries shall be tried at:

    • 1st retry: 1 second delay
    • 2nd retry: 2 seconds delay
    • third retry: 4 seconds delay (capped at maxDelay)
  5. Credit score Utilization Monitoring

  6. Tracks API credit score consumption for cloud API utilization
  7. Offers warnings at specified thresholds
  8. Helps forestall sudden service interruption
  9. Instance: With default settings:
    • Warning at 1000 credit remaining
    • Important alert at 100 credit remaining

Charge Limiting and Batch Processing

The server makes use of Firecrawl’s built-in price limiting and batch processing capabilities:

  • Automated price restrict dealing with with exponential backoff
  • Environment friendly parallel processing for batch operations
  • Sensible request queuing and throttling
  • Automated retries for transient errors

Out there Instruments

1. Scrape Instrument (firecrawl_scrape)

Scrape content material from a single URL with superior choices.

{
"identify": "firecrawl_scrape",
"arguments": {
"url": "https://instance.com",
"codecs": ["markdown"],
"onlyMainContent": true,
"waitFor": 1000,
"timeout": 30000,
"cell": false,
"includeTags": ["article", "main"],
"excludeTags": ["nav", "footer"],
"skipTlsVerification": false
}
}

2. Batch Scrape Instrument (firecrawl_batch_scrape)

Scrape a number of URLs effectively with built-in price limiting and parallel processing.

{
"identify": "firecrawl_batch_scrape",
"arguments": {
"urls": ["https://example1.com", "https://example2.com"],
"choices": {
"codecs": ["markdown"],
"onlyMainContent": true
}
}
}

Response contains operation ID for standing checking:

{
"content material": [
{
"type": "text",
"text": "Batch operation queued with ID: batch_1. Use firecrawl_check_batch_status to check progress."
}
],
"isError": false
}

3. Examine Batch Standing (firecrawl_check_batch_status)

Examine the standing of a batch operation.

{
"identify": "firecrawl_check_batch_status",
"arguments": {
"id": "batch_1"
}
}

4. Search Instrument (firecrawl_search)

Search the net and optionally extract content material from search outcomes.

{
"identify": "firecrawl_search",
"arguments": {
"question": "your search question",
"restrict": 5,
"lang": "en",
"nation": "us",
"scrapeOptions": {
"codecs": ["markdown"],
"onlyMainContent": true
}
}
}

5. Crawl Instrument (firecrawl_crawl)

Begin an asynchronous crawl with superior choices.

{
"identify": "firecrawl_crawl",
"arguments": {
"url": "https://instance.com",
"maxDepth": 2,
"restrict": 100,
"allowExternalLinks": false,
"deduplicateSimilarURLs": true
}
}

6. Extract Instrument (firecrawl_extract)

Extract structured data from internet pages utilizing LLM capabilities. Helps each cloud AI and self-hosted LLM extraction.

{
"identify": "firecrawl_extract",
"arguments": {
"urls": ["https://example.com/page1", "https://example.com/page2"],
"immediate": "Extract product data together with identify, worth, and outline",
"systemPrompt": "You're a useful assistant that extracts product data",
"schema": {
"sort": "object",
"properties": {
"identify": { "sort": "string" },
"worth": { "sort": "quantity" },
"description": { "sort": "string" }
},
"required": ["name", "price"]
},
"allowExternalLinks": false,
"enableWebSearch": false,
"includeSubdomains": false
}
}

Instance response:

{
"content material": [
{
"type": "text",
"text": {
"name": "Example Product",
"price": 99.99,
"description": "This is an example product description"
}
}
],
"isError": false
}

Extract Instrument Choices:

  • urls: Array of URLs to extract data from
  • immediate: Customized immediate for the LLM extraction
  • systemPrompt: System immediate to information the LLM
  • schema: JSON schema for structured information extraction
  • allowExternalLinks: Enable extraction from exterior hyperlinks
  • enableWebSearch: Allow internet seek for extra context
  • includeSubdomains: Embrace subdomains in extraction

When utilizing a self-hosted occasion, the extraction will use your configured LLM. For cloud API, it makes use of Firecrawl’s managed LLM service.

7. Deep Analysis Instrument (firecrawl_deep_research)

Conduct deep internet analysis on a question utilizing clever crawling, search, and LLM evaluation.

{
"identify": "firecrawl_deep_research",
"arguments": {
"question": "how does carbon seize know-how work?",
"maxDepth": 3,
"timeLimit": 120,
"maxUrls": 50
}
}

Arguments:

  • question (string, required): The analysis query or matter to discover.
  • maxDepth (quantity, non-obligatory): Most recursive depth for crawling/search (default: 3).
  • timeLimit (quantity, non-obligatory): Time restrict in seconds for the analysis session (default: 120).
  • maxUrls (quantity, non-obligatory): Most variety of URLs to research (default: 50).

Returns:

  • Remaining evaluation generated by an LLM based mostly on analysis. (information.finalAnalysis)
  • Can also embody structured actions and sources used within the analysis course of.

8. Generate LLMs.txt Instrument (firecrawl_generate_llmstxt)

Generate a standardized llms.txt (and optionally llms-full.txt) file for a given area. This file defines how massive language fashions ought to work together with the positioning.

{
"identify": "firecrawl_generate_llmstxt",
"arguments": {
"url": "https://instance.com",
"maxUrls": 20,
"showFullText": true
}
}

Arguments:

  • url (string, required): The bottom URL of the web site to research.
  • maxUrls (quantity, non-obligatory): Max variety of URLs to incorporate (default: 10).
  • showFullText (boolean, non-obligatory): Whether or not to incorporate llms-full.txt contents within the response.

Returns:

  • Generated llms.txt file contents and optionally the llms-full.txt (information.llmstxt and/or information.llmsfulltxt)

Logging System

The server contains complete logging:

  • Operation standing and progress
  • Efficiency metrics
  • Credit score utilization monitoring
  • Charge restrict monitoring
  • Error situations

Instance log messages:

[INFO] Firecrawl MCP Server initialized efficiently
[INFO] Beginning scrape for URL: https://instance.com
[INFO] Batch operation queued with ID: batch_1
[WARNING] Credit score utilization has reached warning threshold
[ERROR] Charge restrict exceeded, retrying in 2s...

Error Dealing with

The server offers sturdy error dealing with:

  • Automated retries for transient errors
  • Charge restrict dealing with with backoff
  • Detailed error messages
  • Credit score utilization warnings
  • Community resilience

Instance error response:

{
"content material": [
{
"type": "text",
"text": "Error: Rate limit exceeded. Retrying in 2 seconds..."
}
],
"isError": true
}

Improvement

# Set up dependencies
npm set up

# Construct
npm run construct

# Run assessments
npm take a look at

Contributing

  1. Fork the repository
  2. Create your function department
  3. Run assessments: npm take a look at
  4. Submit a pull request

License

MIT License – see LICENSE file for particulars



Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles

PHP Code Snippets Powered By : XYZScripts.com