A Mannequin Context Protocol (MCP) server implementation that integrates with Firecrawl for internet scraping capabilities.
Large due to @vrknetha, @cawstudios for the preliminary implementation!
You can too mess around with our MCP Server on MCP.so’s playground. Due to MCP.so for internet hosting and @gstarwd for integrating our server.
Options
- Scrape, crawl, search, extract, deep analysis and batch scrape help
- Internet scraping with JS rendering
- URL discovery and crawling
- Internet search with content material extraction
- Automated retries with exponential backoff
- Environment friendly batch processing with built-in price limiting
- Credit score utilization monitoring for cloud API
- Complete logging system
- Assist for cloud and self-hosted Firecrawl cases
- Cell/Desktop viewport help
- Sensible content material filtering with tag inclusion/exclusion
Set up
Operating with npx
env FIRECRAWL_API_KEY=fc-YOUR_API_KEY npx -y firecrawl-mcp
Handbook Set up
npm set up -g firecrawl-mcp
Operating on Cursor
Configuring Cursor 🖥️ Observe: Requires Cursor model 0.45.6+ For essentially the most up-to-date configuration directions, please confer with the official Cursor documentation on configuring MCP servers: Cursor MCP Server Configuration Information
To configure Firecrawl MCP in Cursor v0.45.6
- Open Cursor Settings
- Go to Options > MCP Servers
- Click on “+ Add New MCP Server”
- Enter the next:
- Title: “firecrawl-mcp” (or your most popular identify)
- Kind: “command”
- Command:
env FIRECRAWL_API_KEY=your-api-key npx -y firecrawl-mcp
To configure Firecrawl MCP in Cursor v0.48.6
- Open Cursor Settings
- Go to Options > MCP Servers
- Click on “+ Add new world MCP server”
- Enter the next code:
json { "mcpServers": { "firecrawl-mcp": { "command": "npx", "args": ["-y", "firecrawl-mcp"], "env": { "FIRECRAWL_API_KEY": "YOUR-API-KEY" } } } }
If you’re utilizing Home windows and are operating into points, strive
cmd /c "set FIRECRAWL_API_KEY=your-api-key && npx -y firecrawl-mcp"
Exchange your-api-key
along with your Firecrawl API key. If you do not have one but, you may create an account and get it from https://www.firecrawl.dev/app/api-keys
After including, refresh the MCP server listing to see the brand new instruments. The Composer Agent will robotically use Firecrawl MCP when acceptable, however you may explicitly request it by describing your internet scraping wants. Entry the Composer by way of Command+L (Mac), choose “Agent” subsequent to the submit button, and enter your question.
Operating on Windsurf
Add this to your ./codeium/windsurf/model_config.json
:
{
"mcpServers": {
"mcp-server-firecrawl": {
"command": "npx",
"args": ["-y", "firecrawl-mcp"],
"env": {
"FIRECRAWL_API_KEY": "YOUR_API_KEY"
}
}
}
}
Putting in by way of Smithery (Legacy)
To put in Firecrawl for Claude Desktop robotically by way of Smithery:
npx -y @smithery/cli set up @mendableai/mcp-server-firecrawl --client claude
Configuration
Surroundings Variables
Required for Cloud API
FIRECRAWL_API_KEY
: Your Firecrawl API key- Required when utilizing cloud API (default)
- Non-obligatory when utilizing self-hosted occasion with
FIRECRAWL_API_URL
FIRECRAWL_API_URL
(Non-obligatory): Customized API endpoint for self-hosted cases- Instance:
https://firecrawl.your-domain.com
- If not offered, the cloud API shall be used (requires API key)
Non-obligatory Configuration
Retry Configuration
FIRECRAWL_RETRY_MAX_ATTEMPTS
: Most variety of retry makes an attempt (default: 3)FIRECRAWL_RETRY_INITIAL_DELAY
: Preliminary delay in milliseconds earlier than first retry (default: 1000)FIRECRAWL_RETRY_MAX_DELAY
: Most delay in milliseconds between retries (default: 10000)FIRECRAWL_RETRY_BACKOFF_FACTOR
: Exponential backoff multiplier (default: 2)
Credit score Utilization Monitoring
FIRECRAWL_CREDIT_WARNING_THRESHOLD
: Credit score utilization warning threshold (default: 1000)FIRECRAWL_CREDIT_CRITICAL_THRESHOLD
: Credit score utilization essential threshold (default: 100)
Configuration Examples
For cloud API utilization with customized retry and credit score monitoring:
# Required for cloud API
export FIRECRAWL_API_KEY=your-api-key# Non-obligatory retry configuration
export FIRECRAWL_RETRY_MAX_ATTEMPTS=5 # Enhance max retry makes an attempt
export FIRECRAWL_RETRY_INITIAL_DELAY=2000 # Begin with 2s delay
export FIRECRAWL_RETRY_MAX_DELAY=30000 # Most 30s delay
export FIRECRAWL_RETRY_BACKOFF_FACTOR=3 # Extra aggressive backoff
# Non-obligatory credit score monitoring
export FIRECRAWL_CREDIT_WARNING_THRESHOLD=2000 # Warning at 2000 credit
export FIRECRAWL_CREDIT_CRITICAL_THRESHOLD=500 # Important at 500 credit
For self-hosted occasion:
# Required for self-hosted
export FIRECRAWL_API_URL=https://firecrawl.your-domain.com# Non-obligatory authentication for self-hosted
export FIRECRAWL_API_KEY=your-api-key # In case your occasion requires auth
# Customized retry configuration
export FIRECRAWL_RETRY_MAX_ATTEMPTS=10
export FIRECRAWL_RETRY_INITIAL_DELAY=500 # Begin with sooner retries
Utilization with Claude Desktop
Add this to your claude_desktop_config.json
:
{
"mcpServers": {
"mcp-server-firecrawl": {
"command": "npx",
"args": ["-y", "firecrawl-mcp"],
"env": {
"FIRECRAWL_API_KEY": "YOUR_API_KEY_HERE","FIRECRAWL_RETRY_MAX_ATTEMPTS": "5",
"FIRECRAWL_RETRY_INITIAL_DELAY": "2000",
"FIRECRAWL_RETRY_MAX_DELAY": "30000",
"FIRECRAWL_RETRY_BACKOFF_FACTOR": "3",
"FIRECRAWL_CREDIT_WARNING_THRESHOLD": "2000",
"FIRECRAWL_CREDIT_CRITICAL_THRESHOLD": "500"
}
}
}
}
System Configuration
The server contains a number of configurable parameters that may be set by way of atmosphere variables. Listed below are the default values if not configured:
const CONFIG = {
retry: {
maxAttempts: 3, // Variety of retry makes an attempt for rate-limited requests
initialDelay: 1000, // Preliminary delay earlier than first retry (in milliseconds)
maxDelay: 10000, // Most delay between retries (in milliseconds)
backoffFactor: 2, // Multiplier for exponential backoff
},
credit score: {
warningThreshold: 1000, // Warn when credit score utilization reaches this degree
criticalThreshold: 100, // Important alert when credit score utilization reaches this degree
},
};
These configurations management:
-
Retry Conduct
-
Mechanically retries failed requests as a consequence of price limits
- Makes use of exponential backoff to keep away from overwhelming the API
-
Instance: With default settings, retries shall be tried at:
- 1st retry: 1 second delay
- 2nd retry: 2 seconds delay
- third retry: 4 seconds delay (capped at maxDelay)
-
Credit score Utilization Monitoring
- Tracks API credit score consumption for cloud API utilization
- Offers warnings at specified thresholds
- Helps forestall sudden service interruption
- Instance: With default settings:
- Warning at 1000 credit remaining
- Important alert at 100 credit remaining
Charge Limiting and Batch Processing
The server makes use of Firecrawl’s built-in price limiting and batch processing capabilities:
- Automated price restrict dealing with with exponential backoff
- Environment friendly parallel processing for batch operations
- Sensible request queuing and throttling
- Automated retries for transient errors
Out there Instruments
1. Scrape Instrument (firecrawl_scrape
)
Scrape content material from a single URL with superior choices.
{
"identify": "firecrawl_scrape",
"arguments": {
"url": "https://instance.com",
"codecs": ["markdown"],
"onlyMainContent": true,
"waitFor": 1000,
"timeout": 30000,
"cell": false,
"includeTags": ["article", "main"],
"excludeTags": ["nav", "footer"],
"skipTlsVerification": false
}
}
2. Batch Scrape Instrument (firecrawl_batch_scrape
)
Scrape a number of URLs effectively with built-in price limiting and parallel processing.
{
"identify": "firecrawl_batch_scrape",
"arguments": {
"urls": ["https://example1.com", "https://example2.com"],
"choices": {
"codecs": ["markdown"],
"onlyMainContent": true
}
}
}
Response contains operation ID for standing checking:
{
"content material": [
{
"type": "text",
"text": "Batch operation queued with ID: batch_1. Use firecrawl_check_batch_status to check progress."
}
],
"isError": false
}
3. Examine Batch Standing (firecrawl_check_batch_status
)
Examine the standing of a batch operation.
{
"identify": "firecrawl_check_batch_status",
"arguments": {
"id": "batch_1"
}
}
4. Search Instrument (firecrawl_search
)
Search the net and optionally extract content material from search outcomes.
{
"identify": "firecrawl_search",
"arguments": {
"question": "your search question",
"restrict": 5,
"lang": "en",
"nation": "us",
"scrapeOptions": {
"codecs": ["markdown"],
"onlyMainContent": true
}
}
}
5. Crawl Instrument (firecrawl_crawl
)
Begin an asynchronous crawl with superior choices.
{
"identify": "firecrawl_crawl",
"arguments": {
"url": "https://instance.com",
"maxDepth": 2,
"restrict": 100,
"allowExternalLinks": false,
"deduplicateSimilarURLs": true
}
}
6. Extract Instrument (firecrawl_extract
)
Extract structured data from internet pages utilizing LLM capabilities. Helps each cloud AI and self-hosted LLM extraction.
{
"identify": "firecrawl_extract",
"arguments": {
"urls": ["https://example.com/page1", "https://example.com/page2"],
"immediate": "Extract product data together with identify, worth, and outline",
"systemPrompt": "You're a useful assistant that extracts product data",
"schema": {
"sort": "object",
"properties": {
"identify": { "sort": "string" },
"worth": { "sort": "quantity" },
"description": { "sort": "string" }
},
"required": ["name", "price"]
},
"allowExternalLinks": false,
"enableWebSearch": false,
"includeSubdomains": false
}
}
Instance response:
{
"content material": [
{
"type": "text",
"text": {
"name": "Example Product",
"price": 99.99,
"description": "This is an example product description"
}
}
],
"isError": false
}
Extract Instrument Choices:
urls
: Array of URLs to extract data fromimmediate
: Customized immediate for the LLM extractionsystemPrompt
: System immediate to information the LLMschema
: JSON schema for structured information extractionallowExternalLinks
: Enable extraction from exterior hyperlinksenableWebSearch
: Allow internet seek for extra contextincludeSubdomains
: Embrace subdomains in extraction
When utilizing a self-hosted occasion, the extraction will use your configured LLM. For cloud API, it makes use of Firecrawl’s managed LLM service.
7. Deep Analysis Instrument (firecrawl_deep_research)
Conduct deep internet analysis on a question utilizing clever crawling, search, and LLM evaluation.
{
"identify": "firecrawl_deep_research",
"arguments": {
"question": "how does carbon seize know-how work?",
"maxDepth": 3,
"timeLimit": 120,
"maxUrls": 50
}
}
Arguments:
- question (string, required): The analysis query or matter to discover.
- maxDepth (quantity, non-obligatory): Most recursive depth for crawling/search (default: 3).
- timeLimit (quantity, non-obligatory): Time restrict in seconds for the analysis session (default: 120).
- maxUrls (quantity, non-obligatory): Most variety of URLs to research (default: 50).
Returns:
- Remaining evaluation generated by an LLM based mostly on analysis. (information.finalAnalysis)
- Can also embody structured actions and sources used within the analysis course of.
8. Generate LLMs.txt Instrument (firecrawl_generate_llmstxt)
Generate a standardized llms.txt (and optionally llms-full.txt) file for a given area. This file defines how massive language fashions ought to work together with the positioning.
{
"identify": "firecrawl_generate_llmstxt",
"arguments": {
"url": "https://instance.com",
"maxUrls": 20,
"showFullText": true
}
}
Arguments:
- url (string, required): The bottom URL of the web site to research.
- maxUrls (quantity, non-obligatory): Max variety of URLs to incorporate (default: 10).
- showFullText (boolean, non-obligatory): Whether or not to incorporate llms-full.txt contents within the response.
Returns:
- Generated llms.txt file contents and optionally the llms-full.txt (information.llmstxt and/or information.llmsfulltxt)
Logging System
The server contains complete logging:
- Operation standing and progress
- Efficiency metrics
- Credit score utilization monitoring
- Charge restrict monitoring
- Error situations
Instance log messages:
[INFO] Firecrawl MCP Server initialized efficiently
[INFO] Beginning scrape for URL: https://instance.com
[INFO] Batch operation queued with ID: batch_1
[WARNING] Credit score utilization has reached warning threshold
[ERROR] Charge restrict exceeded, retrying in 2s...
Error Dealing with
The server offers sturdy error dealing with:
- Automated retries for transient errors
- Charge restrict dealing with with backoff
- Detailed error messages
- Credit score utilization warnings
- Community resilience
Instance error response:
{
"content material": [
{
"type": "text",
"text": "Error: Rate limit exceeded. Retrying in 2 seconds..."
}
],
"isError": true
}
Improvement
# Set up dependencies
npm set up# Construct
npm run construct
# Run assessments
npm take a look at
Contributing
- Fork the repository
- Create your function department
- Run assessments:
npm take a look at
- Submit a pull request
License
MIT License – see LICENSE file for particulars