Sunday, June 8, 2025

Subsequent – Python library that makes use of Wappalyzer extension (and its fingerprints) to detect applied sciences




This undertaking is a command line device and python library that makes use of Wappalyzer extension (and its fingerprints) to detect applied sciences. Different initiatives emerged after discontinuation of the official open supply undertaking are utilizing outdated fingerpints and lack accuracy when used on dynamic web-apps, this undertaking bypasses these limitations.

Set up

Earlier than putting in wappalyzer, you’ll to put in Firefox and geckodriver/releases”>geckodriver. Under are detailed steps for organising geckodriver however chances are you’ll use google/youtube for assist.

Organising geckodriver

Step 1: Obtain GeckoDriver

  1. Go to the official GeckoDriver releases web page on GitHub:
    https://github.com/mozilla/geckodriver/releases
  2. Obtain the model appropriate along with your system:
  3. For Home windows: geckodriver-vX.XX.X-win64.zip
  4. For macOS: geckodriver-vX.XX.X-macos.tar.gz
  5. For Linux: geckodriver-vX.XX.X-linux64.tar.gz
  6. Extract the downloaded file to a folder of your selection.

Step 2: Add GeckoDriver to the System Path

To make sure Selenium can find the GeckoDriver executable: – Home windows: 1. Transfer the geckodriver.exe to a listing (e.g., C:WebDrivers). 2. Add this listing to the system’s PATH: – Open Setting Variables. – Underneath System Variables, discover and choose the Path variable, then click on Edit. – Click on New and enter the listing path the place geckodriver.exe is saved. – Click on OK to avoid wasting. – macOS/Linux: 1. Transfer the geckodriver file to /usr/native/bin/ or one other listing in your PATH. 2. Use the next command within the terminal: bash sudo mv geckodriver /usr/native/bin/ Guarantee /usr/native/bin/ is in your PATH.

Set up as a command-line device

pipx set up wappalyzer

Set up as a library

To make use of it as a library, set up it with pip inside an remoted container e.g. venv or docker. You may additionally --break-system-packages to do a ‘common’ set up however it’s not advisable.

Set up with docker

Steps
  1. Clone the repository:
git clone https://github.com/s0md3v/wappalyzer-next.git
cd wappalyzer-next
  1. Construct and run with Docker Compose:
docker compose up -d
  1. To scan URLs utilizing the Docker container:

  2. Scan a single URL:

docker compose run --rm wappalyzer -i https://instance.com
  • Scan A number of URLs from a file:
docker compose run --rm wappalyzer -i https://instance.com -oJ output.json

For Customers

Some frequent utilization examples are given beneath, discuss with record of all choices for extra data.

  • Scan a single URL: wappalyzer -i https://instance.com
  • Scan a number of URLs from a file: wappalyzer -i urls.txt -t 10
  • Scan with authentication: wappalyzer -i https://instance.com -c "sessionid=abc123; token=xyz789"
  • Export outcomes to JSON: wappalyzer -i https://instance.com -oJ outcomes.json

Choices

Word: For accuracy use ‘full’ scan kind (default). ‘quick’ and ‘balanced’ don’t use browser emulation.

  • -i: Enter URL or file containing URLs (one per line)
  • --scan-type: Scan kind (default: ‘full’)
  • quick: Fast HTTP-based scan (sends 1 request)
  • balanced: HTTP-based scan with extra requests
  • full: Full scan utilizing wappalyzer extension
  • -t, --threads: Variety of concurrent threads (default: 5)
  • -oJ: JSON output file path
  • -oC: CSV output file path
  • -oH: HTML output file path
  • -c, --cookie: Cookie header string for authenticated scans

For Builders

The python library is a obtainable on pypi as wappalyzer and might be imported with the identical title.

Utilizing the Library

The principle operate you will work together with is analyze():

from wappalyzer import analyze

# Fundamental utilization
outcomes = analyze('https://instance.com')

# With choices
outcomes = analyze(
url="https://instance.com",
scan_type="full", # 'quick', 'balanced', or 'full'
threads=3,
cookie="sessionid=abc123"
)

analyze() Perform Parameters

  • url (str): The URL to research
  • scan_type (str, non-compulsory): Kind of scan to carry out
  • 'quick': Fast HTTP-based scan
  • 'balanced': HTTP-based scan with extra requests
  • 'full': Full scan together with JavaScript execution (default)
  • threads (int, non-compulsory): Variety of threads for parallel processing (default: 3)
  • cookie (str, non-compulsory): Cookie header string for authenticated scans

Return Worth

Returns a dictionary with the URL as key and detected applied sciences as worth:

{
"https://github.com": {
"Amazon S3": {"model": "", "confidence": 100, "classes": ["CDN"], "teams": ["Servers"]},
"lit-html": {"model": "1.1.2", "confidence": 100, "classes": ["JavaScript libraries"], "teams": ["Web development"]},
"React Router": {"model": "6", "confidence": 100, "classes": ["JavaScript frameworks"], "teams": ["Web development"]},
"https://google.com" : {},
"https://instance.com" : {},
}}

FAQ

Why use Firefox as a substitute of Chrome?

Firefox extensions are .xpi information that are basically zip information. This makes it simpler to extract knowledge and barely modify the extension to make this device work.

What’s the distinction between ‘quick’, ‘balanced’, and ‘full’ scan sorts?

  • quick: Sends a single HTTP request to the URL. Would not use the extension.
  • balanced: Sends further HTTP requests to .js information, /robots.txt annd does DNS queries. Would not use the extension.
  • full: Makes use of the official Wappalyzer extension to scan the URL in a headless browser.



Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles

PHP Code Snippets Powered By : XYZScripts.com