URL Threat Detection API

Deep AI-powered analysis for investigating suspicious URLs with multi-agent threat detection and evidence-based reporting.

Overview

What is URL Threat Detection?

URL Threat Detection is our comprehensive, AI-powered security analysis service that performs deep investigation of suspicious URLs. Unlike basic domain checks, this API scrapes page content, analyzes screenshots, and uses multi-agent AI to detect zero-day threats and sophisticated phishing attacks.

Asynchronous

20-40 seconds

Pricing

$0.02/scan

Multi-Agent AI

4 specialist agents

What It Analyzes

• URL patterns and domain infrastructure

• DNS, RDAP, and SSL validation

• Full page content scraping (HTML, scripts)

• Screenshot analysis with AI vision models

• Brand impersonation detection

• Evidence-based threat classification

Typical Use Cases

• Investigating URLs reported by users or security tools
• Detecting brand impersonation and phishing attacks
• Security incident response and forensic analysis
• Analyzing URLs that failed initial domain-level screening

Creating a Scan

Send a POST request to start a new URL scan. The scan will be processed asynchronously, and you'll receive a scan ID to check the status.

Endpoint

POST https://api.urlert.com/v1/scans

Request Body

{
  "url": "https://example.com"
}

Code Examples

curl -X POST https://api.urlert.com/v1/scans \
  -H "Authorization: Bearer u_sk_your_api_token_here" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

import requests

response = requests.post(
    'https://api.urlert.com/v1/scans',
    headers={
        'Authorization': 'Bearer u_sk_your_api_token_here',
        'Content-Type': 'application/json'
    },
    json={'url': 'https://example.com'}
)

data = response.json()
print(data['scan_id'])  # Save this to check status later

const response = await fetch('https://api.urlert.com/v1/scans', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer u_sk_your_api_token_here',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({ url: 'https://example.com' })
});

const data = await response.json();
console.log(data.scan_id); // Save this to check status later

Response (201 Created)

{
  "scan_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "pending",
  "created_at": "2024-01-15T10:30:00Z"
}

Checking Scan Status

Use the scan ID from the create response to check the status and retrieve results once the scan is complete. Poll this endpoint until the status is completed or failed.

Polling Guidance: Most scans complete within 30-60 seconds. Poll every 3-5 seconds to check status. Avoid polling more frequently to stay within rate limits.

Endpoint

GET https://api.urlert.com/v1/scans/{scan_id}

Code Examples with Polling

curl https://api.urlert.com/v1/scans/550e8400-e29b-41d4-a716-446655440000 \
  -H "Authorization: Bearer u_sk_your_api_token_here"

import requests
import time

scan_id = '550e8400-e29b-41d4-a716-446655440000'
headers = {'Authorization': 'Bearer u_sk_your_api_token_here'}

# Poll until scan completes
while True:
    response = requests.get(
        f'https://api.urlert.com/v1/scans/{scan_id}',
        headers=headers
    )
    data = response.json()
    
    if data['status'] in ['completed', 'failed']:
        break
    
    print(f"Status: {data['status']}, waiting...")
    time.sleep(5)  # Wait 5 seconds before next check

print(f"Final status: {data['status']}")
if data['status'] == 'completed':
    print(f"Assessment: {data['report']['final_assessment']}")

const scanId = '550e8400-e29b-41d4-a716-446655440000';
const headers = {
  'Authorization': 'Bearer u_sk_your_api_token_here'
};

// Poll until scan completes
async function pollScanStatus() {
  while (true) {
    const response = await fetch(
      `https://api.urlert.com/v1/scans/${scanId}`,
      { headers }
    );
    const data = await response.json();
    
    if (data.status === 'completed' || data.status === 'failed') {
      return data;
    }
    
    console.log(`Status: ${data.status}, waiting...`);
    await new Promise(resolve => setTimeout(resolve, 5000)); // Wait 5 seconds
  }
}

const result = await pollScanStatus();
console.log(`Final status: ${result.status}`);
if (result.status === 'completed') {
  console.log(`Assessment: ${result.report.final_assessment}`);
}

Scan Statuses

pending Pending

Scan is queued and waiting to start

processing Processing

Scan is currently running

completed Completed

Scan finished successfully, results available

failed Failed

Scan encountered an error

Response Schema

Detailed structure of the scan response object. The report field is only present when status is completed.

Top-Level Fields

Field	Type	Description
scan_id	`string`	UUID of the scan
url	`string`	The URL that was scanned
status	`enum`	Current status: `pending`, `processing`, `completed`, `failed`
created_at	`string`	ISO 8601 timestamp when scan was created
completed_at	`string?`	ISO 8601 timestamp when scan completed (null if not completed)
report	`object?`	Final analysis report (only when status is completed)
error	`string?`	Error message (only when status is failed)

The main analysis report containing the assessment, threat classification, and supporting evidence from AI agents.

Field	Type	Description
request_url	`string`	The original URL that was submitted for analysis
final_assessment	`enum`	Top-level verdict: • `Malicious` - High confidence threat detected • `Suspicious` - Multiple red flags but not definitive • `Benign` - Analyzed and found safe • `Unknown` - Analysis could not be completed
threat_type	`enum`	Classification of detected threat: • `Phishing / Credential Theft` • `Malware Distribution` • `Scam / Fraudulent E-commerce` • `Social Engineering` • `Infrastructure-level Anomaly` • `Analysis Incomplete` • `Benign`
confidence_score	`number`	Confidence level (0-100) indicating certainty of the assessment
executive_summary	`string`	Plain-English explanation of the findings, generated by the Judge agent
evidence_findings	`array`	List of specific evidence from specialist agents (URL, Network, Vision, Content). Empty if no threats found.
site_context	`object?`	Neutral description of what the site appears to be (separate from threat assessment)

Individual findings from analysis agents (URL, Network, Vision, Content, Judge). Each finding has a predefined ID from the evidence catalogue (e.g., NET-001, VIS-003, CON-002).

Field	Type	Description
finding_id	`string`	Unique identifier from evidence catalogue (e.g., `URL-001`, `NET-001`, `VIS-003`)
severity	`enum`	Severity level: `Critical`, `High`, `Medium`, `Low`
source_agent	`string`	Agent that identified this finding (e.g., `URL Agent`, `Network Agent`, `Vision Agent`, `Content Agent`, `Judge Agent`)
description	`string`	Standardized description from the evidence catalogue (e.g., "Domain is newly registered")
details	`string`	Custom, case-specific details with actual evidence (e.g., "Domain 'suspicious.com' was created 2 days ago")

Common Finding IDs:

• URL-001: High-risk TLD • URL-002: Typosquatting • URL-003: Suspicious keywords
• NET-001: Newly registered domain • NET-010: Redirects to legitimate site (cloaking)
• VIS-003: Brand impersonation • VIS-006: Anti-bot/CAPTCHA detected
• CON-002: Form submits to external domain • CON-004: Urgent/threatening language

Neutral, factual description of the website's content and purpose as determined by Vision and Content agents. This describes what the site appears to be, separate from the threat assessment. All fields are optional.

Field	Type	Description
apparent_purpose	`string?`	What the site claims to be (e.g., "E-commerce store", "Bank login page", "Information website")
primary_language	`string?`	ISO language code of primary content (e.g., "en", "es", "fr")
detected_brand	`string?`	Brand or company name the site represents or impersonates (e.g., "PayPal", "Amazon"). May indicate impersonation if mismatched with domain.
page_title	`string?`	Content of the HTML `<title>` tag

Example Responses

{
  "scan_id": "550e8400-e29b-41d4-a716-446655440000",
  "url": "https://example.com",
  "status": "processing",
  "created_at": "2024-01-15T10:30:00Z",
  "completed_at": null,
  "report": null,
  "error": null
}

{
  "scan_id": "550e8400-e29b-41d4-a716-446655440000",
  "url": "https://example.com",
  "status": "completed",
  "created_at": "2024-01-15T10:30:00Z",
  "completed_at": "2024-01-15T10:30:45Z",
  "report": {
    "request_url": "https://example.com",
    "final_assessment": "Benign",
    "threat_type": "Benign",
    "confidence_score": 95,
    "executive_summary": "This website has been analyzed and determined to be safe. No malicious indicators were detected in the domain registration, SSL certificate, page content, or visual presentation. The site appears to be a legitimate information page.",
    "evidence_findings": [],
    "site_context": {
      "apparent_purpose": "Information website",
      "primary_language": "en",
      "detected_brand": null,
      "page_title": "Example Domain"
    }
  },
  "error": null
}

{
  "scan_id": "550e8400-e29b-41d4-a716-446655440000",
  "url": "https://example-phishing-site.com/login.php",
  "status": "completed",
  "created_at": "2024-01-15T10:30:00Z",
  "completed_at": "2024-01-15T10:31:02Z",
  "report": {
    "request_url": "https://example-phishing-site.com/login.php",
    "final_assessment": "Malicious",
    "threat_type": "Phishing / Credential Theft",
    "confidence_score": 95,
    "executive_summary": "This website is assessed as a high-confidence phishing site. It combines a newly registered domain (2 days old) with a visual layout that impersonates a major financial institution, while all forms submit credentials to an unrelated, suspicious domain.",
    "evidence_findings": [
      {
        "finding_id": "NET-001",
        "severity": "Critical",
        "source_agent": "Network Agent",
        "description": "Domain is newly registered.",
        "details": "The domain 'example-phishing-site.com' was registered 2 days ago."
      },
      {
        "finding_id": "VIS-003",
        "severity": "High",
        "source_agent": "Vision Agent",
        "description": "Brand impersonation detected.",
        "details": "The website's logo, color scheme, and layout are a near-perfect clone of 'Big Bank Corp', which is a known financial institution."
      },
      {
        "finding_id": "CON-002",
        "severity": "Critical",
        "source_agent": "Content Agent",
        "description": "Form submits to an external domain.",
        "details": "The HTML login form 'action' attribute points to 'http://steal-my-data.biz/submit.php', which is an external domain not associated with the visible domain or 'Big Bank Corp'."
      }
    ],
    "site_context": {
      "apparent_purpose": "Login page for 'Big Bank Corp'",
      "primary_language": "en",
      "detected_brand": "Big Bank Corp",
      "page_title": "Big Bank Corp :: Secure Login"
    }
  },
  "error": null
}

{
  "scan_id": "550e8400-e29b-41d4-a716-446655440000",
  "url": "https://suspicious-deal-site.com",
  "status": "completed",
  "created_at": "2024-01-15T10:30:00Z",
  "completed_at": "2024-01-15T10:30:58Z",
  "report": {
    "request_url": "https://suspicious-deal-site.com",
    "final_assessment": "Suspicious",
    "threat_type": "Scam / Fraudulent E-commerce",
    "confidence_score": 75,
    "executive_summary": "This website exhibits multiple red flags commonly associated with scam e-commerce sites, including a newly registered domain, high-pressure language, and deceptive UI elements. While not definitively malicious, users should exercise extreme caution.",
    "evidence_findings": [
      {
        "finding_id": "NET-001",
        "severity": "Critical",
        "source_agent": "Network Agent",
        "description": "Domain is newly registered.",
        "details": "The domain 'suspicious-deal-site.com' was created 5 days ago."
      },
      {
        "finding_id": "VIS-004",
        "severity": "Medium",
        "source_agent": "Vision Agent",
        "description": "Site uses high-pressure or deceptive UI elements.",
        "details": "The page features fake countdown timers and false scarcity claims ('Only 2 left in stock!')."
      },
      {
        "finding_id": "CON-004",
        "severity": "Medium",
        "source_agent": "Content Agent",
        "description": "Content uses urgent or threatening language.",
        "details": "The text content contains multiple urgency phrases: 'Limited time offer!', 'Act now before it's too late!', 'Don't miss out!'."
      }
    ],
    "site_context": {
      "apparent_purpose": "E-commerce store",
      "primary_language": "en",
      "detected_brand": null,
      "page_title": "Amazing Deals - 90% Off Everything!"
    }
  },
  "error": null
}

{
  "scan_id": "550e8400-e29b-41d4-a716-446655440000",
  "url": "https://unreachable-site.com",
  "status": "completed",
  "created_at": "2024-01-15T10:30:00Z",
  "completed_at": "2024-01-15T10:30:15Z",
  "report": {
    "request_url": "https://unreachable-site.com",
    "final_assessment": "Suspicious",
    "threat_type": "Infrastructure-level Anomaly",
    "confidence_score": 0,
    "executive_summary": "Analysis halted: DNS failure. The domain could not be resolved. This is common for recently taken-down malicious sites or typosquatted domains that were never properly configured.",
    "evidence_findings": [
      {
        "finding_id": "SYS-001",
        "severity": "Medium",
        "source_agent": "Network Agent",
        "description": "Failed to resolve domain (DNS failure).",
        "details": "The domain 'unreachable-site.com' could not be resolved (NXDOMAIN). This is common for temporary or recently-taken-down malicious sites."
      }
    ],
    "site_context": null
  },
  "error": null
}

Rate Limiting

All API endpoints are rate limited on a per-organization, per-minute basis. Rate limits are enforced using a sliding window algorithm to ensure fair usage.

Default Rate Limits

Operation	Limit	Description
POST /v1/scans	10/minute	Create new URL scans
GET /v1/scans/:id	60/minute	Check scan status and results

Rate Limit Headers

Every API response includes headers that help you track your rate limit status:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1705315800

X-RateLimit-Limit Maximum requests allowed per minute
X-RateLimit-Remaining Requests remaining in current window
X-RateLimit-Reset Unix timestamp when the rate limit resets

Handling Rate Limit Errors

When you exceed your rate limit, you'll receive a 429 Too Many Requests response with a Retry-After header indicating how long to wait:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1705315800
Retry-After: 42

{
  "error": "rate_limit_exceeded",
  "message": "Rate limit exceeded. Try again in 42 seconds.",
  "limit": 10,
  "retry_after": 42
}

Best Practices

• Monitor Headers: Check rate limit headers in responses to track your usage and avoid hitting limits.
• Implement Backoff: When you receive a 429 error, respect the Retry-After header value before retrying.
• Batch When Possible: For read operations, consider implementing caching to reduce the number of API calls.
• Poll Responsibly: When polling for scan results, use intervals of 3-5 seconds to stay well within the read rate limit.

Best Practices

• Poll Every 3-5 Seconds: Most scans complete within 30-60 seconds. Polling more frequently may hit rate limits (60 requests/minute for reads).
• Store Scan IDs: Save the scan_id from the create response to check results later or for record-keeping.
• Handle All Statuses: Your integration should gracefully handle all scan statuses (pending, processing, completed, failed).
• Secure Your Token: Never expose your API token in client-side code or public repositories. Use environment variables.
• Handle Errors: Always check response status codes and handle errors appropriately (insufficient balance, rate limits, etc.).