Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.parcha.ai/llms.txt

Use this file to discover all available pages before exploring further.

The adverse media screening check analyzes news articles, media coverage, and other public sources to identify potential risks or concerns associated with a business. The check searches multiple data sources including Refinitiv World Check, ComplyAdvantage, and Google News to build comprehensive adverse media profiles.

Check ID

kyb.adverse_media_screening_check_v2

Response Structure

The check result contains a list of verified adverse media hits, where each hit represents a business profile found in adverse media sources along with detailed article information and metadata.
{
  type: "KYBAdverseMediaScreeningCheckResultV2";
  verified_adverse_media_hits: Array<BusinessAdverseMediaProfile>;
}

BusinessAdverseMediaProfile Structure

Each adverse media profile contains comprehensive information about a business found in adverse media sources:
{
  // Identification
  id: string;                    // Short unique identifier (8 characters)
  reference_id: string | null;   // External reference ID from data source
  business_name: string | null;  // Name of the business in the profile

  // Geographic Information
  associated_countries: Array<StandardizedCountry>;
  associated_addresses: Array<Address>;
  location: string | null;       // e.g., "Houston, TX"

  // Weblinks and Articles
  weblinks: Array<Weblink>;      // Detailed article references with metadata

  // Classification
  is_perpetrator: boolean | null;  // Whether business is perpetrator of the event
  topics: Array<string>;           // Event topics (e.g., "Regulatory Violations", "Legal Disputes")
  title: string | null;            // Brief title summarizing the event
  summary: string | null;          // Summary of the adverse media event
  when: string | null;             // When the event occurred

  // Matching and Review
  profile_review: BusinessProfileReview;  // Match analysis and ratings
  escalate_for_review: boolean;           // Whether to escalate for manual review
  vendor: string | null;                  // Data source vendor (e.g., "refinitiv_world_check")
}
Each weblink represents a specific article or source with full metadata:
{
  // Identification
  id: string;                    // Unique ID (6 characters)
  url: string | null;            // Article URL
  title: string | null;          // Article title

  // Temporal and Geographic Data
  date: string | null;           // Date from the weblink
  when: string | null;           // When the event occurred (e.g., "2025")
  where_countries: Array<string>; // Countries mentioned in article
  where_cities: Array<string>;    // Cities mentioned in article

  // Content
  summary: string | null;        // Summary of the article
  scanned_website: ScannedWebsite;  // Full scraped webpage content
  metadata: BusinessArticleMetadataV1;  // Structured article metadata

  // Source Information
  article_source: ArticleSource;  // Source type (e.g., "serp_google_search", "refinitiv_world_check")

  // Status
  has_photo: boolean | null;
  is_dead_url: boolean | null;
}

BusinessArticleMetadataV1 Structure

Detailed metadata extracted from each article using LLM analysis:
{
  type: "BusinessArticleMetadataV1";
  id: string;

  // Article Classification
  adverse_media_article_title: string | null;
  is_adverse_media_article: boolean;    // Whether this is adverse media
  topics: Array<string> | null;         // Topics like "Regulatory Violations", "Legal Disputes"

  // Business Involvement
  is_perpetrator: boolean;               // Whether business is the perpetrator
  is_hit_business_name_found: boolean;  // Whether business name was found in article
  hit_business_name_value_found: string | null;  // Exact business name found

  // Geographic Context
  associated_addresses: Array<Address>;
  associated_countries: Array<string>;

  // Temporal Context
  year_of_the_crime: number | null;           // Year the event occurred
  year_of_article_publication: number | null; // Year article was published

  // Event Details
  summary_of_event: string | null;            // Summary of the main event
  summary_of_relation_to_crime: string | null;  // How business relates to the event
  quote_from_article: string | null;          // Direct quote from the article
  source_url: string | null;                  // Article URL
}

ScannedWebsite Structure

Raw scraped content from the webpage:
{
  type: "ScannedWebsite";
  source_id: string;                 // Unique source ID (6 characters)
  source_name: string | null;        // Name of the source
  webpage_url: string | null;        // URL of the webpage
  webpage_title: string | null;      // Page title
  webpage_text: string | null;       // Full text content
  screenshot_url: string | null;     // Screenshot URL
  pdf_snapshot_url: string | null;   // PDF snapshot URL
  response_code: number;             // HTTP response code
  is_valid_url: boolean | null;      // Whether URL is accessible
  language: string | null;           // Language of the content
  publication_date: string | null;   // Publication date
  scrape_type: ScrapeType;           // Type of scrape ("generic", etc.)
  error: string | null;              // Error message if scraping failed
  cloudflare_encountered: boolean;   // Whether Cloudflare protection was encountered
}

BusinessProfileReview Structure

Match analysis and confidence scoring:
{
  id: string;                                  // Review ID (8 characters)
  business_name_match: BusinessNameMatch | null;  // Business name match result
  address_matches: Array<AddressMatch> | null;     // Address match results
  best_address_match: AddressMatch | null;         // Best matching address
  match_rating: HitMatch | null;                   // Overall match confidence
}

Match Rating System

The match_rating field uses the following enum values:
  • strong_match: High confidence that the adverse media refers to the screened business
    • Business name matches closely
    • Location matches
    • Other identifying details align
  • partial_match: Moderate confidence in the match
    • Business name is similar but not exact
    • Some geographic or contextual alignment
    • May require manual review
  • weak_match: Low confidence in the match
    • Name similarity is limited
    • Location may not match
    • Likely a different business with a similar name
  • no_match: Clear mismatch
    • Business name is significantly different
    • Location doesn’t match
    • Context clearly indicates a different entity
  • unknown: Unable to determine match confidence
    • Insufficient information available
    • Ambiguous details

Example Response

{
  "type": "KYBAdverseMediaScreeningCheckResultV2",
  "verified_adverse_media_hits": [
    {
      "id": "tm2m3kdz",
      "reference_id": null,
      "business_name": "Empower Pharmacy",
      "associated_countries": [
        {
          "original_country_input": "United States",
          "country_name": "United States of America",
          "alpha_2_country_code": "US",
          "alpha_3_country_code": "USA",
          "numeric_country_code": "840"
        }
      ],
      "associated_addresses": [
        {
          "type": "Address",
          "street_1": null,
          "street_2": null,
          "city": "Houston",
          "state": null,
          "country_code": "US",
          "postal_code": null
        }
      ],
      "weblinks": [
        {
          "id": "ijbjo6cc",
          "url": "https://lawgaze.com/empower-pharmacy-lawsuit/",
          "date": null,
          "has_photo": null,
          "is_dead_url": null,
          "summary": "Empower Pharmacy is the defendant in regulatory disputes with the FDA over alleged sterility concerns, labeling violations, and compliance failures.",
          "when": "2025",
          "where_countries": ["United States"],
          "where_cities": ["Houston"],
          "scanned_website": {
            "type": "ScannedWebsite",
            "source_id": "nbxery",
            "webpage_url": null,
            "webpage_title": "Empower Pharmacy Lawsuit: FDA Disputes and Legal Battles",
            "response_code": 200,
            "scrape_type": "generic",
            "cloudflare_encountered": false,
            "error": null
          },
          "metadata": {
            "type": "BusinessArticleMetadataV1",
            "id": "ijbjo6cc",
            "topics": ["Regulatory Violations", "Legal Disputes", "Compliance Issues"],
            "adverse_media_article_title": "Empower Pharmacy Lawsuit: FDA Disputes and Legal Battles",
            "is_adverse_media_article": true,
            "is_perpetrator": false,
            "is_hit_business_name_found": true,
            "hit_business_name_value_found": "Empower Pharmacy",
            "associated_addresses": [
              {
                "type": "Address",
                "street_1": null,
                "street_2": null,
                "city": "Houston",
                "state": "Texas",
                "country_code": "US",
                "postal_code": null
              }
            ],
            "associated_countries": ["United States"],
            "year_of_the_crime": 2025,
            "year_of_article_publication": 2025,
            "summary_of_relation_to_crime": "Empower Pharmacy is the defendant in regulatory disputes with the FDA over alleged sterility concerns, labeling violations, and compliance failures.",
            "summary_of_event": "The FDA has alleged that Empower Pharmacy, one of the largest compounding pharmacies in the US, has violated regulations regarding sterility standards, labeling compliance, and drug manufacturing practices.",
            "source_url": "https://lawgaze.com/empower-pharmacy-lawsuit/",
            "quote_from_article": "Empower Pharmacy, one of the largest compounding pharmacies in the United States, has found itself at the center of a legal storm."
          },
          "title": "Empower Pharmacy Lawsuit: FDA Disputes and Legal Battles",
          "article_source": "serp_google_search"
        }
      ],
      "is_perpetrator": null,
      "topics": [],
      "title": null,
      "summary": null,
      "when": null,
      "profile_review": {
        "id": "abc123de",
        "business_name_match": null,
        "address_matches": null,
        "best_address_match": null,
        "match_rating": {
          "match": "partial_match",
          "reason": "Business name matches, location matches, but needs manual review"
        }
      },
      "escalate_for_review": true,
      "vendor": null
    }
  ]
}

Response Fields

type
string
Always "KYBAdverseMediaScreeningCheckResultV2".
verified_adverse_media_hits
array
List of verified adverse media profiles for the business. Each profile represents a business found in adverse media sources.

Article Sources

The check aggregates adverse media from multiple sources:

Primary Data Sources

  • Refinitiv World-Check (refinitiv_world_check)
    • Global risk intelligence database
    • PEPs, sanctions, and adverse media
  • ComplyAdvantage (comply_advantage)
    • Real-time risk database
    • Comprehensive adverse media coverage

Search Engine Sources

  • Google Search (serp_google_search)
    • Web search results for adverse media
    • Broad coverage of online content
  • Google News (serp_google_news)
    • News-specific search results
    • Recent and archived news articles
  • Brave Search (serp_brave_search, serp_brave_news)
    • Privacy-focused search engine results

Other Sources

  • Opoint (opoint)
    • Specialized adverse media intelligence
  • Other (other)
    • Miscellaneous or unclassified sources

Key Components

Article Metadata Extraction

Each article undergoes LLM-powered analysis to extract:
  1. Event Classification: Topics and categories (regulatory, legal, financial, etc.)
  2. Business Involvement: Whether business is perpetrator, victim, or mentioned
  3. Geographic Context: Countries and cities mentioned
  4. Temporal Context: When the event occurred and when it was published
  5. Relationship Analysis: How the business relates to the adverse event
  6. Evidence Extraction: Direct quotes and summaries

Match Confidence Scoring

The system evaluates multiple factors to determine match confidence:
  • Business name similarity and exact matches
  • Geographic alignment (addresses, cities, countries)
  • Contextual relevance
  • Article quality and recency
  • Source reliability

Escalation Logic

Profiles are escalated for manual review based on:
  1. Match Rating: Strong and partial matches typically escalated
  2. Event Severity: Regulatory violations, criminal activity, major fraud
  3. Recency: Recent events (within last 2-5 years)
  4. Volume: Multiple articles about the same event
  5. Source Quality: Articles from reputable sources

Common Topics

Articles are categorized into topics such as:
  • Regulatory Violations: FDA warnings, compliance failures, regulatory actions
  • Legal Disputes: Lawsuits, legal battles, civil litigation
  • Compliance Issues: Failure to meet standards, policy violations
  • Safety Issues: Product safety, public health concerns
  • Financial Misconduct: Fraud, embezzlement, financial crimes
  • Criminal Activity: Criminal charges, investigations
  • Operational Problems: Business failures, bankruptcy
  • Reputational Concerns: Scandals, negative publicity

Implementation Details

Pydantic Schema Location

  • Main Schema: ai/data_loaders/schema/kyb_schema.py
  • Base Classes: ai/data_loaders/schema/base.py
  • Models: ai/tools/bdd/bdd_models.py

Data Loader

ai/data_loaders/kyb_adverse_media_profile_loader_v2.py

Tool Implementation

ai/tools/kyb/kyb_adverse_media_screening_check_v2.py

Filtering Examples

Use JSONPath queries to extract specific fields from the check results:
# Get only articles from Google Search
?jsonpath_query=$.check_results[?(@.command_id=='kyb.adverse_media_screening_check_v2')].payload.verified_adverse_media_hits[*].weblinks[?(@.article_source=='serp_google_search')]

# Get only articles where business is the perpetrator
?jsonpath_query=$.check_results[?(@.command_id=='kyb.adverse_media_screening_check_v2')].payload.verified_adverse_media_hits[*].weblinks[?(@.metadata.is_perpetrator==true)]

# Extract all article titles
?jsonpath_query=$.check_results[?(@.command_id=='kyb.adverse_media_screening_check_v2')].payload.verified_adverse_media_hits[*].weblinks[*].title

# Get articles about regulatory violations
?jsonpath_query=$.check_results[?(@.command_id=='kyb.adverse_media_screening_check_v2')].payload.verified_adverse_media_hits[*].weblinks[?(@.metadata.topics[*]=='Regulatory Violations')]
See getJobById filtering documentation for more JSONPath query examples.