CVE Scraper Walk Through

Table of contents

CVE Scraper Walk Through

Introduction

In cybersecurity, managing and analyzing lists of vulnerabilities is a common task. Often, these lists are long and need to be cross-checked against authoritative sources like the NIST database to ensure accuracy and completeness. This tutorial explains the workflow for processing vulnerability lists using the provided scripts, and how each step helps address security concerns.

Technically this can be done manually, via a bunch of interns. But assume that you have thousands of issues, that may or may NOT be valid. Then, holy shit – let’s utilize the interns for something much better. Also less boring, and more value add for them. For example, they can actually spend time to understand the context behind the generation of the report rather than lame data entry stuff

What is the NIST Database?

The National Institute of Standards and Technology (NIST) maintains the National Vulnerability Database (NVD), which is a comprehensive repository of known cybersecurity vulnerabilities. Each vulnerability is assigned a unique CVE (Common Vulnerabilities and Exposures) identifier. The NIST NVD provides:

Detailed descriptions of vulnerabilities
Severity scores (CVSS)
References to patches and advisories
Information on affected products

Why cross-check with NIST?

Ensures your vulnerability list is accurate and up-to-date
Provides standardized severity ratings
Helps prioritize remediation efforts

Workflow Overview

1. Receiving the Vulnerability List

Typically, you receive a list of vulnerabilities in Excel format.
The list may not be in the required JSON format for automated processing.

2. Preparing the List

Copy and paste the relevant data from Excel into a plain text editor (e.g., Notepad).
Save the file as a text file for further processing.

Script Explanations

convert-to-json-cve.js

Purpose:

Converts a plain text list of CVEs (e.g., from Notepad) into a JSON format required by the main processing script.

How it works:

Reads the text file containing the list of CVEs (one per line).
Converts the list into a JSON array.
Saves the result as a .json file for use by the main script.

Script Example:

const fs = require('fs');

// Read the CVE list from the file '20250807-unique-cve-txt'
fs.readFile('20250807-unique-cve-txt', 'utf8', (err, data) => {
   if (err) {
      console.error("Error reading the file:", err);
      return;
   }

   // Convert the text into an array of CVE strings
   const cveArray = data.trim().split('\n').map(cve => cve.trim());

   // Convert the array to JSON
   const cveJson = JSON.stringify(cveArray, null, 2);

   // Write the JSON to a file
   fs.writeFileSync('20250807-unique-cve.json', cveJson, 'utf8');

   console.log('CVE list has been saved as "20250807-unique-cve.json"');
});

Why is this important?

Standardizes the input format, enabling automated processing and cross-checking.
Reduces manual errors when handling large lists.

scrapeCVE.js (Main Script)

Purpose:

Takes the JSON-formatted list of CVEs and cross-checks each entry against the NIST NVD.
Retrieves detailed information for each CVE.

Script Breakdown

1. Module Imports

const axios = require('axios');
const cheerio = require('cheerio');
const csvWriter = require('csv-writer').createObjectCsvWriter;
const fs = require('fs');

Loads required modules for HTTP requests, HTML parsing, CSV writing, and file operations.

2. Reading the CVE List

fs.readFile('20250807-unique-cve.json', 'utf8', (err, data) => {
   if (err) {
      console.error('Error reading cveList.json:', err);
      return;
   }
   cveList = JSON.parse(data);
   console.log('CVE list loaded:', cveList);
   main('20250807-unique-cve.csv');
});

Reads a JSON file containing a list of CVE IDs, parses it, and starts the main processing function.

3. Fetching and Parsing CVE Details

async function getCveDetails(cveNumber) {
   const url = `https://nvd.nist.gov/vuln/detail/${cveNumber}`;
   try {
      const response = await axios.get(url);
      const $ = cheerio.load(response.data);
      // Extracts various fields from the NIST NVD page
      // ... (status, description, references, scores, dates, CPE configs)
      return {
         CVE: cveNumber,
         Status: status,
         Description: description,
         References: references,
         BaseScore: baseScore,
         Severity: severity,
         PublishedDate: publishedDate,
         LastModified: lastModifiedDate,
         NistScore: nistScore,
         NistScoreSev: nistScoreSev,
         AdpScoreSev: adpScoreSev,
         AdpScore: adpScore,
         CPEConfig1: cpeConfigs[0]?.cpeText || 'Not found',
         // ... up to CPEConfig5
      };
   } catch (error) {
      console.error(`Error fetching ${url}:`, error.message);
      return null;
   }
}

For each CVE, fetches its NIST NVD page, parses the HTML, and extracts all relevant fields (status, description, references, scores, dates, affected products, etc.).

4. Writing Results to CSV

async function writeToCsv(cveDetailsList, outputFile) {
   const writer = csvWriter({
      path: outputFile,
      header: [
         { id: 'CVE', title: 'CVE' },
         // ... other fields
      ]
   });
   await writer.writeRecords(cveDetailsList);
   console.log(`Details have been written to ${outputFile}`);
}

Writes the array of CVE detail objects to a CSV file for reporting and further analysis.

5. Main Processing Loop

async function main(outputFile) {
   const cveDetailsList = [];
   for (const cve of cveList) {
      console.log(`Processing ${cve}...`);
      const details = await getCveDetails(cve);
      if (details) {
         cveDetailsList.push(details);
      }
   }
   await writeToCsv(cveDetailsList, outputFile);
}

Iterates through each CVE, fetches and parses details, and writes the results to the output CSV.

Field Explanations and Security Relevance

CVE ID: Unique identifier for the vulnerability. Used for tracking and referencing.
Description: Explains the nature of the vulnerability. Helps understand the risk.
Severity (CVSS Score): Indicates how critical the vulnerability is. Guides prioritization.
Affected Products (CPE Configs): Lists software/hardware impacted. Essential for scoping remediation.
References: Links to advisories, patches, or further information. Supports remediation actions.
Dates: Published and last modified dates help track vulnerability lifecycle.

How this Addresses Security Concerns

Ensures you have the latest, most accurate vulnerability data.
Helps prioritize which vulnerabilities to address first based on severity and impact.
Provides actionable information for remediation and risk management.

Summary

By following this workflow and using the provided scripts, you can efficiently process large vulnerability lists, standardize their format, and enrich them with authoritative data from the NIST NVD. This approach streamlines vulnerability management and supports effective risk mitigation.