CVE Scraper Walk Through
Table of contents
Introduction
In cybersecurity, managing and analyzing lists of vulnerabilities is a common task. Often, these lists are long and need to be cross-checked against authoritative sources like the NIST database to ensure accuracy and completeness. This tutorial explains the workflow for processing vulnerability lists using the provided scripts, and how each step helps address security concerns.
Technically this can be done manually, via a bunch of interns. But assume that you have thousands of issues, that may or may NOT be valid. Then, holy shit – let’s utilize the interns for something much better. Also less boring, and more value add for them. For example, they can actually spend time to understand the context behind the generation of the report rather than lame data entry stuff
What is the NIST Database?
The National Institute of Standards and Technology (NIST) maintains the National Vulnerability Database (NVD), which is a comprehensive repository of known cybersecurity vulnerabilities. Each vulnerability is assigned a unique CVE (Common Vulnerabilities and Exposures) identifier. The NIST NVD provides:
- Detailed descriptions of vulnerabilities
- Severity scores (CVSS)
- References to patches and advisories
- Information on affected products
Why cross-check with NIST?
- Ensures your vulnerability list is accurate and up-to-date
- Provides standardized severity ratings
- Helps prioritize remediation efforts
Workflow Overview
1. Receiving the Vulnerability List
- Typically, you receive a list of vulnerabilities in Excel format.
- The list may not be in the required JSON format for automated processing.
2. Preparing the List
- Copy and paste the relevant data from Excel into a plain text editor (e.g., Notepad).
- Save the file as a text file for further processing.
Script Explanations
convert-to-json-cve.js
Purpose:
- Converts a plain text list of CVEs (e.g., from Notepad) into a JSON format required by the main processing script.
How it works:
- Reads the text file containing the list of CVEs (one per line).
- Converts the list into a JSON array.
- Saves the result as a
.json
file for use by the main script.
Script Example:
const fs = require('fs');
// Read the CVE list from the file '20250807-unique-cve-txt'
fs.readFile('20250807-unique-cve-txt', 'utf8', (err, data) => {
if (err) {
console.error("Error reading the file:", err);
return;
}
// Convert the text into an array of CVE strings
const cveArray = data.trim().split('\n').map(cve => cve.trim());
// Convert the array to JSON
const cveJson = JSON.stringify(cveArray, null, 2);
// Write the JSON to a file
fs.writeFileSync('20250807-unique-cve.json', cveJson, 'utf8');
console.log('CVE list has been saved as "20250807-unique-cve.json"');
});
Why is this important?
- Standardizes the input format, enabling automated processing and cross-checking.
- Reduces manual errors when handling large lists.
scrapeCVE.js (Main Script)
Purpose:
- Takes the JSON-formatted list of CVEs and cross-checks each entry against the NIST NVD.
- Retrieves detailed information for each CVE.
Script Breakdown
1. Module Imports
const axios = require('axios');
const cheerio = require('cheerio');
const csvWriter = require('csv-writer').createObjectCsvWriter;
const fs = require('fs');
Loads required modules for HTTP requests, HTML parsing, CSV writing, and file operations.
2. Reading the CVE List
fs.readFile('20250807-unique-cve.json', 'utf8', (err, data) => {
if (err) {
console.error('Error reading cveList.json:', err);
return;
}
cveList = JSON.parse(data);
console.log('CVE list loaded:', cveList);
main('20250807-unique-cve.csv');
});
Reads a JSON file containing a list of CVE IDs, parses it, and starts the main processing function.
3. Fetching and Parsing CVE Details
async function getCveDetails(cveNumber) {
const url = `https://nvd.nist.gov/vuln/detail/${cveNumber}`;
try {
const response = await axios.get(url);
const $ = cheerio.load(response.data);
// Extracts various fields from the NIST NVD page
// ... (status, description, references, scores, dates, CPE configs)
return {
CVE: cveNumber,
Status: status,
Description: description,
References: references,
BaseScore: baseScore,
Severity: severity,
PublishedDate: publishedDate,
LastModified: lastModifiedDate,
NistScore: nistScore,
NistScoreSev: nistScoreSev,
AdpScoreSev: adpScoreSev,
AdpScore: adpScore,
CPEConfig1: cpeConfigs[0]?.cpeText || 'Not found',
// ... up to CPEConfig5
};
} catch (error) {
console.error(`Error fetching ${url}:`, error.message);
return null;
}
}
For each CVE, fetches its NIST NVD page, parses the HTML, and extracts all relevant fields (status, description, references, scores, dates, affected products, etc.).
4. Writing Results to CSV
async function writeToCsv(cveDetailsList, outputFile) {
const writer = csvWriter({
path: outputFile,
header: [
{ id: 'CVE', title: 'CVE' },
// ... other fields
]
});
await writer.writeRecords(cveDetailsList);
console.log(`Details have been written to ${outputFile}`);
}
Writes the array of CVE detail objects to a CSV file for reporting and further analysis.
5. Main Processing Loop
async function main(outputFile) {
const cveDetailsList = [];
for (const cve of cveList) {
console.log(`Processing ${cve}...`);
const details = await getCveDetails(cve);
if (details) {
cveDetailsList.push(details);
}
}
await writeToCsv(cveDetailsList, outputFile);
}
Iterates through each CVE, fetches and parses details, and writes the results to the output CSV.
Field Explanations and Security Relevance
- CVE ID: Unique identifier for the vulnerability. Used for tracking and referencing.
- Description: Explains the nature of the vulnerability. Helps understand the risk.
- Severity (CVSS Score): Indicates how critical the vulnerability is. Guides prioritization.
- Affected Products (CPE Configs): Lists software/hardware impacted. Essential for scoping remediation.
- References: Links to advisories, patches, or further information. Supports remediation actions.
- Dates: Published and last modified dates help track vulnerability lifecycle.
How this Addresses Security Concerns
- Ensures you have the latest, most accurate vulnerability data.
- Helps prioritize which vulnerabilities to address first based on severity and impact.
- Provides actionable information for remediation and risk management.
Summary
By following this workflow and using the provided scripts, you can efficiently process large vulnerability lists, standardize their format, and enrich them with authoritative data from the NIST NVD. This approach streamlines vulnerability management and supports effective risk mitigation.