Purpose: For stereotactic radiosurgery (SRS), accurate evaluation of dose-volume metrics for small structures is necessary. The purpose of this study was to compare the DVH metric capabilities of five commercially available SRS DVH analysis tools (Eclipse, Elements, Raystation, MIM, and Velocity). Methods: DICOM RTdose and RTstructure set files created using MATLAB were imported and evaluated in each of the tools. Each structure set consisted of 50 randomly placed spherical targets. The dose distributions were created on a 1-mm grid using an analytic model such that the dose-volume metrics of the spheres were known. Structure sets were created for 3, 5, 7, 10, 15, and 20 mm diameter spheres. The reported structure volume, V100% [cc], and V50% [cc], and the RTOG conformity index and Paddick Gradient Index, were compared with the analytical values. Results: The average difference and range across all evaluated target sizes for the reported structure volume was − 4.73%[−33.2,0.2], 0.11%[−10.9, 9.5], −0.39%[−12.1, 7.0], −2.24%[−21.0, 1.3], and 1.15%[−15.1,0.8], for TPS-A through TPS-E, respectively. The average difference and range for the V100%[cc] (V20Gy[cc]) was − 0.4[−24.5,9.8], −2.73[−23.6, 1.1], −3.01[−23.6, 0.6], −3.79[−27.3, 1.3], and 0.26[−6.1,2.6] for TPS-A through TPS-E, respectively. For V50%[cc](V10Gy[cc]) in TPS-A through TPS-E the average and ranger were − 0.05[−0.8,0.4], −0.18[−1.2, 0.5], −0.44[−1.4, 0.3], −0.26[−1.8, 2.6], and 0.09[−1.4,2.7]. Conclusion: This study expanded on the previously published literature to quantitatively compare the DVH analysis capabilities of software commonly used for SRS plan evaluation and provides freely available and downloadable analytically derived set of ground truth DICOM dose and structure files for the use of radiotherapy clinics. The differences between systems highlight the need for standardization and/or transparency between systems, especially when evaluating plan quality for multi-institutional clinical trials.