The US Food and Drug Administration (FDA) approved the PD-L1 immunohistochemical assay, SP142, as a companion test to determine eligibility for atezolizumab therapy in patients with advanced triple negative breast cancer (TNBC) but data in lung cancer studies suggest the assay suffers from poor reproducibility. We sought to evaluate reproducibility and concordance in PD-L1 scoring across multiple pathologists. Full TNBC sections were stained with SP142 and SP263 assays and interpreted for percentage (%) immune cell (IC) staining by 19 pathologists from 14 academic institutions. Proportion of PD-L1 positive cases (defined as ≥1% IC) was determined for each assay as well as concordance across observers. We utilized a new method we call Observers Needed to Evaluate Subjective Tests (ONEST) to determine the minimum number of evaluators needed to estimate concordance between large numbers of readers, as occurs in the real-world setting. PD-L1 was interpreted as positive with the SP142 assay in an average 58% of cases compared with 78% with SP263 (p < 0.0001). IC positive continuous scores ranged from 1 to 95% (mean = 20%) and 1 to 90% (mean = 10%) for SP263 and SP142, respectively. With SP142, 26 cases (38%) showed complete two category (<1% vs. ≥1%) concordance; with SP263, 38 cases (50%) showed complete agreement. The intraclass correlation coefficient (ICC) for two category scoring of SP263 and SP142 was 0.513 and 0.560. ONEST plots showed decreasing overall percent agreement (OPA) as observer number increased, reaching a low plateau of 0.46 at ten observers for SP263 and 0.41 at eight observers for SP142. IC scoring with both assays showed poor reproducibility across multiple pathologists with ONEST analysis suggesting more than half of pathologists will disagree about IC scores. This could lead to many patients either receiving atezolizumab when they are unlikely to benefit, or not receiving atezolizumab when they may benefit.