Objective: Several different severity scoring systems specific to acute renal failure have been proposed. However, most validation studies of these scoring systems were conducted in a single center or in a small number of centers, often the same ones used for their development. Therefore, if is not known whether such severity scoring systems may be widely applied. Design: Prospective clinical investigation. Setting: Intensive care units. Patients: One thousand seven hundred and forty-two intensive care unit patients with acute renal failure who were either treated with renal replacement therapy or fulfilled predefined criteria. Interventions: Demographic and clinical information and outcomes were measured. Measurements and Main Results: Scores for four acute renal failure-specific scoring systems and two general scoring systems (Simplified Acute Physiology Score II and Sequential Organ Failure Assessment) were calculated, and their discrimination and calibration were tested with receiver operating characteristic curves and Hosmer-Lemeshow goodness-of fit-tests. For the receiver operating characteristic curves, blood lactate levels were also used as a reference. All scores had an area under the receiver operating characteristic curve <0.7 (Mehta 0.670, Liano 0.698, Chertow 0.610, Paganini 0.643, Simplified Acute Physiology Score II 0.645, Sequential Organ Failure Assessment 0.675, lactate 0.639). For scores that can calculate predicted mortality, the Hosmer-Lemeshow goodness-of-fit test showed poor calibration. Conclusions: None of the scoring systems tested had a high level of discrimination or calibration to predict mortality for patients with acute renal failure when tested in a broad cohort of patients from multiple countries. A large, multiple-center database might be needed to improve the discrimination and calibration of acute renal failure scoring system. Copyright © 2005 by the Society of Critical Care Medicine and Lippincott Williams & Wilkins.