Pneumococcal surface protein A (PspA) is a surface exposed, highly immunogenic protein of Streptococcus pneumoniae. Its N-terminal α-helical domain (αHD) elicits protective antibody in humans and animals that can protect mice from fatal infections with pneumococci and can be detected in vitro with opsonophagocytosis assays. The proline-rich domain (PRD) in the center of the PspA sequence can also elicit protection. This study revealed that although the sequence of PRD was diverse, PRD from different pneumococcal isolates contained many shared elements. The inferred amino acid sequences of 123 such PRDs, which were analyzed by assembly and alignment-free (AAF) approaches, formed three PRD groups. Of these sequences, 45 were classified as Group 1, 19 were classified as Group 2, and 59 were classified as Group 3. All Group 3 sequences contained a highly conserved 22-amino acid non-proline block (NPB). A significant polymorphism was observed, however, at a single amino acid position within NPB. Each of the three PRD groups had characteristic patterns of short amino acid repeats, with most of the repeats being found in more than one PRD group. One of these repeats, PKPEQP as well as the NPB were previously shown to elicit protective antibodies in mice. In this study, we found that sera from 12 healthy human adult volunteers contained antibodies to all three PRD groups. This suggested that a PspA-containing vaccine containing carefully selected PRDs and αHDs could redundantly cover the known diversity of PspA. Such an approach might reduce the chances of PspA variants escaping a PspA vaccine's immunity.