Background: Hepatitis B core protein (HBVc) has been extensively studied from both a structural and immunological point of view, but the evolutionary forces driving sequence variation within core are incompletely understood. Results: In this study, the observed variation in HBVc protein sequence has been examined in a collection of a large number of HBVc protein sequences from public sequence repositories. An alignment of several hundred sequences was carried out, and used to analyse the distribution of polymorphisms along the HBVc. Polymorphisms were found at 44 out of 185 amino acid positions analysed and were clustered predominantly in those parts of HBVc forming the outer surface and spike on intact capsid. The relationship between HBVc diversity and HBV genotype was examined. The position of variable amino acids along the sequence was examined in terms of the structural constraints of capsid and envelope assembly, and also in terms of immunological recognition by T and B cells. Conclusion: Over three quarters of amino acids within the HBVc sequence are non-polymorphic, and variation is focused to a few amino acids. Phylogenetic analysis suggests that core protein specific forces constrain its diversity within the context of overall HBV genome evolution. As a consequence, core protein is not a reliable predictor of virus genotype. The structural requirements of capsid assembly are likely to play a major role in limiting diversity. The phylogenetic analysis further suggests that immunological selection does not play a major role in driving HBVc diversity. © 2005 Chain and Myers; licensee BioMed Central Ltd.