We investigate, in detail, the learning process in a feedforward Boolean network. Such a network transforms a set of n binary inputs into a binary output, i.e., it implements an n-variable Boolean function. If the network is constructed at random, the various possible Boolean functions are implemented with widely different probabilities. This probability (or entropy) landscape does not change very much, once a minimal size of the network is reached. Moreover, it has a hierarchical structure with the number of Boolean function scaling as an inverse power law of their probability (Zipfs law). This indicates the existence of an underlying fractal structure. An amazing property of these networks is their ability to generalize, i.e., to learn some of the Boolean functions on the basis of a restricted number of teaching examples. We suggest that the fractal and hierarchical nature of the entropy landscape is the key to this ability. We identify the Boolean functions, which undergo such a learning transition, as those corresponding to a local entropic maximum in the Hamming distance space. We give evidence for the fact that the learning transition shifts to lower fractions of teaching, and becomes sharper, suggesting the existence of a genuine phase transition, as more complex problems are considered. The results on the entropy landscape are obtained by Monte Carlo simulations, while the effect of teaching is discussed using mean-field analysis, by an exact enumeration of the Monte Carlo data, and by actually training the network with a simulated annealing type of learning process. © 1990 The American Physical Society.