One previously unexplained observation about numeral sys-tems is the shared tendency in numeral expressions: Numer-als greater than 20 often have the larger constituent numberexpressed before the smaller constituent number (e.g., twenty-four as opposed to four-twenty in English), and systems thatoriginally adopt the reverse order of expression (e.g., four-and-twenty in Old English) tend to switch order over time. Toexplore these phenomena, we propose the view of Rapid In-formation Gain and contrast it with the established theory ofUniform Information Density. We compare the two theoriesin their ability to explain the shared tendency in the orderingof numeral expressions around 20. We find that Rapid Infor-mation Gain accounts for empirical patterns better than the al-ternative theory, suggesting that there is an emphasis on infor-mation front-loading as opposed to information smoothing inthe design of large compound numerals. Our work shows thatfine-grained generalizations about numeral systems can be un-derstood in information-theoretic terms and offers an opportu-nity to characterize the design principles of lexical compoundsthrough the lens of informative communication.