Pointwise mutual information (PMI), a simple measure of
lexical association, is part of several algorithms used as
models of lexical semantic memory. Typically, it is used as a
component of more complex distributional models rather than
in isolation. We show that when two simple techniques are
applied—(1) down-weighting co-occurrences involving low-
frequency words in order to address PMI’s so-called
“frequency bias,” and (2) defining co-occurrences as counts
of “events in which instances of word1 and word2 co-occur in
a context” rather than “contexts in which word1 and word2 co-
occur”—then PMI outperforms default parameterizations of
word embedding models in terms of how closely it matches
human relatedness judgments. We also identify which down-
weighting techniques are most helpful. The results suggest
that simple measures may be capable of modeling certain
phenomena in semantic memory, and that complex models
which incorporate PMI might be improved with these
modifications.