GENERALIZED THURSTONE MODELS FOR RANKING - EQUIVALENCE AND REVERSIBILITY

work has determined the conditions under which generalized versions of Thurstone’s theory of comparative judgment are formally equivalent (i.e., empirically indistinguishable) for choice experiments. This note solves the analogous problem for ranking experiments: It is shown that if two “Generalized Thurstone Models” are equivalent for choice experiments with n alternatives they are also equivalent for ranking experiments with n alternatives, despite the fact that ranking generates many more prefer- ence probabilities. This result in turn allows one to determine which Generalized Thurstone Models are “reversible,” i.e., satisfy the requirement that regardless of whether the subject ranks from best to worst or from worst to best, rankings that express the same preference order will occur with the same probability. ranking experiments with 71’ < n alternatives.


Equivalence
Several recent articles in this journal have dealt with the equivalence properties of a family of random utility models for choice experiments that can be regarded as generalizations of Thurstone's (1927) Theory of Comparative Judgment-generalized in the sense that the utility random variables (Thurstone's "discriminal processes") are no longer required to have normal distributions.
("Equivalence" here means experimental indistinguishability. Two models-that is, theories-are said to be equivalent for a given class of experiments if results that satisfy one always satisfy the other-so that no experimental decision can be made between them. Yellott (1977Yellott ( , 1978, Moszner (1978), and Rockwell and Yellott (1979), all deal with a generalized version of Thurstone's Case V in which the utility random variables are independent and identically distributed except for shifts: Such models were referred to there simply as "Thurstone models"; here they will be called "Generalized Case V Thurstone (GT-V) Models." Strauss (1979), analyzes a broader class of "Generalized Thurstone (GT) Models" which includes the GT-V models and also nonindependent cases. All this work is reviewed here in Section 2.) This paper deals with the equivalence properties of these same models when they are applied to ranking experiments-that is, experiments in which the subject does not simply choose a single best alternative from some set, but instead rank orders all the alternatives from best to worst (or vice versa). These properties can be quickly determined, and turn out to be both simple and somewhat counterintuitive. Because ranking experiments yield many more preference probabilities than choice experiments (i.e., n! vs n), intuition suggests that they ought to be more efficient at discriminating between models. Specifically, one might expect to find at least some cases in which models that are nonidentical but nevertheless equivalent for certain choice experiments would become nonequivalent when applied to ranking versions of those same experiments. This can definitely happen within the general class of random utility models: Table 1 in Section 3 shows an example. Surprisingly, however, it proves to be impossible for GT models: Theorem 1 in Section 3.3 shows that two such models are equivalent for ranking experiments with 71 alternatives if and only if they are equivalent for choice experiments with n alternatives. In other words whenever two GT models are indistinguishable by choice experiments they are also indistinguishable by ranking experiments, despite the extra preference probabilities generated by the latter.

Reversibility
This equivalence theorem turns out to be useful in understanding an old puzzle in choice theory: The fact that Lute's (1959Lute's ( , 1977 Choice Axiom is incompatible with the intuitively sensible requirement that ranking probabilities should be essentially the same whether the subject ranks from best to worst or vice versa. In designing any ranking experiment one has to decide in advance on the direction of ranking, i.e., should the best alternative be assigned rank 1, the next-best rank 2, and so on ("best-to-worst" ranking), or should the worst alternative get rank 1, the next-worst rank 2, etc. ("worst-to-best" ranking) ? In many contexts this decision seems entirely arbitrary because we expect that regardless of the direction instructions, rank orderings that mean the same thing ought to occur with the same probability. For example, if the alternatives are three tones a, 9 a2 > a3 to be ranked on the basis of loudness, it seems reasonable to expect that the probability of producing any given rank order, say (ai, uj, a,>, under the instruction "rank from loudest to softest" should be the same as the probability of producing the reverse ordering (ak , uj , ai> under the instruction "rank from softest to loudest." However, it has been recognized for a long time that this "reversibility" assumption is surprisingly difficult to reconcile with other theoretical notions that seem equally plausible-and that apparently have nothing to do with the direction of ranking (Lute, 1959;Block & Marschak, 1960;Lute & Suppes, 1965;Marley, 1968;Thorsen & Stever, 1974). In particular, it is well known that reversibility cannot easily be combined with the ideas that choice and ranking probabilities should satisfy a common random utility model and that ranking is accomplished by a series of independent choices-a pair of assumptions closely related to the Choice Axiom.
The difficulties here are neatly captured in a well-known impossibility theorem due to Block et al. (1960;Theorem 51 in Lute et al. 1965). The theorem holds for any number of alternatives but for convenience at this point we consider the simplest case, where there are only three. Suppose the alternatives are a, , a2 , us , and that both choice and ranking probabilities are experimentally determined for all subsets under best-to-JOHN I. YELLOTT worst instructions and also under worst-to-best instructions. (For choices this means we sometimes ask the subject tochoose the best alternative andsometimes to choosethe worst.) Let p(i) denote the probability of choosing a, from {ur , a2 , as} under best-to-worst instructions; p(i, j) the probability of choosing ai from {q , uj} under the same instruction; r(i, j, K) the probability of producing the rank order (ai , uj , a& in the best-toworst condition when a, , a, , and as are all available; and ~(i, j) the probability of producing rank order (ai , q) when only ai and a, are available. And let p*(i), p*(i, j), r*(i, j, k), and r*(i, j) denote the probabilities of the same events under worst-to-best instructions, i.e., r*(i, j, A) is the probability of saying that ai is worst than a, is worse than uk , etc. Now to capture the idea that both choices and rankings are generated by a common set of underlying utility random variables, we assume that the probability of choosing ai over any set of competitors is the same as the probability that ai is ranked ahead of those competitors, i.e., p(i) = y(i, j, k) + y(i, k,j), 0)" p(i, j) = r(i, j) = r(i, j, k) + r(i, k,j) + r(k, i, j), p*(i) = r*(i,j, k) + r*(i, k,j), p*(i, j) = r*(i, j) = r*(i, j, k) + r*(i, k, j) + r*(k, i, j).
Next, to express the idea that ranking is accomplished by a series of independent choices, we assume that the probability of producing the rank order (ai, a,, a*) is the same as the probability of first choosing ai from the whole set {a, , a, , ua} and then, in an independent pair-wise choice, choosing aj over ak : (ii) +,.A k) = p(i>p(j, 4, r*(i,j, k) = p*(i)p*( j, k).
(This assumption is sometimes referred to as "decomposition", e.g., in Strauss (1979).) Finally, to express the notion that rank orderings that mean the same thing should have the same probability regardless of the instructed direction of ranking, we assume "reversibility" : r(i, j) = r*( j, i), r(i, j, k) = r*(k, j, i) Now taken one at a time each of these assumptions seems plausible, and even innocuous. However, Block et al. (1960) extending results obtained originally by Lute (1959), 2 Assumption (i) is equivalent to the assumption (i') that there exist random variables UI , U, , U, such that for all i, j, k: p(i) = P(Ui = max{U, , U, , W; PC& i) = r(G) = JWJd > UJ; and r(i, j, IS) = P(U, > U, > U,). The fact that (i') --, (i) is obvious; the fact that (i) + (i') is Block and Marschak's (1960) well-known characterization theorem for random utility models (Theorem 49 in Lute et al., 1965).
showed that if all three hold the subject must be completely indifferent between the alternatives, i.e., (i), ( ii ) , and (iii) together imply that 'di, j, k, p(i) = p*(i) = g; p(i, j) = p*(i,j) = &, and r(i, j, K) = r*(i, j, R) = 6. The proof of this remarkable fact requires only two simple steps. First, note that (i) and (ii) imply that both the best-to-worst and worst-to-best choice probabilities satisfy Lute's Choice Axiom, which for this three-alternative case is simply P(i,i) = P(i) P(i) + P(i) ' P*(i) P*(id = p*(i) + p*(j) * (To show this use (ii) to substitute for r(i, j, k) in the second part of (i)* yielding p(i, j) = p(i)p( j, k) + p(i)@, j) + p(kk(i~ j).
Then (iv) follows immediately. It is also easy to show that (i) and (iv) together imply (ii), so in the context of the random utility assumption (i), decomposition and the Choice Axiom are equivalent.) The Choice Axiom in turn implies the existence of ratio scale values w, , 'Lo , v3 and VT, vz, vz such that P(i,i) = ..Tv.
The second step is to note that the first part of the reversibility assumption (iii) implies vi/vi = VT/V:. Since we can arbitrarily set a, = vr * = 1, it follows that in general z.$ = I /ai , and then the second part of (iii) together with (ii) implies Simplifying this equation we obtain vivk = vj2, and first setting k = 1 (to yield vavs = 1) and then k = 2 (to yield vrva = oa = na2 = vg2, i.e., va3 = 1) it follows that z1i = oa = v3=v1* =a,* =vz = 1. Now this impossibility theorem is puzzling for two reasons. First, and foremost, it is not at all obvious why the Choice Axiom should have anything to do with the direction of ranking, and so the fact that it cannot be reconciled with reversibility seems to be nothing more than an algebraic accident. Second, the theorem does not give any hint as to the degree of incompatibility between reversibility and the Choice Axiom: We know that r(i, j, k) and r*(k, j, i) cannot be equal (except in the degenerate case vi = vf z l), but we have no intuitive basis for expecting their difference to be large or small. In fact a computer search shows that this difference is never greater than 0.05, and consequently would be practically impossible to detect experimentally. However it is not easy to see intuitively why this should be so.
Approached from the standpoint of GT models, however, the impossibility theorem becomes more understandable. The starting point is the fact that the Choice Axiom is equivalent to the GT-V model based on the double exponential distribution P(X < x) = .P-= (Yellott, 1977). C onsequently (i) and (ii) imply the simultaneous existence of real constants m, , m2, m3, and rn:, m,*, m$, such that Vi,';; k p(i) = P(mi + Xi = m={mj + Xi I j = 1,2,3}), p*(i, j) = P(m" + Xi > mj* + X,), p*(i) = P(mT + Xi = max(mj* + Xj 1 j = 1,2,3}), where the Xi are independent identically distributed random variables with common distribution function F(x) = eee-" (Fig. 1 shows the corresponding probability density function). Moreover (i) implies that the ranking probabilities also satisfy the same double exponential GT-V model: r(i, j, k) = P(mi + Xi > mi + Xi > mR + X,), r*(i, j, k) = P(m' + Xi > mj* + Xj > rnz + X,) (since r(i, j, k) = p( j, k) -p(j) and r*(i, j, k) = p*( j, k) -p*(j).) Now from the first part of the reversibility assumption we have p(i, j) = P(X, -Xi < mi -mj) = p*(j, i) = P(Xi -Xj < mj* -mf) and consequently m, -rni = mj* -mf. Since the m values are only unique up to addition of a constant we can set ml = m T = 0, so that m$ = -mi . Then the second part of reversibility holds iff r(i,j,k)=P(m,-tXi>m,+Xi>m,+X,) = r*(k, j, i) = p(-m, + X, > -mi + Xi > -mi + Xi) This last equation is the key to the incompatibility between the Choice Axiom and reversibility. For suppose it were true for all possible values of m, , m2 , m,--which is what would be required in order for the Choice Axiom to be generally consistent with reversibility. Then it would follow that the double exponential GT-V model would be equivalent for 3-alternative ranking experiments to the GT-V model based on the distribu-tion function F*(x) = 1 -e-+ , z i.e., the distribution function of -X when X is double exponential.
However, in view of the result on ranking equivalence described earlier in Section 1 .l, this could be true only if these two GT-V models were also equivalent for 3-alternative choice experiments, and this cannot be so because we already know that the double exponential GT-V model is unique for such experiments-i.e., the only GT-V models equivalent to the model based on F(x) = e-e-Z are those based on distributions of the same type, i.e., distribution functions of the form P--(~'+'), where a > 0, b are arbitrary constants (Yellott, 1977). Moreover, because the models based on F and F* would have to generate the same ranking systems using the same scale values (that is, {T + Xi> and {w -xi) would have to generate the same ranking probabilities) the constant a here would have to be 1. (Footnote 3 below explains why.) Consequently reversibility could hold only if for some constant b However, this would mean that the double exponential distribution is symmetric, i.e., for the centering constant b/2 (which would have to be zero here) the distribution functions F(x + b/2) (corresponding to X -b/2) and 1 -F(-x + 6/2) (corresponding to -(X -b/2)) would be identical. In other words, for the Choice Axiom to satisfy reversibility the densities corresponding to 1 -e& and d?-" would have to be the same, and clearly they are not, as one can see in Fig. 1. However, one can also see from the figure that the asymmetry of the double exponential density is not very great (compared, for example, to an exponential distribution), and so it becomes understandable that the irreversibility implied by the Choice Axiom is numerically not very large. Analyzed from the CT standpoint then, Block and Marschak's impossibility theorem appears not as an algebraic fluke, but rather as a natural consequence of the shape of the double exponential distribution-in particular, its asymmetry. However, the same analysis  also raises a general question. Suppose we retain the assumption that choices and rankings satisfy a common random utility model (i.e., (i)), but drop the decomposition assumption (ii)-which in this context was equivalent to assuming that both the best-to-worst and worst-to-best probabilities satisfy the double exponential GT-V model. Instead, suppose we replace (ii) with some other assumption that implies that both sets of probabilities satisfy some other GT-V model, i.e., a model based on some distribution other than the double exponential-normal, gamma, or whatever. Then the question arises, which of the resulting ranking models would satisfy the reversibility assumption (iii) ? This general problem is solved here in Section 3.4, using the ranking equivalence theorem of Section 3.3 and earlier results on choice equivalence summarized below in Section 2. On the basis of the special case of the double exponential model, one might expect that the answer would depend only on the symmetry of the utility distributions, i.e., one might expect that a GT-V model is reversible iff its utility distribution is symmetrical. However, this hunch turns out to be only half correct: Every symmetrical distribution does yield a reversible GT-V model, but so also do some asymmetric distributions-though in every case these curious models are only reversible for some finite number of alternatives and never for arbitrarily many. (For example, the asymmetric density function shown in the lower panel of Fig. 2 yields a GT-V model that is reversible for three alternatives but not for four.) The key to reversibility is indeed the shape of the utility distribution, but more is involved than symmetry alone: In addition, one needs to consider the shape of its Fourier transform.

Organization of the Paper
To put these results in context it seems useful to begin by reviewing the equivalence properties of GT models for choice experiments, since results on that problem that we need here are presently scattered through a series of papers. This is done in Section 2, and then Section 3 deals with ranking: 3.1 provides notation and definitions; 3.2 discusses the relationship between choice and ranking for random utility models in general; 3.3 gives the equivalence theorem for GT models for ranking; and 3.4 contains the results on reversibility.

Notation and Terminology
A, = {aI , a2 ,..., a,} is a set of n choice alternatives, which we identify with the set of indices 1, = {I, 2,..., n>. In a pair comparison experiment alternatives are presented two at a time with the instruction "pick one": Here p(i, i) denotes the probability that ai is chosen when {ai , aj} is presented, and an entire collection of the form { p(i, j)] i, j E In} is called a pair comparison system for n alternatives. In a complete choice experiment one presents not only pairs, but all of the subsets of A, : Here p(;; S), i E S _C 1, , denotes the probability that ai is chosen when the subject is presented with the subset {ai 1 j E S> (and thus p(i, i) is shorthand for p(i, {i, j})). A complete collection of probability distribu-tions for all of the subsets of A, (i.e., a collection of the form ((p(i; S) 1 i E S} ] S C 1, , 1 S ( > 2)) is called a com#te system of choice probabilidies for n alternatives.
Two choice models (i.e., "theories") are said to be equiv.JaKent for paik comparison experiments with n alternatives iff every pair comparison system for A alternatives that satisfies (either) one also satisfies the other. Similarly, two models are said to be compbtel' equivakmt for n alternatives iff every complete system of choice probabilities for n alternatives that satisfies one also satisfies the other.

Gemwlized Case V Models
Thurstone's basic idea was, of course, to model choice behavior on the assumption that the alternatives a, , us ,..., correspond to real valued utility random variables : When a subject is required to choose a single alternative from the subset E; /j?'&*he picks the one with the largest utility random variable, so that p(< S) = P[v, = max{Uj 1 j E q]. (Actually Thurstone only dealt explicitly with pair comparisons, but the extension to complete experiments is immediate.) For concreteness sake Thurstone assumed the Ui were normal random variables, and considered five special cases corresponding to various constraints on their variances and covariances. The simplest and most widely applied case, Case V, amounts to assuming that the Ui are independent and identically distributed (i.i.d.) except for shifts along the abscissa, i.e., they can be written in the form U, = m, + X1, U, = mz + X, ,..., where the mi are real constants ("scale values") and the Xi are i.i.d. normal random variables.
Yellott (1977) considered a class of Generalized Case V Thurstone (GT-V) Models (called simply "Thurstone Models" in that paper) in which the Xi are i.i.d. with common distribution function F (i.e., F(x) = PJX, < xl), subject to the condition that the difference distribution D, (i.e., the distribution function of Xi -Xj) is everywhere continuous and strictly increasing. (This condition guarantees that p(i, j) is a continuous strictly increasing function of the scale value difference m, -m, .) The GT-V model yF is then identified with the set of all systems of choice probabilities that can be generated by F together with arbitrary scale values m, , ma ,...; and the identifiability problem is whether yfi and To can be equivalent for some class of experiments even though F and G are different distributions. (Equivalence here means explicitly that any system of choice probabilities that can be generated by flF by assigning arbitrary values to the scale parameters mr , ma ,..., can also be generated by rG using some set of parameters 4 , 4 ,...; and vice versa.) It is quickly apparent that if F and G are distributions of the same type (i.e., F(x) = G(ax + b), a > 0) flP and rG are equivalent for all choice experiments-in other words, choice experiments cannot identify the "true" mean and variance of F, but only (at most) its type: normal, exponential, gamma, or whatever. Consequently the interesting question is whether Sr and rG can be equivalent when F and G are distributions of different types. The answer turns out to depend on the class of experiments one has in mind-in particular, on the number of alternatives.
2.2.1. Pair comparisons. Yellott (1977) shows that rF and r( are equivalent for pair comparison experiments with three or more alternatives (the smallest nontrivial choice experiment) iff they are equivalent for exactly three, and the latter is true iff DF(x) = 480/22/1-s &(ax) for some a > O.s This'condition is not very restrictive: Besides solutions of the form F(x) = G(ax + b), and F(x) = 1 -G(--ax + b) (where F is the distribution of X and G the distribution of --aX + 6, a > O-so that F and G may or may not be of the same type, depending on whether they are symmetrical or asymmetrical), there are usually others in which F and G are not related in any probabilistically sensible way, but instead only through the Fourier analytic fact that their densities have essentially the same amplitude spectrum, i.e., the characteristic functions of F and G (denoted f and g, respectively) satisfy the relationship [ g(t)/ = If(a (In this respect pair comparison experiments are analogous to X-ray diffraction experiments in crystallography: Both can only identify amplitude spectra and necessarily lose the phase spectrum.) For example, the double exponential model F(x) = e+-= (which yields Lute's Choice Axiom) has pail comparison equivalents that are not of the same type as eitherF(ax + b) or 1 -F(-ax + b), and this is also true of the exponential model F(x) = 1 -e-". The only exceptional case (reported so far) is th e normal model (Thurstone's original Case V) whose pair comparison prgictions cannot be entirely duplicated by any non-normal model-a special status that seems quite fitting, though historically entirely coincidental (Yellott, 1977, p. 131).

2.2.2.
Complete choice experiments. The technical key here is the fact that yF and To are equivalent for complete experiments with n alternatives iff the characteristic functions of F and G satisfy the relationship for some a > 0 and all tl , t, ,..., t,-, . * (Yellott (1977) proves "only if"; Rockwell et al ' (1978) supply the proof of "if," inadvertently omitted earlier.) g(t) = eibtf(at) is always a solution to (1); it corresponds to F(x) = G(ax + b). And if (1) holds for all 7t, this is the only solution, i.e., yP and flG are completely equivalent for an infinite number of alternatives iff F and G are distributions of the same type.
Consequently if F and G are distributions of different types there is always some minimal number of alternatives necessary to distinguish between them. If F or G has a nonvanishing characteristic function, this number is three, i.e., if f or g is nonvanishing then for n = 3 the only solution to (1) isg(t) = eibtf(at) (Yellott, 1977). This condition is satisfied by all the well known distributions that are likely to occur to one as natural bases for GT-V models, e.g., the normal, gamma, double exponential, etc., since all have non- * a here has the same interpretation as in Footnote 3.
vanishing characteristic functions (it's worth recalling that this is true of all infinitely divisible distributions). Consequently one can say that all of these models are in principle identifiable by the smallest possible complete choice experiments. However, nothing in their definition forces GT-V models to have nonvanishing characteristic functions, and some do not. In those cases (I) can have solutions f and g that correspond to distributions of different types-for n = 3 (Yellott, 1978;Moszner, 1978), and indeed for any n (Rockwell et al., 1979). Consequently for every n one can construct pairs of GT-V models that are completely equivalent for experiments with n (or fewer) alternatives but that become nonequivalent for n + 1. (Specific examples are the models corresponding to the density functions sinca(x)(l -I-cos Zanx) and sinc2(x)(l + sin 2rmx), where sine(x) = (sin rrx)/~. Figure 2 illustrates these densities for n = 3.) These cases prompted the present analysis of ranking experiments, since it seemed possible that the extra information provided by ranking probabilities might enable one to distinguished between models on the basis of fewer alternatives.

Generalized Thurstone (GT) ModeLF
Strauss (1979) deals with this broader class of models, which includes the GT-V models as a special case. A GT model sFa corresponds to specific sequence of n random variables X, ,..., X, with joint distribution function F,,(xl , x, ,..., x,) = P(X, < x, , x; < x2 )...) X, < x,). For the same reason that the difference distribution D, is assumed to be continuous and strictly increasing in the case of GT-V models, F,, here is &sumed to satisfyjthe condition that for all i and j, i # j, the difference distribution P&-X, <x) is everywhere continuous and strictly increasing. Otherwise F, is unrestricted-in particular, nothing is assumed about dependencies between the Xi .
This result also proves to be the key to the equivalence properties of GT models for ranking, as shown below in Section 3.3.

Terminology and Notation
As before, A,, denotes a set of alternatives ((II ,..., a+,>; I, the corresponding set of indices (l,..., n}. In a ranking experiment we present subsets of A, and ask the subject to rank order the alternatives in each subset from best to worst: The best alternative is assigned rank 1, the next best rank 2, and so on. The result for each subset is a probability distribution over its possible rank orderings, i.e., over the possible permutations of its indices. To denote this, suppose S _C I,, and p is a permutation of the indices in S, i.e., p maps S onto the integers 1,2,..., 1 S I: p(j) is th e ordinal position assigned to alternative j, and p-r(i) is the index of the alternative that has rank order i. Then r(p; S) deffotes the probability of ranking (ai ] jE S} in the order (a,.+,) , +.1(s) ,..., a,,-l~l)).
An experiment that determines r(p; S) for every rank ordering of every subset of 1, is a complete ranking experiment, and the resulting collection of probability distributions is a complete system of ranking probabilities for n alternatives. Such a system satisfies the GT model TF, iff there exist scale values m, ,.. ., m, such that for every ranking p of mery S, SCl, IS/ >,2: YU,, 2, ,..., i S !,> = P[mlp + Xl0 > m2, + X2I, > *.. > mlsl, + XI&~.
In other words a system of ranking probabilities satisfies Y=" iff it can be generated by assigning utility random variables m, + X, ,..., m, + X, to a1 ,..., a, (with F, the joint distribution function of X, ,..., X,) and applying the decision rule: "For every subset {aj 1 jE S) rank the alternatives in the same order as the random variables {mj + Xj / jE S)." Then we say that two GT modek; are equivalent for ranking iff every complete system of tanking probabilities that satisjes one also satis$es the other, and the identifiability problem is to determine when this can happen for two models that correspond to nontrivially different distributions (i.e., it's obvious that YF, and 5c, are equivalent for ranking as well as for choice if F&, ,..., x,) = G,(ux, + b,..., ax, + b), where a > 0, b are arbitrary constants, and so the question is whether there are other possibilities.) In particular, since it is immediately clear that for GT models ranking equivalence implies choice equivalence, it is natural to wonder whether two such models can be equivalent for choice but not for ranking. To put this question in perspective it is useful to begin by considering relationships between choice and ranking for the broadest class of models that still retain Thurstone's basic idea: These are the "random utility models" described in the next section.5 This is important because the assumption that choices and rankings are both dictated by order relationships within a common set of random variables obviously implies a structural bond between the two systems of preference probabilities, and so it is necessary to understand this before considering the additional implications of the distributional constraints embodied in GT models. By a random utility (RU) model 4 we mean any collection of random utility vectors: A given system of choice or ranking probabilities is said to satisfy 9 iff it is generated by some vector in @. Two RU models % and W are equivalent for choice (ranking) experiments with n alternatives iff every complete system of choice (ranking) probabilities for n alternatives that satisfies one also satisfies the other. Clearly every GT model is a RU model, and likewise every GT-V model (construed as a sequence of GT models, as in Section 2.3). However, a RU model may also consist of just one random utility vector. Now suppose 4 and @' are equivalent for ranking experiments with 71 alternatives, and that some system of choice probabilities { p( ; S) 1 S C I,,} satisfies 4. Then this system is generated by some vector 0, in '4Y, and that vector simultaneously generates a system of ranking probabilities {r( ; S) 1 S 2 In} which also must satisfy W for some vector O:, . This vector in turn must generate the same choice system as 0, , since the choice probabiIities generated by a given utility vector are completely determined by its ranking probabilities via the relationship p(i; S) = 1 yb; 8, u:+=i i.e., the probability of choosing i from S is the probability of ranking i first in S. Consequantly LEMMA 2. If two RU models are equivak& for ranking experiments with n alternatives they are equivalent for choice experiments with n alternatives.
Next, suppose @ and W are equivalent for choice experiments with three alternatives, and (Y( ; S) I S C f3> is a ranking system generated by us E 4Y. Then this vector generates a choice system ( p( ; S) 1 S _C Is}, and that system must also be generated by some vector in W, say uj . That vector in turn must generate the same ranking system as a, , because for three alternatives the ranking probabilities generated by a utility vector are completely determined by the corresponding choice probabilities, via the relationships r(i, j) = Ppi > Uj] = p(i, j), r(i, j, k) = PflJ, > Uj > U,] = P[v, > UJ -P[U, = M@Jl , U2, Us}] = p( j, k) -p( j; (1, 2, 3)).
Combining this argument with Lemma 2, we see that: LEMMA 3. Two RU models are equivalent for choice experiments with three alternatives iff they are equivalent for ranking experiments with three alternatives.
For experiments with four or more alternatives, however, the ranking probabilities generated by a random utility vector are no longer determined by its choice probabilities. Table 1 illustrates two different probability distributions over the possible rank ordering of four alteranatives that yield identical systems of choice probabilities. Each of these ranking probability distributions is generated by a random utility vector, namely the vector (U, , U, , Us , U4) having the joint distribution P[U, = i-l, UB =j-l, Us = k-l, U, = l-l] = P[P(l) = is p(2) =j, p(3) = kS PC41 --7 '1)  where i, j, R, 1 is any permutation of 1,2,3,4. Since the left and right columns in Table 1 assign different probabilities to every rank order they obviously correspond to different utility vectors, say 0s and &, but these vectors both generate the same system of choice probabilities (i.e., p(;; S) = l/l S 1). C onsequently the two systems of rank order probabilities in Table 1 correspond to two random utility models {us} and {o;} that are not equivalent for ranking experiments with four alternatives but are equivalent for choice experiments with four alternatives. For RU models in general then, ranking equivalence implies choice equivalence, but not conversely-except in the special case of experiments with three alternatives. Therefore it is reasonable to ask whether there are GT models that are equivalent for complete choice experiments with n > 4 alternatives but not for the corresponding ranking experiments. The next section shows that there are none. Proof. "Only if" follows immediately from Lemma 2, since every GT model is a RU model.
where Cy=, t,ti) = 0. Similarly the joint characteristic function of (6) becomes R&d~, t,(,d%-v fddl4 (8) and (7) and (8) must be equal because of (3). Thus the random vectors (5) and (6) have the same characteristic function, and consequently are identically distributed. This establishes (4) and completes the proof. 1 In particular we see that two GT-V models are completely equivalent for ranking experiments with n alternatives iff they are completely equivalent for choice experiments with rz alternatives, and so, for example, the models corresponding to the density functions Sinc2(x)(l -I-Cos 27~2~) and Sinc2(x)(l + Sin 27~~) cannot be discriminated by either choice or ranking experiments with 71' < n alternatives.

Reversibility
In the Introduction (Section 1.2) the concept of reversibility was motivated in terms of Block and Marschak's impossibility theorem. It was pointed out that this theorem impli-citly assumes that both the best-to-worst and worst-to-best ranking probabilities satisfy the double exponential GT-V model, and consequently reversibility could hold generally only if that model were equivalent (for three-alternative ranking experiments) to the GT-V model based on the "reverse" double exponential distribution-i.e., the distribution of -X when X itself is a double exponential random variable. Because of Theorem 1 in the last section we know that this ranking equivalence could only hold if these two models were also equivalent for three-alternative choice experiments, and consequently we can interpret the impossibility theorem in terms of the general equivalence properties of GT-V choice models, as outlined in Section 2.2. In particular, because the double exponential distribution has a nonvanishing characteristic function we know that a GT-V model YG can be equivalent to the model FF based on the double exponential distribution F(x) = ecem2 iff G is another distribution of the same type, i.e., G(x) =F(ax + b) for some a > 0. Since the double exponential distribution is asymmetric, the reverse double exponential random variable -X with distribution function F*(x) = 1 -eee2 cannot be of double exponential type, and so z&* cannot be equivalent to FF for three alternative choice experiments. Consequently one can say that the impossibility theorem is ultimately due to the shape of the double exponential distribution, i.e., its asymmetry and its "smoothness," as reflected in the fact that its characteristic function is nonvanishing.
This analysis then suggests the general problem: which GT models are reversible for n-alternative ranking experiments ? To answer this question we first need a general definition of reversibility. Intuitively, reversibility means that every system of ranking probabilities (r(p); S) ( S C 1%) generated by FF is matched by another system {r*(p; S) 1 S C I,J, also generated by rF, , which a&ns the same probabilities to the reverse rankings, that is, if p is any permutation of the indices in S and p* is the reversed permutation (say S = {I, 2,4}, p(1) = 3, p(2) = 2, p(4) = 1; then p*(l) = 1, p*(2) = 2, p*(4) = 3), then I* = r(p).
T o f ormalize this notion for GT models we will say that: Suppose FF, and YF* are equivalent. Then for every system of ranking probabilities generated by FF, (uiing some set of scale values m, , m2 ,..., m,) there exists a corresponding set rn:, mz,..., rnz by which 9'; generates the same system, i.e., for every permutation p of every S C I, @h f X1, > ... > mlsl, + Xisl J = P[-m&,l, + Xlsl, > ... > -mZ + x1,1 so YE is reversible. Conversely, suppose Fr, is reversible. Then any system of ranking probabilities generated by Fr, using some set of scale values m, ,..., m, is also generated by FF; using scale values -m:,..., -m,*, where the rn? are the scale values whose existence is guaranteed by the definition of reversibility, and so FF 7% and FF:, are equivalent. 1 Now we can give a general characterization of reversible GT models: THEOREM 2. A GT model TF, is reversible iff the characteristic function fn of distribution F, satis$es the equation f&l , t, ,**-> tn) = fn(-tl 7 -t, ,..., -tn) for all ti such that CT!, ti = 0.
Proof. Suppose (10) holds for all ti such that Cy=, ti = 0. Then since fm(tl ,..., t,) is the characteristic function of F, and fn(-tl ,..., -t,J is the characteristic function of F,* , it follows from Strauss' theorem (Lemma 1, Section 2.3) that FF, and &:, are equivalent for choice, and consequently (by Theorem I) they are equivalent for ranking. Consequently (by Lemma 2) yF, is reversible.
i=l i=l Proof. Equation (10) reduces to (12) in this case. From this we can conclude that GT-V models with nonvanishing characteristic functions are only reversible in the obvious way, i.e., when their utility distributions are symmetric: COROLLARY 2. If F has a nonvanishing characteristic function the GT-V model YT is reversible for three or more alternatives iff F is a symmetric distribution (i.e., fw some centering constunt c, F(x + c) = 1 -F(-x + c)).
Proof. Suppose F is symmetric; i.e., for some c, X -c and -(X -c) are identically distributed. The characteristic function of the former is e-""y(t), and of the latter, eictf (-t), and so these two functions are equal for all t, i.e., f(t) = eizctf(-t).
As noted earlier in Section 2.2, Yellott (1977) shows that when f is nonvanishing, this equation implies f*(t) = eibtf (t), so that here f (-t) = efb"f (t). Consequently -X and X + b are identically distributed, and so for the centering constant c = -b/2 we have symmetry, i.e., F(x + b/2) = 1 -F(-x + b/2). u Corollary 2 shows that for GT-V models with non-vanishing characteristic functions, reversibility and symmetry are equivalent conditions. However, if the characteristic function of a distribution F is allowed to have zeros, Y" can be reversible even though F is asymmetric. This can happen if FF is equivalent to a symmetric model, as in the following example.
COROLLARY 3. The GT-V model corresponding to the asymmetric probability density function p(x) = sinS(x)(l + sin 25~~) is reversible for n' < n alternatives, but not for n' > n.
Consequently in this case f(tl) *..f(tnB1) =f (-tl) . ..f(-t.+J > 0, but since one side has the sign of -4, the other the sign of +i. A similar argument works for any n' > n + 1 and consequently (12) cannot hold in these cases, so the GT-V model based on density p is not reversible for more than n alternatives. 1 Finally, we note that although GT-V models based on asymmetric distributions can sometimes be reversible for a finite number of alternatives (as in the last corollary), they can never be reversible for arbitrarily many: There is always some number beyond which reversibility fails. Consequently if one wants to insist on reversibility for all possible ranking experiments, only symmetric models will do. COROLLARY 4. A GT-V model TF is reversible for n alternatives fm every n iff F is a symmetric distribution.

Proof.
Corollary 2 shows that if F is symmetric, rF is reversible for every 11. Conversely if yr is reversible for every n then (12) holds for all n. Yellott (1977, Theorem 3) shows that this implies f(t) = eibtf(-t), i.e., F is symmetric with centering constant 42. I