Galton's Problem as network autocorrelation

Ethnologists have long sought to establish and validate causal models of sociocultural phenomena by the use of correlational evidence across broad samples of societies. It is well known that correlation does not establish causation. However, evidence for causality is strengthened if a postulated correlation replicates in many different contexts, thus radically reducing the likelihood of hidden third factors, spurious correlations, or correlations due to chance composition of a sample. of condition that the sample be Nonindependence occur in a variety of ways, and its be quite severe. Even though a correlation exists between two variables, its estimate via a nonindependent sample can be grossly inflated or reduced. In cross-cultural research, independence of sample is almost impossible to a fact that has made cross-cultural research open to a good deal of valid criticism. The eminent statistician Sir Galton first pointed out the problem in his remarks following paper. which ignore interdependence. The results of com-parisons based on simulated autocorelation data and the reanalyses of two previously published empirical studies indicate that both of the procedures proposed here compare very favorably with the maximum likelihood approach, and both are vastly superior to the usual regression procedures when there is moderate to high autocorrelation (i.e., interdependence). [Galton’s Problem, cultural diffusion, cultural evolution, statistical methodology]

from a common evolutionary stock and from the diffusion of cultural traits among societies. Societies in neighboring or historically related regions tend to be duplicates of one another in terms of a wide variety of traits that are spread by historical fission, diffusion, or migration of peoples. The result is that neither the actual number of "independent" cases nor the effect of the interdependencies on trait correlations is generally known for any cross-cultural sample.
Galton's Problem applies with equal force to attempts to establish and validate multivariate causal models. Regression models are commonly used for this purpose. Regression coefficients may be under-or overestimated, depending on the extent and structure of nonindependent cases. Here the problem is known as autocorrelation, where variables or error terms in the regression equation are correlated with the scores of related cases on the same variables or error terms. In technical terms, the presence of autocorrelation produces inefficient estimates of the regression coefficients: estimates will vary widely from the true coefficient. This problem is compounded by the fact that the variances of these coefficient estimates are systematically underestimated. As a result, the following may occur: (1) Where little or no correlation exists between two variables, presence of autocorrelation in the sample may result in an estimate of the regression coefficient not only substantially different from zero but also with an underestimate of i t s variance. This may lead the researcher to conclude that the correlation is significantly different than zero. (2) Where the relationship between two variables is indeed substantial and replicable across samples, the estimate of the same regression coefficient via two different autocorrelated samples may result in large differences between the two estimates. When this is coupled with an underestimated variance, a researcher may be led to conclude that the estimates are significantly different and that the relationship does not replicate.
Autocorrelation was first formulated as a problem with time-series data, for which events at a given point in time are generally not independent of the events at points just preceding in time. For cross-sectional data the comparable formulation was in terms of spatial autocorrelation, for which events at a given point in space are not independent of events at points nearby. In the past few years, several cross-cultural researchers have discussed the relevance of the spatial autocorrelation model to Galton's Problem (Loftin 1972;Naroll 1976;Pryor 1976;Simonton 1975;Wining 1974). That formulation requires measurement of interdependencies among societies in terms of spatial distance. The present authors (Dow, White, and Burton 1983; White, Burton, and Dow 1961) have generalized the spatial autocorrelation model to incorporate any kind of historical or diffusional relationship among societies. This network autocorrelation model can use the statistical solutions to the spatial autocorrelation model; only the measures of relationships among societies will differ.
Valid solutions to the spatial autocorrelation model have been developed only recently (Ord 1975;Cliff and Ord 1981;Doreian 1960Doreian , 1961 Bartels and Ketellapper 1979). In a previous paper (Dow, White, and Burton 1983) we review four models in which autocorrelation enters into the regression framework and suggest that one of them, the disturbances model, appears to be well suited for cross-cultural research. In this model autocorrelation appears in the error terms of the regression equation. This assumes that there are patches of related cases which fit the model better than expected by chance, other patches for which scores on the dependent variable are systematically overestimated, and remaining patches for which the dependent variable is underestimated. Such a model is consistent with the case in which the variables under study tend to travel (spread, diffuse) as a packet.
The great advantage of this autocorrelated spatial disturbances model is that regression coefficients can be properly (efficiently) estimated even with the presence of autocorrelation, by the use of maximum likelihood (ML) procedures. Mathematical, empirical, and simulation work (Dow, Burton, and White 1982) have all shown that these ML autocorrelation procedures yield markedly different and more accurate results, in the presence of autocorrelation, than ordinary least squares (OLS) regression, which assumes the absence of autocorrelation. The advantages of the ML procedure over OLS are of a major order even in small samples (Dow, Burton, and White 1982).
Autocorrelation solutions to Galton's Problem in the regression context depend on an explicit specification of the strength of relatedness between each pair of cases in the sample. A restricted and incomplete approach to such specifications, using onedimensional arrays or "diffusion arcs," was introduced to the cross-cultural literature by Naroll (1961Naroll ( , 1973, Loftin (1972), and Simonton (1975). A more powerful two-dimensional or "diffusion proximity" specification was introduced by Wirsing (1974) and Pryor (1976). The weaknesses of the Loftin and Simonton arguments as autocorrelation solutions to Galton's Problem consist of underestimating the amount of diffusion by assuming that (1) diffusion operates only in one direction and (2) is never reciprocal. The weaknesses of the Wirsing and Pryor arguments are not in the specification of the interdependency matrix but in how this is incorporated into the estimation procedures for the regression model. Our network disturbances model builds in the proper estimation procedures.
While the network disturbances model has great advantages for cross-cultural and causal-modeling research, i t s chief disadvantage to date has been the computational effort required by the ML estimation procedure. This is true in spite of Ord's (1975) useful simplification of the estimating routine. The procedure is formidable even with small samples, more so with large ones. And, unless the structure of the interdependencies in large samples assumes some particularly simple forms (Ord 1975:125), the ML computational routine cannot be employed.
The major purpose of this paper, then, is to examine two estimation procedures for the network autocorrelated disturbances model, both of which are considerably simpler cornputationally than the ML procedure. Although neither procedure generates estimates which are preferable to ML estimates on purely formal criteria, we compare the empirical performance of each procedure t o the ML approach and to the usual OLS regression procedures using simulated autocorrelation data. The results of this simulation indicate that both methods compare very favorably with the ML procedure in terms of bias and efficiency of the regression coefficient estimates. Both methods also offer major improvements over the usual OLS regression when moderate to high autocorrelation is present. Also, both methods perform extremely well in reanalyses of two previously published autocorrelation examples.
In the following section we review briefly the matrix formalization of the network disturbances model and outline the three autocorrelation estimation methods. ' We then present and discuss the results of the simulation. In the subsequent section we suggest significance testing procedures which can be used with these alternative routines. Finally, we present the results of our reexamination of two previously reported (White et al. 1981;Pryor 1976) empirical examples using all three autocorrelation methods and OLS.

autoconeiated disturbances regression model
Although Calton's Problem was first raised in connection with the spatial and historical patterning of cultural traits, it is in a statistical sense actually a problem of model specification. A model is correctly specified when the equations in the model adequately correspond to the processes generating the observed variation in the data. Model specification is a matter of degree: the better the theoretical understanding of the underlying processes, the closer the correspondence between the postulated equations and the empirical data, and the more precise the estimates of the model parameters.
From a cross-cultural perspective, cultural diffusion is one of the theoretically important underlying processes affecting the degree of variation in comparative data. The formal expression of this process in equational form is thus a necessary part of any correctly specified model used to analyze cross-cultural data. Before presenting the network autocorrelated disturbances model in equational form, however, we first examine the representation of cultural diffusion as a matrix of relationships among the sample units.
An important step in Pryor's (1976) attempt at a matrix formulation of Galton's Problem is the calculation of a "diffusion possibility index" between every pair of societies in a sample. Any factor thought to influence the probability of diffusion, such as distance and language similarity, can be used to calculate a "DP index" for each pair. As a purely hypothetical example, Pryor calculates the DP indices between pairs using the formula DP = 5 + 3L -40, where L is a 0,l variable indicating language similarity (1) or not (0), and D is geographical distance measured in thousands of miles. Pryor computes a DP index for every pair drawn from six hypothetical societies (7, V, W, X, Y, Z) and assembles the results into a "sample diffusion matrix," shown as the S matrix in Table 1.
Societies T and V in this S matrix would be most alike (DP = 6), while T and Y have little similarity (DP = I), as measured by Pryor's hypothetical DP equation. It is entirely possible, of course, to calculate relationships using measures based only on distance or only on language similarity, rather than attempting to combine them in some ad hoc fashion. No matter which measures of diffusion or similarity are preferred, however, the interdependencies among sample units are still formalized as an S matrix.
The network autocorrelation disturbances model assumes that the N sample units are somehow differentially interrelated and that the interrelations can be operationalized as an N x N "relational" matrix W . This matrix differs from Pryor's diffusion possibility matrix only in that each element is divided by its row sum; thus, the rows are rescaled to sum to 1. Table 1 is the diffusion possibility matrix S appropriately row-normalized to unity. Since a sample unit is not assumed to influence itself, the main diagonal of this matrix has all zeros. The entries in this matrix, wij, indicate the interaction probabilities between unit i and unit j. Note that now interactions may be reciprocal, but they need not be symmetric (i.e., wij does not necessarily equal wji).

The W matrix in
This model is a generalization of the usual OLS (ordinary least squares) multiple regression model, which can be compactly stated in matrix form' as X is an N x (K + 1) matrix of K independent variables plus an initial column of ones; fl is a (K + 1) x 1 column vector of regression coefficients; E is an N x 1 column vector of (independent) multivariate normal errors.
When the error term assumptions of this model are met, in particular the independence assumption, the ML estimation procedure and the usual OLS procedure yield equivalent results.' That is, estimates of the regression coefficients f i and their standard errors are identical. However, given an interdependent set of observations, such as frequently occur in continuous area cross-cultural studies or time series studies, the error terms may not be independent of one another, and the above OLS model is incorrectly specified in this case.
Given an interdependent set of observations and an appropriate W relational matrix, the OLS model can be respecified as a network autocorrelated disturbances model: This model states that the disturbances E from the previous OLS regression model are not independent. Rather, each disturbance is a weighted average of the disturbances at related units, the weights being the nonzero wii in the appropriate row of W, times a scalar p which is analogous to a correlation coefficient, plus an N x 1 column vector of random components v. Rho (p) expresses the extent of the error dependencies (i.e., the overall autocorrelation in the system of variables under study). Note that when p = 0 (Le., there is no autocorrelation present with respect to W), this model is identical to the OLS model in equation 1. When W has all zero elements, corresponding to complete independence of each sample unit from all the others, this model is again identical to equation 1. Thus, the network disturbances model is a true generalization of the usual OLS regression model. Some insight into the nature of the network disturbances model can be obtained by algebraically combining equations 2 and 3. Solving for E in terms of v in equation 3 we get Premultiplying this equation by ( I -pW) gives This latter equation is more simply written as Note that after this common linear transformation of all variables the error terms in equation 10 now meet OLS assumptions; hence, unbiased and relatively efficient estimates of the regression coefficients 0 and their standard errors can be obtained using the usual OLS procedures on the transformed data.
Within the regression framework, then, the network autocorrelation problem is twofold: construction of a plausible relational matrix W and estimation of the autocorrelation parameter p. Selection of the weights w;j for the W matrix is of major importance, since spurious results may be obtained if the hypothesized matrix does not correspond to any real process (Cliff and Ord 1973). In geographical research, interaction effects are often estimated according to notions of space-friction constraints on the possibility of effects from one unit to another. The simplest function in this case is an exponentially decaying distance function such as D !CL, where D;j is the distance from location i to location j and CL is a suitable exponent chosen a priori. Cliff and Ord (1973) compute weights based on both distance and proportion of boundary in common between the 28 counties of Eire. Gatrell (1979) has constructed a measure of interaction among Swedish towns based on distance and number of telephones. Bodson and Peeters (1975) employ distance and minimum public transportation time when constructing an "accessibility function" to generate the interaction weights among 44 Belgian arrondisements.
Clearly, the form of the weighting matrix will depend on the problem at hand and on the available data. In cross-cultural research, many possib es exist for computing weights. Apart from the spatial functions which are obviously applicable to a diffusion process, other pertinent network effects which have commonly been hypothesized are language similarity, whether or not two societies belong to the same state, and trading relationships.
A measure of language similarity has been proposed and employed in previous crosscultural examples (White et al. 1981; Dow in press). Language similarity and geographical distance measures are used individually to construct the two W matrices employed in the analyses reported below.' If there is autocorrelation, the network disturbances model has desirable statistical properties that the OLS model lacks. First, although the OLS estimates of regression coefficients are unbiased, they are more highly dispersed around the true population parameter than are the ML estimates. Second, the OLS estimates of the sampling variances of the regression coefficients will generally underestimate the true variances. Hence, with respect to estimates based on a single sample, underestimating the variances of the regression coefficients will lead to spurious attributions of significance to particular independent variables. By contrast, in replication studies across several interdependent samples, the investigator would tend to conclude that a valid model fails to replicate because of large differences in estimates of the same parameter due to their unreliability (i.e., wide dispersiori around the true parameter). Thus, both single and multiple sample replication studies are biased toward finding differences where none exist (type 1 error) if OLS estimation is used with interdependent samples (Dow, Burton, and White 1982). This is a crucial issue in crosscultural research whereby functional relationships are expected, if valid, to replicate across the major geographical/culturaI regions of the world. three estlmation procedures maximum likelihood (ML) In the usual application of the regression model, estimates of the coefficients and their standard errors can be obtained by the method of moments. That is, means, variances, and covariances of all variables are calculated and can then be used to find estimates of the population parameters. An alternative procedure which does not employ sample moments is maximum likelihood (ML) estimation. The basic idea behind this procedure is to try to locate estimates which are "most likely" to have generated the observed sample values. Details on the estimation theory for the ML network disturbances model and other ML autocorrelation models are given in Ord (19751, Doreian (1980), Cliff and Ord (1981). Since we shall compare two simpler solutions (IGLS and IRR; see below) to the ML solution, we present here an outline of the ML solution. and 6$ for the network disturbances model, plus the standard errors of these estimates. The major computational problem is to find the ML estimate of 3 which permits the simple variable transformations in equation 11. The appropriate ML estimate is the value 6 which minimizes the following expression (Doreian

1980)':
The basic problem is to locate estimates For the results reported below, expression 12 was minimized by direct search of the interval (-l,l).6 As the sample N increases, repeated evaluation of expression 12 becomes computationally burdensome. Note that the summation term requires the N eigenvalues of the W matrix. As mentioned above, unless W assumes some particularly simple form, its eigenvalues are virtually impossible to compute for a sample of N over about 90.' With 6 obtained from the above procedures the following parameters can be estimated: iterative generalized least squares (IOLS) The iterative generalized least squares procedure is based in part on suggestions made by Ord (1975) and Bodson and Peeters (1975).
For the ML approach, the criterion used to decide on the best estimate 6 was that the estimate minimize equation 12, which is equivalent to maximizing the total log-likelihood function. In the IGLS procedure, we sidestep the difficulties associated with finding the eigenvalues of W and the repeated evaluation of equation 12. We iteratively search the interval -1 ( p ( 1 as before; now, however, we (1) insert the search values of directly into equation 11 and perform the variable transformations; (2) apply OLS to the resulting equation 10 and obtain estimated residuals c; (3) retain as the "best" estimate of p the 6 which results in the minimum sum of squares of the residuals from equation 10; (4) retain as our estimates of pi the bits corresponding to this 8; (5) retain the associated standard errors of the bi.
As with the above direct-search ML estimation, it is possible that the search procedure may locate a local rather than a global minimum p within the search interval. A fairly finegrained search would make this unlikely, however. The simulation results reported below clearly suggest that this procedure will estimate a global minimum. It is also possible that no unique minimum sum of squares of errors may be found. This possibility did not occur in any of the 200 simulation direct searches reported below; nor did it arise with any of the empirical examples.
iterative residual regression (IRR) The iterative residual regression procedure originated in Durbin's (1960) approach to time series analysis and is discussed in Bodson and Peeters (1975) for the spatial autocorrelation case. For this procedure the steps are: (1) estimate Y = Yp + E using OLS and obtain the residuals E = Y -Xb; (2) estimate fl by regressing 2 on Wb using OLS (i.e., on 2 = pWE + v); (3) use this i 3 to transform variables as in equation 11; (4) estimate equation 10 using OLS and obtain a new residual vector C; (5) use these residuals to obtain another estimate fl as in step 2; (6) terminate the procedure when successive estimates of 6 are within some prespecified absolute value'; (7) retain the last value of fl and associated bi and their standard errors.
Whether this estimation procedure converges to some minimum is unknown for multilateral interdependencies. For the 200 simulations of this model, the maximum limit we set of 10 iterations was never reached. For the simulation data analyzed here, convergence was very rapid, averaging less than four iterations. Again, the close similarity of the results of this method to the results of the other two procedures suggests that it does indeed converge t o a global minimum.

generation of data and relational matrix W
The procedures employed here to generate the simulation data are similar to those reported in Dow, Burton, and White (1982). The  Since our main interest is in the relative performance of each of the three procedures with respect to the bias and efficiency of the p estimate, the intercept was set equal to zero. First, we generated a random column vector 6 with mean and variance as in expression 18. An autocorrelated independent X variable was then obtained using (1 9) The extent and direction of OLS estimation error is a function of the degree of autocorrelation of the independent variable (i.e., 5) and of the errors (i.e., p); therefore, we set 5 = .4.
For each independent X vector we generated an autocorrelated error vector by drawing a random vector v and transforming it as before: Given the X and E vectors, we then constructed the dependent Y variable using equation 15. That is, we simply add the X and the E vectors. These steps were repeated 50 times for values of p = 23, .43, .63, .83 and a fixed sample size of 35. Only positive values of 5 and p were used, since negative autocorrelation is less interesting theoretically and less likely empirically. Also, the expense of computing using the ML program precluded examination of all possible autocorrelation values.
The 35 x 35 relational matrix W used in generating the data via equations 19 and 20 was obtained from the language family relationships of 35 SubSaharan African societies from the Standard Cross-Cultural Sample (Murdock and White 1969). Briefly, linguistic similarity is measured on the genetic tree of languages as an inverse function of the number of nodes along the path between the two languages. Further details on the construction of this and other cross-cultural relational matrices are provided in White et al. (1981).

simulatlon results
Our previous study of the ML disturbances model (Dow, Burton, and White 1982) showed that there were no significant gains over OLS for low levels of error autocorrelation (i.e., p ( .5). Hence, we did not employ our ML program on the simulated data where p = .23 or .43.
We compare the performance of the three autocorrelation procedures to each other and to OLS with respect to bias and efficiency of the regression coefficient b, and then compare the three autocorrelation procedures with respect to bias in the estimate of the autocorrelation parameter p. bias in the estimates: E(B -6) Table 2 gives the average B over 50 replications at each level of p. Since the true B = 1, the average bias introduced by each procedure corresponds to the values reported in Table 2, minus 1. Clearly, there is very little bias using any of the autocorrelation methods, as we expected from the previous discussion. For autocorrelation at .43 and above, OLS displays the highest bias. relative efficiency of p estimates As mentioned previously, the problem with OLS analyses of autocorrelated data is that the regression coefficient estimates, though unbiased, are inefficient. That is, the variance of the estimates i s large relative to that obtainable through autocorrelation procedures. It i s usual to compare estimators using a measure which combines both bias and efficiency, the mean square error (MSE), since the overall performance of an estimator depends on both quantities. Using expectation notation, MSE is defined as E(P -b? = (bias)* + variance

(21 1
The MSE is the expected value (i.e., average) of the squared differences of the estimates from the true population parameter 0. Table 3 reports the ratios of MSE's of p for all procedures. If the MSE ratio equals 1, then the procedures being compared are equally efficient; if the ratio is less than 1, then the procedure used to calculate the numerator is more efficient (smaller MSE) than the procedure used to calculate the denominator; and if the ratio is more than 1, the opposite is true. --The first two columns of Table 3 indicate that at p = .23, OLS is more efficient than either IGLS or IRR, and that there is little to distinguish between the OLS and either of these two procedures at p = .43. However, at the higher levels of p both IGLS and IRR show huge gains in efficiency over OLS, anywhere from about 100 to 700 percent. The third column indicates that there is virtually no difference in the performance of ICLS or IRR at these higher levels of p. The last two columns also indicate, rather surprisingly, that for the data analyzed here, at least, there is little to distinguish between ICLS and IRR and the more formally correct ML procedure. The former two are marginally more efficient at moderately high levels of autocorrelation (p = .63), while ML is marginally more efficient at high autocorrelation levels (p = .83). Table 4 shows that at low levels of autocorrelation (p = .23) neither ICLS nor IRR yield satisfactory estimates of p. At the next level of autocorrelation (p = .43)

bias in p estimates
both procedures tend to underestimate the true p, while at higher levels they both tend to overestimate p by about 15 percent. ML provides better estimates of p than either IGLS or IRR at these higher levels of autocorrelation. However, this difference appears to have little or no impact on overall efficiency of the regression estimates, as shown in Table 3.

significance testing procedures for autocorrelation parameter p
In the above discussion of the ML computational procedures, expression 14 was given for obtaining the standard error of p. Since p/ d * is asymptotically normally distributed, the likelihood of significant autocorrelation is easily evaluated. The exact distribution of the 6 estimate generated by IGLS or IRR is not known.
One approach to discerning whether or not a statistically significant network process is operative is to employ Cliff and Ord's (1973) I-statistic. This statistic can be applied to the residuals obtained from an initial OLS regression to assess whether significant autocorrelation is occurring with respect to a given relational matrix W . 9 Hepple (1976) has provided a similar statistic for testing OLS residuals which is considerably simpler to compute than the I-statistic. Given the OLS vector of residuals 8, Hepple Hence, if a significant s were detected, either IGLS or IRR could be used to obtain an estimate of the degree of autocorrelation present. The Z score obtained from this statistic is computed and reported in the empirical examples discussed below.
An alternative approach to testing the 6 generated by either ICLS or IRR for significance would be to accept the ^p they yield as a good approximation to the ML 6 and insert the ICLS or IRR value into a likelihood ratio test (Hoe1 1971; Brandsma and Ketellapper 1979).
However, the simulation results reported here suggest that this would only offer a very approximate test of significance, so we do not pursue the possibility at this time. Pryor (1976) presents three tests for the detection of diffusion of cultural traits. As noted above, his method is based on the idea of a "diffusion possibility matrix" which he suggests could be constructed from the physical distances between societies or on language similarities. As an example he presents data from 60 societies rated on seven variables. He hypothesizes that "the presence of a significant amount of market exchange of labor is directly proportional to the level of economic development of the society and also to whether it relies heavily on herding as a source of subsistence goods" (1976:740). He also hypothesizes that "the presence of gambling in a society is directly proportional to the presence of a commercial money and to the presence of considerable socioeconomic differences and is inversely related to whether or not the society is a nomadic herding society" (1976:740). Using distance as a criterion for the possibility of diffusion, Pryor suggests via his three tests that diffusion of the measured traits is likely. Having found some evidence of diffusion, his options at the time he wrote the paper were severely limited. As we shall show with our more current methodology, Pryor did remarkably well. He notes that gambling was prevalent particularly in North America, so he includes whether a society was a North American society as a (0,l) dummy variable and recomputes his regression equation.

empirical reanalyses
Using our methods, we reanalyzed the data as presented by Pryor. Instead of a diffusion possibility matrix, we computed a network relational matrix based on physical distances, as in our previous example. Using that matrix and our two network autocorrelation procedures, we recomputed the regression coefficients and present them here, along with Pryor's estimates (Tables 5 and 6). The standard deviations of the regression coefficients are reported in parentheses below the coefficients.
In the model for the presence of gambling (Table 5), we find significant autocorrelation (z = 6.853). Our estimate of p i s large (p = .795), but our regression coefficients are substantially the same as reported by Pryor in his first analysis, with one important exception. Pryor (1976:741) reports a value of .I67 (.I141 as the coefficient and standard deviation for the presence of socioeconomic inequality. From these results alone he would have had to conclude that socioeconomic inequality was not a causal factor in the presence of gambling. Using the same variables along with the matrix of distance similarities, we find a  [ 1 is the standard error. coefficient of .321 (.067), which is double Pryor's coefficient, with half the standard error. This means that a relationship which remains undetected using standard techniques is shown t o be highly significant when the autocorrelation is taken into account. When Pryor included North America as a dummy variable, the coefficient for socioeconomic inequality increased and, more importantly, was significant. By use of this dummy variable Pryor found a relationship not seen in his first analysis. At the same time Rz increases from ,221 to .676, which indicates that a great deal of the variation in the occurrence of gambling can be explained by the presence of the society in North America. When we include North America in our regression equation, there is no longer significant autocorrelation (z = 1.263) and our regression coefficients coincide with Pryor's. This indicates that the autocorrelation which we report in our first analysis is to be found mainly in North America.
Again applying the techniques to the model for labor market exchange (Table 6). the test for autocorrelation gives a value of z = -.492, which is insignificant. Although computation of the regression estimates under a model of autocorrelated disturbances i s inappropriate in the case where autocorrelation is insignificant, we have presented those computations to illustrate that estimates of the main parameters of interest remain essentially the same as in standard OLS procedures. First note that the results of the two procedures (IRR and IGLS) are almost identical. This indicates that the process probably has converged to a global minimum. Although Pryor reports evidence of diffusion, we find that autocorrelation is statistically insignificant and that his OLS coefficients are valid and correct. He does report that clustering occurred only among the dependent variables, which is true and is an indication that perhaps the disturbances model is not correct and that the "effects" model for autocorrelation may be more appropriate. lo It should also be emphasized that since there is insignificant autocorrelation, the negative values for p are meaningless. These results verify and illustrate our simulation findings. First, there is little difference between the results of ICLS and IRR; however, since they are entirely different procedures, the fact that they give identical results is an almost sure check that they have converged to a global minimum. Second, we have illustrated a case in which significant autocorrelation did not exist and yet the results of the two procedures give results identical t? OLS. Finally, and of greatest importance to cross-cultural researchers, is the fact that a relationship which could have been completely missed using standard techniques has been detected using either ICLS or IRR. If it had not been for Pryor's astute observations, he might have been forced to abandon a relationship of substantial importance in his theorizing.
In an earlier paper (White et al. 1981) we propose and explore a model of the causes of the sexual division of labor in African agriculture. One of the relationships that we tested i s total female agricultural participation as a function of crop type (C: 1 = root, 2 = cereal) and slavery (S: 1 = absent, 2 = incipient, 3 = present). The total-female-involvement-inagriculture dependent variable, labeled 1, is obtained as the sum of three five-point scales indicating female contribution to harvesting, soil preparation, and crop tending, and ranges from a low of 3 to a high of 15.
The sample used in our previous study consists of the 31 African societies from the Standard Cross-Cultural Sample (Murdock and White 1969) that have some degree of agriculture and for which we have data on crop type and slavery. To measure interdependence among these societies we employed both geographical distance and linguistic relatedness criteria. That is, we constructed two W connectivity matrices, one based on an exponentially decreasing function of distance and the other on a tree of historical relations among all 31 languages." (Details on the construction of these matrices and the variables used, as well as the theoretical rationale for the model, are available in White et at. 1981.) Table 7 shows the results of our reanalyses. The OLS and ML results for both W matrices are taken from White et al. (1981). We include here the parameter estimates obtained using the abovedescribed ICLS and IRR procedures.'* For the language W matrix, all three network disturbances procedures show very similar magnitudes for the regression coefficients and their standard deviations. These estimates are quite different from the OLS estimates, although there are no reversals in inferring significance of the independent variables: each regression coefficient is still more than twice its standard deviation. The disturbances parameter estimate is significant for all three procedures, although the estimate is higher for ICLS and IRR. Both of these latter procedures show R2's almost identical to OLS and ML.13 For the distance W matrix, the estimates for all three autocorrelation procedures are very similar. The $s are again all significant, though less so than with the language matrix. The R2% are very slightly lower for IGLS and IRR than for OLS or ML.

conclusions
The formal representation of the network autocorrelated disturbances model in matrix terms, as in equations 2 and 3, offers a quite natural characterization of Calton's Problem. W relational matrices permit rich expression to the idea of cultural diffusion, and hypotheses concerning diffusional effects may be investigated by constructing W matrices based on various theoretical considerations. The W matrices based on measures of language similarity and geographical distance used in the cross-cultural examples reported here, for example, correspond quite closely to theoretical notions concerning the mechanisms underlying cultural diffusion processes.
Since the network disturbance model is a true generalization of the OLS multiple regression model, the same kinds of complex hypotheses testing and data analyses that are possible with the latter are possible with the former. Even where observations are interdependent, hypotheses concerning the effects of an independent variable on a dependent variable while holding constant one or more additional independent variables can be tested. More complex analyses of direct and indirect effects, such as in path analysis or structural equation modeling, are now possible using cross-cultural survey data. Replication of such complex models across major world geographical/culturaI regions, where the most important regional autocorrelation effects are specified within the model, is an important step toward formulating models of great generality.
The results of the simulation study clearly show that all three autocorrelation procedures completely dominate OLS regression whenever autocorrelation effects are moderate to high (p 2.5). Thus, one of the three autocorrelation procedures is clearly preferable to OLS at these levels of autocorrelation. For the simulation data analyzed here, there is surprisingly little to distinguish between the more formally correct ML autocorrelation procedure and the two newly proposed procedures in terms of bias and efficiency of the regression coefficient estimates. From the standpoint of computational simplicity, however, either G L S or IRR is clearly preferable to ML.
Neither our simulation results nor our empirical reanalyses allow us to draw any firm conclusions concerning the relative merits of IRR and ICLS. It is possible that for larger sample sizes one of the two procedures will come to dominate the other in terms of the MSE of the regression coefficients or the p. For estimating a more general model with two or more relational matrices embedded in the error structure, residual regression seems to be the more natural approach to estimation. Our confidence in both of the new and computationally simpler autocorrelation procedures as accurate and reliable solutions to Galton's Problem is considerably strengthened by the results of our empirical reanalyses. In the reanalyses of our o w n previous example, both IGLS and IRR generate estimates that are trivially different from our previously reported ML results. Also, in the reanalysis of Pryor's examples both procedures uncover a substantively important relationship that does not appear using conventional OLS routines, and which may have been missed originally but f o r Pryor's astute observations. While it is obvious that no technique will replace the investigator's understanding and grasp of a problem, the task of theory building and testing using interdependent cross-cultural data may be made substantially easier if appropriate techniques, such as IGLS o r IRR, are employed.

notes
' For discussion and examples of the a-coefficient and a language similarity measure suitable for cross-cultural research, see White et al. (1981).
The In terms in expression 12 are natural (i.e., base e) logarithms.
' A coarse first search of the interval was conducted using steps of 1 t o locate a minimum. A second search of the k . 1 interval around this minimum in steps of .01 was then carried out to locate the final minimum 6.
Associated with any N x N square relational matrix W are N numbers, not necessarily distinct, called its eigenvalues. Special computational routines are required to calculate the eigenvalues for all but extremely small W matrices. An extended discussion of eigenvalues, their properties, and algorithms to calculate them is given in Green (1976). Because of the limitations of our current computer facility, the maximum sample size our ML autocorrelation procedures can handle is about 60.
We stopped when successive values of 8 were within an absolute value of ,131.
The I-statistic can also be used to test for autocorrelation of any individual variable in the equation. Details on the computation of the l-statistic and its standard error for significance testing are given in Cliff and Ord (1973) and Doreian (1980). For an interesting biological example of the use of the I-statistic to investigate diffusion of human alleles, throughout Europe, with the spread of farming during the Neolithic, see Sokal and Menozzi (1982). l o The "effects" model hypothesizes that only the dependent variable is autocorrelated with respect to a given W matrix. This model is specified as follows: Doreian (1980) and Erbring and Young (1979) for discussion of this model '' In this paper we estimate the network disturbances model separately for the language matrix and for the distance matrix. The network disturbances model can be generalized to include both matrices simultaneously as follows. For an empirical example of this more general model applied to crosstultural data, see Dow (1984). l 2 All of the computations reported in this paper for the IRR and KLS procedures were carried out using BASIC language programs written by Reitz and Dow. Listings of these programs are available from the authors upon request.
The R2 reported in our previous paper (White et al. 1981) are incorrect for the ML procedures us-.45 and .45, respectively, as shown here. Again, the R2 reported here are simply the squared Pearson product-moment correlation between the original dependent variable scores and the predicted scores (i.e., r*y;).
ing both the language and distances W matrices. Instead of the .67 and .61 reported, the correct R 2 are references cited