Issues in the Use of the Event Study Methodology: A Critical Analysis of Corporate Social Responsibility Studies

Organizational researchers are increasingly using the event study methodology to assess the effect of strategic decisions on firm performance. Unfortunately, event studies alone are inadequate because, at best, they provide estimates of the shortrun impact on shareholders only and not on other corporate stakeholders. Furthermore, event study findings are sensitive to even small changes in research design. The authors illustrate the lack of robustness by examining five recent studies of corporate social responsibility (CSR) that report conflicting results. They conclude that these contradictory findings arise from significant differences in research design and implementation. The authors also demonstrate why it is inappropriate to draw conclusions regarding the managerial implications of CSR activities from these studies. Finally, they identify alternative methodologies that organizational researchers could use to supplement the event study approach to assess the overall impact of CSR on stakeholders.

The event study methodology was developed to assess the effect of an unanticipated event on stock prices. That is, it measures the average change in share price that occurs when a major "event" is announced. This event presumably provides new information on the future profitability of companies that experience it. Event studies have been used widely in the fields of accounting, economics, and finance to assess the stock price effect that is conveyed by a major corporate announcement. These include announcements of quarterly earnings, mergers and acquisitions, new products and investments, legislation and regulatory changes, and other economically relevant events. More recently, organizational researchers have "imported" the event study methodology to study managerial decisions such as changes in corporate governance and decisions involving corporate social responsibility (CSR).
In this article, we examine critical issues in the use of event study methodology, as applied in management research. To illustrate the importance of these issues, we examine five published studies that estimate the effect of CSR decisions on firm performance. These five studies all examine the same issue, divestment of South African assets during the apartheid controversy, but report conflicting results. Therefore, an in-depth analysis of these studies offers fertile ground for examining methodological issues.
We begin with a brief description of the methodology, followed by a discussion of the use of event studies in management research. Next, we point out the inconsistencies in the direction and magnitude of the reported abnormal stock price returns in the South African divestment studies. This leads us to consider how these contradictory results might arise. Although we focus on this one particular event, our aim is to provide general guidance on the use of event studies, especially in the area of CSR. We demonstrate that the following are critical research design and methodological issues in any event study: • defining the event and constructing an appropriate sample, • the length of the window used to compute abnormal returns, • accounting for the leakage of information, • sample size, and • controlling for industry effects.
Each of the above issues is examined in detail, followed by a discussion of how the affect on additional (nonfinancial) stakeholders has been handled and the managerial implications of these studies. Finally, we discuss the implications of our analysis. Our primary conclusion is that we have not learned much from event studies of South African divestment. Therefore, we recommend that management scholars not use this method to examine issues of corporate social responsibility, unless they do so in conjunction with the use of alternative methodologies that enable them to assess the overall impact of CSR.

Event Study Methodology
The standard methodology, which is based on the market index model, is described as follows: Daily, value-weighted returns for the firm and for the market are used to estimate the following equation for each firm for each event: where R it = rate of return on the share price of firm i over period t, R mt = rate of return on a value-weighted market portfolio of stocks over period t, α i = the intercept term, β i = the systematic risk of stock i, and ε it = the error term, with E(ε it ) = 0. For example, R it might be the rate of return for IBM stock over a specified period, usually about 200 trading days (250 to 50 days prior to the event). From estimation of the above equation, the researcher derives estimates of daily abnormal returns (AR) for the ith firm using the following equation: where a i and b i are the ordinary least squares parameter estimates obtained from the regression of R it on R mt over an estimation period (T) preceding the event (250 to 50 days prior to the event). The abnormal return (AR it ) represents the return earned by the firm after adjusting for the "normal" return process on date t. That is, as shown in Equation (2), the rate of return on the stock is adjusted by subtracting the expected return from the actual return. Any significant difference is considered to be an abnormal or excess return. Following the example above, AR it is an estimate of how the return on IBM stock differed, on day t, from its predicted return based on the average "movement" of the market and the firm-specific parameters (a i and b i ). Some stocks, such as technology stocks, are more volatile, relative to the market, than others. These stocks will have higher β values. Therefore, many authors compute a standardized abnormal return (SAR) in which the abnormal return is divided by its standard deviation. The standardized average return is where S E is the residual variance from the market model as computed for firm i, R m is the mean return on the market portfolio calculated during the estimation period, and T is the number of days in the estimation period. For example, IBM, as a technology company, has more volatile stock than the market average, and it is important that its abnormal return be standardized.
The standardized abnormal returns can then be cumulated over a number of days, k (the event window), to derive a measure of the cumulative abnormal return (CAR) for each firm: For example, CAR i , where i represents IBM, would be the sum of the abnormal returns for IBM, summed over the length of the event window-usually 2 to 3 trading days.
A standard assumption is that the CAR i are independent and identically distributed across firms. With this assumption, we convert these values to identically distributed variables by dividing each CAR i by its standard deviation, which is equal to ((T -2)/(T -4)) 0.5 . Thus, we can compute the average standardized cumulative abnormal return (ACAR) across n firms over the event window as .
In this step, the cumulative returns of all firms in the sample are summed, and the sum is divided by the number of firms-to arrive at an average CAR, which is then standardized. Expanding on the example of IBM, the researcher would then sum the CARs for all the firms in the sample, including IBM, that announced the event of interest (such as a merger) and calculate a standardized average. The test statistic used to assess whether the average cumulative abnormal return is significantly different from zero (its expected value) is If significant, the average cumulative abnormal return is assumed to measure the average effect of the event on the stock price of the firms that experienced the event. That is, the significance of the average abnormal return allows the researcher to infer that the event had a significant impact on the value of the firms.

Use of Event Studies in Management Research
The event study methodology is often used to address managerial issues. For example, Koh and Venkatraman (1991) conducted an event study of the announcement of joint ventures. This topic is well suited to the method because securities analysts closely follow such developments, and thus such announcements (if unanticipated) are likely to have a financial impact. Furthermore, the impact on nonfinancial stakeholders is usually not large for joint ventures. Finally, the authors used the appropriate research design and methods. A number of event studies on corporate governance topics are equally good. Table 1 lists a wide variety of managerial issues that have been examined using the event study method, including corporate governance and ownership control changes, the formation of joint ventures, investment decisions, the implementation of diversification, turnaround, layoff programs, human resource management issues, and, of course, CSR. These topics span various fields in management but are primarily in the area of business policy and strategy. That is not surprising because the field of strategy is focused on explaining why (and how) some firms outperform others, and strategy professors typically have had more exposure to finance and economics than have other management professors. Also, strategy researchers have traditionally examined the performance implications of well-defined strategic events (or shifts in strategy), such as mergers and acquisitions, which have been shown to influence stock prices. Alternative methodologies that have been used to examine the topics identified in Table 1 include correlation analysis, multiple regression, and structural equation modeling (see, e.g., Hill & Snell, 1989;Hoskisson, Johnson, & Moesel, 1994;Wright, Robbie, Thompson, & Starkey, 1994). These alternative approaches typically emphasize long-run, multiple indicators of performance, based on accounting data, such as return on assets (ROA), return on investment (ROI), and return on equity (ROE). These measures have been computed at both the firm and business segment level (Hitt, Hoskisson, Johnson, & Moesel, 1996). Event studies, by contrast, provide only a single firm-level indicator of performance change. An interesting and very useful alternative approach is to study the effect of major corporate announcements on other managerial decisions. For example, Hitt, Hoskisson, Ireland, and Harrison (1991); Hitt et al. (1996); Lichtenberg and Siegel (1990); and Long and Ravenscraft (1993) assess the impact of corporate control changes on the intensity of research and development (R&D) investment (R&D expenditures and patents). This is an excellent example of estimating the impact of these "events" on nonfinancial stakeholders because, for example, reductions in R&D could reduce productivity growth and product innovation and ultimately reduce our standard of living.
However, many researchers have limited the scope of their studies by relying exclusively on the event study methodology. Limiting analysis of managerial issues to the use of event studies is unfortunate because the methodology is not always appropriate and not consistently well executed. For some topics, such as CSR, it is inappropriate because the method allows the researcher to assess the impact of the event on only one stakeholder-the shareholder. CSR affects multiple stakeholders, and it is not reasonable to draw managerial implications about the success of CSR without examining the effect on these stakeholders. We have also found that the event studies published in management journals do not compare favorably with those published in finance journals, in terms of research design and implementation (McWilliams & Siegel, 1996).

Using Event Studies to Measure the Effect of CSR: Divestment From South Africa
In McWilliams and Siegel (1997), we focused on three recent studies of CSR to illustrate the theoretical and empirical limitations of this method. Two of these studies addressed human resource management issues, and one examined divestment from South Africa (Meznar, Nigh, & Kwok, 1994). Meznar et al. reported that firms withdrawing from South Africa suffered a substantial reduction in share price, which they interpreted as a reflection of a transfer of wealth from shareholders to other corporate stakeholders. We identified critical flaws in the Meznar et al. paper and also pointed out that event studies, even when they are well designed and executed, do not provide conclusive evidence of the existence of such a transfer of wealth. Hence, we encouraged researchers to move beyond event studies to examine the impact of CSR on other stakeholder groups.
There are now five published event studies of withdrawal from South Africa, as summarized in Table 2.
3 Only Posnikoff found support for the premise that firms "do well by doing good." The Meznar et al. and Wright and Ferris evidence indicate that there is a trade-off between "doing good" and "doing well." Because these findings strike at the core of widely held beliefs about corporate social responsibility, this divergence in the reported evidence warrants scrutiny. Note. CSR = corporate social responsibility; NA = not applicable. a. These two studies are based on the same data. 347

Direction and Magnitude of Reported Abnormal Stock Price Return
Wright and Ferris (1997) find that divestment resulted in a loss of wealth to shareholders. Although small, the decrease of 0.249% was statistically significant. From this they conclude that "the results of this study suggest that announcements of corporate divestment of South African business units are associated with significant negative excess returns" (p. 81). Posnikoff (1997) reports that for her sample of 40 firms, the effect of announcing divestment was positive-as much as an average of 0.28% increase in stock price. The results from these studies were based on short windows. Meznar et al. (1994Meznar et al. ( , 1998 report results for several different windows. It is important to note that they report no significant stock price reaction during the traditional short 3-day event window (-1, +1). Instead, they report significant results only when they employ very long windows and for very small samples of firms (10 and 22 for their 1998 reply). Interestingly, Meznar et al. (in the 1998 reply) find the largest negative return for one of the smallest samples of firms, 10 companies that divested their South African assets early in the apartheid controversy. According to the authors, these 10 firms experienced, on average, an 11% cumulative negative abnormal return. They also report that firms divesting in the "middle" of the controversy experienced an average decline in shareholder wealth of about 7%, whereas those that divested after sanctions were imposed suffered no financial losses.
The magnitude of the estimated wealth effect is highly implausible. An effect of 11% is enormous by any standards because the firms involved are very large, multinational firms such as IBM, Exxon, Pepsi, and Dow Chemical. According to Meznar et al. (1994Meznar et al. ( , 1998, the firms in their sample report balance sheet assets of $10 billion on average. Because the market value of most companies exceeds the value of assets reported on the firm's balance sheet, an average decline of 11% in share price would likely have resulted in an average decline in market value that significantly exceeded $1.1 billion. This is a remarkably large impact, given that the average asset holdings in South Africa was far less than 1% of firm value (Teoh et al., 1999, p. 87). Meznar et al.'s (1994) reported impact of 11% is nearly 17 times larger than the assets involved and is approximately 40 times as great as the effects reported by either Wright and Ferris (1997) or Posnikoff (1997). This lends credence to Posnikoff's speculation that other influences, such as general market declines, might account for the results reported by Meznar et al. (Posnikoff, 1997, p. 15). McWilliams and Siegel (1997) replicate the results of Meznar et al. (1994) for the 3-day windows, and they also report no significant impact from divestment. They also report no significant impact from divestment for the longer windows, after controlling for confounding events. That is, using a different empirical design, they were not able to replicate the results reported by Meznar et al. Teoh et al. (1999) also report no significant stock price reaction to the announcement of divestment.
In summary, three of five of the studies report no significant financial impact to divestment using a standard event window. The other two report significant financial impacts, one positive (Posnikoff, 1997) and one negative (Wright & Ferris, 1997). Both of these results are based on short windows. In addition, Meznar et al. (1994Meznar et al. ( , 1998) report a very large significant negative impact for some subsamples, when they employ very long windows.
We continue to examine these issues to provide some insights regarding the causes of these conflicting results and to stress the importance of examining the effect of corporate social responsibility on other (nonfinancial) stakeholders, including the following: • individual plants or business units, • workers and unions, • consumers, • suppliers, • financial institutions, and • communities where CSR takes place.

Issues to Consider in Resolving Conflicting Results
One possibility is that the inconsistency of results in South African divestment studies stems from improper event study research methods. As demonstrated in our 1997 paper, event studies are quite sensitive to research design issues (McWilliams & Siegel, 1997). Critical issues identified in our 1997 paper included sample size and the effect of "outliers," length of the event window and confounding effects, and explanation of the abnormal returns. In this article, we focus on the definition of the event and construction of an appropriate sample, accounting for "leakage" of information about the impending event, the length of the window used to estimate abnormal returns, and sample size and controlling for industry effects. Table 2 summarizes how each of these issues was handled in the four studies of South African divestment, as discussed below.

Constructing an Appropriate Sample and How to Define the Event and Control for Other Events
Defining the event may be more difficult when addressing CSR issues than when examining issues that are more clearly associated with stock trades, such as mergers and acquisitions and earnings announcements, because there are no well-developed definitions to rely on. Interpretation of results may be affected by how the event is defined, and this makes construction of the sample more critical in CSR studies than in some others. It also introduces inconsistency that is not generally recognized as a design problem.
Although Wright and Ferris (1997), Posnikoff (1997), Meznar et al. (1994Meznar et al. ( , 1998, McWilliams and Siegel (1997), and Teoh et al. (1999) examine the same issue (divestment from South Africa to protest apartheid), there is considerable variety in the sample of firms studied. This divergence results primarily from how the authors define the event and how they control for the effect of other events on firm value. Table 2 (row 1) lists how the event was defined in each of the five studies. For further clarification of the differences in the samples, Table 3 lists the firms included in each study, except McWilliams and Siegel (1997) because their study is a replication of Meznar et al. Table 2 indicates that Wright and Ferris (1997) examine a sample of 31 firms. They constructed this sample by first identifying an original sample of 116 firms that divested assets between January 1, 1984, andDecember 31, 1990. From this set of 116 firms, they selected 31 firms that conformed to the following standards: Table 3 Firms Included in Event Studies of Divestment From South Africa Teoh, Welch, and Wright and Meznar, Nigh, and Wazzan (1999) Ferris (1997) Kwok (1994,1998) Posnikoff ( 1. they had a "good" Sullivan rating (a measure of racial neutrality), 2. the date of divestment was announced in either the Wall Street Journal or the New York Times, and 3. shareholder resolutions did not "force" managers to divest (Wright & Ferris, 1997, p. 80).
That is, Wright and Ferris (1997) define the event as the published announcement of voluntary divestment by firms that had a positive Sullivan rating. The firm names but not the event dates are reported in the paper. 4 Posnikoff (1997) identified 52 companies that made "distinct public and published announcements of divestment" between 1977 and 1992 (p. 78). Of these, she found that 40 had complete data on financial returns for the estimation period from 1980 to 1991. That is, she defined the event as the announcement of divestment by all firms but located returns data for only 40 of these firms. Firm names and announcement dates for the final sample of 40 firms are reported in the paper. Meznar et al. (1994) identified 207 "corporations that ceased operating in South Africa . . . from the early 1970s to January 1991." From these, the authors identified those firms that announced "an explicit decision to pull out of South Africa" in U.S. newspapers, particularly the Wall Street Journal. This resulted in a sample of 68 companies. From this sample, the authors report that they eliminated the following: 1. firms that were not listed on U.S. stock exchanges, 2. firms for which there were other relevant events reported on the day of the announcement or 1 day before or after, and 3. Nashua Corporation because its returns were "extremely volatile for an extended period that included the event date." This left them with a sample of 39 firms (Meznar et al., 1994(Meznar et al., , pp. 1638(Meznar et al., -1639. That is, Meznar et al. define the event in a manner that is similar to Posnikoff (1997). However, they eliminate firms whose stock price may be significantly affected by other events on or near the announcement date and one firm whose stock price was very volatile. The firm names and announcement dates are available from Meznar et al. but were not reported in the paper.
In McWilliams and Siegel's (1997) replication of Meznar et al. (1994), the authors start with the sample of 39 firms from Meznar et al. but eliminate firms that experienced other significant "events"-referred to in the literature as "confounding" events-during the Meznar et al. windows. That is, McWilliams and Siegel define the event in the same manner as Meznar et al. but control for confounding events over the entire window, rather than over only 3 days. This considerably reduces their sample sizes. In fact, for the longest window (41 days), there are no firms left in the sample. Firm names and dates can be found in the table that lists confounding events (McWilliams & Siegel, 1997, pp. 641-642). Teoh et al. (1999) identified a sample of companies that announced voluntary divestment from 1983 through 1989. From this sample, they eliminated firms that did not subsequently divest. This left them with a sample of 46 firms. That is, they defined the event as the announcement of divestment by firms that made voluntary decisions and then actually followed through with divestment. Firm names and announcement dates are reported in the paper.
An examination of the four samples shows that although there is substantial overlap among the samples, no two are the same. This inconsistency is problematic because a small sample size increases the likelihood that "influential" outliers may have a relatively large impact on the reported results (Lichtenberg & Siegel, 1991). To the extent that "influential" outliers differ across samples, the results may also differ. We conclude that the differences in samples, given that all samples are small, may explain a large portion of the difference in results across these studies.

Identifying the Correct Event Date When There May Be Leakage of Information
Researchers interpret the abnormal returns as a measure of the effect of new information conveyed by the announcement of this event. That is, the abnormal return reflects the amount of wealth gain or loss to stockholders attributable to the event. For this inference of stockholder wealth gain or loss to be appropriate, the event date should be precise and accurate and not "confounded" by other concurrent announcements or events that also affect the stock price of the firm. It is generally very difficult to isolate a sample when no other confounding announcements or events have occurred at the announcement of the "test" event.
Depending on the event investigated, researchers can have a difficult time identifying correctly actual event dates because it is often difficult, if not impossible, to identify the precise date on which the information about the event reaches the market. Researchers sometimes expand the event window to ensure that the event is contained within the longer window as a way to handle this problem. However, this adjustment creates problems. As the event windows are expanded, the number of confounding concurrent events also increases, which reduces the power of the test statistic used to identify abnormal returns, by raising the amount of "noise" relative to information.
Even in situations in which precise event dates can be identified, there remain issues involving how to interpret the abnormal returns as a measure of the impact of the event. For instance, prior speculation regarding impending events is often referred to as "leakage" of information. Leakage makes it difficult to identify the date on which investors were able to react to the new information.
Leakage is not a problem for some events, such as the crash of a jetliner. In situations in which the event is clearly a surprise, such as a jetliner crash, the abnormal returns will reasonably measure the effect of the event on the firm's anticipated future profits. However, in many other circumstances, it is difficult for the researcher to determine what prior information investors might have had. For example, the announcement of an acquisition may follow months of speculation in the press and a flurry of activities by the acquisition players and competitors. Thus, by the time the merger is announced, investors may have already fully capitalized the information in the stock price. In a paper examining the effects of Supreme Court decisions on pending mergers, McWilliams, Turk, and Zardkoohi (1993) demonstrate that most of the impact on share price occurs at the time that the case is argued, rather than when the decisions are announced. That is, they show (with a 2-day window) that the significant effect on share prices occurred on the date of the argument, rather than the date of the decision. The authors contend that traders, having knowledge of how judges responded to previous arguments, were able to predict the decisions and traded based on the predicted outcomes, rather than waiting for the actual announcements.
An approach implemented in Teoh et al. (1999) is to track the passage of the entire legislative process culminating in the Antiapartheid Act in Congress. By specifically focusing on all prior events, this approach is a partial step toward handling leakages. Similarly, for CSR activities, a researcher could also identify relevant prior events and control for their effects. The standard approach for handling leakage is to identify the date on which information appears that might be used to predict the coming event. Researchers can then estimate the abnormal return for these "leakage events" (Salinger, 1992). Typical leakage events include shareholder meetings, public forums, press releases, and news articles indicating that discussions are under way.
If leakage created the need to examine other dates relevant to the divestitures in South Africa, one suggestion is to identify events-such as the announcement of shareholder resolutions-and then construct short windows and test for abnormal returns around these leakage events. The abnormal returns from all the individual events, including the leakage events and the public announcement of withdrawal, could then be summed to arrive at the CAR. This approach would isolate the effect of the divestments because it allows for the inclusion of all abnormal returns actually related to the event while minimizing the contamination from other events.
One could argue that some discussions of divestment might not be publicly announced. However, in this situation, the information is not likely to be accessible to the market either. This is because information of this type that is "leaked" to traders but not publicly announced violates insider trading laws. As noted in Teoh et al. (1999), SEC Rule 10b-5 requires prompt disclosure by firms of any relevant information that has a material effect on firm value. Thus, it is likely that firms would quickly disclose any discussions related to divestment. Given the legal ramifications, we can reasonably assume that virtually all relevant information regarding divestment that was available to traders would also be available for use by researchers, from sources such as the Dow Jones Index. This allows researchers to control for leakage by calculating the impact of specific events, without increasing the noise through the use of excessively long windows.

Length of the Event Window for Cumulating Abnormal Returns and Drawing Inferences
From Table 2, we see that there is much less variety among these studies in the length of the event window or the period over which the abnormal returns are cumulated. Windows in well-designed event studies rarely exceed 3 trading days. This follows from a crucial assumption of the event study methodology-that the stock market is efficient. In an efficient market, new information is almost instantaneously incorporated in stock prices; that is, the stock price almost immediately reflects new information (Mitchell & Netter, 1989). Supporting the theory of efficient markets, there is strong empirical evidence that a short window generally incorporates all abnormal returns. 5 Hence, most researchers use short windows, typically 1 to 3 trading days. Wright and Ferris (1997) report a 1-day window, Posnikoff (1997) reports 2-and 3-day windows, and Teoh et al. (1999) report a 3-day window. Wright and Ferris (1997), Posnikoff (1997), McWilliams and Siegel (1997), and Teoh et al. (1999) all draw their inferences from short windows. That is, they are only willing to infer that there was a negative (Wright & Ferris, 1997), positive (Posnikoff, 1997), or neutral (McWilliams & Siegel, 1997;Teoh et al., 1999) impact from divestment from the stock price reaction during a very short period of time. Of the studies being reassessed here, only Meznar et al. (1994) employed event windows that exceed 3 trading days and draw inferences based on very long windows (up to 41 trading days, which translates to more than 8 calendar weeks).
The second difficulty with extending the window beyond a day or two is that it becomes difficult, if not impossible, to isolate the effect of the event from the effect of other events that affect a firm's stock price. For large firms, there are likely to be several confounding events almost every trading day. Therefore, the longer the window, the more likely it becomes that other events will affect the stock price and cloud the results of an event study. The shorter the window, the less likely it is that confounding events will occur.
A study by Brown and Warner (1985) has been used to justify the use of longer windows. However, Brown and Warner demonstrate that their justification is appropriate only if confounding events are truly random, which is plausible if and only if the sample size is quite large. Therefore, it is not appropriate to invoke Brown and Warner for sample sizes of 7, 10, 19, 20, and 22 as Meznar et al. do in their 1998 reply/errata. To summarize, event studies are designed to isolate the financial impact of a particular event. When the event window is long, which in this context means more than 3 trading days, the method can easily generate spurious results. Of the five event studies of South African divestment, only Meznar et al. (1998) report results and infer significance from a long event window.

Sample Size and Statistical Tests
The statistical tests on which the event study methodology relies are based on normality assumptions that are plausible only with a large sample size. When the sample size is small (especially if it is less than 30), assuming normality is indeed quite heroic. This is problematic because if the data are not normally distributed, the test statistics are not reliable. Sample size is a concern for all the studies of South African divestment because none is based on a large sample. The Wright and Ferris (1997) study includes 31 firms; the Posnikoff (1997) study includes 40 firms; the Meznar et al. (1998) study uses samples of 7, 10, 19, 20, and 22;and the Teoh et al. (1999) study includes 46 firms. Although relatively small, samples of 31, 40, and 46 are not as problematic.
Small sample sizes are an especially significant concern for the Meznar et al. (1994Meznar et al. ( , 1998 studies. For purposes of inferring an impact from divestment, they divided their sample into subsamples of 19 and 20 (Meznar et al., 1994) and 7, 10, and 22 (Meznar et al., 1998). None of these subsamples is large enough to justify imposing normality assumptions. The largest and most significant effects reported by Meznar et al. (1998) are for a sample size of 10. We would caution against drawing inferences from a sample of 10 firms, let alone extrapolating this particular result to all 207 of the South African divestments in their initial sample (Meznar et al., 1994(Meznar et al., , p. 1638. Small samples may account for results that are not robust, such as those that we observe for event studies of withdrawal from South Africa, because they magnify the problems of confounding events and the influence of outliers. That is, with a small set of firms, it is even less likely that confounding events, such as those discussed above, are randomly distributed. Furthermore, small samples are much more likely to lead to biased and imprecise estimates of abnormal returns (the appropriate expected return benchmark is still heatedly being debated in the finance field) because of outliers or confounding events.

Controlling for Other Relevant Events During the Estimation Window
Because the event study method is designed to estimate the financial impact of a unique event, it is crucial that the researcher control for confounding events (other firm-specific events that occur during the sample window). When a researcher uses a long window, there will likely be a large number of confounding events, any of which could conceivably induce an abnormal return. This is especially true for very large multinational firms, such as those that divested their South African assets (see Table 3 for a list of the firms involved).
It is also possible that events affecting nonsample firms may spill over, either directly or indirectly, to the stock prices of firms that experience an event. For example, the announcement that United Airlines is contemplating an acquisition of America West Airlines is likely to have an impact on the stock prices of their competitors, such as American Airlines and Southwest Airlines. Thus, searching news sources for mentions of the sample firms may be insufficient to identify all confounding problems. Obviously, it is impossible to control for all such factors, but using the shortest possible window minimizes the risk of nonevent spillovers. Therefore, it is likely that the Wright and Ferris (1997), Posnikoff (1997), andTeoh et al. (1999) studies are not "clouded" by confounding events because they use short windows. This is not true for the Meznar et al. (1994Meznar et al. ( , 1998 studies, however. It is important to note that one cannot easily predict the effect of confounding events. For example, one might expect to find that the announcement of a major new government contract raises the stock price (Meznar et al., 1998, p. 719). Such an assumption is not appropriate, however. If the researcher has no information about the expected number and size of new contracts, what may seem like good news to the casual observer could actually have been disappointing news to the financial community. For example, the contracts may have been smaller in size than expected, or fewer than expected new contracts were announced. As with earning announcements, changes in expectations are what drive movements in stock prices (Teoh & Hwang, 1991). Therefore, researchers should exclude firms that experience confounding events from their empirical analysis, rather than make assumptions about the size or direction of the effect of the confounding events. The latter approach is especially problematic when there are multiple events in a single window (see McWilliams & Siegel, 1997, pp. 641-642, for a list of the confounding events for firms in Meznar et al.'s [1994] sample).
In their 1998 reply, Meznar et al. perform some controls for confounding events. They use only short (2-day) windows when controlling for confounding events but use very long (41 trading days) windows when estimating the effects of the divestments. Thus, they do not treat events consistently. An additional inconsistency is that they treat the sample of 39 firms differently than their original sample of 62 firms. For 22 firms, controlling for confounds meant elimination from the sample (when constructing the sample for the 1994 research note); for 39 firms, it meant subtracting the effect of the confound (1998 reply). This makes it unclear why the sample used to estimate the long windows is appropriate. A consistent approach would involve adding back the 22 firms that were eliminated from the original sample and then treating all events equally.

Clustering and Industry Effects
In an event study, researchers attempt to isolate the impact of an event on a firm's financial performance. Thus, it is important to control for other factors that can potentially influence a firm's financial return. Industry-specific factors are a problem if a relatively large number of firms in the sample belong to the same or a related industry. This is because errors in the expected-return model (Equation (1)) are likely to be correlated among firms in the same industry and occur during the same time period. This is referred to as "clustering," as the sample firms may cluster in a few (or one) industry (or at the same point in time). When clustering occurs, conditions that affect the industry may have an impact on the firms in a cluster. To attribute any stock price changes to a unique event, the researcher must to be able to extract all stock price changes that are expected relative to market, industry, and other firm-specific factors. Only then can he or she have confidence that the remaining residual return is associated with the unique event.
Over a short interval, such as 1 to 3 trading days, and based on a large sample, it is unlikely that industry conditions are important, relative to market and firm-specific factors. Therefore, as indicated by Warner (1980, 1985), controlling only for the market return is adequate for short windows and large samples. However, over longer windows (such as 41 trading days, which is over 8 calendar weeks), industry factors are likely to be more important, so that extracting the market index alone may be insufficient. This is exacerbated when there is a small sample of firms and when there is clustering. For longer windows, especially with small samples, researchers should also control for industry effects. Wright and Ferris (1997) and Posnikoff (1997) do not report controlling for industry effects, but this is probably not a problem because they use 1-to 3-day windows. Clustering may be a serious problem in the Meznar et al. (1994Meznar et al. ( , 1998 studies, however, because they use long windows (31 and 41 trading days) and small samples that include several firms in one industry (see Table 3). By dividing their sample into three time periods, with the largest sample drawn from the shortest time period-less than 13 months-they increase the likelihood that there are "clusters" of firms and that industry conditions will swamp the effect of any firm-specific event. Meznar et al.'s use of long windows and small samples, as Posnikoff (1997) points out, "would indicate that influences other than the announcement of disinvestment might account for the negative results" (p. 83).
To extract the industry factor, the researcher constructs industry portfolios by using equally weighted portfolios of all firms with matching four-digit SIC codes, excluding the test firms. When there are no matching SIC codes, the equally weighted market portfolio can be substituted. The firms' returns are regressed on the industry factor returns, market returns, and the risk-free rate. This procedure is implemented by Teoh et al. (1999) in their examination of South African divestment. They estimate the following equation: "where R i,t is the firms' raw return, (on CRSP), R m,t is the CRSP equally weighted market portfolio, R industry,t is the equally-weighted portfolio of companies with the same four digit SIC code, and R TB,t is the daily yield in percent per annum for 1-year treasury bills" (p. 76).
We have identified several important empirical problems with the event study analysis. As a caution to researchers using the event study method and interested readers, we summarize five important factors for ensuring the validity and applicability of event study results: • appropriate event definition and sample, • a short window, • relatively large samples, • controlling for other firm-specific relevant events, and • controlling for industry effects.

Analysis of Impact on Additional Stakeholders
Because CSR inherently involves many stakeholders, it is inappropriate to ignore the impact of divestment on other groups. Other managerial decisions, such as corporate takeovers and leveraged buyouts (LBOs), have been studied more comprehensively, and these studies can be used as a model for measuring the overall impact of CSR decisions. When examining the effects of takeovers and LBOs, researchers have estimated the impact on nonfinancial stakeholders, such as workers and customers. Rosett (1990) examined the impact of changes in corporate control on workers and found no evidence of reduced wages in large companies after takeovers. Similarly, Lichtenberg and Siegel (1990) reported no decline in employment or compensation of plant workers following LBOs. Conversely, Gokhale, Groshen, and Neumark (1995) found that older workers experienced layoffs and wage reductions in the aftermath of hostile takeovers. On the consumer side, Chevalier (1995) reported that supermarkets taken privately through LBOs do not raise prices. Wright and Ferris (1997) do not analyze the impact of divestment on other stakeholders. They do, however, speculate that other stakeholders, such as Black workers, may have suffered when American firms withdrew from South Africa because firms that divested had been supportive of the Black South Africans. In fact, they note that "a number of American business interests in South Africa . . . were . . . economically, politically, and socially beneficial to black South Africans" (p. 78). This contradicts the assumption of Meznar et al. (1994) that a loss to shareholders results in a transfer to external stakeholders and makes the implications of being "socially responsible" even more negative.
Neither Posnikoff (1997) nor Meznar et al. (1994) analyzes the impact of divestment on other stakeholders. Meznar et al. makes the assumption that managers are making a trade-off between shareholders and other stakeholders but offer no evidence of any effect on these other stakeholders. Teoh et al. (1999) do analyze the impact on other stakeholders. They go well beyond event studies by examining other factors, such as pension funds' divestments, the effects on U.S. banks with South African lending activity, nonprice factors such as changes in institutional share ownership, macroeconomic effects on South Africa, and the nonfinancial effect on firms during the passage of the Comprehensive Antiapartheid Act of 1986. They report no significant effect on the South African financial sector or on American banks that financed assets in South Africa. They report only one significant effect-an increase in institutional shareholdings (by universities, pension funds, and the like) after divestment.
There is insufficient information in any of the studies of divestment to draw inferences about the overall impact of CSR, however. The Teoh et al. (1999) study provides the most complete evidence, but this evidence is still limited to the effect on investors, firms, and the financial sector. No attempt was made to measure the impact on workers or customers. To estimate the overall impact, researchers would have to examine information on issues such as wages, working conditions, career paths, consumer prices, and so on. We encourage management researchers to provide such evidence in the future.

CSR Implication: Do Firms Do Well by Doing Good?
Wright and Ferris (1997) imply that firms do not necessarily "do well by doing good." Invoking agency theory, they imply that what may be perceived as socially responsible behavior (divesting of South African assets) may in fact be self-serving behavior that increases the personal reputations of managers, at the expense of shareholders. Their result-that divestment lowered the value of the firm-is consistent with this hypothesis. It is important to note that they are not drawing a general conclusion about the effect of CSR on firm performance, however. They view the withdrawal from South Africa as a situation in which managers could disguise their personal selfinterest under the guise of social responsibility. Posnikoff (1997) reports results that support the implication that firms do well by doing good. She focuses, however, on the negative effects of "doing bad" and speculates that doing good eliminates the potential to suffer for "doing bad." She also discusses the importance of the health of South Africa's economy. Because the economy declined following the divestment period, she concluded that firms that were aware of the poor economic outlook may have decided that it was a good time to withdraw assets from the country. Furthermore, investors, possessing the same information about the future of the economy, rewarded this foresight. This is consistent with her finding of a positive impact associated with withdrawal. To the extent that the impact was due to a correct prediction about the economy and not to reduction of the hassle factor, her result is not generalizable either. Meznar et al.'s (1994Meznar et al.'s ( , 1998 results also imply that firms do not "do well by doing good"-quite the opposite, because they report huge losses as a result of a single CSR action: the divestment of South African assets. However, none of the sample sizes used by Meznar et al. is large enough to warrant generalizing about the wealth effects of the managerial decisions to divest and therefore to generalize about the wealth effects of CSR in general. McWilliams and Siegel (1999) and Teoh et al. (1999) report results that support a neutral effect of CSR on financial performance. This may be attributed to the existence of a trade-off between the costs and benefits of being socially responsible (McWilliams & Siegel, 1999) or to the insignificance of this particular event to large multinational firms (Teoh et al., 1999).

Managerial Implications
It is troubling when organizational researchers who examine socially responsible behavior do not discuss the managerial implications of acting socially responsible. Unfortunately this limits the discussion to descriptions of the past, rather than prescriptions for future behavior. CSR requires an investment of firm resources. Therefore, it has important strategic implications that management researchers should comment on. In addition, consideration of CSR in terms of strategy formulation and implementation helps us to understand the implications of the results that have been reported for future strategic decision making.
Wright and Ferris (1997) do not examine the managerial implications of CSR behavior because they hypothesize that the divestment from South Africa resulted from the managers' selfish concerns with personal reputation, rather than concern about firm performance. They also speculate that other stakeholders, such as Black workers, may have suffered when American firms withdrew from South Africa. The implication of these, along with the reported negative effect, is that shareholders need to police managers who may be "overconsuming" socially responsible behavior at the expense of shareholders and other stakeholders. Posnikoff (1997) does not directly discuss managerial implications, but her results imply that managers should be socially responsible because there may be a positive financial return to socially responsible behavior. She offers several explanations of the positive impact. One is that "investors, as well as managers, may wish to 'consume' social responsibility and therefore are more favorably inclined toward firms that announce plans to leave South Africa" (p. 84). A second is that being socially responsible results in "the removal of what has been called the 'hassle' factor" (p. 84). This is important because "one of the most forceful methods of 'hassling' firms involved imposing the municipal purchasing restrictions that had direct effects on current and future sales and income" (Posnikoff, 1997, p. 84). She does point out, however, that, in the case of South African divestment, the positive effect may have resulted from (a) the removal of the hassle factor and (b) the fact that the South Africa had poor economic prospects at the time. The implication of Posnikoff's discussion is that managers should be socially responsible but only in a reactive manner. That is, managers should react to the probability of future "hassles" by responding to current socially responsibility demands. Meznar et al. (1994Meznar et al. ( , 1998 also do not discuss the managerial implications of their results, which suggest that a first mover strategy in CSR is the least desirable option. That is, proactive socially responsible behavior (divesting early in the debate) creates a negative impact for shareholders, but being reactive (waiting until there is a legislative mandate) creates no such negative impact. The managerial implication seems clear: Your shareholders will suffer if you are socially proactive. If this is the case, it seems unlikely that there would have been any voluntary divestments, unless managers did not perceive shareholder wealth maximization as a priority (or care about the effect of stock price on their compensation or job security). This is unlikely for publicly held companies (the only type of ownership that can be included in an event study). It is especially improbable when, simply by waiting until divestment is required, shareholders are protected and the desired outcome in terms of social responsibility is met. Teoh et al. (1999) and McWilliams and Siegel (1997) draw no managerial implications from their studies because they find no effect to the CSR action they examined. This does not imply that CSR has a neutral effect on firm performance or on other stakeholders. It may simply be that this particular event-the divestment of South African assets-was not a significant financial event for the firms examined in these studies.

Conclusions
Several hypotheses have been advanced regarding the influence of CSR on firm value (Waddock & Graves, 1997). If CSR denies firms unique profitable investment opportunities or is costly to implement without attendant benefits to the firms' shareholders, socially responsible activities will lower firm value (Aupperle, Carroll, & Hatfield, 1985). On the other hand, socially responsible activities may increase firm value because CSR activities may (a) be demanded and valued by investors, (b) raise firm productivity by satisfying workers, (c) increase market share, and (d) reduce costly customer boycotts (Moskowitz, 1972). Finally, CSR may have no effect on firm value if the costs and benefits of CSR cancel each other out (McWilliams & Siegel, in press). This lack of consensus about the likely impact of CSR creates an opportunity for researchers to test multiple hypotheses, using a variety of methodologies, including event studies.
Five recent event studies of South African divestment have yielded conflicting results. From analyzing these studies, we conclude that this lack of consensus can be attributed to differences in how researchers address critical research design and methodological issues. Our illustration of the sensitivity of event study findings to minor changes in implementation could be useful to researchers contemplating using this method. The inferences we have drawn from our analysis may also help readers of event studies form opinions about conclusions that can reasonably be drawn from event studies, especially when they involve CSR.
From the various considerations noted throughout this article, we conclude that it is unlikely that South African divestments had a significant impact on firm value. More important, it is clear that the event studies alone have not taught us much about South African divestment. We argue that these conflicting results arose because of the following issues, which are not easily resolved: • differences in defining the event, • difficulty identifying the correct event date, • differences in the length of the event window, • differences in methods of controlling for confounding events, and • differences in methods of controlling for industry effects.
Given these intractable difficulties, we suggest that further analysis of the stock price effects of divestment, based on event studies, is not a fruitful enterprise because it will not provide additional insights to managers regarding the impact of CSR.
CSR is based on the theory that firms have a responsibility to multiple stakeholders. Thus, to assess CSR actions, researchers should measure the impact of an event on numerous stakeholder groups. The Teoh et al. (1999) study does go well beyond event studies by examining other factors, such as pension funds' divestments, the effects on U.S. banks with South African lending activity, nonprice factors such as changes in institutional share ownership, macroeconomic effects on South Africa, and the effect on firms during the passage of the Comprehensive Antiapartheid Act of 1986.
Given the advantage of organizational researchers (relative to finance and economics researchers) in examining nonfinancial stakeholders, we encourage management scholars to conduct additional research on the effects of divestment and apartheid sanctions on labor, customers, suppliers, competitors, and South African firms. For example, did divestments by U.S. firms hurt South African firms by withdrawing managerial expertise and capital investments? Were the firms that purchased U.S. assets just as proficient in managing these assets as American firms so that the South African operations remained healthy under the leadership of the new owners? Did income and job advancement opportunities for Black South African workers change after apartheid? Did working conditions and child care provisions in South Africa improve or worsen after divestment? Did the prices of goods and services provided by the divested plants change?
These are important questions that remain unanswered a decade since the fall of apartheid. A study focusing on these questions may help us understand the efficacy of sanctions and corporate social responsibility actions in general. Management scholars can make a valuable contribution by focusing on broader impacts, including the effect of CSR on multiple stakeholder groups. This will provide much-needed guidance to corporate executives, who are increasingly under pressure to justify CSR decisions.

Notes
1. Trading days are days when the stock exchanges are open. Therefore, there are, at most, 5 trading days in a calendar week.
2. The Meznar, Nigh, and Kwok study has been published twice in Academy of Management Journal. In 1994, Meznar et al. published a research note in which they divided their sample of 39 firms into two subsamples. In 1998, they published a reply to McWilliams and Siegel (1997), using the same data, but divided their sample into three (rather than two) subsamples. The results reported in the two notes are similar.
3. The McWilliams and Siegel (1997) study is simply a replication of the Meznar et al. (1994) study. Therefore, the sample sizes and windows correspond to those employed by Meznar et al. The difference in results is due to the elimination of firms from the sample when confounding events occurred during the reported window.
4. Including firm names and dates is important because it allows other researchers to replicate and extend studies. In event studies, all the data come from public sources. Therefore, when the samples are of a reasonable size, the authors should be required to include these data. 5. For example, Patell and Wolfson (1984) report that evidence on earnings announcements "showed that abnormal returns to knowledge of earnings were greatest within 30 minutes of the announcement, with most of that return occurring within the first 5 to 10 minutes" (p. 223).