Betweenness centrality measures for directed graphs

This paper generalizes Freeman’s geodesic centrality measures for betweenness on undirected graphs to the more general directed case. Four steps are taken. The point centrality measure is first generalized for directed graphs. Second, a unique maximally centralized graph is defined for directed graphs, holding constant the numbers of points with reciprocatable (incoming and outgoing) versus only unreciprocatable (outgoing only or incoming only) arcs, and focusing the measure on the maximally central arrangement of arcs within these constraints. Alternatively, one may simply normalize on the number of arcs. This enables the third step of defining the relative behveenness centralities of a point, independent of the number of points. This normalization step for directed centrality measures removes Gould’s objection that centrality measures for directed graphs are not interpretable because they lack a standard for maximality. The relative directed centrality converges with Freeman’s betweenness measure in the case of undirected graphs with no isolates. The fourth step is to define the measures of this concept of graph centralization in terms of the dominance of the most central point.


Introduction
Betweenness centrality (Freeman 1977(Freeman , 1979(Freeman , 1980) is a fundamental measurement concept for the analysis of social networks.The recent book by Hage and Harary (1991) demonstrates some of its many descriptive and predictive uses.It was originally defined, however, only for undirected graphs.This constitutes a rather severe limit on its potential utility for directed (nonsymmetric) graphs and social networks.Gould (1987) argues that measures of betweenness centrality are possible for the directed case, but that owing to lack of a unit of measurement and of a unique definition of the maximally centralized graph for the directed case, the measure remains uninterpretable.
The present study identifies a unit of measurement and a uniquely maximal centralized graph for the directed case, and so provides for a proper generalization of betweenness centrality to directed graphs.
2. Generalization of betweenness centrality to directed graphs Freeman (1980) showed how betweenness centrality for undirected graphs is derived from the column totals of a single matrix of the numbers of pairwise dependencies of each point on every other point in terms of mediating access in reaching third points.
The following paragraphs parallel Freeman's (1980: 587-588) derivation of the measure of pair-dependency, generalized here (with appropriate modifications) to directed graphs.
"Consider a directed graph representing the nonsymmetric relation 'communicates to ' for a set of people.When point i communicates to point j, j is said to be adjacent from i.A set of arcs linking i to j to k constitute a path from pi to pk.The shortest path linking one point to another is called a geodesic.There can, of course, be more than one geodesic for any ordered pair of points." "Now let gi, = the number of geodesics from pi to pk, and gi,(pj) = the number of geodesics that contain point pj as an intermediary in the geodesics from pi to pk, then: Thus, hik(qj) is the proportion of geodesics linking pi to I)k that contain pi: it is an index of the degree to which pi and pk need pj in order to communicate along the shortest path linking them together.Since it is a proportion, 0 I bik(pj) I 1.Moreover, when bik(pj)j) = l,Pj is strictly between pi and * pk, they cannot communicate along the geodesic(s) linking them without its support in relating messages.In such a situation the communication from pi to pk is completely at the whim of pi: he can distort or falsify any information passing through him." "Now we can define pair-dependency as the degree to which a point, pi, must depend upon another, pj, to relay its messages along geodesics to other reachable points in the network.Thus, network containing n points, or perhaps inhibiting its communication." "Whenever any person is in a position to be a gatekeeper for communications, others must depend on that person.A gatekeeper position, however, may be either rather wide or quite narrow in its impact. . ." "Obviously, such local pair effects are of great potential importance to the points affected.In large networks, where individuals may be at considerable distance from one another, global patterns may be submerged and pair effects may be the main factors determining information flows."Freeman (1980) shows for undirected graphs that the sums down the columns of the pair-dependency tweenness centrality of the points: We divide further by two for comparability with Freeman's measure for undirected graphs.

The maximally centralized graph
It is desirable for some purposes to have a measure of betweenness centralities that is not affected by the number of points in the graph (Leavitt 1951).Freeman (1977: 37-40) derived such a measure by dividing C,(p,) by the maximal value it can take relative to points in the graph.In showing the existence of the maximum, he also showed that the unit of measurement for increases in a point's centrality in a graph was the addition of an arc to the graph that increased the number of paths through the point.The measure of centrality monotonically increases as such arcs are added to a graph, and reaches its maximum when no more such arcs can be added.He also proved by this procedure that an undirected graph with maximal betweenness centrality for one of its points is always a star, where the central point is adjacent to all of the other points and none of the other points are adjacent to one another.
For meaningful interpretation of a measure of centrality, one needs to be able to derive its minimum and maximum, and a unit of measurement with respect to which it is monotone.Gould (1987) examined this question and was unable to define either a maximum value or a unit of measurement for directed graphs.Thus, he argued that while it was possible to define a betweenness centrality measure for directed graphs, one could not interpret the measure.Borgatti and Bonacich (1989) also arrived at the measure of C,(pjl for directed graphs, without the normalization and unit of measurement. In a directed graph, if no is the number of points with outgoing arcs and nzI the number with incoming arcs, then in a star graph the product (n, -l>(n, -1) is the number of paths through the center.The directed graph with maximal point betweenness, given nI and no, is a star in which points with incoming and outgoing arcs are maximally disjoint.In such a star, each arc contributes to the betweenness centrality of the center point.Let its = the number of points with reciprocated arcs.The betweenness centrality of the most centralized star for a directed graph is: Again we divide by 2 for consistency with Freeman's measure.In the case of an undirected graph, yti = no = ~zs = ~1, and the formula for betweenness centrality of a star simplifies to: Divided by 2, this converges with Freeman's result for the betweenness centrality of the most central point in the most centralized (undirected) graph.Now let us duplicate for the directed case Freeman's procedure for adding arcs to a graph, beginning with isolated points, so as to increase the betweenness centrality of a single most central point until its maximum is reached.To do so, we must add arcs that connect peripheral points in a star to the central point.In the directed case, we must add the arcs so as to form the maximum number of (directed) paths that run through the center point.The simplest graph with a directed star pattern, as shown in Fig. 1, begins with two arcs, from i to j and j to k, j being the central point that sits on the path from i to k.To obtain the maximal value of C,<pj> with each new arc with additional points, one must alternate between addition of incoming and outgoing arcs with respect to the central point j. Figure 2 shows two ways of adding an arc to the graph in Fig. 1 to span another point and retain maximal betweenness centrality.For the first 12 -1 arcs added to a graph with n points, reciprocated connections between is the maximal betweenness center and periphery are avoided because they produce less of an increase in betweenness centrality than adding only arcs that are unreciprocated.
Figure 3 shows a graph with maximal betweenness centrality of point j with 5 points and 4 arcs.It has 4 (directed) paths mediated by point j and gives C,(p,) = 4. Figure 4 shows two graphs where instead of the last arc m + j in Fig. 3 we add arcs that reciprocate the relation either between i and j or between j and k.
Here C,(p,) = 2 in one case and C,(pj) = 3 in the other.These are not graphs of maximal centralization.
Once the first II -1 directed arcs are added to maximize the betweenness centrality of the center point with the addition of each arc, the star pattern so formed connects each of the graph's 12 -1 CB bjl = 4 IS the maximal betweenness This is the maximum of star-pattern centralities for asymmetric graphs, but it is not the maximum for a directed (nonsymmetric) graph, since one can continue to increase the centrality of the central point by adding reciprocated arcs (again, in an alternating fashion), in which case one arrives at Freeman's undirected star as the limiting case.
Since we have established a procedure for adding arcs so as to monotonically and maximally increase the measure of centrality, the question of the meaningfulness of the measurement for betweenness centrality in directed graphs is not a matter of a lack of a unit of measurement.It is that while a maximum is well defined for the asymmetric and undirected (symmetric) graphs, the maximum is not yet uniquely defined for the general case of the directed (nonsymmetric) graph.
Defining betweenness centrality for directed graphs must depend on a specification of the parameters y1,, no and ~1s.At one extreme we have asymmetric graphs where n, = 0, and at the other we have undirected graphs where n, = n.Maximal betweenness is in each case a specialization of the general formula: What is the maximal betweenness centrality for intermediate case where 0 < lls < n?We need to define a maximal value that takes into account a limit on the number of points with reciprocatable arcs having both incoming and outgoing arcs (only for such points can arcs be reciprocated).This implies a limit on the remaining points with unreciprocatable arcs (with only incoming or only outgoing arcs).The simplest way to do this is to hold constant, as observed in the actual graph, the number ~zs of points with reciprocatable arcs and those with only unreciprocatable arcs (n, -ns and no -n,> *.Then maximum point centrality is obtained by our procedure for adding arcs up to the limit on the number of points with only in-or outwardly oriented arcs, and then up to the limit on the separate set of points with reciprocatable arcs (the unit of measurement procedure for constructing the maximal graph requires we add arcs at first as unreciprocated arcs between the center and each peripheral point and then proceed to make them reciprocal, following the rule of alternating orientation).Then our general formula will apply where the parameters n,, ylo, and ns are fixed from the empirical graph under study.2 ' Alternately, we may simply normalize on the number of arcs, as follows.Take the k arcs, allocate k + 1-n of them to ns reciprocatable points, and divide the remainder among the remaining points so that n, + no -ns = n and n, and no differ at most by one.This defines the most centralized graph holding constant only the number of arcs, but does not control for the degree of symmetry in the graph.' The C,(p,) measure for the maximum graph, in the case of undirected graphs with no isolates, will converge with Freeman's measure (multiplied by 2).

Relative betweenness centrality for directed graphs: interpretability
The relative betweenness centrality of a point j is its betweenness centrality C,(p,) divided by the maximum betweenness centrality, max C,(pj), holding constant the number of nodes in the graph with reciprocatable arcs 12s or only unreciprocatable arcs, IZ~ -ns, n, -ns: = CB(Pj>/(nI -l)(n, -1) -(ns -1) The measure CL(pj) converges with Freeman's relative betweenness centrality measure in the case of undirected graphs with no isolates.Normalizing the point centrality measure establishes comparability between graphs not only of different sizes but of differing degrees of symmetry, and prevents the measure of relative centrality in graphs with very few reciprocated relations from being deflated because of their lack of symmetry.This step (or the alternate one in note 1) in normalizing the directed centrality measure removes Gould's (1987) objection that centrality measures for directed graphs are uninterpretable because they lack a standard for maximality.
What is the interpretation for the betweenness centrality of a point relative to one with maximal betweenness in a graph of the same number of points and of points with reciprocatable or only unreciprocatable arcs of each type?By holding constant the number of points in terms of their types of arcs, we are measuring the extent to which the existing arcs between these types of points could be rearranged-adding arcs to the center and deleting other arcs to peripheral points-to increase the centrality of the dominant point.

Measures of graph centralization
Following Freeman (1977: 391, the overall centralization of a graph is defined as the average difference in centrality between the most central point and all others.We define Ce(pk*) and as the largest Development of a standardized measure of betweenness for directed graphs is an important step in the development of scientific propositions concerning social networks.Centrality is a fundamental property of actors and has numerous predictive consequences.Freeman's measures of betweenness centrality are conceptually well defined and known to provide correspondence between theoretical predictions and substantive findings about network centralities where the social relations are symmetric, as with the Bavelas small-group experiments (Bavelas 1950;see Freeman et al. 1980).Extending them to directed graphs increases the interpretability of similar findings and the testability of theoretical propositions in the larger arenas of directed graphs and social networks with directed relations.
k=l is the pair-dependency of pi on pj." "We can calculate the pair-dependencies of each point on other point in the network and arrange the results in a matrix, D = (d;) Each entry in D is an index of the degree to which the point designated by the row of the matrix must depend on the point for a every designated by the column to relate messages to others.Thus D captures the importance of each point as a gatekeeper with respect to each other point-facilitating the betweenness centrality of point j, is twice the value of its pair dependency column sum: twice because the upper diagonal and lower diagonals are equal in the pair-dependency matrix of an undirected graph.matrix D is a measure of be-To generalize Freeman's measure of betweenness point centrality, C,(p,), for directed graphs, we use the equality: