Community structure and centrality effects in the South African company network

This paper conducts a search for community structure in the South African company network, a social network whose elements are South African companies listed on the Johannesburg Stock Exchange. Companies are connected in this network if they share one or more directors on their respective boards. Discovered clusters, called communities, can be considered to be compartments of the network working relatively independently of one another, making their distribution and composition of some interest. We test whether the discovered communities of companies are (a) statistically significant, and (b) related to other attributes such as sector membership or market capitalization. We also investigate the relationship between the centrality of a company’s position in the network and its market capitalization.


Introduction
For many years it has been popular to consider the causes and consequences of companies "interlocking" i.e. having one or more directors in common (e.g.see Davis, Yoo and Baker (2003) for a review).In more recent times, the abundance of directorship information and new techniques for the analysis of complex systems has led to a focus on the entire system of partially interlocked companies within a corporate landscape (usually but not necessarily a country).In this mode of research, one views the collection of companies as a social networka graph in which a set of nodes (the companies) are linked together by edges indicating the presence of some kind of social relationship, in this case the presence of one or more common directors on both boards i.e. an interlock.
This type of analysis is unconventional in two respects.Firstly, it focuses on the relationships between entities rather than on the attributes of independent sampling units.Secondly, it aims to describe the structure of the system as a whole rather than assess the individual entities.Taken together this allows one to describe the extent and nature of "interconnectedness" in a corporate system using a small number of summary statistics.Analyses of corporate board networks have been conducted for the US (Newman, Strogatz & Watts, 2001;Davis et al., 2003;Conyon & Muldoon, 2006), UK (Conyon & Muldoon, 2006), Germany (Conyon & Muldoon, 2006;Kogut & Belinky, 2008), Switzerland and the Netherlands (Heemskerk & Schnyder, 2008), Denmark, Sweden and Norway (Sinani et al., 2008), and South Africa (Durbach & Parker, 2009).
New analytical methods have made it possible to consider networks in ever greater detail.An important development is the assessment of whether the nodes making up a social network can be organised into clusters, such that many relationships exist between members of the same cluster and comparatively few exist between members of different clusters.Such clusters, also called communities, can be considered to be compartments of the network working relatively independently of one another (Fortunato, 2010).An example of a network with strong community structure is shown in Figure 1.Community detection has found application in many areas: networks of interacting proteins (Rives & Galitski, 2003), gene expression networks (Wilkinson & Huberman, 2004), metabolic networks (Holme, Huss & Jeong, 2003), mobile phone communications (Blondel et al., 2008), and collaboration networks between academics (Girvan & Newman, 2002).
In this paper, we conduct a search for community structure in the South African company network (shown in Figure 1), a social network in which the nodes are South African companies listed on the Johannesburg Stock Exchange (at March 2008), and two companies are connected if they share one or more directors on their respective boards.We also test whether the discovered communities of companies are related to other attributes such as sector and market capitalization, and investigate the relationship between the centrality of a company's position in the network and its market capitalization.The paper is structured as follows: we begin by giving a brief introduction to network statistics, followed by a summary of previous research on interlocking corporate boards.We then describe the company network used for this study and the algorithm used to detect community structure respectively.The main results are then presented: first basic descriptive results; then community detection results; and finally results obtained from an additional investigation into the relationship between network centrality and market capitalization.A final section contains conclusions.

Network basics
In this section the example network in Figure 1 is used to illustrate various quantities of interest.The network shows the existence of some form of relational tie (edges) between entities (nodes).Edges may be undirected or directed, although here our interest is limited to the undirected case.If two nodes are connected by a single edge they are known as adjacent, with nodes B and C being an example of a pair of adjacent nodes.Adjacencies can be collected into an adjacency matrix A with elements if nodes and are connected and 0 otherwise.Higher values for are possible if multiple edges are allowed between nodes, but this will not concern us here.The adjacency matrix plays a prominent role in many network computations, including ones we consider later.
In discussing the connectivity of a network and its constituent nodes, two measures are of fundamental importance: degree and distance.The degree of a node is simply the number of edges leaving that node.The degrees of nodes A, B, and C in Figure 1 are 7, 5, and 4 respectively.The geodesic distance (or just 'distance') between a pair of nodes is given by the smallest number of edges that must be traversed to get to one node from the other.The distance between nodes A and B is 3 while nodes B and C are separated by a distance of 1. Intuitively it is clear that nodes within the same community should tend to be separated from each other by smaller distances than nodes in different communities.

Causes and consequences of "interlocks" between companies
Previously hypothesised causes of interlocks include collusion, coopting sources of environmental uncertainty, monitoring, enhancing reputation and legitimacy, career advancement, and elite social ties (Mizruchi, 1996).With the exception of collusion (Pennings, 1980), evidence exists in favour of all these hypothesised causes, but this evidence tends to vary from study to study.Both Thompson and McEwen (1959) and Burt (1983) found evidence suggesting cooptation as a source of interlocks, but Ornstein (1980) and Palmer (1983) found that ties broken by death or retirement are not re-established, mitigating against cooptation.Monitoring explanations have been supported in Dooley (1969) and Mizruchi and Stearns (1988), with unprofitable companies found to be more likely to form interlocks, especially with banks (Richardson, 1987;Mizruchi & Stearns, 1988).Mace (1971) and Useem (1984) found that directors are often chosen on the basis of their own reputation, and these may be used to signal the reputation of the company on whose board they sit (Selznick, 1984).As a consequence, directors are more likely to be nominated to new boards if they are already a member of several boards (Davis, 1993).The link between reputation and membership of the upper social stratum has been supported by Zeitlin (1974) and Useem (1984).
The research into the consequences of interlocks for company behaviour is well summarised by Mizruchi (1996), while Davis, Yoo and Baker (2003) and Di Pietra et al. (2008) present more recent evidence.To summarise, interlocking directorates have been shown to facilitate the adoption of executive compensation practices such as "golden parachutes" (Cochran, Wood & Jones, 1985), "greenmail" (Kosnik, 1987), and "poison pills" (Davis, 1991;Davis & Greve, 1997).Others have found that the amount of external financing a company receives is related to bank representation on its board (Mizruchi & Stearns, 1988).Interlocks also serve to facilitate contributions to political candidates and congressional testimony (Mizruchi, 1992), as well as switching behaviour between stock exchanges (Rao, Davis & Ward, 2000).Di Pietra et al. (2008) find that the number of additional directorships held by a board of directors (expressed as a proportion of board size) has a positive association with the market value of a company.Interlocked directors tend to be less effective at monitoring (Fich & Shivdasani, 2006) and more likely to be absent from board meetings (Jiraporn, 2007).

Data
The network that we investigate comprises the boards of directors of all JSE-listed South African companies as at 1 March 2008.This information was obtained from the McGregor BFA database and checked manually for consistency.One problem that arises is that companies may provide different levels of detail in the names of their directors, for example in the number of initials that are specified.In some cases it is clear that the director is in fact the same person (for example, NJM Canca and NJMG Canca are presumably the same person), but in other cases the correct decision is not clear.Our approach has been to treat any names which are identical in surname and first initial as belonging to the same person.The full dataset consists of 2653 directors and 397 companies, but the community detection algorithm is run on the largest connected component of the network consisting of 2048 directors and 294 companies.

Methods
"Communities" generally refer to subsets of nodes that are more densely interconnected among one another than with nodes in the rest of the network i.e. outside their community.The search for community structure thus becomes the search for "a statistically surprising arrangement of edges" (Newman, 2006).Most current algorithms for detecting community structure assign nodes to communities so as to optimize some pre-specified quality function.Reichardt and Bornholdt (2006) suggest the following quality function, to be maximized: where are elements of the adjacency matrix, ( ) if nodes and are in the same community and is 0 otherwise, and are weights of the relative contributions made by present within-community edges, missing withincommunity edges, present between-community edges, and absent between-community edges respectively.The four summation terms in the above equation correspond respectively to (a)  Different authors have used a number of approaches to optimise the equation above; see Fortunato (2010) for a review.Reichardt and Bornholdt (2006) begin by assigning the same importance to connections between nodes in the same community as to those between nodes in different communities i.e. setting and .This means that it is only necessary to consider present and absent connections between nodes in the same community (i.e. the last two terms in the above equation can be ignored).They then select weights and , where denotes the expected number of edges between nodes and and is a parameter giving the importance of present edges relative to absent edges.The choice for means that greater rewards accrue if two nodes with an existing but statistically "surprising" connection are assigned to the same community.Similarly, the choice for means that greater penalties result if two nodes that are "expected" to be connected but in reality are not, are assigned to the same community.The form taken by can be tailored for specific types of networks.In most cases (including ours) the are set to , where is the degree of node and is the total number of nodes.This indicates that edges are more probable between nodes which themselves have many edges, but other choices for can be used to indicate, for example, assortativity (degree-degree correlation), random graphs, or bipartite graphs.These choices of coefficients mean that the resulting quality function simplifies to: ∑ Optimizating the above function for large networks is computationally very intensive.Reichardt and Bornholdt (2006) use methods from statistical mechanics to reformulate the optimization problem in terms of finding the ground state configuration that minimizes the energy of an "infinite range Potts spin glass".Full details of the method can be found in the original reference, and need not concern us here, although it is worth noting some of its advantages.Firstly, the reformulated quality, given by

∑ (
) ∑ where the first sum runs over the set of all edges and is the sum of degrees of nodes in community , is computationally easier to optimize: indeed this can be achieved using standard approaches like simulated annealing (Kirkpatrick, Gelatt & Vecchi., 1983).Secondly, a single, easily interpreted parameter governs the relative weight assigned to present and missing edges; when the total contribution that can be made by present and missing edges is equal.Thirdly, it is a general model that, for example, includes the 'most popular quality function' (Fortunato, 2010) based on 'modularity' (Newman & Girvan, 2004) when . Finally, the method can be used to compute the expected modularity for a 'null model' i.e. a random graph with the same number of nodes and average degree; the modularity of a discovered community structure must exceed that of the null model in order to be considered statistically significant.We used the implementation of Reichardt and Bornholdt (2006) available in the statistical software package R (version 2,13) via its 'igraph' package.

Basic network results
Table 1 gives a brief overview of the South African company network.The average number of directors sitting on the board of a JSE-listed company is 8,56, ranging greatly from just two directors to 27.This is comparable to values reported by Conyon and Muldoon (2006) for the USA (9,97 members), the UK (6,51 members) and Germany (6,33 members).The average number of directorships held is 1,28, and the overwhelming majority of directors (83%) are members of just a single board.This is marginally lower than those reported by Conyon and Muldoon for the USA (1,63 directorships), UK (1,84 memberships) or Germany (1,45 memberships).There are just 32 directors (1,2%) who hold five or more board memberships.A JSE-listed company is directly connected to an average of 5,2 other companies, although this average increases to 6,9 if companies that are not connected to the largest component are excluded.

Community structure results
The essential features of the South African company network as shown in Figure 1 appear to be that (a) a relatively large number of highly interconnected companies appear in the center of the network, and (b) other companies are peripheral to this central cluster of companies, being connected to it only by a distance of a few degrees.This structure presents a clear difficulty for community detection.Nevertheless, the communities detected by the algorithm of Reichardt and Bornholdt (2006) do exhibit some statistically significant associations.Figure 2 shows the communities detected by a 'neutral' application of the algorithm, in which absent edges between members of a community are viewed as equally important as present edges.Nodes represent companies, with edges denoting membership to a common cluster.Nodes are coloured according to their sector and the size of a node is proportional to its market capitalization.For comparative purposes, the community structure detected using the algorithm in a 'conservative' mode (with ) is shown in Figure 4 in the appendix.The former identifies 14 communities and the size of these communities decreases in an approximately linear fashion; the latter identifies 12 communities, one of which is much larger than all the others (an in fact contains some 50% of all companies).It must be acknowledged that a direct qualitative interpretation of these system-wide summary statistics is still lackingit is not yet known how, or even whether, network statistics like the number of communities or the distribution of cluster sizes affect the performance of the economy.Investigating these important topics would require longitudinal data, ideally for a number of countries, and as such is beyond the scope of the current study.We set ourselves the more modest task of assessing whether the fact that two firms belong to the same economic sector, or have similar levels of market capitalization, make them significantly more likely to belong to the same community (which, recall, is dictated only by arrangements of directorships).
The modularity of the detected configuration of nodes into communities is 0,53.This is well above the 'null' model modularity of 0,36, suggesting that the communities found have at least some statistical relevance over and above what would have been expected from a random graph.Moreover, there is a significant association between cluster membership and each of sector membership and market capitalization, two exogenous variables not used in the clustering process (sector: chi 2 = 170,0, DoF = 96, p < 0,001; log(market capitalization): F = 2,66, DoF = 12, 280, p = 0,002), although because of the small sizes of some clusters these p-values cannot be trusted entirely.From Figure 2 itself, it is clear that there is some tendency for companies of a similar market capitalization and sector to appear in the same cluster.For example, 67% of all industrial companies are in just three communities, and in one community 14 out of 31 companies are based in the industrial sector.In one relatively small community of 15 companies, 7 are in the real estate sector.Other sectors are somewhat more evenly dispersed between communities.Significant market capitalization clustering occurs mainly between companies with high market capitalizations -27 of the 56 companies in the upper quintile of market capitalization belong to just 2 of the 13 communities.Our conclusions remain the same using either of the community structures in Figure 2 or 4.
Taken together, our results suggest that companies of the same sector (particularly industrial and real estate) and companies with high market capitalization exhibit a greaterthan-expected tendency to form inter-relationships with one another through common directors on their boards.As indicated by the previous research summarized above, such a tendency can be expected to play a significant role in the sharing of information and some elements of company culture and behaviour.

Centrality and market capitalization
In this section we consider the effect of the centrality of a company's position in the company network on its market capitalization.That is, we ask whether companies that occupy more central positions in the network tend to have higher or lower market capitalization values.In doing so, it is important to control for the effect of board size.Companies with large market capitalizations will tend to require larger boards to manage them, so that one would expect a positive relationship between board size and market capitalization (e.g.Lincke, Netter & Yang, 2008).Since, as we have seen, larger boards also tend to possess higher centrality (Durbach & Parker, 2009), there is an obvious need to control for board size when examining the relationship between centrality and market capitalization.
We do this by fitting a series of quantile regression models (using the quantiles q = 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%) to the 378 companies for which we were able to obtain market capitalization information.Models were fitted using degree as a measure of centrality, while controlling for board size.Board sizes and degree centralities were first centered around their means so that the intercept term can be more easily interpreted (Koenker & Hallock, 2001).Figure 3 shows the results obtained from the model using board size and degree centrality as independent variables.The solid line plots the parameter estimates obtained at various quantiles, while the dashed lines indicate 95% confidence intervals around those estimates.The parameter values can be interpreted as the effect of a oneunit increase in the independent variable on market capitalization.The models verify the positive relationship between board size and market capitalization, though as indicated by the positive slope of the solid line in the Board size plot, this effect tends to be considerably larger in the upper quantiles of the distribution.For example, a company with 1 more director than another company has an approximately R50 million greater market value at the 5% quantile (i.e. at very low levels of market capitalization), but a R424 million greater value at the 0,50 quantile (i.e.median market capitalization) and over R1 billion greater value at the 80% quantile (i.e. at higher levels of market capitalization).Thus, while it appears true to say that in general companies with bigger boards tend to have bigger market capitalizations, the difference between a company with a small board and one with a big board tends to be far more pronounced at higher levels of market capitalization than at lower levels.
Our models also find a strong positive relationship between degree centrality and market capitalization, even after controlling for the effect of board size.The coefficient of the degree centrality effect is positive and significant at the 1% level over all of the quantiles, indicating that companies having higher numbers of connections to other companies tend to have higher market capitalization.Similarly to the effect of board size, the degree centrality effect is smaller in the lower quantiles of the distributions, increasing in magnitude as the market capitalization quantile increases.At the median quantile, companies with a single additional connection have an approximately R180 million greater market capitalization.At the 5% and 80% quantiles, this figure is R24 million and R954 million respectively.Simply put, the difference in the market capitalization of a highlyconnected company and a poorly-connected company whose market values are both in the lower quantiles of their respective conditional distributions is not that large.But that same difference can be a full order of magnitude greater when those companies are both in the upper quantiles of their respective conditional distributions.Thus both centrality and the size of a company's board have larger effects for companies with relatively large market capitalizations.This suggests that limited connectivity and a small board can constrain market capitalization, but that the converse does not necessarily apply to the same degreebeing central or having a large board does not ensure high market capitalization.Interestingly, there is a large jump in the magnitude of the effect which occurs around the 60% quantile.At smaller quantiles, the centrality effect increases slightly as the market capitalization quantile increases.
Beyond the 60% quantile, however, the size of the effect increases dramatically.This further suggests that there may be some sort of "critical mass" beyond which inter-firm connectivity exerts its full effect.
Finally, and predominantly for completeness, the intercept may be interpreted as the estimated conditional quantile function of the market capitalization distribution for a "typical" company (one with 8,65 board members and 5,2 connections to other companies, these figures reflecting the sample mean board size and degree respectively).Thus the 10% quantile for such a company's market capitalization is estimated to be R406 million; the 80% quantile is estimated to be R11,5 billion.

Conclusion
This paper provides a coherent framework of (1) assessing whether companies in the South African company network can be organised into communities such that dense connections exist between members of the same community and relatively fewer connections exist between members of different communities; (2) investigating the relationship between identified communities within the South African company network and each of the attributes "sector membership" and "market capitalization"; and (3) how the statistical tool of quantile regression can be used to both measure the effect of board size as well as of degree centrality in the South African company network on market capitalization, and control for board size while measuring the effect of degree centrality on market capitalization.regardless of whether γ is set to 1 (a "neutral" value) or 0,5 (a more "conservative" value favouring the formation of larger communities).A statistically significant relationship has been detected between community membership and each of sector membership and market capitalization.Most companies with similar market capitalizations and sectors tend to regroup through common directors under the same community.Since edges are formed by common membership on a board of directors, this indicates that a director on the board of a company in one community is more likely to be on the board of another company in the same community.The presence of companies from the same sector in the same community may perhaps raise some warning signs for corporate governance, but we do not have any data to test this and a deeper investigation of this issue goes beyond the scope of the current paper.
The constructed quantile regression models reveal a positive relationship between board size as well as degree centrality and market capitalization.These results show that companies that are more central (as measured by degree centrality) tend to have larger market capitalizations, even after controlling for board size.The results also showed that the magnitude of these effects increases at higher quantiles of the conditional market capitalization distribution.That is, centrality and board size both have larger effects for companies with relatively large market capitalizations.This suggests that limited connectivity and a small board can constrain market capitalization, but that the converse does not necessarily applybeing central or having a large board does not ensure high market capitalization.
Appendix: Community structure results with more conservative clustering Figure 4 and Table 2 respectively show the discovered communities and statistical associations between communities and sector membership and market capitalization, using a more conservative application of Reichardt and Bornholdt (2006) in which is set to 0,5 i.e. existing edges are viewed as more important than missing edges.The modularity of the detected configuration is 0,59, which is (as for the more aggressive clustering) well above the 'null' modularity score of 0,36, suggesting a statistically significant arrangement of nodes into communities.Conclusions are as for the communities detected with γ set to 1 i.e. there is a significant tendency for companies of a similar market capitalization and sector to appear together in the same cluster.

Figure 1 :
Figure 1: On the left, an example of a network with a strong community structure (there are three clear clusters or "communities" indicated by the dashed ellipses).On the right, the South African company network as of March 2008.

Figure 2 :Figure 3 :
Figure 2: Communities in the South African company network, with the inset histogram showing community sizes

Figure 4 :
Figure 4: Communities in the South African company network, obtained with a more 'conservative' clustering parameter than that used in Figure 2