Company failure in South Africa: classification and prediction by means of recursive partitioning

• Brute empirism was avoided by focussing on cash flow ratios in combination with certain accrual ratios. • Failure was not only defined as bankruptcy, but as any condition where the company cannot exist in future in its current form, therefore including delistings as well as major structural changes. • By using the population of listed industrial companies between June 1997 and May 2002, the grey area in-between ‘successful’ and ‘bankrupt’ was included in developing the models. • Every model developed was tested with the help of an independent sample. • The different economic cycles were considered by developing different models for a growth and a recessionary period. A combined model was also developed, with the economic cycle as a independent dichotomous variable.

• Brute empirism was avoided by focussing on cash flow ratios in combination with certain accrual ratios.

•
Failure was not only defined as bankruptcy, but as any condition where the company cannot exist in future in its current form, therefore including delistings as well as major structural changes. • By using the population of listed industrial companies between June 1997 and May 2002, the grey area in-between 'successful' and 'bankrupt' was included in developing the models. • Every model developed was tested with the help of an independent sample.

•
The different economic cycles were considered by developing different models for a growth and a recessionary period. A combined model was also developed, with the economic cycle as a independent dichotomous variable.
When the prediction accuracy for the different classes and in total, of the models developed, is compared with the ex ante probability that an observation will fall in a particular class of the majority (non-failed companies), the prediction accuracy is in every instance higher than the ex ante probability.
*To whom all correspondence should be addressed.
Failure prediction studies categorised Altman (1968:590) reports that already in 1935 a study used financial ratios in the classification between failed and nonfailed companies. However, the onset of research in failure prediction is generally attributed to Beaver (1966) and Altman (1968). These classical studies were followed by numerous studies in this field and after four decades, failure prediction is still a contentious subject. Research in failure prediction can broadly be categorised in three branches of which the current status is discussed briefly.
(a) Research on the statistical or artificial intelligence methods used in the development of a failure prediction model Multiple discriminant analysis (MDA) was the method used most frequently in the development of failure prediction models until the early eighties, when logit analysis and probit analysis were also introduced to this field. The late eighties brought a new dimension to failure prediction studies with the development of machine-learning techniques and artificial intelligence. Some of the limitations of the traditional statistical methods no longer mattered when machine-learning techniques were applied. Examples of these machine-learning techniques are neural networks and recursive partitioning.
Two recent studies that respectively compared models based on MDA, recursive partitioning and neural networks (Pompe & Feelders, 1997:275) and models based on MDA, logit analysis, recursive partitioning and neural networks (Laitinen & Kankaanpää, 1999) with each other, both found that there were no statistically significant differences between the results of the various models. Laitinen and Kankaanpää (1999:84) remark, 'it can be stated that one of the latest applications, neural networks, is in its present form as effective as discriminant analysis was as early as thirty years ago'. Rees (1995:310) is also of the opinion that the 'statistical sophistication of failure prediction models has proved a sterile branch of research'.
(b) Studies that focus on the variables used in the model for differentiating between failed and non-failed companies Various commentators (Ball & Foster, 1982;Zavgren, 1983;Jones, 1987) criticised failure prediction studies for the use of 'brute empirism', whereby independent variables are selected not based on a theoretical model underlying failure, but for reasons such as popularity (Hossari & Rahman, 2005), or being a good discriminator in a previous study. This criticism led to the introduction of various theoretical models. However, several recent studies still employ 'brute empirism'.
(c) Research on the definition of 'failure'. Studies in this group move away from the tendency to define failure narrowly as bankruptcy. Ohlson (1980) remarks on the definition of failure and that various studies use different definitions. He states, '[n]o decision problem I can think of has a payoff space which is partitioned naturally into the binary status bankruptcy versus nonbankruptcy'. Jones (1987:133) as well as Neil, Schaeffer, Bahnson and Bradbury (1991:144) are of the opinion that bankruptcy was used as the dependent variable, because it can be determined objectively and easily. According to Neil et al. (1991:144), bankruptcy is not the ideal dependent variable.
According to Cybinski (2001:30), '[the] major problem in bankruptcy research to date is that the nature of the dependent variable, 'failure', is not a well defined dichotomy'. Most of the studies used bankrupt versus healthy companies in the development of their modelsextremities on the distress continuum. Cybinski (2001:31) is of the opinion that ' [it] is not surprising that these model formulations are most successful when the data conforms to the expectation that the two groups are already well separated on this continuum'. She identifies the challenge for a model to classify the companies successfully in the area between the extremities. Foster (1986) also is of the opinion that financial failure is an economic condition covering a broad continuum. At the one end of the continuum are the financially stable companies and at the other the companies that went bankrupt because of financial failure. In-between are two other categories that create difficulty in the classification process. They are the companies that applied for liquidation for other reasons than financial failure and those who are busy failing, but are carried along by the providers of capital.
Few companies actually apply for bankruptcy, although they are financially distressed. There are alternatives to bankruptcy, for example a major debt reduction after an agreement with the creditors (Blum, 1974:3); restructuring of bank-debt into ordinary shares (Vranas, 1992:260); cutback in activities by disinvesting part of the business; or a business combination with a financially strong company (Ball & Foster, 1982:217). According to Turetsky and McEwen (2001:327), '[a]n acquisition or merger is considered to result from different behavioural and financial characteristics than does failure; but is terminal in that the firm did not continue in its present form'.

Shortcomings in failure prediction studies
Various commentators identified shortcomings in failure prediction studies. Although some of these shortcomings were addressed in some studies, it is nonetheless still common to find a study that did not address some or even all of these issues. Attention was already focussed on two of these deficiencies in the above section, namely: i. The use of 'brute empirism' in the selection of independent variables; and ii. the use of the extreme definition of financial failure, namely bankruptcy.
Further deficiencies in failure prediction studies are: iii. Samples of bankrupt versus successful companies are used, thereby ignoring the 'grey area' between these extremities.
Many of the studies made use of samples of bankrupt versus successful companies. However, the number of actual bankruptcies is limited, and therefore the population of bankrupt companies is used together with a sample of successful companies. Because of the limitation in the number of bankruptcies, it is impossible to make use of an independent holdout sample to test the classification accuracy of the models in predicting outside the original sample from which the model was deduced (Zavgren, 1983). According to Jones (1987): 'It is well known that a model will generally fit the sample from which it was derived better than any other sample. In the case of financial distress prediction, this means that mere success in classifying firms as failing or healthy based on the derivation sample is not sufficient'. This, therefore, leads to another deficiency (Sharma, 2001), namely: iv. The lack in testing the prediction accuracy of models developed on an independent test sample.
v. Population proportions are ignored in samples.
In many of the failure prediction studies, samples of even sizes for failed and non-failed companies were selected, thereby ignoring the proportions of the different classes in the population. Both Zavgren (1983) and Zmijewski (1984) criticise the use of equal samples, as it may lead to an overstatement of the classification and prediction accuracy of failed companies and the understatement of the classification and prediction accuracy of non-failed companies.
vi. The use of data from periods covering different economic conditions, without consideration of economic influences.
Few studies take the external environment of the companies, for example the position in the business cycle into consideration (Cybinski, 2001:30). Mensah (1984) investigates the occurrence that researchers pool data from companies over various years -sometimes an extensive period -without considering the different economic environments during those years. He develops different models for recessionary and growth phases of the economy and concludes that 'the accuracy and structure of predictive models differ across different economic environments' (Mensah, 1984:393) and that it may be easier to identify failing companies in the periods immediately preceding a recession, than in the periods that are followed by a growth phase (Mensah,1984:389).
According to Richardson, Kane and Lobingier (1998) business activities decrease during a recession. The decrease in sales and the increase in costs result in a lower profitability for many companies. The cost of credit increases, while the availability thereof decreases. The position in the economic cycle will therefore influence the failure prediction model's accuracy.

Research objective and research method
Purpose of the study The objective of this article is to develop models for the classification and prediction of failed and non-failed industrial companies listed on the JSE Securities Exchange by means of recursive partitioning, while addressing the above-mentioned shortcomings.

Recursive partitioning
The method selected for the development of a failure prediction model in this study, is recursive partitioning. This method was selected because the results of previous studies did not indicate that there was one method that was statistically significantly better than the others were, and this nonparametric, nonlinear method is graphically explainable to potential users. Statistica's classification tree algorithm, specifically the C&RT-method is utilised in developing the models. Recursive partitioning was previously used in failure prediction studies by Frydman, Altman and Kao (1985); Pompe and Feelders (1997); Laitinen and Kankaanpää (1999); Sung, Chang and Lee (1999) as well as Lin and McClean (2001).
A classification tree is hierarchical and consists of a series of logical 'if-then' conditions (tree nodes). In the derivation of the tree, there are two steps for each split. The first is to determine which independent variable will be the best discriminator for the observations at a specific splitting node. The second is finding the value of the independent variable that will best classify the classes at this node. An independent variable may be used more than once in the same tree. Statistica, at each node, performs an exhaustive search by testing each independent variable separately to find the split condition that will result in the greatest improvement in predictive accuracy. This improvement in predictive accuracy is measured by means of the Gini measure, computed as the sum of products of all pairs of class proportions for classes present at the splitting node (Statistica).
This process repeats itself, with the branches ending in terminal nodes. Splitting of a tree may continue until every observation is correctly classified, resulting in a very high classification accuracy. However, the sample-specific characteristics used in deriving this tree, may hamper the prediction accuracy of the model when applied to an independent sample. In order to avoid over-fitting, the tree is pruned by means of Statistica's FACT-style direct stopping -specifically the 'fraction of objects' option. The minimum number of misclassification allowed at a terminal node is determined by the fraction of objects specified. The fraction specified is determined by means of cross validation, where the tree derived from the learning sample is applied to an independent testing sample. The result is a simpler tree, with less classification accuracy, but with the best prediction accuracy in the testing sample. If the prediction accuracy is far less than the classification accuracy, the model is poor (Statistica).

Economic phases
Different models are developed in this study for the combined economic period (a period that includes both a growth phase and a recession), for a growth phase, as well as for a recession. Conditions vary along the economic cycle and different factors may determine whether a company will fail during the growth phase or during the recession. It is also possible that the market and providers of capital are more lenient during the growth phase and will tolerate an atrisk company longer than during a recession.

Selection of independent variables
'Bankruptcy is a cash phenomenon'. The underlying reasons for financial distress may vary from entity to entity, some may not be profitable at all, while others may be very profitable and expanding rapidly, but as a result thereof cannot meet their obligations. Whatever the reason for failure, it culminates in the fact that the company has a shortage of cash to pay its obligations. Therefore, eight of the fifteen independent variables (as defined in Table 1) selected for use in this study are cash flow ratios. Although there were quite a number of studies on the use of cash flow information in failure prediction models, the majority were before the cash flow statement became compulsory and proxies had to be calculated from accrual information. The studies that used actual cash flow information (Ward, 1994;Schellenger & Cross, 1994;Ward & Foster, 1996;Sharma & Iselin, 2003) substantiate the value of cash flow information in the development of models. Our intention is to use cash flow and accrual information in combination, because of the difference in information content.
Cash flow for one year does not contain as much information as the cumulative cash flow over a few years, as a company may succeed in surviving a year or two of negative cash flows. However, if a company cannot succeed in generating cash over a cumulative period of, for instance, three years, it may get into financial difficulty (Steyn, Hamman & Smit, 2002). For this reason, not only one-year cash flow ratios are used, but also three-year cumulative cash flow ratios. The closing balance of cash divided by the closing balance of total liabilities E:TL The closing balance of equity divided by the closing balance of total liabilities AR:S The closing balance of accounts receivable divided by the revenue for the year WC:TA The closing balance of (inventories + accounts receivable -accounts payable) divided by the closing balance of total assets P:S The profit for the year divided by the revenue for the year CFO:S The cash flow from operating activities divided by the revenue for the year P3:S The cumulative profit for the last three years divided by the cumulative revenue for the last three years CFO3:S The cumulative cash flow from operating activities for the last three years divided by the cumulative revenue for the last three years CFO:TL The cash flow from operating activities divided by the closing balance of total liabilities CFI:TL The cash flow from investing activities divided by the closing balance of total liabilities CFF:TL The cash flow from financing activities divided by the closing balance of total liabilities CFO3:TL The cumulative cash flow from operating activities for the last three years divided by the closing balance of total liabilities CFI3:TL The cumulative cash flow from investing activities for the last three years divided by the closing balance of total liabilities CFF3:TL The cumulative cash flow from financing activities for the last three years divided by the closing balance of total liabilities The other independent variables selected are: • Two profit ratios, for one year, as well as a cumulative three-year ratio, as they are indicative of the profit creating ability of the company and therefore the viability of the business plan.
• The size of the companies is represented by the log of the total assets as standardised by the GDP deflator. Larger companies will probably have a greater chance of surviving that the smaller companies (Ohlson, 1980).
• The structure of the company as represented by four ratios. One that calculates the equity in relation to total liabilities; one that determines the ability of the company to pay its total debt from its current cash resources; a ratio calculating the size of the accounts receivable that must be financed by the company; and one that determines which part of total assets consists of current assets.
Defining failure; sample selection and testing the prediction accuracy Failure is defined as when the company will not survive in its existing structure, and therefore encompasses a delisting or a major structural change. A company-year is classified as 'failed' (F) if it failed within four years after year-end; otherwise, it is classified as 'non-failed' (N-F). A period of four years for following up the outcome of an observation is used, as the objective of the models is to warn all the stakeholders in time to enable the company to change its strategy, and possibly not to fail. Such a strategic shift will negatively influence the prediction accuracy of the models However; a non-failure will be preferable to better prediction accuracy. According to Henebry (1996), a warning three to five years before the event is necessary to address the problems of the company.
In this study, the population that is used consists of As the total population of listed industrial companies is used and not only a sample, the problem of sample proportions is negated. Because of the definition used for financial failure, as well as the fact that the total population, that includes the not-so-clearly-classifiable companies, is used in this study, we therefore venture into the grey area on the distress continuum and do not only make use of extremities in developing the models. The population is divided into two samples, namely the learning sample (two thirds of the population) and the testing sample (one third of the population). This ensures that an independent sample is being used to test the model and the prediction accuracy.

Methodology of this study compared to previous South African studies
In Table 2, the methodology of this study is compared with the methodologies of previous published local studies. The categories of comparison are based on the deficiencies in failure prediction studies identified internationally. In some instances it is not clear from the published local studies exactly what the methodology was for a certain category.

Independent variables as discriminators
The Lilliefors test was performed on the independent variables and in no case was there enough evidence to support the hypothesis that the variables were normally distributed. For this reason a non-parametrical statistical test, the Kruskal-Wallis test, was performed on the independent variables for the three populations. The median of the variables, rather than the mean, is reported in Table 3. The values that are underlined are those that indicated a significant difference in the population location of the classes failed versus non-failed.
From the fifteen independent variables, there are eleven significant differences in the population locations of the failed and the non-failed companies during the total period examined. During the recession only five independent variables indicated a significant difference; and during the growth phase, eight. Three independent variables successfully discriminates between failed and non-failed in all three the populations: LogTA/GDP, CFO:S and CFO3:S, illustrating the importance of the size of the company, as well as its ability to generate cash from its activities. The ratios depicting the extent of working capital (WC:TA) and accounts receivable (AR:S) seem to be more important during the recessionary phase than during the growth phase, the explanation being that the pressure of the recession on the debtors and creditors will definitely influence the company.
Five other ratios, LF:TL, E:TL, CFO3:TL, CFO:TL and CFF3:TL, are successful discriminators during the growth phase, but not during the recessionary phase. These ratios are all expressed in terms of total liabilities and represent the extent of the company's equity in relation to its debt; the cash resources available and the cash generating ability; as well as the extent of financing activities. In the growth phase, with the pressures of the recession receding, the ability to repay its debt, together with the size of the company, are the important discriminators between failed and non-failed companies.
Statistica's classification tree algorithm calculates the relative predictor importance of the various independent variables in the classification between failed and non-failed companies. The results are summarised in Table 3 (columns  headed PI). While the Kruskal-Wallis test is only performed once on the total population, the predictor importance is calculated repetitively for each split. Each node consists of different parts of the population, for which the importance of one variable may differ from that at another node. The most important predictor (with the best ability to purify the whole classification tree) has a value of 100, and the importance of the other variables is stated in relation thereto.
For the combined economy model, the most important predictor is CFF3:TL (100) Noteworthy for both the Kruskal-Wallis tests, as well as the predictor importance calculations, is the importance of the cash flow variables, especially the three-year cumulative ratios.
Studying the medians of the independent variables that indicated a significant difference in the population location for the classes failed and non-failed (those underlined in Table 4), the general characteristics as described in Table 5 can be deduced.

Models developed
Three separate models were developed for the combined period, the recessionary phase and the growth phase. In Table 6 the number of observations, class proportions, classification accuracy of the learning sample, and the prediction accuracy of the testing sample are summarised for each population. Although the prediction accuracy of all of the models differs from the classification accuracy, indicating a certain amount of noise in the model, the results are still reasonably good.    The failed companies are smaller than the non-failed companies  True  True  True The non-failed companies have more liquid resources (cash) available than the failed companies True True The non-failed companies have larger equity in relation to debt than the failed companies True The failed companies carry more accounts receivable as well as working capital in proportion to respectively their revenue and total assets than the failed companies True True Both profit for the year as well as cumulative profit are smaller for the failed companies than the non-failed companies True Both cash flow from operating activities and the cumulative cash flow from operating activities are smaller for the failed companies than the non-failed companies True True True The three-year cumulative cash inflow from financing activities is much higher for the failed companies than for the non-failed companies, demonstrating their dependence on new capital as compensation for their lack in generating cash from operations True True   In each model the classification accuracy and the prediction accuracy are better than the probability of an observation being in the largest class (N-F). The prediction accuracy in total (71,2%) and of the non-failed class (78,6%) is the highest for the recession model, while the prediction accuracy of the failed class (66,9%) is the highest for the combined model. The prediction accuracy of the failed companies during the growth period is only 57,1%, which possibly provides evidence on the hypothesis that companies that would have probably failed during a recession period, are kept alive in the more positive growth period.

Combined model
The tree structure for the combined model is in Table 8 and the tree is depicted in Figure 1. The nodes in bold in the tree structures represent terminal nodes. For example, node 3 is a terminal node classified by the model as N-F, misclassifying eight failed companies as N-F. Node 4, on the other hand, is classified as F, with no misclassifications. The first split condition (Figure 1) in the model is: observations where LogTA/GDP is smaller than or equal to 14,852, move to node 2 on the left, while the others move right to node 3. Therefore, companies with the value of LogTA/GDP larger than 14,852, are classified as N-F. The second split condition is: observations where CFF3:TL is smaller than or equal to -1,047, move to node 4 on the left, and those larger to the right. Therefore, companies with the value of LogTA/GDP smaller than or equal to 14,852 and CFF3:TL smaller than or equal to -1,047, are classified as failed.
The splitting continues until specified node purity is obtained. The model derived from the learning sample is then applied to the testing sample, with the prediction accuracy reported in the misclassification matrix in Table 7. Both the classification (in the region of 77%), as well as the prediction accuracy (around 66%), do not differ much amongst the classes (Table 6).

Recession model
The classification tree developed for a recessionary period is depicted in Figure 2 and its structure is described in Table 9. The first splitting condition for this recessionary model is the same as for the combined model. The next split condition, however, uses cash outflow from investing activities to divide the companies, and not financing activities as in the previous tree. The prediction and classification accuracy of the class non-failed companies (80,4% and 78,6% respectively) are better than that for the class failed companies (67,1% and 60,8% respectively).    Growth model The first split condition of the growth model makes use of the amount of cash available to the company to pay its total debt ( Figure 3 and Table 11). Thereafter the next split conditions contain the cash inflow from financing activities and the size of the company measured by its total assets. Although the model's prediction accuracy for the class nonfailed companies and its total prediction accuracy are slightly better than that of the combined model, the prediction accuracy for the failed class companies is worse (Table 6).

Conclusion
The deficiencies in previous studies in this branch of research were identified from the international literature on failure prediction studies. This study's purpose was to address these deficiencies while using a method not yet published in South Africa in developing failure prediction models.
The deficiencies are addressed as follows: • Brute empirism was avoided by focussing on cash flow ratios in combination with certain accrual ratios, and not by identifying the best ratios by trial and error.
• Failure was not only defined as bankruptcy, the extreme on the failure continuum, but as any condition where the company cannot exist in future in its current form, therefore including delistings as well as major structural changes.
• By using the population of listed industrial companies between June 1997 and May 2002, the grey area inbetween 'successful' and 'bankrupt' was included in developing the models.
• The use of the population instead of samples avoided the problem in previous studies of equal samples of failed and non-failed companies resulting in over or understatement of classification accuracy.
• Every model is tested with the help of an independent sample. The focus in reporting therefore is on the prediction accuracy of the testing sample and not the classification accuracy of the learning sample, which for obvious reasons will always be quite good.
• Even though the observations are only from a five-year period, the different economic cycles were considered by developing different models for a growth and a recessionary period. A combined model was developed as well, with the economic cycle as a dichotomous independent variable -which was not used in the model deduced.
Three ratios emerged as the most important classificators between failed and non-failed companies in the combined period as well as in the growth and recessionary periods. The ratios are the size of the company as measured by its total assets and the cash flow from operating activities divided by sales for both the last financial year as well as a cumulative three-year period, including the last year.
The prediction accuracies of the models developed are not as 'spectacular' as some of the results previously reported. However, considering two factors, namely that the classification and prediction accuracy of the models will probably not be as high as those in previous studies, because the 'grey area' and not extremities was used in developing the models; and that the prediction accuracy will probably be lower than the classification accuracy -which are the values mainly reported in previous studies, that did not use an independent testing sample; the prediction accuracy of the models developed in this study is reasonably good. When the prediction accuracy of the different classes and in total are compared with the probability that an observation will fall in the class in the majority (N-F), the prediction accuracy is in every instance higher than probability -again an indication of a reasonably good model.