Who spends and who does not : Clustering visitors at a national arts festival

The Grahamstown National Arts Festival is the oldest National Arts Festival in South Africa and was founded in 1974. This celebration of the arts takes place over a period of eleven days with the main festival running over eight days, which also makes it the longest (in terms of number of days) arts festival in the country. The literature review revealed that high spenders at arts festivals are also the visitors who buy the most show tickets. The success of these events is determined by ticket sales and not necessarily by the number of visitors. Therefore, the purpose of this paper is to determine who the high spenders at the Grahamstown National Arts Festival are. Data obtained during the festival in 2008 by means of a questionnaire survey (N=446) was statistically analysed by means of K-means clustering, Pearson‟s chi-square test and ANOVAs. Results indicated two clusters, namely high and low spenders and can assist festival organisers in developing a more focused marketing strategy and festival programme. This was the first time that K-means clustering was applied to festival data in South Africa.


Introduction
In 2008, the Grahamstown National Arts Festival celebrated its 34 th year of existence, and is South Africa"s longestrunning arts festival.Grahamstown is a small town in the Eastern Cape Province.The latter is a province with high levels of unemployment.The Festival began in 1974 with just over 60 items and exhibits.Currently, it offers over 500 productions that include dramas, stage shows, music, cabaret, jazz and rock.It is held annually over an 11-day period during the month of July of which the first few days form the fringe programme and the last eight days form the main programme.The purpose of the event is to provide a cultural experience for tourists and visitors, and to grow the local economy.In 2008, the festival contributed in excess of R54 million to the local economy, supporting the notion that festivals are important generators of income for a town, city or region (Slabbert, Saayman, Saayman & Viviers, 2008).Kruger, Saayman and Saayman (2008) indicated that the existence of arts festivals is dependent on the number of tickets sold for shows.Hence, it is not the number of people attending these events that is of importance, but rather the number of tickets sold.Therefore, the focus from a marketing as well as organisational point of view should be on attracting individuals that buy show tickets.In addition, Uys (2003) stated that a sustainable marketing strategy is required if one wants to maintain a steady growth rate.Also a greater understanding of the different festival markets is paramount in developing a sustainable marketing strategy.Spotts and Mahoney (1991) indicated that in today"s competitive business environment, destination marketers are attempting to expand their market share by seeking travellers (visitors) who will spend money on and not just time at their products.The same applies for festivals, since South Africa has experienced a rapid growth in the number of festivals with nearly 300 festivals hosted annually in the country.This is an indication of how competitive the festival market has become and since different products attract different markets, it is imperative to apply market segmentation (Uys, 2003).According to Dibb and Simkin (2004), as well as Hassan (2000), the whole market is too large to reach effectively and too diverse to communicate with in any single wayhence market segmentation.

Literature review
The tourism literature is saturated with studies that segment markets by means of a variety of variables depending on the purpose of the study (Kruger, 2009;Burke & Resnick, 2000;Formica & Uysal, 1998).These segmentation variables can be categorised into different bases for example (Kotler & Armstrong, 2004;Saayman & Kruger, 2009;Slabbert, 2002;Burke & Resnick, 2000).: However, Kruger (2009) indicated that segmentation based on expenditurei.e.identifying various spending segments, such as low-spenders and high-spendersis an alternative approach recently followed.The reason for considering expenditure-based segmentation is that researchers such as Spotts and Mahoney (1991), Craggs and Schofield (2006), and Kruger (2009) clearly indicate this to be a very effective method of identifying high spenders.
With regard to the latter, Craggs and Schofield (2006) confirmed that a wide variety of variables influence visitor expenditure.These variables, as indicated by Cravens (1997), are then used to describe or create a profile for each segment.Burke and Resnick (2000) add that each segment can then be targeted or reached with a distinct marketing mix designed to meet their needs.The benefits of applying expenditure-based segmentation are as follows (Craggs & Schofield, 2006;Thrane, 2002;Mok & Iverson, 2000): In addition Legoherel (1998) stressed that travel expenditure may often be superior to activity measures as a segmentation variable, because travel expenditures for a given unit of travel activity can obviously vary significantly from one travel group to the next.Research by Cook and Mindak (1984) found that heavy users of consumer products account for a large percentage of sales.This has since been confirmed by numerous studies such as Saayman and Saayman (2008), Thrane (2002), and Spotts and Mahoney (1991).Therefore, the high spending market is not only beneficial from an economic point of view, but also from an environmental point of view (Saayman & Saayman, 2006).The latter argue that by attracting high spenders who buy tickets for shows and productions, the overcrowding associated with festivals may be avoided.
In support of the above, Thrane (2002) indicated that high spenders at festivals are likely to stay longer.Other research findings indicated that high spenders are better educated (Woodside, Cook & Mindak, 1987;Pizam & Reichel, 1979).Skuras et al. (2005) as well as Downward and Lumsdon (2002) determined that an increase in the size of the travel group leads to increased spending.Oppermann (1996) found that repeat visitors spend less than first-time visitors do; however, Gyte and Phelps (1998) confirmed the opposite, while Jang et al. (2004) concluded that frequency of visitation is an influencing factor in visitor expenditure.
With regard to language, as well as province of origin, Saayman and Saayman (2006) showed that the latter is significant in the case of arts festivals in South Africa.
Age and its role on spending are not conclusive, because research findings by Mok and Iverson (2000) and Kastenholz (2005) revealed a positive relationship between age and spending, while Mumdambi and Baum (1997) indicated an inverse relationship between age and spending.
The reason or purpose of travel, according to Letho et al. (2004) and Sakai (1988), has a definite impact on expenditure levels.In this regard, Mehmetoghlu ( 2007) confirmed that visitors who travel for the sole purpose of attending a festival or destination spend more.
Even though expenditure-based segmentation has been applied to different tourism products and destinations (Saayman & Saayman, 2008;Craggs & Schofield, 2006;Mok & Iverson, 2000), only one similar study at arts festivals in South Africa was found (Kruger et al., 2008).The latter indicated three distinct segments of visitors to the Klein Karoo National Arts Festival, namely high-, mediumand low spenders.Variables that had the most significant differences were age, occupation, province of origin, length of stay, type of accommodation used and type of shows/productions attended.
Results from this research (Grahamstown National Arts Festival) could therefore be compared to similar research, especially the study conducted by Kruger et al. (2008), since these two arts festivals differ in size, the markets they attract and the type of productions/shows offered.Hence the purpose of this article is to apply expenditure based segmentation to visitors visiting Grahamstown National Arts Festival.

Method of research
The method used in the research will be discussed in the following sections, namely sampling, questionnaire, data collection and data analysis.Since the research required the collection of primary data, a visitor survey was conducted over a period of six days at the Grahamstown National Arts Festival during June/July 2008.

Sampling
Sampling was based on the availability and willingness of visitors to complete the questionnaire.Cooper and Emory (1995) point out that for a population of 100 000 (N) the recommended sample size is 384.Since this festival attracts approximately 33 000 visitors (Slabbert et al., 2008) it was decided to distribute 450 questionnaires in order to ensure a large enough number of completed questionnaires.

Questionnaire
The questionnaire used was similar to previous questionnaires used by Saayman and Saayman (2006) at other arts festivals, namely the Klein Karoo National arts festival (KKNK) and Aardklop National arts festival in South Africa.The questionnaire included questions of demographical nature (age, gender, language, occupation, province of origin) as well as travel and participation behaviour (number of people in the group, financial responsibility, days spent at the festival, spending at the festival and genres attended).

Data collection
As mentioned earlier this festival consists of two parts, namely a fringe and main programme.The fringe programme is hosted first and then the main programme.
The focus of the survey was on the main programme, since the fringe programme is aimed specifically at the community of Grahamstown.All questionnaires were completed at the Main Festival Grounds where fieldworkers moved around in order to minimise bias.Questionnaires were progressively distributed during the last 6 days of the festival.Therefore fieldworkers distributed 50 questionnaires on day one and increase the number by 10 per day for 6 days.Of the 450 questionnaires distributed a total of 446 questionnaires were collected for data capturing during the festival.

Data analysis
Data was coded in Microsoft Excel and processed using SPSS (Statistical package for the Social Sciences).Visitors who had not provided any spending information or the number of people they are financially responsible for while at the festival, were discarded from the study.This is because spending per person per day, excluding transport cost to the festival, was used as the basis for the clustering.This rendered an adapted sample of 326 useable questionnaires.
Since most variables are category variables, they were coded in the form of dichotomous dummy variables in order to test the significance of each category.In general, the dichotomous dummy variable takes on the following form: The descriptive statistics of the various dichotomous variables are available in the appendix.Furthermore, some justification for the use of certain variables is presented below:  Since the 72.7 percent of respondents was Englishspeaking, it is the only language tested for a significant relationship with spending clusters (refer to the data tables in the appendix).


Occupation can be viewed as a weak proxy for income, and occupation category 1 is viewed as high-income occupations (professional persons), while 3 indicates low-income occupations (e.g.students, pensioners).


The four provinces that were tested include Western Cape (17,8 percent), Eastern Cape (51,2 percent), Gauteng (16,3 percent) and KwaZulu-Natal (4,9 percent).Together these provinces account for 90,2 percent of the provinces that respondents reside in (refer to the data tables in the appendix).


If the respondent is in Grahamstown for the sole purpose of visiting the arts festival, it is indicated by a 1; if the respondent is a local, it is coded as 3; and if the respondent is in Grahamstown for another reason, it is indicated by 2.
The continuous variables were kept in their original form and the descriptive statistics are indicated in the appendix.
Various methods can be used to identify market segments.
Expenditure-based segmentation focuses on identifying market segments based on expenditure differences.Previous expenditure-based research often divided the market into a low-, medium-and high-spending segment and entailed categorising each member accordingly.Clustering, on the other hand, makes no assumptions concerning the number of groups or group structure.Instead, the members are grouped together based on their natural similarity (Johnson & Wichern, 2007:671-673).
One clustering method is K-means clustering, which is a non-hierarchical technique for grouping items (rather than variables).K-means clustering assigns each item to a group based on the nearest mean (or centroid) (Johnson & Wichern, 2007:696-697).The measures of closeness or similarity used depend on the data, but a common distance measure is the Euclidean distance, which is also employed in this research.One disadvantage of K-means clustering is that the number of clusters (K) must be chosen.To limit the bias, it is suggested that various numbers of clusters are tried.Therefore, K was initially chosen as 5. Yet, the results showed too limited membership in the last two clusters, and the clustering was redone choosing K = 3.
Once the clusters were identified, the differences between the clusters were explored using analysis of variance (ANOVA) for the continuous variables, and Pearson"s chisquare test for the categorical variables.All analyses were performed in SPSS.

Results
The results of the K-means clustering process on the expenditure per person per day variable are indicated in  It is evident from Table 2 that there are differences in demographic profile, travel behaviour and show preference of the persons in different clusters.Yet, many of the percentages are quite close, and to determine whether the differences are significant the Pearson"s chi-square test were used for the category variables, and an ANOVA for the continuous variables.Table 3 documents the results of the chi-square and ANOVA tests.
The ANOVA test results show that the average age, the number of people the respondent is paying for, as well as the days and nights spent at the festival are significant different between Cluster 1 and Cluster 2. The chi-square test indicated that there is a significant association between the spending clusters and occupations 1 and 3, the provinces Gauteng and Eastern Cape, the Dance genre as well as the main reason why the respondent is attending the festival.

Discussion of results
The results of the clustering show that there are only two clear expenditure segments, namely high spenders and low spenders.The low-spending segment is approximately five times the size of the high-spending segment.Comparing the two segments revealed that there are significant differences between high spenders and low spenders.
High spenders are significantly older than low spenders are, although the average age of both these segments is in the mid-thirties.High spenders are predominantly from highincome earning occupations, such as professionals and managers.Significantly more high spenders are from Gauteng and they tend to be financially responsible for fewer people while at the festival.They also spend one day less at the festival than low spenders.They are mainly in Grahamstown for the sole purpose of the festival and attend significantly more dance items than low spenders.Therefore the festival organisers should make an effort to improve their marketing efforts on visitors from Gauteng and offer packages that would extend these visitors" stay at the festival.The packages could include accommodation, meals and shows.
The low-spenders, on the other hand, are younger and significantly more are from low-income earning occupations (such as students).Results also indicate that significantly more low-spenders are from the Eastern Cape Province, which might suggest that they are locals and thus save on accommodation costs.This notion is also confirmed by the reason for attending the festival, where significantly more people in this category indicated that they are locals or not there for the sole purpose of the festivalwhich could imply visiting family and friends.They tend to stay one day longer at the festival and are financially responsible for more people.This might lead to cost sharing.The challenge for festival organisers is to increase the ticket sales of this particular market since they are young and could become a high spending market in future.
If one compares these results with findings from the literature review, then the results reflect similarities as well as contradictions.In support of the latter, the following will suffice.With regard to age, this research confirms that older visitors spend more, thereby supporting research by Kruger et al. (2008), Van der Merwe et al. (2007) Letho et al. (2004), Mok and Iverson (2000), and Kastenholz (2005), but contradicts research by Mumdambi and Baum (1997).This research also confirms a positive relationship between expenditure and occupation, as was the case in research by Kruger et al. (2008), Pizam andReichel (1979), andWoodside et al. (1987).Province of origin is also an important variable that has a positive relationship with expenditure, which again supports research by Saayman andSaayman (2006), andKruger et al. (2008) conducted at South Africa"s largest arts festival.Interestingly enough, this research contradicts the notion that length of stay increases spending, thereby contradicting findings by Thrane (2002).The research, however, supports findings by Letho et al. (2004) and Sakai (1988), that the sole purpose of visiting a festival/ destination has a positive impact on spending.
With regard to types of shows, these findings contradict those done by Kruger et al. (2008) who found that high spenders prefer drama, word art, poetry and music theatre, and cabaret.This study, however, indicated that high spenders mainly prefer dance shows.The reasons for these differences are not clear.
Implications of this research are firstly that the results differ from findings by Kruger et al. (2008) related to research conducted at South Africa"s largest arts festival, namely KKNK.These differences include length of stay, type of accommodation and the type of shows.Therefore, one cannot use results from one arts festival and apply it to another arts festival, even if the research was conducted in the same country or type of event.Hence, the profile of high spenders differs from one arts festival to the next.Secondly, these results can be used to develop a marketing strategy that should focus on both markets (high and low spenders) since both markets has something to offer.By attracting the high spenders as indicated above more tickets for shows should be sold.However since the low spenders are the younger visitors, (predominantly students) this is a market for the future and should not be neglected.Thirdly, the festival organisers could also use these findings in order to package the shows in such a way as to entice high as well as low spenders to buy more show tickets as was indicated above.Therefore, this type of research could support the development and implementation of a growth strategy, especially in the case where the product life cycle is experiencing a down turn.Lastly, in order to attract the number of high spenders, the focus should be on Gauteng.

Conclusions
The purpose of this research was to apply expenditure-based segmentation to visitors visiting the Grahamstown National Arts Festival using K-means clustering and ANOVAs.It was the first time that K-means clustering was used on arts festival data in South Africa and proved successful and innovative.Results clearly showed two distinct markets or clusters, namely high-and low spenders.Results also showed similarities and contradictions, for example, where length of stay was contradicted by this research and longer stay does not always result in higher expenditure.The importance of this research is grounded in the fact that this study confirmed that the profile of high spenders at one arts festival is not necessarily the same as at another national arts festival.Therefore, it is recommended that this type of research should be applied in more areas of research that could include different events as well as destinations.

Table 1 .
The table shows that cluster 1 has a mean spending per person per day of R255.66 and it has 265 members.
Cluster 2"s mean spending per person per day is R882.69 and it has 58 members that fall into this cluster.Cluster 3 has the highest mean spending, but with only one member, it cannot be regarded as a cluster.The member in Cluster 3 is therefore a clear outlier, and is disregarded from further analyses.The results therefore show that a natural division of the data is in two groups onlya high-spending segment

Table 2
provides a summary of two clusters and how they differ in terms of the variables tested.From Table2it can be seen that persons in cluster 2 (high-spending cluster) are on average, slightly older than those in Cluster 1 (low-spending cluster).There are also more males and English-speaking persons in this cluster relative to Cluster 1.While persons in Cluster 1 are predominantly from low-income occupations, persons in Cluster 2 are mainly from high-income occupations.A relatively higher percentage of people in Cluster 2 are from Gauteng and Western Cape, while persons in Cluster 1 are more likely to reside in the Eastern Cape.Not only do the demographics between the clusters show differences, but also the travel patterns and preferences.It is evident from Table3that persons in Cluster 2 travel in slightly smaller groups and spend one day and night less at the festival than those in Cluster 1. Relative to persons in