Standardising the single and double letter cancellation test for South African military personnel

Neuropsychological assessment is integral to clinical work (Lucas, 2013) and also forms part of test batteries used in organisations such as mining, manufacturing, construction and the military. Psychological testing within the military has become invaluable in the assessment and preparation of its personnel. Nwafor and Adesuwa (2014) described the use of psychological testing within the military context as a process that takes place on a continuum, starting from recruitment where an individual is assessed, to job utilisations for promotions and placements, to special missions and the diagnosis and treatment of disorders, and this continues until their retirement. Within the South African military context, its personnel perform a wide array of functions and occupational duties each of which has its own specific requirements and criteria. Attention was highlighted as a central neurocognitive skill that is necessary for highly specialised occupational duties as well as simple everyday functions in the military. Kennedy and Zillmer (eds. 2012) stated that soldiers are required to ‘maintain high levels of consistent attention and concentration in order to perform effectively and safely’ (p. 199). Even when preparing for the start of a day, the command of ‘attention’ is often given by the commander in order to make all soldiers focus on their duties for the day. It is therefore a standard practice to include a measure of attention as part of an assessment battery. One of the major concerns in the field of psychological testing for this context is, however, ensuring that the normative data are representative in terms of military personnel.


Introduction
Neuropsychological assessment is integral to clinical work (Lucas, 2013) and also forms part of test batteries used in organisations such as mining, manufacturing, construction and the military. Psychological testing within the military has become invaluable in the assessment and preparation of its personnel. Nwafor and Adesuwa (2014) described the use of psychological testing within the military context as a process that takes place on a continuum, starting from recruitment where an individual is assessed, to job utilisations for promotions and placements, to special missions and the diagnosis and treatment of disorders, and this continues until their retirement. Within the South African military context, its personnel perform a wide array of functions and occupational duties each of which has its own specific requirements and criteria. Attention was highlighted as a central neurocognitive skill that is necessary for highly specialised occupational duties as well as simple everyday functions in the military. Kennedy and Zillmer (eds. 2012) stated that soldiers are required to 'maintain high levels of consistent attention and concentration in order to perform effectively and safely ' (p. 199). Even when preparing for the start of a day, the command of 'attention' is often given by the commander in order to make all soldiers focus on their duties for the day. It is therefore a standard practice to include a measure of attention as part of an assessment battery. One of the major concerns in the field of psychological testing for this context is, however, ensuring that the normative data are representative in terms of military personnel.
Military personnel take up many different jobs, such as pilots, weapon handling, medical staff and deployment. It is therefore essential that there are tests available that can help evaluate attention and concentration in order to ensure that the individuals are competent enough to carry out their specific duties. Currently, related tests are used in the military for specialised career placements. These tests are also used as part of the soldier's rehabilitation processes. Even slight impairments in attention and concentration which can be a result of traumatic brain injury can have substantial repercussions for a soldier's effectiveness while on duty or in combat during the recovery period (Hatta, Yoshizaki, Ito, Mase, & Kabasawa, 2012;Kennedy & Zillmer, 2012). Attentional disorders (e.g. attentional deficit and hyperactivity disorder, perseveration and distractibility, confusional state and visual neglect), if undetected or not treated, would also impact on effective functioning in this context. For example, risk factors associated with attentional deficit and hyperactivity disorder include: In a classic quote by psychologist William James, attention is defined as processing 'one out of what seem several simultaneously possible objects or trains of thought … It implies withdrawal from some things in order to deal effectively with others' (James, 1890, pp. 403-404). A person's capacity for paying attention to daily activities is crucial for the successful completion of everyday tasks (Lezak, Howieson, & Loring, 2004). In order for individuals to function effectively, they need the ability to focus on the task at hand while simultaneously ignoring other distracting factors. This ability requires them to filter, select, focus, shift and track information (Groth-Marnat, 2009). Stankov (1988) identified six components of attention: • Attentional span refers to the size of an individual's capacity to hold information in mind to allow processing. • Concentration encompasses 'the capacity to sustain attention on relevant stimuli and the capacity to ignore irrelevant competing stimuli' (Scott, 2011, p. 149). Concentration requires sustained focus on a task over a period of time. • Search speed refers to the time of target selection when visually searching through a series of items for an identified target, or to detect similarities or differences (Cohen, 2013). • Divided attention refers to the ability to respond to more than one stimuli at a time (Baron, 2004). In everyday language, we may refer to this ability as multi-tasking. • Selective attention refers to attending to certain stimuli while disregarding other irrelevant stimuli (Glisky, 2007). • Attention switching refers to the capacity to 'consciously reallocate attentional resources from one activity to another' (Hebben & Milberg, 2009, p. 108).
As attention consists of a variety of processes, a comprehensive test of attention would consequently measure a range of these processes. Based on a review of existing literature, Coetzer and Balchin (2014), Lezak, Howieson, Bigler andTranel (2012), Mirsky, Anthony, Duncan, Ahearn andKellam (1991) and Stump (2002) recommended letter cancellation tests as comprehensive measures of attention. Letter cancellation tests can also be considered as screening tests. Hatta et al. (2012) stated that cancellation tests are simple, yet effective measures of attention as they are cost-effective and applicable over a wide spectrum of age groups.
Cancellation tests are usually paper-and-pencil tests where an individual needs to identify and cancel target items (Azouvi et al., 2006). Most cancellation tests consist of target stimuli that are distributed amongst distractor stimuli. The target stimulus is the identified symbol or letter that the individual needs to identify and cancel, while the distractor stimuli aim to divert the individual's attention from the target stimulus. Performance is scored by recording the number of omissions, errors and time taken to complete the test (Lezak et al., 2004).
Studies on standardisations of the letter cancellation test have been conducted in different contexts. Amongst others, these include the original development for the 1946 birth cohort study in a British context (Richards, Kuhn, Hardy, & Wadsworth, 1999), the use of the letter cancellation test on American samples (Uttl & Pilkenton-Taylor, 2001;Warren, Moorre, & Vogtle, 2008) and normative data for Indian school going children (Pradhan & Nagendra, 2008). These studies all differ in terms of the administration and scoring instructions utilised. The differing administration and scoring instructions pose a challenge to the reliable use of the letter cancellation test as administration with differing instructions alters the quality of the responses by the participants, thus compromising comparability of test results (Groth-Marnat, 2009). Each of these studies also developed differing sets of normative data.
In addition, no South African standardisations were found. The letter cancellation test has been used in research studies in South Africa (e.g. Jossub, Cassimjee, & Cramer, 2017); however, to date, there have been no studies focussing on the suitability of the test for South African populations. Practitioners and academics in the South African context indicate that Lezak et al.'s (2012) international guideline that 'normal performance limits have been defined as 0-2 omissions in 120 seconds' (p. 381) is used. This, however, provides a vague description of the scores, was developed on an international platform and does not allow for consideration of the impact of South African socio-demographic variables on test performance. According to Nell (2000) and Shuttleworth-Edwards (2016) neuropsychological tests without relevant normative data place clinicians at risk of misdiagnosing their patients. Anderson (2001) argued that, 'the injudicious use of imported normative data could result in an unacceptably high diagnostic rate of neuropsychological impairment in otherwise healthy South Africans' (p. 33).
In particular, no normative data are available which provide for the context-specific demographics and skills profile of the South African military environment. The letter cancellation test is a paper-and-pencil test that may prove beneficial as a quick measure for the attention (Pradhan & Nagendra, 2008) of military personnel. Given that crucial decisions are made using test results, appropriate normative data are essential to ensure fairness. Therefore, this study set out to standardise the letter cancellation test for military personnel in the South African National Defence Force (SANDF), by: • constructing standardised administration and scoring procedures for testing • investigating the influence of demographic variables with a view to establishing subgroup normative data for military personnel • establishing preliminary normative data for a sample of military personnel.

Method Participants
The target population comprised military personnel in the SANDF. Non-random (voluntary) sampling was used, resulting in an initial selection of 300 participants. The sample comprised people who were multilingual. Demographic variables of interest were age (the majority of the military personnel are 18-49 years old), gender (approximately 30% of the population are female and 70% male) and rank (15% are officers and 85% non-commissioned officers) (Defence Web, 2011;Martin, 2015). The latter refers to the level of seniority in terms of military rank and is regarded as relevant to assessment-related research conducted in the SANDF. The majority of the participants were right-handed (93.8%)handedness is a variable of importance when conducting neuropsychological tests. Level as well as quality of education has been shown to influence neuropsychological test performance (Lucas, 2013), especially in the case of cognitive batteries with a higher level of complexity. In the present sample, 97% of the participants completed grade 12 and 38% obtained further qualifications. Education was therefore not regarded as a challenge considering the nature of cancellation tasks (see Brucki & Nitrini, 2008). Individuals with a history of attention or neurological disorders, and those with visual impairments were excluded to limit confounding variables that might impact on testing performance. Participants were also screened for current use of chronic medication that might impact on their performance. The resulting sample comprised 292 participants. Representation in terms of age, gender and rank is illustrated in Table 1.

Instruments
The aim of this study was to develop standardised administration and scoring procedures for the letter cancellation test before establishing normative data on the test. Two trials of the letter cancellation test were constructed for the data collection of this study, namely the single (H) letter cancellation test and the double (CE) letter cancellation test. This was done to establish normative data for simple and double mental tracking. Currently, the existing H letter cancellation test used to assess single mental tracking, which is presented in the work of Lezak et al. (2004), is made up of two parts. Existing scoring procedures present the two parts of the letter cancellation test, with an overall score and total number of errors and omissions (Lezak et al., 2004;Pradhan & Nagendra, 2008;Uttl & Pilkenton-Taylor, 2001). Because tests of sustained attention require prolonged tasks, modifications were made to the number of parts when compiling both the single and double letter cancellation tests for use in collecting empirical data for this study. The length of both the H and CE letter cancellation tests was expanded from two to six parts to allow for the assessment of sustained attention. The formats of the parts are consistent -a group of letters arranged in the same number of lines.
In order to ensure uniform administration of the letter cancellation test, the instructions were documented in text and the test administrators were required to read it out verbatim so that the testing instructions remained consistent. Clear and detailed instructions were provided on how to complete the test: • Firstly, participants were instructed to scan the test from left to right, and then to go down one row at a time following the same scanning process, and to cancel targets by striking out the specified letter using a pencil. • The second instruction was that their performance on the test will be timed and they were required to work as quickly as they could. They were also informed that there was no specific time limit imposed on how long they should take to complete the test. • Lastly, participants were informed that they would be completing two trials of the test.
Additionally, a scoring profile was created so that the scoring remained consistent, thus enhancing the integrity of the study. This document was constructed to record participants' time and performance in each part of the test. Test administrators were instructed to record the time taken to complete the task (in seconds), the number of errors made (i.e. non-target items erroneously identified), the number of omitted letters (i.e. target items not identified) and any selfcorrecting attempts for each part in order to establish what is This study, therefore, provides test scores for each of the six parts of the H and CE letter cancellation tests in terms of time, error and a total score, a significant improvement on earlier scoring procedures. The proposed detailed scoring aims to provide clinicians with more comprehensive information on the letter cancellation test, and to further aid assessment and diagnostic practices.

Procedures
All SANDF members have their health status examined annually. Appointments are made on a random basis implying that at any given period, representation in terms of the specified stratification variables (age, gender, rank) could be expected amongst those being assessed. Participants were recruited on a voluntary basis during an arbitrary selected period of assessments. They were primarily from the Gauteng assessment centre with some participants selected from the Western Cape centre. (Note the former centre often also caters for members from other provinces.) All possible efforts were made to ensure that the testing environment was comfortable and reasonably quiet. A screening questionnaire was completed by all participants. Socio-demographic information was obtained and participants had to answer questions regarding their suitability for the study. Psychologists (clinical and counselling) and registered counsellors employed in the SANDF administered the test on an individual basis. The administrators attended a training session and also met with the researcher before each session to prepare for the testing.
The tests were administered in English. This is the main medium of communication in the SANDF, and as such, proficiency in the language is a requirement and could be assumed in this study.

Ethical consideration
Ethical clearance was obtained from the University of South Africa (UNISA) Ethics Committee, reference number: SG (D Psych)/R/104/10/5, for a study involving human participants. In the case of the SANDF, the chain of command implied clearance by various structures, departments and units; (Defence Intelligence), reference number: DI/ DDS/R/202/3/7 and (Military Health Service), reference number: AMHF/R/104/10/05. In the case of the latter, the chain of command implied clearance by various structures, departments and units. Permission was also granted for collecting and using the data for a master's dissertation and for publishing the results in a journal. Informed consent was obtained from all participants, and confidentiality was maintained by securing the data (a locked cupboard and password protection) and ensuring that no personal information was published. Arrangements were made for appropriate referral should the test results indicate the need for further intervention in individual cases.

Data analyses
Descriptive statistics (i.e. means and standard deviations) were calculated for the two trials of the test (H and CE letter cancellation test), for each of the six parts, and for each score, that is, time, omissions and errors made. Comparative analyses were conducted to determine if selected demographic variables had a significant impact on test performance (and thus warranted separate tables for comparison). Analyses were only performed in cases where the cell size was at least n = 30. The sample size allowed for an independent samples T-test to be used to compare the performance of the gender groups and the different ranks, whereas the role of age was investigated by means of oneway analysis of variance (ANOVA). In the case of the latter, significant results were further explored by means of posthoc comparisons using the Tukey's Honest Significant Difference (HSD) Test (Pallant, 2016) to determine which specific group means differ from each other. Visual representation was considered to determine the normality of the distributions. In addition, the Shapiro-Wilk test and the Kolmogorov-Smirnov test were conducted.

Descriptive statistics
The means and standard deviations for each of the six parts of the H letter cancellation test and the CE letter cancellation test are provided in Table 2 for the different scoring categories (i.e. omissions, errors and time). The number of errors made in both versions was small with no errors recorded in some parts of the tests. In both versions, performance was progressively slower in the different parts of the tests.
Only in the case of time taken to complete the tests did the distributions resemble normality (see Pillay, 2017 for detail). However, all results could be regarded as right skewed, and this has implications for the interpretation of the typical performance of the target population.

Demographic variables: Gender, rank and age
Independent samples T-tests showed no significant differences between males (n = 198) and females (n = 92) in terms of omissions and errors on both the H and CE letter cancellation tests. Significant differences were, however found for time scores on all parts of the tests with females performing the tasks in less time than males (refer to Tables 3 and 4). No significant differences were found between officers (n = 53) and non-commissioned officers (n = 238). The performance of four age categories (20-29 years, n = 100; 30-39 years, n = 101; 40-49 years, n = 72; and 50-59 years, n = 18) was compared by means of ANOVA. No significant differences were found in terms of omissions and errors but the groups did differ on the time scores (refer to Tables 5 and 6). Post-hoc comparisons using the Tukey's HSD test indicated that these differences were between those younger than 40 and those older than 40, with the http://www.ajopa.org Open Access latter performing slower on the tasks. The descriptive statistics for time total for gender by age illustrate these trends (Table 7).

Discussion
At present, comparative data for the CE letter cancellation test are limited to an overall score (for two parts) and the statement that 'normal performance limits have been defined as 0-2 omissions in 120 seconds' (Diller, Ben-Yishay, & Gerstman, 1974;Lezak et al., 2012, p. 381 Although Diller et al. (1974) found that there were no significant differences in performance based on gender and   age, the present study did find age and gender to impact on an individual's time taken to complete the tests. In addition to the comparative data for the total sample provided in this manuscript, stratification in terms of age and gender was necessary. Significant differences between males and females in terms of the time taken to complete the tasks are consistent with the findings of Pradhan and Nagendra (2008), Upadhayay and Guragain (2014) and Uttl and Pilkenton-Taylor (2001). Upadhayay and Guragain (2014) also found that women performed faster than men in paper-and-pencil tests. These findings could be partly explained by the fact that different parts of men's and women's brains are activated during different tasks, thus demonstrating that the genders utilise different parts of their brains to solve problems (Brizendine, 2009).
Significant differences were also found between those below and above 40 years of age with the latter taking more time to complete the tasks. Age-related decline in speed for the letter cancellation test has been reported previously (Pradhan & Nagendra, 2008;Uttl & Pilkenton-Taylor, 2001). This may be accounted for by age-related slowing and attentional deficits (Erel & Levy, 2016;Fortinash & Worret, 2014;Kramer & Madden, 2011). Deficits have been noted in the ability to selectively attend to certain tasks (Brink & McDowd, 1999;Glisky, 2007), for example, when requiring an individual to focus their attention on one stimulus among several other sets of information. Madden et al. (2007) found that older adults performed slower and less accurately than younger adults in visual search tests. Military personnel are recruited at a young age, based on their functioning at that point in time. Continuous evaluation of fitness for duty would therefore imply the need to assess any decline, especially in attention associated with normal ageing in addition to those associated with injury.   Skewness could be attributed to the target population being a pre-selected group. According to Kennedy and Moore (eds. 2010), some samples of the military population outperform the general population on neuropsychological tests, as only healthy and generally well-functioning individuals are considered to be fit for duty. Nwafor and Adesuwa (2014) further supported this by adding that the specialised skills required by soldiers for their operational and functional duties require them to function higher than the general population. The raw scores can be converted to standard scores by means of Z-score conversions using the typical performance presented in Tables 2 and 7. A Z-score represents the distance from the mean expressed in standard deviation units (i.e. Z = (the raw score -the mean)/the standard deviation). It is important to note that the distribution of Z-scores has the same form as the raw scores on which they are based. In this instance, the Z-scores will therefore be right skewed and not normally distributed. Although these scores do not have the statistical advantages of normally distributed scores, conversation will nevertheless allow for comparison within and between individuals in this population. In the case of age and gender stratification for total time, comparisons will be limited to each specified group (e.g. females, 20-29 years) (Gadd & Phipps, 2012).

Conclusion
A major contribution of this study is the development of standardised administration and scoring procedures for a test of attention. Additionally, the military context implies a need for appropriate normative data on this construct. However, larger sample sizes are required for adequate representation in terms of some of the demographic variables (i.e. individuals older than 50 years, females older than 40 years and left-handed individuals). This will also enable further exploration of the distribution of the performance. Standard scores based on the present data set cannot be interpreted in terms of the properties of a normal distribution. These recommendations would allow for a comprehensive standardisation and evaluation of the psychometric properties of the test in the military context. The study furthermore involved a highly specific subgroup of the general population and replication studies including additional subpopulations should be considered.
The letter cancellation test is widely used despite being subject to unstandardised administration and scoring procedures and broad cut-off scores. This study provides a review of the letter cancellation test and puts forward improved administration procedures, detailed scoring methods and relevant normative data for adequate sample sizes. This was done to provide clinicians in the SANDF with meaningful scores for interpretation and to guide future developments in the wider South African context.