| Psychological testing | |
|---|---|
| Medical diagnostics | |
| ICD-10-PCS | GZ1 | 
| ICD-9-CM | 94.02 | 
| MeSH | D011581 | 
Psychological testing is the administration of psychological tests, which are designed to be "an objective and standardized measure of a sample of behavior". The term sample of behavior refers to an individual's performance on tasks that have usually been prescribed beforehand. The samples of behavior that make up a paper-and-pencil test, the most common type of test, are a series of items. Performance on these items produce a test score. A score on a well-constructed test is believed to reflect a psychological construct such as achievement in a school subject, cognitive ability, aptitude, emotional functioning, personality, etc. Differences in test scores are thought to reflect individual differences in the construct the test is supposed to measure. The science behind psychological testing is psychometrics.
Psychological tests
A psychological test is an instrument designed to measure unobserved constructs, also known as latent variables.
 Psychological tests are typically, but not necessarily, a series of 
tasks or problems that the respondent has to solve. Psychological tests 
can strongly resemble questionnaires,
 which are also designed to measure unobserved constructs, but differ in
 that psychological tests ask for a respondent's maximum performance 
whereas a questionnaire asks for the respondent's typical performance. A useful psychological test must be both valid (i.e., there is evidence to support the specified interpretation of the test results) and reliable (i.e., internally consistent or give consistent results over time, across raters, etc.). 
It is important that people who are equal on the measured 
construct also have an equal probability of answering the test items 
accurately.
 For example, an item on a mathematics test could be "In a soccer match 
two players get a red card; how many players are left in the end?"; 
however, this item also requires knowledge of soccer to be answered 
correctly, not just mathematical ability. Group membership can also 
influence the chance of correctly answering items (differential item functioning).
 Often tests are constructed for a specific population, and this should 
be taken into account when administering tests. If a test is invariant 
to some group difference (e.g. gender) in one population (e.g. England) 
it does not automatically mean that it is also invariant in another 
population (e.g. Japan). 
Psychological assessment
 is similar to psychological testing but usually involves a more 
comprehensive assessment of the individual. Psychological assessment is a
 process that involves checking the integration of information from 
multiple sources, such as tests of normal and abnormal personality, 
tests of ability or intelligence, tests of interests or attitudes, as 
well as information from personal interviews. Collateral information is 
also collected about personal, occupational, or medical history, such as from records or from interviews with parents, spouses, teachers, or previous therapists or physicians. A psychological test
 is one of the sources of data used within the process of assessment; 
usually more than one test is used. Many psychologists do some level of 
assessment when providing services to clients or patients, and may use 
for example, simple checklists to osis for treatment settings; to assess
 a particular area of functioning or disability often for school 
settings; to help select type of treatment or to assess treatment 
outcomes; to help courts decide issues such as child custody or 
competency to stand trial; or to help assess job applicants or employees
 and provide career development counseling or training.
History
A Song Dynasty painting of candidates participating in the imperial examination, a rudimentary form of psychological testing.
Physiognomy was used to assess personality traits based on an individual's outer appearance.
The first large-scale tests may have been examinations that were part of the imperial examination
 system in China. The test, an early form of psychological testing, 
assessed candidates based on their proficiency in topics such as civil 
law and fiscal policies. Other early tests of intelligence were made for entertainment rather than analysis. Modern mental testing began in France in the 19th century. It contributed to separating mental retardation from mental illness and reducing the neglect, torture, and ridicule heaped on both groups.
Englishman Francis Galton coined the terms psychometrics and eugenics,
 and developed a method for measuring intelligence based on nonverbal 
sensory-motor tests. It was initially popular, but was abandoned after 
the discovery that it had no relationship to outcomes such as college 
grades. French psychologist Alfred Binet, together with psychologists Victor Henri and Théodore Simon, after about 15 years of development, published the Binet-Simon test in 1905, which focused on verbal abilities. It was intended to identify mental retardation in school children.
The origins of personality testing date back to the 18th and 19th centuries, when personality was assessed through phrenology, the measurement of the human skull, and physiognomy, which assessed personality based on a person's outer appearances.
 These early pseudoscientific techniques were eventually replaced with 
more empirical methods in the 20th century. One of the earliest modern 
personality tests was the Woolworth Personality Data Sheet, a self-report inventory developed for World War I and used for the psychiatric screening of new draftees.
Principles
Proper
 psychological testing is conducted after vigorous research and 
development in contrast to quick web-based or magazine questionnaires 
that say "Find out your Personality Color," or "What's your Inner Age?" 
Proper psychological testing consists of the following:
- Standardization - All procedures and steps must be conducted with consistency and under the same environment to achieve the same testing performance from those being tested.
 - Objectivity - Scoring such that subjective judgments and biases are minimized, with results for each test taker obtained in the same way.
 - Test Norms - The average test score within a large group of people where the performance of one individual can be compared to the results of others by establishing a point of comparison or frame of reference.
 - Reliability - Obtaining the same result after multiple testing.
 - Validity - The type of test being administered must measure what it is intended to measure.
 
Interpreting scores
Psychological tests, like many measurements of human characteristics, can be interpreted in a norm-referenced or criterion-referenced manner. Norms are statistical representations of a population. A norm-referenced
 score interpretation compares an individual's results on the test with 
the statistical representation of the population. In practice, rather 
than testing a population, a representative sample or group is tested. 
This provides a group norm or set of norms. One representation of norms 
is the Bell curve
 (also called "normal curve"). Norms are available for standardized 
psychological tests, allowing for an understanding of how an 
individual's scores compare with the group norms. Norm referenced scores
 are typically reported on the standard score (z) scale or a rescaling of it. 
A criterion-referenced
 interpretation of a test score compares an individual's performance to 
some criterion other than performance of other individuals.  For 
example, the generic school test
 typically provides a score in reference to a subject domain; a student 
might score 80% on a geography test.  Criterion-referenced score 
interpretations are generally more applicable to achievement tests rather than psychological tests. 
Often, test scores can be interpreted in both ways; answering 80%
 of the questions correctly on a geography test could place a student at
 the 84th percentile (that is, the student performed better than 83% of 
the class and worse than 16% of the classmates), or a standard score of 
1.0 or even 2.0.
Types
There are several broad categories of psychological tests:
IQ/achievement tests
IQ tests purport to be measures of intelligence, while achievement tests are measures of the use and level of development of use of the ability. IQ (or cognitive) tests and achievement tests
 are common norm-referenced tests. In these types of tests, a series of 
tasks is presented to the person being evaluated, and the person's 
responses are graded according to carefully prescribed guidelines. After
 the test is completed, the results can be compiled and compared to the 
responses of a norm group, usually composed of people at the same age or
 grade level as the person being evaluated. IQ tests which contain a 
series of tasks typically divide the tasks into verbal
 (relying on the use of language) and performance, or non-verbal 
(relying on eye–hand types of tasks, or use of symbols or objects). 
Examples of verbal IQ test tasks are vocabulary and information 
(answering general knowledge questions). Non-verbal examples are timed 
completion of puzzles (object assembly) and identifying images which fit
 a pattern (matrix reasoning). 
IQ tests (e.g., WAIS-IV, WISC-V, Cattell Culture Fair III, Woodcock-Johnson Tests of Cognitive Abilities-IV, Stanford-Binet Intelligence Scales V) and academic achievement tests (e.g. WIAT, WRAT,
 Woodcock-Johnson Tests of Achievement-III) are designed to be 
administered to either an individual (by a trained evaluator) or to a 
group of people (paper and pencil tests). The individually administered 
tests tend to be more comprehensive, more reliable, more valid and 
generally to have better psychometric
 characteristics than group-administered tests. However, individually 
administered tests are more expensive to administer because of the need 
for a trained administrator (psychologist, school psychologist, or psychometrician).
Public safety employment tests
Vocations
 within the public safety field (i.e., fire service, law enforcement, 
corrections, emergency medical services) often require Industrial and Organizational Psychology tests for initial employment and advancement throughout the ranks. The National Firefighter Selection Inventory - NFSI, the National Criminal Justice Officer Selection Inventory - NCJOSI, and the Integrity Inventory are prominent examples of these tests.
Attitude tests
Attitude
 test assess an individual's feelings about an event, person, or object.
  Attitude scales are used in marketing to determine individual (and 
group) preferences for brands, or items. Typically attitude tests use 
either a Thurstone scale, or Likert Scale to measure specific items.
Neuropsychological tests
These tests consist of specifically designed tasks used to measure a 
psychological function known to be linked to a particular brain 
structure or pathway. Neuropsychological tests can be used in a clinical context to assess impairment after an injury or illness known to affect neurocognitive
 functioning. When used in research, these tests can be used to contrast
 neuropsychological abilities across experimental groups. 
Infant and Preschool Assessment
Due to the fact that infants and preschool aged children have 
limited capacities of communication, psychologists are unable to use 
traditional tests to assess them.  Therefore, many tests have been 
designed just for children ages birth to around six years of age. These 
tests usually vary with age respectively from assessments of reflexes 
and developmental milestones, to sensory and motor skills, language 
skills, and simple cognitive skills.
Common tests for this age group are split into categories: Infant Ability, Preschool Intelligence, and School Readiness.
Common infant ability tests include: Gesell Developmental Schedules
 (GDS) which measures the developmental progress of infants, Neonatal 
Behavioral Assessment Scale (NBAS) which tests newborn behavior, 
reflexes, and responses, Ordinal Scales of Psychological Development 
(OSPD) which assesses infant intellectual abilities, and Bayley-III 
which tests mental ability and motor skills.
Common preschool intelligence tests include: McCarthy Scales of Children’s Abilities
 (MSCA) which is similar to an infant IQ test, Differential Ability 
Scales (DAS) which can be used to test for learning disability, Wechsler
 Preschool and Primary Scale of Intelligence-III (WPPSI-III) and 
Stanford-Binet Intelligence Scales for Early Childhood which could be 
seen as infant versions of IQ tests, and Fagan Test of Infant 
Intelligence (FTII) which tests recognition memory.
Finally, some common school readiness tests are: Developmental 
Indicators for the Assessment of Learning-III (DIAL-III) which assesses 
motor, cognitive, and language skills, Denver II which tests motor, 
social, and language skills, and Home Observation for Measurement of 
Environment (HOME) which is a measure of the extent to which a child’s 
home environment facilitates school readiness.
Infant and preschool assessments, since they do not predict later
 childhood nor adult abilities, are mainly useful for testing if a child
 is experiencing developmental delay or disabilities. They are also 
useful for testing individual intelligence and ability, and, as 
aforementioned, there are some specifically designed to test school 
readiness and determine which children may struggle more in school.
Personality tests
Psychological measures of personality are often described as either objective tests or projective tests. The terms "objective test" and "projective test" have recently come under criticism in the Journal of Personality Assessment.
 The more descriptive "rating scale or self-report measures" and "free 
response measures" are suggested, rather than the terms "objective 
tests" and "projective tests," respectively.
Objective tests (Rating scale or self-report measure)
Objective
 tests have a restricted response format, such as allowing for true or 
false answers or rating using an ordinal scale. Prominent examples of 
objective personality tests include the Minnesota Multiphasic Personality Inventory, Millon Clinical Multiaxial Inventory-IV, Child Behavior Checklist, Symptom Checklist 90 and the Beck Depression Inventory. Objective personality tests can be designed for use in business for potential employees, such as the NEO-PI, the 16PF, and the OPQ (Occupational Personality Questionnaire), all of which are based on the Big Five
 taxonomy. The Big Five, or Five Factor Model of normal personality, has
 gained acceptance since the early 1990s when some influential 
meta-analyses (e.g., Barrick & Mount 1991) found consistent 
relationships between the Big Five personality factors and important criterion variables. 
Another personality test based upon the Five Factor Model is the Five Factor Personality Inventory – Children (FFPI-C.).
Projective tests (Free response measures)
Projective tests allow for a freer type of response. An example of this would be the Rorschach test, in which a person states what each of ten ink blots might be.
Projective testing became a growth industry in the first half of 
the 1900s, with doubts about the theoretical assumptions behind 
projective testing arising in the second half of the 1900s.
 Some projective tests are used less often today because they are more 
time consuming to administer and because the reliability and validity 
are controversial.
As improved sampling and statistical methods developed, much 
controversy regarding the utility and validity of projective testing has
 occurred. The use of clinical judgement rather than norms and 
statistics to evaluate people's characteristics has raised criticism 
that projectives are deficient and unreliable (results are too 
dissimilar each time a test is given to the same person). However, as 
more objective scoring and interpretive systems supported by more 
rigorous scientific research have emerged, many practitioners continue 
to rely on projective testing. Projective tests may be useful in 
creating inferences to follow up with other methods. The most widely 
used scoring system for the Rorschach is the  Exner system of scoring. Another common projective test is the Thematic Apperception Test (TAT), which is often scored with Westen's Social Cognition and Object Relations Scales and Phebe Cramer's Defense Mechanisms Manual. Both "rating scale" and "free response" measures are used in contemporary clinical practice, with a trend toward the former.
Other projective tests include the House-Tree-Person test, the Animal Metaphor Test.
Sexological tests
The number of tests specifically meant for the field of sexology is quite limited. The field of sexology
 provides different psychological evaluation devices in order to examine
 the various aspects of the discomfort, problem or dysfunction, 
regardless of whether they are individual or relational ones.
Direct observation tests
Although
 most psychological tests are "rating scale" or "free response" 
measures, psychological assessment may also involve the observation of 
people as they complete activities. This type of assessment is usually 
conducted with families in a laboratory, home or with children in a 
classroom. The purpose may be clinical, such as to establish a 
pre-intervention baseline of a child's hyperactive or aggressive 
classroom behaviors or to observe the nature of a parent-child 
interaction in order to understand a relational disorder. Direct 
observation procedures are also used in research, for example to study 
the relationship between intrapsychic variables and specific target 
behaviors, or to explore sequences of behavioral interaction. 
The Parent-Child Interaction Assessment-II (PCIA)
 is an example of a direct observation procedure that is used with 
school-age children and parents. The parents and children are video 
recorded playing at a make-believe zoo. The Parent-Child Early 
Relational Assessment is used to study parents and young children and involves a feeding and a puzzle task. The MacArthur Story Stem Battery (MSSB) is used to elicit narratives from children. The Dyadic Parent-Child Interaction Coding System-II tracks the extent to which children follow the commands of parents and vice versa and is well suited to the study of children with Oppositional Defiant Disorders and their parents.
Interest tests
Psychological
 tests to assess a person’s interests and preferences. These tests are 
used primarily for career counseling. Interest tests include items about
 daily activities from among which applicants select their preferences. 
The rationale is that if a person exhibits the same pattern of interests
 and preferences as people who are successful in a given occupation, 
then the chances are high that the person taking the test will find 
satisfaction in that occupation. A widely used interest test is the Strong Interest Inventory, which is used in career assessment, career counseling, and educational guidance.
Aptitude tests
Psychological
 tests measure specific abilities, such as clerical, perceptual, 
numerical, or spatial aptitude. Sometimes these tests must be specially 
designed for a particular job, but there are also tests available that 
measure general clerical and mechanical aptitudes, or even general 
learning ability. An example of an occupational aptitude test is the 
Minnesota Clerical Test, which measures the perceptual speed and 
accuracy required to perform various clerical duties. Other widely used 
aptitude tests include Careerscope, the Differential Aptitude Tests
 (DAT), which assess  verbal reasoning, numerical ability, abstract 
Reasoning, clerical speed and accuracy, mechanical reasoning, space 
relations, spelling and language usage. Another widely used test of 
aptitudes is the Wonderlic Test.
 These aptitudes are believed to be related to specific occupations and 
are used for career guidance as well as selection and recruitment.
Biographical Information Blank
The Biographical Information Blanks
 or BIB is a paper-and-pencil form that includes items that ask about 
detailed personal and work history. It is used to aid in the hiring of 
employees by matching the backgrounds of individuals to requirements of 
the job.
Test security
Many
 psychological tests are generally not available to the public, but 
rather, have restrictions both from publishers of the tests and from 
psychology licensing boards that prevent the disclosure of the tests 
themselves and information about the interpretation of the results.
 Test publishers consider both copyright and matters of professional 
ethics to be involved in protecting the secrecy of their tests, and they
 sell tests only to people who have proved their educational and 
professional qualifications to the test maker's satisfaction.  
Purchasers are legally bound from giving test answers or the tests 
themselves out to the public unless permitted under the test maker's 
standard conditions for administration of the tests.
The International Test Commission (ITC), an international 
association of national psychological societies and test publishers, 
publishes the International Guidelines for Test Use, which 
prescribes to "protect the integrity" of the tests by not publicly 
describing test techniques and by not "coaching individuals" so that 
they "might unfairly influence their test performance."