Psychological testing | |
---|---|
Medical diagnostics | |
ICD-10-PCS | GZ1 |
ICD-9-CM | 94.02 |
MeSH | D011581 |
Psychological testing is the administration of psychological tests, which are designed to be "an objective and standardized measure of a sample of behavior". The term sample of behavior refers to an individual's performance on tasks that have usually been prescribed beforehand. The samples of behavior that make up a paper-and-pencil test, the most common type of test, are a series of items. Performance on these items produce a test score. A score on a well-constructed test is believed to reflect a psychological construct such as achievement in a school subject, cognitive ability, aptitude, emotional functioning, personality, etc. Differences in test scores are thought to reflect individual differences in the construct the test is supposed to measure. The science behind psychological testing is psychometrics.
Psychological tests
A psychological test is an instrument designed to measure unobserved constructs, also known as latent variables.
Psychological tests are typically, but not necessarily, a series of
tasks or problems that the respondent has to solve. Psychological tests
can strongly resemble questionnaires,
which are also designed to measure unobserved constructs, but differ in
that psychological tests ask for a respondent's maximum performance
whereas a questionnaire asks for the respondent's typical performance. A useful psychological test must be both valid (i.e., there is evidence to support the specified interpretation of the test results) and reliable (i.e., internally consistent or give consistent results over time, across raters, etc.).
It is important that people who are equal on the measured
construct also have an equal probability of answering the test items
accurately.
For example, an item on a mathematics test could be "In a soccer match
two players get a red card; how many players are left in the end?";
however, this item also requires knowledge of soccer to be answered
correctly, not just mathematical ability. Group membership can also
influence the chance of correctly answering items (differential item functioning).
Often tests are constructed for a specific population, and this should
be taken into account when administering tests. If a test is invariant
to some group difference (e.g. gender) in one population (e.g. England)
it does not automatically mean that it is also invariant in another
population (e.g. Japan).
Psychological assessment
is similar to psychological testing but usually involves a more
comprehensive assessment of the individual. Psychological assessment is a
process that involves checking the integration of information from
multiple sources, such as tests of normal and abnormal personality,
tests of ability or intelligence, tests of interests or attitudes, as
well as information from personal interviews. Collateral information is
also collected about personal, occupational, or medical history, such as from records or from interviews with parents, spouses, teachers, or previous therapists or physicians. A psychological test
is one of the sources of data used within the process of assessment;
usually more than one test is used. Many psychologists do some level of
assessment when providing services to clients or patients, and may use
for example, simple checklists to osis for treatment settings; to assess
a particular area of functioning or disability often for school
settings; to help select type of treatment or to assess treatment
outcomes; to help courts decide issues such as child custody or
competency to stand trial; or to help assess job applicants or employees
and provide career development counseling or training.
History
The first large-scale tests may have been examinations that were part of the imperial examination
system in China. The test, an early form of psychological testing,
assessed candidates based on their proficiency in topics such as civil
law and fiscal policies. Other early tests of intelligence were made for entertainment rather than analysis. Modern mental testing began in France in the 19th century. It contributed to separating mental retardation from mental illness and reducing the neglect, torture, and ridicule heaped on both groups.
Englishman Francis Galton coined the terms psychometrics and eugenics,
and developed a method for measuring intelligence based on nonverbal
sensory-motor tests. It was initially popular, but was abandoned after
the discovery that it had no relationship to outcomes such as college
grades. French psychologist Alfred Binet, together with psychologists Victor Henri and Théodore Simon, after about 15 years of development, published the Binet-Simon test in 1905, which focused on verbal abilities. It was intended to identify mental retardation in school children.
The origins of personality testing date back to the 18th and 19th centuries, when personality was assessed through phrenology, the measurement of the human skull, and physiognomy, which assessed personality based on a person's outer appearances.
These early pseudoscientific techniques were eventually replaced with
more empirical methods in the 20th century. One of the earliest modern
personality tests was the Woolworth Personality Data Sheet, a self-report inventory developed for World War I and used for the psychiatric screening of new draftees.
Principles
Proper
psychological testing is conducted after vigorous research and
development in contrast to quick web-based or magazine questionnaires
that say "Find out your Personality Color," or "What's your Inner Age?"
Proper psychological testing consists of the following:
- Standardization - All procedures and steps must be conducted with consistency and under the same environment to achieve the same testing performance from those being tested.
- Objectivity - Scoring such that subjective judgments and biases are minimized, with results for each test taker obtained in the same way.
- Test Norms - The average test score within a large group of people where the performance of one individual can be compared to the results of others by establishing a point of comparison or frame of reference.
- Reliability - Obtaining the same result after multiple testing.
- Validity - The type of test being administered must measure what it is intended to measure.
Interpreting scores
Psychological tests, like many measurements of human characteristics, can be interpreted in a norm-referenced or criterion-referenced manner. Norms are statistical representations of a population. A norm-referenced
score interpretation compares an individual's results on the test with
the statistical representation of the population. In practice, rather
than testing a population, a representative sample or group is tested.
This provides a group norm or set of norms. One representation of norms
is the Bell curve
(also called "normal curve"). Norms are available for standardized
psychological tests, allowing for an understanding of how an
individual's scores compare with the group norms. Norm referenced scores
are typically reported on the standard score (z) scale or a rescaling of it.
A criterion-referenced
interpretation of a test score compares an individual's performance to
some criterion other than performance of other individuals. For
example, the generic school test
typically provides a score in reference to a subject domain; a student
might score 80% on a geography test. Criterion-referenced score
interpretations are generally more applicable to achievement tests rather than psychological tests.
Often, test scores can be interpreted in both ways; answering 80%
of the questions correctly on a geography test could place a student at
the 84th percentile (that is, the student performed better than 83% of
the class and worse than 16% of the classmates), or a standard score of
1.0 or even 2.0.
Types
There are several broad categories of psychological tests:
IQ/achievement tests
IQ tests purport to be measures of intelligence, while achievement tests are measures of the use and level of development of use of the ability. IQ (or cognitive) tests and achievement tests
are common norm-referenced tests. In these types of tests, a series of
tasks is presented to the person being evaluated, and the person's
responses are graded according to carefully prescribed guidelines. After
the test is completed, the results can be compiled and compared to the
responses of a norm group, usually composed of people at the same age or
grade level as the person being evaluated. IQ tests which contain a
series of tasks typically divide the tasks into verbal
(relying on the use of language) and performance, or non-verbal
(relying on eye–hand types of tasks, or use of symbols or objects).
Examples of verbal IQ test tasks are vocabulary and information
(answering general knowledge questions). Non-verbal examples are timed
completion of puzzles (object assembly) and identifying images which fit
a pattern (matrix reasoning).
IQ tests (e.g., WAIS-IV, WISC-V, Cattell Culture Fair III, Woodcock-Johnson Tests of Cognitive Abilities-IV, Stanford-Binet Intelligence Scales V) and academic achievement tests (e.g. WIAT, WRAT,
Woodcock-Johnson Tests of Achievement-III) are designed to be
administered to either an individual (by a trained evaluator) or to a
group of people (paper and pencil tests). The individually administered
tests tend to be more comprehensive, more reliable, more valid and
generally to have better psychometric
characteristics than group-administered tests. However, individually
administered tests are more expensive to administer because of the need
for a trained administrator (psychologist, school psychologist, or psychometrician).
Public safety employment tests
Vocations
within the public safety field (i.e., fire service, law enforcement,
corrections, emergency medical services) often require Industrial and Organizational Psychology tests for initial employment and advancement throughout the ranks. The National Firefighter Selection Inventory - NFSI, the National Criminal Justice Officer Selection Inventory - NCJOSI, and the Integrity Inventory are prominent examples of these tests.
Attitude tests
Attitude
test assess an individual's feelings about an event, person, or object.
Attitude scales are used in marketing to determine individual (and
group) preferences for brands, or items. Typically attitude tests use
either a Thurstone scale, or Likert Scale to measure specific items.
Neuropsychological tests
These tests consist of specifically designed tasks used to measure a
psychological function known to be linked to a particular brain
structure or pathway. Neuropsychological tests can be used in a clinical context to assess impairment after an injury or illness known to affect neurocognitive
functioning. When used in research, these tests can be used to contrast
neuropsychological abilities across experimental groups.
Infant and Preschool Assessment
Due to the fact that infants and preschool aged children have
limited capacities of communication, psychologists are unable to use
traditional tests to assess them. Therefore, many tests have been
designed just for children ages birth to around six years of age. These
tests usually vary with age respectively from assessments of reflexes
and developmental milestones, to sensory and motor skills, language
skills, and simple cognitive skills.
Common tests for this age group are split into categories: Infant Ability, Preschool Intelligence, and School Readiness.
Common infant ability tests include: Gesell Developmental Schedules
(GDS) which measures the developmental progress of infants, Neonatal
Behavioral Assessment Scale (NBAS) which tests newborn behavior,
reflexes, and responses, Ordinal Scales of Psychological Development
(OSPD) which assesses infant intellectual abilities, and Bayley-III
which tests mental ability and motor skills.
Common preschool intelligence tests include: McCarthy Scales of Children’s Abilities
(MSCA) which is similar to an infant IQ test, Differential Ability
Scales (DAS) which can be used to test for learning disability, Wechsler
Preschool and Primary Scale of Intelligence-III (WPPSI-III) and
Stanford-Binet Intelligence Scales for Early Childhood which could be
seen as infant versions of IQ tests, and Fagan Test of Infant
Intelligence (FTII) which tests recognition memory.
Finally, some common school readiness tests are: Developmental
Indicators for the Assessment of Learning-III (DIAL-III) which assesses
motor, cognitive, and language skills, Denver II which tests motor,
social, and language skills, and Home Observation for Measurement of
Environment (HOME) which is a measure of the extent to which a child’s
home environment facilitates school readiness.
Infant and preschool assessments, since they do not predict later
childhood nor adult abilities, are mainly useful for testing if a child
is experiencing developmental delay or disabilities. They are also
useful for testing individual intelligence and ability, and, as
aforementioned, there are some specifically designed to test school
readiness and determine which children may struggle more in school.
Personality tests
Psychological measures of personality are often described as either objective tests or projective tests. The terms "objective test" and "projective test" have recently come under criticism in the Journal of Personality Assessment.
The more descriptive "rating scale or self-report measures" and "free
response measures" are suggested, rather than the terms "objective
tests" and "projective tests," respectively.
Objective tests (Rating scale or self-report measure)
Objective
tests have a restricted response format, such as allowing for true or
false answers or rating using an ordinal scale. Prominent examples of
objective personality tests include the Minnesota Multiphasic Personality Inventory, Millon Clinical Multiaxial Inventory-IV, Child Behavior Checklist, Symptom Checklist 90 and the Beck Depression Inventory. Objective personality tests can be designed for use in business for potential employees, such as the NEO-PI, the 16PF, and the OPQ (Occupational Personality Questionnaire), all of which are based on the Big Five
taxonomy. The Big Five, or Five Factor Model of normal personality, has
gained acceptance since the early 1990s when some influential
meta-analyses (e.g., Barrick & Mount 1991) found consistent
relationships between the Big Five personality factors and important criterion variables.
Another personality test based upon the Five Factor Model is the Five Factor Personality Inventory – Children (FFPI-C.).
Projective tests (Free response measures)
Projective tests allow for a freer type of response. An example of this would be the Rorschach test, in which a person states what each of ten ink blots might be.
Projective testing became a growth industry in the first half of
the 1900s, with doubts about the theoretical assumptions behind
projective testing arising in the second half of the 1900s.
Some projective tests are used less often today because they are more
time consuming to administer and because the reliability and validity
are controversial.
As improved sampling and statistical methods developed, much
controversy regarding the utility and validity of projective testing has
occurred. The use of clinical judgement rather than norms and
statistics to evaluate people's characteristics has raised criticism
that projectives are deficient and unreliable (results are too
dissimilar each time a test is given to the same person). However, as
more objective scoring and interpretive systems supported by more
rigorous scientific research have emerged, many practitioners continue
to rely on projective testing. Projective tests may be useful in
creating inferences to follow up with other methods. The most widely
used scoring system for the Rorschach is the Exner system of scoring. Another common projective test is the Thematic Apperception Test (TAT), which is often scored with Westen's Social Cognition and Object Relations Scales and Phebe Cramer's Defense Mechanisms Manual. Both "rating scale" and "free response" measures are used in contemporary clinical practice, with a trend toward the former.
Other projective tests include the House-Tree-Person test, the Animal Metaphor Test.
Sexological tests
The number of tests specifically meant for the field of sexology is quite limited. The field of sexology
provides different psychological evaluation devices in order to examine
the various aspects of the discomfort, problem or dysfunction,
regardless of whether they are individual or relational ones.
Direct observation tests
Although
most psychological tests are "rating scale" or "free response"
measures, psychological assessment may also involve the observation of
people as they complete activities. This type of assessment is usually
conducted with families in a laboratory, home or with children in a
classroom. The purpose may be clinical, such as to establish a
pre-intervention baseline of a child's hyperactive or aggressive
classroom behaviors or to observe the nature of a parent-child
interaction in order to understand a relational disorder. Direct
observation procedures are also used in research, for example to study
the relationship between intrapsychic variables and specific target
behaviors, or to explore sequences of behavioral interaction.
The Parent-Child Interaction Assessment-II (PCIA)
is an example of a direct observation procedure that is used with
school-age children and parents. The parents and children are video
recorded playing at a make-believe zoo. The Parent-Child Early
Relational Assessment is used to study parents and young children and involves a feeding and a puzzle task. The MacArthur Story Stem Battery (MSSB) is used to elicit narratives from children. The Dyadic Parent-Child Interaction Coding System-II tracks the extent to which children follow the commands of parents and vice versa and is well suited to the study of children with Oppositional Defiant Disorders and their parents.
Interest tests
Psychological
tests to assess a person’s interests and preferences. These tests are
used primarily for career counseling. Interest tests include items about
daily activities from among which applicants select their preferences.
The rationale is that if a person exhibits the same pattern of interests
and preferences as people who are successful in a given occupation,
then the chances are high that the person taking the test will find
satisfaction in that occupation. A widely used interest test is the Strong Interest Inventory, which is used in career assessment, career counseling, and educational guidance.
Aptitude tests
Psychological
tests measure specific abilities, such as clerical, perceptual,
numerical, or spatial aptitude. Sometimes these tests must be specially
designed for a particular job, but there are also tests available that
measure general clerical and mechanical aptitudes, or even general
learning ability. An example of an occupational aptitude test is the
Minnesota Clerical Test, which measures the perceptual speed and
accuracy required to perform various clerical duties. Other widely used
aptitude tests include Careerscope, the Differential Aptitude Tests
(DAT), which assess verbal reasoning, numerical ability, abstract
Reasoning, clerical speed and accuracy, mechanical reasoning, space
relations, spelling and language usage. Another widely used test of
aptitudes is the Wonderlic Test.
These aptitudes are believed to be related to specific occupations and
are used for career guidance as well as selection and recruitment.
Biographical Information Blank
The Biographical Information Blanks
or BIB is a paper-and-pencil form that includes items that ask about
detailed personal and work history. It is used to aid in the hiring of
employees by matching the backgrounds of individuals to requirements of
the job.
Test security
Many
psychological tests are generally not available to the public, but
rather, have restrictions both from publishers of the tests and from
psychology licensing boards that prevent the disclosure of the tests
themselves and information about the interpretation of the results.
Test publishers consider both copyright and matters of professional
ethics to be involved in protecting the secrecy of their tests, and they
sell tests only to people who have proved their educational and
professional qualifications to the test maker's satisfaction.
Purchasers are legally bound from giving test answers or the tests
themselves out to the public unless permitted under the test maker's
standard conditions for administration of the tests.
The International Test Commission (ITC), an international
association of national psychological societies and test publishers,
publishes the International Guidelines for Test Use, which
prescribes to "protect the integrity" of the tests by not publicly
describing test techniques and by not "coaching individuals" so that
they "might unfairly influence their test performance."