Search This Blog

Monday, November 30, 2020

Every Student Succeeds Act

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Every_Student_Succeeds_Act

Every Student Succeeds Act
Great Seal of the United States
Long titleAn original bill to reauthorize the Elementary and Secondary Education Act of 1965 to ensure that every child achieves.
Acronyms (colloquial)ESSA
Enacted bythe 114th United States Congress
Citations
Public lawPub.L. 114–95 (text) (pdf)
Codification
Acts amendedElementary and Secondary Education Act of 1965
Acts repealedNo Child Left Behind Act
Titles amended20 U.S.C.: Education
U.S.C. sections amended20 U.S.C. ch. 28 § 1001 et seq.
20 U.S.C. ch. 70
Legislative history
  • Introduced in the United States Senate by Lamar Alexander (R-TN) on April 30, 2015
  • Committee consideration by HELP
  • Passed the United States House of Representatives on December 2, 2015 (359–64)
  • Passed the United States Senate on December 9, 2015 (85–12)
  • Signed into law by President Barack Obama on December 10, 2015

The Every Student Succeeds Act (ESSA) is a US law passed in December 2015 that governs the United States K–12 public education policy. The law replaced its predecessor, the No Child Left Behind Act (NCLB), and modified but did not eliminate provisions relating to the periodic standardized tests given to students. Like the No Child Left Behind Act, ESSA is a reauthorization of the 1965 Elementary and Secondary Education Act, which established the federal government's expanded role in public education.

The Every Student Succeeds Act passed both chambers of Congress with bipartisan support.

Overview

President Barack Obama signs the Act into law, December 2015

The bill is the first to narrow the United States federal government's role in elementary and secondary education since the 1980s. The ESSA retains the hallmark annual standardized testing requirements of the 2001 No Child Left Behind Act but shifts the law's federal accountability provisions to states. Under the law, students will continue to take annual tests between the third and eighth grades.

ESSA leaves significantly more control to the states and districts in determining the standards students are held to. States are required to submit their goals and standards and how they plan to achieve them to the US Department of Education, which must then submit additional feedback, and eventually approve. In doing so, the DOE still holds states accountable by ensuring they are implementing complete and ambitious, yet feasible goals. Students will then be tested each year from third through eighth grade and then once again their junior year of high school. These standardized tests will determine each student's capabilities in the classroom, and the success of the state in implementing its plans. The states are also left to determine the consequences low-performing schools might face and how they will be supported in the following years. The USDOE defines low-performing schools as those in the bottom ten percent of the state, based on the number of students who successfully graduate or the number of students who test proficient in reading or language arts and mathematics.

All states must have a multiple-measure accountability system, which include the following four indicators: achievement and/or growth on annual reading/language arts and math assessments; English language proficiency, an elementary and middle school academic measure of student growth; and high school graduation rates. All states also had to include at least one additional indicator of school quality or student success, commonly called the fifth indicator. Most states use chronic absenteeism as their fifth indicator.

Another primary goal of the ESSA is preparing all students, regardless of race, income, disability, ethnicity, or proficiency in English, for a successful college experience and fulfilling career. Therefore, ESSA also requires schools to offer college and career counseling and advanced placement courses to all students.

History

ESSA vote
Senate House
Rep. Dem. Rep. Dem.
40–12 45–0 178–64 181–0

The No Child Left Behind Act was due for reauthorization in 2007, but was not pursued for a lack of bipartisan cooperation. Many states failed to meet the NCLB's standards, and the Obama Administration granted waivers to many states for schools that showed success but failed under the NCLB standards. However, these waivers usually required schools to adopt academic standards such as the Common Core. The NCLB was generally praised for forcing schools and states to become more accountable for ensuring the education of poor and minority children. However, the increase in standardized testing that occurred during the presidencies of Bush and Obama met with resistance from many parents, and many called for a lessened role for the federal government in education. Similarly, the president of the National Education Association decried the NCLB's "one-size-fits-all model ... of test, blame and punish."

Following his 2014 re-election, Senate HELP Committee Chairman Lamar Alexander (R-TN), who had served as Education Secretary under President George H.W. Bush, decided to pursue a major rewrite of No Child Left Behind. Alexander and Patty Murray (D-WA), the ranking member of the HELP committee, collaborated to write a bipartisan bill that could pass the Republican-controlled Congress and earn the signature of President Barack Obama. At the same time, John Kline (R-MN), chairman of the House Committee on Education and the Workforce, pushed his own bill in the House. In July 2015, each chamber of the United States Congress passed their own renewals of the Elementary and Secondary Education Act. President Obama remained largely outside of the negotiations, though Alexander did win Obama's promise to not threaten to veto the bill during negotiations. As the House and Senate negotiated for the passage of a single bill in both houses, Bobby Scott (D-VA), the ranking member of the House Committee on Education and the Workforce, became a key player in ensuring Democratic votes in the House. By September 2015, the House and Senate had been able to resolve most of the major differences, but continued to differ on how to evaluate schools and how to respond to schools that perform poorly. House and Senate negotiators agreed to a proposal from Scott to allow the federal government to mandate specific circumstances in which states had to intervene in schools, while broadly giving states leeway in how to rate schools and in how to help struggling schools. Other major provisions included a pre-K program (at the urging of Murray), a provision to help ensure that states would not be able to exempt large swaths of students from testing (at the behest of civil rights groups), and restrictions on the power of the Education Secretary (at the urging of Alexander and Kline). The surprise resignation of Speaker John Boehner nearly derailed the bill, but incoming Speaker Paul Ryan's support of the bill helped ensure its passage. In December 2015, the House passed the bill in a 359–64 vote; days later, the Senate passed the bill in an 85–12 vote. President Obama signed the bill into law on December 10, 2015.

Students with disabilities

The Every Student Succeeds Act also sets new mandates on expectations and requirements for students with disabilities. Most students with disabilities will be required to take the same assessments and will be held to the same standards as other students. ESSA allows for only one percent of students, accounting for ten percent of students with disabilities, to be excused from the usual standardized testing. This one percent is reserved for students with severe cognitive disabilities, who will be required to take an alternate assessment instead. This is a smaller percentage of students than under past mandates, mainly because there is not enough staff available to administer the assessments to the students one-on-one. The Department of Education does not define disabled, rather, each state decides its own definition in order to determine which students will be allowed to take the alternate assessment. This could prove to be more challenging, though, when it comes to comparing students to one another because not all states will define disabled the same way. The ESSA has also recognized that bullying and harassment in schools disproportionately affects students with disabilities. Because of this, the ESSA requires states to develop and implement plans on how they will combat and attempt to reduce bullying incidents on their campuses.

Reception and opinion

President Obama explains why he signed the Act

Journalist Libby Nelson wrote that the ESSA was a victory for conservatives who wished to see federal control of school accountability transferred to states, and that states "could scale back their efforts to improve schools for poor and minority children".

Researchers from the Thomas B. Fordham Institute also approved of "grant[ing] states more authority over their accountability systems." However, they also expressed concern that, in an effort to set proficiency levels that low-performing students could pass, states would neglect the needs of high-performing students, which would disproportionately affect high-performing, low-income students.

State testing under ESSA

According to the October 24, 2015 U.S. Department of Education Fact Sheet: Testing Action Plan, state testing programs implemented under No Child Left Behind and Race to the Top were "draining creative approaches from our classrooms", "consuming too much instructional time" and "creating undue stress for educators and students."

Federal mandates and incentives were cited as partly responsible for students spending too much time taking standardized tests. ESSA provided states with flexibility to correct the balance and unwind "practices that have burdened classroom time or not served students or educators well."

The Every Student Succeeds Act statute, regulations and guidance give states broad discretion to design and implement assessment systems. Neither the statute nor the regulations apply any specific limits on test design, however United States Department of Education guidance documents say it is essential to ensure that tests "take up the minimum necessary time."

Section 1111(b)(2)(B)(viii)(1) of ESSA presents states with the opportunity to meet all Federal academic assessment requirements with a single comprehensive test. As of 2018-19 some states like Maryland continue to fulfill ESSA assessment requirements by administering four or more content-specific state standardized tests with testing windows that stretch from December through June.

The Every Student Succeeds Act prohibits any officer or employee of the Federal Government from using grants, contracts or other cooperative agreements to mandate, direct or control a state's academic standards and assessments. It also explicitly prohibited any requirement, direction or mandate to adopt the Common Core State Standards and gave states explicit permission to withdraw from the Common Core State Standards or otherwise revise their standards. On January 31, 2019, Florida's Governor signed an executive order "eliminating Common Core and the vestiges of Common Core" from Florida's public schools.

The following list is an incomplete enumeration of state testing initiatives designed to satisfy the requirements of the ESSA

Suspension of accountability requirements

An inauguration day directive on January 20, 2017, from President Donald Trump's Assistant to the President and Chief of Staff "Regulatory Freeze Pending Review" delayed implementation of new regulations, including portions of the Every Student Succeeds Act. On February 10, 2017, U.S. Secretary of Education Betsy DeVos wrote to chief state school officers that "states should continue their work" in developing their ESSA plans and noted that a revised template may be issued. In March 2017, Republican lawmakers with the support of the Trump administration used the Congressional Review Act to eliminate the Obama administration's accountability regulations.

Evidence-based education

From Wikipedia, the free encyclopedia

Molecular paleontology refers to the recovery and analysis of DNA, proteins, carbohydrates, or lipids, and their diagenetic products from ancient human, animal, and plant remains. The field of molecular paleontology has yielded important insights into evolutionary events, species' diasporas, the discovery and characterization of extinct species. By applying molecular analytical techniques to DNA in fossils, one can quantify the level of relatedness between any two organisms for which DNA has been recovered.

Advancements in the field of molecular paleontology have allowed scientists to pursue evolutionary questions on a genetic level rather than relying on phenotypic variation alone. Using various biotechnological techniques such as DNA isolation, amplification, and sequencing scientists have been able to gain expanded new insights into the divergence and evolutionary history of countless organisms.

The evidence-based education movement has its roots in the larger movement towards evidence-based practices, and has been the subject of considerable debate.

The United Kingdom author and academic David H. Hargreaves presented a lecture in 1996 in which he stated "Teaching is not at present a research-based profession. I have no doubt that if it were it would be more effective and satisfying". He compared the fields of medicine and teaching, saying that physicians are expected to keep up to date on medical research, whereas many teachers may not even be aware of the importance of research to their profession. In order for teaching to become more research-based, he suggested, educational research would require a "radical change" and teachers would have to become more involved in the creation and application of research.

Following that lecture, English policy makers in education tried to bring theory and practice closer together. At the same time, existing education research faced criticism for its quality, reliability, impartiality and accessibility.

In 2000 and 2001 two international, evidence-based, studies were created to analyze and report on the effectiveness of school education throughout the world: the Programme for International Student Assessment (PISA) in 2000 and the Progress in International Reading Literacy Study (PIRLS) in 2001.

Also, around the same time three major evidence-based studies about reading were released highlighting the value of evidence in education: the USA National Reading Panel in 2000, the Australian report on Teaching reading in 2005, and the Independent review of the teaching of early reading (Rose Report 2006), England. Approximately a year before the Rose Report, the Scottish Executive Education Department (SEED) published the results of a study entitled A Seven Year Study of the Effects of Synthetic Phonics Teaching on Reading and Spelling Attainment (Clackmannanshire Report), comparing synthetic phonics with analytic phonics.

Scientifically based research (SBR) (also evidence-based practice in education) first appeared in United States Federal legislation in the Reading Excellence Act and subsequently in the Comprehensive School Reform program. However, it came into prominence in the U.S.A. under the No child left behind act of 2001 (NCLB), intended to help students in kindergarten through grade 3 who are reading below grade level. Federal funding was made available for education programs and teacher training that are "based on scientifically based reading research". NCLB was replaced in 2015 by the Every Student Succeeds Act (ESSA).

In 2002 the U.S. Department of Education founded the Institute of Education Sciences (IES) to provide scientific evidence to guide education practice and policy .

The State driven Common Core State Standards Initiative was developed in the United States in 2009 in an attempt to standardize education principles and practices. There appears to have been some attempt to incorporate evidence-based practices. For example, the core standards website has a comprehensive description of the specific details of the English Language Arts Standards that include the areas of the alphabetic principle, print concepts, phonological awareness, phonics and word recognition, and fluency. However, it is up to the individual States and school districts to develop plans to implement the standards, and the National Governors Guide to Early Literacy appears to lack details. As of 2020, 41 States had adopted the standards, and in most cases it has taken three or more years to have them implemented. For example, Wisconsin adopted the standards in 2010 and implemented them in the 2014–2015 school year, yet in 2020 the state Department of Public Instruction was in the process of developing materials to support the standards in teaching phonics.

According to reports, the Common Core State Standards Initiative does not appear to have led to a significant national improvement in students' performance. The Center on Standards, Alignment, Instruction, and Learning (C-SAIL) conducted a study of how the Common Core is received in schools. It reported these findings: a) there is moderately high buy-in for the standards among teachers, principals, and superintendents, but buy-in was significantly lower for teachers, b) there is wide variation in teachers’ alignment to the standards by content area and grade level, c) specificity is desired by some educators, however states and districts are reluctant to provide too much specificity, d) State officials generally agree that accountability changes under ESSA have allowed them to adopt a “smart power” message that is less punitive and more supportive.

Subsequently, in England the Education Endowment Foundation of London was established in 2011 by The Sutton Trust, as the lead charity of the government-designated What Works Centre for high quality evidence in UK Education.

In 2012 the Department for Education in England introduced an evidence-based "phonics reading check" to help support primary students with reading. (In 2016, the Minister for Education reported that the percentage of primary students not meeting reading expectations reduced from 33% in 2010 to 20% in 2016.)

Evidence-based education in England received a boost from the 2013 briefing paper by Dr. Ben Goldacre. It advocated for systemic change and more randomized controlled trials to assess the effects of educational interventions. He said this was not about telling teachers what to do, but rather “empowering teachers to make independent, informed decisions about what works”.  Following that a U.K. based non-profit, researchED, was founded to offer a forum for researchers and educationalists to discuss the role of evidence in education.

Discussion and criticism ensued. Some said research methods that are useful in medicine can be entirely inappropriate in the sphere of education. 

In 2014 the National Foundation for Educational Research, Berkshire, England published a report entitled ‘’Using Evidence in the Classroom: What Works and Why’’.  The review synthesises effective approaches to school and teacher engagement with evidence and discusses challenges, areas for attention and action. It is intended to help the teaching profession to make the best use of evidence about what works in improving educational outcomes.

In 2014 the British Educational Research Association (BERA) and the Royal Society of Arts )RSA) conducted an inquiry into the role of research in teacher education in England, Northern Ireland, Scotland and Wales. The final report made it clear that research and teacher inquiry were of paramount importance in developing self-improving schools. It advocated for a closer working partnership between teacher-researchers and the wider academic research community.

The 2015 Carter Review of Initial Teaching Training in the UK suggested that teacher trainees should have access and skills in using research evidence to support their teaching. However, they do not receive training in utilizing research.

NCLB in the USA was replaced in 2015 by the Every Student Succeeds Act (ESSA) that replaced "scientifically based research" with “evidence-based interventions” (any “activity, strategy, or intervention that shows a statistically significant effect on improving student outcomes or other relevant outcomes”). ESSA has four tiers of evidence that some say gives schools and policy makers greater control because they can choose the desired tier of evidence. The evidence tiers are as follows:

  • Tier 1 – Strong Evidence: supported by one or more well-designed and well-implemented randomized controlled experimental studies.
  • Tier 2 – Moderate Evidence: supported by one or more well-designed and well-implemented quasi-experimental studies.
  • Tier 3 – Promising Evidence: supported by one or more well-designed and well-implemented correlational studies (with statistical controls for selection bias).
  • Tier 4 – Demonstrates a Rationale: practices that have a well-defined logic model or theory of action, are supported by research, and have some effort underway by state educational agencies (SEA), local educational agencies (LEA), or outside research organization to determine their effectiveness.

In 2016 the Department for Education in England published the White Paper Educational Excellence Everywhere. It states its intention to support an evidence-informed teaching profession by increasing teachers’ access to and use of “high quality evidence”. It will also establish a new British education journal and expand the Education Endowment Foundation.  In addition, on October 4, 2016 the Government announced an investment of around £75 million in the Teaching and Leadership Innovation Fund, to support high-quality, evidence-informed, professional development for teachers and school leaders. A research report on July 2017 entitled Evidence-informed teaching: an evaluation of progress in England  concluded this was necessary, but not sufficient. It said that the main challenge for policy makers and researchers was the level of leadership capacity and commitment to make it happen. In other words, the attitudes and actions of school leaders influence how classroom teachers are supported and held accountable for using evidence informed practices.

In 2017 the British Educational Research Association (BERA) examined the role of universities in professional development, focusing especially on teacher education and medical education.

Critics continue, saying “Education research is great but never forget teaching is a complex art form.”  In 2018, Dylan Wiliam, Emeritus Professor of Educational Assessment at University College London, speaking at researchED stated that “Educational research will never tell teachers what to do; their classrooms are too complex for this ever to be possible.” Instead, he suggests, teachers should become critical users of educational research and “aware of when even well-established research findings are likely to fail to apply in a particular setting”. 

Reception

Acceptance

Since many educators and policy makers are not experienced in evaluating scientific studies and studies have found that "teachers’ beliefs are often guided by subjective experience rather than by empirical data", several non-profit organizations have been created to critically evaluate research studies and provide their analysis in a user-friendly manner. They are outlined in research sources and information.

EBP has not been readily adopted in all parts of the education field, leading some to suggest the K-12 teaching profession has suffered a loss of respect because of its science-aversive culture and failure to adopt empirical research as the major determinant of its practices. Speaking in 2017, Harvey Bischof, Ontario Secondary School Teachers' Federation (OSSTF), said there is a need for teacher-centred education based upon what works in the classroom. He suggested that Ontario education "lacks a culture of empiricism" and is vulnerable to gurus, ideologues and advocates promoting unproven trends and fads. 

Neuroscientist Mark Seidenberg, University of Wisconsin–Madison, stated that “A stronger scientific ethos (in education) could have provided a much needed defense against bad science”, particularly in the field of early reading instruction. Other influential researchers in psychopedagogy, cognitive science and neuroscience, such as Stanislas Dehaene and Michel Fayol have also supported the view of incorporating science into educational practices.

Critics and skeptics

Skeptics point out that EBP in medicine often produces conflicting results, so why should educators accept EBP in education?  Others feel that EBE "limits the opportunities for educational professionals to exert their judgment about what is educationally desirable in particular situations".

Some suggest teachers should not pick up research findings and implement them directly into the classroom; instead they advocate for a modified approach some call evidence-informed teaching that combines research with other types of evidence plus personal experience and good judgement.  (To be clear, some use the term evidence-informed teaching to mean "practice that is influenced by robust research evidence".)

Still others say there is “a mutual interdependence between science and education”, and teachers should become better trained in research science and “take science sufficiently seriously” to see how its methods might inform their practice. Straight talk on evidence has suggested that  reports about evidence in education need to be scrutinized for accuracy or subjected to Metascience (research on research). 

In a 2020 talk featured on ResearchED, Dylan Wiliam argues that when looking at the cost, benefit and practicality of research, more impact on student achievement will come from a knowledge-rich curriculum and improving teachers’ pedagogical skills.

Concerns

There has also been some discussion of a philosophical nature about the validity of scientific evidence. This led James M. Kauffman, University of Virginia, and Gary M. Sasso, University of Iowa, to respond in 2006 suggesting that problems arise with the extreme views of a) the "unbound faith in science" (i.e. scientism) or b) the "criticism of science" (that they label as the "nonsense of postmodernism"). They go on to say that science is "the imperfect but best tool available for trying to reduce uncertainty about what we do as special educators".

Meta-analysis

A meta-analysis is a statistical analysis that combines the results of multiple scientific studies. A concern of some researchers is the unreliability of some of these reports due to mythological features. For example, it is suggested that some meta-analyses findings are not credible because they do not exclude or control for studies with small sample sizes or very short durations, and where the researchers are doing the measurements. Such reports can yield "implausible" results. According to Robert Slavin, of the Center for Research and Reform in Education at Johns Hopkins University and Evidence for ESSA, "Meta-analyses are important, because they are widely read and widely cited, in comparison to individual studies. Yet until meta-analyses start consistently excluding, or at least controlling for studies with factors known to inflate mean effect sizes, then they will have little if any meaning for practice."

Research sources and information

The following organizations evaluate research on educational programs, or help educators to understand the research.

Best Evidence Encyclopedia (BEE)

Best Evidence Encyclopedia (BEE) is a free website created by the Johns Hopkins University School of Education's Center for Data-Driven Reform in Education (established in 2004) and is funded by the Institute of Education Sciences, U.S. Department of Education.  It gives educators and researchers reviews about the strength of the evidence supporting a variety of English programs available for students in grades K–12. The reviews cover programs in areas such as Mathematics, Reading, Writing, Science, Comprehensive school reform, and Early childhood Education; and includes such topics as effectiveness of technology and struggling readers.

BEE selects reviews that meet consistent scientific standards and relate to programs that are available to educators. 

Educational programs in the reviews are rated according to the overall strength of the evidence supporting their effects on students as determined by the combination the quality of the research design and their effect size. The BEE website contains an explanation of their interpretation of effect size and how it might be viewed as a percentile score. It uses the following categories of ratings:

  • Strong evidence of effectiveness
  • Moderate evidence of effectiveness
  • Limited evidence of effectiveness: Strong evidence of modest effects
  • Limited evidence of effectiveness: Weak evidence with notable effect
  • No qualifying studies

Reading programs

In 2019, BEE released a review of research on 61 studies of 48 different programs for struggling readers in elementary schools. 84% were randomized experiments and 16% quasi-experiments.  The vast majority were done in the USA, the programs are replicable, and the studies, done between 1990 and 2018, had a minimum duration of 12 weeks. Many of the programs used phonics-based teaching and/or one or more of the following: cooperative learning, technology-supported adaptive instruction (see Educational technology), metacognitive skills, phonemic awareness, word reading, fluency, vocabulary, multisensory learning, spelling, guided reading, reading comprehension, word analysis, structured curriculum, and balanced literacy (non-phonetic approach). Significantly, table 5 (pg. 88) shows the mean weighted effect sizes of the programs by the manner in which they were conducted (i.e. by school, by classroom, by technology-supported adaptive instruction, by one-to-small-group tutoring, and by one-to-one tutoring).  Table 8 (pg. 91) lists the 22 programs meeting ESSA standards for strong and moderate ratings, and their effect size.

The review concludes that a) outcomes were positive for one-to-one tutoring, b) outcomes were positive but not as large for one-to-small group tutoring, c) there were no differences in outcomes between teachers and teaching assistants as tutors, d) technology-supported adaptive instruction did not have positive outcomes, e) whole-class approaches (mostly cooperative learning) and whole-school approaches incorporating tutoring obtained outcomes for struggling readers as large as those found for one- to-one tutoring, and benefitted many more students, and f) approaches mixing classroom and school improvements, with tutoring for the most at-risk students, have the greatest potential for the largest numbers of struggling readers.

The site also offers a newsletter  from Robert Slavin, Director of the Center for Research and Reform in Education  containing information on education around the world.

Blueprints for healthy youth development

Blueprints for Healthy Youth Development, University of Colorado Boulder, offers a registry of evidence-based interventions with "the strongest scientific support" that are effective in promoting a healthy course of action for youth development.

Education Endowment Foundation

The Education Endowment Foundation of London, England was established in 2011 by The Sutton Trust, as a lead charity in partnership with Impetus Trust, together being the government-designated What Works Centre for UK Education.  It offers an online, downloadable Teaching & Learning Toolkit evaluating and describing a variety of educational interventions according to cost, evidence and impact.  As an example, it evaluates and describes a 2018 phonics reading program with low cost, extensive evidence and moderate impact. 

Evidence for ESSA

Evidence for ESSA  began in 2017 and is produced by the Center for Research and Reform in Education (CRRE) at Johns Hopkins University School of Education, Baltimore, MD. It is reported to have received "widespread support ", and offers free up-to-date information on current PK-12 programs in reading, math, social-emotional learning, and attendance that meet the standards of the Every Student Succeeds Act (ESSA) (the United States K–12 public education policy signed by President Obama in 2015).. It also provides information on programs that do meet ESSA standards as well as those that do not.

Evidence-based PK-12 programs

There are three program categories 1) whole class, 2) struggling readers and 3) English learners. Programs can be filtered by a) ESSA evidence rating (strong, moderate, and promising), b) school grade, c) community (rural, suburban, urban), d) groups (African American, Asian American, Hispanic, White, free and reduced price lunch, English learners, and special education), and e) a variety of features such as cooperative learning, technology, tutoring, etc.

For example, as of June 2020 there were 89 reading programs in the database. After filtering for strong results, grades 1-2, and free and reduced-price lunches, 23 programs remain. If it is also filter for struggling readers, the list is narrowed to 14 programs. The resulting list is shown by the ESSA ratings, Strong, Moderate or Promising. Each program can then be evaluated according to the following: number of studies, number of students, average effect size, ESSA rating, cost, program description, outcomes, and requirements for implementation.

Social programs that work and Straight Talk on Evidence

Social programs that work and Straight Talk on Evidence are administered by the Arnold Ventures LLC’s evidence-based policy team, with offices in Houston, Washington, D.C., and New York City. The team is composed of the former leadership of the Coalition for Evidence-Based Policy, a nonprofit, nonpartisan organization advocating the use of well-conducted randomized controlled trials (RCTs) in policy decisions. It offers information on twelve types of social programs including education.

Social programs that work evaluates programs according to their RCTs and gives them one of three ratings:

  • Top Tier: Programs with two or more replicable and well conducted RCTs (or one multi-site RTC), in a typical community settings producing sizable sustained outcomes.
  • Near Top Tier: Programs that meet almost all elements of the Top Tier standard but need another replication RCT to confirm the initial findings.
  • Suggestive Tier: Programs appearing to be a strong candidate with some shortcomings. They produce sizeable positive effects based on one or more well conducted RCTs (or studies that almost meet this standard); however, the evidence is limited by factors such as short-term follow-up or effects that are not statistically significant.

Education programs include K-12 and postsecondary. The programs are listed under each category according to their rating and the update date is shown. For example, as of June 2020 there were 12 programs under K-12; two were Top Tier, five were Near Top Tier, and the remainder were Suggestive Tier. Each program contains information about the program, evaluation methods, key findings and other data such as the cost per student. Beyond the general category, there does not appear to be any way to filter for only the type of program of interest, however the list may not be especially long.

Straight Talk on Evidence seeks to distinguish between programs that only claim to be effective and other programs showing credible findings of being effective. It reports mostly on randomized controlled trial (RCT) evaluations, recognizing that RCTs offer no guarantee that the study was implemented well, or that its reported results represented the true findings. The lead author of a study is given an opportunity to respond to their report prior to its publication.

What Works Clearinghouse (WWC)

What Works Clearinghouse (WWC) of Washington, DC, was established in 2002 and evaluates numerous educational programs in twelve categories by the quality and quantity of the evidence and the effectiveness. It is operated by the federal National Center for Education Evaluation and Regional Assistance (NCEE), part of the Institute of Education Sciences (IES) 

Publications

WWC publications are available for a variety of topics (e.g. literacy, charter schools, science, early childhood, etc.) and Type (i.e. Practice guide or Intervention report).

Practice guides, tutorials, videos and webinars

Practice guides with recommendations are provided covering a wide variety of subjects such as Using Technology to Support Postsecondary Student Learning and Assisting Students Struggling with Reading, etc. Other resources such as tutorials, videos and webinars are also available.

Reviews of individual studies

Individual studies are available that have been reviewed by WWC and categorized according to the evidence tiers of the United States Every student succeeds act (ESSA). Search filters are available for the following:

  • WWC ratings (e.g. meets WWC standards with or without reservations, meets WWC standards without reservations, etc.)
  • Topic (e.g. behavior, charter schools, etc.)
  • Studies meeting certain design standards (e.g. Randomized controlled trial, Quasi-experiment design, etc.)
  • ESSA ratings (e.g. ESSA Tier 1, ESSA Tier 2, etc.)
  • Studies with one or more statistically positive findings

Intervention reports, programs and search filters

Intervention reports are provided for programs according to twelve topics (e.g. literacy, mathematics, science, behavior, etc.).

The filters are helpful to find programs that meet specific criteria. For example, as of July 2020 there were 231 literacy programs in the WWC database. (Note: these are literacy programs that may have several individual trials and some of the trials were conducted as early as 2006.) If these programs are filtered for outcomes in Literacy-Alphabetics the list is narrowed to 25 programs that met WWC standards for evidence and had at least one "potentially positive" effectiveness rating. If the list is further filtered to show only programs in grades one or two, and delivery methods of individual, or small group, or whole class the list is down to 14 programs; and five of those have an effectiveness rating of "strong evidence that intervention had a positive effect on outcomes" in alphabetics.

The resulting list of programs can then be sorted by a) evidence of effectiveness, or b) alphabetically, or c) school grades examined. It is also possible to select individual programs to be compared with each other; however it is advisable to recheck each individual program by searching on the Intervention Reports page. The resulting programs show data in the following areas:

  • outcome domain (e.g. alphabetics, oral language, general mathematics achievement, etc.)
  • effectiveness rating (e.g. positive, potentially positive, mixed, etc.)
  • number of studies meeting WWC standards
  • grades examined (e.g. K-4)
  • number of students in studies that met the WWC standards, and
  • improvement index (i.e. the expected change in percentile rank).

It is also possible to view the program's Evidence snapshot, detailed Intervention report and Review protocols. For other independent "related reviews", go to the evidence snapshot then the WWC Summary of Evidence.

The following chart, updated in July 2020, shows some programs that had "strong evidence" of a "positive effect on outcomes" in the areas specified. The results may have changed since that time, however current information is available on the WWC website, including the outcome domains that did not have "strong evidence".

Some of the concerns expressed about WWC are that it appears to have difficulty keeping up with the research so it may not be current; and when a program is not listed on their database, it may be that it did not meet their criteria or they have not yet reviewed it, but you don't know which. In addition Straight Talk on Evidence, authored by the Arnold Ventures LLC’ Evidence-Based Policy team , on January 16, 2018 expressed concerns about the validity of the ratings provided by WWC. It says WWC in some cases reported a "preliminary outcome when high-quality RCTs found no significant effects on more important and final educational outcomes".

A summary of the January 2020 changes to the WWC procedures and standards is available on their site.

Other sources of information

  • The British Educational Research Association (BERA) claims to be the home of educational research in the United Kingdom. It is a membership association that aims to improve the knowledge of education by advancing research quality, capacity and engagement. Its resources include a quarterly magazine, journals, articles, and conferences.
  • Campbell Collaboration is a nonprofit organization that promotes evidence-based decisions and policy through the production of systematic reviews and other types of evidence synthesis. It has wide spread international support, and allows users to easily search by topic area (e.g. education) or key word (e.g. reading).
  • Doing What Works is provided by WestEd, a San Francisco-based nonprofit organization, and offers an online library  that includes interviews with researchers and educators, in addition to materials and tools for educators. WestEd was criticized in January 2020, claiming they did not interview all interested parties prior to releasing a report.
  • Early childhood Technical Assistance Center (ECTA), of Chapel Hill, NC, provides resources on evidence-based practices in areas specific to early childhood care and education, professional development, early intervention and early childhood special education.
  • Florida Center for Reading Research is a research center at Florida State University that explores all aspects of reading research. Its Resource Database allows you to search for information based on a variety of criteria.
  • Institute of Education Sciences (IES), Washington, DC, is the statistics, research, and evaluation arm of the U.S. Department of Education. It funds independent education research, evaluation and statistics. It published a Synthesis of its Research on Early Intervention and Early Childhood Education in 2013. Its publications and products can be searched by author, subject, etc.
  • The International Initiative for Impact Evaluation (3ie) is a registered non-governmental organisation, since 2008, with offices in New Delhi, London and Washington, DC. Its self-described vision is to improve lives through evidence-informed action in developing countries. In 2016 their researchers synthesised evidence from 238 impact evaluations and 121 qualitative research studies and process evaluations in 52 low-and middle-income countries (L&MICs). It looked at children’s school enrolment, attendance, completion and learning.The results can be viewed in their report entitled The impact of education programmes on learning and school participation in low- and middle-income countries.
  • National Foundation for Educational Research (NFER) is a non-profit research and development organization based in Berkshire, England. It produces independent research and reports about issues across the education system, such as Using Evidence in the Classroom: What Works and Why.
  • The Ministry of Education, Ontario, Canada offers a site entitled What Works? Research Into Practice. It is a collection of research summaries of promising teaching practice written by experts at Ontario universities.
  • RAND Corporation, with offices throughout the world, funds research on early childhood, K-12, and higher education.
  • ResearchED, a U.K. based non-profit since 2013 has organized education conferences around the world (e.g. Africa, Australia, Asia, Canada, the E.U., the Middle East, New Zealand, the U.K. and the U.S.A.) featuring researchers and educators in order to "promote collaboration between research-users and research-creators". It has been described as a "grass-roots teacher-led project that aims to make teachers research-literate and pseudo-science proof". It also publishes an online magazine featuring articles by practicing teachers and others such as professor Daniel T. Willingham (University of Virginia) and Professor Dylan Wiliam (Emeritus professor, UCL Institute of Education). And finally, it offers frequent, free online video presentations  on subjects such as curriculum design, simplifying your practice, unleashing teachers' expertise, the bridge over the reading gap, education post-corona, remote teaching, teaching critical thinking, etc. The free presentations are also available on its YouTube channel.  ResearchED has been featured in online debates about so called "teacher populism".
  • Research 4 Schools, University of Delaware is supported by the Institute of Education Sciences, U.S. Department of Education and offers peer reviewed research about education.

Evidence-based learning techniques

The following are some examples of evidence-based learning techniques.

Spaced repetition

In the Leitner system, correctly answered cards are advanced to the next, less frequent box, while incorrectly answered cards return to the first box.

Spaced repetition is a theory that repetitive training that includes long intervals between training sessions helps to form long-term memory. It is also referred to as spaced training, spacing effect and spaced learning). Such training has been known since the seminal work of Hermann Ebbinghaus to be superior to training that includes short inter-trial intervals (massed training or massed learning) in terms of its ability to promote memory formation. It is a learning technique that is performed with flashcards. Newly introduced and more difficult flashcards are shown more frequently while older and less difficult flashcards are shown less frequently in order to exploit the psychological spacing effect. The use of spaced repetition has been proven to increase rate of learning. Although the principle is useful in many contexts, spaced repetition is commonly applied in contexts in which a learner must acquire a large number of items and retain them indefinitely in memory. It is, therefore, well suited for the problem of vocabulary acquisition in the course of second language learning. A number of spaced repetition softwares have been developed to aid the learning process. It is also possible to perform spaced repetition with flash cards using the Leitner system.

Errorless learning

Errorless learning was an instructional design introduced by psychologist Charles Ferster in the 1950s as part of his studies on what would make the most effective learning environment. B. F. Skinner was also influential in developing the technique, and noted: "errors are not necessary for learning to occur. Errors are not a function of learning or vice versa nor are they blamed on the learner. Errors are a function of poor analysis of behavior, a poorly designed shaping program, moving too fast from step to step in the program, and the lack of the prerequisite behavior necessary for success in the program." Errorless learning can also be understood at a synaptic level, using the principle of Hebbian learning ("Neurons that fire together wire together").

Interest from psychologists studying basic research on errorless learning declined after the 1970s. However, errorless learning attracted the interest of researchers in applied psychology, and studies have been conducted with both children (e.g., educational settings) and adults (e.g. Parkinson's patients). Errorless learning continues to be of practical interest to animal trainers, particularly dog trainers.

Errorless learning has been found to be effective in helping memory-impaired people learn more effectively. The reason for the method's effectiveness is that, while those with sufficient memory function can remember mistakes and learn from them, those with memory impairment may have difficulty remembering not only which methods work, but may strengthen incorrect responses over correct responses, such as via emotional stimuli. See also the reference by Brown to its application in teaching mathematics to undergraduates.

N-back training

The n-back task is a continuous performance task that is commonly used as an assessment in cognitive neuroscience to measure a part of working memory and working memory capacity. The n-back was introduced by Wayne Kirchner in 1958.

A 2008 research paper claimed that practicing a dual n-back task can increase fluid intelligence (Gf), as measured in several different standard tests. This finding received some attention from popular media, including an article in Wired. However, a subsequent criticism of the paper's methodology questioned the experiment's validity and took issue with the lack of uniformity in the tests used to evaluate the control and test groups. For example, the progressive nature of Raven's Advanced Progressive Matrices (APM) test may have been compromised by modifications of time restrictions (i.e., 10 minutes were allowed to complete a normally 45-minute test). The authors of the original paper later addressed this criticism by citing research indicating that scores in timed administrations of the APM are predictive of scores in untimed administrations.

The 2008 study was replicated in 2010 with results indicating that practicing single n-back may be almost equal to dual n-back in increasing the score on tests measuring Gf (fluid intelligence). The single n-back test used was the visual test, leaving out the audio test. In 2011, the same authors showed long-lasting transfer effect in some conditions.

Two studies published in 2012 failed to reproduce the effect of dual n-back training on fluid intelligence. These studies found that the effects of training did not transfer to any other cognitive ability tests. In 2014, a meta-analysis of twenty studies showed that n-back training has small but significant effect on Gf and improve it on average for an equivalent of 3-4 points of IQ. In January 2015, this meta-analysis was the subject of a critical review due to small-study effects. The question of whether n-back training produces real-world improvements to working memory remains controversial.

Intelligent tutoring system

From Wikipedia, the free encyclopedia

An intelligent tutoring system (ITS) is a computer system that aims to provide immediate and customized instruction or feedback to learners, usually without requiring intervention from a human teacher. ITSs have the common goal of enabling learning in a meaningful and effective manner by using a variety of computing technologies. There are many examples of ITSs being used in both formal education and professional settings in which they have demonstrated their capabilities and limitations. There is a close relationship between intelligent tutoring, cognitive learning theories and design; and there is ongoing research to improve the effectiveness of ITS. An ITS typically aims to replicate the demonstrated benefits of one-to-one, personalized tutoring, in contexts where students would otherwise have access to one-to-many instruction from a single teacher (e.g., classroom lectures), or no teacher at all (e.g., online homework). ITSs are often designed with the goal of providing access to high quality education to each and every student.

History

Early mechanical systems

Skinner teaching machine 08

The possibility of intelligent machines have been discussed for centuries. Blaise Pascal created the first calculating machine capable of mathematical functions in the 17th century simply called Pascal's Calculator. At this time the mathematician and philosopher Gottfried Wilhelm Leibniz envisioned machines capable of reasoning and applying rules of logic to settle disputes (Buchanan, 2006). These early works contributed to the development of the computer and future applications.

The concept of intelligent machines for instructional use date back as early as 1924, when Sidney Pressey of Ohio State University created a mechanical teaching machine to instruct students without a human teacher. His machine resembled closely a typewriter with several keys and a window that provided the learner with questions. The Pressey Machine allowed user input and provided immediate feedback by recording their score on a counter.

Pressey himself was influenced by Edward L. Thorndike, a learning theorist and educational psychologist at the Columbia University Teacher College of the late 19th and early 20th centuries. Thorndike posited laws for maximizing learning. Thorndike's laws included the law of effect, the law of exercise, and the law of recency. Following later standards, Pressey's teaching and testing machine would not be considered intelligent as it was mechanically run and was based on one question and answer at a time, but it set an early precedent for future projects. By the 1950s and 1960s, new perspectives on learning were emerging. Burrhus Frederic "B.F." Skinner at Harvard University did not agree with Thorndike's learning theory of connectionism or Pressey's teaching machine. Rather, Skinner was a behaviourist who believed that learners should construct their answers and not rely on recognition. He too, constructed a teaching machine structured using an incremental mechanical system that would reward students for correct responses to questions.

Early electronic systems

In the period following the second world war, mechanical binary systems gave way to binary based electronic machines. These machines were considered intelligent when compared to their mechanical counterparts as they had the capacity to make logical decisions. However, the study of defining and recognizing a machine intelligence was still in its infancy.

Alan Turing, a mathematician, logician and computer scientist, linked computing systems to thinking. One of his most notable papers outlined a hypothetical test to assess the intelligence of a machine which came to be known as the Turing test. Essentially, the test would have a person communicate with two other agents, a human and a computer asking questions to both recipients. The computer passes the test if it can respond in such a way that the human posing the questions cannot differentiate between the other human and the computer. The Turing test has been used in its essence for more than two decades as a model for current ITS development. The main ideal for ITS systems is to effectively communicate. As early as the 1950s programs were emerging displaying intelligent features. Turing's work as well as later projects by researchers such as Allen Newell, Clifford Shaw, and Herb Simon showed programs capable of creating logical proofs and theorems. Their program, The Logic Theorist exhibited complex symbol manipulation and even generation of new information without direct human control and is considered by some to be the first AI program. Such breakthroughs would inspire the new field of Artificial Intelligence officially named in 1956 by John McCarthy in 1956 at the Dartmouth Conference. This conference was the first of its kind that was devoted to scientists and research in the field of AI.

The PLATO V CAI terminal in 1981

The latter part of the 1960s and 1970s saw many new CAI (Computer-Assisted instruction) projects that built on advances in computer science. The creation of the ALGOL programming language in 1958 enabled many schools and universities to begin developing Computer Assisted Instruction (CAI) programs. Major computer vendors and federal agencies in the US such as IBM, HP, and the National Science Foundation funded the development of these projects. Early implementations in education focused on programmed instruction (PI), a structure based on a computerized input-output system. Although many supported this form of instruction, there was limited evidence supporting its effectiveness. The programming language LOGO was created in 1967 by Wally Feurzeig, Cynthia Solomon, and Seymour Papert as a language streamlined for education. PLATO, an educational terminal featuring displays, animations, and touch controls that could store and deliver large amounts of course material, was developed by Donald Bitzer in the University of Illinois in the early 1970s. Along with these, many other CAI projects were initiated in many countries including the US, the UK, and Canada.

At the same time that CAI was gaining interest, Jaime Carbonell suggested that computers could act as a teacher rather than just a tool (Carbonell, 1970). A new perspective would emerge that focused on the use of computers to intelligently coach students called Intelligent Computer Assisted Instruction or Intelligent Tutoring Systems (ITS). Where CAI used a behaviourist perspective on learning based on Skinner's theories (Dede & Swigger, 1988), ITS drew from work in cognitive psychology, computer science, and especially artificial intelligence. There was a shift in AI research at this time as systems moved from the logic focus of the previous decade to knowledge based systems—systems could make intelligent decisions based on prior knowledge (Buchanan, 2006). Such a program was created by Seymour Papert and Ira Goldstein who created Dendral, a system that predicted possible chemical structures from existing data. Further work began to showcase analogical reasoning and language processing. These changes with a focus on knowledge had big implications for how computers could be used in instruction. The technical requirements of ITS, however, proved to be higher and more complex than CAI systems and ITS systems would find limited success at this time.

Towards the latter part of the 1970s interest in CAI technologies began to wane. Computers were still expensive and not as available as expected. Developers and instructors were reacting negatively to the high cost of developing CAI programs, the inadequate provision for instructor training, and the lack of resources.

Microcomputers and intelligent systems

The microcomputer revolution in the late 1970s and early 1980s helped to revive CAI development and jumpstart development of ITS systems. Personal computers such as the Apple 2, Commodore PET, and TRS-80 reduced the resources required to own computers and by 1981, 50% of US schools were using computers (Chambers & Sprecher, 1983). Several CAI projects utilized the Apple 2 as a system to deliver CAI programs in high schools and universities including the British Columbia Project and California State University Project in 1981.

The early 1980s would also see Intelligent Computer-Assisted Instruction (ICAI) and ITS goals diverge from their roots in CAI. As CAI became increasingly focused on deeper interactions with content created for a specific area of interest, ITS sought to create systems that focused on knowledge of the task and the ability to generalize that knowledge in non-specific ways (Larkin & Chabay, 1992). The key goals set out for ITS were to be able to teach a task as well as perform it, adapting dynamically to its situation. In the transition from CAI to ICAI systems, the computer would have to distinguish not only between the correct and incorrect response but the type of incorrect response to adjust the type of instruction. Research in Artificial Intelligence and Cognitive Psychology fueled the new principles of ITS. Psychologists considered how a computer could solve problems and perform 'intelligent' activities. An ITS programme would have to be able to represent, store and retrieve knowledge and even search its own database to derive its own new knowledge to respond to learner's questions. Basically, early specifications for ITS or (ICAI) require it to "diagnose errors and tailor remediation based on the diagnosis" (Shute & Psotka, 1994, p. 9). The idea of diagnosis and remediation is still in use today when programming ITS.

A key breakthrough in ITS research was the creation of The LISP Tutor, a program that implemented ITS principles in a practical way and showed promising effects increasing student performance. The LISP Tutor was developed and researched in 1983 as an ITS system for teaching students the LISP programming language (Corbett & Anderson, 1992). The LISP Tutor could identify mistakes and provide constructive feedback to students while they were performing the exercise. The system was found to decrease the time required to complete the exercises while improving student test scores (Corbett & Anderson, 1992). Other ITS systems beginning to develop around this time include TUTOR created by Logica in 1984 as a general instructional tool and PARNASSUS created in Carnegie Mellon University in 1989 for language instruction.

Modern ITS

After the implementation of initial ITS, more researchers created a number of ITS for different students. In the late 20th century, Intelligent Tutoring Tools (ITTs) was developed by the Byzantium project, which involved six universities. The ITTs were general purpose tutoring system builders and many institutions had positive feedback while using them. (Kinshuk, 1996) This builder, ITT, would produce an Intelligent Tutoring Applet (ITA) for different subject areas. Different teachers created the ITAs and built up a large inventory of knowledge that was accessible by others through the Internet. Once an ITS was created, teachers could copy it and modify it for future use. This system was efficient and flexible. However, Kinshuk and Patel believed that the ITS was not designed from an educational point of view and was not developed based on the actual needs of students and teachers (Kinshuk and Patel, 1997). Recent work has employed ethnographic and design research methods to examine the ways ITSs are actually used by students and teachers across a range of contexts, often revealing unanticipated needs that they meet, fail to meet, or in some cases, even create.

Modern day ITSs typically try to replicate the role of a teacher or a teaching assistant, and increasingly automate pedagogical functions such as problem generation, problem selection, and feedback generation. However, given a current shift towards blended learning models, recent work on ITSs has begun focusing on ways these systems can effectively leverage the complementary strengths of human-led instruction from a teacher or peer, when used in co-located classrooms or other social contexts.

There were three ITS projects that functioned based on conversational dialogue: AutoTutor, Atlas (Freedman, 1999), and Why2. The idea behind these projects was that since students learn best by constructing knowledge themselves, the programs would begin with leading questions for the students and would give out answers as a last resort. AutoTutor's students focused on answering questions about computer technology, Atlas's students focused on solving quantitative problems, and Why2's students focused on explaining physical systems qualitatively. (Graesser, VanLehn, and others, 2001) Other similar tutoring systems such as Andes (Gertner, Conati, and VanLehn, 1998) tend to provide hints and immediate feedback for students when students have trouble answering the questions. They could guess their answers and have correct answers without deep understanding of the concepts. Research was done with a small group of students using Atlas and Andes respectively. The results showed that students using Atlas made significant improvements compared with students who used Andes. However, since the above systems require analysis of students' dialogues, improvement is yet to be made so that more complicated dialogues can be managed.

Structure

Intelligent tutoring systems (ITSs) consist of four basic components based on a general consensus amongst researchers (Nwana,1990; Freedman, 2000; Nkambou et al., 2010):

  1. The Domain model
  2. The Student model
  3. The Tutoring model, and
  4. The User interface model

The domain model (also known as the cognitive model or expert knowledge model) is built on a theory of learning, such as the ACT-R theory which tries to take into account all the possible steps required to solve a problem. More specifically, this model "contains the concepts, rules, and problem-solving strategies of the domain to be learned. It can fulfill several roles: as a source of expert knowledge, a standard for evaluating the student's performance or for detecting errors, etc." (Nkambou et al., 2010, p. 4). Another approach for developing domain models is based on Stellan Ohlsson's Theory of Learning from performance errors, known as constraint-based modelling (CBM). In this case, the domain model is presented as a set of constraints on correct solutions.

The student model can be thought of as an overlay on the domain model. It is considered as the core component of an ITS paying special attention to student's cognitive and affective states and their evolution as the learning process advances. As the student works step-by-step through their problem solving process, an ITS engages in a process called model tracing. Anytime the student model deviates from the domain model, the system identifies, or flags, that an error has occurred. On the other hand, in constraint-based tutors the student model is represented as an overlay on the constraint set. Constraint-based tutors evaluate the student's solution against the constraint set, and identify satisfied and violated constraints. If there are any violated constraints, the student's solution is incorrect, and the ITS provides feedback on those constraints. Constraint-based tutors provide negative feedback (i.e. feedback on errors) and also positive feedback.

The tutor model accepts information from the domain and student models and makes choices about tutoring strategies and actions. At any point in the problem-solving process the learner may request guidance on what to do next, relative to their current location in the model. In addition, the system recognizes when the learner has deviated from the production rules of the model and provides timely feedback for the learner, resulting in a shorter period of time to reach proficiency with the targeted skills. The tutor model may contain several hundred production rules that can be said to exist in one of two states, learned or unlearned. Every time a student successfully applies a rule to a problem, the system updates a probability estimate that the student has learned the rule. The system continues to drill students on exercises that require effective application of a rule until the probability that the rule has been learned reaches at least 95% probability.

Knowledge tracing tracks the learner's progress from problem to problem and builds a profile of strengths and weaknesses relative to the production rules. The cognitive tutoring system developed by John Anderson at Carnegie Mellon University presents information from knowledge tracing as a skillometer, a visual graph of the learner's success in each of the monitored skills related to solving algebra problems. When a learner requests a hint, or an error is flagged, the knowledge tracing data and the skillometer are updated in real-time.

The user interface component "integrates three types of information that are needed in carrying out a dialogue: knowledge about patterns of interpretation (to understand a speaker) and action (to generate utterances) within dialogues; domain knowledge needed for communicating content; and knowledge needed for communicating intent" (Padayachee, 2002, p. 3).

Nkambou et al. (2010) make mention of Nwana's (1990) review of different architectures underlining a strong link between architecture and paradigm (or philosophy). Nwana (1990) declares, "[I]t is almost a rarity to find two ITSs based on the same architecture [which] results from the experimental nature of the work in the area" (p. 258). He further explains that differing tutoring philosophies emphasize different components of the learning process (i.e., domain, student or tutor). The architectural design of an ITS reflects this emphasis, and this leads to a variety of architectures, none of which, individually, can support all tutoring strategies (Nwana, 1990, as cited in Nkambou et al., 2010). Moreover, ITS projects may vary according to the relative level of intelligence of the components. As an example, a project highlighting intelligence in the domain model may generate solutions to complex and novel problems so that students can always have new problems to work on, but it might only have simple methods for teaching those problems, while a system that concentrates on multiple or novel ways of teaching a particular topic might find a less sophisticated representation of that content sufficient.

Design and development methods

Apart from the discrepancy amongst ITS architectures each emphasizing different elements, the development of an ITS is much the same as any instructional design process. Corbett et al. (1997) summarized ITS design and development as consisting of four iterative stages: (1) needs assessment, (2) cognitive task analysis, (3) initial tutor implementation and (4) evaluation.

The first stage known as needs assessment is common to any instructional design process, especially software development. This involves a learner analysis, consultation with subject matter experts and/or the instructor(s). This first step is part of the development of the expert/knowledge and student domain. The goal is to specify learning goals and to outline a general plan for the curriculum; it is imperative not to computerize traditional concepts but develop a new curriculum structure by defining the task in general and understanding learners' possible behaviours dealing with the task and to a lesser degree the tutor's behavior. In doing so, three crucial dimensions need to be dealt with: (1) the probability a student is able to solve problems; (2) the time it takes to reach this performance level and (3) the probability the student will actively use this knowledge in the future. Another important aspect that requires analysis is cost effectiveness of the interface. Moreover, teachers and student entry characteristics such as prior knowledge must be assessed since both groups are going to be system users.

The second stage, cognitive task analysis, is a detailed approach to expert systems programming with the goal of developing a valid computational model of the required problem solving knowledge. Chief methods for developing a domain model include: (1) interviewing domain experts, (2) conducting "think aloud" protocol studies with domain experts, (3) conducting "think aloud" studies with novices and (4) observation of teaching and learning behavior. Although the first method is most commonly used, experts are usually incapable of reporting cognitive components. The "think aloud" methods, in which the experts is asked to report aloud what s/he is thinking when solving typical problems, can avoid this problem. Observation of actual online interactions between tutors and students provides information related to the processes used in problem-solving, which is useful for building dialogue or interactivity into tutoring systems.

The third stage, initial tutor implementation, involves setting up a problem solving environment to enable and support an authentic learning process. This stage is followed by a series of evaluation activities as the final stage which is again similar to any software development project.

The fourth stage, evaluation includes (1) pilot studies to confirm basic usability and educational impact; (2) formative evaluations of the system under development, including (3) parametric studies that examine the effectiveness of system features and finally, (4) summative evaluations of the final tutor's effect: learning rate and asymptotic achievement levels.

A variety of authoring tools have been developed to support this process and create intelligent tutors, including ASPIRE, the Cognitive Tutor Authoring Tools (CTAT), GIFT, ASSISTments Builder and AutoTutor tools. The goal of most of these authoring tools is to simplify the tutor development process, making it possible for people with less expertise than professional AI programmers to develop Intelligent Tutoring Systems.

Eight principles of ITS design and development

Anderson et al. (1987) outlined eight principles for intelligent tutor design and Corbett et al. (1997) later elaborated on those principles highlighting an all-embracing principle which they believed governed intelligent tutor design, they referred to this principle as:

Principle 0: An intelligent tutor system should enable the student to work to the successful conclusion of problem solving.

  1. Represent student competence as a production set.
  2. Communicate the goal structure underlying the problem solving.
  3. Provide instruction in the problem solving context.
  4. Promote an abstract understanding of the problem-solving knowledge.
  5. Minimize working memory load.
  6. Provide immediate feedback on errors.
  7. Adjust the grain size of instruction with learning.
  8. Facilitate successive approximations to the target skill.

Use in practice

All this is a substantial amount of work, even if authoring tools have become available to ease the task. This means that building an ITS is an option only in situations in which they, in spite of their relatively high development costs, still reduce the overall costs through reducing the need for human instructors or sufficiently boosting overall productivity. Such situations occur when large groups need to be tutored simultaneously or many replicated tutoring efforts are needed. Cases in point are technical training situations such as training of military recruits and high school mathematics. One specific type of intelligent tutoring system, the Cognitive Tutor, has been incorporated into mathematics curricula in a substantial number of United States high schools, producing improved student learning outcomes on final exams and standardized tests. Intelligent tutoring systems have been constructed to help students learn geography, circuits, medical diagnosis, computer programming, mathematics, physics, genetics, chemistry, etc. Intelligent Language Tutoring Systems (ILTS), e.g. this one, teach natural language to first or second language learners. ILTS requires specialized natural language processing tools such as large dictionaries and morphological and grammatical analyzers with acceptable coverage.

Applications

During the rapid expansion of the web boom, new computer-aided instruction paradigms, such as e-learning and distributed learning, provided an excellent platform for ITS ideas. Areas that have used ITS include natural language processing, machine learning, planning, multi-agent systems, ontologies, semantic Web, and social and emotional computing. In addition, other technologies such as multimedia, object-oriented systems, modeling, simulation, and statistics have also been connected to or combined with ITS. Historically non-technological areas such as the educational sciences and psychology have also been influenced by the success of ITS.

In recent years, ITS has begun to move away from the search-based to include a range of practical applications. ITS have expanded across many critical and complex cognitive domains, and the results have been far reaching. ITS systems have cemented a place within formal education and these systems have found homes in the sphere of corporate training and organizational learning. ITS offers learners several affordances such as individualized learning, just in time feedback, and flexibility in time and space.

While Intelligent tutoring systems evolved from research in cognitive psychology and artificial intelligence, there are now many applications found in education and in organizations. Intelligent tutoring systems can be found in online environments or in a traditional classroom computer lab, and are used in K-12 classrooms as well as in universities. There are a number of programs that target mathematics but applications can be found in health sciences, language acquisition, and other areas of formalized learning.

Reports of improvement in student comprehension, engagement, attitude, motivation, and academic results have all contributed to the ongoing interest in the investment in and research of theses systems. The personalized nature of the intelligent tutoring systems affords educators the opportunity to create individualized programs. Within education there are a plethora of intelligent tutoring systems, an exhaustive list does not exist but several of the more influential programs are listed below.

Education

Algebra Tutor PAT (PUMP Algebra Tutor or Practical Algebra Tutor) developed by the Pittsburgh Advanced Cognitive Tutor Center at Carnegie Mellon University, engages students in anchored learning problems and uses modern algebraic tools in order to engage students in problem solving and in sharing of their results. The aim of PAT is to tap into a students' prior knowledge and everyday experiences with mathematics in order to promote growth. The success of PAT is well documented (ex. Miami-Dade County Public Schools Office of Evaluation and Research) from both a statistical (student results) and emotional (student and instructor feedback) perspective.

SQL-Tutor is the first ever constraint-based tutor developed by the Intelligent Computer Tutoring Group (ICTG) at the University of Canterbury, New Zealand. SQL-Tutor teaches students how to retrieve data from databases using the SQL SELECT statement.

EER-Tutor is a constraint-based tutor (developed by ICTG) that teaches conceptual database design using the Entity Relationship model. An earlier version of EER-Tutor was KERMIT, a stand-alone tutor for ER modelling, whjich was shown to results in significant improvement of student's knowledge after one hour of learning (with the effect size of 0.6).

COLLECT-UML is a constraint-based tutor that supports pairs of students working collaboratively on UML class diagrams. The tutor provides feedback on the domain level as well as on collaboration.

StoichTutor is a web-based intelligent tutor that helps high school students learn chemistry, specifically the sub-area of chemistry known as stoichiometry. It has been used to explore a variety of learning science principles and techniques, such as worked examples and politeness.

Mathematics Tutor The Mathematics Tutor (Beal, Beck & Woolf, 1998) helps students solve word problems using fractions, decimals and percentages. The tutor records the success rates while a student is working on problems while providing subsequent, lever-appropriate problems for the student to work on. The subsequent problems that are selected are based on student ability and a desirable time in is estimated in which the student is to solve the problem.

eTeacher eTeacher (Schiaffino et al., 2008) is an intelligent agent or pedagogical agent, that supports personalized e-learning assistance. It builds student profiles while observing student performance in online courses. eTeacher then uses the information from the student's performance to suggest a personalized courses of action designed to assist their learning process.

ZOSMAT ZOSMAT was designed to address all the needs of a real classroom. It follows and guides a student in different stages of their learning process. This is a student-centered ITS does this by recording the progress in a student's learning and the student program changes based on the student's effort. ZOSMAT can be used for either individual learning or in a real classroom environment alongside the guidance of a human tutor.

REALP REALP was designed to help students enhance their reading comprehension by providing reader-specific lexical practice and offering personalized practice with useful, authentic reading materials gathered from the Web. The system automatically build a user model according to student's performance. After reading, the student is given a series of exercises based on the target vocabulary found in reading.

CIRCSlM-Tutor CIRCSIM_Tutor is an intelligent tutoring system that is used with first year medical students at the Illinois Institute of Technology. It uses natural dialogue based, Socratic language to help students learn about regulating blood pressure.

Why2-Atlas Why2-Atlas is an ITS that analyses students explanations of physics principles. The students input their work in paragraph form and the program converts their words into a proof by making assumptions of student beliefs that are based on their explanations. In doing this, misconceptions and incomplete explanations are highlighted. The system then addresses these issues through a dialogue with the student and asks the student to correct their essay. A number of iterations may take place before the process is complete.

SmartTutor The University of Hong Kong (HKU) developed a SmartTutor to support the needs of continuing education students. Personalized learning was identified as a key need within adult education at HKU and SmartTutor aims to fill that need. SmartTutor provides support for students by combining Internet technology, educational research and artificial intelligence.

AutoTutor AutoTutor assists college students in learning about computer hardware, operating systems and the Internet in an introductory computer literacy course by simulating the discourse patterns and pedagogical strategies of a human tutor. AutoTutor attempts to understand learner's input from the keyboard and then formulate dialog moves with feedback, prompts, correction and hints.

ActiveMath ActiveMath is a web-based, adaptive learning environment for mathematics. This system strives for improving long-distance learning, for complementing traditional classroom teaching, and for supporting individual and lifelong learning.

ESC101-ITS The Indian Institute of Technology, Kanpur, India developed the ESC101-ITS, an intelligent tutoring system for introductory programming problems.

AdaptErrEx is an adaptive intelligent tutor that uses interactive erroneous examples to help students learn decimal arithmetic.

Corporate training and industry

Generalized Intelligent Framework for Tutoring (GIFT) is an educational software designed for creation of computer-based tutoring systems. Developed by the U.S. Army Research Laboratory from 2009 to 2011, GIFT was released for commercial use in May 2012. GIFT is open-source and domain independent, and can be downloaded online for free. The software allows an instructor to design a tutoring program that can cover various disciplines through adjustments to existing courses. It includes coursework tools intended for use by researchers, instructional designers, instructors, and students. GIFT is compatible with other teaching materials, such as PowerPoint presentations, which can be integrated into the program.

SHERLOCK "SHERLOCK" is used to train Air Force technicians to diagnose problems in the electrical systems of F-15 jets. The ITS creates faulty schematic diagrams of systems for the trainee to locate and diagnose. The ITS provides diagnostic readings allowing the trainee to decide whether the fault lies in the circuit being tested or if it lies elsewhere in the system. Feedback and guidance are provided by the system and help is available if requested.

Cardiac Tutor The Cardiac Tutor's aim is to support advanced cardiac support techniques to medical personnel. The tutor presents cardiac problems and, using a variety of steps, students must select various interventions. Cardiac Tutor provides clues, verbal advice, and feedback in order to personalize and optimize the learning. Each simulation, regardless of whether the students were successfully able to help their patients, results in a detailed report which students then review.

CODES Cooperative Music Prototype Design is a Web-based environment for cooperative music prototyping. It was designed to support users, especially those who are not specialists in music, in creating musical pieces in a prototyping manner. The musical examples (prototypes) can be repeatedly tested, played and modified. One of the main aspects of CODES is interaction and cooperation between the music creators and their partners.

Effectiveness

Assessing the effectiveness of ITS programs is problematic. ITS vary greatly in design, implementation, and educational focus. When ITS are used in a classroom, the system is not only used by students, but by teachers as well. This usage can create barriers to effective evaluation for a number of reasons; most notably due to teacher intervention in student learning.

Teachers often have the ability to enter new problems into the system or adjust the curriculum. In addition, teachers and peers often interact with students while they learn with ITSs (e.g., during an individual computer lab session or during classroom lectures falling in between lab sessions) in ways that may influence their learning with the software. Prior work suggests that the vast majority of students' help-seeking behavior in classrooms using ITSs may occur entirely outside of the software - meaning that the nature and quality of peer and teacher feedback in a given class may be an important mediator of student learning in these contexts. In addition, aspects of classroom climate, such as students' overall level of comfort in publicly asking for help, or the degree to which a teacher is physically active in monitoring individual students may add additional sources of variation across evaluation contexts. All of these variables make evaluation of an ITS complex, and may help explain variation in results across evaluation studies.

Despite the inherent complexities, numerous studies have attempted to measure the overall effectiveness of ITS, often by comparisons of ITS to human tutors. Reviews of early ITS systems (1995) showed an effect size of d = 1.0 in comparison to no tutoring, where as human tutors were given an effect size of d = 2.0. Kurt VanLehn's much more recent overview (2011) of modern ITS found that there was no statistical difference in effect size between expert one-on-one human tutors and step-based ITS. Some individual ITS have been evaluated more positively than others. Studies of the Algebra Cognitive Tutor found that the ITS students outperformed students taught by a classroom teacher on standardized test problems and real-world problem solving tasks. Subsequent studies found that these results were particularly pronounced in students from special education, non-native English, and low-income backgrounds.

A more recent meta-analysis suggests that ITSs can exceed the effectiveness of both CAI and human tutors, especially when measured by local (specific) tests as opposed to standardized tests. "Students who received intelligent tutoring outperformed students from conventional classes in 46 (or 92%) of the 50 controlled evaluations, and the improvement in performance was great enough to be considered of substantive importance in 39 (or 78%) of the 50 studies. The median ES in the 50 studies was 0.66, which is considered a moderate-to-large effect for studies in the social sciences. It is roughly equivalent to an improvement in test performance from the 50th to the 75th percentile. This is stronger than typical effects from other forms of tutoring. C.-L. C. Kulik and Kulik’s (1991) meta-analysis, for example, found an average ES of 0.31 in 165 studies of CAI tutoring. ITS gains are about twice as high. The ITS effect is also greater than typical effects from human tutoring. As we have seen, programs of human tutoring typically raise student test scores about 0.4 standard deviations over control levels. Developers of ITSs long ago set out to improve on the success of CAI tutoring and to match the success of human tutoring. Our results suggest that ITS developers have already met both of these goals.... Although effects were moderate to strong in evaluations that measured outcomes on locally developed tests, they were much smaller in evaluations that measured outcomes on standardized tests. Average ES on studies with local tests was 0.73; average ES on studies with standardized tests was 0.13. This discrepancy is not unusual for meta-analyses that include both local and standardized tests... local tests are likely to align well with the objectives of specific instructional programs. Off-the-shelf standardized tests provide a looser fit. ... Our own belief is that both local and standardized tests provide important information about instructional effectiveness, and when possible, both types of tests should be included in evaluation studies."

Some recognized strengths of ITS are their ability to provide immediate yes/no feedback, individual task selection, on-demand hints, and support mastery learning.

Limitations

Intelligent tutoring systems are expensive both to develop and implement. The research phase paves the way for the development of systems that are commercially viable. However, the research phase is often expensive; it requires the cooperation and input of subject matter experts, the cooperation and support of individuals across both organizations and organizational levels. Another limitation in the development phase is the conceptualization and the development of software within both budget and time constraints. There are also factors that limit the incorporation of intelligent tutors into the real world, including the long timeframe required for development and the high cost of the creation of the system components. A high portion of that cost is a result of content component building. For instance, surveys revealed that encoding an hour of online instruction time took 300 hours of development time for tutoring content. Similarly, building the Cognitive Tutor took a ratio of development time to instruction time of at least 200:1 hours. The high cost of development often eclipses replicating the efforts for real world application. Intelligent tutoring systems are not, in general, commercially feasible for real-world applications.

A criticism of Intelligent Tutoring Systems currently in use, is the pedagogy of immediate feedback and hint sequences that are built in to make the system "intelligent". This pedagogy is criticized for its failure to develop deep learning in students. When students are given control over the ability to receive hints, the learning response created is negative. Some students immediately turn to the hints before attempting to solve the problem or complete the task. When it is possible to do so, some students bottom out the hints – receiving as many hints as possible as fast as possible – in order to complete the task faster. If students fail to reflect on the tutoring system's feedback or hints, and instead increase guessing until positive feedback is garnered, the student is, in effect, learning to do the right thing for the wrong reasons. Most tutoring systems are currently unable to detect shallow learning, or to distinguish between productive versus unproductive struggle (though see, e.g.). For these and many other reasons (e.g., overfitting of underlying models to particular user populations), the effectiveness of these systems may differ significantly across users.

Another criticism of intelligent tutoring systems is the failure of the system to ask questions of the students to explain their actions. If the student is not learning the domain language than it becomes more difficult to gain a deeper understanding, to work collaboratively in groups, and to transfer the domain language to writing. For example, if the student is not "talking science" than it is argued that they are not being immersed in the culture of science, making it difficult to undertake scientific writing or participate in collaborative team efforts. Intelligent tutoring systems have been criticized for being too "instructivist" and removing intrinsic motivation, social learning contexts, and context realism from learning.

Practical concerns, in terms of the inclination of the sponsors/authorities and the users to adapt intelligent tutoring systems, should be taken into account. First, someone must have a willingness to implement the ITS. Additionally an authority must recognize the necessity to integrate an intelligent tutoring software into current curriculum and finally, the sponsor or authority must offer the needed support through the stages of the system development until it is completed and implemented.

Evaluation of an intelligent tutoring system is an important phase; however, it is often difficult, costly, and time consuming. Even though there are various evaluation techniques presented in the literature, there are no guiding principles for the selection of appropriate evaluation method(s) to be used in a particular context. Careful inspection should be undertaken to ensure that a complex system does what it claims to do. This assessment may occur during the design and early development of the system to identify problems and to guide modifications (i.e. formative evaluation). In contrast, the evaluation may occur after the completion of the system to support formal claims about the construction, behaviour of, or outcomes associated with a completed system (i.e. summative evaluation). The great challenge introduced by the lack of evaluation standards resulted in neglecting the evaluation stage in several existing ITS'.

Improvements

Intelligent tutoring systems are less capable than human tutors in the areas of dialogue and feedback. For example, human tutors are able to interpret the affective state of the student, and potentially adapt instruction in response to these perceptions. Recent work is exploring potential strategies for overcoming these limitations of ITSs, to make them more effective.

Dialogue

Human tutors have the ability to understand a person's tone and inflection within a dialogue and interpret this to provide continual feedback through an ongoing dialogue. Intelligent tutoring systems are now being developed to attempt to simulate natural conversations. To get the full experience of dialogue there are many different areas in which a computer must be programmed; including being able to understand tone, inflection, body language, and facial expression and then to respond to these. Dialogue in an ITS can be used to ask specific questions to help guide students and elicit information while allowing students to construct their own knowledge. The development of more sophisticated dialogue within an ITS has been a focus in some current research partially to address the limitations and create a more constructivist approach to ITS. In addition, some current research has focused on modeling the nature and effects of various social cues commonly employed within a dialogue by human tutors and tutees, in order to build trust and rapport (which have been shown to have positive impacts on student learning).

Emotional affect

A growing body of work is considering the role of affect on learning, with the objective of developing intelligent tutoring systems that can interpret and adapt to the different emotional states. Humans do not just use cognitive processes in learning but the affective processes they go through also plays an important role. For example, learners learn better when they have a certain level of disequilibrium (frustration), but not enough to make the learner feel completely overwhelmed. This has motivated affective computing to begin to produce and research creating intelligent tutoring systems that can interpret the affective process of an individual. An ITS can be developed to read an individual's expressions and other signs of affect in an attempt to find and tutor to the optimal affective state for learning. There are many complications in doing this since affect is not expressed in just one way but in multiple ways so that for an ITS to be effective in interpreting affective states it may require a multimodal approach (tone, facial expression, etc...). These ideas have created a new field within ITS, that of Affective Tutoring Systems (ATS). One example of an ITS that addresses affect is Gaze Tutor which was developed to track students eye movements and determine whether they are bored or distracted and then the system attempts to reengage the student.

Rapport Building

To date, most ITSs have focused purely on the cognitive aspects of tutoring and not on the social relationship between the tutoring system and the student. As demonstrated by the Computers are social actors paradigm humans often project social heuristics onto computers. For example in observations of young children interacting with Sam the CastleMate, a collaborative story telling agent, children interacted with this simulated child in much the same manner as they would a human child. It has been suggested that to effectively design an ITS that builds rapport with students, the ITS should mimic strategies of instructional immediacy, behaviors which bridge the apparent social distance between students and teachers such as smiling and addressing students by name. With regard to teenagers, Ogan et. al draw from observations of close friends tutoring each other to argue that in order for an ITS to build rapport as a peer to a student, a more involved process of trust building is likely necessary which may ultimately require that the tutoring system possess the capability to effectively respond to and even produce seemingly rude behavior in order to mediate motivational and affective student factors through playful joking and taunting.

Teachable Agents

Traditionally ITSs take on the role of autonomous tutors, however they can also take on the role of tutees for the purpose of learning by teaching exercises. Evidence suggests that learning by teaching can be an effective strategy for mediating self-explanation, improving feelings of self-efficacy, and boosting educational outcomes and retention. In order to replicate this effect the roles of the student and ITS can be switched. This can be achieved by designing the ITS to have the appearance of being taught as is the case in the Teachable Agent Arithmetic Game  and Betty's Brain. Another approach is to have students teach a machine learning agent which can learn to solve problems by demonstration and correctness feedback as is the case in the APLUS system built with SimStudent. In order to replicate the educational effects of learning by teaching teachable agents generally have a social agent built on top of them which poses questions or conveys confusion. For example Betty from Betty's Brain will prompt the student to ask her questions to make sure that she understands the material, and Stacy from APLUS will prompt the user for explanations of the feedback provided by the student.

Related conferences

Several conferences regularly consider papers on intelligent tutoring systems. The oldest is The International Conference on Intelligent Tutoring Systems, which started in 1988 and is now held every other year. The International Artificial Intelligence in Education (AIED) Society publishes The International Journal of Artificial Intelligence in Education (IJAIED) and organizes the annual International Conference on Artificial Intelligence in Education (http://iaied.org/conf/1/) started in 1989. Many papers on intelligent tutoring systems also appear at International Conference on User Modeling, Adaptation, and Personalization and International Conference on Educational Data Mining. The American Association of Artificial Intelligence (AAAI) will sometimes have symposia and papers related to intelligent tutoring systems. A number of books have been written on ITS including three published by Lawrence Erlbaum Associates.

Rejuvenation

From Wikipedia, the free encyclopedia https://en.wikipedia.org/w...