Printer Friendly

A pilot meta-analysis of computer-based scaffolding in STEM education.

A Pilot meta-analysis of computer-based scaffolding in stem education

Despite much primary research and review work about scaffolding (Kali & Linn, 2008), scaffolding meta-analyses remain an emergent line of inquiry. Meta-analyses have been published on specific types of scaffolding including dynamic assessment (Swanson & Lussier, 2001), scaffolding for students with learning disabilities (Swanson & Deshler, 2003), and scaffolding in multimedia instruction (Lin, Ching, Ke, & Dwyer, 2007). However, a more comprehensive examination of scaffolding is needed to inform researchers and designers of the most effective characteristics of and approaches to researching scaffolding. In this paper, we use meta-analysis to determine the influence of computer-based scaffolding characteristics, study and test score quality, and assessment level on cognitive outcomes in science, technology, engineering, and mathematics (STEM) education. This paper is significant in that it (a) provides evidence of the effectiveness of scaffolding, (b) can guide future research on scaffolding, and (c) provides for data-driven scaffolding design decisions.

Theoretical framework

Computer-based scaffolding


As Vygotsky (1962) noted, "The only good kind of instruction is that which marches ahead of development and leads it" (p. 104). Scaffolding can facilitate such instruction by providing conceptual, procedural, strategic, and metacognitive support that bridges the gap between what students can do on their own and what they can do with the help of a more capable other (Hannafin, Land, & Oliver, 1999; Wood, Bruner, & Ross, 1976). As originally defined, scaffolding referred to dynamic support provided by a teacher or other more capable other that enabled children to solve problems (Wood et al., 1976). The emergence of personal computers has allowed computer-based scaffolding to be developed to supplement teacher scaffolding (Hannafin et al., 1999; Saye & Brush, 2002). Computer-based scaffolding (hereafter referred to as "scaffolding") can promote success in rich problem solving contexts that require students to go beyond filling out worksheets or listening to a teacher lecture.

Meta-analyses of specific scaffolding types (scaffolding for dynamic assessment, learning disabilities, and multimedia instruction) indicate that scaffolding-related interventions were associated with increased student learning (Lin et al., 2007; Swanson & Deshler, 2003; Swanson & Lussier, 2001). Dynamic assessment led to an average effect size of 0.96, and there was a similarly positive relationship between explicit practice and learning (Swanson & Lussier, 2001). Scaffolding in multimedia instruction led to an average effect size of 0.02 (Lin et al., 2007). Thus, a meta-analysis that covers a wider range of scaffolding types is warranted.

Scaffolding characteristics

Scaffolding can (a) enlist interest in the target task (Wood et al., 1976), (b) maintain direction (Wood et al., 1976), (c) reduce complexity (Reiser, 2004), (d) highlight important problem features (Reiser, 2004), (e) help students manage frustration (Wood et al., 1976), (f) model expert processes (van de Pol, Volman, & Beishuizen, 2010), and (g) elicit articulation (Reiser, 2004). Scaffolding can do this through such strategies as making thinking visible (Kali & Linn, 2008). No scaffolding serves all these aims, and no literature compares such strategies to indicate which are most effective in which contexts.

Though scaffolding originally referred to support for children's problem solving abilities (Wood et al., 1976), recent scaffolding work also supports (a) other higher-order thinking skills (e.g., argumentation and evaluation), and (b) knowledge integration--the ability to "expand, revise, restructure, reconnect and reprioritize" scientific models (Linn, 2000, p. 783). The literature does not indicate which scaffolding type leads to stronger outcomes.

Scaffolding is not a stand-alone intervention. Rather, it is used along with instructional approaches (paired interventions) such as problem-based learning (PBL) (Saye & Brush, 2002), learning by design (Puntambekar & Kolodner, 2005), and case-based learning (Lajoie, Lavigne, Guerrera, & Munsie, 2001). No literature comprehensively compares the effectiveness of scaffolding when used with these different approaches.

Many debate whether scaffolding should be generic or context-specific. For example, one finding indicated that generic scaffolds promoted deeper reflection among middle school students than content-specific scaffolds (Davis, 2003), while another finding indicated that content-specific scaffolds were more effective when teachers provided one-to-one scaffolding related to a general argumentation framework (McNeill & Krajcik, 2009).

Many researchers argue that scaffolds must be faded as students gain skill to promote transfer of responsibility (Pea, 2004; Puntambekar & Hubscher, 2005). In one-to-one scaffolding, this is done through continual diagnosis of student performance (van de Pol et al., 2010). The only two ways this has been accomplished among computer-based scaffolds to support ill-structured problem solving are making the scaffolds disappear (a) on a fixed schedule (Li & Lim, 2008) or (b) when students indicate that they do not need the scaffolding anymore (Metcalf, 1999). It is not clear if such approaches lead to greater learning or transfer of responsibility.

Scaffolding and methodological quality

Methodological quality considerations include threats to internal validity and external validity (Gall, Gall, & Borg, 2003; Shadish & Myers, 2001), as well as test score reliability and validity (Messick, 1989). It may be unrealistic to expect that a scaffolding study have zero threats to validity due to the need to study scaffolding in contexts in which students can collaboratively solve authentic problems - contexts that do not include laboratories. In the three existing scaffolding meta-analyses, as methodological quality decreased, effect size magnitude increased (Lin et al., 2007; Swanson & Deshler, 2003; Swanson & Lussier, 2001). This highlights the need to code for methodological quality in a wider scaffolding meta-analysis.

Scaffolding and assessment level

When evaluating scaffolding's effectiveness, one needs to consider assessment level (e.g., concept, principles, and application) (Messick, 1989; Sugrue, 1995). For example, a fact-based test may not be the best vehicle to evaluate scaffolding designed to promote problem solving ability.

Remaining questions

Scaffolding guidelines exist in such areas as online discussion (Choi, Land, & Turgeon, 2005), science inquiry (Linn, 2000; Quintana et al., 2004), problem solving (Ge & Land, 2004; Kolodner, Owensby, & Guzdial, 2004), and argumentation (Belland, Glazewski, & Richardson, 2008; Jonassen & Kim, 2010). Authors often gather empirical support for guidelines (Belland, 2010; Ge & Land, 2003), but the sheer volume of scaffolding frameworks and conflicting advice leads one to desire a comprehensive assessment of scaffolding. Furthermore, the effect of a specific feature inspired by a guideline is rarely isolated in empirical studies. The lack of clarity is not due to a lack of research, as there is a staggering volume of empirical research on scaffolding. Individual empirical studies are the engine that drives educational research, but in accumulating thousands of studies on scaffolding, researchers run the risk of "knowing less than we have proven" (Glass, 1976, p. 8). Meta-analysis is one way to avoid this risk (Chambers, 2004; Glass, 1976).

Research questions

In this preliminary meta-analysis, we address the following research questions:

1. To what extent do scaffolding characteristics (strategy, intended outcome, fading schedule, intervention, and paired intervention) influence cognitive outcomes?

2. To what extent does methodological quality (study design, internal threats to validity, external threats to validity, and test score validity and reliability) influence cognitive outcomes?

3. To what extent does assessment level (concept, principle, application) influence cognitive outcomes?

4. To what extent do the combination of internal threats, external threats, reliability, fading, and scaffolding intervention influence cognitive outcomes?


Inclusion criteria

To be included, studies needed to (a) cover primary, middle level, secondary, college/vocational, graduate/professional, and adult students, (b) compare a scaffolding treatment with a comparison condition (absence of scaffolding), (c) report quantitative, cognitive outcomes (e.g., problem-solving ability, conceptual understanding), and (d) report enough data to support effect size calculation. When more than one source reported the same data (e.g., a dissertation and a journal article), the source with the most detail (e.g., dissertation) was included.

Literature search

The literature search began with a comprehensive review of scaffolding strategies (Kali & Linn, 2008) before proceeding to existing meta-analyses (Lin et al., 2007; Swanson & Deshler, 2003; Swanson & Lussier, 2001). Ninety-four studies were identified as candidates for inclusion on first pass. Upon application of the inclusion criteria by two research team members to each study identified as candidates for inclusion, the 94 studies were reduced to seven, for reasons including (a) not enough information to calculate an effect size, (b) lack of quantitative, cognitive outcomes, (c) lack of a control condition in which students did not receive scaffolding, and (d) the intervention did not meet the definition of computer-based scaffolding.

Coding scheme

Two coders independently coded each study for several characteristics (see Table 1).


Given the diversity of research quality, interventions, populations, and sample sizes among existing primary research, effect size estimate precision varied. Thus, a conversion was made from Cohen's d to Hedges' g for all outcomes (Cooper, 1989).

To examine potential bias a funnel plot was generated. Figure 1 shows standard error and Hedge'sg for each study. The figure shows a general lack of symmetry suggesting publication bias. To investigate further, a cumulative forest plot (Sutton, 2009) was run. The nine most precise studies (g = 0.33) have a substantially lower estimate than all 17 studies (g = 0.53). This implies bias either due to publication or small study effects. This is somewhat anticipated, as the primary source of studies for this meta-analysis was existing reviews that tend to not cover "grey literature" such as book chapters, conference papers, or dissertations (Borenstein, Hedges, Higgins, & Rothstein, 2009). Only one included study was a dissertation; the rest were peer-reviewed journal articles.

Research questions 1-3 were addressed with a 0-test based on analysis of variance. Pairwise differences were assessed using a Z-test (Borenstein et al., 2009). For research question 4, meta-regression was employed. Since the studies in this analysis draw from fundamentally different populations a random effects model was employed. All significance testing assumed an alpha level of .05. Data analysis was conducted using STATA 11.


Before addressing our research questions, we present overall analyses. Seven studies with 17 outcomes met our inclusion criteria (Bagno & Eylon, 1997; Clement, 1993; Foley, 1999; Mayer, 1989; Nathan, Kintsch, & Young, 1992; Ronen & Eliahu, 2000; White & Frederiksen, 1998). Covered subject areas/grade levels are: high school physics (2 studies), middle school physical science (1 study), college mathematics (1 study), college mechanics (1 study), high school science (1 study), and middle and high school science (1 study). An l-squared test indicates a relatively high level of inconsistency (See Figure 2) across effect size estimates (78.0%, p = 0.001). The overall effect size (g = 0.53) is statistically significant, z = 4.65, p = 0.001, and of a medium magnitude (Cohen, 1988).

Boxed Hedges' g values are statistically greater than zero, p < 0.05. The forest plot shows estimates and confidence levels for each outcome. The diamond is a summary estimate and confidence interval for the overall effect.

Research question 1: To what extent do scaffolding characteristics (strategy, intended outcome, fading, intervention, and paired intervention) influence cognitive outcomes?

Table 2 shows point estimates and 95% confidence intervals for Hedge's g for each coded scaffolding characteristic. Examining strategy, one notices that there was no significant difference in cognitive outcomes between generic and context-specific scaffolds, and that the effect size for each is significantly greater than 0 p < 0 .05. This suggests that there are no differences in influence on cognitive outcomes between generic and context-specific scaffolds, though caution is warranted as only one coded study included a generic scaffold.

There were no significant differences between scaffolding that supports higher-order skills and scaffolding that supports knowledge integration (see Table 2). This provides preliminary evidence that scaffolding designed to support each intended outcome is equally effective. As the confidence intervals are fairly wide, examining more studies is necessary to have great confidence in no significant differences.

Turning to Fading Schedule in Table 2, studies in which scaffolding was not faded had higher effect sizes (g = 0.79) than studies that employed fixed fading (g = 0.20), z(16) = 3.02, p = .001. This implies that fixed fading harms cognitive outcomes (see Figure 3).

Note boxed Hedges'g values are statistically greater than zero, p < .05. Diamonds are summary scores, representing Hedges' g and confidence intervals for two or more outcomes; the final diamond represents all outcomes. "Lower" and "upper" represent the effect sizes' 95% confidence interval limits.

As indicated in Table 2 and Figure 4, studies using conceptual (g = 0.67) scaffolds exhibited superior learning outcomes to those using metacognitive (g = 0.25) scaffolds, z (16) = 3.29, p = .01. This emergent result should be interpreted with caution as only one study used metacognitive scaffolds.

Boxed Hedges' g values are statistically greater than zero, p < .05. Diamonds are summary scores, representing Hedges' g and confidence intervals for two or more outcomes; the final diamond represents all outcomes. "Lower" and "upper" represent limits of the effect sizes' 95% confidence intervals.

There were no significant differences between studies based on paired intervention: Inquiry-based learning led to an effect size of 0.39 and problem-solving led to an effect size of 0.53 (see Table 2). Of note, scaffolding is used in the context of other instructional approaches, so it is important to examine more studies covering a wider range of paired interventions.

Research question 2: To what extent does methodological quality (study design, internal threats to validity, external threats to validity, and test score validity and reliability) influence cognitive outcomes?

As indicated in Table 3, there were no statistically significant differences between study designs. Results from quasi- experimental designs parallel results of true random experiments, although the confidence intervals would likely shrink with the inclusion of more studies.

None of the studies reported validity data. Thus, no difference could be calculated based on validity reporting.

Turning to reliability reporting (See Table 3), one sees that studies reporting no reliability data had larger effects than studies with strong reliability reporting, z(16) = 2.11, p = .04. However, this result should be interpreted with caution as only one study reported reliability information.

Examining Internal Threats to Validity in Table 3, one notices that studies reporting no internal threats to validity had smaller learning gains than studies reporting two threats, z(16) = 2.68, p = .01. No other statistically significant differences were found.

Despite a range of effect size estimates, only studies with zero or one threat to external validity had values statistically greater than 0. In the only significant pairwise comparison (see Figure 5), studies with one external threat to validity had a higher effect size estimate than studies with no threats to validity, z(8) = 3.29, p = .001. In contrast to internal threats, there does not appear to be a systematic trend for external threats to validity.

Boxed Hedges' g values are statistically (p < .05) greater than zero, p < .05. Diamonds are summary scores, representing Hedges' g and confidence intervals for two or more outcomes; the final diamond represents all outcomes.

Research question 3: To what extent does assessment level (concept, principle, application) influence cognitive outcomes?

There were no significant differences between assessment levels (see Table 4). However, trends favored Application and Principles-level over Concept-level. With more studies included, the confidence intervals may shrink and significant differences would emerge.

Research question 4: To what extent do the combination of internal threats, external threats, reliability, fading, and scaffolding intervention influence cognitive outcomes?

Regression was used to determine how these variables combine to influence cognitive learning. This involved stepwise regression (forward) selection because including all data with backward elimination would likely result in overfitting due to the small number of observations. A Bonferroni correction was applied for more stringent selection criteria (t = 1.18) given the number of variables (Foster & George, 1994). Only variables with statistically significant differences were considered as predictor candidates--reliability reporting, threats to internal validity, threats to external validity, fading, and scaffolding intervention. The latter two were both dummy coded. The final model explains a statistically significant portion of the variance, [R.sup.2] = .30, p = .02 and consists of only a single variable with two levels: fixed fading, or no fading. When scaffolds were not faded, students had higher cognitive outcomes (t = 2.63). Scaffolding intervention explained 30% of the variability in cognitive outcomes.


Overall, scaffolding produced a medium effect (g = 0.53). There were no differences based on study design, assessment level, or intended learning outcome. When examined individually, effect size differences were predicted by reliability reporting, threats to internal validity, external threats to validity, fading, and scaffolding intervention (conceptual or metacognitive). No study reported validity information, and only one study reported reliability information. Studies with zero threats to internal validity had lower effect sizes than studies with two threats. Studies with one threat to external validity had higher effect sizes than studies with zero threats to validity. Studies with no fading had higher effect sizes than studies with fixed fading. Students fared better with conceptual scaffolds than with metacognitive scaffolds. When significant predictors were entered into a regression analysis, fading explained 30% of the variability in cognitive outcomes: students did better in studies with no fading than in studies where scaffolding was faded according to a fixed schedule.


Significant differences

One of the most interesting findings was that students fared better when scaffolding did not fade than when it faded on a fixed schedule. The limited number of studies using fixed fading should be noted; including more studies in a larger meta-analysis would help to see if the pattern is consistent. For years, researchers have argued that computerbased scaffolds must fade (Pea, 2004; Puntambekar & Hubscher, 2005). Fading was originally proposed to enable transfer of responsibility from the scaffolding provider to the student (Collins, Brown, & Newman, 1989). According to this reasoning, after fading, students would be able to perform the target task independently. All would agree that promoting transfer of responsibility is important. Some have questioned the theoretical and empirical evidence that fixed fading really leads to transfer of responsibility (Belland, 2011). Without ongoing diagnosis, fading is implemented according to a "best guess" of when scaffolds should be removed by either a designer before a student even uses the scaffold, or by a student as she/he uses the scaffold. Thus, fixed fading may cause scaffolds to be removed (a) too soon, resulting in insufficient support, or (b) too late, resulting in a requirement to use scaffolds when students understand how to perform the target task. In short, if students learn less when using scaffolding with fixed fading, and one cannot be sure that transfer of responsibility happens with fixed fading, then fixed fading becomes less attractive as a scaffolding feature. However, it should be noted that another strategy of fading computer-based scaffolds was not covered in this study--making scaffolds disappear when students indicate they do not need them any more (e.g., Metcalf, 1999).

Results for internal threats to validity and external threats were also interesting. Studies with 2 threats to internal validity had higher effect sizes than studies with 0 threats to internal validity. To promote maximum ecological validity of scaffolding studies among secondary students, studies should be done in middle and high schools. K-12 school administrators and Institutional Review Boards rarely allow individual K-12 students to be randomly assigned to treatments. Thus, research that is conducted in secondary schools will encounter at least one threat to internal validity. Furthermore, by studying scaffolding in laboratory settings where students can be randomly assigned as individuals to treatment conditions, one would lose the essence of scaffolding, as scaffolding is meant to be deployed in authentic, collaborative problem solving or other contexts that require more than just filling out a worksheet or passively listening to or watching instructional materials (Belland, 2014; Wood et al., 1976).

Students did better when there was one external threat to validity than when there were no threats. The two most common threats to external validity in this study were limited description of the scaffolding intervention and experimenter effect. By limited description, we mean that there was enough to determine that it was scaffolding but not enough to replicate the study. A sufficient study procedure description would not change the effect size. Experimenter effect was noted if there was only one instructor. This is common, because early studies of educational interventions often involve a partnership between researchers and one or two teachers (Anderson & Shattuck, 2012). Because teachers do not often design interventions developed in design-based research (Anderson & Shattuck, 2012), fear of artificial inflation of effect sizes through teacher actions is relatively unfounded.

There was a significant difference in cognitive outcomes between conceptual and metacognitive scaffolds. There are two possible explanations: (1) students often do not use metacognitive computer-based scaffolds (Oliver & Hannafin, 2000), and (2) we examined how scaffolding influences cognitive outcomes. Metacognitive scaffolds may lead to other important outcomes like self-regulated learning ability.

No significant differences

Often it is forgotten that a lack of significant differences can be, in itself, substantial. We did not find any significant differences based on intended learning outcome, assessment level, generic versus context-specific, paired intervention, or study design.

The lack of significant differences based on intended learning outcome provides preliminary evidence that scaffolding works equally well in support of higher-order learning and knowledge integration outcomes. This is encouraging in that students need to gain problem solving ability to be successful in the 21st century workplace (Feller, 2003; Nordgren, 2002), but they also need to recognize that concepts and theories learned in school apply outside of school (Linn, 2000).

Our finding of no significant difference according to assessment level is interesting when compared alongside our previous research in which we found that outcomes of PBL varied based on assessment level (Belland, Walker, Leary, Kuo, & Can, 2010). The inclusion of more studies would help to further elucidate this difference.

That there was no difference in cognitive outcomes between generic scaffolds and context-specific scaffolds is interesting in that it would help scaffold designers make an informed choice about whether to base scaffolding on a generic process or specific content.

It is interesting that there was no difference in learning outcomes according to paired intervention. More studies should be included in a future meta-analysis such that more paired interventions can be included and it can be seen if the trend of no difference in cognitive outcomes holds.

Finally, it is encouraging that there was no difference in cognitive outcomes based on study design because quasi- experimental is the most common quantitative research design in educational research (Gall et al., 2003). This suggests that quasi-experimental designs sufficiently capture the magnitude of scaffolding's effect.

Remaining issues

While there was a significant difference in study outcomes based on test score reliability, the fact that only one study reported reliability of test scores calls into question the interpretability of this effect. The validity and reliability reporting rate in this study roughly parallels the proportion of PBL studies that reported validity and/or reliability of test scores (Belland, French, & Ertmer, 2009). Unfortunately, reliability and validity reporting is sparse throughout educational research (Belland et al., 2009; Hamdy et al., 2006; Hogan & Agnello, 2004). Readers should recall that reliability and validity reporting should be done not just because the American Educational Research Association (2006) mandates it, but because it has serious consequences for study interpretation. For example, low reliability can lead researchers to under-estimate an effect's magnitude (Hunter & Schmidt, 2004; Loevinger, 1954). If one knows the reliability coefficient, one can estimate the true effect. Low reliability also negatively impacts validity, because one cannot reliably predict what a person with a particular competence level on Trait X will get on an unreliable test that measures Trait X (Messick, 1989). Invalid test scores cannot indicate how much students have of the target construct (Anastasi & Urbina, 1997). Most importantly, improper validity and reliability reporting can lead to improper theory construction based on erroneous empirical results. Improper theory construction has serious consequences, because it can cause school districts and other governmental agencies to spend scarce resources on suboptimal learning tools.

Limitations and suggestions for future research

The final number of included studies is a limitation. We started with 94 studies, but 7 studies were included in the meta-analysis. Eliminated studies (a) did not contain sufficient information to code for an effect size and associated information, (b) did not have an appropriate control group, or (c) were deemed to not describe a scaffolding intervention. Such elimination required agreement of at least two researchers. Including more studies may have led to different results. Loosening inclusion criteria is not a good choice to increase the number of included articles. Rather, broadening search strategies is. This study was a preliminary meta-analysis intended to optimize our coding and analysis procedures and get a sense of important trends in the scaffolding literature. As such, we limited our selection of studies to those harvested from existing literature reviews. As a next step in our research program, we will conduct a comprehensive search of the primary literature (Cooper, Hedges, & Valentine, 2009). Future research may include search terms describing tools similar to scaffolds, such as mindtools and intelligent tutors. Also, future research should take care to broaden search databases to include ones with greater coverage of the non-USA literature.

Meta-analyses only can include certain quantitative studies. Qualitative studies are common in the scaffolding literature. Empirical studies of a variety of designs and methodologies are all of great value, and all can contribute to an understanding of the impacts of scaffolding. Effect sizes calculated during meta-analyses cannot reflect all pertinent literature on scaffolding. Nonetheless, they can help direct future research and development.

Conclusion and implications

The significance of this paper lies in its affirmation of scaffolding as an effective intervention, and its guidance of future research on scaffolding, as well as of data-driven scaffolding design decisions. First, we found preliminary evidence that scaffolding is effective, producing an average effect size of 0.53. Scaffolding's effect (0.53) is (a) considerably stronger than that of the average instructional intervention designed to promote critical thinking (0.341) (Abrami et al., 2008), yet (b) lower than that found for one-to-one human tutoring (0.79) in a recent meta-analysis (VanLehn, 2011). Still, given the high student-to-teacher ratios in most K-12 schools, and the focus on improving critical thinking abilities of the Common Core State Standards and the Next Generation Science Standards, scaffolding may be a particularly promising intervention (Achieve, 2013; National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010).

Scaffolding produced higher effect sizes when studied in authentic settings (e.g., classroom-based problem-based learning) in which there are more threats to internal and external validity. Thus, educators can have confidence in scaffolding's efficacy even when studies suffered from threats to internal or external validity.

Conceptual scaffolding produced higher effect sizes than metacognitive scaffolds. Scaffolding with no fading produced larger effects than scaffolding with fixed fading. This reinforces the role of teachers in supporting metacognition and transfer of responsibility.

We found preliminary evidence that scaffolding's effectiveness does not depend on whether it (a) supports higher-order outcomes and knowledge integration, (b) is generic or context-specific, (c) is used with different paired interventions, or (d) is assessed at the concept, principles, or application level. These findings may imply that on these characteristics, teachers can select scaffolding that best aligns with learning goals.

(Submitted October 8, 2013; Revised March 15, 2014; Accepted April 10, 2014)


This work was supported by a grant from Utah State University. Any opinions, findings, and conclusions are those of the authors.


Abrami, P. C., Bernard, R. M., Borokhovski, E., Wade, A., Surkes, M. A., Tamim, R., & Zhang, D. (2008). Instructional interventions affecting critical thinking skills and dispositions: A stage 1 meta-analysis. Review of Educational Research, 78(4), 1102-1134. doi:10.3102/0034654308326084

Achieve. (2013). Next Generation Science Standards. Retrieved August 8, 2013, from

American Educational Research Association. (2006). Standards for reporting on empirical social science research in AERA publications. Educational Researcher, 35(6), 33-40. doi:10.3102/0013189X035006033

Anastasi, A., & Urbina, S. (1997). Psychological testing (7th ed.). Upper Saddle River, NJ: Prentice Hall.

Anderson, T., & Shattuck, J. (2012). Design-based research: A decade of progress in education research? Educational Researcher, 41(1), 16-25. doi:10.3102/0013189X11428813

Bagno, E., & Eylon, B. S. (1997). From problem solving to a knowledge structure: An example from the domain of electromagnetism. American Journal of Physics, 65(8), 726. doi: 10.1119/1.18642

Belland, B. R. (2010). Portraits of middle school students constructing evidence-based arguments during problem-based learning: The impact of computer-based scaffolds. Educational Technology Research and Development, 58(3), 285-309. doi:10.1007/s11423-009-9139-4

Belland, B. R. (2011). Distributed cognition as a lens to understand the effects of scaffolds: The role of transfer of responsibility. Educational Psychology Review, 23(4), 577-600. doi:10.1007/s10648-011-9176-5

Belland, B. R. (2014). Scaffolding: Definition, current debates, and future directions. In J. M. Spector, M. D. Merrill, J. Elen, & M. J. Bishop (Eds.), Handbook of research on educational communications and technology (4th ed., pp. 505-518). New York, NY, USA: Springer.

Belland, B. R., French, B. F., & Ertmer, P. A. (2009). Validity and problem-based learning research: A review of instruments used to assess intended learning outcomes. Interdisciplinary Journal of Problem-Based Learning, 3(1), 59-89. doi:10.7771/15415015.1059

Belland, B. R., Glazewski, K. D., & Richardson, J. C. (2008). A scaffolding framework to support the construction of evidence-based arguments among middle school students. Educational Technology Research and Development, 56(4), 401-422. doi:10.1007/s11423-007-9074-1

Belland, B. R., Walker, A., Leary, H., Kuo, Y.-C., & Can, G. (2010, May). A meta-analysis of problem-based learning corrected for attenuation, and accounting for internal threats. Paper presented at the 2010 Annual Meeting of the American Educational Research Association, Denver, CO, USA.

Borenstein, M., Hedges, L. V., Higgins, J., & Rothstein, H. (2009). Introduction to meta-analysis. Chichester, UK: John Wiley & Sons.

Chambers, E. A. (2004). An introduction to meta-analysis with articles from the Journal of Educational Research (1992- 2002). Journal of Educational Research, 98(1), 35. doi:10.3200/JOER.98.1.35-45

Choi, I., Land, S. M., & Turgeon, A. J. (2005). Scaffolding peer-questioning strategies to facilitate metacognition during online small group discussion. Instructional Science, 33(5-6), 483-511. doi:10.1007/s11251-005-1277-4

Clement, J. (1993). Using bridging analogies and anchoring intuitions to deal with students' preconceptions in physics. Journal of Research in Science Teaching, 30(10), 1241-1257. doi:10.1002/tea.3660301007

Cohen, J. (1988). Statistical power analysis for the behavioral sciences. N. p.: Lawrence Erlbaum.

Collins, A., Brown, J. S., & Newman, S. E. (1989). Cognitive apprenticeship: Teaching the crafts of reading, writing, and mathematics. In L. B. Resnick (Ed.), Knowing, learning, and instruction: Essays in honor of Robert Glaser (pp. 453- 494). Hillsdale, NJ: Lawrence Erlbaum.

Cooper, H. (1989). Integrating research: A guide for literature reviews (2nd ed.). Newbury Park, CA, USA: Sage.

Cooper, H., Hedges, L. V., & Valentine, J. C. (2009). The handbook of research synthesis and meta-analysis. New York, NY, USA: Russell Sage Foundation.

Davis, E. A. (2003). Prompting middle school science students for productive reflection: Generic and directed prompts. Journal of the Learning Sciences, 12(1), 91-142. doi:10.1207/S15327809JLS1201_4

Feller, R. W. (2003). Aligning school counseling, the changing workplace, and career development assumptions. Professional School Counseling, 6(4), 262-71.

Foley, B. J. (1999). Visualization tools: Models, representations and knowledge integration (Ph.D. Dissertation). University of California, Berkeley, CA, USA. Retrieved from Dissertation Abstracts International. (Publication Number: AAI9966375)

Foster, D. P., & George, E. I. (1994). The risk inflation criterion for multiple regression. Annals of Statistics, 22(4), 1947-1975. doi:10.1214/aos/1176325766

Gall, M. D., Gall, J. P., & Borg, W. R. (2003). Educational research: An introduction (7th ed.). Boston, MA: Pearson.

Ge, X., & Land, S. M. (2003). Scaffolding students' problem-solving processes in an ill-structured task using question prompts and peer interactions. Educational Technology Research and Development, 51(1), 21-38. doi:10.1007/BF02504515

Ge, X., & Land, S. M. (2004). A conceptual framework for scaffolding ill-structured problem-solving processes using question prompts and peer interactions. Educational Technology Research and Development, 52(2), 5-22. doi:10.1007/BF02504836

Glass, G. V. (1976). Primary, secondary, and meta-analysis of research. Educational Researcher, 5(10), 3-8. doi:10.3102/0013189X005010003

Hamdy, H., Prasad, K., Anderson, M. B., Scherpbier, A., Williams, R., Zwierstra, R., & Cuddihy, H. (2006). BEME systematic review: Predictive values of measurements obtained in medical schools and future performance in medical practice. Medical Teacher, 28(2), 103-116. doi:10.1080/01421590600622723

Hannafm, M., Land, S., & Oliver, K. (1999). Open-ended learning environments: Foundations, methods, and models. In C. M. Reigeluth (Ed.), Instructional design theories and models: Volume II: A new paradigm of instructional theory (pp. 115- 140). Mahwah, NJ, USA: Lawrence Erlbaum.

Hogan, T. P., & Agnello, J. (2004). An empirical study of reporting practices concerning measurement validity. Educational and Psychological Measurement, 64(5), 802-812. doi:10.1177/0013164404264120

Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis: Correcting error and bias in research findings (2nd ed.). Thousand Oaks, CA, USA: Sage.

Jonassen, D. H., & Kim, B. (2010). Arguing to learn and learning to argue: Design justifications and guidelines. Educational Technology Research and Development, 58(4), 439-457. doi:10.1007/s11423-009-9143-8

Kali, Y., & Linn, M. C. (2008). Technology-enhanced support strategies for inquiry learning. In J. M. Spector, M. D. Merrill, J. J. G. van Merrienboer, & M. P. Driscoll (Eds.), Handbook of research on educational communications and technology (3rd ed., pp. 145-161). New York, NY: Lawrence Erlbaum.

Kolodner, J. L., Owensby, J. N., & Guzdial, M. (2004). Case-based learning aids. In D. H. Jonassen (Ed.), Handbook of research for education communications and technology (2nd ed., pp. 829-862). Mahwah, NJ, USA: Lawrence Erlbaum.

Lajoie, S. P., Lavigne, N. C., Guerrera, C., & Munsie, S. D. (2001). Constructing knowledge in the context of BioWorld. Instructional Science, 29(2), 155-186. doi:10.1023/A:1003996000775

Li, D. D., & Lim, C. P. (2008). Scaffolding online historical inquiry tasks: A case study of two secondary school classrooms. Computers & Education, 50(4), 1394-1410. doi:10.1016/j.compedu.2006.12.013

Lin, H., Ching, Y.-H., Ke, F., & Dwyer, F. (2007). Effectiveness of various enhancement strategies to complement animated instruction: A meta-analytic assessment. Journal of Educational Technology Systems, 35(2), 215-237. doi:10.2190/M200- 555852V7-2287

Linn, M. C. (2000). Designing the knowledge integration environment. International Journal of Science Education, 22(8), 781-796. doi:10.1080/095006900412275

Loevinger, J. (1954). The attenuation paradox in test theory. Psychological Bulletin, 51(5), 493-504. doi:10.1037/h0058543

Mayer, R. E. (1989). Systematic thinking fostered by illustrations in scientific text. Journal of Educational Psychology, 81(2), 240-246. doi:10.1037/0022-0663.81.2.240

McNeill, K. L., & Krajcik, J. (2009). Synergy between teacher practices and curricular scaffolds to support students in using domain-specific and domain-general knowledge in writing arguments to explain phenomena. Journal of the Learning Sciences, 18(3), 416-460. doi:10.1080/10508400903013488

Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational Measurement (3rd ed.). New York, NY: American Council on Education.

Metcalf, S. J. (1999). The design of guided learner-adaptable scaffolding in interactive learning environments (Ph.D. Dissertation). University of Michigan, Ann Arbor, MI, USA. Retrieved from ProQuest Dissertations & Theses Full Text. (Publication No. 9959821)

Nathan, M. J., Kintsch, W., & Young, E. (1992). A theory of algebra-word-problem comprehension and its implications for the design of learning environments. Cognition and Instruction, 9(4), 329-389. doi:10.1207/s1532690xci0904_2

National Governors Association Center for Best Practices, & Council of Chief State School Officers. (2010). Common core state standards. Retrieved from

Nordgren, R. D. (2002). Globalization and education: What students will need to know and be able to do in the global village. Phi Delta Kappan, 84(4), 318-321.

Oliver, K., & Hannafin, M. J. (2000). Student management of web-based hypermedia resources during open-ended problem solving. Journal of Educational Research, 94, 75-92. doi:10.1080/00220670009598746

Pea, R. D. (2004). The social and technological dimensions of scaffolding and related theoretical concepts for learning, education, and human activity. Journal of the Learning Sciences, 13(3), 423-451. doi:10.1207/s15327809jls1303_6

Puntambekar, S., & Hubscher, R. (2005). Tools for scaffolding students in a complex learning environment: What have we gained and what have we missed? Educational Psychologist, 40, 1-12. doi:10.1207/ s15326985ep4001_1

Puntambekar, S., & Kolodner, J. L. (2005). Toward implementing distributed scaffolding: Helping students learn science from design. Journal of Research in Science Teaching, 42(2), 185-217. doi:10.1002/tea.20048

Quintana, C., Reiser, B. J., Davis, E. A., Krajcik, J., Fretz, E., Duncan, R. G., ... Soloway, E. (2004). A scaffolding design framework for software to support science inquiry. Journal of the Learning Sciences, 13(3), 337-386. doi: 10.1207/s15327809jls1303_4

Reiser, B. J. (2004). Scaffolding complex learning: The mechanisms of structuring and problematizing student work. Journal of the Learning Sciences, 13(3), 273-304. doi:10.1207/s15327809jls1303_2

Ronen, M., & Eliahu, M. (2000). Simulation--A bridge between theory and reality: the case of electric circuits. Journal of Computer Assisted Learning, 16(1), 14-26. doi:10.1046/j.1365-2729.2000.00112.x

Saye, J. W., & Brush, T. (2002). Scaffolding critical reasoning about history and social issues in multimedia-supported learning environments. Educational Technology Research and Development, 50(3), 77-96. doi:10.1007/BF02505026

Shadish, W., & Myers, D. (2001). Research design policy brief. Group. Retrieved from

Sugrue, B. (1995). A theory-based framework for assessing domain-specific problem-solving ability. Educational Measurement: Issues and Practice, 14(3), 29-35. doi:10.1111/j.1745-3992.1995.tb00865.x

Sutton, A. J. (2009). Publication bias. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), Handbook of research synthesis and meta-analysis (2nd ed., pp. 435-452). New York, N: Russell Sage Foundation.

Swanson, H. L., & Deshler, D. (2003). Instructing adolescents with learning disabilities: Converting a meta-analysis to practice. Journal of Learning Disabilities, 36(2), 124-135. doi:10.1177/002221940303600205

Swanson, H. L., & Lussier, C. M. (2001). A selective synthesis of the experimental literature on dynamic assessment. Review of Educational Research, 71(2), 321-363. doi:10.3102/00346543071002321

Van de Pol, J., Volman, M., & Beishuizen, J. (2010). Scaffolding in teacher-student interaction: A decade of research. Educational Psychology Review, 22(3), 271-296. doi:10.1007/s10648-010-9127-6

VanLehn, K. (2011). The relative effectiveness of human tutoring, intelligent tutoring systems, and other tutoring systems. Educational Psychologist, 46(4), 197-221. doi:10.1080/00461520.2011.611369

Vygotsky, L. S. (1962). Thought and language. Cambridge, MA: MIT Press.

White, B. Y., & Frederiksen, J. R. (1998). Inquiry, modeling, and metacognition: Making science accessible to all students. Cognition and Instruction, 16(1), 3-118. doi:10.1207/s1532690xci1601_2

Wood, D., Bruner, J. S., & Ross, G. (1976). The role of tutoring in problem solving. Journal of Child Psychology and Psychiatry, 17(2), 89-100. doi:10.1111/j.1469-7610.1976.tb00381.x

Brian R. Belland (1) *, Andrew E. Walker (1), Megan Whitney Olsen (1) and Heather Leary (2)

(1) Department of Instructional Technology and Learning Sciences, Utah State University, USA // (2) Institute of Cognitive Science, University of Colorado-Boulder, USA // // // //

* Corresponding author

Table 1. Coding Scheme

Contextual Information

Paired intervention (e.g., problem-based learning) (a)
Assessment level--conceptual, principles, application
  (Sugrue, 1995)
Education level (e.g., middle level)
Discipline (e.g., physical science)
Collection year
Institution name (of the primary author)
Attrition treatment and attrition control (% of total
  assigned to treatment or control conditions)
Study design (e.g., random, group random, or quasi experimental)

Scaffold intervention

Scaffolding strategy (e.g., make thinking visible)
  (Kali & Linn, 2008)
Scaffolding function (e.g., reduce complexity) (Wood et al., 1976)
Scaffolding intervention (e.g., conceptual) (Hannafin et al., 1999)

Fading (none; human-adapted; fixed; self-selected)
Generic or specific (a)
Scaffolding Outcome (higher-order thinking skills or knowledge
  integration) (a)

Required information for effect size calculation

Means, SD, or ANOVA or other statistical comparison test values
N for treatment and control

Threats to internal validity (Gall et al., 2003) (b)

Statistical regression
Differential selection
Experimental mortality

Threats to external validity (Shadish & Myers, 2001)

Limited description of treatment
Multiple treatment interaction
Experimenter effect

Test Score Quality Reporting (c)

Test score validity reporting
Test score reliability reporting

Note. (a) Bolded coding items indicate a category that was added or
heavily modified as a result of pilot coding. (b) Internal validity
and external validity threats were coded for the degree to which
the threat could account for study results; 0 meant not a plausible
threat and 3 meant it could account for all variance in the
outcome. (c) Test score quality reporting was assessed as (a)
strong, if authors reported validity or reliability data for their
own sample, (b) attempt, if authors reported validity or
reliability data from other studies, or (c) none, if no validity or
reliability data was reported.

Table 2. Influence of scaffolding characteristics
on cognitive outcomes

Scaffolding                 [N.sub.studies]   [N.sub.outcomes]    g


Generic (a)                        1                 1           0.72
Specific (a)                       6                 16          0.52

Intended Outcome

Higher order skills (a)            2                 7           0.44
Knowledge integration (a)          5                 10          0.59

Fading Schedule

None (a)                           6                 10          0.79
Fixed                              2                 7           0.20

Scaffolding Intervention

Conceptual (a)                     6                 12          0.67
Metacognitive                      1                 5           0.25

Paired Intervention

Inquiry-based learning (a)         3                 12          0.39
Problem solving (a)                4                 5           0.53

Scaffolding                 [CI.sub.Lower]   [CI.sub.Upper]


Generic (a)                      0.41             1.03
Specific (a)                     0.28             0.75

Intended Outcome

Higher order skills (a)          0.31             0.74
Knowledge integration (a)        0.27             0.90

Fading Schedule

None (a)                         0.47             1.11
Fixed                            0.00             0.41

Scaffolding Intervention

Conceptual (a)                   0.37             0.97
Metacognitive                    0.00             0.50

Paired Intervention

Inquiry-based learning (a)       0.16             0.61
Problem solving (a)              0.31             0.75

Note. (a) Significantly greater than an effect size of 0. p < .05.

Table 3. Influence of methodological quality on cognitive

Methodological Quality     [N.sub.studies]   [N.sub.outcomes]

Study Design
  Quasi-experimental (a)          3                 3
  Group random (a)                2                 8
  Random (a)                      2                 6
Validity Reporting
  None (a)                        7                 17
  Attempt                         0                 0
  Strong                          0                 0
Reliability Reporting
  None (a)                        6                 11
  Attempt                         0                 0
  Strong (a)                      1                 6
Number of Internal
  Threats (b)
  None                            2                 5
  One                             0                 0
  Two (a)                         3                 4
  Three (a)                       4                 8
Number of External
  Threats (c)
  None (a)                        1                 5
  One (a)                         2                 4
  Two                             2                 3
  Three                           2                 2
  Four                            1                 3

Methodological Quality      g     CJ Lower   CJ Upper

Study Design
  Quasi-experimental (a)   0.60     0.16       1.05
  Group random (a)         0.43     0.13       0.73
  Random (a)               0.66     0.19       1.13
Validity Reporting
  None (a)                 0.53     0.30       0.75
  Attempt                   --       --         --
  Strong                    --       --         --
Reliability Reporting
  None (a)                 0.67     0.37       0.97
  Attempt                   --       --         --
  Strong (a)               0.25     0.01       0.50
Number of Internal
  Threats (b)
  None                     0.29    -0.11       0.68
  One                       --       --         --
  Two (a)                  0.89     0.67       1.09
  Three (a)                0.45     0.16       0.74
Number of External
  Threats (c)
  None (a)                 0.25     0.00       0.50
  One (a)                  1.00     0.63       1.37
  Two                      0.36    -0.35       1.07
  Three                    0.44    -0.10       0.98
  Four                     0.83    -0.13       1.79

Note. (a) Significantly greater than an effect size of 0,
P < .05. (b) [N.sub.studies] does not add up to 7 because
internal threats are associated with individual outcomes.
Some studies included more than one outcome. (c)
[N.sub.studies] does not add up to 7 because external
threats are associated with individual outcomes. Some
studies included more than one outcome.

Table 4. Influence of Assessment Level on Cognitive Outcomes.

Assessment        [N.sub.studies]   [N.sub.outcomes]
Level (b)

Concept                  3                 6
Principles (a)           6                 8
Application (a)          2                 3

Assessment         g     CJ Lower   CJ Upper
Level (b)

Concept           0.46     0.00       0.92
Principles (a)    0.54     0.20       0.88
Application (a)   0.63     0.34       0.93

Note. (a) Significantly greater than an effect size of 0,
P < .05. (b) [N.sub.studies] does not add up to 7 because
some studies employed more than one test that covered two
or more assessment levels.

Figure 2. Outcomes

Study/Eftect                      n treat   n cntrl   Hedges'g

(Bagno & Eylon, 1997)               69        111       0 72
(Clement, 1993) static,             150       55        0.93
  friction, 3rd law
(Foley, 1999) post-test             79        77        0.17
(Ronen & Eliahu, 2000)              71        74        0.00
  theoretical exam
(White & Frederiksen,               118       119       0.00
  1998) explanation
  of applied physics test
multiple choice applied             118       119       0.00
  physics test
conceptual model test               30        30        0.45
Mass project                        60        60        0 47
inquiry test                        45        45        0.55
(Mayer, 1989) Exp. 1                17        17        0.00
  verbatim recognition
Exp. 1 non-explanative recall       17        17        0.00
Exp. 1 explanative recall           17        17        0.65
Exp. 1 transfer                     17        17        102
(Nathan etal., 1992) post-test      14        14        1.04
training                            14        14        144
(Ronen & Eliahu, 2000) task         34        29        101
  2--on last trial
task 1                              13        29        1-61
Overall                             883       844       0.53

Figure 3. Learning by fading

Fading    studies   outcomes   Hedges' g   lower   upper

none         6         10        0.79      0.47    1.11
fixed        2         7         0.20      0.00    0.41
Overall      7         17        0.53      0.31    0.75

Figure 4. Learning by scaffolding intervention

Intervention    studies   outcomes   Hedges' g   lower   upper

conceptual         5         12        0.67      0.37     097
metacognitive      1         5         0.25      0.00    0.50
Overall            7         17        0.53      0.31    0.75

Figure 5. Learning by external threats

External Threats   studies   outcomes   Hedges g   lower   upper

none                  1         5         0.25     0.00    0.50
one                   2         4         1.00     0.63    1.37
two                   2         3         0.36     -0.35   1.07
three                 2         2         0.44     -0.10   0.98
four                  1         3         0.83     -0.13   1.79
Overall               7         17        0.53      0.1    0.75
COPYRIGHT 2015 International Forum of Educational Technology & Society
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2015 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Belland, Brian R.; Walker, Andrew E.; Olsen, Megan Whitney; Leary, Heather
Publication:Educational Technology & Society
Article Type:Report
Date:Jan 1, 2015
Previous Article:Blending face-to-face higher education with web-based lectures: comparing different didactical application scenarios.
Next Article:A two-tier test-based approach to improving students' computer-programming skills in a web-based learning environment.

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters