arrow_back Back to Research

Cognitive Load Theory

Dylan Wiliam described Cognitive Load Theory as "the single most important thing for teachers to know", and I have to agree. I find it provides rationale and practical strategies for the principles of Explicit Instruction discussed in the previous section, and it has transformed the way I teach. Indeed, I have felt compelled (much to my wife's dismay) to write a talk entitled "How Cognitive Load Theory changed my life" as part of my "How I wished I'd taught maths" workshop - click here for more details. Cognitive Load Theory is focussed upon the conditions that make learning easier for students, and it is fascinating to look at this alongside Bjork's work in the Memory section which extols the advantages of making learning harder. For me, the matter is resolved by using the principles of Cognitive Load Theory to aid the early knowledge acquisition phase, removing all unnecessary load on working memory, and then utilising concepts such as spacing, interleaving, testing, variation and other desirable difficulties later in the learning process. In this section we look at key "effects" of the Cognitive Load Theory relating to worked examples, presentation of information, goal-free problems and the development of problem solving skills, and the practical implications they have for the classroom.

Before we dive into the research papers, Oliver Caviglioli has produced some amazing sketchnote summaries of the 2011 Cognitive Load Theory book, which really help illustrate the key ideas:
Chapters 1 and 2
Chapters 3 and 4
Chapters 5 and 6
Chapters 7 and 8
Chapters 9 and 10
Chapters 11 to 18
Complete Download

Also, this report from the New South Wales Centre for Education Statistics and Evaluation is just about the most clear and concise summary of Cognitive Load Theory that I have seen

Research Paper Title: Cognitive Architecture and Instructional Design (there is also a short review of the development of Cognitive Load Theory that gives a good summary of the key points: Story of a Research Programme)
Author(s): John Sweller, Jeroen J. G. van Merrienboer,  and Fred G. W. C. Paas
My Takeaway:
It is no exaggeration to say that Cognitive Load Theory has changed my life. Well, my teaching life, at least. It explains so many of the struggles and mistakes I have seen my students making over the years, and I am just fuming that it has taken me 12 years to discover it. But better late than never! This paper is an outstanding overview of Cognitive Load Theory, together with its related effects and implications for instructional design. Cognitive Load Theory is concerned with how cognitive resources are focussed and used during problem solving and learning. The key principle is that our working memories are severely limited, and that once a student's working memory reaches "cognitive load", no learning takes place. Indeed, it was a real eye-opener for me to discover that students could be working hard, and yet not actually be learning anything. What can be done about this? Well, as teachers we need to help reduce both the intrinsic and extraneous loads:

Intrinsic load - this is determined by the interaction between the nature of the learning tasks and the expertise of the learner. It depends on the amount of element interactivity in the tasks that must be learned. Something like learning the names of polygons has a relatively low element interactivity, as knowing the name of a pentagon is not necessary to know the name of an octagon - hence they can be learned separately. However, something like solving an equation has a relatively high element interactivity as you need to combine many dependent skills together to arrive at the answer. The degree of element interactivity also depends on the expertise of the learner, because what are numerous elements for a low-expertise learner may be only one or a few elements (i.e. chunks) for a high-expertise learner. As we have seen in previous sections, working memory capacity is severely limited (possibly being as low as 3 to 5 items), but the intrinsic load can be reduced by:
  • By ensuring students have sufficient knowledge, which can be organised in their long-term memory as schema. This allows students to work with a sizeable "chunk" of information (or schema) as if it was one item, which frees up capacity in working memory. As we have seen from the papers by Anderson and Willingham in the Cognitive Science section, knowledge matters!
  • Presenting complementary visual and auditory information together
Extraneous load - this is load that is not necessary for learning (indeed, it is unhelpful), and it typically results from badly designed instruction and the way material is presented to the learner. This extraneous load can be reduced by:
  • The use of goal free problems
  • Making use of worked examples
  • Carrying out completion tasks
  • The careful integration of related information
  • Reducing redundant information
In short, as teacher we need to take steps to focus students' limited working memory capacity on the things that matter, removing everything else, whilst at the same time ensuring students have sufficient background knowledge to better deal with the limitations of their fragile working memories. For an in-depth discussion with a maths teacher who regularly practices the lessons from Cognitive Load Theory, you can listen to my interview with Greg Ashman. The papers that follow in this section look in detail at the concepts introduced above and their implications for explicit instruction.

My favourite quote:
The design of practice and the organization and presentation of information is the domain of instructional designers. Although there are many factors that a designer may consider, the major thesis of this paper is that the cognitive load imposed by instructional designs should be the pre-eminent consideration when determining design structures. Limited working memory is one of the defining aspects of human cognitive architecture and, accordingly, all instructional designs should be analyzed from a cognitive load perspective. We argue that many commonly used instructional designs and procedures, because they were designed without reference to working memory limitations, are inadequate.

Research Paper Title: Cognitive Load Theory, Learning Difficulty and Instructional Design
Author(s): John Sweller
My Takeaway:
As well as providing an excellent overview of Cognitive Load Theory, this paper highlights the contrasting problem solving strategies of novices and experts, and thus introduces one of my favourite concepts: the Goal Free Effect. This has a big implication for teaching, especially when introducing new concepts to students. Goal-specific problems are those in which the top-level goals (i.e. final answer) can only be achieved by successfully completing the sub-goals (i.e. the steps leading up to it). When faced with goal-specific problems, novice learners tend to embark upon a means-end analysis. In essence, this means the student tries to juggle all the possible sub-steps that minimize the difference between the current state (the problem) and the end state (the goal). This is a backwards-thinking approach, and it is no surprise that it quickly overloads working memory. I know I have said this before, but it is worth repeating - even though students are working hard in such a scenario, their cognitive overload means they are not actually learning anything. Just as important is the fact that this means-end approach is incompatible with the development of schema, meaning that whilst the students may solve the specific problem at hand, they are unlikely to actually be learning. Learning is a change in long term memory, not just solving one specific problem. A relatively simple solution is to instead present students with ‘goal-free’ problems, which focuses attention upon working forward from the information present one step at a time, rather than trying to hold multiple possible steps in mind at once. A classic example in maths is instead of giving students a complex, multi-step question whose goal is to find the size of "angle x", instead ask them to find the size of as many angles as they can. Similar things can be done to trigonometry questions, or even when working with cumulative frequency diagrams. This breaks the process down and reduces the burden on novices' working memories. As will be discussed in the remaining papers in this section, once students become expert enough in a given topic, they are likely to have sufficient working memory capacity to dedicate to solving goal-specific problems. The key (and difficulty!) is judging when to introduce them, but in early skill acquisition, goal-free problems seem to be the way to go.
My favourite quote:
This means-ends procedure is a highly efficient technique for attaining the problem-goal. It is designed solely for this purpose. It is not intended as a learning technique and bears little relation to schemas or schema acquisition. In order to acquire an appropriate problem solving schema, students must learn to recognize each problem state according to its relevant moves. Using a means-ends strategy, much more must be done. Relations between a problem state and the goal state must be established; differences between them must be extracted; problem operators that impact favourably on those differences must be found. All this must be done essentially simultaneously and repeated for each move keeping in mind any subgoals. Furthermore, for novices, none of the problem states or operators are likely to be automated and so must be carefully considered. According to cognitive load theory, engaging in complex activities such as these that impose a heavy cognitive load and are irrelevant to schema acquisition will interfere with learning. 


Research Paper Title: Cognitive Architecture and Instructional Design
Author(s): John Sweller, Jeroen J. G. van Merrienboer,  and Fred G. W. C. Paas
My Takeaway:
I include this wonderful paper again as I find it the best for discussing the Worked Example Effect and the Completion Problem Effect.
 
Worked Example Effect
The Worked Example Effect, perhaps more than any other identified by Cognitive Load Theory, has had the most significant impact on my teaching. In short, the Worked Example effect attempts to explain the finding whereby learners who study worked examples perform better on test problems than learners who solve the same problems themselves. When I first came across this, I didn't believe it - surely you learn more from trying to solve problems than by reading the solution. But the key is in the "trying to solve problems". The rather counter-intuitive result is due to the fact that studying worked examples focuses all attention on the correct solution and procedure, reducing the extraneous load compared to a means-end analysis that the learner may embark upon if left to their own devices. This is likely to lead to the development and acquisition of all-important schema which will help the students transfer their knowledge to related contexts. I look at worked examples in greater detail in the sections on Making the most of Worked Examples and The Importance of the Choice of Examples.
 
Completion Problem Effect
There is an obvious danger when just presenting students with worked examples, as opposed to problems to solve, that they do not fully engage with the examples. As all the work has been done for them, where is there incentive to think? And, as we have seen, if students are not thinking then they cannot be learning. I have seen this many times myself - some students will pay little attention during the worked examples, and then (surprisingly!) get stuck when they are set to work on their own. This is where the simple idea of requiring students to complete various pieces of information and steps within a worked example comes into play. This could be as simple as creating a full worked solution and then Tipexing or deleting key sections for the students to complete, or even pausing whilst going through a worked example on the board, and asking students to predict (maybe using mini-whiteboards) what comes next. Completion problems provide a bridge between worked examples and conventional problems. The authors put it like thus: worked examples are completion problems with a full solution, and conventional problems are completion problems with no solution. A good progression might be to start with completion problems that provide almost complete solutions, and gradually work to completion problems for which all or most of the solution must be generated by the learners. This is something that will be addressed further in the Making the most of Worked Examples section.

My favourite quote:
There is considerable evidence that, compared to conventional problems, they decrease extraneous cognitive load, facilitate the construction of schemas, and lead to better transfer performance. In short duration studies, results indicated that completion problems are equally effective as worked examples intermixed with conventional problems. In studies of a longer duration, completion problems may better help learners to maintain motivation and focus their attention on useful solution steps that are available in the partial examples.

Research Paper Title:
Reducing Cognitive Load by Mixing Auditory and Visual Presentation Modes
Author(s): Seyed Yaghoub Mousavi, Renae Low, and John Sweller
My Takeaway:
This paper introduces both the Split-Attention Effect and the Modality Effect, both of which have significantly changed my teaching.

Split Attention Effect
We have seen that the worked-example effect occurs because worked examples reduce extraneous cognitive load, but there can be no guarantee that all worked examples appreciably reduce cognitive load. This is especially true if the worked example consists of a diagram, and a separate worked solution, neither of which are intelligible without the other. The learner must split their attention between the two forms of presentation, which increases the cognitive load. Hence, when presenting students with worked examples where a diagram is involved (for example, most geometry topics), the text solution should be carefully integrated within the diagram, and not separate from it. This will prevent students from having to deal with these two forms separately and reduce the extraneous cognitive load. Similarly, keeping all aspects of a question visible at the same time (i.e. without having to constantly turn the page over) will also help reduce this unnecessary load.

Modality Effect
To understand the importance of the modality effect we need to understand the components of working memory. The Central Executive acts a bit like a supervisor, The Phonological Loop deals with speech and sometimes other kinds of auditory information, the Visuo-Spatial Sketchpad holds visual information and the spatial relationships between objects, and the Episodic Buffer integrates new information with information already stored in long-term memory. They key point here is that working memory gets overloaded if too much information flows into one of these components, but we can use different components to aid processing. Hence, the capacity of working memory may be determined by the modality (auditory or visual) of presentation, and the effective size of working memory may be increased by presenting information in a mixed (auditory and visual mode) rather than in a single mode. There are two key implications of this for me.
1) Make use of complementary representations of concepts. This will be discussed more in the next paper.
2) When students are presented with written information, talking over the top of it can cause overload as both are processed by the auditory component (after text is read, it is processed as if it is being heard). The problem is made even worse if the text on the slide and the words I am saying out loud are the same, as here we have redundant information, and hence an example of the Redundancy Effect, which will be discussed further later in this section. So, the simple practice of putting up a slide of text and allowing the students the opportunity to read it in silence BEFORE reading it aloud or discussing it can reduce the modality effect by decreasing the extraneous load. For me, the biggest change this has made is encouraging me to shut up a little more. Previously I would start my students off on some problems, and then feel an incessant need to provide a running commentary over the top - "read the question carefully", "remember to show your working", etc. I thought I was being helpful, but in fact I was overloading my students' finite working memories as they were having to process my oral ramblings together with reading the text of the problems they were solving. In short, I was inhibiting their learning.

My favourite quote:
We began by indicating that basic research into the characteristics of working memory has suggested that this processing system is divided into at least two partially independent subprocessors: an auditory system devoted heavily to language and a visual system for handling images, including writing. Because both systems can be used simultaneously, limited working memory capacity might be effectively increased if information that must be stored or simultaneously processed is presented in a manner that permits it to be divided between the two systems, rather than processed in one system alone. As a consequence, informationally equivalent material that may be difficult to process in a purely visual manner may be more easily handled if it can be presented partially in both modalities.

Research Paper Title: The Instructive Animation: Helping Students Build Connections Between Words and Pictures in Multimedia Learning
Author(s): Richard E. Mayer and Richard B. Anderson
My Takeaway:
This paper further discusses the Modality Effect, but this time in the context of a dual-coding model, which will be discussed in greater detail in the next paper. The authors conducted an experiment in which students studied an animation depicting the operation of a bicycle tire pump or an automobile braking system, along with concurrent oral narration of the steps in the process, and tested their performance on both a retention test and a problem solving test against groups who had the animation alone, narration alone, or no instruction. On both tests, the group who had the animation alongside the oral narration performed the best. The authors conclude that these results are consistent with a dual-coding model in which retention requires the construction of "representational connections" and problem solving requires the construction of "representational and referential connections". The obvious implication for teaching from this paper is that pictures and words together can be more effective than pictures or words alone. If the oral description is there to support the comprehension of the animation, then it should benefit students by easing the strain placed on their working memories. Hence, I now make more use of carefully selected diagrams, as well as GIFs, Geogebra demonstrations and interactive Desmos graphs. For example, introducing circle theorems using something like this, combined with my oral narration over the top, has really helped my students grasp the key concepts. But it can be a fine balance, and we should be careful not to simply use a different form a multi-media for the sake of it. For example, in the Real Life Maths section, I discuss a paper where the use of video to support maths comprehension had a negative effect on performance as students assumed the medium was easier to understand and hence put less effort into their thinking. 
My favourite quote:
What makes an instructive animation? The results presented in this article demonstrate that animation per se does not necessarily improve students' understanding of how a pump or a brake works, as measured by creative problem solving performance. For example, in both experiments, students who received animation before or after narration were able to solve transfer problems no better than students who had received no instruction. In contrast, when animation was presented concurrently with narration, students demonstrated large improvements in problem-solving transfer over the no-instruction group. We conclude that one important characteristic of an instructive animation is temporal contiguity between animation and narration. We hypothesize that contiguity of words and pictures during instruction encourages learners to build connections between their verbal and visual representations of incoming information, which in turn supports problem-solving transfer.

Research Paper Title: Research‐Based Principles for Designing Multimedia Instruction
Author(s): Richard E. Mayer
My Takeaway:
The paper above offered just a small taste of the Cognitive Theory of Multimedia Learning, which has been developed over many years by Richard Mayer, and others. There are clear parallels to be drawn with Cognitive Load Theory in its representation of working memory and emphasis on the importance of reducing extraneous load, but is primarily focussed upon the design of instruction materials. The theory is based upon three key assumptions: 
1) dual channel assumption—people have separate channels for processing visual and verbal material
2) limited capacity assumption—people can process only a limited amount of material in a channel at any one time
3) active processing assumption—meaningful learning occurs when learners select relevant material, organize it into a coherent structure, and integrate it with relevant prior knowledge
The key principle in the Cognitive Theory of Multimedia Learning is The Multimedia Principle - people learn more deeply from words and graphics than from words alone, which was examined in the paper above. However, this is just the beginning, and this paper provides an excellent introduction into the other principles of this theory, all of which have direct practical relevance to the design and presentation of worked examples, demonstrations, worksheets, etc. Many are related to the effects of Cognitive Load Theory, but two that particularly stood out to me were:
The Signalling principle - People learn more deeply from a multimedia message when cues are added that highlight the organization of the essential material. Hence, finding ways to focus students' attention on the parts of examples that really matter is crucial.
The Coherence Principle - People learn more deeply from a multimedia message when extraneous material is excluded rather than included. The non-essential "fluff" I tend to put around examples (usually pathetic jokes) is only doing my students harm. Likewise, this also calls into question fun murder mystery style investigations. After all, we know from Willingham in the Cognitive Science section that students remember what they think about.
This whole paper is a fascinating read, and the theory itself provides a nice complement to the findings and recommendations from Cognitive Load Theory.
My favourite quote:
What makes an instructive animation? The results presented in this article demonstrate that animation per se does not necessarily improve students' understanding of how a pump or a brake works, as measured by creative problem solving performance. For example, in both experiments, students who received animation before or after narration were able to solve transfer problems no better than students who had received no instruction. In contrast, when animation was presented concurrently with narration, students demonstrated large improvements in problem-solving transfer over the no-instruction group. We conclude that one important characteristic of an instructive animation is temporal contiguity between animation and narration. We hypothesize that contiguity of words and pictures during instruction encourages learners to build connections between their verbal and visual representations of incoming information, which in turn supports problem-solving transfer.

Research Paper Title: Cognitive Architecture and Instructional Design
Author(s): John Sweller, Jeroen J. G. van Merrienboer,  and Fred G. W. C. Paas
My Takeaway:
I include this wonderful paper again as I find it the best for discussing the Redundancy Effect. In the past I had assumed that simply repeating the same information twice, but in a different form, would at worst have a neutral effect on learning. After all, what harm can it do? Moreover, surely it is good to be told something twice - giving students two opportunities to get it? But if that extra information is redundant - i.e. if students can infer all they need for the initial presentation - then I am likely to be imposing an unnecessary load upon students working memories. We must be careful to distinguish this from the split attention effect. Split-attention occurs when learners are faced with multiple sources of information that must be integrated before they can be understood. The individual sources of information cannot be used by learners if considered in isolation, hence the need for integration. The redundancy effect occurs when multiple sources of information are self-contained and can be used without reference to each other. This is because that redundant information is very difficult to ignore, and hence must be processed in students' limited working memories. The message for me is clear: if a concept cannot be understood without a second piece of information, then carefully integrate the two pieces of information together. If the second piece of information is not needed, then leave it out! This is especially true when information is presented in written form on a slide, and yet I feel the incessant need to also read it out to students. Redundant, inhibiting and incredibly annoying!
My favourite quote:
Redundancy is a major effect that should be considered seriously by instructional designers. A large range of experimental results indicate the negative consequences of including redundant material when designing instruction. We know of no experimental work demonstrating advantages of redundancy, and we suspect that such a result only could be obtained under conditions where one set of instructional materials was so poor that any redundant alternative would inevitably confer benefits.

Research Paper Title:
Teaching Complex Rather Than Simple Tasks: Balancing Intrinsic and Germane Load to Enhance Transfer of Learning
Author(s): Jeron J G Van Merrienboer, Liesbeth Kster and Fred Paas
My Takeaway:
So far the focus of Cognitive Load Theory has been on making thinking as easy for students in the sense that all their limited working memory capacity should be focused entirely on the thing we want them to think about to prevent cognitive overload occurring. In other words, as teachers we should try to:
1) eliminate unhelpful extraneous load via worked examples, the careful presentation of information and the use of goal free problems
2) reduce intrinsic load by helping students acquire the background knowledge necessary so that sub-components of a complex task are automated.
However, what if we reduce these two types of loads so much that thinking actually becomes too easy? This fascinating (and controversial!) paper addresses this issue by introducing the concept of Germane Load. This can be viewed as "good cognitive load", in that it directly contributes to learning. It does this by aiding the construction of cognitive structures and processes that improve performance. The authors of this paper found that whilst load reducing extraneous load is effective in producing high retention of the material, these techniques hinder the transfer of learning. They argue that there is a need to vary the conditions of practice and only give limited guidance and feedback in order to induce germane cognitive load and improve transfer. In other words, in order to improve learning (in particular the transfer of skills to new contexts), we need to make learning more difficult... but difficult in the right way! This is a concept similar to Bjork's fascinating idea of "desirable difficulties" that will be discussed at length in the Memory section.
The reason this paper is controversial is that one of the originators of Cognitive Load Theory, John Sweller, has distanced himself from the concept of germane load as he believes it makes his theory impossible to falsify. For example, assuming that the overall load is kept constant, a decrease in performance will be attributed to a rise in extraneous load that impairs germane cognitive processes. Conversely, if the performance increases it will be attributed to a germane load enhancement made possible by a drop in extraneous load.
What is my takeaway from all this? Well, I'll be honest - I am not 100% sure! Building in the concept of germane load might be making Cognitive Load Theory unnecessarily complicated. My take is this: during initial skill acquisition we need to ensure thinking is as focused and easy for the student as possible using all the principles we have discussed in the papers above. But, we need to ensure that thinking is not too easy. If students are cruising through lessons on autopilot, then their learning is unlikely to be deep, and learning without the ability to transfer it to new situations is not really learning at all. Of course, this is a fine balance, and will be covered in far more detail in the Memory sections.
My favourite quote:
In general, well-designed instruction should decrease extraneous load and optimise germane load, within the limits of total available capacity in order to prevent cognitive overload. However, this article is mainly about the situation that even after the removal of all sources of extraneous cognitive load, the element interactivity of the complex tasks is still too high to allow for efficient learning. Thus, it is about balancing intrinsic load, which is caused by dealing with the element interactivity in the tasks, and germane load, which is caused by genuine learning processes. The structure of our argument is as follows. First, we discuss research findings indicating that germane-load inducing instructional methods used for practicing simple tasks are not used for practicing complex tasks, at the cost of transfer of learning. Second, we argue that the element interactivity of learning tasks should be limited early in training to decrease their intrinsic load, so that germane-load inducing methods might be used right from the start of the training program.


Research Paper Title:
Cognitive Load during Problem Solving: Effects on Learning
Author(s): John Sweller
My Takeaway:
One huge question from all we have seen so far on Explicit Instruction and Cognitive Load Theory is: "how do we get our students to become good problem solvers?". This paper offers the first clue. Two of the main strategies involved in problem solving are:
1) Schema acquisition. This involves recognising similarities between novel and previously solved problems, and calling upon knowledge stored in long term memory to apply to the new situation.
2) Means-end analysis. This is a generic problem-solving strategy that we all possess, and it involves measuring your current state, evaluating how far you are from the solution state and then deciding which moves may get you closer.
For students to become good problem solvers they need to form mental schema from domain-specific knowledge which they can then apply to different situations. Unlike experts, novices lack the appropriate schema to recognise and memorise problem configurations. They set about solving problems by focusing on the detail and ignoring structure, embarking upon a means-end analysis. This would all be fine, apart from the claim in this paper that solving problems via such a strategy is itself is not an effective way for novices to develop these crucial mental schema. Why? Well, because trying to solve problems in this manner (problem-solving search via means-end analysis) is cognitively demanding. the learner has to maintain the following aspects of the problem in his or her mind: current problem state, goal state, differences between these two states, operators that reduce the differences between the goal state and the present state, and subgoals. This overloads working memory, and hence the mental schema are not developed. Indeed, even if you manage to solve the problem, you might not recall the solution method, and hence you might not actually learn anything from the process. In other words, during a means-end approach to solving a problem, local goals and relationships may swamp the more global relationships. My key takeaway from this is a rather big one - students may not learn key knowledge and procedures from problem solving. Sure, they mauy solve the problem, but they are unlikely to be able to solve a related one (and almost certainly not an unrelated one, because the idea of a generic "Problem Solving skill" is flawed, as we shall see in the Problem Solving section), and it is an ineffective way of teaching the fundamental skills and procedures required to solve the problem. Problem solving is not a learning device. Problem solving must come at the end of the process, after the necessary domain specific knowledge has been learned. So, what are the implications for the classroom? Well, firstly, students should not be exposed to complex problem too early in the learning process. Secondly, I believe there is little point going through lots of difficult exam questions in the hope students understand them and make connections between related questions. I have been there myself - going through a series of unrelated problems with a class of 30 Year 11s, successfully answering my prompt-filled questions, and happily nodding along when I ask them if they get it. But any apparent success from teaching problem solving to novices is likely to be just mimicry. The skills will not be transferred. This will be covered further in the Problem Solving section, but the key point is that if basic skills are not in place, then problem-solving search via means-end analysis suggests that there will simply not be enough capacity in working memory for students to develop the mental schema necessary to learn and transfer.
My favourite quote:
Most mathematics and mathematics-based curricula place a heavy emphasis on conventional problem solving as a learning device. Once basic principles have been explained and a limited number of worked examples demonstrated, students are normally required to solve substantial numbers of problems. Much time tends to be devoted to problem solving and as a consequence, considerable learning probably occurs during this period. The emphasis on problem solving is nevertheless, based more on tradition than on research findings. There seems to be no clear evidence that conventional problem solving is an efficient learning device and considerable evidence that it is not. If, as suggested here, conventional problems impose a heavy cognitive load which does not assist in learning, they may be better replaced by nonspecific goal problems or worked examples. The use of conventional problems should be reserved for tests and perhaps as a motivational device.

Research Paper Title: The Expertise Reversal Effect
Author(s): Sweller, J., Ayres, P. L., Kalyuga, S. & Chandler, P. A. 
My Takeaway:
So far all the talk has been of using Explicit Instruction and the key features of Cognitive Load Theory as the best method of teaching students to become fluent in the facts and procedures they will need to learn more complex skills. Specifically, in early skill acquisition, learning from worked examples is very advantageous, and learning by solving problems is not. However, as this paper describes, instructional techniques that are highly effective with inexperienced learners (novices) can lose their effectiveness and even have negative consequences when used with more experienced learners (experts), hence the Expertise Reversal Effect. The argument is that worked examples contain information that is easily determined by the more experienced learners themselves and, therefore, can be considered redundant. As we have seen via the Redundancy Effect, devoting working memory to redundant information effectively takes away a portion of the learners’ limited cognitive capacity that could be devoted to the more useful germane load. Moreover, this redundant information may even interfere with the schemas constructed by experienced learners, preventing them from seeing the deeper connections in problems that are essential for transfer. For example, what if students have solved a problem differently to how I have presented it in thew worked example? At this stage of development, working through complex problems independently is likely to be more beneficial for long-term learning than studying worked examples. Of course, one major difficulty of this is recognising when students have made the transition from novice to expert and hence can start to be exposed to more complex problems. It is a delicate balancing act! What I have started doing is making worked examples "optional" for students once we have covered the basics of a topic. That way they can judge themselves whether they are at the stage where using worked examples will help or hinder them.
My favourite quote:
When a problem can be solved relatively effortlessly, analyzing a redundant worked example and integrating it with previously acquired schemas in working memory may impose a greater cognitive load than problem solving. Under these circumstances, practice in problem solving may result in more effective learning than studying worked examples because solving problems may adequately facilitate further schema construction and automation