Hunting the white whale of learning impact
Reflections on Will Thalheimer’s LTEM Bootcamp.
A few weeks ago, I completed Will Thalheimer’s inaugural ‘LTEM Boot Camp’ on learning evaluation.
LTEM stands for ‘Learning Transfer Evaluation Model’, and it provides an alternative to the Kirkpatrick Model that many L&D teams use to measure the effectiveness of their learning interventions.
In this edition of the Dispatch, my intention is not to explain what LTEM is, or how to use it — for this, I recommend you read Will’s LTEM Report.
Instead, I want to share some reflections on the boot camp, and describe how it has shaped my thinking on evaluation.
🐋 Is learning impact my white whale?
On the Learning Experience team at Mind Tools, we’ve been using LTEM for several years, encouraging our clients to agree success metrics at the start of every project. I emphasize ‘start’ here because, as Ross G has written, what we choose to measure influences our approach to content-design. We strongly believe that evaluation should be discussed at the beginning of this process, not just at the end.
When explaining LTEM to clients, I often focus on Tier 8 of the model: Effects of Transfer. In simple terms, Tier 8 is all about the business impact of learning. If you’re developing sales training, a Tier-8 metric would be increased sales. If you’re designing an intervention around employee wellbeing, a Tier-8 metric might be the rate of absenteeism.
To be clear, my obsession with Tier 8 is personal.
As a vendor, we’re not always included in evaluating the materials we design. We encourage clients to think about measurement upfront, and we help them identify success metrics at the different levels of LTEM. But gathering and analyzing that data is typically outside of our control. Even in the projects we work on with our Insights team, where we’re developing valid and reliable behavioral surveys, I’m often left wanting just a little more certainty.
When I signed up for the LTEM Boot Camp, I wasn’t looking to understand the theory behind LTEM – I wanted to understand the practice. While I wasn’t conscious of this at the time, what I actually wanted was for Will to tell me how to remove all doubt from learning evaluation. Taken a step further, I secretly hoped he would explain how I could boil everything down to a single, indisputable number that would prove business impact. Which, of course, is a pipe dream.
My point in all of this, and something I’ve realized since completing the boot camp, is that you can drive yourself slightly mad in the quest to demonstrate ‘pure’ learning impact. There are steps you can take to remove doubt and increase certainty, but these come at a cost.
📏 How much measurement is too much?
This brings me to my next point. To make it, let’s return to our sales-training example.
Suppose it’s been six months since learners completed your program, and sales have increased 20%. Before you pat yourself on the back, how can you be sure that this improved performance should be attributed to your intervention?
Well, you could start by looking at the metrics you set at the other levels of LTEM, including Tier 5: Decision-Making Competence. Ideally, you’d want data on this from before, during, and after the learning event to demonstrate that participants have improved and retained their decision-making ability. You might gather this data by asking learners realistic scenario questions, and evaluating their responses.
If the data shows that the sales team’s decision-making competence has increased, this should give you additional confidence that your intervention is behind the increased sales. But it’s still not quite enough, as it doesn’t account for other factors that might explain the change, such as seasonal purchasing cycles, economic conditions, or incentives.
One way you could control for these factors would be by piloting your new program with a representative group from the sales team, and comparing their performance against a control group. If the pilot group’s performance exceeds that of the control, this would be another indicator that your training program has been effective.
So, how far down the measurement rabbit hole should you go? Predictably, the answer to this question is ‘It depends.’ If you’re spending $100,000 on a blended-learning program, it’s probably worth being really sure that your intervention has worked. If, on the other hand, you’re spending $8,000 on a few explainer videos, you might be happy to stick to Tier 5.
🤔 Why do we measure?
Of course, the purpose of evaluation is not only to demonstrate business impact. As Will pointed out in the LTEM Boot Camp, there are three key reasons we use measurement in learning: i) to prove value; ii) to improve learning; iii) to support learners.
On reflection, I think my obsession with the first of these reasons has sometimes led me to overlook the importance of the other two.
As I’ve already mentioned, we agree success metrics at the outset of client projects, as this improves learning by ensuring the content we design is tied to measurable performance outcomes. But measurement also allows us to improve learning at the end of a project, helping us identify what worked, what didn’t, and where we can make iterative adjustments.
Fundamentally, as Will likes to put it, ‘we measure to make something happen!’. We measure to make better decisions, to take better actions, and to improve our reputation, encouraging our organizations to give us the resources and autonomy we need to support learners. I’m expanding my definition of ‘impact’ to include all of these things going forward.
Want to share your thoughts on learning evaluation? Need help defining success metrics for a new program? Get in touch by emailing email@example.com or reply to this newsletter from your inbox.
🎧 On the podcast
What does an effective workplace wellbeing strategy look like? For some organizations, it’s fruit boxes and staff discounts. For consultant and author Liggy Webb, it’s a holistic approach that factors in the physical, social, mental, financial, digital, environmental and spiritual health of colleagues.
In this week’s episode of podcast, Ross G and Nahdia are joined by Liggy to discuss the benefits of a more structured approach to wellbeing.
Check out the episode below. 👇
📖 Deep dive
AI enthusiasts will doubtless have noticed that Google has finally released the much-anticipated ‘Gemini Advanced’ LLM.
As I’ve mentioned in previous editions of the Dispatch, Ethan Mollick’s newsletter, One Useful Thing, has been one of my go-to sources for AI news and analysis over the last twelve months. So, naturally, I was curious to read Ethan’s thoughts on Gemini Advanced.
In this article, Ethan provides ‘tasting notes’ on the new model. Rather than focusing on Gemini Advanced’s performance against testing benchmarks, he offers a mixture of subjective and objective views on the LLM - ‘more like sampling a wine than giving a rigorous review.’
One of his most interesting ‘notes’ related to applying the LLM in a learning context:
‘We have been actively experimenting with using AI for learning and have been writing papers with suggested prompts. While updating the prompts for Gemini […] we noticed that, compared to GPT-4, it continually tries to be helpful. In fact, it is so helpful that it can undermine the goal of our prompts by trying to help the student, rather than letting them struggle through understanding a concept on their own. We had to change our prompts a little to reduce this behavior.’
Mollick, E. (2024). ‘Google’s Gemini Advanced: Tasting Notes and Implications’. One Useful Thing.
👹 Missing links
In a recent Substack post, Nate Silver responds to a question from a reader — ‘Why do people argue on the internet?’ In typical Silverian fashion, Nate answers this question by identifying five typologies of internet-arguers, ranging from ‘Somebody Is Wrong On the Internet’ types, who can’t help but call out nonsense when they see it, to ‘Flag-Wavers’, whose arguments are designed to signal their good-guy credentials to their tribe.
I’ve been eagerly awaiting the launch of Apple’s new mixed-reality headset, the ‘Vision Pro’ — not because I have any intention of buying one (they retail at $3500), but because I think Apple have addressed a lot of the issues holding back mainstream adoption of VR. In this episode of the Hard Fork podcast, Casey Newton and Kevin Roose share their initial impressions of the device, after testing it at Apple’s Cupertino headquarters.
If you’re not on TikTok, you may have missed the viral frenzy surrounding the Stanley cup. Such is the hysteria, that customers at department stores are reportedly getting into heated arguments over the opportunity to buy what is, in essence, a thermos. To see if she could become the kind of person who is cultishly devoted to a cup, journalist Heather Schwedel spent a week with a Stanley and documented her experience for Slate.
👋 And finally…
Props to my colleague Anna Barnett for bringing this week’s ‘And finally…’ to my attention. Have you ever been asked to ‘fold in the cheese’?
Enable 3rd party cookies or use another browser
Thanks for reading The L&D Dispatch from Mind Tools! If you’d like to speak to us, work with us, or make a suggestion, you can email firstname.lastname@example.org.
Or just hit reply to this email!
Hey here’s a thing! If you’ve reached all the way to the end of this newsletter, then you must really love it!
Why not share that love by hitting the button below, or just forward it to a friend?