Evaluation

how to measure if you have met your goals

This card covers how you measure whether the tool met its goals. Without a way to check, it is easy to assume a tool works because it was built. Good evaluation looks at real use and real output, not just whether the tool runs without errors.

Questions to explore

// use these as prompts in a workshop or on your own. There are no right answers.

What would you measure to know if the tool is actually helping?
How will you compare the work with the tool against the work without it?
What does success look like in numbers, and what only shows up in conversation?
How often will you check, and who looks at the results?
What result would tell you to change or retire the tool?

Expert voices

// notes from the journalists and AI experts who helped shape this kit

“Assess whether AI tools actually improve journalism with metrics beyond efficiency: quality, accuracy, and audience impact.”

Lynn Khellaf, DW

“Decide at what point you will review and evaluate the impact of AI on your work.”

Zenzele Ndebele, Centre for Innovation and Technology (CITE)

“Set realistic testing periods, typically one to three months, with clear success metrics. Give the evaluation enough time to get past the learning curve and show real production impact.”

Michelle Nogales, Muy Waso

“Test AI systems before newsroom adoption, not after. A shared testing grid turns scattered impressions into comparable evaluations.”

Ola Möller and Barbara Gruber, MethodKit and DW Akademie

Things to consider

A tool that runs is not the same as a tool that helps.
Decide how you will measure success before you launch.
Some effects show up in how people work, not in the metrics.

using this card

Pull Evaluation when it is relevant and set it aside when it is not. Pair it with the other AI Solutions cards, lay them out on a table, and use the questions above to get everyone on the same page. Capture what you discuss on sticky notes or in a shared doc.