Evaluation Case Study — Upstream Solutions

—— APPLIED CASE STUDY

The program was working. Nobody could prove it.

A nonprofit education team with two active evaluation partners, a full reporting cycle, and leadership asking questions nobody could answer on the spot.

SECTOR

Nonprofit / Education

FUNCTION

Program Evaluation & Impact Measurement

TEAM SIZE

3-person programs team

FRAMEWORK

Assess · Optimize · Amplify

THE ORGANIZATION ——

A regional education nonprofit managing a two-year literacy initiative across 14 school sites.

The organization is well-staffed, mission-aligned, and has invested meaningfully in evaluation infrastructure. Two external partners are actively generating data. The board is engaged. Funders are watching. Everything looks right — except the internal team can't keep up with what's being produced, and no one is quite sure what to do with it.

INITIATIVE SCOPE

A two-year literacy intervention serving 3,200 students across 14 Title I school sites. Year one is complete; the mid-point funder review is six weeks away.

FUNDER RELATIONSHIP

A national foundation providing $2.1M over two years. Mid-point review requires demonstrated progress on three learning questions tied to the theory of change.

INTERNAL TEAM STRUCTURE

Three staff members managing the initiative: one Director of Programs & Evaluation (handles strategy, reporting, and funder relations), two program managers (site coordination and partner liaison). All experienced. All stretched thin.

CURRENT PAIN POINT

The director knows the program is working. She just spent three hours last week trying to find the right numbers to say so — and still isn't confident in what she found.

EXTERNAL PARTNERS ——

PARTNER ONE

Research & Evaluation Firm

A boutique research consultancy contracted to conduct the formal program evaluation. They administer pre/post literacy assessments, run teacher perception surveys, and produce two formal reports per year — one mid-year, one summative.

Bi-annual student assessment data (literacy benchmarks)
Teacher and school administrator perception surveys
Formal mid-year and summative evaluation reports
Logic model fidelity monitoring

PARTNER TWO

Learning & Coaching Consultancy

A professional development firm providing in-person coaching to classroom teachers at all 14 sites. They track coaching session attendance, document implementation fidelity notes, and submit monthly activity logs to the program managers.

Monthly coaching session logs (per site, per teacher)
Implementation fidelity observation notes
Teacher skill progression rubrics
Informal quarterly reflections on program adoption

HOW IT’S PRESENTING ——

The team isn't asking for a new system. They think they need a better report.

On the surface, this looks like a documentation problem. The data coordinator is manually pulling from two partner reporting formats that don't align with each other, and neither maps cleanly to the funder's reporting template. So the team's proposed solution is a new report template — something that pulls everything together in one place.

VISIBLE SYMPTOM

Coaching logs from Partner Two arrive in a different format each month — sometimes a PDF narrative, sometimes a spreadsheet, once a slide deck. One program manager spends 6–8 hours per cycle normalizing inputs before any analysis can begin — on top of their actual job managing site relationships.

VISIBLE SYMPTOM

Partner One's mid-year report landed three weeks ago. It's 47 pages. The director has read the executive summary. The program managers haven't opened it.

VISIBLE SYMPTOM

When the ED asks "are we on track?" before a board meeting, the programs director gives a verbal update from memory. There's no standing dashboard, no shared summary — just individual knowledge held by one person.

VISIBLE SYMPTOM

The two external partners have never been in the same room — or on the same call. Their work is generating complementary insights that no one is connecting.

WHAT EACH STAKEHOLDER SAYS THEY NEED

DIRECT OF PROGRAMS

"We need a unified reporting template so everything lives in one place." — Believes the problem is format inconsistency.

PROGRAM MANAGERS

"We need Partner Two to submit their logs on time and in the right format." — Believes the problem is a partner compliance issue.

EXECUTIVE DIRECTOR

"I need to know if we're on track — in plain English, before the board asks." — Believes the problem is communication speed.

FUNDER

"Show us progress on the three learning questions from the original proposal." — Believes the problem is evidence packaging.

THE UPSTREAM APPRROACH ——

Four stakeholders. Four different problem diagnoses. One actual bottleneck.

Upstream doesn't assume the presenting problem is the real problem. Before recommending anything — a new template, a new process, new training — we assess where the system is actually breaking down. The answer shapes everything that follows.

1
Assess

Where is the system actually breaking down?

Stakeholder mapping, workflow tracing, and gap diagnosis — before a single solution is proposed.

2
Optimize

Where is the system actually breaking down?

Stakeholder mapping, workflow tracing, and gap diagnosis — before a single solution is proposed.

3
Amplify

Where is the system actually breaking down?

Stakeholder mapping, workflow tracing, and gap diagnosis — before a single solution is proposed.

PHASE ONE - ASSESS ——

The template isn't the problem. The team is making decisions without a shared picture of the work.

Upstream's assessment process maps three things: where information lives, who needs it, and what decision each person is actually trying to make. In this case, the data exists. The evaluation partners are producing quality work. What doesn't exist is a structure that surfaces the right piece to the right person at the right moment. That's a delivery and synthesis problem — not a reporting format problem.

DIRECTOR OF PROGRAMS & EVALUATION

High expertise. Single point of failure.

WHAT THEY KNOW
Everything. Both partner relationships, all the data, the full context. Handles strategy, funder reporting, and day-to-day program decisions.

WHAT’S AT RISK
All institutional knowledge is in her head. No coverage, no redundancy. Evaluation work competes with program management work.

STAKE IN CHANGE
High motivation, low bandwidth. Will support solutions that reduce her load without adding new systems to manage.

PROGRAM MANAGERS

Doing data work they weren't hired for.

WHAT THEY KNOW
Site-level implementation. Day-to-day coaching relationships. What's actually happening on the ground.

WHAT’S AT RISK
Their site knowledge never makes it into the formal record. They're chasing data instead of managing programs.

STAKE IN CHANGE
Want relief from data wrangling. May resist new processes if they feel like more work on top of program management.

EXTERNAL PARTNERS

Producing in parallel. Never connected.

WHAT THEY KNOW
Each owns half the picture. Assessment data vs. implementation data.

WHAT’S AT RISK
Missed correlations between coaching fidelity and student outcomes. The most important insights are invisible.

STAKE IN CHANGE
Cooperative if given clear direction. No one has asked them to align yet.

EXECUTIVE DIRECTOR

Asking the right question. Getting the wrong answer.

WHAT THEY KNOW
Whatever the programs director told them last. High-level strategy. Board expectations.

WHAT’S AT RISK
Going into a $2.1M funder review underprepared. Board questions she can't answer confidently.

STAKE IN CHANGE
Wants clarity and confidence. Will champion anything that delivers it without creating new work for her team.

ASSESSMENT FINDINGS ——

01

The bottleneck is synthesis, not collection.

Both partners are producing quality data on schedule. The breakdown happens at the point where someone has to make it mean something — that step is currently undesigned. It happens differently every cycle, by whoever has time, without a standard for what "good" looks like.

02

The two partners are answering different parts of the same question.

Partner One tracks what students can do. Partner Two tracks what teachers are doing. No one is connecting these datasets — which means the most important insight (does coaching fidelity predict student gains?) is sitting invisible inside existing data nobody has combined.

03

Program managers are being used as data wranglers.

Their highest-value contribution — site relationships, contextual knowledge, qualitative signal — is being crowded out by administrative data-chasing. One manager is spending almost a full workday per month normalizing partner inputs. This isn't a people problem. It's a role design problem.

04

The ED is six weeks from a funder review with no standing brief.

There's no persistent document the executive director can open on a Tuesday morning and know exactly where the initiative stands. This is the most immediate risk — and the most fixable one.

PHASE TWO - OPTIMIZE ——

What we build, what we fix, and what we stop doing entirely.

Optimization is surgical. We don't redesign what works — we intervene at the specific points where the system is losing value. In this case, targeted interventions address the core breakdowns without requiring new software or additional headcount.

INTERVENTION

WHAT IT DOES

WHAT IT REPLACES

Partner Alignment Protocol

A quarterly joint session with both external partners — structured to surface connections between their datasets and produce one shared insight memo rather than two separate reports.

Ad hoc communication. 47-page reports nobody reads. Missed correlations between coaching and outcomes data.

Standing Initiative Brief

A two-page living document updated after each partner submission that answers the ED's "are we on track?" question directly. Structured around the three funder learning questions.

Memory-based verbal updates. Last-minute prep before board meetings. Anxiety.

Data Intake Standardization

A shared input template both partners complete monthly in the same format. Reduces normalization time from 6–8 hours to under 90 minutes.

PDF narratives, ad hoc spreadsheets, slide decks submitted as deliverables. A program manager's recurring 6-hour reconciliation task.

PHASE THREE - AMPLIFY‍ ‍ ——

How we know it worked — and what good looks like.

Amplification means the organization gets more from the same effort — and the changes stick. We measure what matters, track early wins, and design for sustainability from day one.

IMMEDIATE - 30 DAYS

Data intake time cut by 75%

From 6–8 hours of monthly normalization to under 90 minutes.

NEAR-TERM - 60 DAYS

Leadership has real-time visibility

The ED references the standing brief without being asked. Board prep happens days ahead, not the night before.

MID-TERM - 90 DAYS

Funder review delivered with confidence

The team presents clear evidence of progress without last-minute scrambling.

SUSTAINED - 6 MONTHS

0 Data emergencies

No more reactive scrambles to produce information that should exist in the normal rhythm.

"Amplify doesn't mean the program runs itself. It means the people running it have what they need to make the next decision without calling us first."

WORK WITH UPSTREAM

Your data is telling a story.
Let's make sure the right people can read it.

Upstream Solutions works with organizations where understanding isn't optional — and where the gap between what you know and what you can act on is costing real momentum.

START A CONVERSATION

SEE OUR WORK

Want more case studies like this?

Get notified when we publish new work and insights on unlocking expertise.

No spam. Unsubscribe anytime.

The program was working. Nobody could prove it.

A regional education nonprofit managing a two-year literacy initiative across 14 school sites.

PARTNER ONE

PARTNER TWO

The team isn't asking for a new system. They think they need a better report.

Four stakeholders. Four different problem diagnoses. One actual bottleneck.

1Assess

2Optimize

3Amplify

The template isn't the problem. The team is making decisions without a shared picture of the work.

01

02

03

04

What we build, what we fix, and what we stop doing entirely.

How we know it worked — and what good looks like.

"Amplify doesn't mean the program runs itself. It means the people running it have what they need to make the next decision without calling us first."

Your data is telling a story.Let's make sure the right people can read it.

Want more case studies like this?

Subscribe

Upstream Solutions

Complex content, human-centered learning.

1
Assess

2
Optimize

3
Amplify

Your data is telling a story.
Let's make sure the right people can read it.

Complex content,
human-centered learning.