Hackathon 2 — Path 2 of 3

Create

Build a practical AI tool or prompt for evaluation work

🕐 15–30 minutes · 3 options to choose from · Free AI tool of your choice

What this path involves

The Create path is for practitioners who want to make something tangible with AI. You'll design, test, and share a prompt, chatbot, or small tool that has real utility in evaluation contexts — not a toy, but not enterprise software either.

The goal is to produce a working artifact that other evaluators could use or adapt. Your submission goes to the public gallery as a reusable community resource. We encourage hackers to empathize with the intended user and provide clear instructions on how to use the tool, if relevant.

Choose your option

🕐 10–20 min

Prompt Lab

Choose one of the five structured challenges below. Each one gives you a specific evaluation task to tackle through prompt engineering — no coding, just craft.

Steps:

  1. Pick one challenge from the list below.
  2. Write a prompt (or series of prompts) to accomplish the challenge task in any free AI tool.
  3. Test it, iterate, refine.
  4. Save your best prompt + the AI's output to share in your submission.

Include in your submission:

  • Which challenge you chose
  • Your final prompt (paste it)
  • The AI's best output (paste or link)
  • A 140-character reflection on what worked
🕐 20–30 min

Custom Chatbot: Build a Tiny Helper

Use a no-code AI platform (Claude, ChatGPT, or similar) to build a custom chatbot with a system prompt tailored to a specific evaluation task — a survey QA assistant, an ethics check bot, or something you need right now.

Steps:

  1. Identify an evaluation task where a chatbot would help. Keep it narrow — "helps check whether interview questions are leading" is better than "helps with evaluation."
  2. Write a system prompt that defines the bot's role, constraints, and how it should handle edge cases.
  3. Test it: throw three realistic user inputs at it and see how it responds. Iterate the system prompt as needed.
  4. Document what you built: the system prompt + 2–3 example exchanges.

Include in your submission:

  • The bot's purpose (one sentence)
  • Your system prompt
  • Link to the published bot or pasted example exchanges
  • A 140-character reflection
🕐 20–30 min

Vibe-Coding Studio

Use an AI-assisted coding environment (Claude Artifacts, ChatGPT Canvas, or similar) to build a small tool — a data viz, a rubric generator, a checklist — without writing traditional code. Describe what you want; let the AI build it.

Steps:

  1. Pick a concrete tool: a rubric template, a data cleaning helper, a visual summary of a theory of change, etc.
  2. Describe what you want to an AI tool in plain language. Be specific about inputs, outputs, and format.
  3. Iterate: ask the AI to fix, improve, or extend what it built. Note what worked and what required multiple attempts.
  4. Publish or export your artifact and grab a shareable link.

Include in your submission:

  • What you built and why it's useful
  • A link to the published artifact
  • Key prompts you used to get there
  • A 140-character reflection

Five Prompt Lab Challenges

Pick any one of the five challenges below. Each is completable in 10–15 minutes using any free AI tool (ChatGPT, Claude, Gemini, etc.). No prior AI experience required — the challenge is in crafting the prompt, not the technology.

Challenge 1: The Plain-Language Translator

~10 min

Take a dense, jargon-heavy evaluation finding (from a real report or one you compose) and craft a prompt that gets an AI to rewrite it for a non-specialist community audience. The challenge is to instruct the AI to preserve the core finding while removing barriers to understanding — without dumbing it down or introducing errors.

Intent: A good prompt will specify the target audience clearly, set a reading-level expectation, ask the AI to flag any hedging it removes, and produce plain prose — not bullet points.

Challenge 2: Interview Question Architect

~12 min

You're given a two-sentence theory of change for a youth workforce program (or write your own). Craft a prompt to generate a stakeholder interview question set — one set for program staff and one for participants. The AI should produce open-ended, culturally aware questions directly tied to the theory of change's key assumptions.

Intent: A strong prompt will specify the stakeholder roles, the number of questions, the methodological grounding (semi-structured interview), and ask the AI to annotate which assumption each question probes.

When young people from under-resourced communities receive structured job-readiness training and paid work experience, they develop the professional skills and social capital needed to secure and sustain employment — ultimately reducing barriers to economic mobility.

Challenge 3: The Nuance Keeper

~10 min

Craft a prompt that gets an AI to summarize qualitative interview data without flattening divergent perspectives or over-generalizing from one or two voices. Use the example transcript excerpt below, or describe your own dataset in a few sentences — no file uploads required.

Intent: A good prompt will instruct the AI to surface both dominant themes and outliers, preserve participant voice through direct quotes, and explicitly flag where it is uncertain or where data is thin.

Interviewer: What changed for you after joining the program? Participant: Honestly, I didn't think I'd stick with it. But after the first month, I started showing up differently — like, I knew what I was doing. My supervisor noticed. She said I was one of the most reliable people on the team. That meant a lot because nobody ever said that to me before. Interviewer: What do you think made the difference? Participant: The practice interviews, for sure. And just having someone in my corner who believed I could do it. I never had that.

Challenge 4: The AI Skeptic

~15 min

An AI tool has produced a program recommendation based on evaluation data (sample provided below). Your task: craft a prompt that gets the same AI to critique its own recommendation — identifying weak assumptions, missing contextual factors, and at least two potential unintended consequences of acting on it.

Intent: A strong prompt will assign the AI a specific skeptical role (e.g., "act as a program officer who is skeptical of this finding"), set a minimum number of risks to identify, and ask it to rank them by severity. Bonus: prompt the AI to suggest what additional evidence would change its assessment.

We recommend the program adopt an AI-powered case management system to automatically flag at-risk youth based on attendance and engagement data, reducing the burden on case managers and enabling earlier intervention.

Challenge 5: Ethical Use Statement Drafter

~12 min

Your organization is beginning to use AI tools in an evaluation engagement for a government client. Craft a prompt that helps you draft a brief ethical use statement — one you could actually share with the client — covering data privacy, informed consent, attribution, and known AI limitations relevant to evaluation contexts.

Intent: A good prompt will produce a statement that is honest about AI's role in the work, accessible to a non-technical audience, grounded in specific evaluation ethics (e.g., AEA ethics guidelines), and avoids both over-promising and reflexive disclaimers that say nothing.

Before you submit

Every Create submission needs these five things:

  • One-line claim — what did you make or prove? (max 140 characters)
  • Evidence of work — your prompt(s), system prompt, or description of what you built
  • Link (if applicable) — a shareable link to your artifact, bot, or AI output
  • Ethics check — confirm your submission contains no PII or real client data
  • 140-character reflection — what worked, what surprised you, or what you'd do differently
Submit your Create entry →