18. AI for Science

Course Positioning

This course teaches how AI can support the scientific workflow: literature review, hypothesis generation, data analysis, simulation, lab/field experiment planning, surrogate modeling, Bayesian optimization, scientific agents, reproducibility, and responsible research. It is designed for AI-for-science builders and domain scientists, not just general AI users.

Learning outcomes

Map scientific workflows into AI-assistable components: literature, data, models, simulations, experiments, and manuscripts.
Use LLMs for structured literature review, hypothesis generation, protocol drafting, and scientific critique with verification.
Apply AI/ML concepts such as embeddings, surrogate models, active learning, Bayesian optimization, and simulation-in-the-loop workflows.
Design evaluation metrics and oracle/feedback loops for scientific discovery systems.
Build a small AI-for-science project proposal or prototype with reproducibility and ethics plan.

Expanded Topic-by-Topic Coverage

Module 1. The AI-for-science landscape

Module focus: Foundation models for science, lab automation, climate, biology, materials, medicine, mathematics, scientific agents. Primary live activity or lab: Map one research area to AI opportunity types. Expected take-home output: AI-for-science opportunity map.

Topics and coverage

Foundation models for science

What it means: explain how Foundation models for science changes the interaction between human intent, model behavior, external information, and final output.
What to cover: inputs, constraints, examples, output format, grounding, iteration, failure modes, and when a human must intervene.
Demonstration: show a weak attempt, a stronger structured attempt, and a reviewed final version with explicit checks.
Evidence of learning: learners create a reusable prompt, schema, retrieval note, or workflow pattern and test it on at least two examples.

lab automation

What it means: define lab automation clearly and connect it to the module focus: Foundation models for science, lab automation, climate, biology, materials, medicine, mathematics, scientific agents.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

climate

What it means: define climate clearly and connect it to the module focus: Foundation models for science, lab automation, climate, biology, materials, medicine, mathematics, scientific agents.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

biology

What it means: define biology clearly and connect it to the module focus: Foundation models for science, lab automation, climate, biology, materials, medicine, mathematics, scientific agents.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

materials

What it means: define materials clearly and connect it to the module focus: Foundation models for science, lab automation, climate, biology, materials, medicine, mathematics, scientific agents.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

medicine

What it means: define medicine clearly and connect it to the module focus: Foundation models for science, lab automation, climate, biology, materials, medicine, mathematics, scientific agents.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

mathematics

What it means: define mathematics clearly and connect it to the module focus: Foundation models for science, lab automation, climate, biology, materials, medicine, mathematics, scientific agents.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

scientific agents

What it means: explain how scientific agents changes the interaction between human intent, model behavior, external information, and final output.
What to cover: inputs, constraints, examples, output format, grounding, iteration, failure modes, and when a human must intervene.
Demonstration: show a weak attempt, a stronger structured attempt, and a reviewed final version with explicit checks.
Evidence of learning: learners create a reusable prompt, schema, retrieval note, or workflow pattern and test it on at least two examples.

Practice and evidence of learning

Learners complete or discuss: Map one research area to AI opportunity types.
Learners produce: AI-for-science opportunity map.
Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

Learners can explain the module vocabulary without relying on tool-generated text.
Learners have seen one worked example, one hands-on application, and one limitation or failure case.
Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 2. Scientific literature workflows

Module focus: Search, screening, extraction, citation graphs, claims, contradictions, evidence tables. Primary live activity or lab: Build a structured literature matrix. Expected take-home output: Literature evidence table.

Topics and coverage

Search

What it means: define Search clearly and connect it to the module focus: Search, screening, extraction, citation graphs, claims, contradictions, evidence tables.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

screening

What it means: define screening clearly and connect it to the module focus: Search, screening, extraction, citation graphs, claims, contradictions, evidence tables.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

extraction

What it means: define extraction clearly and connect it to the module focus: Search, screening, extraction, citation graphs, claims, contradictions, evidence tables.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

citation graphs

What it means: define citation graphs clearly and connect it to the module focus: Search, screening, extraction, citation graphs, claims, contradictions, evidence tables.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

claims

What it means: define claims clearly and connect it to the module focus: Search, screening, extraction, citation graphs, claims, contradictions, evidence tables.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

contradictions

What it means: define contradictions clearly and connect it to the module focus: Search, screening, extraction, citation graphs, claims, contradictions, evidence tables.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

evidence tables

What it means: define evidence tables clearly and connect it to the module focus: Search, screening, extraction, citation graphs, claims, contradictions, evidence tables.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Practice and evidence of learning

Learners complete or discuss: Build a structured literature matrix.
Learners produce: Literature evidence table.
Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

Learners can explain the module vocabulary without relying on tool-generated text.
Learners have seen one worked example, one hands-on application, and one limitation or failure case.
Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 3. Hypothesis generation and critique

Module focus: LLM brainstorming, mechanistic reasoning, falsifiability, novelty, confounders, failure modes. Primary live activity or lab: Generate hypotheses and critique them against evidence. Expected take-home output: Hypothesis shortlist.

Topics and coverage

LLM brainstorming

What it means: explain how LLM brainstorming changes the interaction between human intent, model behavior, external information, and final output.
What to cover: inputs, constraints, examples, output format, grounding, iteration, failure modes, and when a human must intervene.
Demonstration: show a weak attempt, a stronger structured attempt, and a reviewed final version with explicit checks.
Evidence of learning: learners create a reusable prompt, schema, retrieval note, or workflow pattern and test it on at least two examples.

mechanistic reasoning

What it means: define mechanistic reasoning clearly and connect it to the module focus: LLM brainstorming, mechanistic reasoning, falsifiability, novelty, confounders, failure modes.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

falsifiability

What it means: define falsifiability clearly and connect it to the module focus: LLM brainstorming, mechanistic reasoning, falsifiability, novelty, confounders, failure modes.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

novelty

What it means: define novelty clearly and connect it to the module focus: LLM brainstorming, mechanistic reasoning, falsifiability, novelty, confounders, failure modes.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

confounders

What it means: define confounders clearly and connect it to the module focus: LLM brainstorming, mechanistic reasoning, falsifiability, novelty, confounders, failure modes.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

failure modes

What it means: define failure modes clearly and connect it to the module focus: LLM brainstorming, mechanistic reasoning, falsifiability, novelty, confounders, failure modes.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Practice and evidence of learning

Learners complete or discuss: Generate hypotheses and critique them against evidence.
Learners produce: Hypothesis shortlist.
Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

Learners can explain the module vocabulary without relying on tool-generated text.
Learners have seen one worked example, one hands-on application, and one limitation or failure case.
Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 4. Data and representation

Module focus: Scientific datasets, metadata, embeddings, ontologies, features, leakage, provenance. Primary live activity or lab: Design a data schema for a scientific problem. Expected take-home output: Data card.

Topics and coverage

Scientific datasets

What it means: connect Scientific datasets to the data lifecycle from source and structure through analysis, interpretation, and decision-making.
What to cover: source reliability, missing or biased data, leakage, assumptions, calculations, and the difference between correlation and decision-ready evidence.
Demonstration: walk through a small dataset or example table and mark the checks required before trusting the result.
Evidence of learning: learners produce a short analysis note that includes assumptions, limitations, and verification steps.

metadata

What it means: connect metadata to the data lifecycle from source and structure through analysis, interpretation, and decision-making.
What to cover: source reliability, missing or biased data, leakage, assumptions, calculations, and the difference between correlation and decision-ready evidence.
Demonstration: walk through a small dataset or example table and mark the checks required before trusting the result.
Evidence of learning: learners produce a short analysis note that includes assumptions, limitations, and verification steps.

embeddings

What it means: explain how embeddings changes the interaction between human intent, model behavior, external information, and final output.
What to cover: inputs, constraints, examples, output format, grounding, iteration, failure modes, and when a human must intervene.
Demonstration: show a weak attempt, a stronger structured attempt, and a reviewed final version with explicit checks.
Evidence of learning: learners create a reusable prompt, schema, retrieval note, or workflow pattern and test it on at least two examples.

ontologies

What it means: define ontologies clearly and connect it to the module focus: Scientific datasets, metadata, embeddings, ontologies, features, leakage, provenance.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

features

What it means: connect features to the data lifecycle from source and structure through analysis, interpretation, and decision-making.
What to cover: source reliability, missing or biased data, leakage, assumptions, calculations, and the difference between correlation and decision-ready evidence.
Demonstration: walk through a small dataset or example table and mark the checks required before trusting the result.
Evidence of learning: learners produce a short analysis note that includes assumptions, limitations, and verification steps.

leakage

What it means: define leakage clearly and connect it to the module focus: Scientific datasets, metadata, embeddings, ontologies, features, leakage, provenance.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

provenance

What it means: define provenance clearly and connect it to the module focus: Scientific datasets, metadata, embeddings, ontologies, features, leakage, provenance.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Practice and evidence of learning

Learners complete or discuss: Design a data schema for a scientific problem.
Learners produce: Data card.
Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

Learners can explain the module vocabulary without relying on tool-generated text.
Learners have seen one worked example, one hands-on application, and one limitation or failure case.
Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 5. Models, simulators, and surrogates

Module focus: Physical simulators, ML surrogates, uncertainty, calibration, validation, multi-objective problems. Primary live activity or lab: Sketch simulator + surrogate workflow. Expected take-home output: Modeling plan.

Topics and coverage

Physical simulators

What it means: define Physical simulators clearly and connect it to the module focus: Physical simulators, ML surrogates, uncertainty, calibration, validation, multi-objective problems.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

ML surrogates

What it means: define ML surrogates clearly and connect it to the module focus: Physical simulators, ML surrogates, uncertainty, calibration, validation, multi-objective problems.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

uncertainty

What it means: define uncertainty clearly and connect it to the module focus: Physical simulators, ML surrogates, uncertainty, calibration, validation, multi-objective problems.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

calibration

What it means: define calibration clearly and connect it to the module focus: Physical simulators, ML surrogates, uncertainty, calibration, validation, multi-objective problems.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

validation

What it means: define validation clearly and connect it to the module focus: Physical simulators, ML surrogates, uncertainty, calibration, validation, multi-objective problems.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

multi-objective problems

What it means: define multi-objective problems clearly and connect it to the module focus: Physical simulators, ML surrogates, uncertainty, calibration, validation, multi-objective problems.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Practice and evidence of learning

Learners complete or discuss: Sketch simulator + surrogate workflow.
Learners produce: Modeling plan.
Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

Learners can explain the module vocabulary without relying on tool-generated text.
Learners have seen one worked example, one hands-on application, and one limitation or failure case.
Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 6. Active learning and Bayesian optimization

Module focus: Acquisition functions, exploration/exploitation, expensive experiments, constraints, batch design. Primary live activity or lab: Design an active-learning loop for a fictional experiment. Expected take-home output: BO loop diagram.

Topics and coverage

Acquisition functions

What it means: define Acquisition functions clearly and connect it to the module focus: Acquisition functions, exploration/exploitation, expensive experiments, constraints, batch design.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

exploration/exploitation

What it means: define exploration/exploitation clearly and connect it to the module focus: Acquisition functions, exploration/exploitation, expensive experiments, constraints, batch design.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

expensive experiments

What it means: define expensive experiments clearly and connect it to the module focus: Acquisition functions, exploration/exploitation, expensive experiments, constraints, batch design.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

constraints

What it means: define constraints clearly and connect it to the module focus: Acquisition functions, exploration/exploitation, expensive experiments, constraints, batch design.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

batch design

What it means: show where batch design appears in the learner's real workflow and which parts are judgment-heavy versus draftable.
What to cover: current workflow, pain points, AI-assisted steps, human review checkpoints, quality standard, and ownership of the final decision.
Demonstration: convert one messy real-world input into a structured brief, draft, analysis, checklist, or next action.
Evidence of learning: learners produce a reusable template or playbook entry that can be used after the course.

Practice and evidence of learning

Learners complete or discuss: Design an active-learning loop for a fictional experiment.
Learners produce: BO loop diagram.
Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

Learners can explain the module vocabulary without relying on tool-generated text.
Learners have seen one worked example, one hands-on application, and one limitation or failure case.
Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 7. Scientific agents and tools

Module focus: Tool-using agents, literature tools, code tools, databases, experiment planning, lab notebooks. Primary live activity or lab: Design an agent harness for a scientific task. Expected take-home output: Scientific agent spec.

Topics and coverage

Tool-using agents

What it means: explain how Tool-using agents changes the interaction between human intent, model behavior, external information, and final output.
What to cover: inputs, constraints, examples, output format, grounding, iteration, failure modes, and when a human must intervene.
Demonstration: show a weak attempt, a stronger structured attempt, and a reviewed final version with explicit checks.
Evidence of learning: learners create a reusable prompt, schema, retrieval note, or workflow pattern and test it on at least two examples.

literature tools

What it means: define literature tools clearly and connect it to the module focus: Tool-using agents, literature tools, code tools, databases, experiment planning, lab notebooks.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

code tools

What it means: define code tools clearly and connect it to the module focus: Tool-using agents, literature tools, code tools, databases, experiment planning, lab notebooks.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

databases

What it means: connect databases to the data lifecycle from source and structure through analysis, interpretation, and decision-making.
What to cover: source reliability, missing or biased data, leakage, assumptions, calculations, and the difference between correlation and decision-ready evidence.
Demonstration: walk through a small dataset or example table and mark the checks required before trusting the result.
Evidence of learning: learners produce a short analysis note that includes assumptions, limitations, and verification steps.

experiment planning

What it means: define experiment planning clearly and connect it to the module focus: Tool-using agents, literature tools, code tools, databases, experiment planning, lab notebooks.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

lab notebooks

What it means: define lab notebooks clearly and connect it to the module focus: Tool-using agents, literature tools, code tools, databases, experiment planning, lab notebooks.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Practice and evidence of learning

Learners complete or discuss: Design an agent harness for a scientific task.
Learners produce: Scientific agent spec.
Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

Learners can explain the module vocabulary without relying on tool-generated text.
Learners have seen one worked example, one hands-on application, and one limitation or failure case.
Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 8. Evaluation and reproducibility

Module focus: Benchmarks, held-out tests, baselines, ablations, uncertainty, reproducible pipelines, data/version control. Primary live activity or lab: Create an evaluation plan for a proposed system. Expected take-home output: Evaluation checklist.

Topics and coverage

Benchmarks

What it means: define Benchmarks clearly and connect it to the module focus: Benchmarks, held-out tests, baselines, ablations, uncertainty, reproducible pipelines, data/version control.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

held-out tests

What it means: define held-out tests clearly and connect it to the module focus: Benchmarks, held-out tests, baselines, ablations, uncertainty, reproducible pipelines, data/version control.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

baselines

What it means: define baselines clearly and connect it to the module focus: Benchmarks, held-out tests, baselines, ablations, uncertainty, reproducible pipelines, data/version control.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

ablations

What it means: define ablations clearly and connect it to the module focus: Benchmarks, held-out tests, baselines, ablations, uncertainty, reproducible pipelines, data/version control.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

uncertainty

What it means: define uncertainty clearly and connect it to the module focus: Benchmarks, held-out tests, baselines, ablations, uncertainty, reproducible pipelines, data/version control.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

reproducible pipelines

What it means: define reproducible pipelines clearly and connect it to the module focus: Benchmarks, held-out tests, baselines, ablations, uncertainty, reproducible pipelines, data/version control.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

data/version control

What it means: connect data/version control to the data lifecycle from source and structure through analysis, interpretation, and decision-making.
What to cover: source reliability, missing or biased data, leakage, assumptions, calculations, and the difference between correlation and decision-ready evidence.
Demonstration: walk through a small dataset or example table and mark the checks required before trusting the result.
Evidence of learning: learners produce a short analysis note that includes assumptions, limitations, and verification steps.

Practice and evidence of learning

Learners complete or discuss: Create an evaluation plan for a proposed system.
Learners produce: Evaluation checklist.
Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

Learners can explain the module vocabulary without relying on tool-generated text.
Learners have seen one worked example, one hands-on application, and one limitation or failure case.
Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 9. Ethics, safety, and dual-use

Module focus: Biosecurity, research integrity, data rights, environmental impact, overclaiming, responsible release. Primary live activity or lab: Risk assessment for an AI-for-science project. Expected take-home output: Responsible research note.

Topics and coverage

Biosecurity

What it means in this course: define Biosecurity in operational terms, not as an abstract principle.
What to cover: sensitive data boundaries, affected stakeholders, approval paths, documentation, and what researchers, graduate students, postdocs, R&D teams, scientific software builders must never delegate blindly to AI.
Use case: present one acceptable use, one borderline use, and one prohibited use, then ask learners to justify the classification.
Evidence of learning: learners add a risk control, review step, or escalation rule to their course project.

research integrity

What it means: show where research integrity appears in the learner's real workflow and which parts are judgment-heavy versus draftable.
What to cover: current workflow, pain points, AI-assisted steps, human review checkpoints, quality standard, and ownership of the final decision.
Demonstration: convert one messy real-world input into a structured brief, draft, analysis, checklist, or next action.
Evidence of learning: learners produce a reusable template or playbook entry that can be used after the course.

data rights

What it means in this course: define data rights in operational terms, not as an abstract principle.
What to cover: sensitive data boundaries, affected stakeholders, approval paths, documentation, and what researchers, graduate students, postdocs, R&D teams, scientific software builders must never delegate blindly to AI.
Use case: present one acceptable use, one borderline use, and one prohibited use, then ask learners to justify the classification.
Evidence of learning: learners add a risk control, review step, or escalation rule to their course project.

environmental impact

What it means: define environmental impact clearly and connect it to the module focus: Biosecurity, research integrity, data rights, environmental impact, overclaiming, responsible release.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

overclaiming

What it means: define overclaiming clearly and connect it to the module focus: Biosecurity, research integrity, data rights, environmental impact, overclaiming, responsible release.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

responsible release

What it means: define responsible release clearly and connect it to the module focus: Biosecurity, research integrity, data rights, environmental impact, overclaiming, responsible release.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Practice and evidence of learning

Learners complete or discuss: Risk assessment for an AI-for-science project.
Learners produce: Responsible research note.
Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

Learners can explain the module vocabulary without relying on tool-generated text.
Learners have seen one worked example, one hands-on application, and one limitation or failure case.
Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 10. Capstone proposal/prototype

Module focus: Problem, data, models, oracle, loop, metrics, risks, roadmap. Primary live activity or lab: Present an AI-for-science project. Expected take-home output: Proposal or prototype.

Topics and coverage

Problem

What it means: define Problem clearly and connect it to the module focus: Problem, data, models, oracle, loop, metrics, risks, roadmap.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

data

What it means: connect data to the data lifecycle from source and structure through analysis, interpretation, and decision-making.
What to cover: source reliability, missing or biased data, leakage, assumptions, calculations, and the difference between correlation and decision-ready evidence.
Demonstration: walk through a small dataset or example table and mark the checks required before trusting the result.
Evidence of learning: learners produce a short analysis note that includes assumptions, limitations, and verification steps.

models

What it means: place models inside the AI system stack so learners know what problem it solves and what tradeoffs it introduces.
What to cover: inputs, outputs, system boundaries, evaluation criteria, cost or latency implications, and common failure cases.
Demonstration: use a diagram, small code sample, worksheet, or tool trace to make the mechanism visible.
Evidence of learning: learners compare two approaches and explain which one they would choose for a realistic constraint.

oracle

What it means: define oracle clearly and connect it to the module focus: Problem, data, models, oracle, loop, metrics, risks, roadmap.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

loop

What it means: define loop clearly and connect it to the module focus: Problem, data, models, oracle, loop, metrics, risks, roadmap.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

metrics

What it means: define metrics clearly and connect it to the module focus: Problem, data, models, oracle, loop, metrics, risks, roadmap.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

risks

What it means in this course: define risks in operational terms, not as an abstract principle.
What to cover: sensitive data boundaries, affected stakeholders, approval paths, documentation, and what researchers, graduate students, postdocs, R&D teams, scientific software builders must never delegate blindly to AI.
Use case: present one acceptable use, one borderline use, and one prohibited use, then ask learners to justify the classification.
Evidence of learning: learners add a risk control, review step, or escalation rule to their course project.

roadmap

What it means: define roadmap clearly and connect it to the module focus: Problem, data, models, oracle, loop, metrics, risks, roadmap.
What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
Demonstration: give one simple example, one realistic example, and one failure or limitation example.
Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Practice and evidence of learning

Learners complete or discuss: Present an AI-for-science project.
Learners produce: Proposal or prototype.
Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

Learners can explain the module vocabulary without relying on tool-generated text.
Learners have seen one worked example, one hands-on application, and one limitation or failure case.
Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Labs, projects, and assessments

Lab 1: Literature matrix with claims, evidence strength, contradictions, and open questions.
Lab 2: Design a closed-loop scientific discovery workflow with data, model, oracle, and experiment/simulation feedback.
Lab 3: Build a small notebook prototype or no-code workflow for scientific extraction, analysis, or optimization.
Capstone: AI-for-science mini-proposal or prototype with evaluation, reproducibility, and responsible research plan.

Evaluation approach

20% literature and evidence matrix.
20% hypothesis and critique exercise.
20% active learning or simulator workflow.
20% evaluation/reproducibility plan.
20% capstone proposal or prototype.

Recommended tools and materials

AI assistant, literature databases, Zotero, Python notebooks, domain datasets, Git, experiment tracking, optional BO libraries and vector search.
Optional: domain-specific foundation models, simulation tools, and lab information systems.

Safety, ethics, and governance emphasis

Scientific claims generated by AI must be checked against primary literature or experiments.
Avoid unsafe biological, chemical, clinical, or dual-use operational instructions.
Require transparent documentation of data provenance, prompts, code, model versions, and limitations.

Delivery notes

Adapt examples to the audience: biology, climate, materials, agriculture, medicine, economics, or math.
This course can become a proposal-writing incubator for research grants.

Instructor Build Checklist

Prepare one short demo for each module and one learner activity that creates a saved artifact.
Prepare examples that match the audience, local context, and likely tools learners can access.
Add a verification step to every AI-generated output: factual check, source check, data sensitivity check, and quality review.
Keep a running portfolio folder so each module contributes to the final project or learner playbook.
Reserve time for reflection on what the learner did, what AI did, what was checked, and what remains uncertain.