AI AI EducationCurriculum Library
All courses

AI Curriculum

9. AI Hardware: How AI Works on Chips, Servers, and Data Centers

AudienceEngineers, founders, investors, students, IT leaders, procurement teams, and technically curious professionals
Duration8-10 weeks, with optional hardware lab extensions
Modules10

9. AI Hardware: How AI Works on Chips, Servers, and Data Centers

Course Positioning

A technical systems course that explains the physical and computational stack behind AI: chips, memory, networking, compilers, servers, data centers, and inference economics.

Learning outcomes

  • Explain why AI workloads are dominated by matrix multiplication, memory bandwidth, parallelism, and data movement.
  • Compare CPUs, GPUs, TPUs, NPUs, ASICs, FPGAs, edge accelerators, and data center clusters.
  • Understand training vs inference hardware requirements, precision formats, batching, caching, and latency constraints.
  • Analyze how memory, networking, storage, cooling, and power delivery shape AI system performance and cost.
  • Estimate the hardware and cloud cost implications of model size, context length, throughput, and service-level targets.
  • Understand the AI hardware value chain from chip design to deployment.

Course Design Snapshot

  • Positioning: A technical systems course that explains the physical and computational stack behind AI: chips, memory, networking, compilers, servers, data centers, and inference economics.
  • Audience: Engineers, founders, investors, students, IT leaders, procurement teams, and technically curious professionals.
  • Duration: 8-10 weeks, with optional hardware lab extensions.
  • Prerequisites: Basic computer architecture helpful but not required. Some math and Python familiarity recommended.
  • Format: Concept lectures, diagrams, hardware teardown videos, profiling labs, cost modeling, and architecture comparisons.

Expanded Topic-by-Topic Coverage

Module 1. From transistor to tensor

Module focus: From transistor to tensor: bits, floating point, matrix multiplication, parallelism, and why AI loves accelerators. Primary live activity or lab: Manually compute a tiny matrix multiply and estimate operation count.

Topics and coverage

bits

  • What it means: define bits clearly and connect it to the module focus: From transistor to tensor: bits, floating point, matrix multiplication, parallelism, and why AI loves accelerators.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

floating point

  • What it means: define floating point clearly and connect it to the module focus: From transistor to tensor: bits, floating point, matrix multiplication, parallelism, and why AI loves accelerators.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

matrix multiplication

  • What it means: define matrix multiplication clearly and connect it to the module focus: From transistor to tensor: bits, floating point, matrix multiplication, parallelism, and why AI loves accelerators.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

parallelism

  • What it means: define parallelism clearly and connect it to the module focus: From transistor to tensor: bits, floating point, matrix multiplication, parallelism, and why AI loves accelerators.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

why AI loves accelerators

  • What it means: define why AI loves accelerators clearly and connect it to the module focus: From transistor to tensor: bits, floating point, matrix multiplication, parallelism, and why AI loves accelerators.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Practice and evidence of learning

  • Learners complete or discuss: Manually compute a tiny matrix multiply and estimate operation count.
  • Learners produce: Manually compute a tiny matrix multiply and estimate operation count.
  • Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
  • Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

  • Learners can explain the module vocabulary without relying on tool-generated text.
  • Learners have seen one worked example, one hands-on application, and one limitation or failure case.
  • Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 2. CPU vs GPU

Module focus: CPU vs GPU: cores, SIMD/SIMT, memory bandwidth, kernels, CUDA intuition, and why GPUs won deep learning. Primary live activity or lab: Profile a CPU vs GPU matrix multiplication if hardware is available.

Topics and coverage

cores

  • What it means: define cores clearly and connect it to the module focus: CPU vs GPU: cores, SIMD/SIMT, memory bandwidth, kernels, CUDA intuition, and why GPUs won deep learning.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

SIMD/SIMT

  • What it means: define SIMD/SIMT clearly and connect it to the module focus: CPU vs GPU: cores, SIMD/SIMT, memory bandwidth, kernels, CUDA intuition, and why GPUs won deep learning.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

memory bandwidth

  • What it means: define memory bandwidth clearly and connect it to the module focus: CPU vs GPU: cores, SIMD/SIMT, memory bandwidth, kernels, CUDA intuition, and why GPUs won deep learning.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

kernels

  • What it means: define kernels clearly and connect it to the module focus: CPU vs GPU: cores, SIMD/SIMT, memory bandwidth, kernels, CUDA intuition, and why GPUs won deep learning.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

CUDA intuition

  • What it means: define CUDA intuition clearly and connect it to the module focus: CPU vs GPU: cores, SIMD/SIMT, memory bandwidth, kernels, CUDA intuition, and why GPUs won deep learning.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

why GPUs won deep learning

  • What it means: place why GPUs won deep learning inside the AI system stack so learners know what problem it solves and what tradeoffs it introduces.
  • What to cover: inputs, outputs, system boundaries, evaluation criteria, cost or latency implications, and common failure cases.
  • Demonstration: use a diagram, small code sample, worksheet, or tool trace to make the mechanism visible.
  • Evidence of learning: learners compare two approaches and explain which one they would choose for a realistic constraint.

Practice and evidence of learning

  • Learners complete or discuss: Profile a CPU vs GPU matrix multiplication if hardware is available.
  • Learners produce: Profile a CPU vs GPU matrix multiplication if hardware is available.
  • Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
  • Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

  • Learners can explain the module vocabulary without relying on tool-generated text.
  • Learners have seen one worked example, one hands-on application, and one limitation or failure case.
  • Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 3. TPUs, NPUs, ASICs, and edge accelerators

Module focus: TPUs, NPUs, ASICs, and edge accelerators: systolic arrays, specialization, energy efficiency, and deployment constraints. Primary live activity or lab: Compare accelerator types for mobile, cloud training, and real-time inference.

Topics and coverage

systolic arrays

  • What it means: define systolic arrays clearly and connect it to the module focus: TPUs, NPUs, ASICs, and edge accelerators: systolic arrays, specialization, energy efficiency, and deployment constraints.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

specialization

  • What it means: define specialization clearly and connect it to the module focus: TPUs, NPUs, ASICs, and edge accelerators: systolic arrays, specialization, energy efficiency, and deployment constraints.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

energy efficiency

  • What it means: define energy efficiency clearly and connect it to the module focus: TPUs, NPUs, ASICs, and edge accelerators: systolic arrays, specialization, energy efficiency, and deployment constraints.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

deployment constraints

  • What it means: place deployment constraints inside the AI system stack so learners know what problem it solves and what tradeoffs it introduces.
  • What to cover: inputs, outputs, system boundaries, evaluation criteria, cost or latency implications, and common failure cases.
  • Demonstration: use a diagram, small code sample, worksheet, or tool trace to make the mechanism visible.
  • Evidence of learning: learners compare two approaches and explain which one they would choose for a realistic constraint.

Practice and evidence of learning

  • Learners complete or discuss: Compare accelerator types for mobile, cloud training, and real-time inference.
  • Learners produce: Compare accelerator types for mobile, cloud training, and real-time inference.
  • Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
  • Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

  • Learners can explain the module vocabulary without relying on tool-generated text.
  • Learners have seen one worked example, one hands-on application, and one limitation or failure case.
  • Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 4. Memory hierarchy

Module focus: Memory hierarchy: registers, SRAM, HBM, DRAM, storage, KV cache, context length, and the memory wall. Primary live activity or lab: Estimate KV cache memory for different model sizes and context lengths.

Topics and coverage

registers

  • What it means: define registers clearly and connect it to the module focus: Memory hierarchy: registers, SRAM, HBM, DRAM, storage, KV cache, context length, and the memory wall.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

SRAM

  • What it means: define SRAM clearly and connect it to the module focus: Memory hierarchy: registers, SRAM, HBM, DRAM, storage, KV cache, context length, and the memory wall.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

HBM

  • What it means: define HBM clearly and connect it to the module focus: Memory hierarchy: registers, SRAM, HBM, DRAM, storage, KV cache, context length, and the memory wall.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

DRAM

  • What it means: define DRAM clearly and connect it to the module focus: Memory hierarchy: registers, SRAM, HBM, DRAM, storage, KV cache, context length, and the memory wall.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

storage

  • What it means: explain how storage changes the interaction between human intent, model behavior, external information, and final output.
  • What to cover: inputs, constraints, examples, output format, grounding, iteration, failure modes, and when a human must intervene.
  • Demonstration: show a weak attempt, a stronger structured attempt, and a reviewed final version with explicit checks.
  • Evidence of learning: learners create a reusable prompt, schema, retrieval note, or workflow pattern and test it on at least two examples.

KV cache

  • What it means: define KV cache clearly and connect it to the module focus: Memory hierarchy: registers, SRAM, HBM, DRAM, storage, KV cache, context length, and the memory wall.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

context length

  • What it means: define context length clearly and connect it to the module focus: Memory hierarchy: registers, SRAM, HBM, DRAM, storage, KV cache, context length, and the memory wall.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

the memory wall

  • What it means: define the memory wall clearly and connect it to the module focus: Memory hierarchy: registers, SRAM, HBM, DRAM, storage, KV cache, context length, and the memory wall.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Practice and evidence of learning

  • Learners complete or discuss: Estimate KV cache memory for different model sizes and context lengths.
  • Learners produce: Estimate KV cache memory for different model sizes and context lengths.
  • Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
  • Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

  • Learners can explain the module vocabulary without relying on tool-generated text.
  • Learners have seen one worked example, one hands-on application, and one limitation or failure case.
  • Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 5. Training stack

Module focus: Training stack: data pipelines, distributed training, tensor/data/pipeline parallelism, checkpointing, and interconnect. Primary live activity or lab: Diagram a distributed training system and identify bottlenecks.

Topics and coverage

data pipelines

  • What it means: connect data pipelines to the data lifecycle from source and structure through analysis, interpretation, and decision-making.
  • What to cover: source reliability, missing or biased data, leakage, assumptions, calculations, and the difference between correlation and decision-ready evidence.
  • Demonstration: walk through a small dataset or example table and mark the checks required before trusting the result.
  • Evidence of learning: learners produce a short analysis note that includes assumptions, limitations, and verification steps.

distributed training

  • What it means: place distributed training inside the AI system stack so learners know what problem it solves and what tradeoffs it introduces.
  • What to cover: inputs, outputs, system boundaries, evaluation criteria, cost or latency implications, and common failure cases.
  • Demonstration: use a diagram, small code sample, worksheet, or tool trace to make the mechanism visible.
  • Evidence of learning: learners compare two approaches and explain which one they would choose for a realistic constraint.

tensor/data/pipeline parallelism

  • What it means: connect tensor/data/pipeline parallelism to the data lifecycle from source and structure through analysis, interpretation, and decision-making.
  • What to cover: source reliability, missing or biased data, leakage, assumptions, calculations, and the difference between correlation and decision-ready evidence.
  • Demonstration: walk through a small dataset or example table and mark the checks required before trusting the result.
  • Evidence of learning: learners produce a short analysis note that includes assumptions, limitations, and verification steps.

checkpointing

  • What it means: define checkpointing clearly and connect it to the module focus: Training stack: data pipelines, distributed training, tensor/data/pipeline parallelism, checkpointing, and interconnect.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

interconnect

  • What it means: define interconnect clearly and connect it to the module focus: Training stack: data pipelines, distributed training, tensor/data/pipeline parallelism, checkpointing, and interconnect.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Practice and evidence of learning

  • Learners complete or discuss: Diagram a distributed training system and identify bottlenecks.
  • Learners produce: Diagram a distributed training system and identify bottlenecks.
  • Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
  • Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

  • Learners can explain the module vocabulary without relying on tool-generated text.
  • Learners have seen one worked example, one hands-on application, and one limitation or failure case.
  • Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 6. Inference stack

Module focus: Inference stack: batching, quantization, speculative decoding, model routing, caching, streaming, and serving. Primary live activity or lab: Build an inference cost and latency worksheet for an LLM API service.

Topics and coverage

batching

  • What it means: define batching clearly and connect it to the module focus: Inference stack: batching, quantization, speculative decoding, model routing, caching, streaming, and serving.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

quantization

  • What it means: define quantization clearly and connect it to the module focus: Inference stack: batching, quantization, speculative decoding, model routing, caching, streaming, and serving.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

speculative decoding

  • What it means: define speculative decoding clearly and connect it to the module focus: Inference stack: batching, quantization, speculative decoding, model routing, caching, streaming, and serving.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

model routing

  • What it means: place model routing inside the AI system stack so learners know what problem it solves and what tradeoffs it introduces.
  • What to cover: inputs, outputs, system boundaries, evaluation criteria, cost or latency implications, and common failure cases.
  • Demonstration: use a diagram, small code sample, worksheet, or tool trace to make the mechanism visible.
  • Evidence of learning: learners compare two approaches and explain which one they would choose for a realistic constraint.

caching

  • What it means: define caching clearly and connect it to the module focus: Inference stack: batching, quantization, speculative decoding, model routing, caching, streaming, and serving.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

streaming

  • What it means: define streaming clearly and connect it to the module focus: Inference stack: batching, quantization, speculative decoding, model routing, caching, streaming, and serving.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

serving

  • What it means: define serving clearly and connect it to the module focus: Inference stack: batching, quantization, speculative decoding, model routing, caching, streaming, and serving.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Practice and evidence of learning

  • Learners complete or discuss: Build an inference cost and latency worksheet for an LLM API service.
  • Learners produce: Build an inference cost and latency worksheet for an LLM API service.
  • Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
  • Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

  • Learners can explain the module vocabulary without relying on tool-generated text.
  • Learners have seen one worked example, one hands-on application, and one limitation or failure case.
  • Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 7. Networking and data centers

Module focus: Networking and data centers: NVLink, InfiniBand, Ethernet, rack design, power, cooling, uptime, and operations. Primary live activity or lab: Design a simplified AI cluster architecture and power budget.

Topics and coverage

  • What it means: define NVLink clearly and connect it to the module focus: Networking and data centers: NVLink, InfiniBand, Ethernet, rack design, power, cooling, uptime, and operations.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

InfiniBand

  • What it means: define InfiniBand clearly and connect it to the module focus: Networking and data centers: NVLink, InfiniBand, Ethernet, rack design, power, cooling, uptime, and operations.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Ethernet

  • What it means: define Ethernet clearly and connect it to the module focus: Networking and data centers: NVLink, InfiniBand, Ethernet, rack design, power, cooling, uptime, and operations.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

rack design

  • What it means: show where rack design appears in the learner's real workflow and which parts are judgment-heavy versus draftable.
  • What to cover: current workflow, pain points, AI-assisted steps, human review checkpoints, quality standard, and ownership of the final decision.
  • Demonstration: convert one messy real-world input into a structured brief, draft, analysis, checklist, or next action.
  • Evidence of learning: learners produce a reusable template or playbook entry that can be used after the course.

power

  • What it means: define power clearly and connect it to the module focus: Networking and data centers: NVLink, InfiniBand, Ethernet, rack design, power, cooling, uptime, and operations.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

cooling

  • What it means: define cooling clearly and connect it to the module focus: Networking and data centers: NVLink, InfiniBand, Ethernet, rack design, power, cooling, uptime, and operations.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

uptime

  • What it means: define uptime clearly and connect it to the module focus: Networking and data centers: NVLink, InfiniBand, Ethernet, rack design, power, cooling, uptime, and operations.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

operations

  • What it means: show where operations appears in the learner's real workflow and which parts are judgment-heavy versus draftable.
  • What to cover: current workflow, pain points, AI-assisted steps, human review checkpoints, quality standard, and ownership of the final decision.
  • Demonstration: convert one messy real-world input into a structured brief, draft, analysis, checklist, or next action.
  • Evidence of learning: learners produce a reusable template or playbook entry that can be used after the course.

Practice and evidence of learning

  • Learners complete or discuss: Design a simplified AI cluster architecture and power budget.
  • Learners produce: Design a simplified AI cluster architecture and power budget.
  • Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
  • Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

  • Learners can explain the module vocabulary without relying on tool-generated text.
  • Learners have seen one worked example, one hands-on application, and one limitation or failure case.
  • Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 8. Compilers and software

Module focus: Compilers and software: CUDA, XLA, Triton, graph optimization, kernels, quantization libraries, and deployment runtimes. Primary live activity or lab: Trace how a model operation becomes hardware instructions at a conceptual level.

Topics and coverage

CUDA

  • What it means: define CUDA clearly and connect it to the module focus: Compilers and software: CUDA, XLA, Triton, graph optimization, kernels, quantization libraries, and deployment runtimes.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

XLA

  • What it means: define XLA clearly and connect it to the module focus: Compilers and software: CUDA, XLA, Triton, graph optimization, kernels, quantization libraries, and deployment runtimes.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Triton

  • What it means: define Triton clearly and connect it to the module focus: Compilers and software: CUDA, XLA, Triton, graph optimization, kernels, quantization libraries, and deployment runtimes.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

graph optimization

  • What it means: place graph optimization inside the AI system stack so learners know what problem it solves and what tradeoffs it introduces.
  • What to cover: inputs, outputs, system boundaries, evaluation criteria, cost or latency implications, and common failure cases.
  • Demonstration: use a diagram, small code sample, worksheet, or tool trace to make the mechanism visible.
  • Evidence of learning: learners compare two approaches and explain which one they would choose for a realistic constraint.

kernels

  • What it means: define kernels clearly and connect it to the module focus: Compilers and software: CUDA, XLA, Triton, graph optimization, kernels, quantization libraries, and deployment runtimes.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

quantization libraries

  • What it means: define quantization libraries clearly and connect it to the module focus: Compilers and software: CUDA, XLA, Triton, graph optimization, kernels, quantization libraries, and deployment runtimes.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

deployment runtimes

  • What it means: place deployment runtimes inside the AI system stack so learners know what problem it solves and what tradeoffs it introduces.
  • What to cover: inputs, outputs, system boundaries, evaluation criteria, cost or latency implications, and common failure cases.
  • Demonstration: use a diagram, small code sample, worksheet, or tool trace to make the mechanism visible.
  • Evidence of learning: learners compare two approaches and explain which one they would choose for a realistic constraint.

Practice and evidence of learning

  • Learners complete or discuss: Trace how a model operation becomes hardware instructions at a conceptual level.
  • Learners produce: Trace how a model operation becomes hardware instructions at a conceptual level.
  • Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
  • Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

  • Learners can explain the module vocabulary without relying on tool-generated text.
  • Learners have seen one worked example, one hands-on application, and one limitation or failure case.
  • Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 9. Hardware economics and geopolitics

Module focus: Hardware economics and geopolitics: capex, depreciation, supply chain, foundries, packaging, export controls, and cloud pricing. Primary live activity or lab: Map the AI hardware value chain and identify strategic choke points.

Topics and coverage

capex

  • What it means: define capex clearly and connect it to the module focus: Hardware economics and geopolitics: capex, depreciation, supply chain, foundries, packaging, export controls, and cloud pricing.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

depreciation

  • What it means: define depreciation clearly and connect it to the module focus: Hardware economics and geopolitics: capex, depreciation, supply chain, foundries, packaging, export controls, and cloud pricing.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

supply chain

  • What it means: define supply chain clearly and connect it to the module focus: Hardware economics and geopolitics: capex, depreciation, supply chain, foundries, packaging, export controls, and cloud pricing.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

foundries

  • What it means: define foundries clearly and connect it to the module focus: Hardware economics and geopolitics: capex, depreciation, supply chain, foundries, packaging, export controls, and cloud pricing.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

packaging

  • What it means: define packaging clearly and connect it to the module focus: Hardware economics and geopolitics: capex, depreciation, supply chain, foundries, packaging, export controls, and cloud pricing.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

export controls

  • What it means: define export controls clearly and connect it to the module focus: Hardware economics and geopolitics: capex, depreciation, supply chain, foundries, packaging, export controls, and cloud pricing.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

cloud pricing

  • What it means: define cloud pricing clearly and connect it to the module focus: Hardware economics and geopolitics: capex, depreciation, supply chain, foundries, packaging, export controls, and cloud pricing.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Practice and evidence of learning

  • Learners complete or discuss: Map the AI hardware value chain and identify strategic choke points.
  • Learners produce: Map the AI hardware value chain and identify strategic choke points.
  • Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
  • Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

  • Learners can explain the module vocabulary without relying on tool-generated text.
  • Learners have seen one worked example, one hands-on application, and one limitation or failure case.
  • Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Module 10. Future hardware

Module focus: Future hardware: photonics, neuromorphic ideas, wafer-scale systems, memory-centric compute, and edge AI. Primary live activity or lab: Prepare a hardware roadmap thesis for one application domain.

Topics and coverage

photonics

  • What it means: define photonics clearly and connect it to the module focus: Future hardware: photonics, neuromorphic ideas, wafer-scale systems, memory-centric compute, and edge AI.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

neuromorphic ideas

  • What it means: define neuromorphic ideas clearly and connect it to the module focus: Future hardware: photonics, neuromorphic ideas, wafer-scale systems, memory-centric compute, and edge AI.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

wafer-scale systems

  • What it means: define wafer-scale systems clearly and connect it to the module focus: Future hardware: photonics, neuromorphic ideas, wafer-scale systems, memory-centric compute, and edge AI.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

memory-centric compute

  • What it means: define memory-centric compute clearly and connect it to the module focus: Future hardware: photonics, neuromorphic ideas, wafer-scale systems, memory-centric compute, and edge AI.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

edge AI

  • What it means: define edge AI clearly and connect it to the module focus: Future hardware: photonics, neuromorphic ideas, wafer-scale systems, memory-centric compute, and edge AI.
  • What to cover: the core concept, why it matters, what good usage looks like, and where learners are likely to misunderstand it.
  • Demonstration: give one simple example, one realistic example, and one failure or limitation example.
  • Evidence of learning: learners explain the topic in their own words and apply it to a small artifact or decision.

Practice and evidence of learning

  • Learners complete or discuss: Prepare a hardware roadmap thesis for one application domain.
  • Learners produce: Prepare a hardware roadmap thesis for one application domain.
  • Instructor checks for accuracy, practical usefulness, clear assumptions, appropriate human review, and fit with the course audience.
  • Learners revise once after feedback so the module contributes to the final project, portfolio, or playbook.

Minimum coverage before moving on

  • Learners can explain the module vocabulary without relying on tool-generated text.
  • Learners have seen one worked example, one hands-on application, and one limitation or failure case.
  • Learners know what must be verified, what data must be protected, and who remains accountable for the output.

Core labs and builds

  • Matrix multiplication and FLOP estimation lab.
  • Quantization lab: compare size, speed, and quality tradeoffs.
  • Inference economics lab: cost per 1,000 requests under different models and latency targets.
  • Hardware value-chain lab: chip designer, foundry, packaging, memory, networking, data center, cloud, application.

Capstone

  • Design an AI hardware deployment plan for one workload such as chatbot inference, document AI, medical imaging, video generation, classroom AI lab, call-center agents, or edge camera inspection. The plan includes workload profile, hardware choice, cost model, bottlenecks, and scaling strategy.

Assessment design

  • Hardware comparison memo.
  • Memory and inference cost calculations.
  • Cluster architecture diagram.
  • Final deployment plan.
  • Python notebooks, GPU profiler examples, cloud calculators, model parameter calculators, hardware spec sheets, compiler diagrams, data-center architecture diagrams.

Instructor notes

  • This course is especially valuable for business and investing audiences because it reveals why AI economics depend on bottlenecks outside the model itself: memory, interconnect, energy, utilization, and supply chain.

Instructor Build Checklist

  • Prepare one short demo for each module and one learner activity that creates a saved artifact.
  • Prepare examples that match the audience, local context, and likely tools learners can access.
  • Add a verification step to every AI-generated output: factual check, source check, data sensitivity check, and quality review.
  • Keep a running portfolio folder so each module contributes to the final project or learner playbook.
  • Reserve time for reflection on what the learner did, what AI did, what was checked, and what remains uncertain.