Wave Lang Compiler Stack
Understanding the technology behind Wave Lang's high-performance GPU code generation. From Python DSL to optimized kernels.
Compilation Pipeline
Wave Lang transforms your Python code through multiple optimization stages to generate efficient GPU kernels.
Python DSL
Write high-level tensor operations using familiar Python syntax with type hints and decorators.
Analysis
Parse and analyze the computation graph, inferring shapes, data flow, and optimization opportunities.
IR Generation
Generate IREE's intermediate representation with hardware-agnostic optimizations applied.
GPU Kernel
Generate optimized machine code that executes efficiently on your target GPU architecture.
Architecture Overview
Wave Lang Compiler Stack
The Wave Lang Philosophy: Separation of Concerns
What makes Wave Lang truly special is how it separates kernel logic from scheduling and tiling concerns, making GPU programming both simpler and more fun.
Pure Kernel Logic
Write clean, mathematical expressions that focus purely on the computation you want to perform. No need to think about thread blocks, shared memory, or memory coalescing patterns.
Declarative Constraints
Specify how you want the computation scheduled through simple constraint objects. Control tiling, memory hierarchy, and parallelization without mixing it with your algorithm.
Easy Experimentation
Try different scheduling strategies by simply changing constraint parameters. No need to rewrite your kernel logic - the same mathematical code works with any scheduling approach.
Portable Performance
The same kernel logic can be optimized for different hardware by adjusting constraints. Move from development to production, or AMD to NVIDIA, with just constraint changes.
Traditional vs Wave Lang Approach
❌ Traditional C
• Thread indexing calculations
• Shared memory management
• Coalescing optimizations
• Block size considerations
• Hardware-specific tuning
✅ Wave Lang
Constraints: Scheduling decisions
• WorkgroupConstraint
• TilingConstraint
• HardwareConstraint
• WaveConstraint
Built on Proven Technology
Wave Lang leverages industry-leading compiler infrastructure for maximum performance and reliability.
How It Works
Symbolic Computation
Wave Lang uses symbolic variables to represent tensor dimensions, enabling compile-time optimization and automatic kernel specialization based on actual input shapes.
Graph Optimization
The compiler analyzes the entire computation graph to identify fusion opportunities, eliminate redundant operations, and optimize memory access patterns.
Automatic Tiling
Wave Lang automatically determines optimal tile sizes for different operations based on hardware characteristics and memory constraints of the target GPU.
Load Balancing
Smart work distribution ensures all GPU compute units are utilized efficiently, minimizing idle time and maximizing throughput.
Want to learn more?
Dive deeper into Wave Lang's compiler architecture and optimization techniques.