1
SPEX: A Programming Language
for Software Defined Radio
Yuan Lin, Robert Mullenix, Mark Woh,
Scott Mahlke, Trevor Mudge,
Alastair Reid1, and Krisztián Flautner1
The University of Michigan at Ann Arbor
1ARM, Ltd
1
SDR Hardware Platform
 Handset SDR has steep computational requirements
2
(>40GOPS) with tight power budget (<500mW)
 Heterogeneous multiprocessor solutions common
 Examples Include:
 SODA
 Philips EVP
 TI OMAP
 IBM CELL* (not a mobile solution but very similar)
 Themes:
 Targets high-throughput signal-processing domains
 VLIW SIMD
 Control- and data- focused cores
ACAL – University of Michigan
2
SODA System Architecture for 3G
3
 Based on our prior work:
 4 PEs
 Scalar, SIMD pipelines
 32 elements wide
 Scratchpad memories
 ARM controller processor
 Handles control, DMA
 Feasible approach:
 3W, 26.6 mm2 at 180nm
 ~0.5W, 6.7 mm2 projected
Global
Mem
System Architecture
ARM
Local
Mem
Local
Mem
Local
Mem
Local
Mem
PE
Execution
Unit
PE
Execution
Unit
PE
Execution
Unit
PE
Execution
Unit
DMA
scalar
RF
scalar
MEM
SIMD
MEM
SIMD
RF
VtoS
&
StoV
Scalar ALU
SIMD ALU
for 90nm
ACAL – University of Michigan
3
Programming SDR Platforms
4
 Programming DSPs already tough
 Multiprocessor architectures make a tough problem tougher
 Want to achieve high performance with high productivity
 Software needs to advance along with hardware
 C not sufficient
 Can express protocols, but awkward and inefficient
 Rediscovering parallelism is challenging
 Want to decouple algorithm design from implementation
 No well-defined concept of time
ACAL – University of Michigan
4
Algorithm Characteristics
 Digital communication protocols hierarchical
5
 Abstracted as a series of connected kernels
 Can isolate and optimize individually
 Operations frequently involve matrix computations
 Have real-time constraints with static control flow
W-CDMA Protocol Operations
ACAL – University of Michigan
5
Desired SDR Language Features
6
 Plenty of Parallelism
 Kernels vectorizible
 Pipeline the stream
 Interleave concurrent tasks
 Give compiler control
 Express the algorithms & constraints, not run-time behavior
 Static decisions should be made at compile time
 (e.g. scheduling, PE assignment, memory management)
 Support for Timing Models
 Absolute timing primitives prevent drifting
 Periodic and relative timing constraints
ACAL – University of Michigan
6
SPEX – Language Extension for SDR
7
 Two levels: data/control separation
 Kernel SPEX - the data plane
 Algorithm kernel descriptions, timing unaware
 C + Matlab operators + DSP fixed-point arithmetic
 System SPEX - the control plane
 Wireless protocol system descriptions
 C + Inter-kernel communications + timing constraints
ACAL – University of Michigan
7
Kernel SPEX
 Atomic Building Blocks
 Non-preemptible
 Ignorant of timing constraints
 Maintains local state
 Features
 Templated definitions
 Member functions
 Matlab-like vector support
 SystemC-like data types
8
template<class T, TAPS, BSIZE>
kernel FIR {
vector<T, TAPS> z;
vector<T, TAPS> coeff;
void set_coeff(vector<T, TAPS> c)
{ coeff = c; }
void run(channel<T, BSIZE> inbuf,
channel<T, BSIZE> outbuf)
{
int i;
T in, out;
for (i = 0; i < BSIZE; i++) {
in = inbuf.pop();
z += coeff * in;
out = z[0];
outbuf.push(out);
z = (z(1:TAPS-1),0);
}
}
};
ACAL – University of Michigan
8
System SPEX

9
Synchronous primitives
 A set of timing and concurrent primitives for expressing real-time
execution
Modeled after real-time languages


Stream primitives
 A set of streaming primitives for expressing streaming computation
 Modeled after the synchronous dataflow model and its variations
ACAL – University of Michigan
9
Synchronous Example – WCDMA
10
void wcdma()
{
clock clk;
at (clk % wcdma_frame == 0) {
...
adcfir(ch1);
chan_est(ch1, ch2, num_fingers);
parallel {
bch(ch1, bch_done);
if (dch_mode) {
dch(ch1, ch2, num_fingers, dch_done);
}
}
...
parallel {
wait(clk % bch_deadline == 0);
if (dch_mode)
wait(clk % dch_deadline == 0);
}
}
}
Real-time clock
Absolute timing
assertion
Parallel execution of
instructions within
each scope
ACAL – University of Michigan
10
SDR Compilation Strategy
ACAL – University of Michigan
11
11
Kernel SPEX Compilation Flow
Kernel
SPEX
SPEX
frontend
 Frontend removes “syntactic
sugar”


Templates instantiated
Matlab features mapped to
function calls
Virtual
Kernel C
+
12
V-to-P
translation
libraries
gcc
Physical
Kernel C
 Virtual Kernel C



Infinite vector length assumed
Robust set of operators
Can be linked with special libraries
and compiled with gcc to verify
functional correctness
 Physical Kernel C


Vector length bounded by actual
SIMD width
Restricted to machine operators
ACAL – University of Michigan
a.out
VLIW
backend
Functional
debugging path
SODA
assembly
Compilation
path
12
System SPEX Compilation Frontend
ACAL – University of Michigan
13
13
System SPEX Task Compilation
1
14

Stream IR with dataflow primitives
 i.e. push(), pop(), peek()

Step 1: Dataflow rate-matching
 Insert buffers between nodes
 Add loops to match the rate
Step 2: Initial resource allocation
 Processor assignments



2
memory allocation and DMA
transfer
Step 3: Control-data split
 Break the task into independent
threads
3
ACAL – University of Michigan
14
System SPEX Real-time Optimization
15
 Hierarchical constraint
scheduling


Each task is treated as a
single node
Guarantees all nodes are
schedulable through
compiler optimizations
 Non-preemptive multiprocessor scheduler



Static processor
assignments
Static task execution
ordering
Dynamic execution timing
 Iterative optimization if
constraints not met

ACAL – University of Michigan
Re-compile each task with
system profiling
15
Summary
16
 Multiprocessor architectures makes handset SDR
feasible, but complicates software
 Need better language to map algorithm to hardware
 SPEX capitalizes on domain properties
 C and Matlab based
 Control and data separation
 Kernels exploit massive data parallelism
 Systems can pipeline kernels and interleave tasks
 Compile system and kernels independently
 Provide multiple paths to ensure robust debugging
ACAL – University of Michigan
16
17
Questions?
ACAL – University of Michigan
17
Stream Example – DCH Channel
18
void DCH(channel<int16, frame> ADC_in,
channel<int16, max_fingers> searcher_in,
int num_fingers,
channel<int16, frame> & to_MAC,
signal<bool> & done)
{
channel<int16, frame> ch1[max_rake_finger];
channel<int16, frame> ch2;
stream {
for (int i = 0; i < num_fingers; i++)
rake(ADC_in, searcher_in[i], ch1[i]);
combiner(ch1, ch2);
viterbi(ch2, to_MAC);
}
channels and
signals can be
declared either as
Stream computation
within thefunction
scope arguments
or as local
Channel merging
variablesdone
in combiner function
done = true;
}
ACAL – University of Michigan
18
System SPEX Compilation
ACAL – University of Michigan
19
19
Descargar

SPEX: A Programming Language for Software Defined …