Programming WSN:
TinyDB
Marco Avvenuti
Pervasive Computing & Networking Lab. (PerLab)
Dept. of Information Engineering
University of Pisa
[email protected]
PerLab
Outline
PerLab






Introduction
TinyDB & TAG
TAG Aggregates
Implementation of agg
Properties and Optimizations
Advantages of TAG
03/10/2015
Sistemi Mobili e Pervasivi
2
TAG:
Tiny AGgregation service
PerLab


Provides a simple, declarative interface for data
collection and aggregation, inspired by selection and
aggregation facilities in database query languages
Intelligently distributes and executes aggregation
queries in the sensor network



time and power-efficient
sensitive to the resource constraints and lossy
communication properties
TAG processes aggregates in the network

03/10/2015
computing over the data as it flows through the sensors,
discarding irrelevant data and combining relevant readings
into more compact records when possible
Sistemi Mobili e Pervasivi
3
TinyDB approach:
Declarative Queries
PerLab

Users specify the data they want and the rate
at which data should be refreshed




Simple, SQL-like queries
Using predicates, not specific addresses
TinyDB collects that data from motes in the
environment, filters it, aggregates it together,
and routes it out to a PC
TinyDB does this via power-efficient
in-network processing algorithms
03/10/2015
Sistemi Mobili e Pervasivi
4
Ad-Hoc Routing Algorithm
PerLab



Routing independence: TinyDB is agnostic to the choice of
routing algorithm, which must be able to:
 deliver query request to all nodes
 provide one or more routes from every node to the root
 guarantee that at most one copy of every message arrive
Topology discovery goes on continuously: periodic updates make
easy to find topology changes or node fault. Parents are retained
unless a child does not hear from them for some long period of
time
Data delivery follows a routing tree: when a mote wishes to send
a message to the root, it sends a message addressed to its
parent, which in turn forward it on to its parent, and so on, until it
reaches the root
03/10/2015
Sistemi Mobili e Pervasivi
5
Aggregates:
Centralized Approach

In the server-based approach
all sensor readings are sent to
the base station, which then
computes the aggregates.
0
1
2
3
4
5
COUNT = 6
11
Total12 number of13Tx:
Example:
16
41
SELECT COUNT(*)
FROM sensors
How many transmissions?
03/10/2015
PerLab
Sistemi Mobili e Pervasivi
51
61
6
Aggregates:
Distributed Approach


In the TAG approach aggregates are
computed in network whenever
possible.
Number of transmissions, latency
and power consumption decrease.
COUNT = 6
0
61
COUNT = 5
4
COUNT = 3
Total12 number of43Tx:
Example:
6
43
SELECT COUNT(*)
FROM sensors
51
03/10/2015
PerLab
Sistemi Mobili e Pervasivi
COUNT = 2
1
61
7
Outline
PerLab

Introduction

TinyDB & TAG

TAG Aggregates
Implementation of agg



Properties and Optimizations
Advantages of TAG
03/10/2015
Sistemi Mobili e Pervasivi
8
Query Model
PerLab
To monitor the occupancy of rooms at 6th floor
based on noise over a specified threshold:
SELECT AVG(volume), room FROM sensors
WHERE floor = 6
GROUP BY room
HAVING AVG(volume) > threshold
EPOCH DURATION 30 s
03/10/2015
Sistemi Mobili e Pervasivi
9
SELECT clause
PerLab
SELECT { agg(expr), attrs }
Es. SELECT AVG(volume), room

specifies an arbitrary arithmetic expression over one or more
aggregation attributes
 We expect that the common case here is that expr will simply
be the name of a single attribute.
 attrs (optionally) selects the attributes by which the sensor
readings are partitioned; these can be any subset of attrs that
appear in the GROUP BY clause.
 The syntax of the agg clause is discussed below.
03/10/2015
Sistemi Mobili e Pervasivi
10
sensors table
PerLab


TAG support SQL-style queries (without
joins) over a single table called sensors,
whose schema is known at the base station.
This table can be thought of as an appendonly relational table with one attribute per
input of the motes (e.g., temperature, light.)
03/10/2015
Sistemi Mobili e Pervasivi
11
WHERE clause
PerLab
WHERE { selPreds }
Es. WHERE floor = 6


filters out individual sensor readings before
they are aggregated. Such predicates can
typically be executed locally at the mote
before readings are communicated.
Usefull to avoid unnecessary communication
03/10/2015
Sistemi Mobili e Pervasivi
12
GROUP BY clause
PerLab
GROUP BY {attrs}
Es. GROUP BY room

specifies an attribute-based partitioning of
sensor readings. Logically, each reading
belongs to exactly one group, and the
evaluation of the query is a table of group
identifiers and aggregate values.
03/10/2015
Sistemi Mobili e Pervasivi
13
HAVING clause
PerLab
HAVING { havingPreds }
Es. HAVING AVG(volume) > threshold

filters the table by suppressing groups that do
not satisfy the havingPreds predicates
03/10/2015
Sistemi Mobili e Pervasivi
14
Semantic difference between
TAG and SQL queries
PerLab

The output of a TAG query is a stream of values,
rather than a single aggregate value (or batched
result)


In monitoring applications, such continuous results are
often more useful than a single, isolated aggregate, as they
allow users to understand how the network is behaving
over time
In these stream semantics, each record consists of
one <group id, aggregate value> pair per group


03/10/2015
Each record is time-stamped
The readings used to compute an aggregate record all
belong to the same time interval, or epoch.
Sistemi Mobili e Pervasivi
15
EPOCH DURATION clause
PerLab
EPOCH DURATION i


Specifies the amount of time (in seconds) devices wait before
acquiring and transmitting each successive sample.
This value may be as large as the user desires; it must be at
least as long as the time it takes for a mote to process and
transmit a single radio message and do some local processing:
 about 30 ms (including average MAC backoff in a low-contention
environment) for current generation motes (yielding a maximum
sample rate of about 33 samples per second.)
03/10/2015
Sistemi Mobili e Pervasivi
16
Attribute Catalog
PerLab




Attributes in TAG may be direct representations of
sensor values, such as light or temperature
TAG includes on each mote a small catalog of
attributes. This catalog can be searched for
attributes of a specific name, or iterated through
It is not necessary for all nodes to have identical
catalogs, which allows heterogeneous sensing
capabilities and incremental deployment of motes
Nodes lacking attributes simply tag missing
attributes as NULL in their result records
03/10/2015
Sistemi Mobili e Pervasivi
17
Outline
PerLab

Introduction
Framework
TinyDB & TAG

TAG Aggregates

Implementation of agg

Properties and Optimizations
Advantages of TAG



03/10/2015
Sistemi Mobili e Pervasivi
18
TAG Operations

Users pose aggregation queries from a
powered, storage-rich basestation.

Operators that implement the query are
distributed into the network by piggybacking on
the existing ad-hoc networking protocol.
03/10/2015
App
Query,
Trigger
PerLab
Data
TinyDB
Sensor Network

Sensors route data back towards the user through a routing
tree rooted at the basestation.

As data flows up this tree, it is aggregated according to an
aggregation function and value-based partitioning specified in
the query.
Sistemi Mobili e Pervasivi
19
TinyDB Architecture
SELECT
AVG(temp)
WHERE
light > 400
T:1, AVG: 225
Results T:2, AVG: 250
Queries
Multihop
Network
Query Processor
Aggavg(temp)
~10,000 Lines Embedded C Code
~5,000 Lines (PC-Side) Java
~3200 Bytes RAM
~58 kB compiled code
(largest TinyOS program)
Filterlight >
get (‘temp’)
getTempFunc(…)
400
Tables
Samples
03/10/2015
got(‘temp’)
Schema
TinyOS
PerLab


Sistemi Mobili e Pervasivi
Name: temp
Time to sample: 50 uS
Cost to sample: 90 uJ
Calibration Table: 3
Units: Deg. F
Error: ± 5 Deg F
Get f : getTempFunc()…
20
Tiny Aggregation
PerLab

TAG consists of two phases:



Distribution: in which aggregate queries are pushed
down into the network;
Collection: where the aggregate values are continually
routed up from children to parents (during an epoch
every device produces a single aggregate).
Parent-specified transmission time: collection
phase must ensure that parents in the routing tree
wait until they have heard from their children
before propagating an aggregate value for the
current epoch.
03/10/2015
Sistemi Mobili e Pervasivi
21
Tiny Aggregation:
Parent-Specified time interval
PerLab



Every aggregate request includes the interval when the father
will be waiting for partial state records from children;
Intervals are selected such that there is enough time for the
parents to combine partial state records and, in turn,
propagate their own record to their parents;
Requires synchronization of the clock, according to timing
information in the message.
03/10/2015
Sistemi Mobili e Pervasivi
22
Tiny Aggregation:
Parent-Specified time interval
PerLab
Upon receiving a request, a mote:
1.
2.
3.
4.
awakens
synchronizes its clock
chooses the sender of the msg as its parent
forwards the query, setting the delivery interval for children
to be slightly before the time its parent expects to see the
partial state record
Level 1
Tree Depth
Level 2
Level 3
Level 4
Radio and processor idle
Listening and receiving
Sensing and processing,
Radio idle
Delivering interval
(Transmitting)
Level 5
03/10/2015
Legend

1 Epoch Time
Sistemi Mobili e Pervasivi
23
Epoch duration
PerLab


It needs to be long enough such that all
nodes can report
Interval=Epoch_duration/tree_depth



Sets a lower bound to epoch duration
Limits the maximun sample rate of the network
The sample rate can be increased by
pipelining the communication schedule
03/10/2015
Sistemi Mobili e Pervasivi
24
Outline
PerLab

Introduction
Framework
TinyDB & TAG
TAG Aggregates

Implementation of agg

Properties and Optimizations
Advantages of TAG




03/10/2015
Sistemi Mobili e Pervasivi
25
Implementation of agg
PerLab

TAG implements agg via three functions:



03/10/2015
f – merging function
i – inizializer
e – evaluator
Sistemi Mobili e Pervasivi
26
Merging function
<z> = f(<x>, <y>)


PerLab
<x> and <y> are multi-valued partial state records,
computed over one or more sensor values
<z> is the partial state record resulting from the
application of function f to <x> and <y>

For AVERAGE each partial state record will consist of a
pair of values, SUM and COUNT and, given two state
records <S1,C1> and <S2,C2>, is specified as follows:
f(<S1,C1>, <S2,C2>) = <S1 +S2, C1+C2>
03/10/2015
Sistemi Mobili e Pervasivi
27
Inizializer and evaluator
PerLab

The initializer i is needed to specify how to
instantiate a state record for a single sensor value


for AVERAGE over a sensor value of x, the initializer i(x)
returns the tuple <x, 1>
The evaluator e takes a partial state record and
computes the actual value of the aggregate

03/10/2015
for AVERAGE, the evaluator e(<S,C>) simply returns S/C
Sistemi Mobili e Pervasivi
28
Outline
PerLab

Introduction
Framework
TinyDB & TAG
TAG Aggregates
Implementation of agg

Properties and Optimizations

Advantages of TAG




03/10/2015
Sistemi Mobili e Pervasivi
29
Taxonomy of Aggregates
PerLab
We classify aggregates according to four properties:
1.
2.
3.
4.
Duplicate sensitivity
Tolerance to losses
Monotonicity
State requirements
A general set of optimizations can be
automatically applied
03/10/2015
Sistemi Mobili e Pervasivi
30
Duplicate sensitivity
PerLab

duplicate insensitive aggregates are
unaffected by duplicate readings from a
single device


MAX, MIN, COUNT DISTINCT
duplicate sensitive aggregates will change
when a duplicate reading is reported.

03/10/2015
COUNT, SUM, AVERAGE, MEDIAN…
Sistemi Mobili e Pervasivi
31
Loss-tolerance:
Exemplary Vs. Summary
PerLab

exemplary aggregates return one or more
representative values from the set of all values


behave unpredictably in the face of loss
 MAX, MIN, MEDIAN…
summary aggregates compute some property over all
values

03/10/2015
the aggregate applied to a subset can be treated as a
robust approximation of the true aggregate value
 COUNT, SUM, AVERAGE, COUNT DISTINCT…
Sistemi Mobili e Pervasivi
32
Monotonicity
PerLab

Monotonic aggregates have the property that when two
partial state records, s1 and s2, are combined via f, the
resulting state record s’ will have the property that either:
 s1 , s 2 , e ( s ' )  MAX ( e ( s1 ), e ( s 2 ))

 s1 , s 2 , e ( s ' )  MIN ( e ( s1 ), e ( s 2 ))
Important when determining whether some predicates
(such as HAVING) can be applied in network, before the
final value of the aggregate is known.

03/10/2015
Early predicate evaluation saves messages by reducing the
distance that partial state records must flow up the aggregation
tree.
Sistemi Mobili e Pervasivi
33
State requirements
PerLab





Distributive aggregates
Algebraic aggregates
Holistic aggregates
Unique aggregates
Content-Sensitive aggregates
03/10/2015
Sistemi Mobili e Pervasivi
34
State requirements:
Distributive aggregates
PerLab


the partial state is simply the aggregate for
the partition of data over which they are
computed
the size of the partial state records is the
same as the size of the final aggregate

03/10/2015
MAX, MIN, COUNT, SUM
Sistemi Mobili e Pervasivi
35
State requirements:
Algebraic aggregates
PerLab

the partial state records are not themselves
aggregates for the partitions, but are of
constant size

03/10/2015
AVERAGE
Sistemi Mobili e Pervasivi
36
State requirements:
Holistic aggregates
PerLab


the partial state records are proportional in
size to the set of data in the partition
no useful partial aggregation can be done,
and all the data must be brought together to
be aggregated by the evaluator.

03/10/2015
MEDIAN
Sistemi Mobili e Pervasivi
37
State requirements:
Unique aggregates
PerLab

similar to holistic aggregates, except that the
amount of state that must be propagated is
proportional to the number of distinct values
in the partition

03/10/2015
COUNT DISTINCT
Sistemi Mobili e Pervasivi
38
State requirements:
Content-Sensitive aggregates
PerLab

the partial state records are proportional in
size to some (perhaps statistical) property of
the data values in the partition

03/10/2015
HISTOGRAM
Sistemi Mobili e Pervasivi
39
Properties and Optimizations
PerLab
Property
Duplicate
Sensitivity
03/10/2015
Examples
MIN : dup. insensitive,
AVG : dup. sensitive
Affects
Routing Redundancy
Exemplary vs. MAX : exemplary
Summary
COUNT: summary
Effect of Loss
Monotonicity
COUNT : monotonic
AVG : non-monotonic
Snooping,
Hypothesis Testing
Partial State
MEDIAN : unbounded,
MAX : 1 record
Effectiveness of TAG
Sistemi Mobili e Pervasivi
40
Snooping:
Tx Reduction
PerLab
SELECT MAX(x)
7
5
FROM sensors
6
Snooping is the capability of nodes to examine
messages not directly addresses to them
 Snooping can be used to reduce the number of
My neighbour
messages sent for some classes of aggregates.
transmitted 5. As

3
value wouldcomputing
be
myConsider
2
Legend
Aggregate requests
flooding
a MAX over a group of motes: if a
MAX(
3) = 3 a peer reporting a maximum value greater
node2,hears
I do
not transmit
than
its local maximum, it can decide to not send its
own value and be sure this will not affect the value of
the final aggregate.
Query results
flooding
03/10/2015
Sistemi Mobili e Pervasivi
41
Snooping:
Loss recovery
PerLab
Root
1
2
3

Situation: a node misses an initial
request to begin aggregation

If we include information about queries in
partial state records, lost nodes (either without
a parent or not incorporated into the
aggregation network during the initial flooding
phase) can reconnect by listening to other
node’s state records.
When node 4 hears another device reporting
an aggregate, it can assume it should be
aggregate.
5
4
6
Legend
Aggregate requests
flooding

Query results
flooding
03/10/2015
Sistemi Mobili e Pervasivi
42
Improving Tolerance to Loss:
Using Available Redundancy
MAX(x) = 7
4
7
7
7
3
7 Consider a mote with two possible choices of parent: instead
of sending its aggregate value to just one parent, it can send it
to both parents
 A node can easily discover that it has multiple parents by
7
5
6
SELECT MAX(x)
FROM sensors
2
7
PerLab
building a list of nodes it has heard that are one step closer to
the root
6
6
6
Caution: For duplicate-sensitive aggregates, sending results to
multiple parents has the undesirable effect of causing the node to be
counted multiple times

03/10/2015
A solution is to send part of the aggregate to one parent and the rest to the
other. Consider a COUNT; a mote with c-1 children and two parents can
send a COUNT of c/2 to both parents instead of a count of c to a single
parent.
Sistemi Mobili e Pervasivi
43
Improving Tolerance to Loss:
Child cache
PerLab



Parents remember the partial state records their children
reported for some number of rounds, and use those previous
values when new values are unavailable due to lost child
messages, without over-counting any nodes.
Note that caching is a simple form of interpolation where the
interpolated value is the same as the previous value.
Drawbacks:


03/10/2015
Caching uses memory that could be used for group storage
It sets a minimum bound on the time that devices must wait
before determining their parent has gone offline
Sistemi Mobili e Pervasivi
44
Hypothesis testing
PerLab

03/10/2015
If a node is presented with an
estimation of the proper value of an
aggregate, it can determine locally
whether or not its own reading and
the readings of its children will affect
the final value of the aggregate.
Sistemi Mobili e Pervasivi
45
Hypothesis testing:
Example
PerLab
The root of the network seeking an exemplary
sensor value, such as a MAX, might compute the
minimum
sensor
value m over the highest levels
SELECT
FROM
sensors
MAX
MAX(x)
=9
of the subtree, and then abort the aggregate and
FROM sensors
>5
3 WHERE
issue aMAX(x)
new request
asking for values greater
than m over the whole tree.
 leaf nodes do not need to send a message if
their value is lower than the minimum observed
over the top levels
4
 intermediate nodes must still forward partial
Legend
state records even if their value is suppressed

Root
4
5
9
SELECT MAX(x)
Query requests
flooding
Query results
flooding
03/10/2015
Sistemi Mobili e Pervasivi
46
Outline
PerLab

Introduction
Framework
TinyDB & TAG
TAG Aggregates
Implementation of agg

Properties and Optimizations

Advantages of TAG




03/10/2015
Sistemi Mobili e Pervasivi
47
Communication cost
PerLab
Total Bytes Xmitted / Epoch , All Sensors
Communication costs of centralized aggregates are independent
from the aggregate function, since all data must be routed to the
root
100000
In-network vs. Centralized Aggregation
90000
80000
70000
60000
50000
40000
30000
20000
10000
0
Any Centralized
Aggregate
03/10/2015
COUNT
MIN
HISTOGRAM
MAX
Aggregation Function
Sistemi Mobili e Pervasivi
AVE
COUNT MEDIAN
DISTINCT
48
Power saving
PerLab


The principal advantage of TAG is its
capability of (dramatically) limiting the
communication required to compute an
aggregate
Explicitly dividing time into epochs is a
convenient mechanism for idling the
processor.
03/10/2015
Sistemi Mobili e Pervasivi
49
Improved network lifetime
PerLab


In centralized systems, as the data converge
towards the root, nodes on top of the tree are
required to transmit significantly more data
than leaves
In TAG, each mote is required to transmit
only a single message per epoch,
regardless of its depth in the routing tree
03/10/2015
Sistemi Mobili e Pervasivi
50
Loss and changes tolerance
PerLab


TAG has the ability to tolerate disconnections and
loss. Lost nodes can reconnect by listening to other
node’s state records (not necessarily intended for them)
as they flow up the tree
The constant topology maintenance makes it
relatively easy to adapt to network changes caused by
mobility of certain nodes, or to the addition or deletion of
motes
However, this is very energy wasting
03/10/2015
Sistemi Mobili e Pervasivi
51
Descargar

Diapositiva 1