!
NoSQL technologies are a fundamental
part of Azure today
The world today
Our field was originally
called data processing
New data technologies
abound
Data is as critical as ever
Data is much more plentiful
It’s what the people who
pay us care most about
Storage costs are lower
NoSQL
There are bigger data
sources:
- Web-scale applications
- Internet of Things (IoT)
- More
Big data analytics
Search
This isn’t the post-SQL era,
but it is the SQL+ era
Illustrating the intersection
SQL+
SQL
On-Premises
Cloud
A summary
Operational Data
Analytical Data
Document Store
(DocumentDB, MongoDB, …)
NoSQL
Technologies
Key/Value Store
(Tables, Riak, …)
Big Data Analytics
(HDInsight, Hadoop)
Column Family Store
(HBase, Cassandra …)
SQL
Technologies
Relational Database
(SQL Database,
SQL Server, Oracle, MySQL, …)
Relational Analytics
(SQL Server, Oracle, MySQL, …)
Managed service
provided by Azure
Software that can run
in Azure virtual
machines
A relational data service
SQL Database
Tables
Application
SQL
Query
ID
int
1
3
Column Name
Column Type
Primary Key
Data
2
7
Name
char
Country
char
Age
int
LastUse
date
Atomic transactions typically
span only a single shard
Sharding
Database
Adam
Carl
Cynthia
Bill
Anusha
Catherine
Sharded Database
Andrew
Bertrand
Shard 1
Shard 2
Shard 3
Adam
Bertrand
Carl
Andrew
Bill
Catherine
Anusha
SQL Database Elastic
Scale (in preview) now
supports sharding
Cynthia
Category
SQL
Database
Relational
Maximum
Query
Database
Storage
Language
Size
Abstractions
Tables,
rows,
columns
500 GB
With Elastic Scale,
100s of TBs
SQL
Transaction
Support
All rows and
tables in a
database
Stored
Secondary Procedure/
Indexes
Triggers
Yes
Written in
T-SQL
Pricing
Units of
throughput
To scale for lots of users and
lots of data
To work better with different
data formats, e.g., JSON
To work with data in a more
flexible way
Pros: NoSQL technologies can
offer more scalability than
relational databases
Pros: Avoiding
object/relational mapping
makes code easier to write
Pros: NoSQL technologies don’t
have fixed schemas
Cons: Often lose some
benefits of relational
databases, e.g., database-wide
transactions
Cons: Limited BI tools;
persistent data designed
for a single application is
harder to share
Cons: Fixed schemas help
prevent errors
DocumentDB
A document store
Collections
Document 1
Request
{
{
"name": "John",
"country": "Canada",
"age": 43,
"lastUse": "March 4, 2014"
Application
{…}
Document 2
"name": "Eva",
"country": "Germany",
"age": 25
}
}
Document 3
{
"name": "Lou",
"country": "Australia",
"age": 51,
"firstUse": "May 8, 2013"
}
Document 4
{
"docCount": 3,
"last": "May 1, 2014"
}
All written in
JavaScript
Ways to work with data
RESTful access methods
DocumentDB SQL
For Create/Read/Update/Delete
(CRUD) operations
A query language with
SQL-derived syntax
Example:
SELECT c.age
FROM customers c
WHERE c.name = "Lou"
Executing logic in the
database
Stored procedures
Triggers
User-defined functions (UDFs)
- Allow extending
DocumentDB SQL
With Node.js
Microsoft Azure
JavaScript
Application
JSON
Web Browser
Native
Apps
Phone/Tablet
DocumentDB
Web Sites
JavaScript
Server Code
JSON
Node.js
Request
Collection
JSON
JSON
JSON
JSON
Sharding and transactions
Atomic transactions can span
only a single collection
Database
Collection
The unit of
sharding is a
collection
Collection
Collection
Collection
JSON
JSON
JSON
JSON
JSON
JSON
JSON
JSON
JSON
JSON
JSON
Replication and consistency
Replication can improve
performance and availability
Database
A write to the
primary replica takes
time to propagate to
the secondaries
Shard A
Shard A
Shard A
What does a
reader see?
Primary replica
Secondary replica
Readers
might see
old data
Readers
might see
out-of-order
updates
Speed of
writes
Speed of
reads
Strong
No
No
Slowest
Slowest
Bounded
Staleness
Yes, but only
within a
specified
interval
No
Fastest
Moderately
slow
Session
Yes, but only
for writes by
other clients
Yes, but only
for writes by
other clients
Fastest
Moderately
fast
Eventual
Yes
Yes
Fastest
Fastest
Consistency options
The default
Category
SQL
Database
DocumentDB
Relational
Document
store
Maximum
Query
Storage
Database
Language
Abstractions
Size
Tables,
rows,
columns
Collections,
documents
500 GB
100s of
TBs
SQL
Transaction
Support
All rows and
tables in a
database
All
Extended
documents in
subset of
the same
SQL
collection
Stored
Secondary Procedures/
Triggers
Indexes
Pricing
Yes
Written in
T-SQL
Units of
throughput
Yes
Written in
JavaScript
Units of
throughput
Azure Tables
Tables
A key/value store
B
A
1
A
2
2
Name
Country
Age
String
String
int
Name
Country
Age
String
String
int
Partition
A
Name
Country
Age
FirstUse
String
String
int
Date
Application
B
2
B
Entity
Property Name
Property
Property Type
Partition key
Data
Row key
B
1
2
Count
int
Last
Date
LastUse
Date
Partition
B
Sharding and transactions
Atomic transactions can
span only a single partition
Table
Partition A
The unit of
sharding is a
partition
Partition B
Partition C
A 1
B 1
C 1
A 2
B 2
C 2
A 3
B 3
C 3
Partitions are replicated;
reads and writes provide
strong consistency
Category
Maximum
Query
Storage
Database
Language
Abstractions
Size
Relational
Tables,
rows,
columns
500 GB
DocumentDB
Document
store
Collections,
documents
Tables
Key/value
store
Tables,
partitions,
entities
SQL
Database
Transaction
Support
Stored
Secondary Procedures/
Indexes
Triggers
Pricing
SQL
All rows and
tables in a
database
Yes
Written in
T-SQL
Units of
throughput
100s of
TBs
Extended
subset of
SQL
All documents
in the same
collection
Yes
Written in
JavaScript
Units of
throughput
100s of
TBs
Subset of
OData
queries
All entities
in the same
partition
No
None
GBs of
storage
HDInsight HBase
Tables
A column family store
Usage LastUse 2 v2
Application
Row
Key
1
2
HDInsight supports
Phoenix for SQL
queries on HBase
3
4
Column Key (Family)
Column Key (Qualifier)
Data (optionally with
time-stamped versions)
5
6
Usage
User
Name
Country
Age
LastUse
FirstUse
Sharding and transactions
Atomic transactions can
span only a single row
Table
Region A
The unit of
sharding is
a region
Regions are replicated;
reads and writes provide
strong consistency
Region B
Region C
HBase automatically
shards a table; users
don’t see regions
Category
Maximum
Query
Storage
Database
Language
Abstractions
Size
Relational
Tables,
rows,
columns
500 GB
DocumentDB
Document
store
Collections,
documents
100s of
TBs
Tables
Key/value
store
Tables,
partitions,
entities
HDInsight
HBase
Column
family
store
SQL
Database
Tables, rows,
columns,
cells, column
families
Transaction
Support
All rows and
tables in a
database
Stored
Secondary Procedures/
Indexes
Triggers
Pricing
Yes
Written in
T-SQL
Units of
throughput
Extended All documents
in the same
subset of
collection
SQL
Yes
Written in
JavaScript
Units of
throughput
100s of
TBs
Subset of
OData
queries
All entities in
the same
partition
No
None
GBs of
storage
100s of
TBs
SQL
subset w/
Phoenix
All cells in
the same
row
Written in
Java
GBs of
storage
plus VMs
per hour
SQL
No
The Hadoop technology family
Hadoop Technologies
Hive, Pig, …
MapReduce
Tez
...
YARN
HDFS
HBase
Storm
Azure HDInsight
provides these as a
managed service
An illustration
Excel
HDInsight
Hive
Pig
...
Tez/
MapReduce
Job
VM
VM
VM
Logic
Logic
Logic
...
HDFS API
HDInsight HBase is also
implemented on this API
and relies on Azure Blobs
1000110100110
0111101111101
1011010001101
1000110100110
0111101111101
1011010001101
1000110100110
0111101111101
1011010001101
Blob
Blob
Blob
Azure Blobs
...
Different goals, different technologies
Read/write requests
Relational/
NoSQL Store
Operational
Data
Results
Application
User
Search requests
Search results
Index
Azure
Search
Index
Some examples
Online retailer
User-generated content site
Example: An online shoe
store
Example: A discussion site
for movie buffs
Custom business application
Example: An employee benefits
application
For an online shoe retailer
high
high heels
high tops
high arch
Users expect
suggestions
Azure Search doesn’t
provide any UI components
For an online shoe retailer
Contoso Brand High Heels
Results returned
in a specific order
Pumps, stilettos, and more
Fabrikam’s Fancies
High heels for fun!
High Heels for Everybody
Fashion, fashion, and more fashion
Search terms
shown in bold
For an online shoe retailer
Color
Help with
the user’s
next search
(34)
(21)
(22)
(5)
(19)
(9)
(11)
(10)
Contoso
Pumps
$129.95
Contoso
Stilettos
$350.00
Fabrikam
Flower
$59.99
Fabrikam
Lipstick Heels
$489.95
Price
$100 or less (10)
$100 - $250 (14)
$250 and up (9)
Requires more
information than
a simple text UI
Creating an index
Create index
Schema
Application
Azure
Search
Index
For an online shoe retailer
Name
Type
Other Attributes
Category
String
Searchable, Suggestions, Sortable, Retrievable, Filterable
Brand
String
Searchable, Suggestions, Sortable, Retrievable, Filterable
Style
String
Searchable, Suggestions, Sortable, Retrievable, Filterable
Color
Collection(String) Searchable, Suggestions, Retrievable, Filterable, Facetable
Price
Picture
Double
String
Stock
Int32
Promotion
Boolean
For ordering
search results
Searchable, Sortable, Retrievable, Filterable, Facetable
Retrievable
Holds a URL to an
Azure blob
Populating an index
Provide data
Application
Azure
Search
Index
For an online shoe retailer
Category
Brand
Style
Color
Price
Picture
Stock
Promotion
Sneakers
Contoso
HiTops
White,
Black
$29.95
http://...
194
True
High Heels
Contoso
Pumps
Red
$129.95
http://...
285
True
High Heels
Contoso
Stilettos
$350.00
http://...
23
True
Boots
Contoso
Beatle
$134.79
http://...
100
True
High Heels
Fabrikam
Lipstick
Heels
Red,
Black
White,
Black
Pink
$489.95
http://...
10
False
High Heels
Fabrikam
Flower
Red
$59.99
http://...
158
False
Tuxedo
Fabrikam
Black
Black
$500.00
http://...
34
False
...
...
...
...
...
...
...
...
Searching
Search Request
text=“high heels”
Index
high heels
Azure
Search
Application
User
Price
$100 or less
$100 - $250
$250 and up
Contoso
Pumps
$129.95
Contoso
Stilettos
$350.00
Color
Fabrikam
Flower
$59.99
Fabrikam
Lipstick Heels
$489.95
Search Result
For an online shoe retailer
Returned by
a search for
“high heels”
Category
Brand
Style
Color
Price
Picture
Stock
Promotion
Sneakers
Contoso
HiTops
White,
Black
$29.95
http://...
194
True
High Heels
Contoso
Pumps
Red
$129.95
http://...
285
True
High Heels
Contoso
Stilettos
$350.00
http://...
23
True
Boots
Contoso
Beatle
$134.79
http://...
100
True
High Heels
Fabrikam
Lipstick
Heels
Red,
Black
White,
Black
Pink
$489.95
http://...
10
False
High Heels
Fabrikam
Flower
Red
$59.99
http://...
158
False
Tuxedo
Fabrikam
Black
Black
$500.00
http://...
34
False
...
...
...
...
...
...
...
...
For an online shoe retailer
Pumps are first because
there are more in stock
Color
Made
possible by
facets
(34)
(21)
(22)
(5)
(19)
(9)
(11)
(10)
Contoso
Pumps
$129.95
Contoso
Stilettos
$350.00
Fabrikam
Flower
$59.99
Fabrikam
Lipstick Heels
$489.95
Price
$100 or less (10)
$100 - $250 (14)
$250 and up (9)
Contoso is first
because of the
promotion
The traditional world
BI Applications
Internal
Users
Analytical Data
Operational Data
Relational
Data
Warehouse
Relational
Database
Applications
End Users
A modern view
BI Applications
Internal
Users
Analytical Data
Relational
Data
Warehouse
Unstructured
Data
Operational Data
Relational
Database
NoSQL
Store
Applications
End Users
Search Data
Indexes
Microsoft Azure technologies
Analytical Data
Operational Data
Search Data
Relational
Data
Warehouse
Unstructured
Data
Relational
Database
NoSQL
Store
Indexes
SQL Server in
an IaaS VM
HDInsight
Hadoop
SQL Server in
an IaaS VM,
DocumentDB,
Tables,
HDInsight
HBase
Search
SQL
Database
Managed service
provided by Azure
Software that can run in
Azure virtual machines
Options in the new world
Use SQL Database
when:
Use DocumentDB,
Tables, or HDInsight
HBase when:
You want relational
data
You need more scale
than relational allows
You want to get your
application up as fast
as possible
You want a nonrelational data model
You want your
application to require
minimal management
You don’t want to be
locked into a schema
Use HDInsight Hadoop
Tez/MapReduce when:
You want to do big-data
analytics
Use Azure Search
when:
You want to provide a
search interface to your
users
!



NoSQL technologies are a fundamental
part of Azure today
David Chappell is Principal of Chappell & Associates
(www.davidchappell.com) in San Francisco, California. Through his
speaking, writing, and consulting, he helps people around the world
understand, use, and make better decisions about new technology.
David has been the keynote speaker for more than a hundred events
and conferences on five continents, and his seminars have been
attended by tens of thousands of IT leaders, architects, and developers
in fifty countries. His books have been published in a dozen languages
and used regularly in courses at MIT, ETH Zurich, and other universities.
In his consulting practice, he has helped clients such as Hewlett-Packard,
IBM, Microsoft, Stanford University, and Target Corporation adopt new
technologies, market new products, and educate their customers and
staff.
NO PURCHASE NECESSARY. Open only to event attendees.
Winners must be present to win. Game ends May 9th, 2015.
For Official Rules, see The Cloud and Enterprise Lounge or
myignite.com/challenge
http://myignite.microsoft.com
Descargar

NoSQL on Microsoft Azure: An introduction