SPECIAL OFFERS
Keep up with new releases and promotions. Sign up to hear from us.
Register your product to gain access to bonus material or receive a coupon.
The Complete Guide to Building Cloud Computing Solutions with Amazon SimpleDB
Using SimpleDB, any organization can leverage Amazon Web Services (AWS), Amazon’s powerful cloud-based computing platform–and dramatically reduce the cost and resources associated with application infrastructure. Now, for the first time, there’s a complete developer’s guide to building production solutions with Amazon SimpleDB.
Pioneering SimpleDB developer Mocky Habeeb brings together all the hard-to-find information you need to succeed. Mocky tours the SimpleDB platform and APIs, explains their essential characteristics and tradeoffs, and helps you determine whether your applications are appropriate for SimpleDB. Next, he walks you through all aspects of writing, deploying, querying, optimizing, and securing Amazon SimpleDB applications–from the basics through advanced techniques.
Throughout, Mocky draws on his unsurpassed experience supporting developers on SimpleDB’s official Web forums. He offers practical tips and answers that can’t be found anywhere else, and presents extensive working sample code–from snippets to complete applications.
With A Developer’s Guide to Amazon SimpleDB you will be able to
This book will be an indispensable resource for every IT professional evaluating or using SimpleDB to build cloud-computing applications, clients, or frameworks.
Download the sample pages (includes Chapter 1 and Index)
Preface xvi
Acknowledgments xviii
1 Introducing Amazon SimpleDB 1
What Is SimpleDB? 1
What SimpleDB Is Not 1
Schema-Less Data 2
Stored Securely in the Cloud 2
Billed Only for Actual Usage 3
Domains, Items, and Attribute Pairs 3
Multi-Valued Attributes 3
Queries 4
High Availability 4
Database Consistency 5
Sizing Up the SimpleDB Feature Set 6
Benefits of Using SimpleDB 6
Database Features SimpleDB Doesn’t Have 7
Higher-Level Framework Functionality 7
Service Limits 8
Abandoning the Relational Model? 8
A Database Without a Schema 9
Areas Where Relational Databases Struggle 10
Scalability Isn’t Your Problem 11
Avoiding the SimpleDB Hype 11
Putting the DBA Out of Work 12
Dodging Copies of C.J. Date 13
Other Pieces of the Puzzle 14
Adding Compute Power with Amazon EC2 14
Storing Large Objects with Amazon S3 14
Queuing Up Tasks with Amazon SQS 15
Comparing SimpleDB to Other Products and Services 15
Windows Azure Platform 15
Google App Engine 17
Apache CouchDB 17
Dynamo-Like Products 18
Compelling Use Cases for SimpleDB 18
Web Services for Connected Systems 18
Low-Usage Application 19
Clustered Databases Without the Time Sink 19
Dynamic Data Application 19
Amazon S3 Content Search 20
Empowering the Power Users 20
Existing AWS Customers 20
Summary 21
2 Getting Started with SimpleDB 23
Gaining Access to SimpleDB 23
Creating an AWS Account 23
Signing Up for SimpleDB 24
Managing Account Keys 24
Finding a Client for SimpleDB 24
Building a SimpleDB Domain Administration Tool 25
Administration Tool Features 25
Key Storage 25
Implementing the Base Application 26
Displaying a Domain List 28
Adding Domain Creation 28
Supporting Domain Deletion 29
Listing Domain Metadata 29
Running the Tool 31
Packaging the Tool as a Jar File 31
Building a User Authentication Service 31
Integrating with the Spring Security Framework 32
Representing User Data 32
Fetching User Data with SimpleDBUserService 34
Salting and Encoding Passwords 36
Creating a User Update Tool 37
Summary 39
3 A Code-Snippet Tour of the SimpleDB API 41
Selecting a SimpleDB Client 41
Typica Setup in Java 42
C# Library for Amazon SimpleDB Setup 43
Tarzan Setup in PHP 45
Common Concepts 45
The Language Gap 45
SimpleDB Endpoints 45
SimpleDB Service Versions 47
Common Response Elements 47
CreateDomain 48
CreateDomain Parameters 49
CreateDomain Response Data 49
CreateDomain Snippet in Java 49
CreateDomain Snippet in C# 50
CreateDomain Snippet in PHP 50
ListDomains 51
ListDomains Parameters 51
ListDomains Response Data 51
ListDomains Snippet in Java 52
ListDomains Snippet in C# 52
ListDomains Snippet in PHP 53
DeleteDomain 54
DeleteDomain Parameters 54
DeleteDomain Response Data 54
DeleteDomain Snippet in Java 55
DeleteDomain Snippet in C# 55
DeleteDomain Snippet in PHP 55
DomainMetadata 56
DomainMetadata Parameters 56
DomainMetadata Response Data 56
DomainMetadata Snippet in Java 57
DomainMetadata Snippet in C# 58
DomainMetadata Snippet in PHP 58
PutAttributes 59
PutAttributes Parameters 60
PutAttributes Response Data 62
PutAttributes Snippet in Java 63
PutAttributes Snippet in C# 64
PutAttributes Snippet in PHP 65
GetAttributes 65
GetAttributes Parameters 65
GetAttributes Response Data 66
GetAttributes Snippet in Java 67
GetAttributes Snippet in C# 68
GetAttributes Snippet in PHP 69
DeleteAttributes 70
DeleteAttributes Parameters 70
DeleteAttributes Response Data 71
DeleteAttributes Snippet in Java 72
DeleteAttributes Snippet in C# 72
DeleteAttributes Snippet in PHP 73
BatchPutAttributes 73
BatchPutAttributes Parameters 74
BatchPutAttributes Response Data 75
BatchPutAttributes Snippet in Java 76
BatchPutAttributes Snippet in C# 77
BatchPutAttributes Snippet in PHP 78
Select 79
Select Parameters 79
Select Response Data 80
Select Snippet in Java 81
Select Snippet in C# 83
Select Snippet in PHP 85
Summary 86
4 A Closer Look at Select 87
Select Syntax 87
Required Clauses 88
Select Quoting Rule for Names 88
Output Selection Clause 89
WHERE Clause 90
Select Quoting Rules for Values 90
Sort Clause 91
LIMIT Clause 92
Formatting Attribute Data for Select 93
Integer Formatting 94
Floating Point Formatting 95
Date and Time Formatting 95
Case Sensitivity 97
Expressions and Predicates 97
Simple Comparison Operators 98
Range Operators 98
IN() Queries 99
Prefix Queries with LIKE and NOT LIKE 99
IS NULL and IS NOT NULL 100
Multi-Valued Attribute Queries 100
Multiple Predicate Queries with the INTERSECTION
Operator 101
Selection with EVERY() 102
Query Results with the Same Item Multiple Times 102
Improving Query Performance 103
Attribute Indexes 103
Composite Attributes 104
Judicious Use of LIKE 105
Running on EC2 106
Skipping Pages with count() and LIMIT 106
Measuring Select Performance 107
Automating Performance Measurements 109
Summary 110
5 Bulk Data Operations 111
Importing Data with BatchPutAttributes 112
Calling BatchPutAttributes 112
Mapping the Import File to SimpleDB Attributes 112
Supporting Multiple File Formats 113
Storing the Mapping Data 113
Reporting Import Progress 113
Creating Right-Sized Batches 114
Managing Concurrency 114
Resuming a Stopped Import 115
Verifying Progress and Completion 115
Properly Handling Character Encodings 116
Backup and Data Export 116
Using Third-Party Backup Services 117
Writing Your Own Backup Tool 118
Restoring from Backup 119
Summary 119
6 Working Beyond the Boundaries 121
Availability: The Final Frontier 121
Boundaries of Eventual Consistency 123
Item-Level Atomicity 123
Looking into the Eventual Consistency Window 124
Read-Your-Writes 125
Implementing a Consistent View 125
Handling Text Larger Than 1K 128
Storing Text in S3 128
Storing Overflow in Different Attributes 129
Storing Overflow as a Multi-Valued Attribute 130
Entities with More than 256 Attributes 131
Paging to Arbitrary Query Depth 131
Exact Counting Without Locks or Transactions 133
Using One Item Per Count 134
Storing the Count in a Multi-Valued Attribute 136
Testing Strategies 138
Designing for Testability 138
Alternatives to Live Service Calls 139
Summary 139
7 Planning for the Application Lifecycle 141
Capacity Planning 141
Estimating Initial Costs 141
Keeping Tabs on SimpleDB Usage with AWS Usage
Reports 142
Creating More Finely Detailed Usage Reports 145
Tracking Usage over Time 146
Storage Requirements 146
Computing Storage Costs 147
Understanding the Cost of Slack Space 147
Evaluating Attribute Concatenation 148
Scalability: Increasing the Load 148
Planning Maintenance 150
Using Read-Repair to Apply Formatting Changes 150
Using Read-Repair to Update Item Layout 152
Using a Batch Process to Apply Updates 152
Summary 153
8 Security in SimpleDB-Based Applications 155
Account Security 155
Managing Access Within the Organization 155
Limiting Amazon Access from AWS Credentials 157
Boosting Security with Multi-Factor Authentication 158
Access Key Security 159
Key Management 159
Secret Key Rotation 160
Data Security 161
Storing Clean Data 161
SSL and Data in Transmission 162
Data Storage and Encryption 164
Storing Data in Multiple Locations 165
Summary 165
9 Increasing Performance 167
Determining If SimpleDB Is Fast Enough 167
Targeting Moderate Performance in Small Projects 167
Exploiting Advanced Features in Small Projects 168
Speeding Up SimpleDB 169
Taking Detailed Performance Measurements 169
Accessing SimpleDB from EC2 169
Caching 170
Concurrency 172
Keeping Requests and Responses Small 173
Operation-Specific Performance 174
Optimizing GetAttributes 174
Optimizing PutAttributes 178
Optimizing BatchPutAttributes 179
Optimizing Select 180
Data Sharding 181
Partitioning Data 181
Multiplexing Queries 181
Accessing SimpleDB Outside the Amazon Cloud 182
Working Around Latency 182
Ignoring Latency 183
Summary 183
10 Writing a SimpleDB Client: A Language-Independent
Guide 185
Client Design Overview 185
Public Interface 186
Attribute Class 188
Item Class 190
Client Design Considerations 191
High-Level Design Issues 191
Operation-Specific Considerations 193
Implementing the Client Code 196
Safe Handling of the Secret Key 196
Implementing the Constructor 197
Implementing the Remaining Methods 198
Making Requests 200
Computing the Signature 208
Making the Connections 210
Parsing the Response 214
Summary 216
11 Improving the SimpleDB Client 217
Convenience Methods 217
Convenient Count Methods 217
Select with a Real Limit 219
Custom Metadata and Building a Smarter Client 219
Justifying a Schema for Numeric Data 220
Database Tools 221
Coordinating Concurrent Clients 221
Storing Custom Metadata within SimpleDB 221
Storing Custom Metadata in S3 222
Automatically Optimizing for Box Usage Cost 222
The Exponential Cost of Write Operations 223
QueryTimeout: The Most Expensive Way to Get Nothing 225
Automated Domain Sharding 228
Domain Sharding Overview 228
Put/Get Delete Routing 228
Query Multiplexing 231
Summary 232
12 Building a Web-Based Task List 233
Application Overview 233
Requirements 233
The Data Model 234
Implementing User Authentication 235
Implementing a Task Workspace 238
Implementing a Task Service 241
Adding the Login Servlet 244
Adding the Logout Servlet 249
Displaying the Tasks 249
Adding New Tasks 252
Deployment 252
Summary 254
Index 255