Home > Articles > Programming > Java

Java Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

MongoDB

Last updated Mar 14, 2003.

As modern software developers we are intimately familiar with relational databases. We understand how to elegantly normalize a database, and if we've been working with database for some time, we understand what things we can do to de-normalize a database to enhance performance. We are familiar with SQL and with joining different tables to obtain the information we seek. We are also familiar with some of the programmatic discrepancies between relational data and object-oriented programming and hence the creation of Object-Relational Mapping (ORM) tools like Hibernate that bridge those discrepancies. Relational databases are the underpinnings that sit below most of our enterprise applications.

If relational databases are so fundamental to our applications, then where did "NoSQL" databases such as MongoDB come from? The simple answer is that relational models and relational database transactional overhead are an impediment to scalability. Given, relational databases can scale quite large to host terabytes of information, but this is not the level of scalability to which I am referring. Consider storing user feeds on a social networking site like Facebook or Twitter. In this case we're not talking about terabytes of data, but rather pentabytes of data - and growing! In such an environment, a relational database simply cannot store and provide access to this much information in an acceptable manner.

There are certain types of database interactions that necessitate transactional integrity, such as credit card payments, but for the vast majority of database interactions in large scale web applications, such as social network or even product catalogs, transactional integrity is not a strict requirement. If you could remove transactions and de-normalize such a model then you could distribute data across hundreds, or even thousands, of machines and intelligently find that data when it is requested.

MongoDB is a document database designed from its inception to support modern web applications and horizontal-scalability, meaning that it can easily be distributed across dozens of machines as application needs grow. But document databases represent a change in mindset from an existing relational representation of data into document-oriented representation of data. Let's consider the user feeds referenced above. Listing 1 shows a document that might represent a feed posting.

Listing 1. Feed Posting Document

{ _id: ObjectID('12345...'),
  message: 'Feeling good today',
  user: 'shaines',
  picture: {
    url: 'http://media.geekcap.com/pictures/pic.jpg',
    title: 'Beautiful Sunrise'
  },
  comments:  [
    {user: 'michael',
     message: 'Good to hear'},

    {user: 'rebecca',
     message: 'That makes me happy'}
  ]
}

Listing 1 looks far more natural than you were probably conceiving of a post in a relational manner. The document contains the user that posted the message, the message itself, a picture associated with the post, and a list of comments. The document is represented in JSON, which is one of the formats that MongoDB supports, and is very readable.

To contrast this against a relational model, Figure 1 shows a domain model that might represent this same information in a relational way.

Figure 1

Figure 1 Feed Relational Domain Model

In order to remove the chance of duplicating data, the domain model presented in figure 1 is normalized. It is a clean relational representation of the data, but as a result we need several joins or even separate queries (for comments) in order to build a complete feed to display to the user. The document, on the other hand, includes all data in one single record. For example, to query for all feeds we would perform a query similar to the following:

SELECT * FROM feed 
INNER JOIN user ON feed.user_id = user.id
INNER JOIN picture ON feed.id = picture.feed_id
WHERE user.username = 'shaines';

And then to get comments you would need to iterate over the feeds and load the comments:

SELECT * FROM comment WHERE feed_id = ?

Now let's look at using MongoDB's query language to find feeds written by me:

db.feed.find( {'user': 'shaines'} );

This statement would return all documents where the user property is 'shaines'. It is a different way of thinking of things, but you have to admit that it is pretty clean.

This series will demonstrate how to setup and configure MongoDB, how to interact with it in Java, and some of the libraries that help you more easily interact with MongoDB.

Continue to MongoDB Setup.