Home > Articles > Data

Programming Neo4j with Java

  • Print
  • + Share This
In the second article of a three-part series, Steven Haines discusses two ways of using Java to program for Neo4j: via the Core Java API, and via the Traversal API.
Like this article? We recommend

Like this article? We recommend

Programming Neo4j with Java

This is the second article of a three-part series on using Java and Neo4j. This article focuses on programming Neo4j with Java. There are three primary interfaces for working with Neo4j:

  • Core Java API
  • Traversal API
  • Cypher queries

This article explores the first two: manually manipulating Neo4j using the Java API, and traversing a Neo4j object graph by using the Traversal API. The final article in this series dives into the details of writing Cypher queries.

This article builds on the users and movies database example defined in the first article in this series, "Introduction to Neo4j." Users can have IS_FRIEND_OF relationships with other users, and users can have HAS_SEEN relationships with movies. The HAS_SEEN relationship also can have a "stars" property to reflect how much the user liked the movie.

Core Java API

To keep things simple, we'll stick to using Neo4j as an embedded database for this article. The first step is to set up a new project. I'm using Maven:

mvn archetype:generate -DgroupId=com.geekcap.informit -DartifactId=neo4j-sample-app

Next, we add the Neo4j Maven dependency to our POM file:


As presented in the previous article, the primary interface to interacting with Neo4j is the GraphDatabaseService. Let's create an embedded database that stores its database files in the data directory, relative to where we launch our application:

GraphDatabaseService graphDB = new GraphDatabaseFactory().newEmbeddedDatabase("data");

The GraphDatabaseFactory creates an embedded database by executing its newEmbeddedDatabase() method, passing it the relative or absolute path to the database files.

Creating Nodes and Relationships

Creating a new node using the Core Java API is accomplished by executing the GraphDatabaseService's createNode() method:

Node myNode = graphDB.createNode();

Each node can contain a set of properties, where the key is a String and the value can be any of the following: String, char, boolean, byte, short, int, long, double, float, or an array. If this node was meant to be a user, then we might set the user's name as follows:

myNode.setProperty("name", "Steve");

A node can contain any number of properties: Node properties contain the data in your database, where the type of the node is analogous to a table and a node instance is analogous to a row in that table. If a node was meant to represent a user, we might assign it a "type" of "USER", but Neo4j provides a better way: labels.

Neo4j's labels allow you to group related nodes. A node can have zero or more labels, so you are not constrained to a single label or categorization of nodes. Furthermore, the Neo4j Core Java API allows you to find all nodes that contain a specified label or even all nodes that have a specified label and property name/value combination. Labels are defined using the Label interface:

public interface Label {
    String name();

It's easiest to define labels using an enum, such as the following:

    public enum Labels implements Label {

You can add labels to nodes by invoking the addLabel() method of the Node class:

myNode.addLabel( Labels.USER );

Or you can pass a single Label or an array of Labels to the GraphDatabaseService's createNode() method:

Node myNode = graphDB.createNode(Labels.USER);

The following code segment creates four users:

Node steve = graphDB.createNode(Labels.USER);
steve.setProperty("name", "Steve");
Node linda = graphDB.createNode(Labels.USER);
linda.setProperty("name", "Linda");
Node michael = graphDB.createNode(Labels.USER);
michael.setProperty("name", "Michael");
Node rebecca = graphDB.createNode(Labels.USER);
rebecca.setProperty("name", "Rebecca");

Now let's create some relationships between these nodes. Relationships are defined by a RelationshipType interface:

public interface RelationshipType {
    String name();

Just as with labels, we can create our relationships types with an enum:

public enum RelationshipTypes implements RelationshipType {

Relationships can be created by invoking the node's createRelationshipTo() method:

steve.createRelationshipTo(michael, RelationshipTypes.IS_FRIEND_OF);
steve.createRelationshipTo(rebecca, RelationshipTypes.IS_FRIEND_OF);
steve.createRelationshipTo(linda, RelationshipTypes.IS_FRIEND_OF);

In this case, I have created IS_FRIEND_OF relationships to various members of my family. Relationships are directed, meaning that Steve can create a friend relationship with Linda, but Linda does not necessarily need to create a friend relationship with Steve. (But let's hope that my wife considers me a friend!) Even though they are directed, you can traverse across the relationship honoring the direction or not. For example, if I find all IS_FRIEND_OF relationships for Linda, I will find Steve.

Now let's create a few movies:

Node divergent = graphDB.createNode(Labels.MOVIE);
divergent.setProperty("name", "Divergent");
Node hero = graphDB.createNode(Labels.MOVIE);
hero.setProperty("name", "Big Hero 6");
Node cinderella = graphDB.createNode(Labels.MOVIE);
cinderella.setProperty("name", "Cinderella");
Node interview = graphDB.createNode(Labels.MOVIE);
interview.setProperty("name", "The Interview");

Before we define relationships between our users and our movies, I want to introduce an additional feature of relationships: Relationships can have their own properties. In this example, we want to define HAS_SEEN relationships between users and movies, but we'll also add a stars property to the relationship so that we can see how much each user liked the movie. For example, I might give Cinderella a four-star rating, but my daughter Rebecca would definitely give it a five-star rating. We'll use the following helper method that allows users to see and rate movies:

public static Relationship seeMovie( Node user, Node movie, int stars )
    Relationship relationship = user.createRelationshipTo( movie, RelationshipTypes.HAS_SEEN );
    relationship.setProperty( "stars", stars );
    return relationship;

We create a new relationship, of type HAS_SEEN, between the specified user and the specified movie, and then we add a new property named stars with the specified integer value.

Now we can define a few HAS_SEEN relationships:

seeMovie( steve, divergent, 5 );
seeMovie( steve, hero, 5 );
seeMovie( steve, cinderella, 4 );
seeMovie( rebecca, hero, 5 );
seeMovie( rebecca, cinderella, 5 );
seeMovie( michael, hero, 5 );
seeMovie( michael, cinderella, 3 );
seeMovie( linda, divergent, 4 );
seeMovie( linda, hero, 5 );
seeMovie( linda, cinderella, 5 );

Traversing Across Nodes and Relationships

At this point we have a pretty good graph of four users, four movies, IS_FRIEND_OF relationships, and HAS_SEEN relationships. Let's start using the Core Java API to traverse some of these nodes and relationships. First we'll show all users and all movies in the database:

// Find all movies in our database
ResourceIterator<Node> movies = graphDB.findNodes( Labels.MOVIE );
System.out.println( "Movies:" );
while( movies.hasNext() )
    Node movie = movies.next();
    System.out.println( "\t" + movie.getProperty( "name" ) );

// Find all users
ResourceIterator<Node> users = graphDB.findNodes( Labels.USER );
System.out.println( "Users:" );
while( users.hasNext() )
    Node user = users.next();
    System.out.println( "\t" + user.getProperty( "name" ) );

As this example shows, the GraphDatabaseService provides a findNodes() method to which you can pass a Label, and it will return all nodes of that type. The findNodes() method returns a ResourceIterator of Nodes, which can be traversed using the hasNext()/next() paradigm. To display the results, we retrieve the name property from the node, using the node's getProperty() method. The output from running this is the following:

      Big Hero 6
      The Interview

Now let's try something a little more complicated: We'll retrieve all movies and then compute the average rating of each movie by looking at the number of stars assigned by everyone who has seen it:

// Compute average movie rating
movies = graphDB.findNodes( Labels.MOVIE );
System.out.println( "Movie Ratings:" );
while( movies.hasNext() )
    Node movie = movies.next();

    // Follow all HAS_SEEN relationships and get their star rating
    Iterable<Relationship> relationships = movie.getRelationships(
         Direction.INCOMING, RelationshipTypes.HAS_SEEN) ;
    int totalStars = 0;
    int relationshipCount = 0;
    for( Relationship relationship : relationships )
        Integer stars = ( Integer )relationship.getProperty( "stars" );
        totalStars += stars;
    System.out.println( "\t" + movie.getProperty( "name" ) + ", Viewers: " +
          relationshipCount + ", Average rating: " +
          (float)totalStars/relationshipCount );

We begin by using the findNodes() method to find all movies and then pass it the MOVIE label; but this time, as we iterate over the results, we call the movie's getRelationships() method, passing two parameters:

  • Direction: The direction can be INCOMING, OUTGOING, or BOTH. The direction is optional, but in this case we're starting from a movie node, so we want to find all INCOMING HAS_SEEN relationships.
  • Relationship type: The type of relationship (HAS_SEEN, in this case).

We iterate over all HAS_SEEN relationships—think of them as lines in a graph, dividing the title of a movie from the names of users who have seen the movie—and then we retrieve the relationship's stars property, casting it to an Integer. We maintain a count of the total number of people who have seen the movie and the sum of all of the star ratings. We compute the average by dividing the total stars by the number of people who saw the movie. The output of executing this is the following:

Movie Ratings:
      Divergent, Viewers: 2, Average rating: 4.5
      Big Hero 6, Viewers: 4, Average rating: 5.0
      Cinderella, Viewers: 4, Average rating: 4.25
      The Interview, Viewers: 1, Average rating: 1.0

Next let's find all movies that a user has seen:

users = graphDB.findNodes( Labels.USER );
while( users.hasNext() )
    Node user = users.next();
    System.out.print( "\t" + user.getProperty( "name" ) + " has seen " );
    for( Relationship relationship : user.getRelationships(
         RelationshipTypes.HAS_SEEN ) )
        Node movie = relationship.getOtherNode( user );
        System.out.print( "\t" + movie.getProperty( "name" ) );

In this case, we find all user Nodes and then retrieve all relationships of type HAS_SEEN for that user. Note that we're not specifying a direction, which illustrates that direction is optional, and the default is BOTH. The difference in this example is that once we have the relationship, we want to retrieve the movie node on the other end of the relationship. We accomplish this goal by executing the getOtherNode() method on the relationship and passing it a reference to the originating node (the user). The output from this code is the following:

      Steve has seen       The Interview   Cinderella      Big Hero 6      Divergent
      Linda has seen       Cinderella      Big Hero 6      Divergent
      Michael has seen     Cinderella      Big Hero 6
      Rebecca has seen     Cinderella      Big Hero 6

The last example I want to build is a demonstration of a simple recommendation engine. We want to find the Michael user node and recommend movies to him. The algorithm is simple: Find all movies that Michael's friends have seen (and Michael hasn't), rated four or five stars, and suggest those movies to him. This task is a bit more complicated, but will make sense if we break it down into small steps:

Node michael = graphDB.findNode( Labels.USER, "name", "Michael" );

// Find all of Michael's movies
Set<String> michaelsMovies = new HashSet<String>();
for( Relationship relationship : michael.getRelationships( Direction.OUTGOING,
     RelationshipTypes.HAS_SEEN ) )
    michaelsMovies.add( ( String )relationship.getOtherNode( michael )
                        .getProperty( "name" ) );

// Find all of Michael's friends
Set<String> friendsMovies = new HashSet<String>();
for( Relationship relationship : michael.getRelationships(
     RelationshipTypes.IS_FRIEND_OF ) )
    Node friend = relationship.getOtherNode( michael );

    // Find all movies that Michael's friend has seen
    for( Relationship relationship1 : friend.getRelationships(
         Direction.OUTGOING, RelationshipTypes.HAS_SEEN ) )
        // Get the stars property and only include it if it has 4 or more stars
        if( ( Integer )relationship1.getProperty( "stars") > 3 )
            // Add this movie to our friendsMovie set
            friendsMovies.add((String) relationship1
// Remove all of the movies that Michael has already seen

// Show the movies with a rating of 4 or 5 that Michael hasn't seen
System.out.println( "Movies that Michael hasn't see, but his friends" +
                    "have seen and given a rating of 4 or higher:" );
for( String movie : friendsMovies )
    System.out.println( "\t" + movie );

The first step is to find Michael, which we accomplish by passing the GraphDatabaseService's findNode() method the USER label, the "name" property, and the "Michael" value:

Node michael = graphDB.findNode( Labels.USER, "name", "Michael" );

The next step is to build a set of all movies that Michael has seen, so that we can later remove them from the result set. We retrieve all OUTGOING HAS_SEEN relationships from the Michael node, obtain a reference to the movie node on the other end of the relationship, and add the value of its "name" property to a HashSet.

Now we follow all of Michael's IS_FRIEND_OF relationships to find his friends. In this case, it's important that we either specify the BOTH direction or no direction, so that we receive both INCOMING and OUTGOING relationships. We can retrieve Michael's friend node by invoking the relationship's getOtherNode() method. With his friend in hand, we can find all of that friend's movies by following all OUTGOING HAS_SEEN relationships to movie nodes and retrieve their movie names. The other condition that we imposed was to include only movies that the user rated as four or five stars. We add the qualifying movies to a HashSet named friendsMovies. The reason we chose a Set is that it doesn't allow duplicates, so we'll have a set of unique movies that all of Michael's friends have seen.

Finally, we need to remove the movies that Michael has already seen, which we accomplish by invoking the HashSet's removeAll() method, and then we display all the remaining movies. (Alternatively, we could have checked this HashSet while building the friendsMovie set.) The output is as follows:

Movies that Michael hasn't see, but his friends have seen and given a rating of 4 or higher:

We see only one movie in the list because there are only two movies in the list that Michael hasn't seen: Divergent and The Interview. I gave Divergent a five-star rating, but I wasn't quite so kind to The Interview, so it was excluded from the result set.

You have to admit that this capability is powerful, but you're probably thinking that it's also a lot of work! Let's turn our attention to the Traversal API and see how it can simplify things.

Working with the Traversal API

The Traversal API provides a more declarative way of traversing Neo4j nodes. It allows you to specify what you would like the traverser to do, and then the traverser follows your instructions. The first step in using the Traversal API is to create a TraversalDescription that describes what the traverser should do. You can create a TraversalDescription from the GraphDatabaseService by executing its traversalDescription() method:

TraversalDescription myFriends = graphDB.traversalDescription();

The TraversalDescription is an immutable (unchangeable) object; rather than configuring one directly, as you invoke its methods, it returns a new instance of itself. For example, if we want to tell the TraversalDescription to follow IS_FRIEND_OF relationships, we could do so as follows:

TraversalDescription myFriends = graphDB.traversalDescription()
                    .relationships( RelationshipTypes.IS_FRIEND_OF );

If you have a background in JavaScript or functional programming languages, you should be comfortable chaining these operations together; if not, you'll get used to it.

Once we have a TraversalDescription defined, we can create a Traverser and pass it a starting node to traverse. But before we get into the details of creating and configuring a TraversalDescription, there are a few concepts that you need to understand:

  • How will you traverse the nodes? The Traversal API provides two algorithms: depth-first traversal and breadth-first traversal. Read on for a more detailed review of the two algorithms.
  • How will you expand relationships? In other words, when you are at a node, how do you determine what relationships to follow? We refer to this decision as an expansion. The traverser will follow whatever relationships you specify, by using one of the built-in expanders or the relationships() shortcut to the standard expander, or by creating a custom expander. Under the hood, the expander is passed a path to the current node and can return any of the following values to the traverser: INCLUDE_AND_CONTINUE, EXCLUDE_AND_CONTINUE, INCLUDE_AND_PRUNE, and EXCLUDE_AND_PRUNE. The INCLUDE/EXCLUDE component tells the traverser whether it should include or exclude this node from the results. The CONTINUE/PRUNE component tells the traverser whether it should continue traversing down the tree or prune (exclude) the sub-nodes below this node.
  • How do you know when to traverse to the next node in the tree or stop? We refer to this decision as an evaluation. Several default evaluators are defined in the Evaluators class. Shortly we'll use the atDepth() method to stop traversing when we're at a depth of two, but other methods exist, such as all() to include all nodes, fromDepth() to start at a depth and continue down the graph, excludeStartPosition() to skip the initial node, and more.

Graph traversal is typically accomplished using one of two algorithms: depth-first traversal and breadth-first traversal. The difference between the algorithms lies in how the traverser visits nodes in the graph.

Depth-first traversal traverses down the graph, visiting nodes using the following algorithm: Visit a node's first child; then visit that node's first child, and repeat until there are no children. Once it has encountered a node without children, it returns to the parent node and visits the node's next child. The depth-first traversal favors visiting nodes down the graph before visiting all of the direct children of the starting node.

Breadth-first traversal, on the other hand, visits all of a node's children before traversing down to a child node's children.

Figure 1 shows the difference between the two algorithms and displays the order that nodes are visited.

Figure 1 Depth-first versus breadth-first traversal.

The following code snippet shows how we find all of a user's friends:

// Find Michael
Node michael = graphDB.findNode( Labels.USER, "name", "Michael" );

// Find Michael's friends
TraversalDescription myFriends = graphDB.traversalDescription()
        .relationships( RelationshipTypes.IS_FRIEND_OF )
        .evaluator( Evaluators.atDepth( 1 ) );
Traverser traverser = myFriends.traverse( michael );
System.out.println( "Michael's friends: " );
for( Node friend : traverser.nodes() )
    System.out.println( "\t" + friend.getProperty( "name" ) );

We begin by finding the Michael node by label and name, just as we did in the preceding section. We create a TraversalDescription by invoking the GraphDatabaseService's traversalDescription() method and then execute three methods on it:

  • breadthFirst(): In this example, we chose a breadth-first traversal algorithm.
  • relationships(): The relationships() method is a shortcut for using the standard expander and tells the traverser to follow all IS_FRIEND_OF relationships.
  • evaluator(): the evaluator() method tells the traverser what nodes to include in the results and when to stop. Evaluators.atDepth(1) checks the depth of the current node; if under 1 it continues, if over 1 it prunes. So you can read this line as "Only include nodes below a depth of 1," or just visit the direct children of the starting node.

The output for this code is the following:

Michael's friends:

Likewise, let's find all movies that Michael has seen:

TraversalDescription myMovies = graphDB.traversalDescription()
        .relationships( RelationshipTypes.HAS_SEEN )
        .evaluator( Evaluators.atDepth( 1 ) );
traverser = myMovies.traverse( michael );
System.out.println( "Michael's movies: " );
for( Node movie : traverser.nodes() )
    System.out.println( "\t" + movie.getProperty( "name" ) );

This follows the same pattern as the friend search, but this time we follow the HAS_SEEN relationship. The output is as follows:

Michael's movies:
      Big Hero 6

These traversals are pretty simple, so let's conclude this section by building a traversal that is conceptually simple, but when we get into the details is a bit more complex. Let's find all movies that Michael's friends have seen. To do this, we want to first follow all of Michael's IS_FRIEND_OF relationships and then follow all of his friend's HAS_SEEN relationships. Further, we want to ensure that only movie nodes are returned in the result set, because when we follow the IS_FRIEND_OF relationships from Michael to his friends, his friend user nodes will be added to the result set.

The following code snippet shows this traversal:

TraversalDescription moviesThatFriendsLike = graphDB.traversalDescription()
                    // Choose a depth-first search strategy

                    // At depth 0 traverse the IS_FRIEND_OF relationships,
                    // at a depth of 1 traverse the HAS_SEEN relationship
                    .expand( new PathExpander<Object>() {
                        public Iterable<Relationship> expand(Path path,
                                     BranchState<Object> objectBranchState) {
                            // Get the depth of this node
                            int depth = path.length();

                            if( depth == 0 ) {
                                // Depth of 0 means the user's node (starting node)
                                return path.endNode().getRelationships(
                                              RelationshipTypes.IS_FRIEND_OF );
                            else {
                                // A depth of 1 would mean that we're at a friend and
                                // should expand his HAS_SEEN relationships
                                return path.endNode().getRelationships(
                                      RelationshipTypes.HAS_SEEN );
                        public PathExpander<Object> reverse() {
                            return null;

                    // Only go down to a depth of 2
                    .evaluator( Evaluators.atDepth( 2 ) )

                    // Only return movies
                    .evaluator( new Evaluator() {
                        public Evaluation evaluate(Path path) {
                            if( path.endNode().hasLabel( Labels.MOVIE ) ) {
                                return Evaluation.INCLUDE_AND_CONTINUE;
                            return Evaluation.EXCLUDE_AND_CONTINUE;
            traverser = moviesThatFriendsLike.traverse( michael );

            System.out.println( "Movies that Michael's friends have seen: " );
            for( Node movie : traverser.nodes() )
                System.out.println( "\t" + movie.getProperty( "name" ) );

This TraversalDescription implements the following steps:

  1. Select a traversal algorithm. In this case, we chose to use a depth-first traversal.
  2. Define a custom expander that checks the depth of a node. If the depth is 0, follow its IS_FRIEND_OF relationships; if the depth is 1, follow its HAS_SEEN relationships. We do this by creating a new PathExpander instance and overriding its expand() method. The expand() method returns an Iterable reference to the relationships that follow. In this case, we retrieve the node itself—the endNode() method of the path—and return all of its relationships of the required type.
  3. Add an evaluator that tells the traverser to go down only to a depth of 2.
  4. Create a custom evaluator that only adds movies to the result set. Do this by creating a new Evaluator instance and overriding its evaluate() method. The evaluate() method receives the path to the current node, which includes all nodes and relationships followed to get to the current node. We can obtain the current node itself by executing the endNode() method. Now we check to see whether the node contains the MOVIE label. If so, return INCLUDE_AND_CONTINUE, which means that we should include this node in the result set and continue our traversal. If not, return EXCLUDE_AND_CONTINUE to tell the traverser not to include this node in the result set, but to continue the traversal. Leave the exit criteria to the atDepth(2) evaluator.

The output from this traversal is the following:

Movies that Michael's friends have seen:
      Big Hero 6
      The Interview

The Traversal API takes a little while to master, but once you understand how it works and its key components (traversal algorithm, expanders, and evaluators), then you can quickly become effective at using it. I'll leave you with the exercise of checking movie ratings and removing movies that Michael has already seen.


This article explored two main APIs for interacting with Neo4j: the Core Java API and the Traversal API. The Core Java API provides you with raw access to nodes and relationships, allowing you to traverse them using intuitive Java interfaces. It's powerful, but it requires quite a bit of code. The Traversal API is more declarative and allows you to tell a traverser how to traverse your graph of nodes.

In the next article, we'll turn our attention to the Neo4j query language, Cypher, seeing how it can be leveraged to execute complex queries.

  • + Share This
  • 🔖 Save To Your Account