Coherence Simplified


I’ve been working with Oracle Coherence (formerly Tangosol) for a few years now and I often get asked what it is or how it works. Explaining what Coherence is can be slightly tricky given that programmers generally do not have to deal with systems that can benefit from caching large data sets. Here is my attempt at explaining Coherence in the simplest java terms:

  • Data can be stored in various places: file system, databases, memory, LDAP, etc.
  • The fastest way to store and access data is in memory as close to where business operations are performed as possible
  • If you have more data than you can fit in memory then Coherence can help

Suppose you have a java program running on one host that has a memory limit of 1 GB. If you continuously allocate memory you will generate an OutOfMemoryError when you hit the 1 GB limit.

public class MaxMemory {

   private static final int MEGABYTE = 1048576;
   private static final int maxMemoryUsageInMB = 3000;

   /**
    * @param args - None
    */
   public static void main(String[] args) {
      /*
       * Create an array to hold all the
       * data that we are going to store
       */
      Object[] memoryHolder = new Object[maxMemoryUsageInMB];

      for(int i = 0; i < maxMemoryUsageInMB; i++) {
         try {
            // Add one MB chunks to the array
            memoryHolder[i] = new byte[MEGABYTE];
            System.out.print("." + i);
         } catch (OutOfMemoryError error) {
            System.out.println("Memory limit reached at " +
                  Runtime.getRuntime().totalMemory());
            error.printStackTrace();
         }
      }
      System.out.println("Done.");
   }
}

If you however store your data in a data grid like Coherence you can use the memory of another server in the grid as if it were your own. This is the previous example revised to store data in a Coherence cache. If you run this example on two servers with 1 GB of heap space on each, you will notice that you can now allocate a total of 2 GB of data in memory.

import java.io.Serializable;

import com.tangosol.net.CacheFactory;
import com.tangosol.net.NamedCache;

public class MaxCoherenceMemory {

   private static final int MEGABYTE = 1048576;
   private static final int maxMemoryUsageInMB = 3000;

   /**
    * @param args - None
    */
   public static void main(String[] args) {
      NamedCache cache = CacheFactory.getCache("VirtualCache");

      for(int i = 0; i < maxMemoryUsageInMB; i++) {
         try {
            Dummy dummy = new Dummy();
            cache.put(new Integer(i), dummy);
            System.out.print("." + i);
         } catch (OutOfMemoryError error) {
            System.out.println("Memory limit reached at " +
                  Runtime.getRuntime().totalMemory());
            error.printStackTrace();
         }
      }
      System.out.println("Done.");
   }

   private static class Dummy implements Serializable {
      byte[] data;

      Dummy() {
         data = new byte[MEGABYTE];
      }
   }
}

Coherence is thus a virtualization product that allows you to use another server’s physical resources (memory in this case) as if it were your own. The exciting part is that if you designed your program to use Coherence to store data and the data were to grow much larger, you would have to make no changes to your application. All you have to do is to turn on another server and add it to the existing data grid.

This is just the tip of the iceberg and Coherence has many other pieces of functionality that can make building scalable systems easier.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s