Home > Articles

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

Performing Checkpoints

The second component of the infrastructure is performing checkpoints of the log files. As transactions commit, change records are written into the log files, but the actual changes to the database are not necessarily written to disk. When a checkpoint is performed, the changes to the database that are part of committed transactions are written into the backing database file.

Performing checkpoints is necessary for two reasons. First, you can remove the Berkeley DB log files from your system only after a checkpoint. Second, the frequency of your checkpoints is inversely proportional to the amount of time it takes to run database recovery after a system or application failure.

Once the database pages are written, log files can be archived and removed from the system because they will never be needed for anything other than catastrophic failure. In addition, recovery after system or application failure has to redo or undo changes only since the last checkpoint because changes before the checkpoint have all been flushed to the filesystem.

Berkeley DB provides a separate utility, db_checkpoint, which can be used to perform checkpoints. Alternatively, applications can write their own checkpoint utility using the underlying txn_checkpoint function. The following code fragment checkpoints the database environment every 60 seconds:

int
main(int argc, char *argv)
{
  extern char *optarg;
  extern int optind;
  DB *db_cats, *db_color, *db_fruit;
  DB_ENV *dbenv;
  pthread_t ptid;
  int ch;

  while ((ch = getopt(argc, argv, "")) != EOF) 
    switch (ch) { 
    case '?': 
    default: 
      usage(); 
    } 
  argc -= optind; 
  argv += optind; 

  env_dir_create(); 
  env_open(&dbenv); 

  /* Start a checkpoint thread. */ 
  if ((errno = pthread_create( 
   &ptid, NULL, checkpoint_thread, (void *)dbenv)) != 0) { 
    fprintf(stderr, 
     "txnapp: failed spawning checkpoint thread: %s\n", 
     strerror(errno)); 
    exit (1); 
  } 

  /* Open database: Key is fruit class; Data is specific type. */ 
  db_open(dbenv, &db_fruit, "fruit", 0); 

  /* Open database: Key is a color; Data is an integer. */ 
  db_open(dbenv, &db_color, "color", 0); 

  /*
  * Open database:
  *  Key is a name; Data is: company name, address, cat breeds.
  */ 
  db_open(dbenv, &db_cats, "cats", 1); 

  add_fruit(dbenv, db_fruit, "apple", "yellow delicious"); 

  add_color(dbenv, db_color, "blue", 0); 
  add_color(dbenv, db_color, "blue", 3);

  add_cat(dbenv, db_cats, 
    "Amy Adams", 
    "Sleepycat Software", 
    "118 Tower Rd., Lincoln, MA 01741, USA", 
    "abyssinian", 
    "bengal", 
    "chartreaux", 
    NULL); 

  return (0); 
} 

void * 
checkpoint_thread(void *arg) 
{ 
  DB_ENV *dbenv; 
  int ret; 

  dbenv = arg; 
  dbenv_errx(dbenv, "Checkpoint thread: %lu", (u_long)pthread_self());

  /* Checkpoint once a minute. */ 
  for (;; sleep(60)) 
    switch (ret = txn_checkpoint(dbenv, 0, 0, 0)) { 
    case 0: 
    case DB_INCOMPLETE: 
      break; 
    default: 
      dbenv_err(dbenv, ret, "checkpoint thread"); 
      exit (1); 
    } 

  /* NOTREACHED */ 
}

Because checkpoints can be quite expensive, choosing how often to perform a checkpoint is a common tuning parameter for Berkeley DB applications.

  • + Share This
  • 🔖 Save To Your Account