Home > Articles > Data > Oracle

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

Exadata Smart Flash Logging

Exadata Storage Software 11.2.2.4 introduced the Smart Flash Logging feature. The intent of this feature is to reduce overall redo log sync times by allowing the Exadata Flash storage to serve as a secondary destination for redo log writes. During a redo log sync, Oracle writes to the disk and Flash simultaneously and allows the redo log sync operation to complete as soon as either device completes its write.

In the event that the Flash Cache wins the race to write, the data need be held for only a short time until the storage server is certain that all writes have made it to the redo log. Since the Smart Flash Log is only a temporary store, only a small amount of Flash storage is required—512MB per cell (out of 3.2TB on an X4, 1.6TB on an X3, or 365GB on an X2 system).

Figure 15.14 illustrates the essential flow of control. Oracle processes performing DML generate redo entries which are written to the redo buffer (1). Periodically or upon COMMIT the LGWR flushes the buffer (2), resulting in an I/O request to the CELLSRV process (3). CELLSRV writes to Flash and grid disk simultaneously (4), and when either I/O completes, it returns control to the LGWR (5).

Figure 15.14

Figure 15.14 Exadata Smart Flash Logging

The use of Flash SSD to optimize redo log operations has been a somewhat contentious topic. Many—including this author—have argued that Flash SSD is a poor choice for redo log workloads. The nature of sequential redo log I/O tends to favor the spinning magnetic disk since sequential I/O minimizes seek latency, while penalizing Flash-based SSD, since the continual overwriting of existing blocks makes the probability of a block erase very high.

However, the Exadata Smart Flash Logging feature is not predicated on some theoretical write I/O advantage for Flash SSD. Rather it aims to “smooth out” redo log writes by running redo log writes out through two channels (grid disk and Flash SSD) and allowing the redo log write to complete when either of the two completes.

Redo log sync waits—which occur whenever a COMMIT occurs—generally involve only a couple of milliseconds of wait time since they involve only a small sequential write operation on an (ideally) relatively lightly loaded disk subsystem. Keeping redo logs on separate ASM disk groups from data files helps ensure that heavy data file I/O loads do not affect the time taken for redo operations.

However, it’s inevitable that from time to time a redo log sync operation will conflict with some other I/O—an archive read or Data Guard operation, for instance. In these circumstances some redo log sync operations may take a very long time indeed.

Following is some Oracle trace log data that shows some redo log sync waits:

WAIT #4..648: nam='log file sync’ ela= 710
WAIT #4..648: nam='log file sync’ ela= 733
WAIT #4...648: nam='log file sync’ ela= 621
WAIT #4...648: nam='log file sync’ ela= 507
WAIT #4...648: nam='log file sync’ ela= 683
WAIT #4...648: nam='log file sync’ ela= 2084
WAIT #4...648: nam='log file sync’ ela= 798
WAIT #4...648: nam='log file sync’ ela= 1043
WAIT #4...648: nam='log file sync’ ela= 2394
WAIT #4...648: nam='log file sync’ ela= 932
WAIT #4...648: nam='log file sync’ ela= 291780
WAIT #4...648: nam='log file sync’ ela= 671
WAIT #4...648: nam='log file sync’ ela= 957
WAIT #4...648: nam='log file sync’ ela= 852
WAIT #4...648: nam='log file sync’ ela= 639
WAIT #4...648: nam='log file sync’ ela= 699
WAIT #4...648: nam='log file sync’ ela= 819

The ela entry shows the elapsed time in microseconds. Most of the waits are less than 1 millisecond (1000 microseconds), but in the middle we see an anomalous wait of 291,780 microseconds (about one-third of a second!).

Occasional very high redo log sync waits like the one just shown might not seem too disturbing until you remember that redo log sync waits are frequently included in the most critical application transactions. Online operations such as saving a shopping cart, confirming an order, and saving a profile change all generally involve some sort of commit operation, and it’s well known that today’s online consumers rapidly lose patience when operations delay even by fractions of a second. So even occasional high redo log wait times are cause for concern. It’s the intent of Exadata Smart Flash Logging to smooth out these disturbing outliers.

Controlling and Monitoring Smart Flash Logging

Exadata Smart Flash Logging is enabled by default and you don’t have to do anything specifically to enable it—other than to make sure your Storage Cells are running at least Exadata Storage Software 11.2.2.4.

You can confirm your Flash Log status by issuing a LIST FLASHLOG command at a CellCLI prompt:

CellCLI> list flashlog  detail
         name:                   exa1cel01_FLASHLOG
         cellDisk:               FD_09_exa1cel01,FD_02_exa1cel01,
         creationTime:           2012-07-07T06:56:23-07:00
         degradedCelldisks:
         effectiveSize:          512M
         efficiency:             100.0
         id:                     3c08cfe1-ea43-4fde-85c2-0bbd5cbd11ec
         size:                   512M
         status:                 normal

You can control the behavior of Exadata Smart Flash Logging by using a resource management plan. This allows you to turn Exadata Smart Flash Logging on or off for individual databases.

So, for instance, this command will turn Exadata Smart Flash Logging off for database GUY and leave it on for all other databases:

ALTER IORMPLAN dbplan=((name=’GUY’,flashLog=false),
                       (name=other,flashlog=on))’

You can monitor the behavior of Exadata Smart Flash Logging by using the following CellCLI command:

CellCLI> list metriccurrent where objectType=’FLASHLOG’;
   FL_ACTUAL_OUTLIERS              FLASHLOG        1 IO requests
   FL_BY_KEEP                      FLASHLOG        0
   FL_DISK_FIRST                   FLASHLOG        253,540,190 IO requests
      ...... ......
   FL_FLASH_FIRST                  FLASHLOG        11,881,503 IO requests
      ...... ......
   FL_PREVENTED_OUTLIERS           FLASHLOG        275,125 IO requests

These are probably the most interesting CellCLI metrics generated by this command:

  • FL_DISK_FIRST—the grid disk log write completed first during the redo log write operation
  • FL_FLASH_FIRST—the Flash SSD completed first during the redo log write operation
  • FL_PREVENTED_OUTLIERS—the number of redo log writes that were optimized by the Flash Logging that would otherwise have taken longer than 500 milliseconds to complete

Testing Exadata Smart Flash Logging

Let’s look at an example. Say we test Exadata Smart Flash Logging by running 20 concurrent processes, each of which performs 200,000 updates and commits—a total of 4 million redo log sync operations. Now, Exadata Smart Flash Logging is disabled using a resource plan (see the ALTER IORMPLAN statement in the previous section) and the tests are repeated. We capture every redo log sync wait in a DBMS_MONITOR trace file for analysis using the R statistical package.

With Exadata Smart Flash Logging disabled, our key CellCLI metrics look like this:

FL_DISK_FIRST            32,669,310 IO requests
FL_FLASH_FIRST            7,318,741 IO requests
FL_PREVENTED_OUTLIERS       774,146 IO requests

With Exadata Smart Flash Logging enabled, the metrics look like this:

FL_DISK_FIRST            33,201,462 IO requests
FL_FLASH_FIRST            7,337,931 IO requests
FL_PREVENTED_OUTLIERS       774,146 IO requests

So for this particular cell the Flash disk “won” only 3.8% of the time (the ratio of FL_FLASH_FIRST and FL_DISK_FIRST) and prevented no outliers. (Outliers are redo log syncs that take longer than 500 milliseconds to complete.)So on the surface, it would seem that very little has been achieved.

However, statistical analysis of the redo log sync times provides a somewhat different interpretation. Table 15.1 summarizes the key statistics for the two tests.

Table 15.1 Effect of Exadata Smart Flash Logging on Redo Log Sync Waits

Redo Log Sync Time (microseconds)

Smart Flash Logging

Min

Median

Mean

99%

Max

On

1.0

650

723

1656

75,740

Off

1.0

627

878

4662

291,800

Exadata Smart Flash Logging reduced the mean log file sync wait time by over 15%—and this difference was statistically significant. There was also a significant reduction in the 99th percentile—the minimum wait time for the top 1% of waits was reduced from about 4.6 seconds to 1.6 seconds.

Figure 15.15 shows the distribution of log file sync waits with the Exadata Smart Flash Logging feature enabled and disabled. Turning Exadata Smart Flash Logging on created a strange hump on the high side of what otherwise looks like a normal bell curve distribution. Understanding that hump requires that we take a look at the distribution of very high outlier redo log waits.

Figure 15.15

Figure 15.15 Distribution of log file sync waits with Exadata Smart Flash Logging

Figure 15.16 shows the distribution of the top 10,000 waits. This shows far more clearly how Exadata Smart Flash Logging worked to reduce high outlier log file sync waits. These waits have been pulled back, but to a point that is still above the average wait time for other log file sync waits. This creates the hump in Figure 15.15 and represents a significant reduction in the time taken for outlying redo log waits.

Figure 15.16

Figure 15.16 Distribution of top 10,000 log file sync waits with Exadata Smart Flash Logging

While Flash SSD is not necessarily an ideal storage medium for redo write I/O, Exadata Smart Flash Logging does reduce the impact of very high outlier redo log writes.

  • + Share This
  • 🔖 Save To Your Account