zero-off-set

AIX 에서 Oracle 데이타파일을 Striped LV raw device에 생성하는 경우 LV 타입을 Oracle로 만들면(mklv -T O)

향상된 I/O 성능을 얻을 수 있으며 엄청 드물게 발생할 수 있는 Oracle DB Corruption 확률을 크게 줄일 수 있습니다.

적어도 AIX 5.2 및 Oracle 9i R2 최신 버전에는 이와 관련한 업데이트가 이미 적용되어 있습니다.

단, 볼륨그룹은 mkvg -B 옵션으로 Big VG로 생성해야 합니다.

( mklv -T O is not checking if the underlaying VG is a BIG one or not. )

The mklv -T flag is NOT documented on the mklv man page.

The purpose of this flag is to use with Oracle to tell it to NOT skip the first 4096 bytes of the logical volume.

When Oracle skips the first 4096 bytes, it can cause problems with fractured blocks.

The fractured block means that the block headers and block trailers do not match.

If the block is fractured, redo logs will not aid in recovery.

This can happen when a write is issued by oracle that is broken up by AIX into 2 or more writes and not all of the writes complete.

For example if the database block size is 8k. Oracle skips the first 4k of the lv, and then uses 8k blocks.

When you lay this out with the 128K stipe size, you will find that a number of blocks are split across 2 drives.

If a 16K stripe size is there then the IO will be 16K but since Oracle skips the first 4K so the 16K IO will be spread across 2 PVs,

the first 12K in one PV and the next 4K in the next PV.

So when you do abnormal halt then we do not get back the iodones properly for the entire 16K IO.

We may get iodone for the first 12K and then the next 4K is not done.

If it is without striping then if there is a abnormal halt we will not see part of the IO complete and the other part incomplete.

In this case we will get either the IO is done or not done.

The problem was being seen when a halt -q was issued, and any fractured blocks became corrupted.

When Oracle sees a type of "O" (uppercase O, not 0 zero), it will intentionally overwrite the LVCB by starting at the beginning of the LV,

thus preventing fractured 4K blocks.

When creating raw logical volumes, there is an Oracle recommendation to use the command line option ?T O (create Oracle volume type).

This creates logical volumes with zero offsets which improves performance and avoids a known Oracle problem.

By default, the first 4k bytes of each AIX Logical Volume is reserved for the Logical Volume Control Block (LVCB).

This means that the first Oracle data block begins at a 4k offset into the Logical Volume.

When fine granularity striping is used (either within AIX LVM or within ESS RAID-5 or RAID-10 arrays),

this can result in a slight I/O performance degradation when an Oracle DB Block Size is greater than 4k is used.

(An 8k DB Block Size is typical for OLTP applications and a 16k DB Block size is typical for Data Warehouse applications.)

This is because every few DB blocks are physically split across device boundaries ?

with the first part of the DB block residing on one physical disk and the remainder of the DB block residing on another physical disk.

This can result in two physical I/Os being required to read or write a single DB block.

The larger the DB Block size used, the higher the percentage of split blocks and the greater the potential for I/O performance degradation.

If suppose the stripe width is 64K and when oracle is used, it will offset 4K from the beginning of the LV

to protect the LVCB from being overwritten, then 1 in 4 writes will be split over 2 stripes.

But Oracle expects its 16K write to be either done or not done and cannot handle part of the IO being done which can happen

when halt -q is issued without doing a sync.

But if striping is not used then for a 128M partition 1 out of 8192 will have the potential of being split.

When running Oracle9i Release 2 (or later), it is possible to eliminate the 4k bytes offset.

This is recommended for new Oracle implementations or for existing applications with extremely high I/O performance requirements.

Currently, the capability to do this is delivered in two parts:

1. IBM AIX e-fix (APAR IY36656 for AIX 5.1 or APAR IY38578 for AIX 4.3) and

2. Oracle patch (bug 2620053).

The functionality will be included in future release levels of AIX and in Oracle 9.2.0.3 or later.

Once the prerequisite software has been installed, do the following to take advantage of the zero offset feature:

? Create a “big” Volume Group using the mkvg ?B flag.

In big volume groups the LVCB information is mirrored in the VGDA so we can overwrite the original LVCB.

? Create one or more Logical Volumes in that Volume group using the mklv ?T O flag.

The “-T O” option indicates to Oracle that it can safely use a 0 offset for this Logical Volume.

In order to eliminate the 4k bytes offset for an existing Oracle database, new Logical Volumes must be created and

the existing data must be migrated to the new Logical Volumes using normal migration procedures.

But right now there is no way to see if it is set or not other than looking in the raw data and then comparing it with the structures.

We can see it in the driver I think using kdb. This will most probably be available in the next release.

The LVCB for the small VG is stored in the first 512 bytes of the LV.

Now after the application of IY36656, Oracle (which used to skip the first few bytes before so as not to overwrite the LVCB)

will write from the beginning of the LV which means it will overwrite the LVCB.

But in a big VG there is a copy of the LVCB in the VGDA too and hence even if ORacle overwrites the first 512 bytes,

since the data is there in VGDA importvg will work fine.

But if you are using a small VG then some of the LV related info is lost (even if Oracle uses raw device).

So you will most probably be safer using a big vg.

이 글은 스프링노트에서 작성되었습니다.

'IBM > HACMP' 카테고리의 다른 글

HACMP_PD교육 (0)	2012.11.09
바뀐LUN사이즈를VG에적용할때 (0)	2012.11.09
PowerHA7.1교육 (0)	2012.11.09
PowerHA6.1 구성 절차 (0)	2012.11.09
How to synchronize timestamp (0)	2012.11.09
Understanding_active_and_passive_varyon_in_enhanced_concurrent_mode (0)	2011.08.24
Fast_disk_takeover (0)	2011.08.24
Enhanced_concurrent_mode (0)	2011.08.24
clstat 실행 시 error 날 때 해결책 (0)	2011.07.21
PowerHA 7.1 vs Veritas Cluster erver (VCS) (0)	2011.07.20

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

ultrasound

zero-off-set

'IBM > HACMP' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

zero-off-set

'IBM > HACMP' 카테고리의 다른 글

'IBM/HACMP' Related Articles

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역