HACMP_PD교육

HACMP v5.4.1limits

Cluster limits
- 32 nodes in a cluster
- 64 resource groups per cluster
- 256 IP addresses known to HACMP
- 128 application monitors
- Two sites
RSCT limit
- 48 heartbeat rings for both IP adn non-IP networks combined

clcomdES(커뮤니케이션 데몬, 커뮤니케이션 함) V5에서는 rsh이 필요 없다?

Topology - networing-centric

Resources -

Resources Group

Resources Group polocies

Customization : Process of augmenting HACMP

node 이름에 제일 맨 앞에 숫자를 주면 않됨!!!

HACMP path 설정 하기(.profile)

/usr/es/sbin/cluster

/usr/es/sbin/cluster/utilities

/usr/es/sbin/cluster/etc

/usr/es/sbin/cluster/diag

명령어들

디스크 Heartbeat test

#/usr/sbin/rsct/bin/dhb_read -p hdisk4 -r : 첫번째 노드에서 실행

#/usr/sbin/rsct/bin/dhb_read -p hdisk4 -t : 두번째 노드에서 실행

#lssrc -ls topsvcs

NIM's PID: 524472
diskhb_0       [ 2] 2     2     S    255.255.10.0    255.255.10.1
diskhb_0       [ 2] rhdisk5          0x84dd646a      0x84dd649e
HB Interval = 2.000 secs. Sensitivity = 4 missed beats
Missed HBs: Total: 0 Current group: 0
Packets sent    : 37 ICMP 0 Errors: 0 No mbuf: 0
Packets received: 43 ICMP 0 Dropped: 0

HA가 구동 되었는지 확인(5.3이상 부터)
# lssrc -ls clstrmgrES | grep state

Cluster states :

ST_INIT : cluster configured and down

ST_JOINING : node joining the cluster

ST_VOTING : Inter-node decision state for an event

ST_RP_RUNNING : cluster running Recovery program

ST_BARRIER : clstrmgr waiting at the barrier statement

ST_CBARRIER : clstrmgr is exiting recovery program

ST_UNSTABLE : cluster unstable

NOT_CONFIGURED : HA installed but not configured

RP_FAILED : event script failed

ST_STABLE : Cluster Services are running with managed resources(stable cluster) or Cluster Services have been "forced" down with resource groups potentially in the UNMANAGED state(HACMP 5.4 and later)

리소스 그룹 정보 확인

#clRGinfo

-----------------------------------------------------------------------------
Group Name     State                        Node
-----------------------------------------------------------------------------
webApp_group   ONLINE                       node3
                       OFFLINE                      node4

메이져 넘버가 남았는 정보 확인

#lvlstmajor (the numbers listed are available)

HACMP Sync 명령

#cldare -rtV normal(DARE : Dynamic Automatic Reconfigureation Event)

DARE requires three copies of the ODM
- DCD /etc/objrepos
- SCD /usr/es/sbin/cluster/etc/objrepos/staging
- ACD /usr/es/sbin/cluster/etc/objrepos/active

System monitoring via RMC(IBM 관리하는 여러가지 리소스들을 모니터링 하기 위해 만들어 논것)

#lsrsrc -A p(persistent) d(dynamic) b(both)

p : Persistent(static) attributes describe enduring

d : Dynamic attributes represent changing characteristics

b : To list both Persistent and Dynamic attributes

HA가 스크립트 실행도 중 에러가 발생하여 더 이상 진행되지 않고 멈춰있을 때 문제를 해결하고 반대편 노드로 리소스를 넘길 때 사용

#clruncmd node_name

SnapShot

#clsnapshot -c -i -n' snapshot_name' -d 'descripsion'

경로 변경하고 싶을 때 SNAPSHOT=some_other_directory

/usr/es/sbin/cluster/netmon.cf : 한 노드에 네트워크 카드가 하나 일 때 구성, 네트워크 장애시 네트워크 카드가 어떤 것이 문제 인지 판별하기 위해 (게이트웨이 IP 하나만 넣어주면 됨)

/usr/es/sbin/cluster/events : HA 기본적으로 사용하는 스크립트들의 모음

HACMP의 로그 파일들

hacmp log는 /etc/syslog.conf가 만들어 준다

clcycle 명령으로 만들어 지는 매일 백업 파일은 /usr/es/sbin/cluster/history/cluster.mmddyyyy로 만들어 진다

/usr/es/adm/cluster.log v5.3 : cluster로그

/var/hacmp/adm/cluster.log v5.4 : cluster로그

/usr/es/adm/cluster.log : records start and stop information for every cluster event generated in a running cluster

/var/hacmp/clverify/clverify.log : Contains the verbose messages outpup by the cluster verification utility

/tmp/cspoc.log : Contains time-stamped, formatted messages generated by HACMP C-SPOC commands. The file resides on the node where the C-SPOC command was invoked

/var/hacmp/log/clutils.log
- Automatic cluster configuration verification
- File collection utility
- Two-node cluster configuration assistant
- Cluster Test Tool(CTT)
- OLPW conversion tool
/var/adm/clavan.log : Contains the uptime state of applications managed by HACMP

PV 만드는 스크립트

for i in A B C D(where A B C D are the appropriate hdisk numbers for your system)

chdev -a pv=yes -l hdisk$i

done

for i in A B C D

rmdev -dl hdisk$i

done

concurrent 모드로 만들면 failover 시 빨리 넘어 간다

enhanced concurrent 볼륨 그룹 생성시 디스크에 rsct가 들어 가는 작은 영역이 만들어 진다

lv이 생성시 로그를 따로 만들면 퍼포먼스가 좋아진다.

로그 만드는 명령

#logform /dev/web_log_lv

가짜 디스크 생성

#mkdev -c disk -t 1000mb -s scsi -p scsi0 -w 9,0 -d

Fast failure detection on NIC failure and node halt

Heartbeating over IP Aliases

같은 대역의 IP만을 사용 할 수 밖에 없는 경우에 이걸 사용 할 수 있다

IZ26020를 적용하면 IPAT를 사용 할 때 부트 IP와 같은 대역의 서비스 IP를 사용 할 수 있다

http://www-01.ibm.com/support/docview.wss?uid=isg1IZ26020

멀티노드 디스크 핫빗

Tuning the Failure Detection Rate(FDR) for NIM(Network Interface Module)

Failure Detection Rate 를 slow로 변경하면 네트워크가 불안정한 환경에서는 적합

Extended Topology -> Configure HACMP Network Modules -> Change a Network Module using Predefined Values -> ether

Network Module Name                                 ether
Description                                         Ethernet Protocol
Failure Detection Rate                              Normal -> slow로 변경

heartbeat 체크 주기 확인

Extended Topology -> Configure HACMP Network Modules -> Change a Network Module using Custom Values -> ether

Failure Cycle [10] -> 10번
Interval between Heartbeats (seconds) [1.00] -> 1초 주기로

#lssrc -ls topsvcs | more

Enabling Fast Failure Detection(FFD) on node halt(HACMP 5.4 이사, diskhb network 필요

Extended Topology -> Configure HACMP Network Modules -> Change a Network Module using Custom Values -> diskhb

Network Module Name                                 diskhb
Description                                        [Disk Heartbeating Pro>
Address Type                                        Device                 +
Path                                               [/usr/sbin/rsct/bin/ha> /
Parameters                                         [FFD_ON] -> 시스템이 패닉 상태 일때 핫빗 체크 주기 상관 없이 테이크 오버
Grace Period                                       [30]                     #
Supports gratuitous arp                            [false]                 +

Configuring parent/child resourece group dependencies(v5.2 부터) : AppServer 설정에서 parent로 설정된 AppServer가 먼저 실행 되지 않으면 Child로 설정된 AppServer는 실행 되지 않음

Extended Resource Configuration -> Configure Resource Group Run-Time Policies -> Configure Dependencies between Resource Groups -> Configure Parent/Child Dependency

HACMP event 설명

for f in `ls /usr/es/sbin/cluster/events | grep -v "[a-z}*.rp"`

cat /usr/es/sbin/cluster/events/$f | grep -p -i "desc"

done >> ./events.txt

rsct command

#lsrsrc

#lsrsrc -A b IBM.FileSystem

#lsrsrc -A b IBM.Host

#lssrc -g rsct_rm

#lssrc -s IBM.FSRM

Cross-site LVM Mirroring

hacmp는 /var 파일 시스템이 풀 나면 가동 되지 않는다

forced down : 서비스는 그대로 두고 HACMP만 내리는 것

업그레이드 하기 위해

cluster가 stable이고 synchronize cleanly

snapshot은 다른 디렉토리에 저장 (SNAPSHOTPATH=some_other_directory)

customization files은 다른 디렉토리에 저장

업그레이드 중에 하지 말아야 할 것

/usr/sbin/cluster, /usr/es/sbin/cluster, /usr/lpp/cluster은 건들지 말것

Do not synchronize the cluster

Do not stop a node and place resource groups in an UNMANAGED state

Do not attempt a DARE or a C-SPOC command

#varyonvg -b(breake reservation) -u vg_name : 볼륨그룹 리저베이션이 깨지기 때문에 양쪽에서 베리온이 가능

로우 디바이스가 -TO(오대문자) 옵션을 썻는지 확인

#lslv raw_system

Devicesubtype = DS_LVZ

big volume에서 -T 옵션을 주고 로우 디바이스를 만들고 싶을 때
먼저 mklv -y lv_name -TO(오대문자) 으로 먼저 만들고 C-SPOC 으로 아무 LV나 만들면 기존 볼륨 작업들이 모두 반영 되어 들어 간다

network configuration rules

	IPAT via Aliasing subnet	IPAT via Replacement subnet
boot interfaces	diffrerent	different
service	different from boot subnets	Same as One boot subnet
persistent	different from boot subnets	different to boot subnets or same as service
Heartbeating over IP Aliasing	Private Subnet	Private Subnet
HWAT(지금은 사용하지 않음)	not supported	yes
Netmask	same for all	same for all
Etherchannel	yes	not supported
Vitual Adapter support	yes	not supported

AIX와 HACMP 업그레이드 버전 호환성

	AIX 5.2	AIX 5.3	AIX 6.1	End of SUPPORT
HACMP 5.3	YES	YES	YES	30 Sep 2009
HACMP 5.4 / 5.4.1	YES	YES	YES	N/A

저작자표시 (새창열림)

'IBM > HACMP' 카테고리의 다른 글

바뀐LUN사이즈를VG에적용할때 (0)	2012.11.09
PowerHA7.1교육 (0)	2012.11.09
PowerHA6.1 구성 절차 (0)	2012.11.09
How to synchronize timestamp (0)	2012.11.09
zero-off-set (0)	2012.03.24
Understanding_active_and_passive_varyon_in_enhanced_concurrent_mode (0)	2011.08.24
Fast_disk_takeover (0)	2011.08.24
Enhanced_concurrent_mode (0)	2011.08.24
clstat 실행 시 error 날 때 해결책 (0)	2011.07.21
PowerHA 7.1 vs Veritas Cluster erver (VCS) (0)	2011.07.20

ultrasound

HACMP_PD교육

'IBM > HACMP' 카테고리의 다른 글

티스토리툴바

HACMP_PD교육

'IBM > HACMP' 카테고리의 다른 글

'IBM/HACMP' Related Articles

티스토리툴바