Found this article and found it very informative.
These are my notes on what I thought was important and what I should review.I would suggest reading the entire article to get a better handle on the information. It is complete with detailed explanations of scenarios and diagrams of the different setups. Looks like a metro storage can be a few miles, even 20 miles, as per the example cities in the article. Such a scenario in using this would probably be good for business continuity affecting problems with a building like a tornado, but wouldn't be affective in large dr scenario's like a hurricane, where it affects a large area.
Link to article
- Combines synchronous replication with array-based clustering.
- These solutions typically are deployed in environments where the distance between datacenters is limited, often metropolitan or campus environments.
- VMware vMSC infrastructures are implemented with the goal of reaping the same benefits that high-availability clusters provide to a local site, but in a geographically dispersed model with two datacenters in differentlocations.
- The architecture is built on the idea of extending what is defined as “local” in terms of network and storage.
- [NOT]traditional synchronous replication solutions, [that]create a primary/secondary relationship between the active primary LUN, where data is being accessed, and the secondary LUN, which is receiving replication.
- [In Traditional syncronous] To access the secondary LUN, replication is stopped (or reversed) and the LUN is made visible to hosts. This now “promoted” secondary LUN has a completely different LUN ID
- VMware vMSC configuration[]t enables live migration ofrunning virtual machines between sites.
- Uniform host access configuration – When ESXi hosts from both sites are all connected to a storage node in the storage cluster across all sites. Paths presented to ESXi hosts are stretched across distance.
- Nonuniform host access configuration – ESXi hosts in each site are connected only to storage node(s) in the same site. Paths presented to ESXi hosts from storage nodes are limited to the local site.
- With the uniform configuration, hosts in datacenter-A and datacenter-B have access to the storage systems in both datacenters. In effect, the storage-area network is stretched between the sites, and all hosts can access allLUNs.
- Read/write access to a LUN takes place on one of the two arrays, and a synchronous mirror is maintained in a hidden, read-only state on the second array.
- if a LUN containing a datastore is read/write on the array at datacenter-A, all ESXi hosts access that datastore via the array in datacenter-A.
- For ESXi hosts in datacenter-A, this is local access. ESXi hosts in datacenter-B send read/write traffic acrossthe network
- In case of an outage, or operator-controlled shift of control of the LUN to datacenter-B, all ESXi hosts continue to detect the identical LUN being presented, except that it is now accessed via the array in datacenter-B.
- uniform configurations are currently the most commonly deployed,
- hosts in datacenter-A have access only to the array within the localdatacenter.
- concept of a “virtual LUN” that enables ESXi hosts in each datacenter to read and write to the same datastore/LUN.
- [two arrays but array in each DC] maintains the cache state on each array, so an ESXi host in either datacenter detects the LUN as local.
- EMC refers to this solution as “write anywhere.”
- Two virtual machines reside on the same datastore but are located in different datacenters, they write locally without any performance impact on either of them.
- This configuration of the LUNs/datastores has “site affinity” defined.
- if [link problem]between [] sites, the storage system on the preferred site for a given datastore is the only remaining one that has read/write access to it, thereby preventing any data corruption in the case of a failure scenario.
- simulated a environment
- two sites, Frimley and Bluefin, United Kingdom).
- The network between layer 2 network with a minimal distance between them, as is typical in campus cluster scenarios.
- fabric configuration ::in auniform device access model:: means every host in the cluster is connected to both storage heads.
- For any given LUN, one of the two storage heads presents the LUN as read/write via iSCSI.
- The opposite storage head maintains the replicated, read-only copy that is effectively hidden from the ESXi hosts.
- full-site failure is onescenario []VMware recommends enabling admissioncontrol.
- workload availability is the primary driver for stretched-cluster environments, recommended sufficient capacity be allotted for a full-site failure.
- because such hosts are equally divided across the two sites, and to ensure that all workloads can be restarted by vSphere HA,
- VMware recommends percentage-based policy because it offers the most flexibility and reducesoperational overhead.
- two heartbeat mechanisms,network heartbeating and datastore heartbeating.
- Network heartbeating is the primary Datastore heartbeating is the secondary after network heartbeating has failed.
- If a host is not receiving any heartbeats, it uses a fail-safe mechanism to detect whether it is merely isolated from its master node or is completely isolated from the network. It does this by pinging the default gateway
- VMware recommends specifying a minimum of two additional isolation addresses and that each of these addresses be site local.
- enable an automated failover ofvirtual machines residing on a datastore that has a PDL condition
- PDL condition, one that is communicated by the array controller to ESXi via an SCSI sense code. This condition indicates that a device (LUN) has become unavailable and is likely to be permanently unavailable.
- example scenario [where PDL] communicated by the array is when a LUN is set offline. This condition is used in nonuniform models during a failure scenario, to ensure that ESXi takes appropriate action when access to a LUN is revoked.
- When a full storage failure occurs, it is impossible to generate the PDL condition because there is no chance of communication between the array and the ESXi host. This state will be identified by the ESXi host as an all paths down (APD) condition.
- virtual machine killed as soon as it initiates disk I/O on a datastore that is in a PDL AND all of the virtual machine files reside on this datastore.
- If virtual machine files do not all reside on the same datastore and a PDL condition exists on one of the datastores, the
- virtual machine will not be killed.
- VMware recommends placing all files for a given virtual machine on a single datastore, ensuring that PDL conditions can be mitigated by vSphere HA.
- VMware recommends setting disk.terminateVMonPDLDefault to True. A virtual machine is killed only when issuing I/O to the datastore. Otherwise, it remains active.
- A virtual machine that is running memory-intensive workloads without issuing I/O to the datastore might remain active in such situations.
- VMware recommends enabling VMware DRS to allow for load balancing across hosts in the cluster. Its load balancing calculation is based on CPU and memory use.
- To prevent storage and networktraffic overhead in a stretched-cluster environment, VMware recommends implementing VMware DRS affinityrules to enable a logical separation of virtual machines.
- “storage site affinity,” the preferred location for access to a given LUN.
- VMware recommends implementing “should rules,” because these are violated by vSphere HA in the case of afailure. Availability of services should always prevail over performance.
- “Must rules,” vSphere HAdoes not violate the rule set. This might potentially lead to service outages.
- In the scenario where a fulldatacenter fails, “must rules” make it impossible for vSphere HA to restart the virtual machines
- VMware recommends manually defining “sites” by creating a group of hosts that belong to a site and adding virtual machines to these sites based on the affinity of the datastore on which they are provisioned.
- VMware recommends automating the process of defining site affinity by using tools such as VMware® vCenter Orchestrator or VMware vSphere PowerCLI.
- If automating the process is not an option, using a generic naming convention is recommended, to enable simplifying the creation of these groups.
- VMware DRS is invoked every 5 minutes by default, but it also is triggered if the cluster detects changes. For instance, when a host reconnects to the cluster
- Storage DRS enables aggregation of datastores into a single unit of consumption from an administrative perspective, and it balances virtual machine disks when defined thresholds are exceeded.
- [Because] stretched storage systems use synchronous replication, a migration or series of migrations have an impact on replication traffic and might cause the virtual machines to become temporarily unavailable due to contention for network resources during the movement of disks.
- Migration to random datastores might also potentially lead to additional I/O latency in uniform access configurations if virtual
- example, virtualmachine on a host in site A has disk migrated to datastore site B, it will continue operating but with []degraded performance. The virtual machine’s disk reads will be subject to the increased latency associated with reading from the virtual iSCSI IP at site B, and reads will be subject to intersite latency instead of being satisfied by a local target.
- host fails, failure detected by the cluster’s HA master node because network heartbeats fromit are no longer being received. After the master node has detected that network heartbeats are absent, it willstart monitoring for datastore heartbeats.
- third availability check is performed by pinging the management addresses of the failed hosts.
- If all of these checks return as unsuccessful, the master node will declare the missing host dead and will attempt to restart all the protected virtual machines
- host is isolated, it generates datastore heartbeats . Detection of heartbeats enables HA master node to determine that the host is running but is isolated from the network.
- Depending on the isolation response configured, the impacted host might choose to Power Off or Shut Down virtual machines, or alternatively to leave the virtual machines powered on.
- The isolation response is triggered 30 seconds after the host has detected that it is isolated.
- best practices , Leave Powered On is the recommended isolation response setting for the majority of environments. Isolated hosts are a rare event in a properly architected environment
- In environments that use network storage , such as iSCSI and NFS, the recommended isolation response is Power Off. With these environments, it is more likely that a network outage that causes a host to become isolated will also affect the host’s ability to communicate to the datastores.
Overview::Stretched storage cluster or Metro storage cluster.
It is implemented in environments where disaster/downtime avoidance is a key requirement.
Isn't that a requirement for everyone in some degree
Technical Requirements and Constraints
• The maximum supported network latency between sites for the VMware® ESXi™ management networks is 10ms round-trip time (RTT).
– 10ms of latency for vMotion is supported only with VMware vSphere® Enterprise Plus Edition licenses (Metro vMotion).
• The maximum supported latency for synchronous storage replication links is 5ms RTT.
• A minimum of 622Mbps network bandwidth, configured with redundant links, is required for the ESXi vMotion network.
The bandwitdth requirement alone throws this into a enterprise level only price range.
A VMware vMSC requires what is in effect a single storagesubsystem that spans both sites.
So it is not replication based
VMware vMSC solutions are classified in two distinct categories,
Looks like uniform would let your host run the vm on a subsystem in another building and nonuniform means the vm has to be svmotioned over to the local part of the subsystem
More on Uniformed Access
More on NON-Uniformed Access
Infrastructure (in example)
vSphere HA
configuring the admission control policy to 50 percent is advised.
vSphere HA uses heartbeat mechanisms to validate the state of a host.
VMware recommends increasing the number of heartbeat datastores [] to four.
The minimum number ofheartbeat datastores is two and the maximum is five.
Four is recommended in a stretched-cluster environmentbecause that would provide full redundancy in both locations.
Permanent Device Loss Enhancements in VMware vSphere 5.0 Update 1
VMs and PDL
VMWARE and DRS
VMware vSphere Storage DRS
machines are not migrated along with their virtual disks, from a site perspective.
VMware recommends that VMware Storage DRS be configured inmanual mode. This enables human validation per recommendation and allows recommendations to be appliedduring off-peak hours
Host Failures:Single-Host Failure
VMware recommends manually invoking VMware DRS to ensure that all virtualmachines are placed on hosts in the correct location, to prevent possible performance degradation
Host Failures:Single-Host Isolation
Host Failuires: Storage Partition
Virtual machines remained running, with no impact.
….
0 comments:
Post a Comment