Angels Technology

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Friday, July 13, 2012

What is stretched storage cluster / metro storage cluster?

Posted on 9:06 AM by Unknown

Found this article and found it very informative.
These are my notes on what I thought was important and what I should review.I would suggest reading the entire article to get a better handle on the information. It is complete with detailed explanations of scenarios and diagrams of the different setups. Looks like a metro storage can be a few miles, even 20 miles, as per the example cities in the article. Such a scenario in using this would probably be good for business continuity affecting problems with a building like a tornado, but wouldn't be affective in large dr scenario's like a hurricane, where it affects a large area.

Link to article
http://www.vmware.com/files/pdf/techpaper/vsphr-cs-mtro-stor-clstr-uslet-101-web-1.pdf

 
    Overview::Stretched storage cluster or Metro storage cluster.
    It is implemented in environments where disaster/downtime avoidance is a key requirement.
    Isn't that a requirement for everyone in some degree


  1. Combines synchronous replication with array-based clustering.
  2. These solutions typically are deployed in environments where the distance between datacenters is limited, often metropolitan or campus environments.
  3. VMware vMSC infrastructures are implemented with the goal of reaping the same benefits that high-availability clusters provide to a local site, but in a geographically dispersed model with two datacenters in differentlocations.
  4. The architecture is built on the idea of extending what is defined as “local” in terms of network and storage.


  5. Technical Requirements and Constraints
    • The maximum supported network latency between sites for the VMware® ESXi™ management networks is 10ms round-trip time (RTT).
    – 10ms of latency for vMotion is supported only with VMware vSphere® Enterprise Plus Edition licenses (Metro vMotion).
    • The maximum supported latency for synchronous storage replication links is 5ms RTT.
    • A minimum of 622Mbps network bandwidth, configured with redundant links, is required for the ESXi vMotion network.
    The bandwitdth requirement alone throws this into a enterprise level only price range.


     A VMware vMSC requires what is in effect a single storagesubsystem that spans both sites.
  6. [NOT]traditional synchronous replication solutions, [that]create a primary/secondary relationship between the active primary LUN, where data is being accessed, and the secondary LUN, which is receiving replication.
  7. [In Traditional syncronous]  To access the secondary LUN, replication is stopped (or reversed) and the LUN is made visible to hosts. This now “promoted” secondary LUN has a completely different LUN ID
  8. VMware vMSC configuration[]t enables live migration ofrunning virtual machines between sites.
  9. So it is not replication based


    VMware vMSC solutions are classified in two distinct categories,
  10.  Uniform host access configuration – When ESXi hosts from both sites are all connected to a storage node in the storage cluster across all sites. Paths presented to ESXi hosts are stretched across distance.
  11.  Nonuniform host access configuration – ESXi hosts in each site are connected only to storage node(s) in the same site. Paths presented to ESXi hosts from storage nodes are limited to the local site.
  12. Looks like uniform would let your host run the vm on a subsystem in another building and nonuniform means the vm has to be svmotioned over to the local part of the subsystem

    More on Uniformed Access
  13. With the uniform configuration, hosts in datacenter-A and datacenter-B have access to the storage  systems in both datacenters. In effect, the storage-area network is stretched between the sites, and all hosts can access allLUNs.
  14. Read/write access to a LUN takes place on one of the two arrays, and a synchronous mirror is maintained in a hidden, read-only state on the second array.
  15.  if a LUN containing a datastore is read/write on the array at datacenter-A, all ESXi hosts access that datastore via the array in datacenter-A.
  16.  For ESXi hosts in datacenter-A, this is local access. ESXi hosts in datacenter-B send read/write traffic acrossthe network
  17.  In case of an outage, or operator-controlled shift of control of the LUN to datacenter-B, all ESXi hosts continue to detect the identical LUN being presented, except that it is now accessed via the array in datacenter-B.
  18. uniform configurations are currently the most commonly deployed,

  19. More on NON-Uniformed Access
  20.  hosts in datacenter-A have access only to the array within the localdatacenter.
  21.  concept of a “virtual LUN” that enables ESXi hosts in each datacenter to read and write to the same datastore/LUN.
  22. [two arrays but array in each DC]  maintains the cache state on each array, so an ESXi host in either datacenter detects the LUN as local.
  23.  EMC refers to this solution as “write anywhere.”
  24. Two virtual machines reside on the same datastore but are located in different datacenters, they write locally without any performance impact on either of them.
  25. This configuration of the LUNs/datastores has “site affinity” defined.
  26.  if [link problem]between [] sites, the storage system on the preferred site for a given datastore is the only remaining one that has read/write access to it, thereby preventing any data corruption in the case of a failure scenario.

  27. Infrastructure (in example)
  28. simulated a environment
  29. two sites, Frimley and Bluefin, United Kingdom).
  30. The network between layer 2 network with a minimal distance between them, as is typical in campus cluster scenarios.
  31. fabric configuration ::in auniform device access model:: means every host in the cluster is connected to both storage heads.
  32. For any given LUN, one of the two storage heads presents the LUN as read/write via iSCSI.
  33. The opposite storage head maintains the replicated, read-only copy that is effectively hidden from the ESXi hosts.

  34. vSphere HA
  35. full-site failure is onescenario []VMware recommends enabling admissioncontrol.
  36. workload availability is the primary driver for stretched-cluster environments, recommended sufficient capacity be allotted for a full-site failure.
  37. because such hosts are equally divided across the two sites, and to ensure that all workloads can be restarted by vSphere HA,
  38. configuring the admission control policy to 50 percent is advised.
  39. VMware recommends percentage-based policy because it offers the most flexibility and reducesoperational overhead.

  40. vSphere HA uses heartbeat mechanisms to validate the state of a host.
  41. two heartbeat mechanisms,network heartbeating and datastore heartbeating.
  42. Network heartbeating is the primary Datastore heartbeating is the secondary after network heartbeating has failed.
  43. If a host is not receiving any heartbeats, it uses a fail-safe mechanism to detect whether it is merely isolated from its master node or is completely isolated from the network. It does this by pinging the default gateway
  44. VMware recommends specifying a minimum of two additional isolation addresses and that each of these addresses be site local.
  45. VMware recommends increasing the number of heartbeat datastores [] to four.
    The minimum number ofheartbeat datastores is two and the maximum is five.
    Four is recommended in a stretched-cluster environmentbecause that would provide full redundancy in both locations.


     Permanent Device Loss Enhancements in VMware vSphere 5.0 Update 1
  46. enable an automated failover ofvirtual machines residing on a datastore that has a PDL condition
  47. PDL condition, one that is communicated by the array controller to ESXi via an SCSI sense code. This condition indicates that a device (LUN) has become  unavailable and is likely to be permanently unavailable.
  48. example scenario  [where PDL] communicated by the array is when a LUN is set offline. This condition is used in nonuniform models during a failure scenario, to ensure that ESXi takes appropriate action when access to a LUN is revoked.
  49. When a full storage failure occurs, it is impossible to generate the PDL condition because there is no chance of communication between the array and the ESXi host. This state will be identified by the ESXi host as an all paths down (APD) condition.

  50. VMs and PDL
  51. virtual machine killed as soon as it initiates disk I/O on a datastore that is in a PDL AND all of the virtual machine files reside on this datastore.
  52. If virtual machine files do not all reside on the same datastore and a PDL condition exists on one of the datastores, the
  53. virtual machine will not be killed.
  54. VMware recommends placing all files for a given virtual machine on a single datastore, ensuring that PDL conditions can be mitigated by vSphere HA.
  55. VMware recommends setting disk.terminateVMonPDLDefault to True. A virtual machine is killed only when issuing I/O to the datastore. Otherwise, it remains active.
  56. A virtual machine that is running memory-intensive workloads without issuing I/O to the datastore might remain active in such situations.

  57. VMWARE and DRS
  58. VMware recommends enabling VMware DRS to allow for load balancing across hosts in the cluster. Its load balancing calculation is based  on CPU and memory use.
  59. To prevent storage and networktraffic overhead in a stretched-cluster environment, VMware recommends implementing VMware DRS affinityrules to enable a logical separation of virtual machines.
  60. “storage site affinity,” the preferred location for access to a given LUN.
  61. VMware recommends implementing “should rules,” because these are violated by vSphere HA in the case of afailure. Availability of services should always prevail over performance.
  62. “Must rules,” vSphere HAdoes not violate the rule set. This might potentially lead to service outages.
  63. In the scenario where a fulldatacenter fails, “must rules” make it impossible for vSphere HA to restart the virtual machines
  64. VMware recommends manually defining “sites” by creating a group of hosts that belong to a site and adding virtual machines to these sites based on the affinity of the datastore on which they are provisioned.
  65. VMware recommends automating the process of defining site affinity by using tools such as VMware® vCenter Orchestrator  or VMware vSphere PowerCLI.
  66. If automating the process is not an option, using a generic naming convention is recommended, to enable simplifying the creation of these groups.
  67. VMware DRS is invoked  every 5 minutes by default, but it also is triggered if the cluster detects changes. For instance, when a host reconnects to the cluster

  68. VMware vSphere Storage DRS
  69. Storage DRS enables aggregation of datastores into a single unit of consumption from an administrative perspective, and it balances virtual machine disks when defined thresholds are exceeded.
  70. [Because] stretched storage systems use synchronous replication, a migration or series of migrations have an impact on replication traffic and might cause the virtual machines to become temporarily unavailable due to contention for network resources during the movement of disks.
  71.  Migration to random datastores might also potentially lead to additional I/O latency in uniform access configurations if virtual
  72. machines are not migrated along with their virtual disks, from a site perspective.
  73. example, virtualmachine on a host in site A has disk migrated to datastore site B, it will continue operating but with []degraded performance. The virtual machine’s disk reads will be subject to the increased latency associated with reading from the virtual iSCSI IP at site B, and reads will be subject to intersite latency instead of being satisfied by a local target.
  74. VMware recommends that VMware Storage DRS be configured inmanual mode. This enables human validation per recommendation and allows recommendations to be appliedduring off-peak hours


    Host Failures:Single-Host Failure
  75. host fails, failure detected by the cluster’s HA master node because network heartbeats fromit are no longer being received. After the master node has detected that network heartbeats are absent, it willstart monitoring for datastore heartbeats.
  76. third availability check is performed by pinging the management addresses of the failed hosts.
  77. If all of these checks return as unsuccessful, the master node will declare the missing host dead and will attempt to restart all the protected virtual machines
  78. VMware recommends manually invoking VMware DRS to ensure that all virtualmachines are placed on hosts in the correct location, to prevent possible performance degradation

    Host Failures:Single-Host Isolation
  79. host is isolated, it generates datastore heartbeats . Detection of heartbeats enables HA master node to determine that the host is running but is isolated from the network.
  80. Depending on the isolation response configured, the impacted host might choose to Power Off or Shut Down virtual machines, or alternatively to leave the virtual machines powered on.
  81. The isolation response is triggered 30 seconds after the host has detected that it is isolated.
  82. best practices , Leave Powered On is the recommended isolation response setting for the majority of environments. Isolated hosts are a rare event in a properly architected environment
  83. In environments that use network storage , such as iSCSI and NFS, the recommended isolation response is Power Off. With these environments, it is more likely that a network outage that causes a host to become isolated will also affect the host’s ability to communicate to the datastores.


  84. Host Failuires: Storage Partition
    Virtual machines remained running, with no impact.
    ….


Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Posted in datastore | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Copy and paste clipboard items to and from your vsphere virtual machines and your pc
    Wanted to copy and paste text between your pc and a vm? Now you can. Power off your VM. Go to the vm properties->Options->Advanced-...
  • Interesting look at Win cpu usage vs Vmware CPU usage
    I came across this scenario: The windows task manager shows the cpu of the vm pegged at 100%. The vmware performance monitor says that ...
  • Storage comparison
    One of Cormac Hogan s posts provides a good basis for compares of different storage types for vmware Vsphere and how they stack up. He dis...
  • E1000 vs e1000e in vmware : notes
    Performance difference " The performance should be about the same, the reason for the change is that Intel is not longer supporting the...
  • vCenter and Hosts Disconnected -- Reason: Cannot verify the SSL thumbprint
    Just saw this over on the forums, but if your hosts are getting this error: Cannot syncronize the host <hostname.fqdn>, Reason: Cannot...
  • Vmware esxi : Intel Pro/1000 ET quad port adapter and ISCSI
    I've seen issues pop up with intel quad ports here and there on the forums so I thought it would be good to note down what worked here...
  • Vmware DRS anti affinity rules wont let you enter maintenance mode for a esxi host
    You have a DRS rule that specifies that 2 vms need to be kept apart: In this case: 250-FT and 250sql3 For larger clusters with multiple...
  • Snapshot creation /reversion/ deletion/ listing with vim-cmd
    Here we are going to use the command line on a esxi host to create, revert, and delete snapshots. First ssh into your host. Important thi...
  • shutdown your esxi host using powercli
    if you want to shutdown a host using powercli: Set-VMhost -VMhost HOSTNAME -State Maintenance get-vmhost HOSTNAME | Foreach {Get-View $_.ID}...
  • Setting your esxi host to restart automatically after crash or purple screen aka psod
    The default and recommended setting is to leave the purple screen of death up to help you notice that het host has died and also leave t...

Categories

  • 5.1
  • backup
  • cloud
  • cluster
  • command line
  • console
  • converter
  • cpu
  • datacenter
  • datastore
  • datastore. rdm
  • DCUI
  • dell
  • disaster recovery
  • display
  • DR
  • e1000
  • e1000e
  • ec2
  • esx
  • esxi
  • esxtop
  • extent
  • Good for enterprise
  • HA
  • hcl
  • host
  • HP
  • ibm
  • iometer
  • iscsi
  • iso
  • linked mode
  • logs
  • MAC
  • memory
  • NFS
  • NIC
  • NTP
  • ova
  • ovf
  • p2v
  • pcie
  • performance
  • phone
  • powercli
  • powershell
  • PSOD
  • raid
  • RDM
  • resource pool
  • rvtools
  • scsi
  • sddc
  • snapshots
  • SQL
  • SRM
  • ssh
  • storage
  • svmotion
  • syslog collector
  • v2v
  • vapp
  • vcenter
  • vcloud
  • vcp
  • veeam
  • VI console
  • vm
  • vmdk
  • VMFS
  • vmkfstools
  • vmotion
  • VMUG
  • vmware
  • vmware tools
  • vmware.esxi
  • vmxnet3
  • vsphere
  • vum
  • web client
  • windows

Blog Archive

  • ►  2013 (28)
    • ►  October (2)
    • ►  September (1)
    • ►  August (1)
    • ►  July (1)
    • ►  June (14)
    • ►  May (1)
    • ►  April (1)
    • ►  March (5)
    • ►  February (1)
    • ►  January (1)
  • ▼  2012 (138)
    • ►  December (2)
    • ►  November (13)
    • ►  October (26)
    • ►  September (19)
    • ►  August (35)
    • ▼  July (34)
      • Esxi : Recovering from a bad update : Last known g...
      • convert thick disk to thin using a command line vi...
      • Setup NTP for your ESXi host
      • Boot virtual machine from ISO : tips on if you are...
      • Best practices for using VMware Converter : Checklist
      • Iometer test of USB flash drive connected to a Vmw...
      • Can you storage vmotion (svmotion) your virutalize...
      • vCenter and Hosts Disconnected -- Reason: Cannot v...
      • Network configuration not updated in configuration...
      • There are issues communicating with the following ...
      • "The Fully Qualified Domain Name cannot be resolve...
      • Notes: Linked-mode vsphere vcenter pre-requisites ...
      • What to the DCUI console logs show
      • Browsing Vmware logs in the DCUI (view esxi logs ...
      • SRM Site Recovery Manager best practices
      • Keeping nic configurations when p2v
      • What is stretched storage cluster / metro storage...
      • How to create a vapp in esxi or vsphere
      • Thought: Raid 0 and Raid 1 : quick easy way to rem...
      • Setting a MAC for a vm manually
      • Extents and VMFS volumes
      • How to add a Raw disk to your esx host (RDM)
      • How to describe vsphere vs vcenter to a layman
      • How to setup a cluster in vmware vsphere
      • Matching virtual disks to the Disks on the guest
      • How to attach a vmdk to VM
      • How to attach a ISO to a vmware guest via vsphere ...
      • personal thoughts: veeam vs vranger
      • Copy-vmguestfile :: Copying data from powercli to ...
      • Vmware converter : does disabling ssl make your vm...
      • Growing and Expanding a VMFS
      • How to connect to the vpshere web client
      • Snapshot creation /reversion/ deletion/ listing w...
      • Enable ssh from powercli
    • ►  June (9)
Powered by Blogger.

About Me

Unknown
View my complete profile