BCA Notes

April 4, 2018 | Author: Shady El-Malatawey | Category: Virtual Machine, V Mware, Microsoft Sql Server, Computer Data Storage, Central Processing Unit


Comments



Description

Application Active Directo ry Design Qualifier Availabili ty Recommendations 1-) Try to separate FSMO roles between many AD DC VMs and separate these VMs on different host using VMs Anti-Affinity Roles. 2-) Try to separate all DC VMs on separate back-end Storage Arrays. If not available, try to host one DC VM on a local datastore as a protection in case of back-end shared storage failure. Keep in mind that, DC VM on local datastore won’t use features like: HA or vMotion. 3-) Try to separate your DC VMs on many clusters on many physical racks or blade chassis using Soft (Should) VM-Host Anti-affinity rule. At least, dedicate a management cluster separated from production cluster. 4-) Make sure to set HA Restart Priority for all DC VMs in your HA Cluster(s) to High in order to be restarted first before any other VMs in case of host failure. Performa nce 5-) Try to use VM Monitor to monitor activity of the AD DC VMs and restart them in case of OS failure. 1-) CPU Sizing: Site Size No. of vCPUs <500 Users per Site Single vCPU <10,000 Users per Site 2 vCPUs >10,000 Users per Site 3+ vCPUs This assumes that the primary work of the directory is user authentication. For any additional workload, like Exchange Server, additional vCPUs may be required. Capacity Monitoring is helpful to determine the correct vCPUs required. 2-) Memory Sizing: Memory of AD DC can help in boosting the performance by caching AD DB in the RAM, like any other DB application. Ideal case is to cache all AD DB in the RAM for max. performance. This is preferred in environments that have AD integrated with other solutions, like Exchange servers. The following guide line is a start: Site Size Min. RAM Size <500 Users per domain per Site 512 MB 500-1,000 Users per domain per Site 1GB >1,000 Users per domain per Site 2 GB For the correct sizing of RAM, start with min. required and use Windows Performance Monitor to monitor “Database/DB Cache Hit%” for lsass service after extended period after deploying this DC. Add RAM if required using vSphere Hot Add feature (Keep in mind that you have to enable it before starting up the DC VM). When the RAM is sized correctly enough to cache proper portion of DB, this ratio should be near 100%. Keep in mind that, this is only for AD Domain Services, i.e. additional RAM is required for the Guest OS. 3-) Storage Sizing: The following equations are general for sizing the required storage size for a DC: “Storage required= OS Storage+ AD DB Storage+ AD DB Logs Storage+ SYSVOL Folder Storage+ Global Catalogue Storage+ Any data stored in Application Partition+ Any 3 rd Part Storage” “AD DB Storage= 0.4GB for any 1,000 users≈ 0.4MB*Total No. of Users” “AD DB Logs Storage= 25% of AD DB Storage” “SYSVOL Folder≈ 500MB+” May increase in case of high no. of GPOs. “Global Catalogue Storage= 50% of AD DB for any additional Domain” “Any data stored in Application Partition is to be estimated” “Any 3rd Part Storage includes any installed OS patches, paging file, backup agent or anti-virus agent” The following table shows the Read/Write behavior of each of AD DC components: AD DC Component Read/Write RAID Recommended AD DB Read Intensive RAID 5 AD DB Logs Write Intensive RAID 1/10 OS Read/Write RAID 1 Keep in mind that for large environments with many integrated solutions with AD, separation of OS, AD DB and AD DB Logs on many disks is recommended for IO separation on many disks and vSCSI adapters. In such environments, caching most of AD DB on RAM will give a performance boost. Managea bility 4-) Network Sizing: AD DC VM should have a VMXNET3 vNIC which gives the max. network performance with least CPU load and this should be sufficient on 1GB physical network. The port group to which AD DC VM is connected should have a teamed pNICs for redundancy. 1-) Time Sync: Time Synchronization is one of the most important things in AD DS environments. As stated by VMware Best Practices to Virtualize AD DS on Windows 2012 here, it’s recommended to follow a hierarchical time sync as follows: - Sync the PDC in Forest Root Domain to an external Startum 1 NTP servers. - Sync all other PDCs in other child domains in the forest to the Root PDC or any other DC in the Root Domain. - Sync all ESXi Hosts in the VI to the same Startum 1 NTP Server. - Sync all workstations in every domain to the nearest DC in their domains respectively. To configure the PDC to time-sync with an external NTP server using GPO: http://blogs.technet.com/b/askds/archive/2008/11/13/configuring-an-authoritative-time-server-withgroup-policy-using-wmi-filtering.aspx Also, it’s recommended to disable time-sync between VMs and Hosts using VMware Tools totally (Even after uncheck the box from VM settings page, VM can sync with the Host using VMware Tools in case of startup, resume, snapshotting, etc.) according to the following KB: http://kb.vmware.com/selfservice/microsites/search.do? language=en_US&cmd=displayKC&externalId=1189 This will make the PDC VM only sync its time with the configured time source using GPO. 2-) Use Best Practices Analyzer (BPA): It’s recommended to use BPA for AD DCs to make sure that your configuration is coherent with Microsoft recommended configuration. In some cases and for valid reasons, you can drift from Microsoft recommendations. AD DS: http://technet.microsoft.com/en-us/library/dd391875(v=ws.10).aspx 3-) Use AD DS Replication Tool: This tool, offered by Microsoft for free, can help detect any issue in replication between all DCs in your environment and show them and the related KB articles to solve these issues. It’s the next generation from REPADMIN CLI tool. Download it from: http://www.microsoft.com/en-us/download/details.aspx?id=30005out Recovera bility 4-) Snapshots: Using AD DC on Windows 2012, you can use snapshots without worrying about reverting to old snapshot and the related USN Rollback issue. AD DC on Windows 2012 leverages the new VMGeneration ID feature that makes the AD DC Virtualization aware and hence, any hot/cold snapshot can be used to revert to it safely. Check VMware Best Practices to Virtualize AD DS on Windows 2012 here for more information about VM-Generation ID and related Virtualization Safeguards. 1-) Try to use a backup software that is VSS-aware to safely backup your AD DB. AD DC on Windows 2012 can be backed up with a backup software that uses VSS to backup and restore entire DC VM, because AD DC on Windows 2012 leverages the new VM-Generation ID feature that makes the AD DC Virtualization aware and hence, any restore process of entire DC VM can be done safely. Check VMware Best Practices to Virtualize AD DS on Windows 2012 here for more information about VM-Generation ID and related Virtualization Safeguards. 2-) Make sure to backup any DC System State. System State contains AD DB, AD DB Logs, SYSVOL Folder and any other OS critical component like registry files. 3-) For DR, you can use native AD DCs replication to replicate the AD DB between the main site and the DR site. This approach requires min. management overhead and good DR capability. This approach only lacks the ability to protect the five FSMO role holders. Scalabilit y 4-) Another approach for DR is to leverage VMware SRM with VM-Generation ID capability on Windows 2012. This approach helps to continuously replicate AD DC VMs using SRM Replication or Array-based Replication and failover in case of disaster. This allows to protect FSMO roles holders as well as provide AD infrastructure to failed-over VMs in the DR site. 1-) For greater scalability, try to upgrade your AD DCs to Windows Server 2012. AD DC on Windows 2012 leverages the new VM-Generation ID feature that makes the AD DC Virtualization aware and hence, it can be cloned easily and any cloning process can be done safely. Check VMware Best Practices Security to Virtualize AD DS on Windows 2012 here for more information and cloning process step-by-step guide. Cloning can help in case of urgently needed expansion in AD DC infrastructure, DR process or testing. It also cuts down heavy network utilization by AD DCs in replication of entire DB to the new promotedfrom-scratch DCs. Keep in mind that Cold Cloning is the only one supported by both VMware and Microsoft. Hot Cloning isn’t supported in production by either VMware or Microsoft. 1-) All security procedures done for securing physical DCs should be done in DC VMs, like: Role-based Access Policy and hard drive encryption. 2-) Follow VMware Hardening Guide (v5.1/v5.5) for more security procedures to secure both of your VMs and vCenter Server. MS Clustering Solutions Availabilit y 1-) Use vSphere HA with MSCS to provide additional level of availability to your protected application. 2-) Use vSphere DRS with Partial Automated level with MSCS to provide automatic placement of clustered VMs when powered on only. Clustered VMs use SCSI BUS Sharing which require not to migrate these VMs with vMotion and hence, Automatic DRS load balancing can’t be used. If the vSphere Cluster on which clustered VMs is configured with Automatic DRS, change VMs-specific DRS configuration to Partial Automated. 3-) Affinity Rules: With Cluster-in-a-box configuration, use VMs Affinity Rule to gather all clustered VMs together on the same host. With Cluster-across-boxes or Physical-Virtual Cluster, use VMs Anti-affinity Rule to separate the VMs across different Hosts. HA doesn’t respect VM Affinity/Anti-affinity rules and when a host fails, HA may violate these rules. In vSphere 5.1, configure the vSphere Cluster with “ForeAffinePowerOn” option set to 1 to respect all VMs Affinity/Anti-affinity rules. In vSphere 5.5, configure the vSphere Cluster with both “ForeAffinePowerOn” & “das.respectVmVmAntiAffinityRules” set to 1 to respect all VMs Affinity/Antiaffinity rules respectively. Performa nce 4-) Try to use VM Monitor to monitor activity of the clustered VMs and restart them in case of OS failure. 1-) Memory Sizing: Don’t use Memory Over-commitment on ESXi hosts hosting clustered VMs. Memory Over-commitment may cause small pauses to these VMs which are sensitive to any time delay. 2-) SCSI Driver: 2003 SP2 2008 2008 2003 SP2 2008 2008 SP1 or Virtual-mode RDM Disk: on Fiber SAN. you have to use different SCSI drivers for both of Guest OS Disk and shared Quorum Disk.SCSI Driver Supported LSI-Logic Parallel LSI-Logic SAS LSI-Logic SAS OS (Windows) 2003 SP1 or SP2 32/64 bit 2008 SP2 or 2008 R2 SP1 32/64 bit 2012 (vSphere 5.e. both of SCSI (0:x) and (1:x).5 U1 or later) Keep in mind that. performance. Virtual Physical-mode RDM Disk on Fiber SAN.vmdk): Local/on Virtual SP2 or R2 SP1 SP1 or SP2 or R2 SP1 SP1 or SP2 or R2 SP1 SP2 or R2 SP1 or or 2012 .5 Only Cluster-in-abox (Recommende SCSI BUS Sharin g Virtual 2003 SP1 or SP2 2008 SP2 or 2008 R2 SP1 Eager-Zeroed ThickProvisioned Virtual Disk (. it’s recommended to use Thick-provisioned Disks instead of Thin ones for max.vmdk): Local/on Fiber SAN. Physica l Physica l Eager-Zeroed ThickProvisioned Virtual Disk (. 3-) Storage Supported for OS Disks : For OS disks in clustered VMs. 4-) Storage Supported for Shared Quorum Disk: vSphere Cluster OS (Windows) Disk Type Version Configuratio n Type vSphere 5.x) or 2012 R2 (vSphere 5.5. Physica l 2003 SP2 2003 SP2 2008 2008 2008 2008 2012 SP1 or Virtual-mode RDM Disk: on Fiber SAN. Physical-mode RDM Disk on Fiber SAN. i.x Cluster-in-abox (Recommende d Configuration) Cluster-in-abox Cluster-acrossboxes (Recommende d Configuration) Cluster-acrossboxes PhysicalVirtual vSphere 5. certain Multi-pathing policy must be set to configure how ESXi Hosts connect to that FC . 2012 or 2012 R2 (2012 R2 requires vSphere 5. i. vSphere 5.d Configuration) R2 (2012 R2 iSCSI/FCoE SAN.e.x.1 that requires FC SAN. 5-) Multi-pathing Policy: For clustered VMs configuration on vSphere 5. Mixing between different types of initiators for a storage protocol is supported only on vSphere 5.5 U1) In-Guest iSCSI target sharing for Quorum Disk is supported for any type of clustering configuration and any OS. Mixing between different types of storage protocols connecting Quorum Disk isn’t supported. first node connected to Quorum Disk using iSCSI and the second is connected using FC. Same goes for FCoE.5 U1) Cluster-in-a2008 SP2 or Virtual-mode RDM Disk: Virtual box 2008 R2 SP1 or on iSCSI/FCoE SAN. R2 (2012 R2 requires vSphere 5. Keep in mind that mixing between Cluster-across-box/Cluster-in-box configuration isn’t supported as well as mixing between different verison of vSphere in a single cluster. d R2 (2012 R2 Configuration) requires vSphere 5.2008 SP2 or Physical-mode RDM Physica boxes 2008 R2 SP1 or Disk on iSCSI/FCoE l (Recommende 2012 or 2012 SAN.x supports in-guest FCoE target sharing for Quorum Disk. Host 1 can connect using Software iSCSI and Host 2 can connect using HW iSCSI Initiator. requires vSphere 5. i.5.e.5.5 U1) Cluster-across.5 U1) Physical2008 SP2 or Physical-mode RDM Physica Virtual 2008 R2 SP1 or Disk on iSCSI/FCoE l 2012 or 2012 SAN. this issue was resolved according Path Selection Policy Round Robin Fixed MRU Fixed Fixed to both of KB1 & KB2.2 physical NICs for redundancy and NIC teaming capabilities. . redundancy.do? language=en_US&cmd=displayKC&externalId=1016106 Manageab ility 8-) Network: .You should choose the latest vNICs available to the Guest OS. Check the following KB for more information: http://kb. 1-) Time Sync: Time Synchronization is one of the most important things in SQL environments. 6-) Guest Disk IO Timeout: From Guest OS. Management. one for public network and the other one for heartbeat network.com/selfservice/microsites/search. configure heartbeat network with two physical NICs for redundancy.vmware. etc.Let all your SQL VMs sync their time with DC’s only. not with VMware Tools. 7-) Set shared LUN on to which Quorum Disk is placed (RDM). . Connect each physical NIC to a different physical switch for max. like: vMotion. Perennially Reserved on each host participating to prevent long time duration of starting of any ESXi host participating or hosting a clustered VM. Network separation is either physical or virtual using VLANs.Consider network separation between different types of networks.SAN: Multi-pathing Plugin NMP using SATP: ALUA_CX SAN Type Generic EMC Clariion EMC VNX using SATP: ALUA IBM 2810XIV using SATP: Default_AA IBM 2810XIV Hitachi NETAPP Data ONTAP 7-Mode using SATP: SYMM EMC Symmetrix In vSphere 5. it’s recommended to change Disk IO Timeout to more than 60 seconds from the following registry key: “HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Disk\TimeOutValue”. Fault Tolerance. The most preferred is VMXNET3 for both Private and Public networks.Clustered VMs should have two vNICs.Disable time-sync between SQL VMs and Hosts using VMware Tools totally (Even .Try to set your port groups with -at least. . . It’s recommended to do the following: .5 or above. Production. For Cluster-across-boxes. snapshotting. VM can sync with the Host using VMware Tools in case of startup.x and Windows Server 2012. at least twice per year.5 U1 or later) 5 Nodes FC SAN hosting Quorum Disk with vSphere 5.5) for more security procedures to secure both of your VMs and vCenter Server. Quorum Disk can be hosted on iSCSI or FCoE SAN. 2-) Try to maintain a proper DR/BC plan. With vSphere 5. Issue of using Round Robin PSP is solved (under certain conditions mentioned in this KB).1 U2 and Windows 2012 1-) Try to maintain a proper backup/restore plan.5.1/v5. 5 Nodes 2008 SP2 or 2008 R2 SP1 32/64 bit 5 Nodes Windows 2012 (vSphere 5.1 hypervisors. 1-) All security procedures done for securing physical Microsoft Clusters should be done in Clustered VMs.after uncheck the box from VM settings page. resume.) according to the following KB: http://kb. Keep in mind also to continuously test restoring your backup sets to test their effectiveness.x) or Windows 2012 R2 (vSphere 5.5. 2-) Follow VMware Hardening Guide (v5.Sync all ESXi Hosts in the VI to the same Startum 1 NTP Server which is the same time source of your forest/domain. etc. Clustering Configurations would not help a lot in case of total data center failure situation.do? language=en_US&cmd=displayKC&externalId=1189 . of Nodes OS (Windows) 2 Nodes 2003 SP1 or SP2 32/64 bit and per vSphere 5. Recovera bility Scalabilit y Security 2-) Supported OS’s and Number of Nodes: No.vmware. 1-) For greater scalability. 2 Nodes FCoE SAN hosting Quorum Disk with vSphere 5. Availabili 1-) Try to use vSphere HA in addition to Exchange DAG to provide the highest level of availability. Try to test your DR/BC plan from time to time. try to upgrade your clustered VMs to Windows Server 2012. This helps in case of total corrupt of a cluster node which requires a full restore on bare metal/VM. like: Role-based Access Policy.1 U2 and Windows 2008/2012. Adapt a .com/selfservice/microsites/search. pdf. 4-) vMotion: . 3-) For MBX VMs. . 1000. refer to: http://www. Calculate the required resources per VM and then decide the number of VMs required to serve your requirement. 2-) Try to separate your DAG MBX VMs on different Racks. If Application HA Agent can’t recover the application from that failure. it may restart services or mount databases. or to be manually reactivated as the primary active database. In vSphere 5. it’ll stop sending heartbeats and the host will initiate a VM restart as a HA action. 6-) Try to leverage Symantec Application HA Agent for Exchange with vSphere HA & Exchange DAG for max.respectVmVmAntiAffinityRules” set to 1 to respect all VMs Affinity/Antiaffinity rules respectively. any vMotion operation may cause a false failover during the switch between the source and destination hosts – although this drop is really negligible and characterized by single ping-packet drop. . sending heartbeats to HA driver on ESXi host. availability. in the background. Keep in mind that additional compute capacity on MBX VMs is .Excha nge ty protection policy of N+1. but on the following DRS invocation. In vSphere 5. the host will monitor IO and network activity of the VM for certain period.5. Using Application HA. If it’s stopped because Guest OS failure. 2000. configure the vSphere Cluster with “ForeAffinePowerOn” option set to 1 to respect all VMs Affinity/Anti-affinity rules. vSphere HA powers-on the failed virtual machine on another host.Leverage Multi-NIC vMotion feature to make vMotion process much faster even for large MBX VM.vmware. use VMs anti-affinity rules to separate them over different hosts.1. Keep in mind that. Performa nce 1-) Try to leverage “Building Blocks” approach. For step-bystep guide of how to change this value.Try to enable Jumbo Frames on vMotion Network to reduce overhead and increase throughput. This add additional layer of availability for Exchange VMs. 5-) Try to leverage VM Monitoring to mitigate the risk of Guest OS failure. the monitoring agent will monitor Exchange service. restoring the failed DAG member and bringing the newly passive database up to date and ready to take over in case of a failure. In some cases when leveraging Multi-NIC vMotion and enabling Jumbo Frames on vMotion Network. or 4000 Users Mailboxs per VM. In case of application failure. configure the vSphere Cluster with both “ForeAffinePowerOn” & “das. deploying your MBX VMs as a Standalone Configuration is somehow different in calculations than DAG Configuration. That’s why we need to set DAG Cluster Heartbeat setting “SameSubnetDelay” to 2 seconds (2000 milliseconds). When HA restart a VM. use DRS Clusters in Fully Automated Mode. 2-) It’s recommended to distribute all users mailboxes evenly on all of your DAG members to load balance the users load between all MBX VMs.DAG members are very sensitive to any latency or drop in its heartbeat network and hence. it’ll not respect the anti-affintiy rule. VMware Tools inside Exchange VMs will send heartbeats to HA driver on the host. Divide your required number of Users Mailboxes on equally-sized MBX VMs of either 500. For CAS VMs. Blade Chassis and Storage Arrays if available using VMs Affinity/Anti-affinity rules and Storage Affinity/Anti-affinity rules for most availability. as N is the number of DAG members VMs in vSphere Cluster of N+1 hosts. use Host-VM should affinity rule to load-balance them on different group of hosts even if two CAS VMs stayed on the same host.com/files/pdf/Exchange_2013_on_VMware_Best_Practices_Guide. In case of a ESXi failure. 3-) As Microsoft supports vMotion of Exchange VMs. the host will restart the VM. changing heartbeat setting isn’t necessary. If there’s also no activity. the VM will be migrated to respect the rule. follow this link (http://technet.aspx & http://technet. CPU Utilization should be less than 80% even in case of failover of a failed MBX DB. over-commit is allowed after establishing a performance baseline. Memory and IOps Sizing Guide for Exchange 2010: (http://technet. Ratio of Virtual: Physical Cores should be 2:1 max (better to keep it nearly 1:1) to be under MS support umbrella.aspx & http://technet. 2-) Use Exchange Server Calculator (2010: http://blogs.Don’t disable Ballon-driver installed with VMware Tools.technet.aspx). It won’t double the processing power –in opposite to what shown on ESXi host as double number of logical cores. MBX role shouldn’t use more than 40% of CPU Utilization in case of Multi-role deployment.Exchange VMs should have CPU Utilization less than 70-75% if used in a Standalone Configuration.com/en-us/library/ee832793. like: HA Slot Size. In addition.aspx) & (2013: https://gallery. vMotion chances and time. 5-) A complete guiding size for Exchange 2013: http://blogs.microsoft.Exchange is a SMP application that can use all VM vCPUs.aspx 4-) CPU Sizing: .com/en-us/library/dd346701.aspx) for CAS/ HUB Sizing according to sizing MBX Server. its space is usable for adding more VMs.com/Exchange-2013-ServerRole-f8a61780) to calculate the required resources for your MBX VMs according to your chosen building block sizes. For Exchange 2013.Don’t over-commit CPUs. If needed. If DAG is to be implemented.com/b/exchange/archive/2010/01/22/updates-to-the-exchange-2010-mailbox-serverrole-requirements-calculator.Exchange 2010/2013 aren’t vNUMA-aware. In addition.aspx) for CAS Sizing. Keep in mind that memory reservation affects as aspects. so there’s no need to configure large Exchange VMs like underlying physical NUMA topology. On the other side.com/b/exchange/archive/2013/05/06/ask-the-perf-guy-sizing-exchange-2013deployments.microsoft. reservation of memory removes VM swapfiles from datastores and hence.but it’ll give a CPU processing boost up to 20-25% in some cases. as Exchange is a memory-intensive application.com/enus/library/ee712771.technet.com/en-us/library/dd346700. Assign vCPUs as required and don’t overallocate to the VM to prevent CPU Scheduling issues at hypervisor level and high RDY time.com/b/exchange/archive/2013/05/06/askthe-perf-guy-sizing-exchange-2013-deployments. specially for MBX VMs. .Use the following equations: 5-) Memory Sizing: .needed for passive DBs for failover. reserve the configured memory to provide the required performance level. 3-) For Exchange 2010. .microsoft. follow this link (http://blogs.Enable Hyperthreading when available. . . 4-) CPU.technet. Don’t consider it when calculating Virtual: Physical Cores ratio. Try to size your Exchange VMs to fit inside single NUMA node to gain the performance boost of NUMA node locality. ESXi Hypervisor is NUMA aware and it leverages the NUMA topology to gain a significant performance boost. In some cases like small environments. distribute these DAG members VMs evenly on all hosts using DRS VMs Anti-affinity rule for more load-balancing and higher availability.technet. .microsoft. It’s ESXi Host’s last line of defense against .Don’t over-commit memory.microsoft. . Exchange VMs should be always monitored for its memory performance and adjust the configured and reserved –if any. iSCSI SANs or using In-guest iSCSI targets. as Exchange is an IOintensive application to avoid IO contention. 7-) Network: . through two HBAs.1 U1 and later.Partition Alignment gives a performance boost to your backend storage. Jumbo Frames reduces network overhead and increases Throughput.Separate Exchange VMs’ disks on different –dedicated if needed.Distribute any Exchange VM disks on the four allowed SCSI drivers for max.Use Paravirtual SCSI Driver in all of your Exchange VMs. enable Jumbo Frames on its network end-to-end.Adding memory on the fly to Exchange VMs –specially Exchange 2013.Provide at least 4 paths. No performance difference between these two types of disks. according to this link: http://kb. Try to use Eager-zeroed Thick disks for DB and Logs disks. between each ESXi host and the Storage Array for max. VMFS5 created using vSphere (Web) Client will be aligned automatically as well as any disks formatted using newer versions of Windows. . like: Exchange P2V migration or to leverage 3 rd Party array-based backup tool.Always consider any storage space overhead while calculating VMs space size required. . performance although it’ll add high management and maintenance overhead.will not add any performance gain till the VM is rebooted. Any upgraded VMFS datastores or upgraded versions of Windows Guests will require a partitions alignment process.For heavy workloads. . . That’s why enabling hot add won’t be necessary. Choosing RDM disks or VMFS-based disks are based on your technical requirements.Microsoft supports only virtualized Exchange VMs on either FC. performance and least latency and CPU overhead. for max. performance and throughput and least CPU overhead. availability. . . specially disks used for DB and Logs. .For IP-based Storage. .com/selfservice/microsites/search.memory contention before Compression and Swapping to Disk processes. . 6-) Storage Sizing: . performance paralleling and higher IOps.vmware. FCoE. . it’s done by migrate VMs disks to another datastore using Storage vMotion. .Use VMXNet3 vNIC in all Exchange VMs for max. dedicate a LUN/Datastore per a MBX VM for max. .x Host may need to increase its VMFS Heap Size to allow for hosting Exchange VMs with large disks of several TBs. as spindles will not make two reads or writes to process single request.do? language=en_US&cmd=displayKC&externalId=1004424 This issue is mitigated in vSphere 5.memory values to meet its requirements. . NAS Arrays aren’t supported either as a NFS datastore or accessing it through UNC path from inside the Guest OS.ESXi 5. then format and recreate the datastore on VMFS5.RDM can be used in many cases. For upgraded VMFS.datastores. It’s recommended to add 20-30% of space as an overhead. Overhead can be: swapfiles. VMs logs or snapshots. configuring DAG members with one vNIC is supported. 5-) Monitoring: . like: vMotion.000 ms window) Usage Active Swap-in Rate Swap-out Rate Commands Device Latency Kernel Latency packets-Rx packets-Tx Description Processor usage across all vCPUs. Exchange production.Always. one for public network and the other one for heartbeat and replication network. Network separation is either physical or virtual using VLANs. . as well as your hosting ESXi hosts using ESXTOP tool and Windows Performance Monitor respectively: Subsystem CPU Memory Storage Network Subsystem VM Processor VM Memory ESXTOP Counters %RDY %USED %ACTV SWW/s SWR/s ACTV DAVG/cmd KAVG/cmd MbRX/s MbTX/s Win PerfMon Counters % Processor Time Memory Ballooned Memory Swapped vCenter Counter Ready (milliseconds in a 20. 2-) Use vCenter Operation Manager to monitor your environment performance trends. Create your Golden Template for every tier of your VMs. . This reduces the time required for deploying or scaling your Exchange environment as well as preserve consistency of configuration throughout your environment. Management. try to monitor your environment using In-guest tools and ESXi and vCenter performance charts.Consider network separation between different types of networks. redundancy. Amount of memory in MB reclaimed by balloon driver. etc. but it’s not considered as a best practice. Memory Used Managea bility 1-) Try to leverage vSphere Templates in your environment. Connect each physical NIC to a different physical switch for max.Exchange VMs port group should have at least 2 physical NICs for redundancy and NIC teaming capabilities. establish a . Amount of memory in MB forcibly swapped to ESXi host swap.. Physical memory in use by the virtual machine.DAG members VMs should have two vNICs. The following are some counters that may help in monitoring your Exchange VMs performance. Fault Tolerance. Keep in mind that. com/selfservice/microsites/search. 5-) Another approaches for DR: .in your environment. Check the following link: http://www.microsoft.) according to the following KB: http://kb. Scale-out Approach requires smaller ESXi Hosts and gives a more flexibility in designing a DAG VM.do? language=en_US&cmd=displayKC&externalId=1189 .150).vmware. estimate the capacity required for further scaling and proactively protect your environment against sudden peaks of VMs performance that need immediate scaling-up of resources. That’s why Scale-up Approach needs a careful attention to availability of DAG VMs.com/en-us/library/jj619301(v=exchg. one of them is vSphere Advanced Data Protection.You can use Stretched DAG configuration with Automatic Failover. resume. There’s no best approach here. With SRM.vmware. 4-) If SRM isn’t available for any reason.Disable time-sync between Exchange VMs and Hosts using VMware Tools totally (Even after uncheck the box from VM settings page. not with VMware Tools. These are Exchange-aware and don’t cause any corruption in mailbox DB due to quiesceing the DB during the backup operation.pdf 2-) If available.dynamic baseline of your VMs performance to prevent false static alerts. a single failed VM will affect a large portion of users. . It all depends on your environment and your requirements. try to leverage any 3 rd party replication software to replicate your Exchange VMs to a DR site for recovery in case of any disaster. snapshotting.vmware. etc. DRS will be more effective. http://www. Ofcourse.You can use Stretched DAG Configuration with Lagged Copy. It’s recommended to do the following: . . . It reduces the number of VMs required to serve certain number of mailbox users and hence.com/files/pdf/Exchange_2013_on_VMware_Availability_and_Recovery_Options.com/files/pdf/products/vsphere/VMware-vSphere-Data-Protection-Product-FAQ. VM can sync with the Host using VMware Tools in case of startup.aspx 10-) Time Synchronization is one of the most important things in Exchange environments. you can use any backup software that depends on array-based snapshots if it’s Exchangeaware. 3-) Use VMware Site Recovery Manager (SRM) if available for DR.Sync all ESXi Hosts in the VI to the same Startum 1 NTP Server which is the same time source of your forest/domain.Let all your Exchange VMs sync their time with DC’s only. A single VM failure has a less effect using Scale-out Approach and it requires less time for migration using vMotion and hence. 3-) Microsoft Support Statement for Exchange in Virtual environments: http://technet.pdf 1-) Scale-up Approach of DAG members requires a large ESXi Hosts with many sockets and RAM. automated failover to a replicated copy of the VMs in your DR site can be carried over in case of a disaster or even a failure of single MBX VM –for example. Recovera bility Scalabilit y 1-) Try to leverage any backup software that uses Microsoft Volume Shadow Service (VSS). but requires high number of ESXi hosts to provide the required level of availability. of vCPU= 1. of Virtual Cores of CAS No.375*U*NVCM MC= 2+2*NVCC ≥ 8GB For Multi-role Deployment: No.Variables Adjusted Mega Cycles per Core Baseline Mega Cycles per Core Mega Cycle for Certain Mailbox Usage Total Required Mega Cycles Required Mega Cycles for Active DB Copy Required Mega Cycles for Active DB Copy (Worst Case Scenario-Single Node Failure) Required Mega Cycles for Passive DB Copy Required Mega Cycles for Passive DB Copy (Worst Case Scenario-Single Node Failure) Required Mega Cycles for Effect of Passive DB on Active DB Copy Total Number of Users Mailboxes Total Number of Users Mailbox per MBX Server (Building Block Size) Total Number of Active Mailboxes (Worst Case Scenario-Single Node Failure) Total Number of Passive Mailboxes (Worst Case Scenario-Single Node Failure) Utilization No.375* NVCM Memory Required≈ MC+MT Representation AMC BMC MMC TMC RMCADB RMCDADB RMCPDB RMCDPDB RMCPADB NT NS NA NP U NVCC NVCM MT MMBX MC . of Virtual Cores of MBX Total Cache Memory Required on a Mailbox Server Cache Memory required per Mailbox usage Memory required for CAS Server Exchange 2013: AMC= BMC*(New Rating/Baseline Rating) Standalone Configuration (U≤70-75%) TMC= (MMC*NT)/U NVCM= TMC/AMC MT= NS*MMBX NVCC= 0. . in the background. For licensing limits.1. Asynchronous Mode DRS and vMotion Always-on Failover Yes Max. In vSphere 5. Data Mirroring – High Yes Max. or to be manually reactivated as the primary active database. In this case. In vSphere 5. Synchronous Mode DRS and vMotion Always-on Availability No Max. 2 Yes 0 Secon Totally compatible and Group (AAG)ds supported with vSphere HA. 2-) Try to separate your SQL AAG VMs on different Racks.respectVmVmAntiAffinityRules” set to 1 to respect all VMs Affinity/Anti-affinity rules respectively. Blade Chassis and Power Supplies for more availability.SQL Serv er 2012 Availabili ty 1-) Try to use vSphere HA in addition to SQL AAG to provide the highest level of availability. Totally supported with HA but isn’t supported for vMotion and Automatic DRS. configure the vSphere Cluster with “ForeAffinePowerOn” option set to 1 to respect all VMs Affinity/Anti-affinity rules. use VMs anti-affinity rules to separate them over different hosts.e. i. restoring the failed AAG member and bringing the newly passive database up to date and ready to take over in case of a failure. as N is the number of AAG members VMs in vSphere Cluster of N+1 hosts. 4 Yes Secon Minute Totally compatible and Group (AAG)ds s supported with vSphere HA. there’re many other High-Availability technologies to be used in SQL Clusters: Technology Autom No. must rules are respected even in case of HA invocation. automatic Failover DRS and vMotion Data Mirroring – High No Max. use Host-VM must affinity rule to force VMs to run on licensed hosts only. 1 No Secon Minute Totally compatible and . automatic Failover DRS and vMotion Data Mirroring – High No Max. 4 No 0 Secon Requires certain Cluster Instances (vSphere ds configuration to be hosted supported on vSphere Clusters) Clusters. configure the vSphere Cluster with both “ForeAffinePowerOn” & “das. of Readab RPO RTO vSphere Compatibility atic Second le Failov aries Second er aries Always-on Availability Yes Max. 3-) For SQL AAG VMs. 1 No 0 Secon Totally compatible and Safety Mode with ds supported with vSphere HA. 1 No 0 Minute Totally compatible and Safety Mode without s supported with vSphere HA. When HA restart a VM. Adapt a protection policy of N+1. HA will not restart a failed VM if it can’t respect a must rule. vSphere HA powers-on the failed virtual machine on another host. In case of an ESXi failure.5. the VM will be migrated to respect the rule. but on the following DRS invocation. 4-) If SQL AAG isn’t available. It’s recommended to separate your licensed hosts on different Racks. Keep in mind that. Blade Chassis and Storage Arrays if available using VMs Affinity/Anti-affinity rules and Storage Affinity/Anti-affinity rules for most availability. it’ll not respect the anti-affintiy rule. changing heartbeat setting isn’t necessary. it may restart services or mount databases.to choose physical CPUs with higher clock and lower . In some cases when leveraging Multi-NIC vMotion and enabling Jumbo Frames on vMotion Network.SQL AAG members are very sensitive to any latency or drop in its heartbeat network and hence. Log Shipping or Data Mirroring require DB Full Recovery Mode. the host will restart the VM.Performance Mode ds s supported with vSphere HA. DRS and vMotion Log Shipping No N/A No Minute Minute Totally compatible and s s-Hrs supported with vSphere HA. If Application HA Agent can’t recover the application from that failure. it’ll stop sending heartbeats and the host will initiate a VM restart as a HA action. In case of application failure. If there’s also no activity. This add additional layer of availability for Exchange VMs. 2-) Disable virus scanning on SQL DB and logs as it may lead to a performance impact on your SQL DB. . any vMotion operation may cause a false failover during the switch between the source and destination hosts – although this drop is really negligible and characterized by single ping-packet drop. It’ll always load balance your SQL VMs across the cluster.Try to enable Jumbo Frames on vMotion Network to reduce overhead and increase throughput.As most of SQL VMs underutilize their vCPUs (95% of SQL servers are under 30% utilization as stated by VMware Capacity Planner). the host will monitor IO and network activity of the VM for certain period. all of AAG. DRS and vMotion Backup/Restore No N/A No Minute HrsTotally compatible and s-Hrs Days supported with vSphere HA. That’s why we need to set SQL AAG Cluster Heartbeat setting “SameSubnetDelay” to 2 seconds (2000 milliseconds). SQL Server Resources Governor can now create up to 64 resource pools and can sit affinity rules on which CPUs or NUMA nodes to process similar to vSphere Resource-controlling capabilities. it’s better to adjust resources settings from SQL Server Resources Governor itself. which may lead to higher storage growth.Leverage Multi-NIC vMotion feature to make vMotion process much faster even for large SQL VM. use DRS Clusters in Fully Automated Mode. 5-) As Microsoft supports vMotion of SQL VMs. . DRS and vMotion Keep in mind that. Using Application HA. availability. rather than vSphere Resource Pools. VMware Tools inside SQL VMs will send heartbeats to HA driver on the host. Generally. Performa nce 1-) Try to leverage Resources Governor limits and pools in conjunction with using vSphere Resource Pools for more control over the compute resources presented to SQL Server. 5-) vMotion: . 6-) Try to leverage VM Monitoring to mitigate the risk of Guest OS failure. respecting all of your configured affinity/anti-affinity rules. sending heartbeats to HA driver on ESXi host. it’s preferred –if available. 7-) Try to leverage Symantec Application HA Agent for SQL with vSphere HA for max. the monitoring agent will monitor SQL instance and its services. 2-) CPU Sizing: . If it’s stopped because Guest OS failure. It’s recommended to use “Max Server Memory” parameter from inside SQL itself and set it to Configured Memory minus 4GB to allow some memory for Guest OS and any 3 rd-party . Swapping to disk is done by ESXi Host itself. In some cases like small environments. Host’d swap memory pages from physical memory to VM swap file on disk without knowing what these pages contain or if these pages are required or idle.SQL Server 2012/2014 tends to use all the configured memory of the VM.ESXi Hypervisor is NUMA aware and it leverages the NUMA topology to gain a significant performance boost. Generally speaking as mentioned in the previous point. It’s better to keep it nearly 1:1. as SQL 2012/2014 is a memory-intensive application. Don’t consider it when calculating Virtual: Physical Cores ratio. When Balloning is needed. swapping is done according to Guest OS techniques. overcommitment is allowed after establishing a performance baseline of normal-state utilization. Balloning is the last line of defense of ESXi Host before compression and swapping to disk when its memory becomes too low. . . This may lead to some performance issues cause any other application installed –like: backup agent or AV agent. For some cases. . This reduces the license costs as the number of physical cores is reduced and doesn’t affect the cores performance due to higher clock speed and more MHz to share between the low-utilization SQL VMs.min”.vcpu. In some cases of Tier-1 SQL Servers. Try to size your SQL VMs to fit inside single NUMA node to gain the performance boost of NUMA node locality.Don’t over-commit memory.Don’t over-commit CPUs.but it’ll give a CPU processing boost up to 20-25% in some cases and these added logical cores won’t be taken into licensing consideration (Max. . It won’t double the processing power –in opposite to what shown on ESXi host as double number of logical cores. Test both options and choose what fits your performance requirements. If needed. i. over-commitment is allowed to get higher consolidation ratios. 3-) Memory Sizing: . You can also use SPECint2006 Results to compare between CPUs.e. Ballon Driver will force Guest OS to swap the idle memory pages to disk to return the free memory pages to the Host. Balloning hit to the performance is somehow much lower than Swapping to disk. vMotion chances and time. its space is usable for adding more VMs. keep in mind that SQL 2012/2014 are vNUMA-aware.Enable Hyperthreading when available.number of cores. like: HA Slot Size. setting VM Setting ”Hyperthreading Sharing” to None –disabling Hyperthreading for this VM.wouldn’t find adequate memory to operate. Keep in mind that memory reservation affects as aspects. It’s applicable for large VMs with 8 or more vCPUs or can be enabled for small VMs using advanced VM Setting “numa. . In addition. don’t overcommit memory for business critical SQL VMs and if you’ll do some over-commitment. reserve the configured memory to provide the required performance level. reservation of memory removes VM swapfiles from datastores and hence.may lead to better performance than enabling Hyperthreading.Don’t disable Ballon Driver installed with VMware Tools. Virtualization per Core Licensing).In case of large SQL VMs that won’t fit inside single NUMA node. Keep in mind that. Performance monitoring is mandatory in this case to maintain a baseline of normalstate utilization. you have to do your homework of capacity planning and performance monitoring to maintain your performance baseline before deciding. so it’s recommended to show underlying physical NUMA topology to SQL VMs. don’t push it to these extreme limits. where a lot of underutilized SQL Servers there. . Always consider any storage space overhead while calculating VMs space size required.SQL Server uses all the configured memory as a cache of its queries and it manages its configured memory by its own techniques. Use in-guest memory counters and SQL Server memory counters. SQL Server allocates all buffer pool memory during startup. SQL Server startup behavior changes. it can try to allocate less.With trace flag 834 enabled. This can help in high-consolidation-ratios scenarios.to simulate your SQL DB IO load on your backend storage to establish a performance baseline of your backend storage array. SQL Server allocates memory in 2MB contiguous blocks instead of 4KB blocks. check: http://support. it might be difficult to obtain contiguous memory due to fragmentation. However.If possible.com/kb/920093. Any memory over-commitment technique can lead to performance issues with trace flag 834.microsoft. you should start your 64bit SQL Server with “trace flag 834”.microsoft. . .com/kb/231619 . SQL Server will not free memory unless the minimum server memory value is reduced. . VMs logs or snapshots.Applications. It’s recommended to run it in non-business hours as it loads your HBAs. 4-) Storage Sizing: .Set “Min Server Memory” to define a min amount of memory for SQL server to acquire for processing. To enable “trace flag 834”. as when Host is memory-contended and Ballon Driver at its full swing to free memory pages from Guest OS’es.With trace flag 834 enabled. For more information: http://blogs. Server Memory” will prevent Ballong Driver from inflating and taking this defined memory space.aspx To enable all SQL Server buffer to use Large Memory pages. vSphere does not perform Transparent Page Sharing on the VM’s memory unless hosts reached Memory Hard state. “Min. use SQLIOSIM tool –found in Binn folder found under your SQL instance installation path. SQL Server startup time can be significantly delayed.VM has 8GB or more of physical RAM. . .com/b/psssql/archive/2009/06/05/sql-server-and-large-pagesexplained. SQL Server supports the concept of large pages when allocating memory for some internal structures and the buffer pool. his will lead to: .110). when the following conditions are met: . .You are using SQL Server Enterprise Edition.aspx . NICs and back-end array. Therefore. and SQL Server might then run with less memory than you intended. After the host has been running for a long time. Sever Memory” will not immediately be allocated on startup.Use Large Memory Pages for Tier-1 SQL VMs. Instead of allocating memory dynamically at runtime.msdn. Overhead can be: swapfiles. after memory usage has reached this value due to client load. Check: http://msdn.microsoft. Active Memory counter from vSphere (Web) Client may not reflect the actual usage of SQL Server memory. This only should be enabled on Tier-1 dedicated SQL VMs with memory reservation and no memory overcommitment. .The Lock Pages in Memory privilege is set for the service account. and the virtual machine is running on a host that supports large pages. “Min. Check: http://support.With large pages enabled in the guest operating system.com/en-us/library/ms190730(v=sql. It’s recommended to add 20-30% of space as an overhead. If SQL Server is unable to allocate the amount of contiguous memory it needs. .aspx 5-) Network: . SQL Replication.Use Paravirtual SCSI Driver in all of your SQL VMs. Connect each physical NIC to a different physical switch for max. then Windows pads zeros to it as long as a new writes occur. Any upgraded VMFS datastores or upgraded versions of Windows Guests will require a partitions alignment process. Fault Tolerance.Distribute any SQL VM disks on the four allowed SCSI drivers for max. Choosing RDM disks or VMFS-based disks are based on your technical requirements. It forces Windows to initialize the DB file before zeroing all of the blocks in its allocated size. .Enable Instant Initialization feature for speeding up the growth or creating new databases.Provide at least 4 paths. as spindles will not make two reads or writes to process single request. No performance difference between these two types of disks. etc.microsoft. performance and throughput and least CPU overhead.Follow Microsoft SQL Storage Best Practices: http://technet. It’s better to dedicate a physical NIC on ESXi hosts for replication network between .The following table shows the Read/Write behavior of each of SQL DB components: DB Component Read/Write RAID Recommended DB Read Intensive RAID 5 Logs Write Intensive RAID 1/10 tempdb Write Intensive RAID 10 . performance paralleling and higher IOps.Separate different SQL VMs’ disks on different –dedicated if needed.Use VMXNet3 vNIC in all SQL VMs for max. SQL production.com/en-us/library/cc966534. . enable Jumbo Frames on its network end-to-end.SQL VMs port group should have at least 2 physical NICs for redundancy and NIC teaming capabilities. It’s recommended to use Eager-zeroed Thick disks for DB and Logs disks. between each ESXi host and the Storage Array for max. it’s done by migrate VMs disks to another datastore using Storage vMotion. one for public network and the other one for heartbeat and replication network.com/en-us/library/ms175935(v=sql. . . Management. . Jumbo Frames reduces network overhead and increases Throughput. . .Consider network separation between different types of networks. like: SQL P2V migration or to leverage 3 rd Party array-based backup tool.aspx . as SQL is an IO-intensive application. availability. then format and recreate the datastore on VMFS5. through two HBAs.110). specially disks used for DB and Logs. It’s recommended if these tempdbs are on fast SSD array for better overall performance. For upgraded VMFS. redundancy.Partition Alignment gives a performance boost to your backend storage. Log files disks can’t be instantly initializaed. . Check: http://technet.Clustered SQL VMs should have two vNICs.Allocate tempdb for each of your SQ VM vCPU.datastores to avoid IO contention..RDM can be used in many cases. VMFS5 created using vSphere (Web) Client will be aligned automatically as well as any disks formatted using newer versions of Windows. performance. like: vMotion. . Network separation is either physical or virtual using VLANs. least latency and least CPU overhead.For IP-based Storage. . for max.microsoft. Two models: Licensing your virtual cores or Server/CAL licensing.com/en-us/server-cloud/products/sql-servereditions/ 2-) SQL Server 2012/2014 Licensing: 2012: http://download. also known as Queuing Time‖ Amount of data received/transmitted per second Both Received/Transmitted Packets per second Both Receive/Transmit Dropped packets per second Networ k PKTRX/s.com/files/pdf/solutions/SQL_Server_on_VMware-Best_Practices_Guide. Hyperthreading is connection requires CAL license.com/documents/china/sql/SQL_Server_2012_Licensing_Reference_Guide.microsoft.: 4 virtual cores per VM. PacketsTx DroppedRx.pdf 2014: http://www.Each VM will have Server license and each . PKTTX/s %DRPRX.Core Factor doesn’t apply.microsoft. MbTX/s Received.pdf Managea bility 1-) SQL Server 2012 Support Statement for Virtualization: http://support.microsoft. or cumulative over host) Amount of memory reclaimed from resource pool by way of ballooning Reads and Writes issued in the collection interval MbRX/s. WRITEs/s DAVG/cmd KAVG/cmd NumberRead.microsoft. Transmitted PacketsRx. %DRPTX Both Both Both -In-guest Counters: http://www. DroppedTx Both Average latency (ms) of the device (LUN) Average latency (ms) in the VMkernel.vmware. . .aspx Two licensing approaches: Licensing your VMs or Licensing your physical hosts (Max.com/en-us/server-cloud/products/sql-server/buy.com/kb/956893 2-) SQL Server 2012/2014 Editions: http://www. Swapoutrate vmmemctl Both Disk READs/s.Clustered SQL VMs. NumberWrite deviceLatency KernelLatency Both CPU used over the collection interval (%) CPU time spent in ready state Percentage of time spent in the ESX/ESXi Server VMKernel Memory ESX/ESXi host swaps in/out from/to disk (per virtual machine. . Swapout MCTLSZ (MB) Swapinrate. 6-) Monitoring: Try to establish a performance baseline for your SQL VMs and VI by monitoring the following: . Virtualization Approach). Licensing Virtual Cores Licensing Server/CAL . specially when using Synchronous-commit mode AAGs.ESXi Hosts and VMs counters: Resource Metric (esxtop/resxtop) Metric (vSphere Client) Host/ VM Description CPU %USED %RDY %SYS Used Ready System Both VM Both Memor y Swapin.Licensing your VMs: Used usually for small deployments.Min. DBs migration to SQL Servers or SQL Server migration to new version). supporting single production application.Licensing your Physical Hosts (Max.microsoft. 9-) An availability group listener is a virtual network name (VNN) that directs read-write requests to the primary replica and read-only requests to the read-only secondary replica. .Allow for VMs mobility across different hosts using VMware vMotion with Software Assurance using VMware vMotion with Software Assurance (SA) Benefits. Virtualization Approach): Used for large virtual environments. Without SA. Keep in mind that it can’t check on OS or some 3 rd-party applications installed that may not allow for upgrade. (SA) Benefits. 5-) Try to leverage monitoring and capacity planning tools. . You count only for physical CPU cores with consideration of Core Factor and Hyperthreading isn’t taken into considerations. . development. That enables application clients to connect to an availability replica without knowing the name of the physical instance of the SQL Server installation. their editions and licenses (This helps a lot in case of P2V.Available for Standard and Business Intelligence must be licensed if used.com/en-us/library/ms144256.aspx 7-) Try to leverage Microsoft SQL Best Practices Analyzer to make sure you follow the best practices of deploying SQL Server in your environment.aspx 4-) Try to have a policy for Change Management in your environment to control the SQL Server VMs sprawl due to high demand on SQL Servers for the different purposes in your environment. With SA Benefits.Available only for Standard Edition. Always create an availability group listener when deploying AAG on vSphere. like: Microsoft MAP Tool. you’re limited in deploying total vCPUs of SQL VMs less than or equal to number of physical Cores licensed for SQL per host. It’s recommended to use Cold P2V conversion to preserve DBs consistency. . It’s recommended to do the following: .Let all your SQL VMs sync their time with DC’s only. It helps significantly in monitoring all your SQL VMs performance and utilization. For more information: http://msdn. 3-) SQL Server 2012/2014 Maximums: http://msdn. Keep in mind to remove unneeded HW drivers from P2V VMs after successful conversions. 8-) Try to leverage VMware Converter for faster migration from physical SQL Servers. like: testing.microsoft.Restart the VM. .Change Windows Environmental Variable “devmgr_show_unpresent_devices” to 1.aspx 10-) Time Synchronization is one of the most important things in SQL environments. not with VMware Tools. Editions. . etc. For more information: http://technet. and after all not presented physical drivers are shown. This is can be done by: .Allow for VMs mobility across different hosts .From Device Manager. your DBs sizes and performance trends as well as creating a complete inventory of all your SQL servers.com/en-us/library/ms143432. 6-) Try to use Upgrade Advisor to perform a full check-up on the old SQL Servers before upgrading to SQL 2012/2014.com/en-us/library/hh213417.taken into consideration as additional vCPUs that . remove all unneeded old drivers. it allows for unlimited number of SQL Server VMs and allows for license mobility as much as possible in your datacenter.microsoft. Ofcourse.vmware. DRS will be more effective. A single VM failure has a less effect using Scale-out Approach and it requires less time for migration using vMotion and hence. check: http://msdn. These are SQLaware and don’t cause any corruption in DB due to quiesceing the DB during the backup operation. but with higher management overhead and higher RPO/RTO results than using VMware SRM. 1-) Try to leverage any backup software that uses Microsoft Volume Shadow Service (VSS). Check the following link: http://www. resume.Try to leverage SQL Sysprep Tool to provide a golden template of your SQL VM.microsoft. you can leverage some availability features of SQL itself for more recoverability and DR.Sync all ESXi Hosts in the VI to the same Startum 1 NTP Server which is the same time source of your forest/domain.com/selfservice/microsites/search. It’ll reduce the time required for deploying or scaling your SQL environment as well as preserve consistency of configuration throughout your environment.Leverage CPU/Memory Hot add with SQL VMs to scale them as needed. 11-) Make sure that you enable “Full Recovery Mode” on all DBs that will be included in your SQL AAGs and also make sure that at least single Full Backup is taken. You can either use a mix between AAG Synchronous/Asynchronous replicas or Data Mirroring in High safety Mode with Log Shipping in DR Site.Recovera bility Scalabilit y Security . With SRM. VM can sync with the Host using VMware Tools in case of startup. All you need is to use “RECONFIGURE” query using Management Studio to force SQL Server to use the newly added resources. 3-) If VMware SRM isn’t available.aspx .com/enus/library/ee210754. It reduces the number of VMs required to serve certain number of DBs and hence.pdf 2-) If available.in your environment. one of them is vSphere Advanced Data Protection. Scale-out Approach requires smaller ESXi Hosts and gives a more flexibility in designing a SQL VM. automated failover to a replicated copy of the VMs in your DR site can be carried over in case of a disaster or even a failure of single MBX VM –for example.microsoft.do? language=en_US&cmd=displayKC&externalId=1189 .aspx & http://msdn. . That’s why Scale-up Approach needs a careful attention to availability of DAG VMs. It all depends on your environment and your requirements. 2-) Use VMware Site Recovery Manager (SRM) if available for DR.vmware.Disable time-sync between SQL VMs and Hosts using VMware Tools totally (Even after uncheck the box from VM settings page. you can use any backup software that depends on array-based snapshots if it’s Exchangeaware. . This approach leads to lower cost. more cost.com/en-us/library/ee210664.Scale-up Approach of SQL VMs requires a large ESXi Hosts with many sockets and RAM.) according to the following KB: http://kb. In the same time it reduces the cost of software licenses and physical hosts.com/files/pdf/products/vsphere/VMware-vSphere-Data-Protection-Product-FAQ. a single failed VM will affect a large portion of users. For more information. etc. snapshotting. There’s no best approach here. but requires high number of ESXi hosts to provide the required level of availability and more software licenses and hence. . as N is the number of Oracle RAC Cluster members VMs in vSphere Cluster of N+1 hosts. Eventually. RVI C1E Halt State Power-Saving Virus Warning Hyperthreading Yes Necessary to run 64-bit guest operating systems. use VMs anti-affinity rules to separate them over different hosts. it’ll stop sending heartbeats and the host will initiate a VM restart as a HA action. availability. the VM will be migrated to respect the rule. Using Application HA. In vSphere 5. Disables NUMA benefits if set to Yes. When HA restart a VM. but it’ll lead to high costs of licenses and physical hosts to host many Oracle VMs. For use with some Intel processors. EPT. 7-) Try to leverage Symantec Application HA Agent for Oracle with vSphere HA for max. In case of application failure. it may restart services and any dependent resources.respectVmVmAntiAffinityRules” set to 1 to respect all VMs Affinity/Anti-affinity rules respectively. Blade Chassis and Storage Arrays if available using VMs Affinity/Anti-affinity rules and Storage Affinity/Anti-affinity rules for most availability. Hardware-based virtualization support. VMware Tools inside Oracle VMs will send heartbeats to HA driver on the host.5. sending heartbeats to HA driver on ESXi host. it’ll not respect the anti-affintiy rule. the host will restart the VM. Disables warning messages when writing to the master boot record. deploy Orace RAC Multi-node Cluster with leveraging vSphere HA. This add additional layer of availability for Oracle VMs. 6-) Try to leverage VM Monitoring to mitigate the risk of Guest OS failure. Performa nce 1-) Configure the following BIOS Settings on each ESXi Host: Settings Recomme Description nded Value Virtualization Technology Turbo Mode Node Interleaving VT-x. in the background. In vSphere 5. but it suffers from some downtime while moving Oracle Instance from failed node to another node using OMotion.1. AMD-V. 4-) For zero down time.Ora cle DB Availabili ty 1-) Try to use vSphere HA in addition to Oracle RAC Cluster to provide the highest level of availability. vSphere HA powers-on the failed virtual machine on another host. restoring the failed Oracle RAC Clusters member and bringing the newly passive member up and ready to take over in case of a failure. it’s your choice according to your technical requirements and SLA. Hyperthreading is always recommended . the host will monitor IO and network activity of the VM for certain period. configure the vSphere Cluster with “ForeAffinePowerOn” option set to 1 to respect all VMs Affinity/Anti-affinity rules. you can deploy Oracle RAC Single-Node cluster with leveraging vSphere HA which costs much lower than MultiNode cluster. In case of an ESXi failure. If Application HA Agent can’t recover the application from that failure. Disable if performance is more important than saving power. Adapt a protection policy of N+1. or to be manually reactivated as the primary active instance. No No No Yes Disable if performance is more important than saving power. but on the following DRS invocation. Yes No Yes Balanced workload over unused cores. For minimum downtime (Near zero). configure the vSphere Cluster with both “ForeAffinePowerOn” & “das. 2-) Try to separate your Oracle RAC Cluster VMs on different Racks. If it’s stopped because Guest OS failure. the monitoring agent will monitor Oracle instance and its services. If there’s also no activity. 3-) For Oracle RAC Cluster VMs. where a lot of underutilized testing Oracle Servers there. For some cases. . keep in mind that Oracle 11g and above are supporting NUMA – Virtual NUMA-aware-. on Windows: Indexing Service. Performance monitoring is mandatory in this case to maintain a baseline of normalstate utilization. Keep in mind that memory reservation affects as aspects. Not necessary for database virtual machine. like: HA Slot Size. 2-) Remove unnecessary services from the Guest OS. Not necessary for database virtual machine. It’s recommended to test if enabling NUMA Support and exposing vNUMA to Oracle would increase performance or not before applying it in production.Don’t over-commit memory. over-commitment is allowed to get higher consolidation ratios. machine. reservation of memory removes VM swapfiles from datastores and hence. overcommitment is allowed after establishing a performance baseline of normal-state utilization. For example. reserve the configured memory to provide the required performance level. Video BIOS Cacheable Wake On LAN Execute Disable No Video BIOS Shadowable Video RAM Cacheable On-Board Audio On-Board Modem On-Board Firewire On-Board Serial Ports On-Board Parallel Ports On-Board Game Port No Required for VMware vSphere Distributed Power Management feature. It won’t double the processing power –in opposite to what shown on ESXi host as double number of logical cores.In case of large Oracle VMs that won’t fit inside single NUMA node. No Not necessary for database virtual machine. In addition. .but it’ll give a CPU processing boost up to 20-25% in some cases. . No Not necessary for database virtual machine. System Restore and Remote Desktop.Don’t over-commit CPUs. . For example on Linux: IPTables. as Oracle is a memory-intensive application. Try to size your Oracle VMs to fit inside single NUMA node to gain the performance boost of NUMA node locality.with Intel’s newer Core i7 processors such as the Xeon 5500 series. its space is usable for adding more VMs. but it’s disabled by default. If needed. 4-) CPU Sizing: . It’s better to keep it nearly 1:1.ESXi Hypervisor is NUMA aware and it leverages the NUMA topology to gain a significant performance boost. vMotion chances and time. No No No No Not Not Not Not No Not necessary for database virtual machine. machine.Enable Hyperthreading when available. Autofs and cups. machine. In some cases like small environments. Yes Yes necessary necessary necessary necessary for for for for database database database database virtual virtual virtual virtual machine. 3-) Set VM settings to “Automatically Choose Best CPU/MMU Virtualization Mode”. 5-) Memory Sizing: . Required for vMotion and VMware vSphere Distributed Resource Scheduler (DRS) features. wither it’s Windows or Linux. VMs logs or snapshots. . then format and recreate the datastore on VMFS5. Any upgraded VMFS datastores or upgraded versions of Windows Guests will require a partitions alignment process. Oracle Server supports the concept of large pages when allocating memory since version 9i R2 for Linux and 10g R2 for Windows.Provide at least 4 paths. availability.e. Keep in mind that. . performance paralleling and higher IOps. Oracle ASM performs better and . Oracle Real Application Cluster (RAC) supports vMotion using shared VMFS datastores hosting its disks. . Swapping to disk is done by ESXi Host itself. between each ESXi host and the Storage Array for max. For upgraded VMFS. Ballon Driver will force Guest OS to swap the idle memory pages to disk to return the free memory pages to the Host. as Oracle is an IO-intensive application. .The following table shows the Read/Write behavior of each of Oracle DB components: DB Component Read/Write RAID Recommended DB Read Intensive RAID 5 Logs Write Intensive RAID 1/10 OS Disk Read/Write RAID 1/10 .. It’s recommended to add 20-30% of space as an overhead. 6-) Storage Sizing: . Balloning is the last line of defense of ESXi Host before compression and swapping to disk when its memory becomes too low.Don’t disable Ballon Driver installed with VMware Tools. don’t push it to these extreme limits. like: Oracle P2V migration or to leverage 3 rd Party array-based backup tool. least latency and least CPU overhead. don’t overcommit memory for business-critical Oracel VMs and if you’ll do some over-commitment. It is the predecessor of Oracle Automatic Storage Management (ASM) and doesn’t allow for single instance Oracle Server. . Balloning hit to the performance is somehow much lower than Swapping to disk. . specially disks used for DB and Logs.For IP-bases Storage. it’s done by migrate VMs disks to another datastore using Storage vMotion.Use Paravirtual SCSI Driver in all of your Oracle VMs. swapping is done according to Guest OS techniques.Always consider any storage space overhead while calculating VMs space size required. i. . When Balloning is needed. VMFS5 created using vSphere (Web) Client will be aligned automatically as well as any disks formatted using newer versions of Windows. for max. Overhead can be: swapfiles.RDM can be used in many cases. Host’d swap memory pages from physical memory to VM swap file on disk without knowing what these pages contain or if these pages are required or idle. through two HBAs.Distribute any Oracle VM disks on the four allowed SCSI drivers for max. Choosing RDM disks or VMFS-based disks are based on your technical requirements.Use Large Memory Pages for Tier-1 Oracle VMs. It’s recommended to use Eager-zeroed Thick disks for DB and Logs disks.Separate different Oracle VMs’ disks on different –dedicated if needed.datastores to avoid IO contention. Jumbo Frames reduces network overhead and increases Throughput. enable Jumbo Frames on its network end-to-end. . Generally speaking as mentioned in the previous point. .Partition Alignment gives a performance boost to your backend storage. as spindles will not make two reads or writes to process single request. performance. No performance difference between these two types of disks.Don’t use Oracle Clustered File System (OCFS). Oracle production. as a disk group perform as fast as the slowest disk inside the group. 8-) Monitoring: Try to establish a performance baseline for your Oracle VMs and VI by monitoring the following: . . use external RAID –or similar technology. redundancy.Use VMXNet3 vNIC in all Oracle VMs for max. Only meaningful for SMP virtual machines. you have to disable simultaneous write protection provided by VMFS using the multi-writer flag (http://kb.Create Oracle ASM Disk groups on similar disks and storage arrays.vmware. For disk redundancy. Connect each physical NIC to a different physical switch for max.Consider network separation between different types of networks.allows for single instance and clustered Oracle RACs. performance and throughput and least CPU overhead. For RDM disks as CSR/Voting Disks. .com/kb/1034165) to share these disks between nodes as well as using VMDK disks as “Independent Persistent” disks.Clustered Oracle VMs should have two vNICs. one for public network and the other one for heartbeat and replication network. also known as %MLMTD Memor y Disk VM Both . Oracle Replication.that’s done on storage-array level. .For Oracle Multi-Node RAC Cluster. Management. transparently to Oracle Stack. In case you use VMDK disks on VMFS datastores as CSR/Voting Disks. or cumulative over host) Amount of memory reclaimed from resource pool by way of ballooning Reads and Writes issued in the collection interval Both Both Average latency (ms) of the device (LUN) Average latency (ms) in the VMkernel. etc. Fault Tolerance. It’s better to dedicate a physical NIC on ESXi hosts for replication network between Clustered Oracle RAC VMs.Don’t use Oracle Automatic Storage Management (ASM) Failure Groups. you use SCSI Bus Sharing in Physical Mode to share these disks between nodes. . like: vMotion. Swapout MCTLSZ (MB) System Swapinrate.Oracle VMs port group should have at least 2 physical NICs for redundancy and NIC teaming capabilities. you can use either VMDK disks on VMFS datastores or RDM LUNs for shared CSR/Voting Disks. It costs additional CPU overhead and may behave unexpectedly after failure of one disk in virtual environments. Network separation is either physical or virtual using VLANs. Swapoutrate vmmemctl Both Both READs/s. Percentage of time spent in the ESX/ESXi Server VMKernel Memory ESX/ESXi host swaps in/out from/to disk (per virtual machine. . . WRITEs/s DAVG/cmd KAVG/cmd NumberRead. 7-) Network: . NumberWrite deviceLatency KernelLatency Both CPU used over the collection interval (%) CPU time spent in ready state Percentage of time a vCPU spent in read.ESXi Hosts and VMs counters: Resource Metric (esxtop/resxtop) Metric (vSphere Client) Host/ VM Description CPU %USED %RDY %CSTP Used Ready Co-Stop Both VM VM %SYS Swapin. co-descheduled state. Percentage of time a vCPU was ready to run but was deliberately not scheduled due to CPU limits. That’s why Scale-up Approach needs a careful attention to availability of Oracle VMs. Scale-up Approach of Oracle VMs requires a large ESXi Hosts with many sockets and RAM. Scalabilit y .do? language=en_US&cmd=displayKC&externalId=1006427 . PacketsTx DroppedRx. more cost. . as CPU and Memory usage is more accurately obtained using inguest counters. but requires high number of ESXi hosts to provide the required level of availability and more software licenses and hence. PKTTX/s %DRPRX.Networ k MbRX/s.com/selfservice/microsites/search. In the same time it reduces the cost of software licenses and physical hosts.do? language=en_US&cmd=displayKC&externalId=1318 Linux: http://kb. snapshotting. It all depends on your environment and your requirements.com/selfservice/microsites/search. Managea bility 1-) Oracle Support statement for VMware: 2-) VMware Expanded support for Oracle DB: https://www. Scale-out Approach requires smaller ESXi Hosts and gives a more flexibility in designing of an Oracle VM. etc. All you need is to use “RECONFIGURE” query using Management Studio to force SQL Server to use the newly added resources. DroppedTx Both Queuing Time‖ Amount of data received/transmitted per second Both Received/Transmitted Packets per second Both Receive/Transmit Dropped packets per second In-guest monitoring is important as well. Transmitted PacketsRx. There’s no best approach here.Sync all ESXi Hosts in the VI to the same Startum 1 NTP Server which is the same time source of your forest/domain. It’s recommended to do the following: .vmware. MbTX/s PKTRX/s.com/selfservice/microsites/search.Let all your Oracle VMs sync their time according to the following best practices: Windows: http://kb. a single failed VM will affect a large portion of users. A single VM failure has a less effect using Scale-out Approach and it requires less time for migration using vMotion and hence. It’ll reduce the time required for deploying or scaling your Oracle environment as well as preserve consistency of configuration throughout your environment. %DRPTX Received. VM can sync with the Host using VMware Tools in case of startup.Disable time-sync between SQL VMs and Hosts using VMware Tools totally (Even after uncheck the box from VM settings page. resume. It reduces the number of VMs required to serve certain number of transactions and hence.com/support/policies/oracle-support 3-) Try to leverage a Golden Template to provide a base to your Oracle VM.) according to the following KB: http://kb.vmware. 5-) Leverage CPU/Memory Hot add with SQL VMs to scale them as needed.vmware. 4-) Time Synchronization is one of the most important things in Oracle environments.vmware.vSphere supports sizing Oracle VMs using Scale-out or Scale-up approaches.do?language=en_US&cmd=displayKC&externalId=1189 . DRS will be more effective. . use DRS Clusters in Fully Automated Mode.aspx 5-) As Microsoft supports vMotion of SQL VMs.aspx . For SQL Servers in your SharePoint Farms.Protect Crawl DB with the suitable SQL native availability technique for higher availability if needed. it’s recommended to deploy many Web Server VMs behind Load Balancers (HW/Virtual Aplliances) to provide load-balancing and high availability.com/en-us/library/jj841106(v=office. leverage vSphere HA with VM Monitoring to restart any failed Application Server VMs on other hosts for better availability and min. availability..15). It’ll always load balance your SharePoint VMs across the cluster. leverage vSphere HA with VM Monitoring to restart any failed Web Server VMs on other hosts for better availability and min. 2-) For Application Server Role. downtime. it’s recommended to deploy many Application Server VMs to provide load-balancing and high availability.Deploy Two or more Crawl DB Servers. it’s recommended to use SQL Server native availability techniques combined with vSphere HA.microsoft.com/en-us/library/jj841106(v=office. . VM Monitoring and ApplicationHA for max. each with a main Index Partition and a mirror of the other partitionfor load balancing and redundancy. respecting all of your configured affinity/antiaffinity rules.Deploy two or more Query Servers.is Hot-add aware and there’s no need to reboot a VM afer hot-add operation. For more information: http://technet.Leverage CPU/Memory Hot add with Oracle VMs to scale them as needed. Some SQL native availability techniques can be used with all types of SharePoint 2013 Farm DBs while other techniques can’t be used with all types of SharePoint DBs. 3-) For DB Servers. This provides highest level of redundancy and load balancing for large environments with Enterprise Search Service. It’ll use the added resource immediately. For additional availability.Deploy two or more Crawl Servers.aspx 5-) For Search Service Availability: .microsoft. For more information: http://technet. downtime. .com/en-us/library/gg502595. ShareP oint 2013 Availabili ty 1-) For Web Server Role. each one holds a Crawl DB for a Crawl Server.microsoft. For example: Availability Technique Configuration Central Content DB Administration DB DB DB Mirroring – High Safety Mode Yes Yes Yes DB Mirroring – High Performance Mode/ Log No Yes Yes Shipping SQL 2012 AAG – Synchronous-commit Mode Yes Yes Yes SQL 2012 AAG – Asynchronous-commit Mode No No Yes For more information: http://technet. For additional availability. SharePoint 2013 Farm will automatically balance the users load between all Application Server VMs. each with two or more Crawler Services that each one of the Crawler Services connected to a different Crawl DB. Oracle -64bit versions mainly. for more redundancy and load balancing. consider all best practices while deploying vMotion Network to support large SQL VMs migrations.15). com/enus/library/hh292622(v=office. For more information about SharePoint 2013 DBs: http://technet.5. Using Application HA. In case of application failure.aspx 5-) Follow Microsoft Best Practices for backend SQL Server: http://technet.15). SharePoint 2013 supports only SQL Server 2008R2 or 2012.aspx 4-) For Capacity Planning of a SharePoint 2013 Farm. If it’s stopped because Guest OS failure. it may restart services or mount databases or services.aspx For graphical poster: http://www. In vSphere 5. For Application and web Server VMs. Microsoft has its own performance tests for some scenarios and recommendations based on these tests: http://technet.15).aspx?id=30363 3-) As it’s so hard to give performance recommendations for green-field deployments of SharePoint 2013 farms. vSphere DRS easily balance smaller VMs across the cluster that larger VMs. it’s recommended to follow Microsoft recommendations as it’s really difficult to provide standard guidance: http://technet. sometimes it’s easier and better to follow scale-out approach by creating additional VMs to serve more load than scale-up approach. Besides.15). it’ll stop sending heartbeats and the host will initiate a VM restart as a HA action. This add additional layer of availability for SharePoint VMs. This approach can be applied to the three roles in your SharePoint Farm: Web.respectVmVmAntiAffinityRules” set to 1 to respect all VMs Affinity/Anti-affinity rules respectively. 6-) Try to leverage VM Monitoring to mitigate the risk of Guest OS failure. sending heartbeats to HA driver on ESXi host. so they have not to participate in any memory reclamation techniques.com/enus/library/cc678868(v=office. the host will restart the VM. the VM will be migrated to respect the rule.com/en-us/download/confirmation. the monitoring agent will monitor SQL instance and SharePoint services.microsoft. 2-) You should have a good knowledge about SharePoint 2013 Farm’s DBs used in it and their performance characteristics before virtualizing.1.microsoft. They heavily depend on their memory as a Cache for the entire SharePoint Farm.4-) For Web Server/Application Server VMs.aspx 5-) CPU Sizing: . use DRS VMs anti-affinity rules to separate them over different hosts. When HA restart a VM. configure the vSphere Cluster with both “ForeAffinePowerOn” & “das. Generally speaking. but on the following DRS invocation. If there’s also no activity.Assign vCPUs as required –using Hot Add feature. Performa nce 1-) Distributed Cache Application VMs should have their configured memory reserved.com/en-us/library/cc261716(v=office. If Application HA Agent can’t recover the application from that failure. VMware Tools inside SharePoint VMs will send heartbeats to HA driver on the host.com/enus/library/ff758645(v=office. configure the vSphere Cluster with “ForeAffinePowerOn” option set to 1 to respect all VMs Affinity/Anti-affinity rules. like: balloning. In vSphere 5. Application and DB VMs.microsoft.com/en-us/library/ff608068(v=office.aspx You can also check Microsoft published case studies about different deployments scenarios with different capacities: http://technet.14).microsoft. availability.microsoft. the host will monitor IO and network activity of the VM for certain period. better CPU utilization in you SharePoint Farm means higher throughput and lower .15). 7-) Try to leverage Symantec Application HA Agent for SharePoint with vSphere HA for max.microsoft. it’ll not respect the anti-affintiy rule.and don’t over-allocate to the VM to prevent CPU Scheduling issues at hypervisor level and high RDY time. latency. . . like: Distributed Cache. over-commitment is allowed to get higher consolidation ratios. Set “Maximum Degree of Parallelism” setting to 1 from SQL Server adv.Enable Hyperthreading when available. In some cases like small environments. SQL Server. . reservation of memory removes VM swapfiles from datastores and hence. Ratio of Virtual: Physical Cores should be 2:1 max (better to keep it nearly 1:1) for mission-critical SharePoint VMs. Don’t consider it when calculating Virtual: Physical Cores ratio. Web Server and Application Server can start with 4 vCPUs then be scaled up. can’t use the added memory till a reboot. vMotion chances and time. . Try to size your SharePoint VMs to fit inside single NUMA node to gain the performance boost of NUMA node locality. For some cases.but it’ll give a CPU processing boost up to 2025% in some cases. like: HA Slot Size.Leverage Memory Hot-add feature to scale your VMs quickly.On your SQL Server. In addition. its space is usable for adding more VMs.For Web Server and Application Server. . Keep in mind that memory reservation affects as aspects. down or out according to your environment. Keep in mind that some SharePoint servers. .Don’t over-commit CPUs. properties to control how SQL Server divides incoming requests between VM vCPUs. where a lot of underutilized SharePoint Servers there.ESXi Hypervisor is NUMA aware and it leverages the NUMA topology to gain a significant performance boost. Performance monitoring is mandatory in this case to maintain a baseline of normal-state utilization.For Content DB Memory Sizing: Combined size of content RAM recommended for databases computer running SQL Server Minimum for small production 8GB deployments Minimum for medium production 16 GB deployments Recommendation for up to 2 32 GB terabytes Recommendation for the range of 64 GB 2 terabytes to 5 terabytes Recommendation for more than 5 >64 GB (estimated according to terabytes your DB size to provide enough cache to improve your SQL . reserve the configured memory to provide the required performance level (more memory equals more caching and better throughput and lower latency). If needed.Don’t over-commit memory. 6-) Memory Sizing: . over-commit is allowed after establishing a performance baseline. Adding more users load on your Web Server or more applications on your Applications Servers will require adding more memory than the recommended. as SharePoint 2013 is a memory-intensive application. It won’t double the processing power –in opposite to what shown on ESXi host as double number of logical cores.Generally speaking. . the min recommended memory is 8GB for small environments and scale-up to 16GB for large ones. . In case of using SQL 2012 AAG groups with readable secondaries. for max.RDM can be used in many cases. you can keep one Web.SharePoint VMs port group should have at least 2 physical NICs for redundancy and NIC teaming capabilities.Always consider any storage space overhead while calculating VMs space size required. then format and recreate the datastore on VMFS5. . . SharePoint backend communication.Separate different SharePoint VMs’ disks on different –dedicated if needed. through two HBAs. It’s recommended to add 20-30% of space as an overhead. one for public communication with users and the other for backend communication with SQL DB VMs. performance and throughput and least CPU overhead. VMFS5 created using vSphere (Web) Client will be aligned automatically as well as any disks formatted using newer versions of Windows. sizing secondary SQL Server node properly will improve Read operation performance. Overhead can be: swapfiles.Provide at least 4 paths. as SharePoint is an IO-intensive application with many components. each with different IOps requirements. . . Connect each physical NIC to a different physical switch for max. . one Application and one backend DB Server VMs as a one unit ona single ESXi Hosts. . like: P2V migration or to leverage 3 rd Party array-based backup tool. as spindles will not make two reads or writes to process single request. Any upgraded VMFS datastores or upgraded versions of Windows Guests will require a partitions alignment process. Choosing RDM disks or VMFS-based disks are based on your technical requirements. leveraging any SQL Availability techniques that create additional secondary copies of the DB. . 8-) Network Sizing: . availability. least latency and least CPU overhead. etc. .Provided that your design will depend on creating multiple redundant instances of all SharePoint Roles.Consider network separation between different types of networks. between each ESXi host and the Storage Array for max. like: vMotion. Fault Tolerance.Use Paravirtual SCSI Driver in all of your Oracle VMs.All recommended best practices for deploying SQL Server Storage requirements must be applied when deploying SharePoint Farm backend DB Servers.datastores to avoid IOps contention. No performance difference between these two types of disks. This will make all their backend communications local on host’s memory which provides much . VMs logs or snapshots. specially disks used for DB and Logs. redundancy. it’s done by migrate VMs disks to another datastore using Storage vMotion.It’s better to dedicate a physical NIC on ESXi hosts for backend communication network between Application and Web VMs and SQL DB VMs.Your Application and Web Servers should have two vNICs.Use VMXNet3 vNIC in all SharePoint VMs for max. SharePoint production. For upgraded VMFS. performance. will require sizing the secondary SQL Server node with the same memory size to provide the same performance in case of failover.Server performance) Keep in mind that. . . . . Network separation is either physical or virtual using VLANs. Management. 7-) Storage Sizing.Partition Alignment gives a performance boost to your backend storage. com/en-us/library/ff758658. or cumulative over host) Amount of memory reclaimed from resource pool by way of ballooning Reads and Writes issued in the collection interval %MLMTD VM %SYS System Both Memor y Swapin. you can leverage some availability features of SQL itself for more recoverability of your backend DB infrastructure. Transmitted PacketsRx.aspx Recovera bility 1-) Use VMware Site Recovery Manager (SRM) if available for DR. PKTTX/s %DRPRX. Create many units of the three VMs and distribute them on your ESXi hosts for higher availability. 2-) If VMware SRM isn’t available. DroppedTx Both Average latency (ms) of the device (LUN) Average latency (ms) in the VMkernel. WRITEs/s DAVG/cmd KAVG/cmd NumberRead. Use Host-VM Should Affinity rules to control which unit runs on which host. codescheduled state. PacketsTx DroppedRx. Percentage of time spent in the ESX/ESXi Server VMKernel Memory ESX/ESXi host swaps in/out from/to disk (per virtual machine.In-guest counters: For all in-guest counters need to be monitored: http://technet. dedicate physical NIC on ESXi hosts hosting them for replication traffic between redundant instances to keep them in tight lockstep for better availability and better RPO. Percentage of time a vCPU was ready to run but was deliberately not scheduled due to CPU limits. Only meaningful for SMP virtual machines. Swapoutrate vmmemctl Both Disk READs/s. also known as Queuing Time‖ Amount of data received/transmitted per second Both Received/Transmitted Packets per second Both Receive/Transmit Dropped packets per second Networ k PKTRX/s. You can either use a mix between AAG Synchronous/Asynchronous replicas or Data Mirroring in High safety Mode with Log Shipping in DR Site. . With SRM.For your DB Servers. 8-) Monitoring: Try to establish a performance baseline for your SQL VMs and VI by monitoring the following: . %DRPTX Both Both Both .more throughput than your network and much lower latency. NumberWrite deviceLatency KernelLatency Both MbRX/s.ESXi Hosts and VMs counters: Resource Metric (esxtop/resxtop) Metric (vSphere Client) Host/ VM Description CPU %USED %RDY %CSTP Used Ready Co-Stop Both VM VM CPU used over the collection interval (%) CPU time spent in ready state Percentage of time a vCPU spent in read.microsoft. MbTX/s Received. automated failover to a replicated copy of the VMs in your DR site can be carried over in case of a disaster or even a failure of single VM in your SharePoint Farm. Use DRS VMs Affinity rules to keep these VMs together. . Swapout MCTLSZ (MB) Swapinrate. com/en-us/library/ff628971. Ofcourse. It can be really helpful in packaging and exporting group of SharePoint VMs with certain reserved resources for development or testing.microsoft.microsoft. .pdf Managea bility Scalabilit y 1-) Microsoft support for SharePoint 2013 Virtualization: http://technet. 3-) Use vCenter Operation Manager to monitor your environment performance trends. VM can sync with the Host using VMware Tools in case of startup.com/files/pdf/products/vsphere/VMware-vSphere-Data-ProtectionProduct-FAQ. These are SQL-aware and don’t cause any corruption in DB due to quiesceing the DB during the backup operation. 5-) Install SharePoint Server binaries on all required Application and Web Server VMs before configuring any required configuration on any one of them to achieve configuration consistency and stable SharePoint farm. use warm clones of your VMs in the DR sites that ready to be powered up and deployed in case of disaster.Sync all ESXi Hosts in the VI to the same Startum 1 NTP Server which is the same time source of your forest/domain. 7-) Make sure that you enable “Full Recovery Mode” on all SharePoint DBs that will be included in your SQL AAGs and also make sure that at least single Full Backup is taken. For more information: http://technet.do? language=en_US&cmd=displayKC&externalId=1189 . 1-) SharePoint 2013 Farm contains many SQL DBs that differ in their scalability approaches according to the number allowed of each in the farm.aspx 5-) Try to leverage any backup software that uses Microsoft Volume Shadow Service (VSS).aspx 4-) Try to leverage native backup techniques in SharePoint.15). snapshotting. In addition. Fore more information about SharePoint Farm DR: http://technet. 6-) Time Synchronization is one of the most important things in SharePoint environments.Let all your SharePoint VMs sync their time with DC’s only.Disable time-sync between SharePoint VMs and Hosts using VMware Tools totally (Even after uncheck the box from VM settings page.com/selfservice/microsites/search. resume. one of them is vSphere Advanced Data Protection.vmware.This approach leads to lower cost. estimate the capacity required for further scaling and proactively protect your environment against sudden peaks of VMs performance that need immediate scaling-up of resources. their performance characteristics and the max. but with higher management overhead and higher RPO/RTO results than using VMware SRM.com/en-us/library/ee428315(v=office. etc. This approach require consistent Backup/Restore cycle of your SharePoint Farm VMs.com/enus/library/ff607936(v=office. not with VMware Tools. they’re SharePoint-aware and uses VSS writers to backup any Application or Web Server without any interruption.aspx 2-) Try to leverage vApp feature in vSphere.vmware. Check the following link: http://www.) according to the following KB: http://kb.15). 3-) For the least protection. 4-) Use SharePoint Product Preparation Tool found on SharePoint media to install all prerequisites on your SharePoint Server.microsoft. recommended . establish a dynamic baseline of your VMs performance to prevent false static alerts. It’s recommended to do the following: . scalability and availability levels. Generally speaking. if you have a rare case to expand any of them.size.com/en-eg/download/details. This reduces the time required for deploying or scaling your SharePoint environment as well as preserve consistency of configuration throughout your environment. Configuration DB and Central administration DBs must be co-located and both will never grow beyond 1 GB. It’s recommended to keep it below 200GB for max.microsoft. will need a reboot to use them. VMware Tools inside SAP HANA VMs will send heartbeats to HA driver on the host.microsoft. Check ShreaPoint 2010: http://technet.aspx# 2-) Microsoft released some topologies for different sizes of SharePoint environments with the required components.1. 2-) Use DRS Anti-affinity rules for separating SAP HANA VMs apart and use VM-Host Should Affinity rules to keep SAP HANA VMs on their certified ESXi Hosts only. Content DB will grow according to your deployment of your SharePoint 2013 Farm and can beyond 1 TB. Some VMs .com/en-us/library/cc678868(v=office. These can be a starting point for you to size your environment to acheive the required performance. 6-) Try to leverage VM Monitoring to mitigate the risk of Guest OS failure. scale-out your Web Server and add another Content DB that should be kept also below 200GB and so on. 7-) Try to leverage Symantec Application HA Agent for SAP HANA with vSphere HA for max. it’ll stop sending heartbeats and the host will initiate a VM restart as a HA action. 1-) Make sure to enable DRS in Fully Automated Mode on the cluster hosting SAP HANA VMs.microsoft.pdf 4-) Make sure to add the automatic SAP HANA start parameter to the SAP HANA configuration file to enable SAP HANA automatic restart after reboot in case of HA event. If there’s also no activity. If Application HA Agent can’t recover the application from that failure.com/enus/library/cc263044. sending heartbeats to HA driver on ESXi host. SAP HANA support migration of its VMs using vMotion.saphana. it may restart services. you should scale them up not out. the host will restart the VM. . In case of application failure. may use added resources without a reboot when others. Check: http://www. Using Application HA. the monitoring agent will monitor SAP HANA instance related services. Create your Golden Template for every tier of your VMs. 3-) Try to leverage different SAP HANA High Availability solutions with vSphere HA. SAP HAN A Availabili ty 1-) Leverage vMotion with your SAP HANA VMs. SharePoint 2013 Farm must have only one of each of them and hence. performance. Make sure that destination host has the required resources to run migrated VM. 4-) Try to leverage vSphere Templates in your environment. For more scalability. like: Distributed Cache Server. This add additional layer of availability for SharePoint VMs.aspx?id=30377 3-) Leverage CPU/Memory Hot add with SharePoint VMs to scale them as needed. If it’s stopped because Guest OS failure. availability.15).aspx SharePoint 2013: http://www. like SQL Server. the host will monitor IO and network activity of the VM for certain period.com/servlet/JiveServlet/previewBody/2775-102-4-9467/HANA_HA_2. For more information: http://technet. 3-) Turn off the SLES kernel dump function (kdump) if it is not needed for specific reasons. Autofs and cups. machine.Performa nce 1-) Follow all VMware best practices for Latency-sensitive applications: http://www. machine. Hardware-based virtualization support. 2-) Remove unnecessary services from the Guest OS. RVI C1E Halt State Power-Saving Virus Warning Hyperthreading Yes Necessary to run 64-bit guest operating systems. No Not necessary for database virtual machine. Not necessary for database virtual machine. No No No Yes Video BIOS Cacheable Wake On LAN Execute Disable No Disable if performance is more important than saving power. Disables warning messages when writing to the master boot record. Disables NUMA benefits if set to Yes. EPT.tcp_slow_start_after_idle=0” 5-) Adhere to the shared memory settings as described below. Yes Yes necessary necessary necessary necessary for for for for database database database database virtual virtual virtual virtual machine. No No No No Not Not Not Not No Not necessary for database virtual machine. for example: a root cause analysis. Not necessary for database virtual machine.pdf 2-) Configure the following BIOS Settings on each ESXi Host: Settings Recomme Description nded Value Virtualization Technology Turbo Mode Node Interleaving VT-x. 4-) Configure the SLES kernel parameter as described below: “net. Deployment Size Shmmni Value Physical Memory Size Small 4GB ≥24 G & ≤64GB . No Not necessary for database virtual machine. Hyperthreading is always recommended with Intel’s newer Core i7 processors such as the Xeon 5500 series. Required for vMotion and VMware vSphere Distributed Resource Scheduler (DRS) features. machine. AMD-V.com/files/pdf/techpaper/VMW-Tuning-Latency-Sensitive-Workloads.vmware. For use with some Intel processors.ipv4. Yes No Yes Balanced workload over unused cores. for example on Linux: IPTables. Disable if performance is more important than saving power. Video BIOS Shadowable Video RAM Cacheable On-Board Audio On-Board Modem On-Board Firewire On-Board Serial Ports On-Board Parallel Ports On-Board Game Port No Required for VMware vSphere Distributed Power Management feature. which is SUSE Linux. but it’ll give a CPU processing boost up to 10-20%.affinity = “20-39”” 8-) Memory Sizing: . so enabling vNUMA on the wide VMs –that spans multiple NUMA nodes. as SAP HANA is a memory-intensive application. In addition.will give better performance.As Linux VMs just touches the needed memory pages when booting.ESXi Hypervisor is NUMA aware and it leverages the NUMA topology to gain a significant performance boost. reserve the configured memory to provide the required performance level. pin each vCPU to its NUMA noda to prevent migrations from physical NUMA node to another one by setting the following adv. setting in VM Configuration Parameters: .Assign vCPUs as required –using Hot Add feature. SAP is NUMA-aware. It’ll just allocate and reserve the touched memory only.Use Large Memory Pages (aka HugePages feature in SUSE Linux 11) to give a 10% performance boost to your SAP HANA VMs. It’s better to keep Virtual: Physical Cores nearly 1:1 for mission-critical SAP HANA VMs. reservation of memory removes VM swapfiles from datastores and hence.affinity = “0-19” sched. Keep in mind that memory reservation affects as aspects. Performance monitoring is mandatory in this case to maintain a baseline of normal-state utilization. Don’t consider it when calculating Virtual: Physical Cores ratio. For SAP HANA Linux VMs.Enable Hyperthreading when available. .affinity = “0-19” sched. Try to size your SAP HANA VMs to fit inside single NUMA node to gain the performance boost of NUMA node locality. For some cases. vMotion chances and time.vcpu1.vcpu10.Don’t over-commit CPUs. like testing environments.affinity = “0-19” … sched.vcpu19. its space is usable for adding more VMs. like: HA Slot Size. over-commitment is allowed to get higher consolidation ratios. In some cases like test environments. . It’s enabled by default since SUSE Linux 11 SP2.affinity = “20-39” .Don’t over-commit memory. . sched. setting in VM Configuration Parameters: “sched. .For large SAP HANA VMs.vcpu9. .and don’t over-allocate to the VM to prevent CPU Scheduling issues at hypervisor level and high RDY time. 7-) CPU Sizing: .. over-commit is allowed after establishing a performance baseline. In addition. It won’t double the processing power –in opposite to what shown on ESXi host as double number of logical cores. setting memory reservation for it won’t allocate all the reserved memory during the booting process.vcpu0. If needed.Medium 64GB ≥64 G & ≤256GB Large 53488 MB > 256GB 6-) Set VM settings to “Automatically Choose Best CPU/MMU Virtualization Mode”.vcpu11. all memory configured should be per-allocated using the following adv.affinity = “20-39” sched. . .vmxSwapEnabled=False” . setting.com/selfservice/microsites/search. . as spindles will not make two reads or writes to process single request. For more information: http://kb. Fault Tolerance (FT) and Cloning. .Separate different SAP HANA VMs’ disks on different –dedicated if needed. . . It’s recommended to use Eager-zeroed Thick disks for DB and Logs disks. between each ESXi host and the Storage Array for max.datastores to avoid IOps contention. performance and throughput and least CPU overhead.VMware vMotion.In order to achieve the absolute lowest possible latency for SAP HANA.N_Port ID virtualization (NPIV). 10-) Monitoring: Try to establish a performance baseline for your SQL VMs and VI by monitoring the following: .Use VMXNet3 vNIC in all SAP HANA VMs for max.“sched.do? language=en_US&cmd=displayKC&externalId=2011861 9-) Network Sizing: . don’t forget memory overhead to be calculated and accounted for.swap.RDM can be used in many cases. .It’s recommended to use “NOOP Scheduler” as your IO scheduler in your SAP HANA Linux VMs. like: P2V migration or to leverage 3 rd Party array-based backup tool.running on mixed VMware ESXi versions. Datastores created using vSphere (Web) Client is natively aligned. through two HBAs. Choosing RDM disks or VMFS-based disks are based on your technical requirements. . as it won’t support the following: .Partition Alignment gives a performance boost to your backend storage.ESXi Hosts and VMs counters: Resource Metric (esxtop/resxtop) Metric (vSphere Client) Host/ VM Description CPU %USED %RDY Used Ready Both VM CPU used over the collection interval (%) CPU time spent in ready state . vDS also provides many advanced features –that don’t exist in Standard Switch-. each with different IOps requirements.Try to leverage vSphere Distributed Switch (vDS) to preserve consistency in your network configuration between all ESXi Hosts.As SAP HANA instances usually need large memory reservation. . performance. . No performance difference between these two types of disks. .prealloc=True sched.Don’t use IBM GPFS with your virtualized SAP HANA instances. 8-) Storage Sizing: . For large-memory VMs.Use Paravirtual SCSI Driver in all of your SAP HANA VMs for max. performance paralleling and higher IOps. like: Private VLANs and NetFlow. least latency and least CPU overhead. .vmware. it recommended to set the latency to in VM adv.Distribute any SAP HANA VM disks on the four allowed SCSI drivers for max.Provide at least 4 paths. memory overhead can be several GBs of memory. as SAP HANAis an IO-intensive application with many components. IBM GPFS supports only running with Physical-mode RDM.mem. Distributed Resource Scheduler (DRS). availability. co-descheduled state. With SRM. Percentage of time a vCPU was ready to run but was deliberately not scheduled due to CPU limits. Only meaningful for SMP virtual machines. Swapoutrate vmmemctl N%L Disk READs/s. 2-) SAP has released use of parallel SAP HANA VMs on VMware vSphere 5. also known as Queuing Time‖ Aborts are issued by the virtual machine because the storage is not responding.5 and SAP HANA SPS 7. Host Profiles preserve configuration consistency between ESXi Hosts in the cluster which is crucial for a cluster hosting some SAP HANA instances to achieve high performance. For Windows virtual machines. or when the storage array is not accepting I/O. Reads and Writes issued in the collection interval 1-) SAP HANA instance virtualization is supported for production with vSphre 5. DroppedTx Percentage of time a vCPU spent in read. allowing selected customers. WRITEs/s DAVG/cmd KAVG/cmd Managea bility Recovera RESET/s MbRX/s. If the virtual machine has memory size greater than the amount of memory local to each processor. 3-) It’s recommended to use vSphere Host Profiles while configuring ESXi Hosts that will host SAP HANA instances. the virtual machine is experiencing poor NUMA locality. the ESXi scheduler does not attempt to use NUMA optimizations for that virtual machine. MbTX/s PKTRX/s. Transmitted PacketsRx. %DRPTX Both Both Both VM NumberRead. or cumulative over host) Amount of memory reclaimed from resource pool by way of ballooning If less than 80. Amount of data received/transmitted per second Both Received/Transmitted Packets per second Both Receive/Transmit Dropped packets per second VM Received. The number of command resets per second. NumberWrite deviceLatency KernelLatency ABRTS/s Networ k VM Both Both Both VM Both Average latency (ms) of the device (LUN) Average latency (ms) in the VMkernel. PKTTX/s %DRPRX. this happens after a 60-second default.%CSTP Memor y Co-Stop %MLMTD VM %SWPWT VM %SYS Swapin. This issue can be caused by path failure. 2-) 1-) Use VMware Site Recovery Manager (SRM) if available for DR. Percentage of time spent in the ESX/ESXi Server VMKernel Memory ESX/ESXi host swaps in/out from/to disk (per virtual machine. This can indicate overcommitted memory.5 into controlled availability. automated failover to a . Virtual machine waiting on swapped pages to be read from disk. Swapout MCTLSZ (MB) System Swapinrate. depending on their scenarios and system sizes to go live with this configuration. PacketsTx DroppedRx. bility Scalabilit y Java Enterpris e Applicatio ns replicated copy of the VMs in your DR site can be carried over in case of a disaster or even a failure of single VM in your SAP HANA environment. it may restart services and any dependent resources.com/files/pdf/techpaper/VMW-Tuning-Latency-Sensitive-Workloads. the host will monitor IO and network activity of the VM for certain period. Disables NUMA benefits if set to Yes. the VM will be migrated to respect the rule.respectVmVmAntiAffinityRules” set to 1 to respect all VMs Affinity/Anti-affinity rules respectively. AMD-V. This add additional layer of availability for Java VMs. Performa nce 1-) Follow all VMware best practices for Latency-sensitive applications: http://www. it’ll not respect the anti-affintiy rule. when HA restart a VM. Availabili ty 1-) Try to use vSphere HA with your Java VMs to provide decent level of availability. from the same tier.1. configure the vSphere Cluster with “ForeAffinePowerOn” option set to 1 to respect all VMs Affinity/Anti-affinity rules. RVI C1E Halt State Power-Saving Virus Warning Yes Necessary to run 64-bit guest operating systems. sending heartbeats to HA driver on ESXi host. on different Racks. Keep in mind that. VMware Tools inside Java VMs will send heartbeats to HA driver on the host. 6-) Try to leverage VM Monitoring to mitigate the risk of Guest OS failure. Create your Golden Template for every tier of your VMs. Disable if performance is more important than saving power. Blade Chassis and Storage Arrays if available using VMs Affinity/Anti-affinity rules and Storage Affinity/Anti-affinity rules for most availability. the monitoring agent will monitor Oracle instance and its services.5. No No No Disable if performance is more important than saving power. Disables warning messages when writing to the master boot record. the host will restart the VM. In case of application failure. 1-) Try to leverage vSphere Templates in your environment. availability. but on the following DRS invocation. it’ll stop sending heartbeats and the host will initiate a VM restart as a HA action. configure the vSphere Cluster with both “ForeAffinePowerOn” & “das.pdf 2-) Configure the following BIOS Settings on each ESXi Host: Settings Recomme Description nded Value Virtualization Technology Turbo Mode Node Interleaving VT-x. Hardware-based virtualization support. 7-) Try to leverage Symantec Application HA Agent for Oracle with vSphere HA for max. Yes No Yes Balanced workload over unused cores. 2-) Try to separate your Java VMs. If Application HA Agent can’t recover the application from that failure. EPT. If there’s also no activity. This reduces the time required for deploying or scaling your SharePoint environment as well as preserve consistency of configuration throughout your environment. If it’s stopped because Guest OS failure. In vSphere 5. Using Application HA. In vSphere 5.vmware. . . reserve the configured memory to provide the required performance level. In addition. No No No No Not Not Not Not No Not necessary for database virtual machine.Enable Hyperthreading when available. Not necessary for database virtual machine.CPU Over-commit is allowed in Java Enterprise Applications VMs so that total physical CPUs utilization doesn’t exceed 80%.Don’t over-commit memory. machine. over-commitment is allowed to get higher consolidation ratios. 2-) During testing and planning phase. No Not necessary for database virtual machine. its space is usable for adding more VMs. machine. like: HA Slot Size. Yes Yes For use with some Intel processors. Hyperthreading is always recommended with Intel’s newer Core i7 processors such as the Xeon 5500 series. It won’t double the processing power –in opposite to what shown on ESXi host as double number of logical cores.ESXi Hypervisor is NUMA aware and it leverages the NUMA topology to gain a significant performance boost. vMotion chances and time. machine. 2-) CPU Sizing: . 2-) Memory Sizing: .but it’ll give a CPU processing boost up to 2025% in some cases. Try to size your Java VMs to fit inside single NUMA node to gain the performance boost of NUMA node locality.Assign vCPUs as required and don’t over-allocate to the VM to prevent CPU Scheduling issues at hypervisor level and high RDY time. necessary necessary necessary necessary for for for for database database database database virtual virtual virtual virtual machine. Not necessary for database virtual machine. Keep in mind that memory reservation affects as aspects. . try to establish a baseline of HTTP requests: Java Heaps: DB Connections required. . reservation of memory removes VM swapfiles from datastores and hence. as Java Applications are memory-intensive. For some cases. If needed. Performance monitoring is mandatory in this case to maintain a baseline of normal-state utilization. No Not necessary for database virtual machine. like testing environments. Required for vMotion and VMware vSphere Distributed Resource Scheduler (DRS) features.Hyperthreading Yes Video BIOS Cacheable Wake On LAN Execute Disable No Video BIOS Shadowable Video RAM Cacheable On-Board Audio On-Board Modem On-Board Firewire On-Board Serial Ports On-Board Parallel Ports On-Board Game Port No Required for VMware vSphere Distributed Power Management feature. . Performance monitoring is mandatory in this case to maintain a baseline of normal-state utilization. Don’t consider it when calculating Virtual: Physical Cores ratio. the ESXi scheduler does not attempt to use NUMA optimizations for that virtual machine. NumberWrite deviceLatenc y KernelLatenc y . If the virtual machine has memory size greater than the amount of memory local to each processor. Host’d swap memory pages from physical memory to VM swap file on disk without knowing what these pages contain or if these pages are required or idle. It can give a performance boost for your Java VMs. don’t over-commit memory for business critical Java VMs and if you’ll do some overcommitment.Configure Memory Reservation on your Java VMs according to: “Reserved memory= VM Memory= Guest OS Memory+ Java Memory (JVM Memory)” . Balloning hit to the performance is somehow much lower than Swapping to disk. Generally speaking as mentioned in the previous point. Swapping to disk is done by ESXi Host itself. don’t push it to these extreme limits. Ballon Driver will force Guest OS to swap the idle memory pages to disk to return the free memory pages to the Host. Swapout MCTLSZ (MB) System Swapinrate.e. When Balloning is needed. Swapoutrate vmmemctl N%L Disk READs/s. Virtual machine waiting on swapped pages to be read from disk. Balloning is the last line of defense of ESXi Host before compression and swapping to disk when its memory becomes too low. Reads and Writes issued in the collection interval Both Average latency (ms) of the device (LUN) Both Average latency (ms) in the VMkernel. i. swapping is done according to Guest OS techniques.. also known as Queuing Time‖ Memor y %MLMTD VM %SWPWT VM %SYS Swapin. WRITEs/s DAVG/cmd KAVG/cmd Both Both Both VM NumberRead.ESXi Hosts and VMs counters: Resource Metric (esxtop/resxtop) Metric (vSphere Client) Host/ VM Description CPU %USED %RDY %CSTP Used Ready Co-Stop Both VM VM Both CPU used over the collection interval (%) CPU time spent in ready state Percentage of time a vCPU spent in read. 10-) Monitoring: Try to establish a performance baseline for your SQL VMs and VI by monitoring the following: . or cumulative over host) Amount of memory reclaimed from resource pool by way of ballooning If less than 80. Keep in mind not to configure all Java VM configured memory as Large Pages and leave some memory to be used by small pages for processes that can’t leverage Large Pages. the virtual machine is experiencing poor NUMA locality.Don’t disable Ballon Driver installed with VMware Tools. This can indicate overcommitted memory. Only meaningful for SMP virtual machines. Percentage of time spent in the ESX/ESXi Server VMKernel Memory ESX/ESXi host swaps in/out from/to disk (per virtual machine. co-descheduled state.Leverage Large Memory Pages feature. Percentage of time a vCPU was ready to run but was deliberately not scheduled due to CPU limits. . try to use symmetric VMs in your scaling.Disable time-sync between SQL VMs and Hosts using VMware Tools totally (Even after uncheck the box from VM settings page. 2-) When increasing your VM Java Heap Size.Sync all ESXi Hosts in the VI to the same Startum 1 NTP Server which is the same time source of your forest/domain. Load balancers aren’t aware of VMs sizing and hence. 3-) When adapting Scale-out approach of your Java VMs. VM can sync with the Host using VMware Tools in case of startup. This issue can be caused by path failure.com/files/pdf/solutions/VMware-Virtualizing-Business-Critical-Apps-on-VMware_en-wp. not with VMware Tools. this happens after a 60-second by default.ABRTS/s Networ k RESET/s MbRX/s. resume. http://www.Let all your SQL VMs sync their time with DC’s only. performance. MbTX/s PKTRX/s.vmware.pdf . 2-) Try to leverage vSphere-aware load balancers that can integrate with vSphere API and automatically add newly added VM to its corresponding load-balancing pool. so that load balancers can effectively load-balancing requests between them. non-symmetric VMs would lead to non-efficient load-balancing unless you configure VMs sizes on your load balancers. PacketsTx DroppedRx. For Windows virtual machines. etc. Scalabilit y 1-) Leverage CPU/Memory Hot add with Java VMs to scale them up as needed. %DRPTX VM Received. . DroppedTx VM Both Aborts are issued by the virtual machine because the storage is not responding.vmware. or when the storage array is not accepting I/O. Transmitted PacketsRx.do? language=en_US&cmd=displayKC&externalId=1189 . It’s recommended to do the following: . The number of command resets per second. 3-) Use load balancers with known algorithms that you can understand so that you can test and configure them efficiently to make sure that each VM has equal share of requests and load is balanced. snapshotting. PKTTX/s %DRPRX. increase your vCPUs for max.) according to the following KB: http://kb. Amount of data received/transmitted per second Both Received/Transmitted Packets per second Both Receive/Transmit Dropped packets per second Managea bility 10-) Time Synchronization is one of the most important things in SQL environments.com/selfservice/microsites/search. which is time consuming.
Copyright © 2025 DOKUMEN.SITE Inc.