Jetstress Field Guide v2.0.0.8

March 23, 2018 | Author: alyakhov8429 | Category: Hyper V, Databases, Microsoft Exchange Server, Operating System, Copyright


Comments



Description

Jetstress 2013Jetstress Field Guide Wednesday, 26 February 2014 Version 2.0.0.8 [Issued] Prepared by [email protected] Template Version October 2011 Prepared for Exchange Community MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, our provision of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. The descriptions of other companies’ products in this document, if any, are provided only as a convenience to you. Any such references should not be considered an endorsement or support by Microsoft. Microsoft cannot guarantee their accuracy, and the products may change over time. Also, the descriptions are intended as brief highlights to aid understanding, rather than as thorough coverage. For authoritative descriptions of these products, please consult their respective manufacturers. © 2011 Microsoft Corporation. All rights reserved. Any use or distribution of these materials without express authorization of Microsoft Corp. is strictly prohibited. Microsoft and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. Page ii Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8 Draft Prepared by [email protected] "Document1" last modified on 26 Feb. 14, Rev 2 Prepared for Exchange Community Revision and Signoff Sheet Change Record Date Author Version Change reference 2.0.0.1 2.0.0.2   First draft for Jetstress 2013 Updates after feedback from Robert Gillies and Ramone Infante. Final issue after internal review Updated Error Table description with JET codes Added troubleshooting information for ESE 606. Fixed formatting issues 22/03/2013 Neil Johnson 03/04/2013 Neil Johnson 19/06/2013 Neil Johnson 20/06/2013 Neil Johnson 2.0.0.5 2.0.0.6     20/06/2013 Neil Johnson 2.0.0.7 Page iii Jetstress 2013, Field Guide, Version 2.0.0.8 Issued Prepared by Neil Johnson "Document1" last modified on 26 Feb. 14, Rev 2 0. Exchange CXP Section Author Jetstress internals Configuring Jetstress Various Various Various Ramon b. Field Guide. WW COMMUNITIES. UC Matt Gossage Umair Ahmad PRINCIPAL PROGRAM MANAGER LEAD SDET II. Exchange Test Page iv Jetstress 2013.8 Issued Prepared by Neil Johnson "Document1" last modified on 26 Feb. UK MCS SENIOR SDET. 14. Infante DIR. Rev 2 .Prepared for Exchange Community Document Contributors Name Neil Johnson Alexandre Costa Ross Smith IV Position Senior Consultant. Exchange Test PRINCIPAL PROGRAM MANAGER.0. Version 2. UC PRINCIPAL PROGRAM MANAGER LEAD.1 2.US PRINCIPAL TECHNICAL WRITER.0.CAT SVCS REGIONAL ARCHITECT.0.Prepared for Exchange Community Reviewers Name Neil Johnson Alexandre Costa Ross Smith IV Version 2.0. Version 2.0.1 2.0.1 Umair Ahmad Nathan Muggli Scott Schnoll Boris Lokhvitsky Jeff Mealiffe 2. US-US-MCS West SL 2 SENIOR PROGRAM MANAGER LEAD.1 2.0.1 2. Rev 2 .0.1 Position Senior Consultant II. US-MCS DOD SL 2 PRINCIPAL CONSULTANT. MCS UK SENIOR SDET.0.0. Exchange PM .1 2.1 Robert Gillies David Mosier 2. Office 365 . Infante 2. Exchange Test PRINCIPAL PROGRAM MANAGER.0.0.0.0.0.1 2.1 2.1 Matt Gossage 2. Exchange PM – US SDET II.0.CAT SVCS DIR.1 Table 1: Document reviewers Page v Jetstress 2013.0.0.0.0.0. Office 365 . US-MCS Civilian SL 2 Date Ramon b.0.0.0. Exchange Test – US SENIOR PROGRAM MANAGER. Content Publishing DELIVERY ARCHITECT. WW COMMUNITIES.0.0. 14. Field Guide.8 Issued Prepared by Neil Johnson "Document1" last modified on 26 Feb.0. ..............1 Jetstress testing flow chart ................................................... Field Guide....3 Thread Dispatcher .......... 19 Page vi Jetstress 2013........2 5..............................................................................0....1.......................................................................................................................................................................0..........................................................................................11 Example of a failed degraded mode test ....3 Raid Array Testing ..................1 5.........1.................................................................... 7 5....................2 High Level Test Overview ................. 1 3 Introduction to Jetstress ................................................................................5 Background Log Checksummer .................................................... 7 5............................................................................ 18 6 Installing Jetstress ........... Version 2.....................................................5 Reporting and Verification ............................4.................................................5 Auto Tuning Component ....... 2 4 Jetstress Internals ..................1 Documentation .....................................................................11 Resilient Component Testing ................................1......................................................................................1................1 What is different about Jetstress inside a virtual machine? ....................................1 4.......................................................4 4...........1...............................................15 Clean-up ..........................................................................1 5................................................... 13 5........... 9 Where should I run Jetstress in my infrastructure? .......................................................................................................6 5 Planning for Jetstress .......................1 5.................................15 Testing .................................................................................................................................. Rev 2 ......................6.......................2 5............................................................... 1 2 What is New in Jetstress 2013 ...5.......... 11 5..........................................Prepared for Exchange Community Table of Contents 1 Purpose....8 Preparing for the Jetstress test .. 3 4................4................ 3 4......................................2 5....................................13 How much time should I allocate for Jetstress testing? .........................6...............................2 4.....4............3 Initialisation ..................3 4.......8 Issued Prepared by Neil Johnson "Document1" last modified on 26 Feb.................................................................................................5 Offline Log and Database Checksummer ..................................................................................................... 15 5.................7 Process with Automatic thread tuning ............................16 5................................................... 10 Failure Mode Testing ................ 14................................................................6................................5 5.......................................................................1..........................................................................4 When should I run Jetstress in my project? ............................................................................................................................................................................6 Jetstress testing inside virtual machines .............1........................................................................................................................... 17 What happens if the test fails? ....................................3 5. 19 6....................................................................7 5................................8 5....12 5.1 Main Jetstress Components ......... ............4 Interpreting Jetstress test results ...........21 File locations from the installation media .................2.................................................. 26 7....................1 Jetstress Test Types ....................................................................................... 45 Appendix B – Configuring sluggishsessions ...............4.................................................2 6.....................................................................2......................................................... 22 6...................................0...........1....................26 Test an Exchange mailbox profile ...............................................2 Test a disk subsystem throughput .............................................................................................2.................................2................................... 46 Page vii Jetstress 2013.......................................38 Host System Performance ..............................................................4.4 9...........................................................................9 9................... 14..2.......................................................5 Installation ........... 27 8 Jetstress Output Files .............39 Error Counts Per Volume ...3 9......................2.................................2...................................2 File locations from an installed Exchange Server ...................1 6.......10 9...........................................................................................................21 6.........................................1 9.........................39 Test Log ........................................................................................................................................................................................37 Total I/O Performance ....................................................5.................................................................................... 33 9 Reading Jetstress report data ..............................................2 Application Installation .1 7............. 44 10 11 Appendix A – Configuring thread count ............................1..................... 35 9.............................36 Background Database Maintenance I/O Performance .................................................................................8 Issued Prepared by Neil Johnson "Document1" last modified on 26 Feb.................. 26 7.............................................................3 9............................11 Test Summary ........................................................................................ Version 2....................................2.......................2.............................................................1 6............. Rev 2 ............................24 7 Configuring Jetstress ............................. 19 Prerequisites ........................35 Database Sizing and Throughput .............................7 9.....................................................................................36 Transactional I/O Performance.................................................................................................................. Field Guide............. 34 Reading the Jetstress Test Result Report ........26 7.........8 9...................................................2 Initial configuration ...................................................................................37 Log Replication I/O Performance .....6 9..............................2 Target design values ........................................2 9..22 ESE File Installation ...5 9.......................... 20 Getting ESE Files necessary for Jetstress . 21 6...................................................2..........................................35 Jetstress System Parameters .........................................5.............1 9............................................................36 Database Configuration . 34 9.....................................0........ 43 Test evaluation .............................................................................2...........................3 6......................42 9.....................................................................................................................................4 Jetstress Version and Download ................................................................................Prepared for Exchange Community 6................................... .............................1..... 50 Jetstress cannot attach to or create a database ...............50 Unable to tune for the parameters .........................1..1.....................................50 Error loading Performance Monitor counters ....................................................... Field Guide....................................................................................51 Unable to mount databases due to invalid mount point configuration ............51 14........................exe ...........................................0..........Running a Jetstress Test with JetstressCmd..........3 14.......... Error: System.........4 14..................................................1 Appendix C ...........0.............. 47 Appendix E – Running Jetstress on a production server ......... Version 2......5 Jetstress testing failed......................................52 Page viii Jetstress 2013........................Prepared for Exchange Community 12 13 14 14.........8 Issued Prepared by Neil Johnson "Document1" last modified on 26 Feb..........1........................................................... 14...ApplicationException: Faulty performance counter paths: \MSExchange Database(*)\* ...................1 14...................................... 49 Common Issues.. Rev 2 ................................... 50 Troubleshooting Jetstress....1.......2 14...... Instead of specifying Threads/DB. In case of CRC errors. you now specify a global thread count. It will explain how Jetstress works.0.8 Draft Prepared by neil. Rev 2 . A quick outline of new features:      The Event log is captured and logged to the test log. Jetstress Field Guide. DbtimeTooNew. which works against all databases. Any errors are logged against the volume that they occurred. The final report shows the error counts per volume in a new sub-section. server design and planning refer to Planning and Deployment. A re-run of Jetstress should verify that they indeed were remapped. they might be remapped. Page 1 Jetstress 2013.com "Document1" last modified on 26 Feb. which generate IO. This improves the granularity of thread tuning and enables automatic tuning to work more effectively. and how to analyse the results of the test. hung IO.johnson@microsoft. -1119. -1022.0. DbtimeTooOld. -1021. Jetstress configuration files (JetstressConfig. are now controlled at a global level.XML) generated from an older version of Jetstress is no longer allowed. This document is not intended to provide Exchange storage design guidance. These events show up in the Jetstress UI as the test is progressing.  Important Changes  Do not use Jetstress 2013 for older versions of Exchange Server. Jetstress 2013 has only been tested with Exchange Server 2013. For guidance on Exchange 2013. Version 2. bug fixes and it allows validation of Exchange Server 2013 solutions. Threads. Detects -1018. 2 What is New in Jetstress 2013 Jetstress 2013 is an evolution of Jetstress 2010.Prepared for Exchange Community 1 Purpose This document is intended to explain the process and requirements for validating an Exchange 2013 storage solution prior to releasing an Exchange deployment into production. -1019. how to plan for and perform a Jetstress test. 14. It has some improvements. A single IO error anywhere will fail the test. Jetstress Field Guide. Often the Jetstress test will not provide the results that were expected. driver or firmware updates) it is then possible to get the test to pass. Sometimes by making subtle configuration changes to the storage infrastructure (for example.8 Draft Prepared by neil. If you need to remediate a test failure. It is primarily used to validate physical deployments against the theoretical design targets that were derived during the design phase. Jetstress is just reporting on the performance of your storage solution. Jetstress testing will be part of the overall project plan. The best time to schedule Jetstress testing is just before Exchange will be physically installed onto the servers. especially in scenarios where shared storage infrastructure is deployed or where the storage design is complex.0. To simulate the complex Exchange database I/O pattern effectively. It is important to remember that when the Jetstress test reports a failure.com "Document1" last modified on 26 Feb. Ideally.0.DLL that Exchange uses in production. Rev 2 . This often works out differently from expectations. remember that Jetstress is dumb tool that is used worldwide by thousands of Exchange professionals and in Office 365.Prepared for Exchange Community 3 Introduction to Jetstress Jetstress is a tool for simulating Exchange database I/O load without requiring Exchange to be installed. a successful Jetstress test validates that all of the hardware and software components within the I/O stack from the operating system down to the physical disk drive are working to a sufficient level to meet the predicted performance required by Exchange to operate successfully.     Validates that the physical deployment is capable of meeting specific performance requirements Validates that the storage design is capable of meeting specific performance requirements Finds weak components prior to deploying in production Proves storage and I/O stability The most important aspect of Jetstress testing is that it allows you to see how the physically deployed storage and server infrastructure will behave once a real Exchange workload is applied.johnson@microsoft. This may seem an obvious point. Page 2 Jetstress 2013. however a large number of customer escalation cases for Jetstress are not actually Jetstress cases and are instead storage performance cases. 14. It is therefore vital Jetstress use the same version of the Extensible Storage Engine (ESE) files that your Exchange infrastructure will be built with in production. Jetstress makes use of the same ESE. Jetstress testing provides the following benefits prior to deploying live users. It is extremely unlikely that Jetstress is broken. it is far more likely that you have a design issue or misconfiguration with your storage deployment. Jetstress has not failed. Fundamentally. Version 2. 1.1 Auto Tuning Component This component is responsible for auto tuning within Jetstress. whilst remaining within the published disk latency guidelines for Exchange Server. Jetstress is an ESE-based application. Jetstress analyses the performance data to determine if the system meets the targets specified at the beginning of the test.0. It attempts to determine the maximum thread count that the solution can support.8 Draft Prepared by neil.3 Interpreting Jetstress test results. During each of these tasks Windows records performance information about the specific task and the operating system as a whole.com "Document1" last modified on 26 Feb. Each thread performs a set amount of ESE calls. which in turn makes calls to the Windows File system and I/O Manager to gain access to the data stored on disk. Jetstress Field Guide. Device Drivers Page 3 Jetstress 2013. 14. makes API calls to ESE. Windows Operating System Windows Performance Counters Hardware Performance Data Jetstress Application Auto tuning Storage Subsystem Extensible Storage Engine (ESE) Background Database Maintenance Windows I/O Manager Reporting and Verification Thread Dispatcher Transactional I/O Background Log Checksummer Offline Log & Database Checksummer Figure 1 . the storage workload can be modified. which generates a set amount of disk I/O.johnson@microsoft. It runs in user memory space. Rev 2 . Version 2. The Jetstress test parameters for disk latency are shown in section 8.Main Jetstress Components 4. By raising or lowering thread count. Once the test is completed.Prepared for Exchange Community Important: The validity of your Jetstress testing is only as good as the user profile analysis and workload prediction that was completed during the design phase of the project. The auto-tuning component attempts to determine the maximum thread count that the storage solution can support.0. 4 4.1 Jetstress Internals Main Jetstress Components Like Exchange. Auto-tuning may still fail. Version 2.com "Document1" last modified on 26 Feb.0.8 Draft Prepared by neil. Page 4 Jetstress 2013.Prepared for Exchange Community New: Auto tuning has been improved in Jetstress 2013 by moving to a global thread [email protected]. however it should be successful in many more scenarios than in 2010. Jetstress Field Guide. Rev 2 . 14. it used to be the number of threads per storage group and in Exchange 2010 it was number of threads per database).1. Selecting the “multi-host” option during the test configuration causes the testing process to stop and wait for confirmation before beginning the CRC check to avoid servers interfering with each other’s results. if you have 3 for SluggishSessions and an insert thread took 100ms in the last cycle. Important If you are running Jetstress on multiple servers in parallel on shared storage infrastructure. This copy operation has an I/O cost which increases with each additional copy. This is usually used to fine tune the amount of work performed by a given thread. or edit the Jetstress configuration file and change the VerifyChecksum value to false (default is true).3 Background Log Checksummer This component simulates the I/O overhead of additional database copies. it will sleep for 300ms before moving on to the next cycle.   4. respectively. 4. While working out the correct thread count to use it is not necessary to let the checksum part of the test complete.0. It also provides performance data for CRC checksum speed should VSS copies require a checksum prior to backup. it is vital that the CRC check is not running while other servers are performing their Jetstress tests.4 Offline Log and Database Checksummer This process checksums all database and log files at the end of a Jetstress run to ensure that all data is intact. 14. update and delete (all of those against records on a table). 35%.2 Thread Dispatcher The thread dispatcher is responsible for managing workload within Jetstress. Rev 2 . This process is extremely hard on storage hardware. Version 2. Of course. In Exchange 2013 this is a global parameter. To stop the checksum you can either click on cancel. There are four types: insert.com "Document1" last modified on 26 Feb. read. a thread sleeps for (SluggishSessions * TaskRunTime) before picking up the next task to run. The default operation mix for an Exchange 2010 simulation is: 40%. The main areas of interest within the thread dispatcher are as follows:  ThreadCount: number of transactional threads globally (prior to Exchange 2010. Internally. ThreadTypes: each of those threads chooses to do one type of work against the database. SluggishSessions: the default is 1 for Exchange [email protected]. often applying an I/O load many times greater than the workload that the actual Jetstress test applies.8 Draft Prepared by neil. The same thread can perform different types of work during a given run. 5% and 20%. Jetstress Field Guide.Prepared for Exchange Community 4. For example. which will stop the checksum part of the test but still generate the performance test report.1. 0 means “go full throttle”. <VerifyChecksum>false</VerifyChecksum> Page 5 Jetstress 2013.1. Version 2.8 Draft Prepared by neil. 14.0.Prepared for Exchange Community 4. Page 6 Jetstress 2013. Jetstress Field Guide. During the test. These results are then written to a HTML file.com "Document1" last modified on 26 Feb. the reporting and verification process compares the observed performance results against a set of acceptable values. binary performance data is written out to a BLG file.5 Reporting and Verification At the end of a Jetstress [email protected]. Rev 2 . how much time to allocate for testing. If this value is below the design target. the aim is to increase workload until the test fails or meets the design goals identified in the mailbox role calculator.johnson@microsoft. Fundamentally.1 High Level Test Overview Figure 2 . and which parts of the project should Jetstress testing occur? This section will try to answer some of these questions and explain the process in more detail.High Level Test Overview Page 7 Jetstress 2013.High Level Test Overview shows a high-level flowchart for Jetstress testing. Important: The last value before failure is the highest workload that the system can support. The following process assumes that you are using the disk subsystem throughput test and autotuning as recommended. Complete Mailbox Role Calculator Begin Testing Jetstress Testing Test Pass? yes Achieved IOPS? yes Validation Complete No No Remediation / Reconfiguration Figure 2 .0.8 Draft Prepared by neil. 14.1. 5. Rev 2 .0. Particularly. If the storage is still unable to meet the requirements then we have determined that it is unsuitable for the workload intended. Version 2. 5. then use sluggishsessions to fine-tune the test.com "Document1" last modified on 26 Feb.1 Jetstress testing flow chart The aim of the following process is to find the maximum workload while still passing the test. The process begins with a completed Mailbox Role Calculator and ends when the test has passed successfully while meeting the targets identified in the calculator.Prepared for Exchange Community 5 Planning for Jetstress Jetstress testing can be difficult to account for in your planning process. Jetstress Field Guide. Prepared for Exchange Community 5. 14.1.com "Document1" last modified on 26 Feb.2 Process with Automatic thread tuning Record failure results and Retest Increase thread count manually Reduce thread count manually YES IOPS Exceeded? NO NO Testing Begins Non Latency Error? NO YES NO Test Initialisation AutoTuning sets thread count Perform 15 minute test Test Pass? YES IOPS Sufficient? YES Storage Solution Requires Remediation NO Perform 2hr strict mode test Test Pass? YES Perform 24hr lenient mode test Test Pass? YES Test Results Testing Ends NO Figure 3 . Jetstress Field Guide. Rev 2 .Jetstress test flowchart for automatic thread tuning Page 8 Jetstress 2013. Version 2.8 Draft Prepared by neil.0.0.johnson@microsoft. Prepared for Exchange Community 5.2 When should I run Jetstress in my project? Jetstress testing can often take place at multiple phases within the project plan. Depending on the design approach taken, Jetstress testing may be performed during both the planning (design) and build phases of a project. Figure 4 - SDM phase overview So, why would you run Jetstress during the planning/design phase of a project? The simple answer is that with today’s powerful hardware, Exchange design teams must use standard “chunks” of hardware to create their design. Rather than attempt to guess what the I/O limits are of the hardware it is preferable to perform some Jetstress tests on the hardware to determine the maximum storage IO capacity of the system. This allows the design team to specify the bill of materials much more precisely, thereby saving money and reducing risk. However, if you have already proven the solution in the lab, why test again at build time? This is a common question. Many projects only schedule sufficient time for testing a single server and its storage solution with the belief that they only need to validate the design. The problem with this approach is that it assumes a zero error rate in the build out. What happens if someone forgets a part of the build on one server? Alternatively, deploys a different device driver from the one used in the lab? What happens if a faulty piece of hardware has been deployed? Jetstress testing at build time is a great way to validate that the physically deployed hardware and software are capable of providing the required I/O performance for Exchange. Jetstress testing at build time is also a way to identify failing components such as disk drives; it is much less stressful to identify a weak batch of disks during a Jetstress test than on a Monday morning after a large user migration! If the project plan will allow it, build in sufficient time to test each server and storage chassis that will be deployed before migrating user mailboxes to it. Remember that Jetstress can be fully automated, so with a little bit of planning it can be left to run overnight and may not actually add any significant overhead to the project. Page 9 Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8 Draft Prepared by [email protected] "Document1" last modified on 26 Feb. 14, Rev 2 Prepared for Exchange Community 5.3 Where should I run Jetstress in my infrastructure? To ensure that the Jetstress test is representative of production, it is recommended to run Jetstress on every set of disks that will hold mailbox database copies (active, passive or lagged). The test is designed to validate the storage system and so it is important that where you have multiple Exchange servers that use the same storage system, you must test them in parallel to simulate the production workload. If the storage system also supports additional workload, you should use IOMeter to simulate this if it is not yet active on the storage system at the time of testing. Note: It is important to remember not to run Jetstress on production servers that have Exchange Server already installed. This may lead to problems with Exchange performance counters. It is recommended to run Jetstress BEFORE installing Exchange Server into production. In the event that you have already installed and configured Jetstress on your production Exchange Servers, refer to the following article for more information on resolving Exchange Performance Counter problems: http://blogs.technet.com/b/mikelag/archive/2010/09/10/how-to-unload-reloadperformance-counters-on-exchange-2010.aspx Each database copy must be designed to provide sufficient I/O to support the copy if it were to become active. Therefore, by testing each database LUN in parallel, we are validating that the storage solution is able to meet the design requirements. We are also validating that any pieces of shared infrastructure are able to meet the demand of the entire solution, rather than simply testing each server individually. Note: Where there is no shared infrastructure and all storage is directly attached, servers may be tested individually. However, the test must be configured to include any active, replica or lagged LUNS that could become online at the same time to be a valid test. Page 10 Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8 Draft Prepared by [email protected] "Document1" last modified on 26 Feb. 14, Rev 2 Prepared for Exchange Community 5.4 Failure Mode Testing 5.4.1 Raid Array Testing Since the improvements in Exchange I/O from Exchange 2007, it is now viable to deploy Exchange Server databases on a multitude of storage types, from JBOD to RAID 6. Raid arrays offer a great compromise between data redundancy and performance. However, they can also suffer from a significant performance reduction when operating in degraded mode (spindle failure). Due to this, it is recommended to design RAID arrays that will host Exchange Server databases such that the RAID array should provide sufficient IOPS performance for the Exchange workload when running in degraded mode. Important: While testing for failure scenarios it is not necessary to run your Jetstress test at peak working load. Instead, it is recommended to modify the thread count until the Jetstress test achieves just above the Total Database Required IOPS / Server value reported in the Mailbox Role Calculator. From a service availability perspective, it is important to validate that your storage can provide sufficient performance in all common failure conditions. Due to this, it is recommended to run the Jetstress test while the array is operating in the following conditions. Array Condition Optimal Degraded Rebuilding Test importance Recommended for all deployments Recommended for all deployments Recommended if array has hot spare . 1 Description All disk spindles operating normally Single spindle removed from the array Failed spindle replaced and array controller is rebuilding the array Table 2: Raid array testing conditions Ideally, the Jetstress test should still pass during a degraded mode test. If the test fails, refer to this post to analyse the failure severity. 5.4.2 Resilient Component Testing Any aspect of the storage solution that has been designed to be resilient should also be tested in a failed state to determine the impact. For example if there are multiple paths between the host and the storage controller, the Jetstress test should still pass if one is disabled. Since there are so many possible types of resilient components, it is impossible to list them here, however the general spirit of this test is to evaluate potential sources of failure within your storage solution and ensure that Jetstress still passes if they enter a degraded state. 1 If your array does not contain a hot spare, you can choose to perform array rebuilds out of hours so the end user impact is minimized, however your data loss exposure is increased. If you plan on performing array rebuilds during working hours, even if you do not have a hot spare configured it is recommended to perform a Jetstress test run while the array is rebuilding. Page 11 Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8 Draft Prepared by [email protected] "Document1" last modified on 26 Feb. 14, Rev 2 com/en-us/library/ee832792. This represented a clear failure of the degraded mode test. Jetstress Field Guide. http://technet. the storage was based on Raid6 technology. The Jetstress test was configured to run at 1256 IOPS (Mailbox Role Calculator predicted 1200 IOPS).4.0.8 Draft Prepared by neil.aspx Page 12 Jetstress 2013. Approximately half way through the test.microsoft. This situation took 18 hours to return to normal after the failure.0.Prepared for Exchange Community 5. The test data shows that the average read I/O latency (Exchange Database ==> Instances\I/O Database Reads (Attached) /average Latency) increased from 11ms to 400ms+.com "Document1" last modified on 26 Feb. a hard disk drive was (carefully) removed from the array and the spare began rebuilding. Figure 5: Degraded mode failure Note: Please refer to the following section about understanding storage configuration for Exchange Server 2013 for more information on recommended raid configurations for Exchange Server. Rev 2 . 14. I have chosen to show an unacceptable result since a good test is just a flat line and that is not particularly interesting. Version 2.3 Example of a failed degraded mode test This example shows an unacceptable test result. Important: Common failure modes such as a disk rebuild should not materially affect the test results.johnson@microsoft. In this instance. with latency spikes of 3000-4000ms on the affected LUN. microsoft.8 Draft Prepared by neil. The single caveat is that the hypervisor being used is one of the following or newer:    Microsoft Windows Server 2008 R2 (or newer) Microsoft Hyper-V Server 2008 R2 (or newer) VMware ESX 4. we have seen a huge increase is deployments on hypervisor technology. Follow the current recommended practices from both Microsoft and your hypervisor vendor. 1. but was not quite so practical for all hypervisors. 3.0. we worked with a number of customers who observed inaccurate results during their Jetstress tests of virtual machines. Does the host server have any high availability technology that we need to test in degraded mode? This could include things like multiple paths to the storage or network.5. The aim of the test is to validate that the storage presented to the virtual guest can provide sufficient performance to meet the predicted requirements from the mailbox role calculator. there are things that we may need to consider during our Jetstress testing.Prepared for Exchange Community 5.0. All performance counters and recommended values remain the same from a physical to a virtual guest and the recommendations for testing against raid arrays and in failure-modes still apply.com "Document1" last modified on 26 Feb. Version 2.1 (or newer) Information: More information about deploying Exchange Server 2013 on a Hypervisor can be found here: http://technet. I know this is obvious but it still amazes me how many problems are resolved by following the recommended guidance! Page 13 Jetstress 2013.1 What is different about Jetstress inside a virtual machine? The approach and testing process do not change. then there is the possibility that we will experience performance problems once the host is fully loaded. Is the virtual host operating at a normal working load during our test? If the host has capacity for 10 virtual machines and we are testing with a single virtual machine running. 2.johnson@microsoft. Additionally the host may be the failover location for other guests. During the early stages of hypervisor use for Exchange. This culminated in the Exchange product group releasing a statement that advised against using Jetstress inside a virtual machine and instead to test on the root of the hypervisor – obviously this worked for Hyper-V. 14. or maybe even a Hypervisor HA solution. Jetstress Field Guide.com/en-us/library/jj619301. However. Rev 2 .aspx 5.5 Jetstress testing inside virtual machines A quick history lesson: Over the years. meaning that workload may increase dramatically in a failure scenario. Yes. On 30th March 2012 after significant internal testing against modern hypervisors the Exchange Product group announced that it is now viable to perform your Jetstress testing directly from inside the virtual machines that are planned to host the Exchange Mailbox role. Prepared for Exchange Community Guidance The spirit of the test is to ensure that the system can meet its predicted workload during normal working conditions and during any common failure modes for which the system has been designed to survive. Rev 2 . Version 2.com "Document1" last modified on 26 Feb.8 Draft Prepared by neil.0. Page 14 Jetstress 2013. Jetstress Field Guide. 14.0.johnson@microsoft. 1 1. the test procedure can be broken up into three parts. cutting-edge storage technology. If you are working in a complex Page 15 Jetstress 2013. If you are using direct attached storage and initialise multiple servers in parallel these predictions apply to each server.aspx Best Practices for Virtualizing Exchange Server 2010 with Windows Server® 2008 R2 Hyper V™ (Applies equally to Exchange Server 2013): http://www.3 5.8 Draft Prepared by neil.0. it is highly likely that you will need to allocate more time for testing. prerequisites and initial database creation.6 10.com/download/en/details.    Initialisation Testing Clean-up 5.aspx?id=2428   5. Database creation time varies between hardware deployments however expect around 24 hours for 10TB of data per server (~7GB/minute). the initial database creation will take the longest amount of time.1 Initialisation This phase includes installation.0 0.2 Testing The actual testing phase will vary depending on the complexity and maturity of the design.2 5TB 12.com/b/exchange/archive/2011/05/16/announcing-enhancedhardware-virtualization-support-for-exchange-2010.com/b/exchange/archive/2011/10/11/demystifying-exchange-2010sp1-virtualization.com "Document1" last modified on 26 Feb.Prepared for Exchange Community For more information about virtualizing Exchange Server:  Announcing Enhanced Hardware Virtualization Support for Exchange 2010 (this applies equally to Exchange Server 2013): http://blogs.technet.8 0. If your design is based on common direct attached components. Of these tasks.technet. 14.6. Generally.6 How much time should I allocate for Jetstress testing? Jetstress testing can take a long time to complete and it is vital that this time is correctly planned for within your Exchange project plan.0 100TB 240.6. If your design is based on complex. If you are using shared storage.0 50TB 120.4 0. your initialisation time may take considerably longer. Version 2.johnson@microsoft. DATA (TB) TIME (Hours) TIME (Days) 1TB 2.0. Rev 2 . for complex SAN solutions try to allocate up to 10 working days.1 2TB 4. For simple direct attached solutions allow between 2-5 days. Jetstress Field Guide.0 Table 3: Database initialisation time 5. the testing phase is likely to be quite short.aspx Demystifying Exchange 2010 SP1 Virtualization (this applies equally to Exchange Server 2013): http://blogs.5 10TB 24.microsoft. com "Document1" last modified on 26 Feb. 5.Prepared for Exchange Community enterprise with large scale. Page 16 Jetstress 2013. [Prompt] $true or $false.xml” if no other value is specified.zip The scripts will parse your JetstressConfig. Tip: If you have a complex deployment. allow between 1 and 2 hours per Exchange server that needs to have Jetstress uninstalled.XML file – defaults to “C:\Program Files\Exchange Jetstress\JetstressConfig. you can use the scripts embedded here: JetstressScripts.8 Draft Prepared by neil. Jetstress Field Guide. Rev 2 . complex storage infrastructure budget between 4-6 weeks for Jetstress testing.XML file and remove all database and log folders defined in the test. default is $true. Troubleshooting storage performance issues can often be very time-consuming. specify $false to use as part of an automated process.0.  Note that these scripts are unsupported and you use them entirely at your own risk.0. The scripts takes two input parameters:  [XMLFile] Path to JetstressConfig. The recommended procedure is as follows     Uninstall Jetstress and Reboot Copy the Jetstress data to a safe location Delete the Jetstress installation folder Remove all test databases Depending on complexity. They are provided here for convenience only.6.johnson@microsoft. 14. it is necessary to remove the Jetstress application and the test databases that were created.3 Clean-up Before the server can be put into production. Version 2. Hard disk firmware 3. the following items: a. To ensure that the environment is ready it should be configured according to both the hardware vendor’s and Microsoft recommendations. If multiple clusters will be sharing any aspect of the disk subsystem. but are not limited to. 9. Page 17 Jetstress 2013. Read/Write Cache is 75% Write and 25% Read on all LUN’s. Format the LUNs within Windows with NTFS file system. Verify with vendors that drivers and firmware are current and consistent across all servers. Server BIOS/firmware b. ensure that the following conditions have been met: 1. SCSI/Array Controller firmware and driver c. Queue Depth). 2. As a starting point. Many HBAs use registry keys to customize the configuration to a specific SAN platform (for example. Fibre switch/hub firmware e. 5. Fibre Host Bus Adapter (HBA) firmware and driver d.Prepared for Exchange Community 5. Storport. Refer to Understanding Exchange 2013 Storage Configuration Options for further detail. Jetstress Field Guide. 14.0. Version 2.johnson@microsoft. Best practice = 64k allocation unit size. Configure the storage logical unit numbers (LUNs) (consider Exchange log devices and database devices). Rev 2 . 7. 8. SAN (Storage Area Network) enclosure Operating System/Microcode/firmware f.SYS has been updated to the latest supported version for your hardware.0.com "Document1" last modified on 26 Feb. 6. NTFS Compression is not enabled. the server/storage configuration must be Cluster/Multi-Cluster Certified.8 Draft Prepared by neil. 10. 4. File Level Anti-Virus is configured to exclude all Exchange data locations and any directories that Jetstress has been configured to use.7 Preparing for the Jetstress test Jetstress simulates an Exchange database workload. Drivers and firmware include. Verify that the HBA/SAN specific configuration is set correctly and is consistent across all servers. Raid Controller Stripe size is 256Kb or greater (refer to hardware vendor for guidance). The test will find the peak working load that the storage is able to provide at the I/O latency targets recommended by the Microsoft Exchange Team. Before beginning significant storage redesign work. Rev 2 . hardware. it is important to check the basics listed in section 4. It is far better to suffer a small delay to the project timescales than put a service into production that does not meet its original goals. If it does not meet the design targets. The aim of remediation is to determine why the IOPS target was below the design target and to provide a remediation plan before submitting the solution for a re-test. Jetstress is just the messenger. Advice: It is much easier to resolve configuration problems during this phase of the deployment than after the Exchange servers have been put into production.0. Jetstress Field Guide. the storage has failed to meet its design targets it will be necessary to perform remediation.Prepared for Exchange Community 5. Remember that “Jetstress” has not failed. build. Page 18 Jetstress 2013. One of the most common pitfalls that occurs when a test fails is focussing on Jetstress itself.8 Draft Prepared by neil.8 What happens if the test fails? It is important to determine the pass and fail criteria for the test.7 Preparing for the Jetstress test. instead concentrate on understanding the data that Jetstress has provided and how you can fix your storage solution.com "Document1" last modified on 26 Feb.johnson@microsoft. and storage vendor teams. If the recorded IOPS target from the Jetstress test is above the targets documented within the Exchange design then the storage solution is deemed to have passed the test. Your storage has failed the test. This usually involves a combination of resources from the design/project. Jetstress is a well-proven tool and is extremely unlikely to be the root cause of your storage test failing.0. These are defined in section 8. Version 2.3 Interpreting Jetstress test results. If the test shows that. The most common causes of Jetstress test failures are missing simple configuration steps during deployment and/or misconfiguring the Jetstress test itself. 14. then the storage solution is deemed to have failed the test. aspx?id=36849 Table 4 .0.01.   Jetstress 2013 will not allow you to use an XML configuration file from an older version of Jetstress.0.Jetstress version and download table Note: Although there is a 32-bit build of Exchange 2007.com/enus/download/details.225.8 Draft Prepared by [email protected]/enus/download/details.com "Document1" last modified on 26 Feb. Version 2. If you are validating Exchange Server 2003.x for Exchange 2003 Page 19 Jetstress 2013.Prepared for Exchange Community 6 6. Always ensure that you use the same version of Jetstress to initialise the databases and to perform the testing. 6.aspx?id=20054 http://www.017 14.4 Build 32 bit 64 bit 64 bit Usage     Exchange 2003 Exchange 2007 Exchange 2010 Exchange 2013 2 Link http://www.1 Installing Jetstress Documentation The document that you are currently reading represents the main source of information for Jetstress 2013. it is not recommended or supported to use these ESE files to run a Jetstress test.017 15.658.microsoft. 2007 or 2010 refer to the Jetstress Field Guide for Jetstress 2010.01.aspx?id=4167 http://www. Jetstress Field Guide. This is due to the requirement for a 64-bit address space to simulate a realistic Exchange I/O pattern.01.com/enus/download/details.0225.2 Jetstress Version and Download Version 14.0225.0.microsoft. 2 Refer to Appendix D – Exchange 2003 for information on configuring Jetstress 14. Rev 2 . 14. Jetstress Field Guide.ini o eseperf.com "Document1" last modified on 26 Feb.Prepared for Exchange Community 6. 14. 3 See section 5.3 Prerequisites   .5 or higher A copy of your 64-bit production ESE files3 o ese.0.4 Getting ESE Files necessary for Jetstress for the locations of these files. Page 20 Jetstress 2013.0.8 Draft Prepared by neil. Rev 2 .hxx o eseperf.dll o eseperf.xml It is important that the version of ESE that is used for the test is the same version that will be used in [email protected] Framework 4. Version 2.dll o eseperf. com "Document1" last modified on 26 Feb. Rev 2 . Version 2. Jetstress Field Guide.ESE file locations on running Exchange server 6. Note: AMD64 refers to the x86-64 bit architecture and is not specific to AMD processors.1 File locations from an installed Exchange Server File ESE.0.4.DLL ESEPERF. The needed files are available from an installed Exchange server or from the Exchange installation media.ESE file locations from installation media Caution Remember to use the same version of ESE files in your Jetstress tests that you will use in production.HXX ESEPERF.XML Path \setup\serverroles\common \setup\serverroles\common\perf\amd64 \setup\serverroles\common\perf\amd64 \setup\serverroles\common\perf\amd64 \setup\serverroles\common\perf\amd64 Table 6 . Page 21 Jetstress 2013.DLL ESEPERF. It is recommended to get the files from an installed Exchange server that has been fully updated and patched.0.8 Draft Prepared by neil. 14.Prepared for Exchange Community 6. Do NOT use the x86 files! 6.INI ESEPERF.DLL ESEPERF.4.4 Getting ESE Files necessary for Jetstress Jetstress requires ESE to [email protected] File locations from the installation media File ESE. it is possible to get the necessary files directly from the installation media without requiring an Exchange installation. If you are validating Exchange 2010 or newer.HXX ESEPERF.XML Path C:\Program Files\Microsoft\Exchange Server\V15\Bin C:\Program Files\Microsoft\Exchange Server\V15\Bin\perf\AMD64 C:\Program Files\Microsoft\Exchange Server\V15\Bin\perf\AMD64 C:\Program Files\Microsoft\Exchange Server\V15\Bin\perf\AMD64 C:\Program Files\Microsoft\Exchange Server\V15\Bin\perf\AMD64 Table 5 .DLL ESEPERF.INI ESEPERF. Accept License agreement Page 22 Jetstress 2013.0.1 # 1. Application Installation Screenshot Instruction Begin Jetstress installation 2.0. 14.Prepared for Exchange Community 6.5 Installation Before performing this section.8 Draft Prepared by neil.johnson@microsoft. it is recommended that all prerequisites have been met and that Exchange server is not installed on any servers being used for Jetstress testing. 6.com "Document1" last modified on 26 Feb. Jetstress Field Guide. Rev 2 .5. Version 2. Leave the installation options as default unless you have a good reason to change them.8 Draft Prepared by neil.Prepared for Exchange Community 3.com "Document1" last modified on 26 Feb. Jetstress Field Guide. 14. [email protected]. Rev 2 . Click on “Next” to install… Page 23 Jetstress 2013.0. Note: All performance data and HTML reports will be stored in the installation folder so if your system drive is short of space select an alternative folder. This is the last chance to stop the installation. Version 2. Once installation is completed click on “Close”. Version 2.5.0. 14. Jetstress Field Guide.Prepared for Exchange Community 5.com "Document1" last modified on 26 Feb. Rev 2 . By default this is “c:\Program Files\Exchange Jetstress” Page 24 Jetstress 2013.8 Draft Prepared by neil.0.2 # 1. ESE File Installation Screenshot Instruction Copy ESE prerequisite files into the Jetstress installation folder. Table 7 [email protected] installation instructions 6. 0. Start “Exchange Jetstress 2013” Note: Jetstress requires local Administrator access.Prepared for Exchange Community 2. Table 8 . Rev 2 . 14.ESE installation instructions Page 25 Jetstress 2013.EXE process as an administrator.0. Click on “Start new test” 4.8 Draft Prepared by neil. Close Jetstress This is the end of the Jetstress installation. Jetstress Field Guide. Verify in the output on this screen that the ESE version is correct and that the last line of the status output requires that Jetstress be restarted.johnson@microsoft. The first time that this occurs Jetstress must be restarted. Version 2. Jetstress will attempt to use the ESE files that were copied over in step 1. If user access control is enabled. ensure that you start the JetstressWin.com "Document1" last modified on 26 Feb. 3. we will be configuring a disk subsystem throughput test. Databases Size Control Where you are testing multiple databases per volume. it is still recommended to complete the disk subsystem throughput test to determine the maximum working load of the storage solution at full capacity.1.com "Document1" last modified on 26 Feb. Note: Even if this test type is used. This test should be regarded as mandatory for each Exchange server released into production. IOPS per mailbox and quota size to simulate the profiled Exchange mailbox load. Rev 2 .1.johnson@microsoft. In the Exchange mailbox profile test scenario. The values observed from this test can be used both to qualify the solution ready for production and to calculate available system I/O headroom once the service is in production. 14. Version 2. then you can control the size of the databases by reducing the “size the database using storage capacity percentage” box during the test configuration to be whatever you need. Jetstress Field Guide.1 Jetstress Test Types 7. 7. This is the recommended test type since it identifies the maximum working load of the storage solution for use with Exchange Server 2013 while the disks are filled to capacity.1 Test a disk subsystem throughput This test uses some fixed parameters to determine the maximum storage performance at maximum working capacity (80%). This test type can be useful if your storage has been specifically designed to operate only at a specific disk capacity4. If your volume is over-sized for your solution for some reason and the test databases are too large. 4 It is not recommended to design Exchange storage performance based on less than 80% utilisation capacity.Prepared for Exchange Community 7 Configuring Jetstress For the purposes of this document.2 Test an Exchange mailbox profile Helps you determine whether your storage system meets or exceeds the planned Exchange mailbox profile. Jetstress will automatically calculate the database size of all databases on the same volume to ensure that the test runs at 80% of volume capacity. 7. The goal of this test is to identify the peak working IOPS value that the storage subsystem can sustain while remaining within the disk latency targets established by the Exchange Product Group.0.0. Page 26 Jetstress 2013.8 Draft Prepared by neil. you can specify the number of mailbox users. 0. Page 27 Jetstress 2013. 2. Jetstress Field Guide. Click on “Start new test” 3. Rev 2 . Version 2.0.johnson@microsoft. Check that the status text does not ask for a restart and that the last two lines state that the ESE engine and performance libraries were detected.2 # Initial configuration Instruction Open “Exchange Jetstress 2013” Screenshot 1.8 Draft Prepared by neil.com "Document1" last modified on 26 Feb. 14.Prepared for Exchange Community 7. This is a change to Jetstress 2010 where autotuning would rarely work. we are configuring a test we will accept the defaults and click next. Ensure that “Supress tuning and use thread count” is unchecked. Version 2.com "Document1" last modified on 26 Feb. Rev 2 . 5.xml in the default installation directory. This will create a new configuration file called JetstressConfig.0.8 Draft Prepared by neil. Jetstress Field Guide. however if the storage presented to your servers is greatly oversized then you can control the Jetstress test database sizes by reducing the size the database using storage capacity percentage. Select the “Test disk subsystem throughput” test and click “next” 6. Most validation tests should leave both values at 100. Page 28 Jetstress 2013. If you already have an XML file select that. 14.0. You should always test with 100% database capacity and target IOPS throughput. Auto tuning should work in most scenarios with Jetstress 2013. Since this is the first time.Prepared for Exchange Community 4.johnson@microsoft. revert to manual thread configuration as per Appendix A – Configuring Thread Count. If Auto-tuning fails. Enter in the folder for storing the test results and set the correct duration for Jetstress. If you are testing a shared storage platform. A minimum of one successful 2hr and a separate 24 test is required for deployment validation. Version 2. Note: While auto-tuning or configuring thread count. Rev 2 .0.johnson@microsoft. Configure the test for performance.Prepared for Exchange Community 7.50 = 30m 0. Configure the test to represent the production deployment.8 Draft Prepared by neil. 14.25 = 15m Recommendation: Use 0. enable the multi-host checkbox. passive and lagged. they will be reported in a new table to highlight disk errors. 9. Ensure that run “background database maintenance” is checked.    0.com "Document1" last modified on 26 Feb. active. 8. you can set a shorter than 2 hour test by typing directly into the window.0. Jetstress Field Guide. Number of copies per database represents the number of total copies Page 29 Jetstress 2013.75 = 45m 0.50 (30 minute) test runs to set thread count for SAN storage. Set “continue the test run despite encountering errors” to enabled. Number of databases should be the total on this server including all database copies. If any errors are detected during the test. 0.com "Document1" last modified on 26 Feb. Configure the database and log file paths appropriately. Scroll to the bottom of this page to find the “next” link. For example. Version 2. Page 30 Jetstress 2013. otherwise select “Attach existing databases”.8 Draft Prepared by neil. Rev 2 . you would set the number of databases to 20 and the number of copies per database to 4. This value simply simulates some LOG I/O reads to account for the log shipping between active and passive databases – it does NOT actually copy logs between servers. 11. 2 passive HA copies and 1 lagged copy per database (or 120 database copies spread across 6 servers. Jetstress Field Guide. if your 6 server DAG contained 30 databases. If this is the first time the test has been run select to “Create new databases”. 10. Note: Refer to the Mailbox Role Calculator’s Distribution Tab to understand how your database should be configured.johnson@microsoft. with each server hosting 20 copies).0.Prepared for Exchange Community that will exist for each unique database. with 1 active copy. 14. Jetstress Field [email protected]. Version 2. for further information on database sizes and creation time.0.0.8 Draft Prepared by neil. Page 31 Jetstress 2013. Refer to section 4. Verify that the paths are as expected and click “Prepare test” 13.Prepared for Exchange Community 12. This value should equate to 80% of the available storage. This will begin database initialisation – this process will vary but plan on 24 hours for every 10TB worth of data to be initialised.com "Document1" last modified on 26 Feb. Rev 2 . 14.1 Initialisation. 14.HTML DBChecksum_<date>.0.8 Draft Prepared by neil.XML DBChecksum_<date>.BLG XMLConfig_<date>.HTML Performance_<date>. Table 9 .XML Performance_<date>. close Jetstress and copy the Jetstress report and performance data somewhere for analysis.BLG DBChecksum_<date>.johnson@microsoft. Once the test has been initialised. 15.EVT files which contain event log data taken during the test. Note: In addition you may also wish to make a copy of the *. Rev 2 .0. Once the test has completed.        Performance_<date>.com "Document1" last modified on 26 Feb. click “Execute Test”. Version 2. Each performance test will generate the following files.XML Ensure that you make a copy of all of these files.Jetstress initial configuration Page 32 Jetstress 2013.Prepared for Exchange Community 14. Jetstress Field Guide. Page 33 Jetstress 2013. Binary performance data captured during the checksum test. DBChecksum_<date>. Jetstress Field Guide.BLG HTML Report for the performance Provides an easy to read status test report for the test. examine the counters manually to understand reasons for failure.XML Performance_<date>. 14. Version 2.johnson@microsoft. HTML Report for the checksum test XML Configuration File Provides an easy to read status report for the checksum test.com "Document1" last modified on 26 Feb. Provides a backup of the Jetstress Configuration file used for the test.0.HTML XMLConfig_<date>.Jetstress output files XML Report for the checksum test Provides status report data in XML format.Prepared for Exchange Community 8 Jetstress Output Files This section will explain what output files will be created after the test and what is in each one.XML Table 10 .0. File Performance_<date>.HTML DBChecksum_<date>. Useful if the checksum fails or takes a long time to complete.XML DBChecksum_<date>. Provides binary performance data gathered during the CRC checksum of the database. captured during the performance Open this file in perfmon and test. Rev 2 . XML Report for the performance test Provides the status report data in XML format. Performance_<date>.8 Draft Prepared by neil.BLG Content Purpose Binary performance data To provide detailed data for analysis. 8 Draft Prepared by neil. 14. 9. the information we need is in the following table on the Role Requirements tab. we need to know what our design targets are. Jetstress Field Guide. Assuming that the storage design was based on data from the Mailbox Role calculator (which they should be).0. Version 2.1 Target design values Before we can evaluate our Jetstress data.com "Document1" last modified on 26 Feb. and explain where the key values are stored and how to interpret the data.johnson@microsoft. Make a note of the following value:  Total Database Required IOPS / Server Page 34 Jetstress 2013. Rev 2 .0.Prepared for Exchange Community 9 Reading Jetstress report data This section will walk through a very simple sample report. The most important part of this section is the overall test result.1 Target design values.1 Test Summary This section is a basic summary of the test.0. 9. 9. pass or fail. Jetstress created a total of 101GB (109154926592 bytes) of data for testing which is 80% of the available space.2.Prepared for Exchange Community 9. 4 x 25GB Databases were created on a 126GB LUN.com "Document1" last modified on 26 Feb. when it started. In this example the test validated the storage can provide 231 transactional I/O per second.2 Database Sizing and Throughput This section shows some more detailed parameters regarding the test. from the Mailbox Role Calculator. Version 2.2.0. finished and which versions of operating system and ESE were used. Page 35 Jetstress 2013. The most important value in this section is the Achieved Transactional I/O per [email protected] Reading the Jetstress Test Result Report The following report is for a test with four databases configured. Rev 2 . A “test disk subsystem throughput” test report will always show 100% for Capacity Percentage and Throughput Percentage. 14. In this example.8 Draft Prepared by neil. This represents random database IOPS. Note: To validate that the test has met the design requirements compare the Achieved Transactional I/O per Second from your Jetstress report to the Total Database Required IOPS / Server value recorded in section 8. This is normal behaviour. in performance mode Jetstress will use 80% of the disk capacity to allow room for growth during the test process. Jetstress Field Guide. by default. BDM I/O is mostly sequential so it is not usually considered during the design phase.87 + 23.4 Database Configuration This section lists the paths for each database and log combination. Page 36 Jetstress 2013.069 + 33. In this example. 9.491 + 33. Rev 2 . In this example. Check that all of the test databases are listed here and the path names are correct.2.2.8 Draft Prepared by neil.0.978 + 24.johnson@microsoft. Version 2.186 + 34.0.807 = ~231 IOPS. Transactional I/O does not include I/O for Background Database Maintenance. 4 x 25GB databases were configured on a single LUN. Information: If you sum the values highlighted in the red box the result should add up to the Achieved Transactional I/O per second reported in the Database Sizing and Throughput table. 9.859 + 24.com "Document1" last modified on 26 Feb.2.3 Jetstress System Parameters This section displays some system values that Jetstress used for this test. The important values for analysis here are the thread count and number of copies per database. Jetstress Field Guide.Prepared for Exchange Community 9. 14.043 + 23.5 Transactional I/O Performance This section of the report displays the Transactional I/O values that were achieved for each database. 33. take the advice of your storage vendor on this aspect. Jetstress Field Guide. The sum of values in the red box shows the total amount of IO used for BDM operations. 9. the little things count  Page 37 Jetstress 2013. this is shown by a non-zero count for I/O Log Reads/[email protected]. These are sequential operations and we do not usually need to account for them in our design. In this example there were two replica copies (replicas=2). If this value is greater than zero it confirms that database replication is being simulated. Version 2.com "Document1" last modified on 26 Feb.Prepared for Exchange Community 9. Rev 2 .7 Log Replication I/O Performance This section displays the I/O overhead for LOG file replication. However.8 Draft Prepared by neil. I finally provided a report that shows log IO – I know.0. 14.0.2.6 Background Database Maintenance I/O Performance This section displays the I/O that was used to perform Background Database Maintenance only. Note: For those that noticed. some storage platforms do not handle sequential IO as well as others and may require some additional design work to help them deal with BDM more gracefully. Host observed IOPS Page 38 Jetstress 2013.8 Draft Prepared by neil. However. In this case. however there should be a strong correlation between the IOPS observed on the windows host and at the storage subsystem. The following chart shows the observed IOPS from the Windows host during the Jetstress test.2. the summation suggests that the storage subsystem had to deal with 349 IOPS.com "Document1" last modified on 26 Feb. Version 2. Jetstress Field Guide.Prepared for Exchange Community 9. Figure 6 . Rev 2 . the windows host values take precedence from a Jetstress validation perspective. The summation of I/O values from areas highlighted in red in this table should agree (roughly) with those observed at the storage subsystem.8 Total I/O Performance This table shows all I/O that was recorded during the test (transactional I/O plus BDM I/O plus LOG I/O)[email protected]. since sequential I/O is very easy on most disk subsystems. In the event of contradiction between observed IOPS at the Windows Host and those at the storage controller.0. This counter includes all system IOPS as well as the test IOPS. 14. roughly 1/3rd of those (349-231=117) IOPS were sequential and so were not accounted for during the design process. Often storage teams are confused by the results of a Jetstress test since the achieved transactional I/O per second value is much lower than the observations they make at the storage system. The only value from the Jetstress report that is required for validation is Achieved Transactional I/O per Second.2. We are only interested in transactional IOPS when we are Jetstress testing – BDM and LOG IO are sequential in nature and so we ignore them from a performance planning perspective for Exchange Server. during the test it will try to continue to run the test and report the errors in both the Test Log and Error counts per Volume table. The table lists each volume along with the number and type of IO errors that were recorded. Figure 8: Error Counts Per Volume Table Page 39 Jetstress 2013. Jetstress has been optimized to evaluate the storage subsystem and not the host performance itself.0. Jetstress Field Guide. All other values are for support and curiosity only! 9.8 Draft Prepared by neil. The most important thing to note from this section is that the CPU load from Jetstress is usually minimal. Note: It is an invalid approach to sum the values displayed in the Total I/O Performance table and compare them to the Total Database Required IOPS / Server predicted by the Mailbox Role calculator.10 Error Counts Per Volume If the Jetstress test detects IO errors. 9. It is important to differentiate between the workloads. This section is most often used for troubleshooting. Rev 2 . 14. Version 2.com "Document1" last modified on 26 Feb.2.0.9 Host System Performance Figure 7: Host System Performance Table This section of the report shows the observed system performance during the [email protected] for Exchange Community It is import to differentiate between sequential IOPS and transactional (random) IOPS when validating your storage. However. Page 40 Jetstress 2013.johnson@microsoft. for example. For a full list of JET/ESE event types see the following article Extensible Storage Engine Error Codes. Lost Flush events signal significant data corruption has occurred and something is very wrong with your storage (under no circumstances should you entertain putting a system into production that is experiencing ANY lost flush events during a test). Exchange will try to deal with this scenario via Page Patching in normal operation and so is not of critical importance. Version 2. although signifies that the data we read was not the same that we originally wrote (checksum failed).com "Document1" last modified on 26 Feb. Rev 2 . 14.0.0. Jetstress Field Guide. in a JBOD environment we may see -1021 (JET_errDiskReadVerificationFailure) which. some other IO Failures are relatively normal.8 Draft Prepared by neil.Prepared for Exchange Community Error Type IO Failures JET/ESE Error Type JET_errDiskIO JET_errReadVerifyFailure JET_errPageNotInitialized JET_errReadPgnoVerifyFailure JET_errDiskReadVerificationFailure Error Code -1022 -1018 -1019 -1118 -1021 -533 -528 -501 -1023 -1024 -1025 -1032 -1812 -1852 -1305 -1119 -566 -567 Filesystem Corruptions JET_errCheckpointCorrupt JET_errMissingLogFile JET_errLogFileCorrupt JET_errInvalidPath JET_errInvalidSystemPath JET_errInvalidLogDirectory JET_errFileAccessDenied JET_errFileInvalidType JET_errLogCorrupted JET_errObjectNotFound Lost Flush JET_errReadLostFlushVerifyFailure JET_errDbTimeTooOld JET_errDbTimeTooNew Table 11: JET Error Code Groupings Information Some failure events are more important than others. ESE has implemented lost flush detection.Prepared for Exchange Community What is a Lost Flush? A lost flush occurs if we issued a write operation to the disk and the OS reported the operation as having successfully completed. A lost flush is a very insidious type of storage failure for a database engine because the consequences can range from none (if we are very lucky) to nasty and potentially undetectable logical database corruption (more likely). A bug somewhere in the storage stack. This is the reason why we only run with write-cache enabled on the storage if there’s a battery backing up the cache. Important: The bottom line for lost flushes is that you should NEVER put a system into production that has recorded lost flushes during the Jetstress test. Page 41 Jetstress 2013. it means the flush was lost. Undetected lost flushes on the passive copy may show up as a JET_errDbTimeTooOld (-566) replication error on the passive copy. we flip a bit on the actual page and also store that bit in a flush map in memory. Rev 2 .0. 14. If we read the page again off the disk. Undetected lost flushes on the active copy may show up as a JET_errDbTimeTooNew (-567) replication error on the passive copy.johnson@microsoft. the controller makes sure to flush the uncommitted cache to the disk. even though it was reported to the application that it did. Power loss on storage with write-cache enabled: in this case. but it actually didn’t get physically committed to the non-volatile storage. 2.com "Document1" last modified on 26 Feb. You must be 100% certain that you have resolved the underlying problem and have at least one good 24 hour test that has no lost flushes recorded before accepting the solution into production. Jetstress Field Guide.0. it means it never actually made it to the non-volatile storage. The two main reasons for this to happen are: 1. based on a flush map. so if it loses power. the operation is committed to the volatile cache of the disk or controller. Basically.8 Draft Prepared by neil. we check the bit against the in-memory flush map and if they don’t match. but if the hardware loses power. Version 2. every time we issue a write on a page. 0. Rev 2 .com "Document1" last modified on 26 Feb. Jetstress Field Guide.2. Version 2. It is most often used for troubleshooting failures. 14.8 Draft Prepared by neil. Page 42 Jetstress 2013.Prepared for Exchange Community [email protected] Test Log This section of the report is a log of the Jetstress test. Version 2. Performance Test “Strict” mode (<= 6 hour test)     Average Database Read Latency: 20ms Average Log File Write Latency: 10ms Max Database Read Latency: 100ms Max Log File Write Latency: 100ms Stress Test “Lenient” mode (> 6 hour test)     Average Database Read Latency: 20ms Average Log File Write Latency: 10ms Max Database Read Latency: 200ms Max Log File Write Latency: 200ms Page 43 Jetstress 2013.Prepared for Exchange Community 9. 14.3 Interpreting Jetstress test results Jetstress evaluates latency values for Database Reads and LOG writes since these affect the end user experience. Rev 2 .johnson@microsoft. Jetstress Field Guide.0.0.8 Draft Prepared by neil.com "Document1" last modified on 26 Feb. The first test is validated against the design target and must be performed manually. if both target IOPS and latency values are much higher decrease the thread count. If the test shows that Achieved IOPS is below the design target AND the test latency values are above limits the storage solution is unable to meet the requirements. The second and third are against pre-defined latency targets for Exchange. Increase the thread count by 1 and re-test.0.4 Test evaluation Evaluate the following criteria for each test run. Jetstress does not validate this value.johnson@microsoft. Use sluggishsessions to finetune if necessary. DB IOPS Target: Is the Achieved Transactional I/O per Second in the test report higher than the Total Database Required IOPS / Server predicted in the Mailbox Role Calculator? 2.com "Document1" last modified on 26 Feb. but the latency values are good. if these values are not within tolerance.Prepared for Exchange Community 9. At this stage. Is the I/O Database Reads Average Latency in the test report <20ms? 3. Is the I/O Log Writes Average Latency in the test report <10ms? DB IOPS Target PASS FAIL DB Read Latency PASS PASS LOG Write Action Latency PASS PASS Test successful The test is failing to meet the IOPS target. Version 2. it is necessary to re-evaluate the storage design and begin troubleshooting the physical deployment to determine the correct remediation. 1.Quick results analysis table Page 44 Jetstress 2013. Jetstress Field Guide.8 Draft Prepared by neil. PASS PASS PASS FAIL FAIL FAIL FAIL PASS FAIL FAIL FAIL PASS FAIL FAIL PASS FAIL PASS FAIL Table 12 . 14.0. Rev 2 . At least one database has recorded latency over threshold. Jetstress will report the test as failed. If the latency values are very close to limits increase sluggish sessions by 1. The exact quantity of IOPS generated per thread will change as the storage system workload changes. If the thread count predicted is less than 1 it may be necessary to modify the sluggishsessions value afterwards.Prepared for Exchange Community 10 Appendix A – Configuring thread count Jetstress 2013 has been updated so that the auto-tuning feature will work in far more scenarios than previously. Jetstress Field Guide. Setting this value correctly requires some trial and error.0. this should then represent the peak working IOPS value that the storage subsystem can support. Thread count controls how many IOPS Jetstress attempts to drive through the storage subsystem. Each thread will generate a workload on the system. it is recommended to begin Jetstress testing in auto-tuning mode and only revert to manual thread configuration if auto-tuning fails to set a thread value. Jetstress was designed to produce approximately 60 IOPS per thread at 20ms disk latency. if the storage design team recommended that the storage for a given server was able to support 1000 IOPS:  Target IOPS = 1000 ( ) Starting thread count = Given this example… Starting thread count = ( ) = 15. For the process described within this document the goal is to increase the thread count to a value that fails and then reduce the value until the test [email protected] "Document1" last modified on 26 Feb. 14. So for example.38 (round up to 16) Notes:    Try auto-tuning with Jetstress 2013 If in doubt start with thread=1 and work up until the test fails. Due to this.0.8 Draft Prepared by neil.  Page 45 Jetstress 2013. Rev 2 . As the storage system gets closer to its performance limit the IOPS per thread value will reduce. Version 2. [email protected] Draft Prepared by neil. This allows a level of fine-tuning over the workload dispatched by Jetstress. Jetstress Field Guide. Version 2. open the JetstressConfig. Page 46 Jetstress 2013. 14.com "Document1" last modified on 26 Feb.xml file. To change the value. Rev 2 . As sluggishsessions is increased.0.0.Prepared for Exchange Community 11 Appendix B – Configuring sluggishsessions If it is not possible to achieve the right IOPS value by modifying the thread count it becomes necessary to modify the sluggishsessions value within the JetstressConfig. The sluggishsessions value adds a pause between each task. the achieved IOPS value decreases.xml file and look for the default configuration option <SluggishSessions>1</SluggishSessions> Modify the value. save the configuration file and then re-start Jetstress. which means you will have comparable test results with the same XML configuration file. Jetstress Field Guide.johnson@microsoft. Version 2.exe use the common Jetstress core library files.exe to open and run the test scenarios by using the /config command-line option. We recommend that you use JetstressWin. 14. You can also see all the other available options by using the /? (help) command-line option.exe and JetstressCmd. Default is the current directory. Default is 2 hours.Prepared for Exchange Community 12 Appendix C .exe Both JetstressWin. Rev 2 .0. Path for test output.0.8 Draft Prepared by neil.Running a Jetstress Test with JetstressCmd. Database paths for each storage group Log path for each storage group Specify capacity percentage Specify throughput percentage Suppress auto tuning and specify thread count Do not run background database maintenance during performance/stress test Run background database maintenance during soft recovery test /new Create new databases Config Generate /c JetstressConfig.com "Document1" last modified on 26 Feb.exe to create new test scenarios.xml /g TimeOut /TimeOut 2H0M0S Output /output c:\output DBPath /dbpath m:\sg1\mdb /dbpath n:\sg2\mdb /log x:\sg1\log y:\sg2\log /pctcapacity 100 /throughput 100 /threads LogPath PctCapacity Throughput Threads DoNotRunDBMPerformance RunDBMPerformance New Page 47 Jetstress 2013. and JetstressCmd. Action Argument help Example of Use /? Description The help for the command-line program Open a configuration file Generate a sample XML configuration file Test Duration. com "Document1" last modified on 26 Feb.8 Draft Prepared by neil. Rev 2 [email protected] for Exchange Community Open Bak Recovery Streaming Transaction /open /bak /recovery Open existing databases Restore backup database Run soft recovery test Run streaming backup test Run transaction performance test Run database checksums VerifyCheckSum Page 48 Jetstress 2013. Version 2. 14.0.0. Jetstress Field Guide. Prepared for Exchange Community 13 Appendix E – Running Jetstress on a production server Although the formal support position on this is that you shouldn’t do it – ever – at all – under no circumstances – in fact you shouldn’t even be reading this section of the field guide  … however. Page 49 Jetstress 2013. Version 2. Rev 2 . 14.johnson@microsoft. Inspect Exchange Performance counters are working. this will prevent performance counter problems afterwards! Do not unload/reload performance counters after the test (if you have used the same ESE files as are currently installed this is unnecessary and could break things!). Remember to clean up the Jetstress test databases after testing. Set Exchange Services back to the state they were in before you began testing. Uninstall Jetstress. Copy the ESE files from the currently installed version of Exchange server – Jetstress will detect that the performance counters are already installed for this version of ESE and will use them. That still doesn’t mean it’s ok to do it!! If you really MUST do it.0.        Remember: This is not supported or recommended – only follow this as a matter of last resort or under the instruction of Microsoft Support/Microsoft Consulting Services. such as when attaching new storage to an existing server or troubleshooting performance bottlenecks on existing servers. Stop and Disable all Exchange Services on the server. we all accept there are cases where it can be necessary. Reboot your Exchange Server. here are some things to know before beginning…    Record the start-up state of all Exchange Services. Jetstress Field Guide. Inspect Windows System and Application Event logs for errors.8 Draft Prepared by neil.com "Document1" last modified on 26 Feb.0. Rev 2 . Start JetstressWin.edb).ini files exist in the directory.exe requires the ESE database counters to be installed. In a command shell [email protected] Error loading Performance Monitor counters JetstressWin.chk). Solution: To reload the counters.1. 14. This section provides possible causes. Event log error that may display: Error -550 (0)   Possible cause: The last time Jetstress was run. you may see exception errors related to performance counters. you may encounter some known issues with Jetstress. Solution: Delete the Jetstress database (*. and eseperf. This caused the log files to become unsynchronized with the database. Jetstress requires read/write permission to the directories it is using. Delete that log file and all the log files that have a higher number in the file name.1 Jetstress cannot attach to or create a database Event log error that may display: Error -1023   Possible cause: The path of the database or log files is incorrect. log files (*. Event log error that may display: Error -1022   Possible cause: The failure is caused by circular logging by Jetstress.edb. Page 50 Jetstress 2013. Solution: Ensure that the paths and file names are correct. JetstressWin. and re-create the Jetstress database.exe.exe was installed and verify that eseperf.exe with the /r switch to resynchronize the logs and database. Version 2. and check file (*. Solution: Check the log drive for the log file name that is identified in the event log.exe and allow it to reload the performance counters. Jetstress Field Guide. and the recommended solutions. and rerun Jetstress. This will unregister the ESE Database performance counters. exit from JetstressWin.0. When the database is in a good state.0. 14.com "Document1" last modified on 26 Feb.log).exe /r to recover Jetstress. type the command unlodctr ESE and then click Enter.8 Draft Prepared by neil.1 Troubleshooting Jetstress While using Jetstress. run Eseutil. eseperf. it was ended uncleanly.1.dll. 14. Then.hxx. You can also use Eseutil. delete all the log files in the log directory.exe relies on performance counters to monitor the system. Event log error that may display: Error -1032   Possible cause: Permissions are insufficient to access the .Prepared for Exchange Community 14 Common Issues 14.   Cause: When the counters are not loaded correctly. Locate the directory where JetstressWin.edb file or the log files. Solution: Verify that permissions are sufficient for the account under which Jetstress is running. the system volume) does not have enough space. The most common reason is that the storage subsystem has multiple hosts attached to it. ALL mount point folder paths are indicated by a <JUNCTION> notation.1. and then rerun the test on the other hosts with the Suppress Tuning option enabled and the tuning parameters entered manually from the results of the first test.3 Unable to tune for the parameters This error indicates that Jetstress could not find appropriate parameters that could be used to run a performance or stress test at the desired level of I/O load. Database creation fails saying that volume C: (or in general. Solution: When you are running in a scenario such as this.0. The issue here is that some of the mount-points mapped to directories in the system volume are not properly configured and so Jetstress is looking at the directory (thus checking against the system drive itself).  Cause: This error means that one or more of the mount points is invalid or the mount point folder path is not connected to its LUN.  14. 14. you can run Jetstress on a single host with tuning enabled to generate the appropriate load parameters. rather than the actual disk.4 Unable to mount databases due to invalid mount point configuration When using mount points and running the Prepare phase of Jetstress. where <system drive> is the drive letter where you keep your root mount folder. Troubleshooting: Execute a DIR command in the mount point root folder.1. Rev 2 .  Cause: This can be caused by several factors. Any folder that is listed as a <DIR> is not attached to its mount point and is likely causing the problem. Jetstress Field Guide.0.   Solution: The mount path folder could be listed as <DIR> for a number of reasons: Page 51 Jetstress [email protected] Draft Prepared by neil. and those hosts are competing for common resources during the tuning process.Prepared for Exchange Community 14.com "Document1" last modified on 26 Feb. Version 2. the operation fails with error “There is insufficient disk space on volume <system drive>:\” . 14. Version 2.26) or use a version of ESE later than 726.1. Rev 2 . If you experience this issue either use the RTM version of ESE (516. Use the storage system array management software to verify that the LUN has an assigned logical drive. Error: System.ApplicationException: Faulty performance counter paths: \MSExchange Database(*)\* Jetstress version 658.johnson@microsoft. 14.0. 2. Using the Disk Management MMC.004 has an incompatibility with ESE version 620 (CU1) and above. Verify the LUN is present and in good health. re-assign the LUN to the correct mount-point. a fixed version of Jetstress will be released (726) that will work with all versions of ESE after 516. which will be released with CU2.8 Draft Prepared by neil. Jetstress Field Guide.0. if you try to run a test with more than 38 databases configured. Page 52 Jetstress 2013.com "Document1" last modified on 26 Feb.26 (Exchange 2013 ESE). 3.5 Jetstress testing failed. Additionally.Prepared for Exchange Community 1.
Copyright © 2025 DOKUMEN.SITE Inc.