Temenos T24 and Microsoft SQL Server HADR White Paper

The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24A deployment reference architecture and guidance for implementing a high-availability and disaster-recovery solution for TEMENOS T24 running on the Microsoft Application Platform Technical White Paper Published: May 2012 Applies to: Microsoft SQL Server 2012 Authors: Igor Pagliai (Microsoft) Dammika Wickramasinghe (Temenos) Abstract Temenos and Microsoft worked together to define a deployment architecture/topology that provides high availability and disaster recovery for the TEMENOS T24 core banking solution using the Microsoft Application platform and Microsoft technologies. This white paper describes the results of this joint effort. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 i ©2012 Microsoft Corporation. All rights reserved. This document is provided “as -is.” Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. You bear the risk of using it. This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 ii Table of Contents Introduction .................................................................................................................................................. 1 Technical Overview of TEMENOS T24 ............................................................................................................ 5 SQL Server AlwaysOn .................................................................................................................................... 6 Recovery Objectives .......................................................................................................................................... 7 Fault Tolerance and Disaster Recovery Architecture ........................................................................................ 8 High Availability and Disaster Recovery Solution ......................................................................................... 10 Setup and Configuration .............................................................................................................................. 13 SQL Server 2012 HADR Configuration ............................................................................................................ 13 Windows Server Firewall Configurations ........................................................................................................ 14 T24 File Share Configuration .......................................................................................................................... 15 Active Directory Domain Services DNS Configuration .................................................................................... 17 Application-Tier NLB Configuration ................................................................................................................ 18 T24 Application Server Configuration ............................................................................................................. 20 Web-Tier NLB Configuration ........................................................................................................................... 23 T24Browser Configuration.............................................................................................................................. 25 Disaster Recovery Procedures ..................................................................................................................... 27 DNS Switching ................................................................................................................................................ 29 SQL Server 2012 HADR Failover ...................................................................................................................... 31 Findings and Carryovers .............................................................................................................................. 50 Recommended Hotfixes and Service Packs .................................................................................................. 51 Additional Resources ................................................................................................................................... 52 SQL Server 2012 .............................................................................................................................................. 52 Windows Server Failover Cluster .................................................................................................................... 55 Network Load Balancing ................................................................................................................................ 56 About Temenos .............................................................................................................................................. 57 About Microsoft.............................................................................................................................................. 57 The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 iii they can benefit from open. The following considerations apply to the recommended architecture:  The SQL Server 2012 Availability Group feature. The Network Load Balancing (NLB) feature of Windows Server 2008 R2 was chosen to eliminate the need for an expensive hardware load balancer device in front of the JBoss servers. The NLB feature of Windows Server 2008 R2 was chosen to provide better load balancing performance than the native T24 capabilities in front of T24 servers. and Islamic banking and microfinance sectors. the recommended software topologies can be customized to meet customer’s needs. One of the main drivers for developing the architecture/topology was to reduce the cost of Microsoft software licenses and the use of specialized hardware (such as load balancers) to minimize the total cost of ownership (TCO). higher security-trades volumes. Two cluster nodes in the primary site with shared SAN storage were used to provide high availability for the T24 application file share. and augment the possibility of using an existing deployment based on a typical Windows Failover Clustering (WSFC) configuration. As part of their strategic alliance. modular core banking solution that covers a broad spectrum of functional requirements for the retail. T24 customers can experience faster funds transfers. This joint effort was conducted in the Temenos Hemel Hempstead lab. Microsoft and Temenos worked together to define a recommended deployment architecture that provides high availability and disaster recovery (HADR) for T24 running on the Microsoft Application Platform and using Microsoft technologies. which helps to greatly increase the speed and effectiveness with which new products and services are created. Microsoft SQL Server 2012 data management software provides an ideal data management framework for T24.Introduction TEMENOS T24 (T24) is a fully integrated. universal. state-of-the-art technologies to accelerate innovation. With this foundation. real-time view of client computers across the entire enterprise. corporate. T24 provides a single. minimize management and performance overhead. private. A SQL Server 2012 Failover Cluster Instance (FCI) was adopted for the primary site instead of two standalone instances to reduce licensing cost. making it possible for banks to maximize returns while also streamlining costs.     The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 1 . and quicker close-of-business processes. Therefore. was selected instead of storage area network (SAN)–level synchronous storage replication to avoid the cost of an additional SAN device and the licensing cost for SAN replication software. part of the AlwaysOn technology set. providing high availability at the level of the disaster recovery site as well. o o This second instance must be installed only on the nodes in the disaster recovery site. along with the one (or more) in the disaster recovery site. the two standalone SQL Server 2012 instances. the requirements of individual HADR solutions need to be determined on a case-by-case basis for each deployment. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 2 .The implementation/requirements of HADR solutions can vary based on variety of factors. This instance is distinct from the one used in the primary site. Used in combination with the previous option. Therefore. The cost savings with this alternative come from eliminating the need for shared storage. In this configuration. o If you are using an availability group. number of sites. Possible alternatives to the recommended schema can include the following:  Use two standalone SQL Server 2012 instances (in an AlwaysOn Availability Group) instead of a single SQL Server 2012 Failover Cluster Instance. cost. all nodes in the must still be part of a cluster. but extra care must be taken to avoid unwanted failover to the remote disaster recovery site. and a standalone SQL Server 2012 instance must still be installed on each node. With this alternative. o NOTE Distributed File System Replication (DFS-R) can be used to replicate files from the primary site to the disaster recovery site with a less frequent schedule. Use of DFS-R as a solution to avoid a clustered file share by having continuous replication with local folders. however. automatic failover can be provided by the AlwaysOn Availability Group feature. is not recommended because of the possible performance impact. a highly available network storage for the cluster file share witness can render the installation of a Windows Server Failover Cluster unnecessary.  Use an additional node in the disaster recovery site with shared SAN storage between the nodes. including service level agreements (SLAs). This lets you avoid using shared SAN for the cluster nodes in the primary site. Alternatives to the Recommended Architecture The architecture proposed in this white paper is not the only one possible using SQL Server 2012 AlwaysOn features. a second SQL Server 2012 FCI can be used. must be configured for synchronous replication. and network infrastructure. o o  Use an existing highly available network storage for the cluster file share witness. To ensure that there is no local data loss if there is local failover between instances in the primary site. but this architecture has been thoroughly tested. reporting. For this reason. Additional SQL Server 2012 HADR Capabilities for Future Consideration Note that the following SQL Server 2012 HADR capabilities have not been tested prior to publication of this white paper because of time.com/en-us/library/hh710054.microsoft.com/en-us/library/hh510184.aspx) 3 The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 . the secondary replicas for the availability group replicated the databases.  Availability Group Read-Only Routing and Application Intent These features cannot be used because they require the SQL Server 2012 Native Open Database Connectivity (ODBC) client to be installed on the T24 servers. Read-only access is not enabled. The following links provide more information: o o Configure Read-Only Routing for an Availability Group (SQL Server) (http://msdn.aspx) NOTE In the recommended configuration. it is highly recommended that you recover the primary site as soon as possible or use an additional node in the disaster recovery site with shared SAN storage between the nodes. The following links provide more information: o o Active Secondaries: Readable Secondary Replicas (http://msdn.com/en-us/library/ff878253. They should be considered to be future enhancements to the recommended architecture. the disaster recovery site will operate in an exposed configuration that is not highly available. As a future enhancement. o IMPORTANT In the proposed scenario. the minimum number of servers has been used in the disaster recovery site to reduce costs.o This instance should be configured for synchronous replication in the availability group replication.aspx) Configure Read-Only Access on an Availability Replica (http://msdn.microsoft. as mentioned previously.). and should be tested for custom deployments and/or lab testing sessions:  Readable secondary for Availability Group replicated databases This feature presents no theoretical risks and could be used to better utilize hardware resources in the disaster recovery site (including read-only queries. The shared SAN storage between the nodes in the disaster recovery site is not linked/replicated to the shared storage between the nodes in the primary site. and integrity checks. but can be easily activated with no downtime. This means that in the case of a complete primary site disaster. and configuration constraints. but T24 should be modified to take advantage of this capability (for readonly queries only). backups.microsoft.aspx) Client Connection Access to Availability Replicas (SQL Server) (http://msdn. resource.microsoft.com/en-us/library/hh213002. this version of the client should be tested for T24 use. T24 channels other than T24Browser.1.com/en-us/library/ff878716. but this has not been tested for using in reducing downtime because of Domain Name System (DNS) replication latency. This document focuses only on HADR functionality. The following links provide more information.NET.microsoft.aspx) o Document Scope The following are considered in the scope of this white paper:    This document applies to T24 R11 and R12 (Temenos Application Framework C) with T24Browser as a channel. The document applies to following software: o o o o o o o o Windows Server 2008 R2 with Service Pack 1 (SP1) Windows Server 2008 R2 Network Load Balancing (NLB) Windows Server 2008 R2 clustering Windows Server 2008 R2 clustered file share Windows Server 2008 R2 Distributed File System (DFS) Replication SQL Server 2012 AlwaysOn Availability Group Windows Server 2008 R2 domain controller JBoss 5.NET.aspx) Configure FailureConditionLevel Property Settings (http://msdn.microsoft. 4 The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 . and BizTalk Adapter.aspx) SQL Server 2012 AlwaysOn: Multisite Failover Cluster Instance (http://sqlcat.0 GA The following are considered out of the scope of this white paper:   Performance tuning recommendations. The following links provide more information.com/en-us/library/ff878664.com/en-us/library/ff878667.microsoft. o Failover Policy for Failover Cluster Instances (http://msdn.aspx)  Flexible failover policy SQL Server 2012 introduces a new health detection mechanism for clustered installation that can be modified so that the Windows Failover Clustering is more alert to possible SQL Server 2012 health problem conditions.com/sqlcat/b/whitepapers/archive/2011/12/22/sql-server-2012alwayson_3a00_-multisite-failover-cluster-instance. such as TWS. o o SQL Server Multi-Subnet Clustering (http://msdn. TOCF. Multi-subnet failover clustering Windows Server 2008 R2 and SQL Server 2012 support this type of configuration. NET ARC IB ARC Mobile TOCF. Note that the HADR solution recommended in this white paper focuses on T24 with T24Browser as a channel. Technical Overview of TEMENOS T24 The various components of a T24-based solution are shown in Figure 1. such as RAID and network adapter teaming. Windows Server 2008 R2 Windows Server 2008 R2 Internet Information Services (IIS) 7. Local area network (LAN)/wide area network (WAN) configurations and recommendations. Security.5 T24 Browser TWS. T24 logical component view Table 1 provides a description of the components.NET TWS (EE) TOCF (EE) Channels Connectivity Temenos T24 Management Security Application Windows Server 2008 R2 Active Directory CC // C++ Agent C++ TAFC TAFC Agent CC // C++ TAFC Agent C++ C/ C++ TAFC Agent C / C++ Agent C / C++ TAFC C / C++ TAFC Agent CC // C++ T24 Agent C++ C/ C++ T24 C / C++ T24 T24 T24 T24 T24 T24 C C/ /C++ C++ TAFC C / C++ TAFC C / C++ TAFC TAFC C / C++DCD C C / C++ TAFC C/ /C++ C++ TAFC C / C++ DCD TAFC C / C++ DCD C / C++ DCD C / C++ DCD DCD Database Driver T24 Monitor FX FX FX EB AA DX AC Message Queue SQL Server 2012 Windows Server 2008 R2 Figure 1. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 5 . Hardware configurations.    Administration and monitoring of the software. Direct Connect Driver (DCD) is the T24 data abstraction layer that decouples T24 business logic from the underlying data storage/structure. T24 TAFC Database Driver T24 Monitor Message Queue Database SQL Server AlwaysOn SQL Server AlwaysOn is a new integrated. and IBM DB2. The jBASE or vendor-provided relational database management system (RDBMS). AlwaysOn can provide data and hardware redundancy within and across data centers. A solution using AlwaysOn can take advantage of two major SQL Server 2012 features for configuring availability at both the database and the instance level: The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 6 . as well as historical views of a particular T24 system. Message Queue is an optional middleware infrastructure that lets T24 use message-driven communication with the channel layer. T24 is the banking business logic written by using jBC. currently supported platforms are Oracle.Table 1. T24 Monitor is a Java Management Extensions (JMX) and web-based online monitoring tool for T24. Microsoft SQL Server. Component descriptions Component T24 Agent Description T24 Agent is a server-side jBASE component that is responsible for accepting and processing incoming client requests. and has the capability to serve a wide range of client applications as long as they speak the same protocol. and it can improve application failover time to increase the availability of mission-critical applications. Communication is established via TCP socket connections and by means of a well-defined protocol. and cost-efficient HADR solution. flexible. T24 Agent is a socket server listening on a user-defined TCP port. which is used to generate C / C++ code. AlwaysOn is flexible and lets you reuse existing hardware investments. The Temenos Application Framework C (TAFC) version provides additional runtime services that are currently not available in jBC. offering real-time statistics. Availability groups provide an integrated set of options. Table 2 offers a rough comparison of the type of results that those different solutions may achieve. the primary goal is to restore full service to the point that new transactions can take place. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 7 .  Recovery Objectives Data redundancy is a key component of a high-availability database solution. Recovery Point Objective (RPO) The RPO is often referred to as a measure of acceptable data loss. Transactional activity on your primary SQL Server instance is synchronously or asynchronously applied to one or more secondary instances. help ensure availability of application databases. transactions that were in-flight might be rolled back. and automatic page repair. AlwaysOn Failover Cluster Instances (FCIs) FCIs enhance the SQL Server failover clustering feature and support multi-site clustering across subnets. You can measure the impact and set recovery goals in terms how long it takes to get back in business and how much time latency there is in the last transaction recovered:  Recovery Time Objective (RTO) The RTO is the duration of the outage.  You should use RTO and RPO values as goals that indicate business tolerance for downtime and acceptable data loss. AlwaysOn Availability Groups AlwaysOn Availability Groups are new in SQL Server 2012. fast application failover. which enables cross-data-center failover of SQL Server instances. and as metrics for monitoring availability health. including automatic and manual failover of a logical group of databases. However. The actual data loss can vary depending on the workload on the system at the time of the failure. The initial goal is to get the system back online in at least a read-only capacity to facilitate investigation of the failure. and the type of high availability solution used. or they might be lost on the secondary instances because of delays in data propagation. When an outage occurs. the type of failure. and enable zero data loss through log-based data movement for data protection without shared disks. Faster and more predictable instance failover is another key benefit that enables faster application recovery. They greatly enhance the capabilities of database mirroring. support for up to four secondary replicas. The business goals for RTO and RPO should be key drivers in selecting a SQL Server technology for your high-availability and disaster-recovery solution. It is the time gap or latency between the last committed data transaction before the failure and the most recent data recovered after the failure. Copy. it has been common practice to separate duties and responsibilities for the various audiences and roles involved. Historically. Restore is appropriate for disaster recovery. Use AlwaysOn Availability Groups instead. but not for high availability. 3 The FCI itself does not provide data protection. Comparison of SQL Server HADR solutions SQL Server HADR Solution AlwaysOn Availability Group—synchronouscommit AlwaysOn Availability Group—asynchronouscommit AlwaysOn Failover Cluster Instance Potential Data Loss (RPO) Potential Recovery Time (RTO) Seconds Automatic Failover Readable Secondaries1 Zero Yes2 0 –2 Seconds Minutes Seconds -to-minutes Seconds Minutes5 Minutes -to-hours5 Hours -to-days5 No 0 –4 NA3 Yes Yes No No No NA NA NA Not during a restore Not during a restore Database Mirroring4— Zero High-safety (sync + witness) Database Mirroring2— High-performance (async) Log Shipping Backup. This section describes each of those layers and offers guidance for your design discussions and implementation decisions. so that each was predominately concerned with only a portion of those solution layers. 4 This feature will be removed in future versions of Microsoft SQL Server. data volume. Automatic failover of an Availability Group is not supported to or from a failover cluster instance. regardless of type. 6 Backup. Restore6 Seconds5 Minutes5 Hours5 Fault Tolerance and Disaster Recovery Architecture SQL Server AlwaysOn solutions help provide fault tolerance and disaster recovery across several logical and physical layers of infrastructure and application components.Table 2. A successful SQL Server AlwaysOn solution requires understanding and collaboration across these solution layers: 1 2 An AlwaysOn Availability Group can have no more than a total of four secondary replicas. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 8 . Copy. data loss is dependent upon the storage system implementation. 5 This is highly dependent upon the workload. and failover procedures.    Figure 2 shows a logical topology of a representative AlwaysOn solution. Logical representation of an AlwaysOn solution The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 9 . logically redirecting connection requests to the appropriate SQL Server instance and database replica. Client connectivity Database client applications can connect directly to a SQL Server instance network name. or they may connect to a virtual network name (VNN) that is bound to an availability group listener. Each replica is hosted by an instance of SQL Server (FCI or non-FCI) on a different node of the WSFC cluster. SQL Server instance level A SQL Server AlwaysOn Failover Cluster Instance (FCI) is a SQL Server instance that is installed across and can fail over to server nodes in a WSFC cluster. The VNN abstracts the WSFC cluster and Availability Group topology. The nodes that host the FCI are attached to robust symmetric shared storage (SAN or SMB). An availability group consists of a primary replica and one to four secondary replicas. Infrastructure level Server-level fault-tolerance and intra-node network communication use Windows Server Failover Clustering (WSFC) features for health monitoring and failover coordination. Database level An availability group is a set of user databases that fail over together. Figure 2. the configuration used in the primary site should be used for the disaster recovery site. and NLB is used instead with the “Affinity: None” configuration to achieve the best possible load balancing. The user might lose the session if an application server goes down. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 10       .CoE. However. assuming that there is a compatible network connection between the sites that are capable of synchronous data replication.com). The solution presented in this white paper eliminates these limitations by removing sticky sessions.High Availability and Disaster Recovery Solution The recommended HADR solution for T24 deployments was designed based on the following:  Incurring zero data loss when failing over to the disaster recovery site. This feature is disabled in the recommended solution. Maximizing use of any Windows Server 2008 R2 features and capabilities that complement T24. A DNS host record was created for the web-tier NLB IP to make the failover to the disaster recovery site transparent to the users (for example. T24Browser is a stateful application that normally deploys with a sticky-session configuration. Although this configuration provides the required functionality. If the disaster recovery site also requires high availability. T24Browser is capable of performing simple load balancing among the available T24 application servers when a load balancing solution is not available in the application tier. The same feature can be used for the disaster recovery site if there will be two or more disaster recovery nodes. The Windows Server 2008 R2 NLB feature is used to load balance the traffic into the JBoss application servers in the primary site. T24Browser. reducing the availability.CoE. This is an optional configuration that is only required if a facility needs to simplify server maintenance and keep the T24Browser configurations identical in both sites. Using NLB and DNS host record and avoiding the use of sticky sessions lets you add or remove web-tier servers transparently. it reduces the scalability of the T24 web tier.   The following decisions were made in the solution design.com).  The disaster recovery site used for testing had only one server for each tier. without affecting users. this option does create an additional step in the disaster recovery procedures. This is achieved by persisting the JBoss session state in the SQL Server database and configuring NLB to “Affinity: None”.Temenos.Temenos. Reducing the cost of Microsoft software licenses and specialized hardware (such as load balancers) to minimize the total cost of ownership. Refer to Figure 3 for further information. DNS host record was created for the application-tier NLB IP so that you have the option of failing over only the application tier to the disaster recovery site if necessary (for example. T24Server.  The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 11 .   T24 typically accesses shared files and folders via a mapped drive letter in each T24 server. The clustered file share increases the availability of the T24 shared files and folders.  The disaster recovery instance of SQL Server 2012 is configured as a SQL Server 2012 HADRON synchronous AlwaysOn replica for zero data loss. Therefore. DFS-R is scheduled to occur several times per day to reduce the overhead of the replication. file and folder symbolic links were created by using the “mklink” utility of Windows and used instead of the mapped drive letters to avoid unintended mistakes. The primary site can have two standalone instances of SQL Server 2012 instead of the failover cluster instance if you need to remove the shared storage. If you do not have a fast and stable network connection. however. having the T24 shared files available has a positive impact. The disaster recovery site has a local folder for T24 shared files/folders. A JBoss session persistence database was created in the same SQL Server 2012 HADRON configuration as the T24 database. while the failover cluster instance requires only one license regardless of the number of nodes in the cluster. Symbolic links make the shared files and folders imitate local entities. This needs to be taken into account when setting up the network. implement asynchronous replication instead. however implement the JBoss session persistence database as a different instance. Making the T24 shared files available in the disaster recovery site is not mandatory because T24 can recover without them. and therefore T24 can access them directly. This makes management easier and reduces the steps in disasterrecovery procedures. therefore having the same high availability and disaster recovery capabilities. You can. Since accidentally removing or changing the mapped drive letter can cause failures.  The same Windows Server Failover Cluster that hosts the SQL Server 2012 clustered instance is used to host a clustered file share to keep T24 shared files and folders. Synchronous replication requires a fast and stable network connection in order to work as expected.  Using the NLB “Affinity: None” configuration makes it possible to add or remove application-tier servers transparently. if required. Windows Server 2008 R2 Distributed File System Replication (DFS-R) is implemented with an Active Directory Domain Services (AD DS)–published namespace to make the file share failover to the disaster recovery site transparent and to replicate T24 shared files/folders. this will require licenses for each SQL Server 2012 instance. without affecting online transactions. but understand that asynchronous replication does have a possibility of data loss. The SQL Server 2012 HADR AlwaysOn (HADRON) configuration with a SQL Server 2012 Failover Cluster instance for the primary site is used to reduce the number of required SQL Server 2012 licenses. However. Figure 3. HADR solution The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 12 . Figure 4.Setup and Configuration This section describes how to configure the HADR solution. The primary site could have two standalone instances of SQL Server 2012 instead of the failover cluster instance if this is required to remove the shared storage. however. SQL Server 2012 HADR Configuration SQL Server 2012 HADR is configured with a clustered instance for the primary site and a standalone instance in the disaster recovery site. The configuration uses the AlwaysOn Availability Group to replicate database content and to provide transparent failover. this option requires licenses for each SQL Server 2012 instance. SQL Server 2012 HADR solution The Windows Server Failover Cluster consists of a cluster with three nodes: two nodes in the primary site and one node in the disaster recovery site with a SAN shared only between the two nodes in the primary site. while the failover cluster instance requires only one licence regardless of the number of nodes in the cluster. because there is no SAN in the secondary site. and because you do not need an expensive storage-level synchronization mechanism to replicate disk data content. A clustered SQL Server 2012 instance is primarily used to reduce the number of SQL Server 2012 licenses that are required. The disaster recovery instance is configured as a synchronous replica for zero data loss. The disaster recovery instance has only local storage where the database content is replicated by using the availability group. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 13 . Figure 4 shows a schematic of the solution. The cost of the solution is reduced because there is no shared storage between nodes in the primary site and the node in the disaster recovery site. If the disaster recovery site also requires high availability, the same configuration used in the primary site needs to be available in the disaster recovery site. When the recommended solution was tested, all of the SQL Server instances were created as named instances to make them easy to identify during maintenance and monitoring. Table 3 lists the names that were used in the test environment during setup; these names can be used as a reference guideline. Table 3. Names of SQL Server instances Name SQL11HA Description SQL Server 2012 instance name of the primary site. Since the named instance uses a dynamic TCP port, static TCP port 1533 was configured via the SQL Server Configuration Manager. SQL11DR SQL Server 2012 instance name of the disaster recovery site. Since the named instance uses a dynamic TCP port, static TCP port 1533 was configured via the SQL Server Configuration Manager. T24AG SQL Server 2012 AlwaysOn Availability Group name. This name is not used by T24, and is used in SQL Server Management Studio when required to fail over to the disaster recovery instance. The JBoss session persistence database was added to the same availability group in the test environment. This makes management easier, and disaster recovery failover becomes a single process for both the databases. T24AgListener SQL Server 2012 AlwaysOn Availability Group listener name. This is the name T24 uses to connect the SQL Server 2012 HADRON instance. When creating the listener, 1433 (the SQL Server default port) was used as the TCP port number to avoid having to change the T24 connection parameters to use a different port number. Windows Server Firewall Configurations The Windows Server Firewall is on by default; therefore, you need to create relevant inbound firewall exceptions in the servers for the configuration to work as expected. Table 4 shows the inbound firewall rules that need to be created in all the database servers. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 14 Table 4. Firewall rules Name SQL11 (1533) SQL11 Browser (1434) Description Inbound firewall exception rule for TCP port 1533, which is the static port configured for the SQL Server instance. Inbound firewall exception rule for UDP port 1434, which is required for the SQL Server Browser when named instances exist. Inbound firewall exception rule for TCP port 5022, which is required for the SQL Server 2012 HADRON Availability Group. Inbound firewall exception rule for TCP port 1433, which is configured for the SQL Server 2012 HADRON Availability Group Listener. SQL11 AG (5022) SQL11 AG Listener (1433) T24 File Share Configuration In the multi-server configuration, T24 is required to have a shared location for its working files and folders. Any single file is created or written by only one T24 instance and is read by all instances. There is no concern about file write locks; however, the share needs to be resilient for the multiserver configuration to function properly. If T24 fails over to the disaster recovery site, making the T24 shared files available in the disaster recovery site is not mandatory because T24 can recover without them. However, having the shared files available does have a positive impact. A resilient file share solution with less frequent (once or twice a day) file replication to the disaster recovery site is therefore a good solution. Windows Server Clustered File Server, in conjunction with DFS-R, provides an optimal solution and does not require any additional licenses. For simplicity, an Active Directory Domain Services (AD DS)—published DFS Namespace is used to refer the shared file folder. Therefore, T24 can refer the same path (namespace) for shared files, whether it is in the primary site or in the disaster recovery site. Figure 5 shows the T24 file share and file replication configuration. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 15 Figure 5. File share and file replication Windows Server Clustered File Share Configuration The recommended SQL Server 2012 HADR configuration uses a Windows Server Cluster. Using the same cluster to host the file server reduces the complexity of the solution and simplifies management and monitoring. Since only the primary site servers in the cluster have access to the shared storage, the only possible owners of the file server are the servers in the primary site. The file server, therefore, does not fail over to the disaster recovery site, and the disaster recovery instance of T24 will only have access to its local folders. A shared folder called “T24FileShare” was created in the file server and used as the resilient file share location of the primary site. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 16 and because having the shared files available in the disaster recovery site is not mandatory to T24. the folder for the shared files was created locally in the same server. the “time to live” (TTL) value of the DNS host record needs to be adjusted. Note that this is an optional configuration that is helpful if you need to ease server maintenance and keep the T24Browser configurations identical in both sites. One drawback of using DNS host records is that the client application using the name caches the IP address.com” was created for the application-tier NLB IP in the test environment. Therefore. the TTL value was set to one minute. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 17 .com” was created for the web-tier Network Load Balancing IP. this configuration does add a step to the disaster recovery procedures. Active Directory Domain Services DNS Configuration To make the web-tier failover transparent to the users. You can also create a DNS host record for the application-tier servers if it is a requirement to be able to transparently fail over the application tier independently to the web tier. The DNS host record “T24Server. and users do not need to use a different URL. the same type of file share needs to be created in the disaster recovery site.CoE. therefore. The disaster recovery site of the test environment had a single instance of T24. However. the client application might still use the old IP address. In the test environment. The DFS replication was set up to replicate the files between the clustered file share in the primary site and the local folder in the T24 disaster recovery instance.Temenos. which means that the client application verified the DNS host record IP address with the server every one minute. even if the IP address of the DNS host record is changed at the server-side in a disaster recovery failover. Failover to disaster recovery will therefore only require changing the IP address of the DNS host record. To minimize the chance to this happening.Temenos. However. because the test environment had only a single T24 instance.If the disaster recovery site also uses a T24 multi-server configuration. The replication frequency was set to the lowest possible (once or twice per day) to avoid any performance implications. you must have a DNS host record that can be referred by the users to reach T24Browser instead of the load balancer IP. a local folder was created with the same shared folder name. and this old IP address might no longer be available. Distributed File System Replication Configuration DFS-R was used to periodically replicate T24 shared files between the primary site and the disaster recovery site. the DNS host record “T24Browser.CoE. In the test environment. Temenos. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 18 . In the recommended solution. Table 5. The NLB feature in Windows Server is a software load balancing solution that does not require additional licenses and complements T24 by providing a specialised load balancing solution. Figure 6 shows the application-tier NLB cluster. However. Application-Tier NLB Configuration T24Browser is capable of performing simple load balancing among the available T24 application servers when a load balancing solution is not available in the application tier. T24Server. The TTL value was set to one minute for testing. DNS host records DNS Host Record T24Browser. This was used by the T24Browser (configured in t24ds. the NLB feature In Windows Server is enabled and configured in the T24 application servers in the primary site.CoE. application servers. they can be useful with critical services like web servers. The TTL value was set to one minute for testing. TTL values are often reduced by the DNS administrator before service is moved to minimize disruptions.com Description The Domain Name System (DNS) host record of the T24 web-tier load balancer that was used in the web browser URL to connect to T24Browser. specialized load balancing solutions can provide better load balancing capabilities.com An optional DNS host record created for the T24 application-tier load balancer to test transparent failover of the application tier independently to the web tier.While shorter TTL values can increase the load on the DNS server. Table 5 shows the DNS host records that were created in the test environment.Temenos. and load balancers. and created an NLB cluster consisting of the two servers.CoE.xml) to connect to the load balancer in the test environment. “Affinity: None” was selected to achieve best possible load balancing.” Protocol Port range Filtering mode The protocol used for communication with T24 was TCP/IP. This was because the test servers had only one network adapter. an NLB cluster needs to be configured in those servers as well.Temenos. which is the T24 agent port configuration.CoE. This lets the network load balancing route the connections to the T24 instances in the cluster. Application-tier NLB cluster If the disaster recovery site has multiple T24 application servers. The port range was limited to 20002. The simple load balancing feature in T24 of T24Browser is disabled and used NLB cluster name (T24Server. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 19 . Table 6. If the server has multiple network adapters. NLB configurations Configuration Cluster operation mode Description “Multicast” operation mode was used to keep the network adapter’s built-in media access control (MAC) address. the cluster operation mode can be set to “Unicast.com) as the T24 instance. Table 6 shows the NLB configurations used.Figure 6. and this network adapter had to be used for server management as well. The HADR solution for the T24 file share is implemented by using a Windows Server 2008 R2 clustered file share and DFS-R. The Windows Server 2008 R2 NLB feature was used to balance the T24 application servers. Figure 7. T24 Application Server Configuration The T24 application tier is configured with two T24 instance (nodes: App Node 1 and App Node 2) in the primary site and a single instance (node: App Node 3) in the disaster recovery site. Figure 7 shows the T24 application tier configuration. Install TAFC and T24 application on all application servers (for installation guidance. Note that it is possible to have multiple T24 instances (application server nodes) in the disaster recovery site if high availability is a requirement for the disaster recovery site.Using NLB with the “Affinity: None” configuration lets you add or remove application-tier servers transparently. contact Temenos). The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 20 . T24 application tier The Temenos Application Framework ‘C’ (TAFC) is the execution environment for the T24 application. without affecting online transactions. the T24 database must be accessed via the SQL Server 2012 AlwaysOn Availability Group.  DB Server name] DB name  DB User/Password encrypted   Default database locking (SQL Server application lock) was used for the testing. Because the SQL Server 2012 HADR configuration is used for the database tier. and therefore T24Browser sees only a single instance of T24 (load balancing cluster name). When using multiple application servers. At the time of testing. Therefore. the SQL Server 2008 R2 Native Client was used.Following is a description of how the T24 application servers were configured:  All the T24 instances in the test environment used multiple server configurations with the required licenses. regardless of the number of T24 applications servers available. The default jbase_agent port 20002 was used in the test environment. File jedi_config . the DCD for the SQL Server 2012 Native Client was still in development.100203 0002 T24AgListener]T24R12 0003 T24User]uHdE9oJj8B5Y0cUF0hGh0A==]   Direct connect driver version.   Inbound Windows firewall exception rule for TCP port 20002 was created to make the jbase_agent port accessible from T24Browser. For this reason. define port ranges for each T24 application server to avoid conflicts or deadlock situation during close of business. Ports can be assigned by using the following variable in each application server: JBCPORTNO= {port range}  The same jbase_agent port must be used on all T24 application servers. the availability group listener name was used in the T24 configuration instead of the database server IP address. install the multiple application server module. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 21 . The same port must be used because requests to the T24 servers are controlled by the load balancer. The T24 database driver (Direct Connect Driver *DCD+) requires the SQL Server client to be installed on the server. Record 'XMLMSSQL_FRMWRK' Command-> 0001 R12. To use one instance of T24 on multiple servers. For more information about connection string keywords.NET Framework 3.0 update 4.NET Framework 4.Limitations of Using the SQL 2008 R2 Native Client During the testing.NET with .microsoft.microsoft. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 22 .2* ADO.0 ODBC SQL Server Native Client 11. the considerations for client availability features shown in Table 7 will apply.5 Microsoft Java Database Connectivity (JDBC) driver 4.2 patch download for connectivity improvement (http://support. the SQL Server 2008 R2 Native Client was used with the T24 database driver because the DCD did not support SQL Server 2012 client libraries.0 OLE DB ADO. see: Using Connection String Keywords with SQL Server Native Client (http://msdn. The following limitations therefore apply to the SQL Server 2012 HADR AlwaysOn functionalities:    Read-only routing for the availability group is not available.com/en-us/library/ms130822(v=sql. Application intent is not available.NET with Microsoft .0.NET with .aspx). Client type considerations Driver Multisubnet failover Application intent Readonly routing Multi-subnet failover: faster single subnet endpoint failover Yes No Future date Multi-subnet failover: named instance resolution for SQL Server clustered instances Yes No Future date SQL Server Native Client 11.NET Framework 4. Table 7.com/kb/2544514).0 for SQL Server Yes No Yes Yes Yes Yes Yes Yes Yes Future date Yes Future date Yes Future date Yes Future date Yes Future date Future date *ADO.110). When the SQL Server 2012 Native Client is certified for use with T24.0. Optimizations for fast multi-subnet failover clustering are not available. The NLB feature is enabled and configured in the web servers in the primary site. Symbolic links make the shared files and folders act as local entities. However. Therefore. accidentally removing or changing the mapped drive letter can cause failures. Web-tier NLB cluster The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 23 . T24 typically accesses shared files and folders via a mapped drive letter in each T24 server. and created an NLB cluster consisting of the two servers. so T24 can directly access them.T24 Shared Files For T24 multiple server installation. Figure 8 shows the web-tier NLB cluster. The Network Load Balancing (NLB) feature of Windows Server does not have a single point of failure because the service works on the network layer of all the servers. Because it is a readily available feature in Windows Server. This functionality is typically provided by using the proxy server or/and load balancer with redundancy to increase the availability of the service. If there are additional folders/files that need to be shared. it is necessary to share certain files and folders among T24 application servers. appropriate symbolic links should be created. file and folder symbolic links were created by using the Windows “mklink” utility instead of mapped drive letters to avoid unintended mistakes. regardless of the number of servers in the tier. Figure 8. there needs to be a mechanism to route the requests to the servers and to provide a single address to the requester (web browser). Web-Tier NLB Configuration When the web tier has multiple servers (nodes). the NLB feature does not require additional licenses. JBoss is configured to persist session states in the SQL Server database. If the server has multiple network adapters. and this network adapter had to be used for server management as well. “Affinity: None” was selected to achieve best possible load balancing. Table 8 shows the NLB configurations used. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 24 . in the recommended solution.CoE. it is possible to use the “Affinity: None” configuration in the load balancer. using NLB with the DNS host record allows for adding or removing web-tier servers transparently and without affecting the users. NLB configurations Configuration Cluster operation mode Description “Multicast” operation mode was used to keep the network adapter’s built-in media access control (MAC) address. Not using sticky-sessions increases the availability of the site. the DNS host record (T24Browser. This was because the test servers had only one network adapter.Temenos. However. an NLB cluster needs to be configured in those servers as well. The port range was limited to 8080.com) is used for the NLB cluster IP address. Table 8. therefore. even if there is a failover to the disaster recovery site. the cluster operation mode can be set to “Unicast. the T24Browser requires “Affinity: Single” (stickysession) configuration because it is a stateful application. the web browser URL remains unchanged. To make it possible to fail over the web tier to the disaster recovery site transparently.If the disaster recovery site has multiple web-tier servers.” Protocol Port range Filtering mode TCP was used as the HTTP traffic transport over TCP/IP. which was the JBoss web site port range configured in the test environment. in addition. Typically. Therefore. JBoss session persistence functionality was implemented using a SQL Server database. It is possible to have a multiple JBoss/T24Browser instances (web server nodes) in the disaster recovery site if high availability is a requirement for the disaster recovery site. The Windows Server 2008 R2 NLB feature was used to balance the loads on the JBoss server nodes. Following is the list of configurations that were made after successfully installing JBoss:  Because of the limitations of JBoss cluster session replication and to avoid using sticky sessions.0 GA was used in the test environment that hosted T24Browser Java Servlet application. (Note that the JBoss session persistence database can be implemented as a different instance if required. T24 web tier JBoss Configuration The JBoss application server 5.) The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 25 . No clustered instance of JBoss was installed in the web-tier servers. the JBoss session persistence database has the same high availability and disaster recovery capabilities as the T24 database. This makes management easier and reduces the number of steps in the disaster recover procedures. Figure 9. A JBoss session persistence database was created in the same SQL Server 2012 HADR configuration as the T24 database.T24Browser Configuration The T24 web tier is configured with two JBoss instances with T24Browser (nodes: Web Node 1 and Web Node 2) in the primary site and a single instance (node: Web Node 3) in the disaster recovery site. Therefore.1. Figure 9 shows the T24 web tier. This was set to 20 times. This was set to 20 seconds. T24Browser can be deployed and configured to use one of the two types of supported configurations. Detailed stepby-step setup and configuration can be requested from Temenos. For the online transactions used in this testing. RetryWait When retrying. AGENT connection method was used for the testing. the AGENT configuration is recommended. RetryCount The number of retry attempts the T24Browser should make if it can’t reach T24 to successfully execute a transaction. An inbound Windows firewall exception rule for TCP port 8080 was created to make JBoss accessible to users. ConnectionTime The connection expiration time if T24Browser does not get a response from the out T24 application server. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 26 . T24Browser with AGENT Connection Method After successful installation of the JBoss application server. AGENT or JMS. Table 9. Tables 9 and 10 show the settings that were configured in the T24Browser. the number of seconds to wait before attempting to retry the transaction.xml Parameter name Server Connection Method Description Configuration of the connection to the T24 server. Settings in browserParameters. This was set to 5 seconds. if the optional DFS-R is implemented. Note that the second failover activity is optional.xml Property name Host Description A comma-separated list of available T24 servers.Temenos. because this is typically part of the business continuity plan. However. In addition. the “DFS Namespace” fails over automatically and manual failover is not required. and can be used if application-tier failover is implemented to ease maintenance activities. Settings in t24-ds. the disaster recovery failover might require additional procedures to be followed. Figure 10 shows the three failover activities that are required. This was set to 60 seconds in the test environment. Disaster Recovery Procedures The high availability solution described in this document implements “automatic failover” between the primary site servers (nodes). Human intervention is therefore not required. Ports The jbase_agent TCP port number. the name of the load balancing cluster needs to be used instead of the names of the T24 servers. The load balancing cluster “T24Server. the disaster recovery failover is intentionally designed to be manual. Therefore. This section describes the disaster recovery procedures that were successfully tested for the recommended solution. loadBalancing To enable or disable the simple load balancing feature in T24Browser. actionTimeout The number of seconds that the jbase_agent waits for a response from T24 application server. This is set to “false” because the NLB feature in Windows Server 2008 R2 performs the load balancing in the recommended solution.CoE. All T24 instances in the test environment are configured to use TCP port 20002. 20002 is used as the jbase_agent port number.com” was used in the test environment.Table 10. therefore. Because the NLB feature in Windows Server 2008 R2 is configured at the application tier. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 27 . Failover to disaster recovery site The steps required for the failover activities are described in detail in the sections that follow.Figure 10. Note that the steps in all sections need to be completed to successfully fail over to the disaster recovery site. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 28 . and then expand Forward Lookup Zones. Expand Roles. 4. expand Server Name. Note that T24Browser and optional T24Server are the DNS host records that require the IP changes (Figure 11). 3. Select the domain name (for example. expand DNS. 2. expand DNS Server. Following are the steps that need to be followed to change the IP addresses of the DNS host records: 1.com).Temenos. Select the DNS host record The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 29 . Navigate to Server Manager. CoE. Log on to the domain controller as the administrator.DNS Switching Web-tier and application-tier DNS switching require changing the IP address of the DNS host records to the IP address of the relevant server (node) in the disaster recovery site. Figure 11. and then select Properties (Figure 12). If the disaster recovery site has more than one web-tier server. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 30 . Right click on the DNS host record T24Browser. the previous IP address should be the IP address of the web-tier load balancer (NLB cluster).5. Change the address in the IP address field to the IP address of the web-tier server in the disaster recovery site. Figure 12. and then click OK. T24Browser DNS host record properties 6.  The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 31 . SQL Server 2012 HADR Failover The SQL Server 2012 HADR failover to the disaster recovery site might be required for the following two scenarios:  Planned manual failover Primary site database servers are available. the IP address should be the IP address of the application-tier load balancer (NLB cluster). Unplanned forced failover Complete primary site or primary site database server failure. If the “T24Server” DNS host record is also available. and then click OK (Figure 13). right-click the DNS host record. Figure 13. but required to fail over to the disaster recovery site. T24Server DNS host record properties If the disaster recovery site has more than one application-tier server. and the database servers in the primary site are not accessible.7. and then select Properties. Change the address in the IP address field to the IP address of the application-tier server in the disaster recovery site. there is no server downtime in the primary site. and databases are in “Synchronized” state in both primary and disaster recovery instances of SQL Server. Figure 14: SQL Server primary instance database status The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 32 . make sure that the databases are in “Synchronized” state in both primary and disaster recovery instances of SQL Server (Figure 14 and Figure 15). before starting the failover procedure.Planned Manual Failover When the failover is planned. the Windows Server Failover Cluster (WSFC) is active. Therefore. However.aspx).microsoft.  The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 33 . database recovery occurs asynchronously after the availability group has finished failing over.aspx). see: Perform a Planned Manual Failover of an Availability Group (SQL Server) (http://msdn.Figure 15.com/en-us/library/ms366279.microsoft. SQL Server disaster recovery instance database status For more information about planned manual failover. Limitations and Restrictions  A failover command returns as soon as the target secondary replica has accepted the command. see: Cross-Database Transactions Not Supported for Database Mirroring or AlwaysOn Availability Groups (SQL Server) (http://msdn. Cross-database transactions and distributed transactions are not supported by AlwaysOn Availability Groups. Cross-database consistency across databases within the availability group is not maintained during failover.com/en-us/library/hh231018. For more information. com/en-us/library/hh213474. Failover Procedure Following are the steps that need to be followed to fail over the SQL Server 2012 HADR to the disaster recovery site. The target secondary replica must currently be synchronized with the primary replica. To determine the failover readiness of a secondary replica. You must be connected to the server instance that hosts the target secondary replica. the local secondary databases must be synchronized). SQL Server 2012 primary instance The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 34 .dm_hadr_database_cluster_states dynamic management view (see: http://msdn.microsoft.Prerequisites and Restrictions   The target secondary replica and the primary replica must both be running in synchronouscommit availability mode.aspx) or look at the Failover Readiness column of the AlwaysOn Group Dashboard (see: http://msdn. This requires that all the secondary databases on this secondary replica must have been joined to the availability group and must be synchronized with their corresponding primary databases (that is. 1.com/en-us/library/hh213319.aspx).  This task is supported only on the target secondary replica. Connect to Primary or Secondary (disaster recovery) instance of SQL Server by using the SQL Server 2012 Management Studio (Figure 16). query the is_failover_ready column in the sys.microsoft. Figure 16. T24AG). Right-click on the Availability Group (for example. Figure 18. Select "Failover" 3. click Next (Figure 18). and then select Failover (Figure 17).2. Figure 17. In the Fail Over Availability Group Wizard. Failover Availability Group wizard – Introduction page The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 35 . Figure 19. Figure 20. and then click Next (Figure 19). connect to the secondary instance by providing the credentials. Fail Over Availability Group wizard – Select New Primary Replica page 5. select the secondary SQL Server instance if it is not already selected. In the Connect to Replica page.4. In the Select New Primary Replica page. Fail Over Availability Group wizard – Connect to Replica page The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 36 . and then click Next (Figure 20). Fail Over Availability Group Wizard – Results Page The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 37 . Figure 22. Click Finish at the Summary page to start the failover (Figure 21). the wizard will show a Results page similar to the following (Figure 22).6. Fail Over Availability Group wizard – Summary page 7. Figure 21. After the successful failover. Management Studio after Fail Over Availability Group wizard The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 38 .The “Validating WSFC quorum vote configuration” warning appears because of the special quorum configuration used in this solution and is safe to ignore (Figure 23). Check the database status and Availability Group status in SQL Server 2012 Management Studio to verify the failover (Figure 24). Fail Over Availability Group wizard – WSFC quorum configuration warning 8. Figure 24. Figure 23. client computers might still be connected to former primary databases. When a database on a secondary replica is in the REVERTING or INITIALIZING state. the SQL Server 2012 AlwaysOn Availability Group needs to force failover to the disaster recovery instance.microsoft. For more information.com/en-us/library/ff877957(SQL. Therefore WSFC needs to be deliberately started (forced) before the database failover. If the database was in the INITIALIZING state. Cross-database consistency across databases within the availability group is not maintained upon failover. Limitations and Restrictions  Data loss is possible during the forced failover of an availability group. After bringing the WSFC online with a forced quorum. Therefore. it is strongly recommended that you force failover only if the primary replica is no longer running and if you are willing to risk losing data to restore access to databases in the availability group. For more information about unplanned forced failover. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 39 .aspx).microsoft. see: Cross-Database Transactions Not Supported for Database Mirroring or AlwaysOn Availability Groups (SQL Server)    (http://msdn.Unplanned Forced Failover When the primary site or the database servers (nodes) in the primary site are not available. Cross-database transactions and distributed transactions are not supported by AlwaysOn Availability Groups. you will need to fully restore the database from backups. In addition. if the primary replica is running when you initiate a forced failover. However. A failover command returns as soon as the target secondary replica has accepted the command. If the database was in the REVERTING state.com/en-us/library/ms366279. forcing failover causes the database to fail to start as a primary database. the Windows Server Failover Cluster (WSFC) will not have quorum to bring the cluster online.110). database recovery occurs asynchronously after the availability group has finished failing over. see: Perform a Forced Manual Failover of an Availability Group (SQL Server) (http://msdn.aspx). you will need to apply the missing log records from a database backup or fully restore the database from scratch.  Failover Procedure When the primary site or the primary site database servers are not available. You must be able to connect to the server instance that hosts the target secondary replica.aspx). Windows Server failover Cluster without quorum The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 40 .microsoft. The following shows how Windows Server Failover Cluster and SQL Server instance can be seen in the disaster recovery database server (Figure 25 and Figure 26). the only accessible database server will be the disaster recovery instance. see: WSFC Disaster Recovery through Forced Quorum (SQL Server) (http://msdn.com/en-us/library/hh270277. For more information about the forced quorum procedure. Figure 25.Prerequisites and Restrictions  Windows Server Failover Cluster (WSFC) needs to be brought online with a forced quorum. followed by SQL Server 2012 availability group forced failover.Figure 26. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 41 . The steps in all sections need to be completed to successfully fail over to the disaster recovery site. Log on to the disaster recovery database server with a domain account that has administrator privileges to the local computer. The following sub-sections provide the steps required to bring the database online. SQL Server 2012 – primary site failure To bring the database online in the disaster recover site. you first need to start Windows Server Failover Cluster with forced quorum. Force Cluster Start with Force Quorum Following are the steps need to be followed to force the cluster to start in the disaster recovery site with force quorum: 1. Open Server Manager. Figure 27. Click Force Cluster Start in the Actions pane (Figure 28). Cluster Manager . expand Features.2. Failed cluster due to quorum vote 3. and then expand Failover Cluster Manager. Figure 28.force cluster start option The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 42 . Select the cluster (Figure 27). Cluster force start in progress The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 43 .4. Figure 30. Confirm force cluster start 5. Figure 29. Confirm the action by selecting Yes – Force my cluster to start option (Figure 29). Cluster start will take some time—wait till the cluster starts successfully (Figure 30). Open SQL Server 2012 Management Studio and connect to the SQL Server disaster recovery instance (Figure 32).6. the cluster will look like the following figure in the Failover Cluster Manager (Figure 31). Cluster started with force quorum Force Failover SQL Server 2012 Availability Group Once the Windows Server Failover Cluster is online with force quorum. Figure 31. Figure 32. the following steps need to be followed to force failover in the SQL Server 2012 availability group: 1. After the cluster starts.SQL Server instance before forced failover The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 44 . click Next (Figure 34). Fail Over Availability Group wizard – Introduction page The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 45 . Right-click on the Availability Group (for example. Figure 33. and then select Failover (Figure 33). Start force failover 3. T24AG).2. In the Fail Over Availability Group Wizard. Figure 34. In the Select New Primary Replica page. Because the cluster quorum is forced. Also note the warning. Click Next (Figure 35 and Figure 36).4. Fail Over Availability Group wizard – Select New Primary Replica page warning The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 46 . Fail Over Availability Group wizard – Select New Primary Replica page Figure 36. Figure 35. select the secondary SQL Server instance if it is not already selected. the quorum status is showing as “Forced Quorum”. Because the database status is not synchronized. Select and confirm failover with potential data loss. Fail Over Availability Group wizard – Potential Data Loss confirmation The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 47 . and then click Next (Figure 37).5. SQL Server warns about potential data loss. there is no data loss if the databases were in “Synchronized” state at the time of the site failure Figure 37. However. Figure 38. After the successful force failover. Click Finish on the Summary page to start the failover (Figure 38). Fail Over Availability Group wizard – Results page The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 48 .6. Fail Over Availability Group wizard – Force Failover Summary page 7. Figure 39. wizard will show the “Results” page (Figure 39). Figure 41. Management Studio after Fail Over Availability Group wizard Additional Considerations It is highly recommended that you change the cluster quorum configuration if planned (scheduled maintenance) or unplanned (primary site disaster) shutdown of all cluster nodes in the primary site occurs. the database status and availability group status in SQL Server 2012 Management Studio will look like the following figure (Figure 41). Fail Over Availability Group wizard – WSFC Quorum Configuration warning 8. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 49 .The “Validating WSFC quorum vote configuration” warning appears because of the special quorum configuration that is used in the recommended solution and is safe to ignore (Figure 40). Figure 40. the entire cluster might shut down because of insufficient quorum vote availability. If you do not change the cluster quorum configuration. After successful force failover. and if the disaster recovery SQL Server 2012 instance becomes active as the primary instance for an extended period of time. and faster failover with no additional cost. see the Microsoft Support article at http://support. better scalability. and change the value for the cluster nodes in the primary site to 0. For more information.  Using the NLB feature in Windows Server provides better stability.microsoft.  Removing the sticky-session requirement in T24Browser makes the solution more reliable and scalable. The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 50 .  JBoss session persistence increases the reliability and provided better scalability for the solution.  T24 works well with a configuration that uses the NLB feature in Windows Server and provides faster application-tier failover. Otherwise. Findings and Carryovers The following findings and carryovers were noted during the testing of the proposed solution in this document. change the FSW location to be in the disaster recovery site. it is highly recommended that you add a second node in the disaster recovery site and modify the cluster quorum configuration accordingly.  SQL Server 2012 HADR and AlwaysOn provides simplified disaster recovery failover while maintaining database replica in the disaster recovery site.com/kb/2494036/en-us. Therefore. Shutting down only one node in the primary site will not affect cluster availability as long as the second node in the primary site will be still up and running along with the File Share Witness (FSW).  If the FSW in the primary site will not be available and cannot be contacted by the cluster node in the disaster recovery site. this should only be done for a limited amount of time. NLB also lets you transparently add or remove nodes in the web and application tiers. Running the entire system with only one node in the disaster recovery site will not guarantee high availability.  A JBoss session persistence database in the same SQL Server 2012 AlwaysOn Availability Group reduces the administrative work and reduces the steps in the disaster recovery procedures. Change the value for the disaster recovery cluster node property “NodeWeight” to 1. 51 The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 . o NOTE Currently there is no released service pack for SQL Server 2012. o NOTE Currently. Service Pack 1 (SP1) for Windows Server 2008 R2 is available and certified by both Microsoft and Temenos. Recommended Hotfixes and Service Packs The following best practices apply to the recommended configuration:   Regularly check and apply all the security hotfixes for Windows Server 2008 R2.  DNS host records used for the load balancer IP addresses make disaster recovery failover transparent at the web and application tiers. Windows Server DFS-R with DFS Namespace published in Active Directory Domain Services provides a unique URL that can be used to refer the file share. SQL Server 2012 does not have any security hotfixes released. Regularly check and apply the latest available service pack for Windows Server 2008 R2 after checking with Temenos about the supportability.microsoft. o NOTE Currently.  File and folder symbolic links make the shared file/folder access more resilient. regardless of the system that is operating in the primary or the disaster recovery environment.  A clustered instance of SQL Server 2012 for high availability reduces licensing requirements.com/kb/2687741/en-us  Regularly check and apply all the security hotfixes for SQL Server 2012. A hotfix that improves the performance of the "AlwaysOn Availability Group" feature in SQL Server 2012 is available for Windows Server 2008 R2 http://support.  Regularly check and apply the latest available service pack for SQL Server 2012 after checking with Temenos about the supportability.  A SQL Server 2012 AlwaysOn Availability Group eliminates SAN replications.com/kb/980054/en-us  As a special “out-of-band” recommended hotfix for Windows Server 2008 R2.microsoft.  Regularly check and apply the pertinent hotfixes mentioned in the following knowledge base (KB) article to enhance stability and fix known critical bugs (not security related). please install the following hotfix on all the cluster nodes in the primary and disaster recovery sites. Recommended hotfixes and updates for Windows Server 2008 R2–based server clusters http://support. aspx Active Secondaries: Readable Secondary Replicas (AlwaysOn Availability Groups) http://msdn. it is not necessary to install the previous hotfix. For a list of released CUs for SQL Server 2012. The SQL Server 2012 builds that were released after SQL Server 2012 was released http://support. Cumulative update package 1 for SQL Server 2012 http://support. Additional Resources Following are links for further information.com/en-us/library/ff929171.microsoft.microsoft.com/kb/2692828/en-us Finally.  Regularly check for latest “cumulative update” (CU) release for SQL Server 2012.com/en-us/library/ff878253.redmond.microsoft. see the following KB article.microsoft.microsoft. SQL Server 2012  Books Online for SQL Server 2012 http://msdn.aspx Database Availability Key Capabilities and Concepts: o o Failover Clustering and AlwaysOn Availability Groups (SQL Server) http://msdn.aspx o o  Instance Availability Key Capabilities and Concepts: The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 52 .com/fwlink/?LinkId=201271 Perform a Forced Manual Failover of an Availability Group (SQL Server) http://msdn. it is highly recommended that you check periodically with the Microsoft Support Service for any recommended non-security related hotfixes for Windows Server 2008 R2 and SQL Server 2012. As a special “out-of-band” recommended hotfix for SQL Server 2012.microsoft.microsoft.corp. install the following update package on all the SQL Server 2012 instances in the primary and disaster recovery sites. review the fixed bugs and install only if you are affected and after checking with Temenos about supportability.aspx   Database Availability Step-by-Step Guide: o Deploying a new Availability Group http://msdnstage.com/kb/2679368/en-us NOTE If a more recent update is available.aspx#RelatedTasks Create or Configure an Availability Group Listener (SQL Server) http://go.com/en-us/library/ff877957.com/en-us/library/ms130214.com/enus/library/ff877884.microsoft. com/en-us/sqlserver/gg508768(l=en-us) Hardware and Software Requirements for Installing SQL Server 2012 http://msdn.aspx#SystemReqsForAOAG Before Installing Failover Clustering http://msdn.aspx Introducing SQL Server AlwaysOn http://msdn.com/en-us/library/ms143506.com/en-us/library/ms189910.microsoft.aspx Add or Remove Nodes in a SQL Server Failover Cluster (Setup) http://msdn.com/en-us/library/ms179530.microsoft.com/en-us/library/ff877931.com/en-us/library/ff878487.microsoft.com/download/D/2/0/D20E1C5F-72EA-4505-9F26FEF9550EFD44/Microsoft%20SQL%20Server%20AlwaysOn%20Solutions%20Guide%20for% 20High%20Availability%20and%20Disaster%20Recovery.microsoft.aspx         The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 53 .microsoft.com/en-us/library/ff878716.microsoft.microsoft.docx Availability Modes http://msdn.aspx Configure FailureConditionLevel Property Settings http://msdn.aspx Microsoft SQL Server AlwaysOn Solutions Guide for High Availability and Disaster Recovery http://download.com/en-us/sqlserver/gg490638 Overview of AlwaysOn Availability Groups http://msdn.aspx  Instance Availability Step-by-Step Guide: o o o SQL Server Multi-Subnet Clustering http://msdn.com/en-us/library/ff878667.o Failover Policy for Failover Cluster Instances http://msdn.microsoft.aspx View and Read Failover Cluster Instance Diagnostics Log http://msdn.microsoft.com/en-us/library/ff877884.aspx Prerequisites.microsoft.aspx   AlwaysOn FAQ for SQL Server 2012 http://msdn.com/en-us/library/ms191545.microsoft. and Recommendations for AlwaysOn Availability Groups http://msdn.microsoft.microsoft.aspx Create a New SQL Server Failover Cluster (Setup) http://msdn.microsoft. Restrictions.com/en-us/library/ff878700.com/en-us/library/ff878664. com/en-us/library/ff878176.microsoft.com/en-us/library/cc646023.aspx SQL Server 2012 AlwaysOn: Multisite Failover Cluster Instance http://sqlcat.aspx Monitor Availability Groups http://msdn.aspx Configure Read-Only Routing on an Availability Group (SQL Server) http://msdn. and Application Failover (SQL Server) http://msdn.aspx Availability Group Listeners.aspx Configure Read-Only Access on an Availability Replica (SQL Server) http://msdn.microsoft.microsoft.microsoft. Client Connectivity. AlwaysOn Failover Cluster Instances http://msdn.microsoft.com/sqlcat/b/whitepapers/archive/2011/12/22/sql-server-2012alwayson_3a00_-multisite-failover-cluster-instance.microsoft.microsoft.aspx Client Connection Access to Availability Replicas (SQL Server) http://msdn.com/en-us/library/hh710054.com/en-us/library/hh213417.aspx Configure the Windows Firewall to Allow SQL Server Access http://msdn.aspx Enable and Disable AlwaysOn Availability Groups (SQL Server) http://msdn.aspx Configure Read-Only Access on an Availability Replica http://msdn.com/en-us/library/ff877943.microsoft.aspx AlwaysOn Availability Groups Dynamic Management Views and Functions http://msdn.com/en-us/library/hh213002.aspx Create or Configure an Availability Group Listener (SQL Server) http://msdn.com/en-us/library/ff878259.aspx Perform a Forced Manual Failover of an Availability Group http://msdn.microsoft.com/en-us/library/ff878349.com/en-us/library/ff877957.com/en-us/library/hh213080.com/en-us/library/ms189134.com/en-us/library/ff878305.microsoft.microsoft.microsoft.aspx Creating an Availability Group (SQL Server) http://msdn.com/en-us/library/hh213002.com/en-us/library/hh510184.microsoft.aspx Manually Prepare a Secondary Database for an Availability Group (SQL Server) http://msdn.microsoft.aspx               The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 54 . com/en-us/library/ff878664(SQL.microsoft.110).10).com/kb/918992/en-us SQL Server Web site http://www.110).microsoft.aspx         The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 55 .com/kb/319723/en-us How to transfer the logins and the passwords between instances of SQL Server 2005 and SQL Server 2008 http://support.com/en-us/library/cc732035(WS.aspx Failover Cluster Step-by-Step Guide: Validating Hardware for a Failover Cluster http://technet.microsoft.microsoft.com/windowsserver2008/en/us/failover-clustering-main.aspx Failover Cluster Step-by-Step Guide: Configuring Accounts in Active Directory http://technet.microsoft.com/sqlserver SQL Server Tech Center http://technet.microsoft.microsoft.microsoft.microsoft.110).microsoft.10).com/en-us/library/cc755009.com/en-us/library/hh270275(v=SQL.com/en-us/library/cc770620(v=ws.com/en-us/library/cc753969.microsoft.microsoft.microsoft.aspx Checklist: Create a Failover Cluster http://technet.aspx Failover Cluster Step-by-Step Guide: Configuring the Quorum in a Failover Cluster http://technet.aspx Configure Cluster Quorum NodeWeight Settings http://msdn.com/en-us/sqlserver SQL Server Dev Center http://msdn. How to use Kerberos authentication in SQL Server http://support.com/en-us/library/cc731002(WS.aspx Checklist: Create a Clustered File Server http://technet.microsoft.aspx Force a WSFC Cluster to Start Without a Quorum http://msdn.com/en-us/library/hh270281(SQL.com/en-us/sqlserver     Windows Server Failover Cluster  Windows Server | Failover Clustering and Node Balancing http://www.aspx Failover Policy for Failover Cluster Instances http://msdn.10). microsoft.aspx Network Load Balancing parameters http://technet.com/en-us/library/cc770558(v=ws.microsoft.microsoft.aspx?FamilyID=d24c373e-bafc-4e31-b1b2d86584a12ca4&displaylang=en      The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 56 . Recommended hotfixes and updates for Windows Server 2008 R2-based server clusters http://support.com/kb/2687741/en-us  Network Load Balancing  Network Load Balancing http://technet.microsoft.microsoft.microsoft.com/en-us/library/cc755161.aspx NLB 101: How NLB balances network traffic http://blogs.com/b/networking/archive/2008/10/01/nlb-101-how-nlb-balancesnetwork-traffic.technet.aspx Network Load Balancing: Configuration Best Practices for Windows 2000 and Windows Server 2003 http://www.com/en-us/library/cc778263.aspx Specifying the Affinity and Load-Balancing Behavior of the Custom Port Rule http://technet.com/kb/980054/en-us A hotfix that improves the performance of the "AlwaysOn Availability Group" feature in SQL Server 2012 is available for Windows Server 2008 R2 http://support.com/en-us/library/cc759039.com/downloadS/details.10).microsoft.aspx Upgrading the Network Load Balancing Cluster (to 2008) http://technet. and solutions that help people and businesses realize their full potential. Temenos serves more than 1. private. corporate. and microfinance and community banks. incorporating best-practice processes that take advantage of Temenos’ experience in 700 implementations around the globe.About Temenos Founded in 1993 and listed on the Swiss Stock Exchange (SIX: TEMN). services. For more information.microsoft. For more information. visit: www. Islamic. visit: www.temenos.com The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 57 .500 customers in 125 countries. universal. Microsoft (Nasdaq "MSFT") is the worldwide leader in software. Temenos Group AG is the market-leading provider of banking software systems to retail. Temenos’ software products provide advanced technology and rich functionality.com About Microsoft Founded in 1975. Headquartered in Geneva with more than 60 offices worldwide.

Temenos T24 and Microsoft SQL Server HADR White Paper

Comments

Description