High Availability Test Plan of a BizTalk Solution – A sample of a real approach

Following one real example of how to HA Test Plan, a previous and reference post about this topic:

High Availability Test Plan of a BizTalk Solution

High Availability Test Plan of a BizTalk Solution – Methodology – requirements and expectation  

An example of a highly available setup involves setting up 2 or more ACT clients generating HTTP stress against 2 or more ‘Web Servers’ setup to use NLB. The ASP pages on the ‘Web Servers’ create objects [only proxies of which exist on the ‘Web Servers’] which reside on the middle-tier application servers running CLB. These ‘Web Servers’ and ‘Application Servers’ then utilize features provided by backend systems that are setup using MSCS, an example being clustered SQL Server or clustered Commerce Server or clustered BizTalk Server.

We can catalogue some basic scenarios and for any of these exist a correct strategy

2 Box Configuration [Separate SQL]:

1 Box BizTalk [Receive/Transmit/Scheduler/Map/Transformation]
This machine will also be used to Receive FILE/MSMQt messages.
MSMQt messages will always be received locally by BizTalk.
BizTalk will simulate the behavior of an MSMQ server.

1 Box SQL Server

This machine will host the BizTalkMsgBoxDb, BizTalkMgmtDb & the Trackingdb.

image

3 Box Configuration [Separate Receive/Transmit-Process + Separate SQL]:

The receive functionality will be separated from the Process/Transmit functionality spreading the load.

image

4 Box Configuration [Multiple Receive + 1 SQL + 1 Process/Transmit]

This is similar to the 3 machine configuration with multiple receive locations.

image

4 Box Configuration [Multiple Transmit + 1 SQL]

This is similar to the 3 machine configuration with multiple Process/Transmit locations.

image

4 Box Configuration [Separate Receive/Transmit + Multiple MsgBox]

This is similar to the 3 machine configuration with multiple SQL message boxes.
Note: This does not provide failover as both the MsgBox databases are unique.

image


HA deployment [Clustering/NLB]

This is a Highly Available deployment configuration where failover is provided by adding multiple machines. The Receive & Transmit machines are configured to run as NLB clusters.

image

Large Scale Deployment [Clustering/NLB/ISA/DMZ]

This is a Highly Available deployment configuration where failover is provided by adding multiple machines. The Receive & Transmit machines are configured to run as NLB clusters. The SQL Servers are also configured to run on MSCS clusters on RAID 5 disks providing total failover protection.

image

All the above configurations will run on a separate isolated domain on a private VLAN.

Data Centre Certification

Identify areas that are potential candidates for HA [E.g. NT Services] failover testing.
Testing required throughout the product cycle to ensure that the product is certifiable.
Test scripts to be written to simulate failover scenarios.
Methods of deploying multiple machines [including DC setup].
Select scenario that will be run during failover tests.
Ensure product passes all tests performed by tools
[Job Objects/Failover/Recoverability/Data Persistence/Data Integrity]

 

Example:

An example of a highly available setup involves setting up 2 or more ACT clients generating HTTP stress against 2 or more ‘Web Servers’ setup to use NLB. The ASP pages on the ‘Web Servers’ create objects [only proxies of which exist on the ‘Web Servers’] which reside on the middle-tier application servers running CLB. These ‘Web Servers’ and ‘Application Servers’ then utilize features provided by backend systems that are setup using MSCS, an example being clustered SQL Server or clustered Commerce Server or clustered BizTalk Server.

Following is a diagram depicting a large scale deployment with an NLB clustered front end and multiple MSCS clustered backend systems:
Following is a sample of plan for testing a BizTalk solution in a distributed environment.

Protocols:

For example we have 5 different protocols [2 way Receive/Send] will be tested on the various deployment configurations.

  • FILE
  • HTTP
  • MSMQt
  • SOAP: This will be tested using Web Services.
  • SMTP: This will only be used to Transmit Messages one-way.

Testing Permutations:

Since we have 3 protocols each way, to test all possible combinations we end up with 9 possible permutations.

Rx

Tx

XLANG/S

XLANG/S

FILE

FILE

Y

N

FILE

HTTP

Y

N

FILE

MSMQ

Y

N

FILE

SOAP

Y

N

FILE

SMTP

Y

N

HTTP

FILE

Y

N

HTTP

HTTP

Y

N

HTTP

MSMQ

Y

N

HTTP

SOAP

Y

N

HTTP

SMTP

Y

N

MSMQ

FILE

Y

N

MSMQ

HTTP

Y

N

MSMQ

MSMQ

Y

N

MSMQ

SOAP

Y

N

MSMQ

SMTP

Y

N

SOAP

FILE

Y

N

SOAP

HTTP

Y

N

SOAP

MSMQ

Y

N

SOAP

SOAP

Y

N

SOAP

SMTP

Y

N

Only 4 of the protocols can be used to Receive & all 5 protocols can be used to Transmit leading to 20 permutations. This can be further complicated by running the tests with and without the XLANG/S scheduler.

Thus it can lead to 40 possible combinations.

To avoid the complexity in managing multiple scenarios, a simple scenario has been built that exercises all of the above protocols both ways. This scenario will be configurable, whereby Receive/Send locations can be Enabled/Disabled with a simple script.
This sample scenario will also permit Enabling/Disabling the code path that uses or skips the Scheduler. Although this scenario will be comprehensive, it will be easier to manage from a testing perspective. The objective of the testing in HA is not to exercise Stress/Perf, but to ensure a successful failover.

Principal Testing Defined:

Test scripts/code will be developed that simulate real world contingency situations.
E.g. Loss of network connectivity, System Crash, Service Stopped/Crashed.
Data Centre Certification requires that the product failover gracefully.
The tool will also be used to test the behavior of the product under extreme conditions.
Results of these failures will have to be measured in an automated manner.
Did the service failover in a reasonable amount of time? Response time between failover and recovery.
Did the system continue processing the last failed transaction? Did it abort it and/or lose it?
Did the system preserve data integrity? Was the transaction atomically committed or rolled back?
Did the system retry a failed/suspended transaction to completion?
Automation [scripts] will have to be developed to measure the above failure situations.
E.g. If 1000 valid files were dropped, did 1000 messages appear at the destination[FILE/HTTP/MSMQt]?

Deployment:

Since testing in this area involves multiple machines with varied configurations deployment will have to be automated such that minimal effort is required to setup one of the following configurations.
Time spent on deployment can be minimized by having dedicated machines specifically assigned for HA testing.
All machines will have the base OS + other dependencies installed and imaged.
Not all machines in the test bed will have to be re-imaged every time a build changes.
The only machines that will require re-imaging will be the BizTalk boxes.
Scripts will have to be developed to restore the SQL Server databases to a pristine state before each new run.
All scenarios used for testing will have automated scripts that deploy the scenarios on multiple-machines. Test scripts required to generate load on the deployed scenarios will also be automated. Test run results will be rolled up into the http://ebizweb site.

Performance:

The flip side of testing in a HA environment is that Performance numbers for scalability can be produced.
These types of controlled runs can be achieved once a build has been confirmed as a candidate for a performance run. If a particular build passes a test similar to a BVT, it is deployable and able to pass cursory testing like end-to-end loop, performance numbers can be extracted from this run.

Real sample Performance test tracking

Performance Parity with BizTalkGroup1

image

Targeted Performance Improvements Over BizTalkGroup1

image

New features

image

Capacity Planning

image

Stress

image

Adapter Matrix sample

image

Release Criteria Scratch

image

Test Suites samples

image

 

image

 

Definitions

 

image

Related blog posts