Successful scalability implementations

Achieving scalable web servers is not a trivial task. There are various solutions from which to pick, setup and configuration tasks to understand and perform, and many delicate dependencies between related but heterogeneous technologies. This section describes some of the major issues affecting successful scalability implementations:

Designing and coding scalable applications

Application architects must create designs that are inherently flexible by relying on open standards that don't restrict the application's construction and implementation to vendor-specific interfaces and tools. Similarly, web developers that construct the designed application must be aware that they can significantly impact the application's scalability in the way in which they write their code, build their SQL queries, invoke thread management, access databases, and partition the application.

This section discusses the following topics to consider when designing and building a web application:

Application session and state management

As you create web applications, you will probably create specific variables that you intend to carry across multiple interactions between a user's browser and a site's web server(s). Using client variables that are stored in a shared state repository, or session variables that are stored in memory of a specific server, are popular approaches for accomplishing this task. The latter approach, however, introduces a significant challenge for a website that is supported by multiple servers. When a user has begun a session and variables are stored on a specific server, the user must return to that server for the life of the session to maintain correct state information.

An example that illustrates this concept is an e-commerce application that uses shopping carts. With this type of application, as a customer accumulates items in a cart, there must be a mechanism to ensure that the user can see the items as they are added. One approach is to store these items in session variables on a specific web server. However, if you use this approach, there must also be a way to ensure that the user always returns to the same server for the life of the session. ClusterCATS automatically handles this challenge for you.

Another approach to solving this problem is to store client variables in a back-end common state repository. This approach enables all web servers in a cluster to access variables in a common, shared back-end data store, such as a database. However, this approach can potentially affect your site's performance.

Web developers must think through the user scenarios in which application session and state are affected, and engineer appropriate mechanisms to handle them. The most common ways to handle session data are:

Whatever mechanism your architects and engineers use, they must anticipate the scenarios in which maintaining an application's state is vital to a good user experience. See "Session-aware load balancing".

Database locking and concurrency issues

Dynamic web applications that allow users to modify a database must ensure appropriate database concurrency handling. This term refers to how an application manages concurrent user requests when accessing the same database records. If an application does not impose a database-locking mechanism on multiple requests to update a record, data integrity can be compromised in the database - two users could make simultaneous modifications to a record, but only the second change would take effect.

For example, consider a Human Resources web application on a company intranet. The HR Generalist adds two new employee records to the HR database by filling out a web form, because two new employees have been hired. The Generalist enters most of the vital information into the records, but doesn't yet have the new employees' phone extensions or HMO selections, so leaves those fields blank. Later in the day, the HR Generalist's manager, the HR Director, obtains this information from both new hires and decides to enter it in the database. However, one of the new employees, after speaking with her husband, decides to change her HMO selection from the basic selection to the PPO choice. The employee calls the HR Generalist to tell him of the change, and the Generalist says he will take care of it immediately. Without talking to the HR Director, the HR Generalist adds the information into the employee records at the same time that the HR Director is attempting to update the information.

In this scenario, if the application uses an appropriate database concurrency validation mechanism, the HR Director receives a message indicating that she could not access the employee record because it was in use, thereby alerting her that someone in her department was trying to change the record. However, if the application did not use such a validation mechanism, the HR Director would overwrite the new data that the Generalist had just entered, resulting in data integrity problems. This example illustrates the importance of your dynamic web applications handling database concurrency issues well.

Application partitioning

The way an application is partitioned and deployed dramatically affects its ability to scale. A key development objective must be to ensure that each partition scales independently of the others, thereby eliminating application bottlenecks.

Application partitioning refers to the logical and physical deployment of an application's three core types of logic, or services - presentation, business, and data access. If you are familiar with the concept of tiered client/server application development, you already understand the rationale for developing applications in this way. The following short review highlights this methodology's benefits.

An application, regardless of whether it is a web application or a more traditional client/server application, has three main categories of logic, or services:

The way that architects and web developers decide to partition and deploy these core application services significantly affects the application's ability to scale. Although your development efforts may no longer be burdened with developing, distributing, customizing, and updating proprietary client software for your applications, the ubiquitous graphical user interface (GUI) - the web browser - presents new interface issues and challenges. For example, you must ensure that your application's presentation remains performance-friendly. It should minimize the number and size of graphic elements that must be downloaded to the client. Also, because some browsers can not cleanly display all technologies, such as cascading style sheets (CSS), Java applets, and frames, you must carefully evaluate their use in your applications.

Bear in mind these presentation guidelines, to aid your applications' performance and user experience, and be sure to plan and test for the lowest common denominator that all browsers can accommodate.

Often, partitioning business services to a separate business logic application server from the primary application server, if necessary, can yield better application organization and easier maintenance. You can maximize your application's data services by carefully constructing them and by ensuring that a separate database server (in this case, a separate computer) is used to increase processor capacity for any database transactions.

These are several of the most important topics you and the developers creating your web applications should consider early on. In doing so, you ensure that your web applications are designed and coded with scalability in mind.

Avoiding common bottlenecks

In addition to application design and construction considerations, you must plan to avoid common bottlenecks that can negatively affect a web application's performance.

Following are typical bottlenecks that can affect an application's ability to perform and scale well:

DNS effects on website performance and availability

Improper Domain Name System (DNS) setup and configuration on web servers is one of the most common problems administrators encounter. This section addresses the following topics:

What is DNS?

DNS is a set of protocols and services on a TCP/IP network that allows network users to use hierarchical natural language names, rather than computer IP addresses, when searching for computer hosts (servers) on a network. DNS is used extensively on the Internet and on private enterprise networks, including LANs and WANs.

The primary capability of DNS is its ability to map host names to IP addresses, and vice versa. For example, suppose the web server at Macromedia has an IP address of 157.55.100.1. Most people would connect to this server by entering the domain name (www.macromedia.com), not the less-friendly IP address. Besides being easier to remember, the name is more reliable, because the numeric address could change for a variety of reasons, but the name can always be reserved.

DNS effects on site performance and availability

Internet DNS is a powerful and successful mechanism that has enabled huge numbers of individuals and organizations to create easily locatable websites on the Internet. However, DNS by itself may not allow your website to perform and scale as it should, thus causing it to become unavailable and unreliable. Whether you use DNS by itself to load balance inbound traffic depends largely on the site's purpose and the amount of concurrent activity you expect on it. For instance, a low-volume, static site that provides only textual HTML information can probably be accommodated by round-robin DNS. However, a high-volume, dynamic, e-commerce site that you anticipate doing lots of volume won't perform or scale well if it is only supported by round-robin DNS.

To understand why, let's look at the e-commerce example. Even if you have planned ahead and set up multiple servers to support this high-volume site, if you rely only on DNS, it can only perform two tasks:

However, if a spike in user activity causes servers to overload or fail, round-robin DNS keeps distributing requests among all servers, even if some are not operating.

In short, Internet DNS is limited in its capabilities, and its round-robin distribution mechanism does not include intelligence for monitoring, managing, and reacting to overloaded or failed servers. Consequently, DNS by itself is not a sound load-balancing or failover solution for your business-critical sites. ClusterCATS compensates for DNS limitations and lets you create highly available, reliable, scalable web applications.

DNS core elements

The following are core DNS elements that you must be able to configure if your web applications are to work well with DNS:

Zones and domains

A Domain Name System is composed of a distributed database of names. The names in the DNS database establish a logical tree structure called the domain name space. On the Internet, the root of the DNS database is managed by the Internet Network Information Center (InterNIC). The top-level domains were originally assigned organizationally and by country. Two-letter and three-letter abbreviations are used for countries. Some abbreviations are reserved for use by organizations - for example, .com, .gov, and .edu for business, government, and educational organizations, respectively.

A domain is a node on a network and all the nodes below it (subdomains) that are contained within the DNS database tree structure. Domains and subdomains can be grouped into zones to allow distributed administration of the name space. More specifically, a zone is a portion of the DNS name space whose database records exist and are managed in one physical file. One DNS server may be configured to manage one or multiple zone files. Each zone is anchored at a specific domain node. You use zones for breaking up domains across multiple segments to distribute the management of the domain to multiple groups, and to replicate data more efficiently.

The following figure shows these concepts:

Domains and subdomains

DNS servers store information about the domain name space and are referred to as name servers. Name servers typically have one or more zones for which they are responsible. The name server has authority for those zones and is aware of all the other DNS name servers that are in the same domain.

DNS record types, server aliases, and round-robin distribution

There are three DNS record types that you must define and configure for each web server in order for ClusterCATS load-balancing and failover technology to work correctly. These records must be defined and configured on your local and primary DNS servers.

To see how all of these records work together, let's look at a simple example. There are two web servers, named www1.yourcompany.com and www2.yourcompany.com. You don't want users to see the primary host names (A records) for these servers in their browser; you want them to see only their assigned aliases (CNAME records), when being redirected.

The DNS entries would look like the following:

:
; Entries for forward-resolution: A-records
 
 
www1.yourcompany.com
IN A
192.168.0.1
www2.yourcompany.com
IN A
192.168.0.2
; Entries for reverse-resolution: PTR-records
 
 
192.168.0.1
PTR
www1.yourcompany.com
192.168.0.2
PTR
www2.yourcompany.com
; Round Robin entries
 
 
www.yourcompany.com
IN A
192.168.0.1
www.yourcompany.com
IN A
192.168.0.2

To ensure that your site lookups and translations occur as intended, you must provide correct entries in your DNS records, as shown. Also, to enable round-robin DNS functionality, you must create round-robin entries as shown.

On the Windows platform, you make DNS entries using the Domain Name Service Manager utility.

On UNIX platforms, you make DNS entries in the name.db file, which is read by the DNS server's Berkeley Internet Name Daemon (BIND).

Load testing your web applications

Load testing is the process of defining acceptable benchmarks for your web application's performance, and then simulating load and measuring resulting response times and throughput against the benchmarks. You perform load testing to measure the application's ability to scale.

This section discusses the following topics:

Reasons to perform load testing

Load testing is important to your website's success because it lets you test its capacities before you deploy it, so you can find and fix problems before they are exposed to your users. Determining your site's purpose, and the amount of traffic you anticipate, may affect how you load test it.

Managers of small sites, who don't expect heavy concurrent loads, might be able to organize actual users to simultaneously access the site to perform load testing. However, this is difficult to accomplish well, because it introduces many human variables. In fact, for larger business-critical systems that expect heavy concurrent load, this type of testing is not feasible and does not provide satisfactory or realistic results.

A better approach to load testing is to use load simulation software. There are some excellent software load-testing tools on the market that let you simulate heavy loads hitting your web server. By using the software in conjunction with your defined benchmarks and formal test plans, you can confidently determine whether your web application is ready for deployment.

Another reason to load test is to verify your failover capabilities. Failover ensures that if a primary server within a cluster of servers stops functioning, subsequent user requests are directed to another server within the cluster. Failover is addressed in more depth in "What is website availability?". Using load-testing software, you can essentially force a server redirection by designating a computer as "unavailable" or by shutting it down.

Note:   ClusterCATS uses the HTTP protocol to redirect packets of data from a failed server to an available server. Therefore, it is important to verify that your load-testing tool can handle HTTP redirections properly before you initiate load testing.

How to load test your web applications

Before you can load test, you must purchase a load-testing software tool and learn how to use it.

There is a variety of good load-testing software tools on the market, including Segue's SilkPerformer, Mercury Interactive's LoadRunner, and RSW's e-LOAD. Each of these packages provides substantial Web-enabled software-testing solutions that help you effectively simulate and test load.

After you purchase, install, and learn to use load-testing software, you determine benchmarks that you want to-or must-achieve for your website, to ensure a good user experience. Following that, you formalize your testing strategy by designing and developing written test plans against which you execute your tests.

When the test plans are written and approved, you run the tests. After you do so, you capture and analyze the load-testing results and report the statistics to the development team. From there, you'll need to reach consensus about the most serious problems you discovered, the necessary changes to make, and the best way to implement the fixes. After the changes are made and a new build of the application is available, you rerun the tests to look for performance improvements. Again, you analyze the testing results, and continue this cycle until the site is operating within the established parameters that you've set. When your team agrees that the site scales well and is operating at peak performance under heavy stress, you're ready to deploy the application into a production environment.

Load-testing considerations

Before starting your load testing, consider the following:

You should have a good overview of what scalability implies, the core elements that compose it, some of the issues that affect successful implementations, and the tasks that must be performed to verify that your web applications are able to achieve satisfactory scalability.

The next section describes website availability and reliability concepts and considerations.

Comments