NOTE: All posts in this blog have been migrated to Web Performance Matters.
All updated and new content since 1/1/2007 is there. Please update your bookmarks.

Saturday, March 25, 2006

Managing Rich Internet Applications [6]

This is the sixth post in a series devoted to the challenges of Service Level Management (SLM) for Rich Internet Applications (RIAs). In these applications, some processing is transferred to the Web client while some remains on the application server. Previous posts introduced the subject and the SLM topics I plan to address, reviewed RIA technologies, introduced The RIA Behavior Model, and introduced the application measurement topic.

Measuring RIA Responsiveness: Complications

To explain the challenges of measuring Rich Internet Applications, I will begin by reviewing three significant complications introduced by RIAs. From this discussion, I will draw conclusions about how measuring RIAs will differ from measuring traditional Web applications.

First Complication: Variety of Possible Behaviors
To make meaningful statements about an application's performance, you must first define what you need to measure. Typically you must measure common application usage patterns (or 'scenarios'), or patterns that are important by reason of their business value. When an application uses only a standard browser, its behavior is simple, known, and predictable, limiting the number of interesting usage scenarios to be measured.

Adding a client-side engine separates user interface actions from server requests and responses, giving application designers many more options. It allows developers to build an application that uses creative, clever ways to transfer display elements and portions of the processing to the client. The custom application behaviors encoded in the client-side engine make the result more complex and its usage less predictable than a traditional browser-based application. This increases the number of possible usage patterns, complicating the task of determining the important scenarios and measuring their performance.

Having many more design and implementation options also creates new opportunities for developers to make performance-related mistakes. They can (accidentally or deliberately) implement "chatty" client/server communication styles, which under some workload conditions may perform poorly. Even with thorough testing, some of these problems may remain undiscovered until after the application is deployed. A systematic SLM process must include measurement capabilities that provide a way to identify, investigate, and fix these types of problems.

Second Complication: Concurrent Activity
There are two reasons why it takes a while to download a Web page comprised of 50 separate content elements, no matter how fast your connection. First HTTP limits the rate at which clients can request objects from servers, then TCP limits the rate at which the data packets can deliver those objects from server to client. In particular, although the HTTP 1.1 standard allows clients to establish persistent connections with servers, RFC2616 defining the standard restricts the number of parallel connections the client (usually a browser) can use:
[Section 8.1.4] Clients that use persistent connections SHOULD limit the number of simultaneous connections that they maintain to a given server. A single-user client SHOULD NOT maintain more than 2 connections with any server or proxy. ... These guidelines are intended to improve HTTP response times and avoid congestion.
Avoiding bottlenecks is a desirable goal in all computer systems, and so the Internet protocols are designed to protect shared network and server resources, no matter how selfishly their users might behave. But flow controls may not always be needed. Consider the metering lights that permit only two cars onto the highway every 30 seconds. During the rush hour they can smooth the flow of traffic and reduce freeway congestion, but if left on at other times, they would delay drivers for no reason. Similarly, when there is plenty of available bandwidth and no server congestion, the limit of two connections per domain is just a governor that restricts the potential rate of data transfer from server to client.

Nonetheless, modern browsers (including IE) do adhere to this guideline. And it is safest to do so, because DoS logic in some proxy servers might reject connections that do not obey the RFC2616 standard. For a lively discussion of IE behavior and the meaning of 'SHOULD' in the RFC, see this msdn blog, which also points out a relatively simple technique any Web site designer can use to circumvent this limitation, if they feel it is important enough:
Savvy web developers can take this connection limit into account and deliver their data from multiple domains, since the browser will open up to two connections per domain.
Although this solution may be fast enough for many applications, most Ajax developers are looking for clever ways to make the client even more responsive. And because Ajax offers almost unlimited application design possibilities, today there is little agreement on the best ways to achieve that goal. This debate was well summarized by Jep Castelein in AJAX Latency problems: myth or reality?, a discussion that includes the important reminder that 'IE will have a maximum of 2 simultaneous connections (per domain, actually -- C.L.), whether you use XMLHttpRequest or not'. In other words, Ajax implementations are not above the law of HTTP.

Even so, a primary objective of many RIA designs is to work around the two-connection obstacle to improve responsiveness for the user. The resulting implementations will use many different techniques to communicate with one or more servers. As one data point, consider the opinion of Drew McClellan, who surely qualifies as an expert in developing Web applications. In his tutorial about JavaScript and the XMLHttpRequest object -- Very Dynamic Web Interfaces -- Drew concludes that the real challenge here is not figuring out how to make the code work but thinking of interesting ways in which it can be utilized.

Such freedom to improvise inevitably complicates measurement -- especially when requests originating from what appear to be separate units of work on a client are really all part of a single logical application. It is easy to sit on a server and measure the traffic, or the demands placed on various server-side resources. And this kind of measurement has its uses, especially when load testing or investigating bottlenecks. But the biggest challenge when measuring applications is correlating those seemingly separate pieces of information with a particular application activity, task, or phase. The more complex the client/server relationship, especially when it involves concurrent interactions, the trickier it becomes for measurement and analysis tools to perform that correlation properly.

Third Complication: Asynchronous and Synchronous Communications
An earlier post discussed RIA technologies. It described how Flash and Ajax use client-side engines that can be programmed to communicate asynchronously with server(s), independent of a user's actions. This notion is captured in name 'Ajax', in which the initial 'A' stands for 'Asynchronous', and Figure 2 in that post was taken from Jesse James Garrett's seminal article on Ajax. Although Garrett's figure does show how an engine enables asynchronous application behaviors, and differs from a traditional Web app, it does not illustrate the full range of possibilities, which I discussed further in RIA post #4. In general, a user action within a Rich Internet Application can trigger zero, one, or many server requests.

Also, most discussions of RIA architecture or technology focus on how asynchronous application behaviors can improve usability. But they don't question the fact that communication between browser and server is still synchronous. That is, communication is always initiated by the browser (or the client-side engine operating as an extension of the browser), and follows the standard synchronous HTTP request and response protocol. But in a recent blog post and presentation, Alex Russell of dojo proposed the name Comet for a collection of clever techniques that exploit HTTP persistent connections to implement a 'push' model of communication.

He used an adaptation of Garrett's original Ajax figure to show how Comet extends the Ajax model; this smaller version comes from brainblog:
The Comet Communication Model
Using the Comet techniques, a server uses long-lived persistent connections to send updates to many clients, without even receiving a request (a 'poll') from the client. According to Russell:
As is illustrated above, Comet applications can deliver data to the client at any time, not only in response to user input. The data is delivered over a single, previously-opened connection. This approach reduces the latency for data delivery significantly.

The architecture relies on a view of data which is event driven on both sides of the HTTP connection. Engineers familiar with SOA or message oriented middleware will find this diagram to be amazingly familiar. The only substantive change is that the endpoint is the browser.

While Comet is similar to Ajax in that it's asynchronous, applications that implement the Comet style can communicate state changes with almost negligible latency. This makes it suitable for many types of monitoring and multi-user collaboration applications which would otherwise be difficult or impossible to handle in a browser without plugins.
Like Ajax, Comet is not a new technology, but a new name for some clever ways to implement an old communication style using standard Web protocols. But as happened with Ajax in 2005, the new name has triggered a lot of interest among developers -- for evidence, just search on 'Ajax', 'Comet', and 'Web' (the last term should eliminate most links to Greek legends, soccer legends, and legendary cleaning products). Especially useful are Russell's slides from his talk (Comet: Low Latency Data For Browsers) at the recent O'Reilly ETech Conference, and Phil Windley's excellent write-up of the talk.

The complexity of a push architecture is justified only for applications that manage highly volatile shared data. But when it is implemented, it creates an entirely new set of measurement challenges. If information pushed to the client does help the user to work faster, its benefits will be reflected in normal measurements of application responsiveness. But you can't use normal responsiveness metrics to evaluate a server-initiated activity that simply updates information a user is already working with.

New metrics must be devised. Depending on the application, we may need to measure the currency or staleness of information available to the user, or maybe the percentage of times a user's action is invalidated by a newly received context update from the server. This kind of "hiccup" metric is conceptually similar to the frequency of rebuffering incidents, one measure of streaming quality. Server capacity will also be a major issue requiring careful design and load testing. These are new SLM challenges facing anyone who decides to implement the Comet approach.

Measuring RIA Responsiveness: Conclusions

While discussing the three major complications introduced by RIAs, I have already noted some consequences: measurements may become harder to specify, more difficult to correlate, and may even require the invention of new metrics. But I have been saving the punch-line until the end. I will now discuss my three most important and far-reaching conclusions about measuring RIAs, each of which involves a significant change from the way Web applications are measured today. These changes deal with the issues of what, where, and how to measure.

RIAs: What to Measure?
The short answer? Not individual Web pages. First and foremost, because the asynchronous aspects of the RIA model undermine two fundamental assumptions about Web applications and what to measure:
  • We can no longer think of a Web application as comprising a series of Web pages.
  • We can no longer assume that the time it takes to complete a Web page download corresponds to something a user perceives as important.
In fact, when a client-side engine can be programmed to continually download new content, or a server-side engine can keep pushing new content to the browser over a connectionn that never closes, some page downloads may never complete.

Therefore to report useful measurements of the user experience of response times, instead of relying on the definition of physical Web pages to drive the subdivision of application response times, we must break the application into what we might call 'logical pages'. To do this, a measurement tool must recognize meaningful application milestones or markers that signal logical boundaries of interest for reporting, and thus subdivide the application so that we can identify and report response times by logical page.

Because (as I noted earlier) it is usually hard to correlate seemingly separate pieces of measurement data after the fact, I conclude that these milestones will have to be identified before measurement takes place. They could be key events or downloads that always occur naturally within the application. Or they could require proactive instrumentation, for example using little downloadable content elements that developers deliberately imbed at logical boundaries in the application's flow, to enable its subsequent measurement.

The former method places the burden on those setting up measurements to first identify application-specific milestones. The latter frees the measurement tool from the need to know anything about the application, but places the burden on application developers to instrument their code by inserting markers at key points.

Comparing Apples and Oranges?
Second, when setting out to measure a RIA, you must think carefully about the purpose of the measurement, especially if the RIA is replacing a traditional Web application, or being compared with one. You must not let the traditional Web approach to a business task determine what measurements are taken. Aleks Šušnjar (see post [2] in this series) provided the following insights:
We can't compare apples and oranges by measuring either an apple or an orange. We need to approach measurement as if we were comparing two applications built by different developers -- one a traditional Webapp and the other a client/server app with a completely different UI.

In this situation, we cannot measure only at the level of single Web pages or server interactions. Finding that it takes so many milliseconds to get a server response for request-type X is meaningless if that client/server interaction does not occur in both versions of the application.

In my experience, a typical example concerned how long it took to upload or download a document. But those metrics were sometimes irrelevant, depending on the application context. So to make really useful performance comparisons, we had to approach the problem at a higher level -- for example, 'how long does it take to move a document from folder A to folder B?' In a traditional Web app that task would likely require multiple clicks on separate pages, whereas with an RIA/Ajax implementation, we could do it with a single drag and drop.

So to make valid comparisons, we had to measure and compare the net effect of two rather different aspects of performance -- one concerning only the computer (how many operations of type X can a machine perform per hour), the other involving human performance (how many documents can an employee move per hour). But both affected the overall conclusion. Generalizing, I would say that:
  • The server that has to generate additional HTML for multiple responses in a traditional Web app will likely use many more processor cycles than the one supporting an RIA/Ajax implementation, where all the user interactions are handled on the client and the server just has to provide a service at the end.
  • If the designer takes care to avoid 'chatty' client/server communications, network utilization will probably also be significantly lower in the second case, further improving server performance.
  • Finally, if the client-side interface is well designed, a RIA should allow users to think and work faster.
In the final analysis, all these components of application performance must be weighed.
Aleks' insights come from his own experience developing a Rich Internet Application -- for more details see his Wikipedia page about RIA and AJAX.

RIAs: Where to Measure?
You might think that to measure a user's experience of responsiveness, you would have to use a tool that collects measurements from the user's workstation, or at least from a measurement computer that is programmed to generate synthetic actions that imitate the behavior of a typical user. Surprisingly, this is not actually the case for traditional Web applications.

While synthetic measurements require computers to mimic both a user's actions and their geographical location, software that collects passive measurements of real user actions can in fact reside either on the client machine or on a machine that is close to the server, usually just behind the firewall -- just so long as that machine can observe the flow of network traffic at the TCP and HTTP levels. Because of the synchronous and predictable nature of these protocols, a measurement tool that can read and interpret the flow of packets can actually deduce the user's experience of response time by tracking HTTP message traffic and the lower-level timings of TCP data packets and (crucially) TCP acknowledgements.

Such a tool is called a packet sniffer, or protocol analyzer. Packet sniffing has a bad name in some quarters, being associated with malicious snooping by hackers. But in the right hands, it is a legitimate analysis technique used by some Web measurement tools to deduce client-side performance without actually installing any components, hardware or software, anywhere near the users.

Unfortunately for these tools, the growth of RIAs will make it progressively more difficult to exploit this clever approach. The RIA Behavior Reference Model (this figure) I introduced previously makes it clear that RIAs -- even without the further complications introduced by a Comet push architecture severely limit the power of the packet sniffing approach. That's because we can no longer characterize response time as the time to complete the synchronous round trip of:

Click(C) => Browser(B) => Request(Q) => Server(S) => Response(R) => Display(D)

Instead the client-side engine in the RIA architecture breaks apart this cycle into two separate cycles operating asynchronously:

The user-client cycle: Click(C) => Engine(E) => Display(D) -- [CED, for short]
The client-server cycle: Request(Q) => Server(S) => Response(R) -- [QSR, for short]

Another way of describing these cycles might be as 'foreground' (CED) and 'background' (QSR). Both cycles are important, because neither stands alone; it is their relationship that defines application behavior. But that relationship depends only on the application design, which cannot (in general) be inferred by a measurement tool, especially one that can observe only one of the two cycles.

I conclude therefore that to measure RIAs, tools will have to reside on the client machine, where they can see both the level of responsiveness experienced by the browser (the QSR cycle) and the user's experience of responsiveness (the CED cycle). Granted, QSR cycles can still be monitored by the traditional packet sniffing methods, but tracking them will permit only limited inferences to be made about the CED cycle, which is separately managed by the client-side engine.

One might imagine that tracking the times of certain previously identified 'marker' objects within the QSR stream could solve this problem, especially since I already concluded (above) that marker objects will be needed to delimit the logical pages of RIAs. But in order to be able to draw conclusions about CED times by seeing those markers in the QSR stream, a measurement tool must impose a lot of constraints on the design of the client-side engine. An engine that implemented truly asynchronous behaviors (such as anticipatory buffering) would make it difficult or impossible to assess the user's actual experience without a measurement presence on the client side to observe the CED cycle.

Either that, or the marker objects would themselves need to be active scripts that triggered timing events that were somehow transmitted to the server (in a manner similar to ARM), rather than simply being passive milestones. But once again, this approach is tantamount to placing a measurement capability on the client. (Indeed, in the context of RIAs, dynamically distributing little measurement scripts that function as extensions to the client-side engine would be a natural approach). I therefore conclude that an approach comprising only passive listening on the server side will be insufficient to measure RIA responsiveness.

RIAs: How to Measure?
We have seen that RIAs will affect where a passive measurement tool can be used. Active measurement tools, because their modus operandi is to simulate the user's behavior, are not affected by this issue -- since they mimic the client, they naturally reside there. For these measurement tools, the issue raised by RIAs is how closely a user's behavior needs to be emulated.

User Actions: First, note that RIAs can generate back-end traffic in response to any user action, and not only when the user clicks. For example, the Google maps application can trigger preloading of adjacent map segments based on the direction of a user's cursor movement within the map display. Therefore to measure RIA responsiveness, an active measurement tool must simulate the user's actions, not the engine's actions. This further explains why active tools must reside at the client.

In other words, using the terminology introduced earlier, active measurement tools must drive the CED cycle, not the QSR cycle. The former involves driving the client-side engine to obtain its backend behavior; the latter would require the tool user to supply a complete script of the engine's behavior to the active measurement tool. The latter involves more work and is inherently more difficult and mistake prone, and therefore much less useful.

Think Times: Second, an active measurement tool must properly reflect then fact that a client-side engine may be doing useful work in the background, while a user is reading the previous response or deciding what to do next. It may, for example, be prefetching content in anticipation of the user's next action. Therefore the time a user devotes to these activities -- jointly represented as 'think time' in the RIA Behavior Model -- may affect their perception of the application's responsiveness.

For passive measurements of real user traffic, this is not a problem, because their measurement data always includes the effects of the users' actual think times. But for traditional Web apps, synthetic measurement tools have not needed to simulate think times by pausing between simulated actions, because introducing such delays would not have altered the result. When measuring an RIA however, setting think time to zero (as is typically done today) could have the effect of eliminating or minimizing potential background preloading activity, thus maximizing the perceived response times of later user actions.

And because the engine's behavior during think time varies by application, a measurement tool cannot simply measure content download times then introduce simulated think times during the reporting phase. Combining individual component measurements after the fact to produce a realistic estimate of user experience would be like trying to construct a PERT chart for a volatile project when you are not sure you know all the tasks and also cannot be sure about all their interdependencies -- in other words, impossible.

While people familiar with the project could probably construct the chart and draw conclusions about the project's duration, a general-purpose tool cannot. But in software performance and analysis work, the most difficult and error-prone aspect is combining low level measurements to draw conclusions about user-related metrics like transaction response time. So most users of application measurement tools want to be told the bottom line, namely the user experience, not just the underlying component times.

Therefore I conclude that to reflect a user's experience, an active measurement tool will have to simulate user think times during the measurement process. Using realistic think times (as the best load testing tools already do today) will produce the most realistic measure of the response times a user perceives during a particular application use case.

Summary
Since this has been a long post I will now summarize my conclusions, with links back to the discussion behind each:
  • The variety of possible RIA behaviors creates new opportunities for developers to make performance-related mistakes, requiring more systematic approaches to measurement.
  • Concurrent client/server interactions make it difficult for measurement and analysis tools to correlate seemingly separate pieces of data with a particular application activity.
  • RIA push models (like 'Comet') will require the invention of new metrics to measure effectiveness.
  • Milestones must be specified in advance to allow measurement tools to group RIA measurements into 'logical pages' or tasks.
  • To compare the performance of a RIA with a traditional Webapp, you must measure equally important activities within each.
  • Passive monitoring of server requests will be insufficient to determine a user's perception of RIA responsiveness.
  • Active measurement tools must simulate user actions, not just engine actions, to measure RIA responsiveness.
  • Active measurement tools must simulate user think times during the measurement process, to reflect a user's experience accurately.
Next ...
This completes (for now) my analysis of the challenges of measuring RIAs; any further ideas will have to wait until future posts. Next I will consider how the introduction of RIAs may affect SLM processes that were designed and fine-tuned to manage traditional Web applications.

NOTE: All posts in this blog have been migrated to Web Performance Matters.
All updated and new content since 1/1/2007 is there. Please update your bookmarks.

Friday, March 17, 2006

Managing Rich Internet Applications [5]

This is the fifth post in a series devoted to the challenges of Service Level Management (SLM) for Rich Internet Applications (RIAs). In these applications, some processing is transferred to the Web client while some remains on the application server. Previous posts introduced the subject and the SLM topics I plan to address, reviewed RIA technologies, and introduced The RIA Behavior Model.

Measuring RIA Responsiveness: Introduction

Having been sidetracked by my interesting but futile search for the origin of the saying 'you can't manage what you can't measure', I am now almost ready to examine some of the challenges of measuring Rich Internet Applications. But first I must spell out the motivations for measuring an application, and explain why the choice of an active or passive measurement methodology is of only secondary importance to this discussion.

Why Measure Applications?
Reasons for measuring application performance fall into two classes, one client-centric, the other server-centric. They are:
  1. MEASUREMENT: Determining how quickly the users can achieve their goals:

    • (a) Verifying that the application meets its service level objectives
    • (b) Detecting abnormal application behavior (typically, slow response)
    • (c) Identifying bottlenecks in application components
    • (d) Comparing builds, versions, or releases of an application
    • (e) Comparing two applications (e.g. our app vs. competitor's version)

  2. LOAD TESTING: Determining how a system behaves under increasing load:

    • (a) Verifying that the system will be able to handle the projected traffic
    • (b) Determining how many users a given server environment can handle
    • (c) Predicting bottleneck components as workload levels grow
    • (d) Comparing builds, versions, or releases of an application
    • (e) Identifying components that fail after extended use
As I have indicated, activities in the first class are conventionally called 'measurement', while those in the second are referred to as 'load testing', or simply 'testing'. So although both classes require us to address many common measurement problems, I will simplify my description of these problems by using the conventional terminology to distinguish the two classes of activties.

Active vs. Passive Measurement
A few years ago I edited a version of the Network Measurement FAQ originally produced by a CAIDA working group, for publication by CMG. The online FAQ seems to have gone into hiding, but you can still download the CMG paper, Fundamentals of Internet Measurement: A Tutorial. Although it is about network measurement, a couple of paragraphs introduce the concepts of active and passive measurements:
Passive measurements are carried out by observing normal network traffic, so they do not perturb the network. They are commonly used to measure traffic flows, i.e. counting the number of packets and bytes travelling through routers or links between specified sources and destinations.

Active measurements, on the other hand, are performed by sending test traffic into the network. For example, one might measure a network's maximum carrying capacity by sending packets through it and increasing the sending rate until the network is saturated. Clearly one needs to be aware that active measurements impose extra traffic onto a network and can distort its behavior in the process, thereby affecting measurement results.
When measuring Web applications, similar definitions and concerns apply, except at the level of application traffic and the server infrastuctures that deliver applications. While load testing requires an active measurement approach, application responsiveness can be measured using either active or passive methods.

However, no matter which is used, the passive and active measurement approaches differ only in the way application traffic is generated -- both still require mechanisms to measure that traffic. Passive measurements must capture the behavior and experience of real application users, while active measurements must do the same for synthetic users, that is, computers mimicing the behavior of real users.

This is fortunate, because to discuss the pros and cons of real and synthetic monitoring would require a major blog post. In fact I gave a short presentation on that topic at Interop last year, which would provide the structure required. But for now I will simply note that both real and synthetic application users have to be measured, and RIAs present many common measurement challenges, which I will be describing in my next post.

NOTE: All posts in this blog have been migrated to Web Performance Matters.
All updated and new content since 1/1/2007 is there. Please update your bookmarks.

Tuesday, March 14, 2006

Deep Thoughts on Management

Since I am writing a series of posts about managing Rich Internet Applications, and working on a post about the difficulties of measuring them, I thought I should begin with the popular management aphorism that you can't manage what you can't measure. Well, was that ever a diversion! Everyone is familiar with this saying, but interestingly, despite a ton of digging on the Web, the precise origin of this saying remains obscure (at least, to me).

Perhaps because of my training in statistics, I had always assumed it was just a simplification of a famous statement by Lord Kelvin, often cited by statisticians:
When you can measure what you are speaking about and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind. It may be the beginning of knowledge, but you have scarcely advanced to the stage of science.
Unfortunately, there does not seem to be a lot of support for my assumption. Instead, my Web searches have revealed that the statement 'you can't manage what you can't measure' is most often attributed either to the pioneer of quality control, W. Edwards Deming, or to 'the father of modern management', Peter Drucker. Moreover, neither of these popular attributions seems to be correct.

In Deming's case, one of his most well-known statements is almost the exact opposite, namely that management must take account of many things that cannot be measured, and that running a company on visible figures alone was one of the seven deadly diseases of management. All the same, I'm sure Deming, given his interest in quality control, would have also agreed that things which can be measured should be measured, And all SLM processes are based on measurements.

Attribution to Peter Drucker also seems to be a misconception, since it does not appear either in the wikiquote list of quotes from Drucker, or in other extensive lists of his well-known sayings. But doing this research reminded me of the many insightful statements Drucker actually did make, including:
  • Management is doing things right; leadership is doing the right things.
  • There is nothing so useless as doing efficiently that which should not be done at all.
  • A manager's task is to make the strengths of people effective and their weakness irrelevant--and that applies fully as much to the manager's boss as it applies to the manager's subordinates.
  • Management is so much more than exercising rank and privilege, ... it is much more than "making deals." Management affects people and their lives.
  • Rank does not confer privilege or give power. It imposes responsibility.
  • Executives owe it to the organization and to their fellow workers not to tolerate nonperforming individuals in important jobs.
  • Most of what we call management consists of making it difficult for people to get their work done.
  • The most efficient way to produce anything is to bring together under one management as many as possible of the activities needed to turn out the product.
  • Increasingly, politics is not about "who gets what, when, how" but about values, each of them considered to be absolute. Politics is about "the right to life"...It is about the environment. It is about gaining equality for groups alleged to be oppressed...None of these issues is economic. All are fundamentally moral.
  • What's absolutely unforgivable is the financial benefit top management people get for laying off people. There is no excuse for it. No justification. This is morally and socially unforgivable, and we will pay a heavy price for it.
  • Unless commitment is made, there are only promises and hopes... but no plans.
  • The best way to predict the future is to create it.
This post is an example of what happens when I do research on the Web -- I find a lot of interesting information I was not even looking for. So this post became a diversion about the management insights of Edward Deming and Peter Drucker, and in my next one I will get back to the subject of measuring RIAs.

NOTE: All posts in this blog have been migrated to Web Performance Matters.
All updated and new content since 1/1/2007 is there. Please update your bookmarks.

Friday, March 10, 2006

Managing Rich Internet Applications [4]

This is the fourth post in a series devoted to the challenges of Service Level Management (SLM) for Rich Internet Applications (RIAs). In these applications, some processing is transferred to the Web client while some remains on the application server. Previous posts introduced the subject, listed topics I plan to address, and reviewed RIA technologies.

The RIA Behavior Model

In an earlier post I discussed the idea of using a reference model to establish a shared frame of reference, or conceptual framework, to structure subsequent discussions of a subject. Today I introduce a new reference model -- The RIA Behavior Model -- illustrated by the diagram below.

I intend this model to represent the principal elements of the behavior of RIAs, elements that must be considered in any discussion of RIA performance and management. Note however that I am not attempting to address the complex human forces that determine perceptual, cognitive, and motor behaviors. I am merely seeking to represent a few generalized behavioral outcomes that are relevant in the context of an interaction between a human user and a Rich Internet Application.

RIA Reference Model
As you can see, this model embraces more concepts than the two figures created by Jesse Garrett to introduce Ajax, which I included in my previous post. Those figures are ideal for explaining the differences between traditional Web applications and RIAs. But to discuss how the combination of a Rich Internet Application and a user will actually behave, I have included several more elements that interact to determine behavior, which I will now describe.

At the highest level, the model comprises three major aspects (indicated by the color coding in the figure), each of which influences application performance:
  1. The application's design and usage environment, or context (upper row, grey)
  2. The user's expectations and behavior (lower left, blue)
  3. The application's behavior when used (lower right, yellow)
Browser/Server Interaction
If we consider a Web browser to be the simplest form of client engine, then the solid black arrows trace the flow of a traditional Web page download. The user clicks on a link in their browser, the browser sends requests to one or more servers. Servers respond to client requests, and when there is enough of the requested content on the client (in the browser cache), the browser renders it ('paints' the screen), and the user can view it. The user's experience of response time is the elapsed time of the entire process, 'from click to paint'.

Even a single Web page download will normally involve many round trips between client (browser) and server, because most Web pages are an assemblage of content elements such as CSS files, script files, and embedded images, each of which is separately downloaded by the browser.

In the traditional synchronous Web application, illustrated in the upper half of Garrett's Figure 2, this process is repeated several times. Because applications usually require an exchange of information, at least one of the requests the browser sends to the server will normally be an HTTP POST (as opposed to the much more common HTTP GET request), to upload some data a user has entered into a form. Consider, for example, shopping at amazon.com as a return visitor. At minimum, even if the application recognizes you from a cookie, you must enter your password again to confirm your identity. But after that, the site already has all your personal information, and you can complete your transaction just by clicking on the right buttons on each page as it is presented to you.

Server-Side Elements
We are all familiar with this kind of Web application and its behavior. But unless you are responsible for site management or performance, you may be less aware of some of the other server-side elements of the model. Servers must field requests concurrently from many users. No matter how powerful the server, every concurrent user consumes their small share of the server's resources: memory, processor, and database.

Web servers can respond rapidly to stateless requests for information from many concurrent users, making catalog browsing a relatively fast and efficient activity. But when a user's action (such as clicking the 'add item to shopping cart' button) requires the server to update something, more of those server resources are consumed. So the number of concurrent transactions -- server interactions that update a customer's stored information -- plays a vital role in determining server performance.

In the model, the grey arrows and the boxes labeled Users and Transactions represent the fact that server performance is strongly influenced by these two concurrency factors. Servers typically perform uniformly well up to a certain concurrency level, but above that level ('the knee of the curve') transaction performance quickly degrades, as one of the underlying resources becomes a bottleneck. Because of this characteristic, seemingly small design changes in an application or in the infrastructure serving the application may, if they extend the duration of transactions, have a significant effect on the user's experience of response time.

Adding a Client-Side Engine
Adding a client-side engine does not prevent an application from implementing the traditional synchronous design described above. But RIAs aim to improve the user's experience, and a client-side engine allows the application designer to consider many additional possibilities, such as:
  • Prefetching of content to client
  • Lazy loading of content
  • Just In Time fetching of content
  • Client-side validation of user input
  • Client-side only responses to user input
  • Batching of server inputs on the client
  • Offloading of server function to client machines
If any of these techniques are employed, the resulting application will inevitably be more complex than the traditional synchronous application. The challenge of SLM is to ensure that a more complex design actually does produce a more responsive user experience, because -- despite the optimistic claims being made for Ajax and Flash -- there are no guarantees.

Improving the User Experience
For example, a common method of accelerating client-side response is to anticipate the user's next action(s) and program the engine to preload (or prefetch) some content 'in the background', while the user is thinking. Depending on their think time, when the user clicks, part or all of the desired response can be already available on the client. This technique has long been used in client/server systems to improve client responsiveness; a version (called Link Prefetching) is implemented by Mozilla browsers.

Preloading will certainly create a more responsive experience -- if the user actually follows the anticipated path through the application. But what if they don't? Then the client engine has made extra server requests that turned out to be unnecessary, and it still has to react to the user's input with yet more server requests. So the design has placed extra load on the servers, for no benefit.

The extra load serving these 'background' requests, on behalf of what may be hundreds or even thousands of clients running the preloading application, can slow down a server's responses to requests that users are actually waiting for. Slower responses lengthen transaction times, which drives up the number of concurrent users, clogging up the servers even more, and further slowing down responses. It's a vicious circle.

Even if the application serves some users quickly, those whose usage patterns do not match the profile the application designer had in mind will probably not receive such good service, and (in the worst case) may abandon transactions before completing them. Apart from the lost opportunity to serve a customer, abandonment usually also wastes scarce server resources, because the allocations earmarked for now-abandoned transactions languish unused until finally freed by some type of timeout mechanism.

People who design and test back-end systems already know that behavioral variables like user think-time distributions and abandonment rates per page have a significant influence on the capacity and responsiveness of servers under load. Today, RIAs (as indicated by the dotted lines in the diagram) give application designers the flexibility to create application designs that attempt to take account of such behavioral variables. But a consequence is that RIAs also magnify the risk of failure should the designer miscalculate: a simple design applied well beats a clever design misapplied.

The Application Environment
This brings us to the crucial importance of the design and usage environment, represented by the grey boxes in the model. A user's satisfaction with any application depends on their usage context and environment; in other words, how well the application design matches their needs at the time, their way of thinking, and their behavior when they are using it.

Their experience of response time depends on the combined behaviors of the client and server components of the application, which in turn depend on the application design, the underlying server infrastructure design, and of course the user's Internet connection speed. The most effective RIA will be one whose creators took into account these factors at each stage of its development life cycle, and who created the necessary management processes to ensure its success when in production.

Future Posts
Using The RIA Behavior Model as one starting point, future posts will discuss how to build robust and responsive RIAs, and explore some of the technical and management challenges they pose, especially in the area of response time measurement. If I can organize my thoughts sufficiently to keep the writer's block at bay, I expect my next post to focus on that topic.

NOTE: All posts in this blog have been migrated to Web Performance Matters.
All updated and new content since 1/1/2007 is there. Please update your bookmarks.

Tuesday, March 07, 2006

Managing Rich Internet Applications [3]

This is the third post in a series devoted to the challenges of Service Level Management (SLM) for Rich Internet Applications (RIAs). In these applications, some processing is transferred to the Web client while some remains on the application server. Previous posts introduced the subject, and listed the topics I plan to address.

RIA Technologies
Before diving into the management issues posed by Rich Internet Applications, I will introduce the two principal technologies used to implement RIAs -- Ajax and Flash -- and provide a few links I have found useful.

I am going to begin with Ajax, because the seminal article defining the term "Ajax," Jesse James Garrett provides such a clear introduction to the subject. After explaining that ...
Ajax isn’t a technology. It’s really several technologies, each flourishing in its own right, coming together in powerful new ways. Ajax incorporates:
  • standards-based presentation using XHTML and CSS;
  • dynamic display and interaction using the Document Object Model;
  • data interchange and manipulation using XML and XSLT;
  • asynchronous data retrieval using XMLHttpRequest;
  • and JavaScript binding everything together.
... he uses two figures that reveal the essential differences between a classic Web application and one implemented using Ajax:
The classic web application model works like this: Most user actions in the interface trigger an HTTP request back to a web server. The server does some processing — retrieving data, crunching numbers, talking to various legacy systems — and then returns an HTML page to the client... This approach makes a lot of technical sense, but it doesn’t make for a great user experience. While the server is doing its thing, what’s the user doing? That’s right, waiting. And at every step in a task, the user waits some more.

Obviously, if we were designing the Web from scratch for applications, we wouldn’t make users wait around. Once an interface is loaded, why should the user interaction come to a halt every time the application needs something from the server? In fact, why should the user see the application go to the server at all?
Figure 1 illustrates how Ajax is different:
Traditional and Ajax Application Models
Figure 1: The traditional model for web applications (left) compared to the Ajax model (right).
While this diagram refers to Ajax technology, its structure describes RIAs in general. Garrett explains that the RIA model:
... eliminates the start-stop-start-stop nature of interaction on the Web by introducing an intermediary — an Ajax engine — between the user and the server. ... At the start of the session, the browser loads an Ajax engine — written in JavaScript and usually tucked away in a hidden frame. This engine is responsible for both rendering the interface the user sees and communicating with the server on the user’s behalf".
Figure 2 (below) illustrates how the addition of a client-side engine ... "allows the user’s interaction with the application to happen asynchronously — independent of communication with the server".
Synchronous and Asynchronous Client/Server Communications
Figure 2: The synchronous interaction pattern of a traditional web application (top) compared with the asynchronous pattern of an Ajax application (bottom)
In the best case, a client-side engine can mean that users spend less time waiting for the server to respond. Like all writers making the case for Ajax and RIAs, Garrett assumes that this architecture guarantees a more responsive user experience -- but the reality is more complicated. In practice, an RIA's responsiveness will depend on several factors that I will be exploring in this series of posts.

For more detailed discussion of the history of RIAs and Ajax, I recommend Aleksandar Šušnjar's Wikipedia page about RIA and AJAX. For more technical details, see these Sitepoint articles: What about Flash?

You may feel I should have started with Flash, because it came first, but opinion is divided on that point. It is true that Macromedia announced their Flash MX product line in 2002 (see Developing Rich Internet Applications with Macromedia MX for a good summary), whereas the term Ajax was coined only in February 2005. However, the underlying RIA techniques have been in use for much longer. For example, on his Wikipedia page Aleks cites the pioneering work of the now defunct company, Desktop.com. In 1999, Dave Winer wrote in his blog:
Want a vision of where the Web is headed? Check out Desktop.Com. Launched in beta a couple of weeks ago, this stunning site changed my point of view on what can be accomplished with JavaScript, ActiveX and whatever other kinds of mysterious code-doo-dads they're using... Desktop.Com says the Web is a desktop, just like the desktops on the Mac and Windows. Icons down the left edge of the browser window, menus at the top of the window, double-click to open a directory, double-click to edit a file. -- Dave Winer
If you need to compare Ajax and Flash, here are four links you may find interesting: Any search engine will turn up plenty more material on this subject, and I will return to it when discussing aspects of RIA measurement and management.

You can download many reports and papers about Flash and RIAs from the Adobe (formerly Macromedia) Web site -- for example, these. Not included in that list is one of the most readable introductions to Rich Internet Applications, a 2003 paper sponsored by Macromedia and Intel and written by Joshua Duhl of IDC, an excellent writer who I once worked with (in the early 90's) at ONTOS, a Boston-based Object DB company. Another former colleague from Boston, Alan Sarasohn, used to claim that the computer industry is run by just 300 people, but they keep moving around. I'm starting to believe him!

NOTE: All posts in this blog have been migrated to Web Performance Matters.
All updated and new content since 1/1/2007 is there. Please update your bookmarks.

Thursday, March 02, 2006

Managing Rich Internet Applications [2]

This is the second post in a series devoted to the challenges of Service Level Management (SLM) for Rich Internet Applications (RIAs). In these applications, some processing is transferred to the Web client while some remains on the application server.

In the first post in this series, I introduced the concept of a Rich Internet Application, and asserted that anyone planning to deploy RIA technology must be prepared to invest more in design, development, testing and management to make it successful. By that I meant that deploying an RIA successfully will demand more resources -- skills, tools, process, time, and (of course) money -- than would be required to deploy a more traditional version of a Web-based application in which all application logic is executed on the server.

I have reached this conclusion after studying this subject for the last three months, in the light of my prior experience in the field of software performance management. Factors leading to this conclusion include:
  • the nature of much of the technology being promoted for building RIAs today (primitive)
  • the stage of evolution of this new application architecture (early)
  • the toolsets available to support RIA developers (incomplete), and
  • the experiences of others developing RIAs (hard work).
This comment by Todd Spangler in Baseline Magazine neatly sums up the situation today:
Creating an Ajax application from scratch is like having to build a brick wall but first having to figure out how to create the bricks.
This reminds me of the days when I used to program in IBM Assembler -- funny how life in the world of computing seems to repeat itself every few years! (And I'll have more to say about that later). But while this is all fascinating stuff, it is not my primary focus; it is just the starting point for the SLM topics I want to discuss. So although I will devote some space to reviewing the state of RIA technology, I will try to keep those posts concise and provide useful links. If you want more, a few searches will turn up tons of reference material.

The questions I plan to explore in more depth in this series are:
  • Why are RIAs so much more difficult to design, develop, and manage?
  • How do you measure an RIA user's experience of response time?
  • How do you test whether an RIA will perform acceptably in the production environment?
  • How do you break apart the time it takes to complete a business transaction using an RIA into meaningful components?
  • How do you monitor the performance of an RIA in the production environment, and alert when its behavior is abnormal?
  • What kinds of development and systems management processes will maximize the chances of implementing an RIA successfully?
Before I get into any of these details, I'd like to acknowledge the contributions of two people who have helped shed light on these issues. This first is Victor Pavlov, a principal engineer and colleague at Keynote, who knows Web technology inside out, and can figure out how to measure anything. I am lucky that when I am noodling on a tricky question, I can sometimes run into Victor at the coffee machine.

The second is Aleksandar Šušnjar, who I met online after reading his contributions to the discussion of RIAs in Wikipedia, where he shared insights gained from his experience building a product first released in 2002 by his company, Hummingbird. This product, DM Webtop (later renamed DM Web Server) is a component of their Enterprise Document Management suite. In delivering desktop capabilities via the Web, it clearly predated much of today's thinking about the purpose of RIAs and how to build them.

Naturally, Aleks also discovered some potential pitfalls an RIA designer might run into, which I will be mentioning in later posts. Regrettably, many of his insightful contributions to Wikipedia (which can still be found in its history pages) were subsequently removed by other contributors who lacked either his intimate knowledge of the technology or his historical perspective. Rather than waging Wikipedia edit wars, which for a hot topic can continue interminably, he has simply posted his own page about RIA and AJAX.

I am not surprised by these events, because I have found that the computing world tends to attract people who have little sense of history. This explains why it is forever reinventing the wheel, repeating yesterday's mistakes, and tripping over previously solved problems. Rich Internet Applications are a case in point. Mainframe (thin client) computing was replaced by the client/server (fat client) model, which in turn was displaced by the Web-based (thin client) applications. Now the emergence of RIAs signals a return to the fat client model again. The only difference between the client/server and RIA models is that instead of Sneakernets and static installation protocols, Internet and Web technologies now provide an almost universally available platform for distributing function to client desktops dynamically. But many of the over-optimistic claims being made for RIA technology today mirror those popular 15-20 years ago, when everyone was jumping on the client/server bandwagon and predicting the imminent demise of the mainframe -- which, by the way, never happened.

So I'll end by reminding you of the famous saying of philosopher George Santayana: Those who cannot remember the past are condemned to repeat it. One of my goals in this series of posts about RIAs is to keep you from that fate, so I'm glad to be collaborating with people like Victor and Aleks whose common sense is grounded in a strong sense of history!

NOTE: All posts in this blog have been migrated to Web Performance Matters.
All updated and new content since 1/1/2007 is there. Please update your bookmarks.

Wednesday, March 01, 2006

Managing Rich Internet Applications [1]

After a few months to collect my thoughts, I have at last conquered my writer's block. Today I begin a series of posts on the topic of Managing Rich Internet Applications. The term Rich Internet Application, which I will often abbreviate to RIA for convenience, refers a Web-based application that Wikipedia defines as "a cross between web applications and traditional desktop applications, transferring some of the processing to a web client and keeping (some of) the processing on the application server". RIAs are most often implemented using either the Flash or JavaScript technologies, although the client-side processing could also be coded as a Java application or applet.

Wikipedia provides a useful, though brief, introduction to this subject; I will provide more references in future posts. Greatly complicating any coherent analysis of RIA technology is the massive amount of hype surrounding both Web 2.0 (a superset of RIA) and Ajax (a subset of RIA). I will also discuss these relationships in more detail in a future post.

Regular readers (if any should chance to return after my long absence!) will know that I will be focusing on service level management, and that I will be situating those service level issues within a wider context of application usability.

This is a broad topic, yet (as far as I can tell) a little explored one. I have been researching this subject area for almost 3 months now, and although you can find literally thousands of opinions and analyses of Ajax, RIAs, and Web 2.0 on the Web, very little of it deals with questions of performance management or SLM. So I'm not sure how many posts it will take before I run out of interesting things to write about. I suppose this may be considered to be an advantage of specializing in a field that most people take for granted. However, if you are planning to implement a Rich Internet Application, it would be a big mistake to take its performance for granted.

That was the theme of an introductory article I wrote recently for eCommerce Times, entitled Web Applications: Richer or Poorer? That article is a good place to start; it will give you a brief overview of the subject matter I plan to cover here in later posts, and introduce my world view. A key conclusion was that companies that decide to employ the technology must be prepared to invest more in design, development, testing and management to make it successful. In future posts I will justify and elaborate on that statement.