Quality Driven Web Services Composition

Liangzhao Zeng Boualem Benatallah Marlon Dumas
small University of New South Wales University of New South Wales Queensland University of Technology
Sydney, Australia Sydney, Australia Brisbane Australia
zlzhao@cse.unsw.edu.au boualem@cse.unsw.edu.au m.dumas@qut.edu.au
Jayant Kalagnanam Quan Z. Sheng  
IBM T.J. Watson Research Center University of New South Wales  
New York, USA Sydney, Australia  
jayant@us.ibm.com qsheng@cse.unsw.edu.au  

Copyright is held by the author/owner(s).
WWW2003, May 20-24, 2003, Budapest, Hungary.
ACM 1-58113-680-3/03/0005.

Abstract:

The process-driven composition of Web services is emerging as a promising approach to integrate business applications within and across organizational boundaries. In this approach, individual Web services are federated into composite Web services whose business logic is expressed as a process model. The tasks of this process model are essentially invocations to functionalities offered by the underlying component services. Usually, several component services are able to execute a given task, although with different levels of pricing and quality. In this paper, we advocate that the selection of component services should be carried out during the execution of a composite service, rather than at design-time. In addition, this selection should consider multiple criteria (e.g., price, duration, reliability), and it should take into account global constraints and preferences set by the user (e.g., budget constraints). Accordingly, the paper proposes a global planning approach to optimally select component services during the execution of a composite service. Service selection is formulated as an optimization problem which can be solved using efficient linear programming methods. Experimental results show that this global planning approach outperforms approaches in which the component services are selected individually for each task in a composite service.

H.3.5Information SystemsWeb-based services Management, Performance


Introduction


Web services technologies are emerging as a powerful vehicle for organizations that need to integrate their applications within and across organizational boundaries. In particular, the process-based composition of Web services is gaining a considerable momentum as an approach for the effective integration of distributed, heterogeneous, and autonomous applications [1]. In this approach, applications are encapsulated as Web services and the logic of their interactions is expressed as a process model. This approach provides an attractive alternative to hand-coding the interactions between applications using general-purpose programming languages. A Web service is a self-described application that uses standard Internet technologies to interact with other Web services. An example of a Web service is a SOAP-based interface to place bids in an auction house. Once deployed, services can be aggregated into composite services. An example of a composite service would be a ``Travel Planner'' system that aggregates multiple component services for flight booking, travel insurance, accommodation booking, car ren-tal, and itinerary planning, which are executed sequentially or concurrently. The process model underlying a composite service identifies the functionalities required by the services to be composed (i.e., the tasks of the composite service) and their interactions (e.g., control-flow, data-flow, and transactional dependencies). Component services that are able to provide the required functionalities are then associated to the individual tasks of the composite services and invoked during each execution of the composite service. The number of services providing a given functionality may be large and constantly changing. Consequently, approaches where the development of composite services requires the identification at design-time of the exact services to be composed are inappropriate. The runtime selection of component services during the execution of a composite service has been put forward as an approach to address this issue [2,6,11]. The idea is that component services are selected by the composite service execution engine based on a set of criteria. However, previous approaches in this area have not identified a set of criteria (other than price and application-specific criteria) for selecting Web services. In addition, existing service selection approaches adopt a local selection strategy, meaning that they assign a component service to an individual tasks, one at a time. As a result, these approaches are not able to handle global user constraints and preferences. For example, the overall duration of the composite service execution should be minimized, or a given budget constraint should be satisfied. In this paper, we present quality-driven approach to select component services during the execution of a composite service. The salient features of our approach are: The rest of the paper is organized as follows. Section 2 presents the service composition model and defines some key concepts used throughout the paper. Section 3 defines the service quality criteria used for service selection and explains how the values of these quality criteria can be computed for a given service. Section 4 formulates the global service selection problem and describes a linear programming method to efficiently solve it. Section 5 presents a prototype implementation of the proposed approach, as well as a set of experiments comparing the global planning approach with the local selection approach. Finally, Section 6 discusses related work, and Section 7 draws some conclusions.


Web Service Composition Model


In this section, we will present some basic concepts in Web service composition first, then give some definitions on composite service execution planning.

Composite services and communities


A composite Web service is an umbrella structure aggregating multiple other elementary and composite Web services, which interact with each other according to a process model. Following our previous work [2], we choose to specify the process model of a composite service as a statechart [12]. The choice of statecharts for specifying composite Web services is motivated by two main reasons: (i) statecharts have a well-defined semantics; and (ii) they offer the basic flow constructs found in contemporary process modeling languages (i.e., sequence, conditional branching, structured loops, concurrent threads, and inter-thread synchronization). The first characteristic facilitates the application of formal manipulation techniques to statechart models, while the second characteristic ensures that the service composition mechanisms developed in the context of statecharts, can be adapted to other process modeling languages like, for example, those that are being designed by Web services standardization efforts (e.g., BPEL4WS, WSCI, BPML)[*]. A statechart is made up of states and transitions. In the proposed composition framework, the transitions of a statechart are labeled with events, conditions, and assignment operations over process variables. States can be basic or compound. Basic states are labelled with invocations to Web services operations. Compound states contain one or several statecharts within them. Specifically, compound states come in two flavors: OR and AND states. An OR-state contains a single statechart within it whereas an AND-state contains several statecharts (separated by dashed lines) which are intended to be executed concurrently. Accordingly, OR-states are used as a decomposition mechanism for modularity purposes, while AND-states are used to express concurrency: they encode a fork/join pair. The initial state of a statechart is denoted by a filled circle, while the final state is denoted by two concentric circles: one filled and the other unfilled. A simplified statechart $ W$ specifying a ``Travel Planner'' composite Web Service is depicted in Figure 1. In this composite service, a search for attractions is performed in parallel with a flight and an accommodation booking. After these searching and booking operations are completed, the distance from the hotel to the accommodation is computed, and either a car or a bike rental service is invoked. Note that when two transitions stem from the same state (in this case the state $ t_4$), they denote a conditional branching, and the transitions should therefore be labelled with disjoint conditions.

Figure 1: Statechart of a composite service ``Travel Planner''
\includegraphics[width=3in]{statechart.eps}

A basic state[*] of a statechart describing a composite service can be labelled with an invocation to either of the following: The concept of Web service community addresses the issue of composing a large and changing collection of Web services. Service communities provide descriptions of a desired functionality (e.g., flight booking) without referring to any actual service (e.g., Qantas flight booking Web service). The set of members of a community can be fixed when the community is created, or it can be determined through a registration mechanism, thereby allowing service providers to join, quit, and reinstate the community at any time. When a community receives a request to execute an operation, this request is delegated to one of its current members. The choice of the delegatee is done at execution time based on the parameters of the request, the characteristics of the members, the history of past executions, and the status of ongoing executions. Sections 3 and 4 deal with the selection of delegatees during the execution of a composite service whose states are labelled with invocations to communities.

Execution paths and plans


In this section, we define two concepts used in the rest of the paper: execution path and execution plan.

Definition 1 (Execution path). An execution path of a statechart is a sequence of states [$ t_1$, $ t_2$, .. $ t_n$], such that $ t_1$ is the initial state, $ t_n$ is the final state, and for every state $ t_i$ ($ 1 < i
< n$), the following holds:

$ \square$

This definition relies on the concept of direct successor of a state. A basic state $ t_b$ is a direct successor of another basic state $ t_a$ if there is a sequence of adjacent transitions[*] going from $ t_a$ to $ t_b$ without traversing any other basic state. In other words, the first transition in the sequence stems from $ t_a$, the last transition leads to $ t_b$, and all intermediate transitions stem from and lead to either compound, initial, or final states, but are not incident to a basic state. It is straightforward to see that an acyclic statechart has a finite number of execution paths. To simplify the presentation, we initially assume that all the statecharts that we deal with are acyclic. If a statechart contains cycles, a technique for ``unfolding'' it into an acyclic statechart needs to be applied beforehand. The method used to unfold the cycles of a statechart is to examine the logs of past executions in order to determine the average number of times that each cycle is taken. The states appearing between the beginning and the end of the cycle are then cloned as many times as the cycle is taken in average. Details about this unfolding process are omitted for space reasons. Under the assumption that the underlying statechart is acyclic, it is possible to represent an execution path of this statechart as a Directed Acyclic Graph (DAG) as follows.

Definition 2 (DAG representation of an execution path). Given an execution path [$ t_1$, $ t_2$, .. $ t_n$] of a statechart ST, the DAG representation of this execution path is a graph obtained as follows:

$ \square$

If a statechart diagram contains conditional branchings, it has multiple execution paths. Each execution path represents a sequence of tasks to complete a composite service execution. Figure 2 gives an example of a statechart's execution paths. In this example, since there is one conditional branching after task $ t_4$, there are two paths, called $ W_{e1}$ and $ W_{e2}$ respectively. In execution path $ W_{e1}$, task $ t_6$ is executed after task $ t_5$, while in execution path $ W_{e2}$, task $ t_7$ is executed after task $ t_5$.

Figure 2: DAG representation of the execution paths of the statechart of Figure 1.
\includegraphics[width=3.5in]{executionpathTravel.eps}

As stated before, the basic states of a statechart describing a composite service can be labelled with invocations to communities. If this is the case, actual Web services (i.e., members of communities) need to be selected during the execution of the composite service. Hence, it is possible to execute a path in very different ways by allocating different Web services to the states in the path. The concept of execution plan defined below captures the various ways of executing a given execution path.

Definition 3 (Execution plan). A set of pairs $ p = \{<t_1,s_{i1}>,
<t_2,s_{i2}>, \ldots, <t_N,s_{iN}>\}$ is an execution plan of an execution path $ W_e$ iff:

$ \square$



Web Service Quality Model


In a Web environment, multiple Web services may provide similar functionalities with different non-functional property values (e.g., different prices). In the composition model presented in the previous section, such Web services will typically be grouped together in a single community. To differentiate the members of a community during service selection, their non-functional properties need to be considered. For this purpose, we adopt a Web services quality model based on a set of quality criteria (i.e. non-functional properties) that are transversal to all Web services, for example, their pricing and reliability. Although the adopted quality model has a limited number of criteria (for the sake of illustration), it is extensible: new criteria can be added without fundamentally altering the service selection techniques built on top of the model. In particular, it is possible to extend the quality model to integrate non-functional service characteristics such as those proposed in [22], or to integrate service QoS metrics such as those proposed by [25]. In this section, we first present the quality criteria in the context of elementary services, before turning our attention to composite services. For each criterion, we provide a definition, indicate its granularity (i.e., whether it is defined for an entire service or for individual service operations), and provide rules to compute its value for a given service.

Quality Criteria for Elementary Services


We consider five generic quality criteria for elementary services: (1) execution price, (2)execution duration, (3) reputation, (4) reliability, and (5) availability. Given the above quality criteria, the quality vector of a service $ s$ is defined as follows:
$\displaystyle q(s) =
(q_{price}(s),q_{du}(s),q_{av}(s),q_{re}(s),q_{rep}(s))$     (1)

Note that the method for computing the value of the quality criteria is not unique. The global planning model presented Section 4 is independent of these methods.

Quality Criteria for Composite Services


The above quality criteria are also applied to evaluate the QoS of composite services. Table 1 provides aggregation functions for the computation of the QoS of a composite service CS when executed using plan $ p = \{<t_1,s_{i1}>,
<t_2,s_{i2}>, \ldots, <t_N,s_{iN}>\}$. A brief explanation of each criterion's aggregation function follows:
Table 1: Aggregation functions for computing the QoS of execution plans
Criteria Aggregation function
Price $ Q_{price}(p) = \sum_{i=1}^Nq_{price}(s_i, op_i)$
Duration $ Q_{du}(p) = CPA(q_{du}(s_1, op_1),..., q_{du}(s_N, op_N))$
Reputation $ Q_{rep}(p) = \frac{1}{N} \sum_{i=1}^N q_{rep}(s_i)$
Reliability $ Q_{rel}(p) = \Pi_{i=1}^N( e^{q_{rel}(s_i) \ast z_i})$
Availability $ Q_{av}(p) = \Pi_{i=1}^N( e^{q_{av}(s_i) \ast z_i})$

Using above aggregation functions, the quality vector of a composite service's execution plan is defined as:
$\displaystyle Q(p) =
(Q_{price}(p),Q_{du}(p),Q_{av}(p),Q_{re}(p),Q_{rep}(p))$     (2)



Global Service Selection


As mentioned before, in existing approaches, the selection of component service to execute a task is determined independently to other tasks of composite services [2,11,6]. More precisely, in our previous work [2], service selection is done at each service community locally. The selection of a service is based on a selection policy involving parameters of the request, the characteristics of the members, the history of past executions, and the status of ongoing executions. Although service selection can be locally optimized, the global quality constraints may not be satisfied. For example, a global constraint such as composite services' execution price is less than 500 dollars can not be enforced. In this section, we present a global planning based approach for Web services selection. We first present an approach of selecting an optimal execution plan for a composite service, then present a novel linear programming based method for optimal execution plan selection.


Selecting an Optimal Execution Plan


The basic idea of global planning is the same as query optimization in database management systems. Several plans are identified and the optimal plan is selected. The foregoing discussion makes it clear that a statechart has multiple execution paths and each execution path has its own set of execution plans if the statechart contains conditional branchings. In this subsection, we assume that the statechart does not contain any conditional branchings and has only one execution path. We will discuss the case where a statechart has multiple execution paths in Section 4.2. We also assume that for each task $ t_j$, there is a set of candidate Web services $ S_j$ that are available to which task $ t_j$ can be assigned. Associated with each Web service $ s_{ij}$ is a quality vector (see equation 1). Based on the available Web services, by selecting a Web service for each task in an execution path, the global planner will generate a set of execution plans $ P$ :
$\displaystyle P = \{p_1, p_2, ..., p_n\}$     (3)

$ n$ is the number of execution plans. After a set of execution plans is generated, the system needs to select an optimal execution plan. When selecting the execution plan, instead of computing the quality vector of a particular Web service, each execution plan's global service quality vector needs to be computed. The selection of execution plan uses Multiple Attribute Decision Making (MADM)[16] approach. Once the quality vector for each execution plan is derived, by accumulating all the execution plans' quality vectors, we obtain matrix $ \mathbb{Q}$, where each row represents an execution plan's quality vector.
$\displaystyle \mathbb{Q}= \left ( \begin{array}{cccc}
Q_{1,1} & Q_{1,2} & \ldot...
... & \vdots & \vdots \\
Q_{n,1} & Q_{n,2} & \ldots & Q_{n,5}
\end{array} \right)$     (4)

A Simple Additive Weighting (SAW) [4] technique is used to select an optimal execution plan. Basically, there are two phases in applying SAW:

Scaling Phase

Some of the criteria used could be negative, i.e., the higher the value is, the lower the quality is. This includes criteria such as execution time and execution price. Other criteria are positive criteria, i.e., the higher the value is, the higher the quality is. For negative criteria, values are scaled according to Equation 5. For positive criteria, values are scaled according to Equation 6.
$\displaystyle V_{i,j} = \left\{ \begin{array}{ll}
\frac{Q_{j}^{max}-Q_{i,j} }{ ...
...
1 & \textrm{if } Q_{j}^{max} - Q_{j}^{min} = 0
\end{array}\right. j = 1, 2~~~$     (5)
$\displaystyle V_{i,j} = \left\{ \begin{array}{ll}
\frac{Q_{i,j}-Q_{j}^{min} }{ ...
...
1 & \textrm{if } Q_{j}^{max} - Q_{j}^{min} = 0
\end{array}\right. j = 3, 4, 5$     (6)

In the above equations, $ Q_{j}^{max}$ is maximal value of a quality criterion in matrix $ \mathbb{Q}$, i.e., $ Q_{j}^{max} =
Max(Q_{i,j}), 1 \le i \le n$. $ Q_{j}^{max}$ is minimal value of a quality criterion in matrix $ \mathbb{Q}$, i.e., $ Q_{j}^{min}=
Min(Q_{i,j}), 1 \le i \le n$. In fact, we can compute $ Q_j^{max}$ and $ Q_j^{min}$ without generating all possible execution plans. For example, in order to compute the maximum execution price (i.e., $ Q_{price}^{max}$) of all the execution plans, we select the most expensive Web service for each task and sum up all these execution prices to compute $ Q_{price}^{max}$. In order to compute the minimum execution duration (i.e., $ Q_{du}^{min}$) of all the execution plans, we select the Web service that has shortest execution duration for each task and use CPA to compute $ Q_{du}^{min}$. The computation cost of $ Q_j^{max}$ and $ Q_j^{min}$ is polynomial. After the scaling phase, we obtain the following matrix $ \mathbb{Q^{'}}$:

$\displaystyle \mathbb{Q^{'}}= \left ( \begin{array}{cccc}
V_{1,1} & V_{1,2} & ...
...dots & \vdots \\
V_{n,1} & V_{n,2} & \ldots & V_{n,5}
\end{array} \right)
$

Weighting Phase

The following formula is used to compute the overall quality score for each execution plan:
$\displaystyle Score(p_i) = \sum_{j = 1}^{5} (V_{i,j} \ast W_j)$     (7)

where $ W_j \in [0,1]$ and $ \sum_{j = 1}^{5}W_j =1$. $ W_j$ represents the weight of each criterion. In (7), end users can give their preference on QoS (i.e., balance the impact of the different criteria) to select a desired execution plan by adjusting the value of $ W_j$. The global planner will choose the execution path which has the maximal value of $ Score(p_i)$ (i.e., $ max(Score(p_i))$). If there are more than one execution plans which have the same maximal value of $ Score(p_i)$, then an execution plan will be selected from them randomly.


Handling Multiple Execution Paths


In Section 4.1, we assume that the statechart only has one execution path. In this subsection, we discuss the case where statecharts have multiple execution paths. Assume that a statechart has $ n$ execution paths. For each execution path, an optimal execution plan can be selected. So, the global planner has $ n$ selected execution plans. Since each selected optimal execution plan only covers a subset of the entire statechart, then the global planner needs to aggregate these $ n$ execution plans into an overall execution plan to covers all the tasks in the statechart. This overall execution plan will be used to execute the statechart. For example, for travel planner statechart $ W$ (see Figure 1), there are two execution paths $ W_{e1}$ and $ W_{e2}$. The optimal execution plans $ p_1$ and $ p_2$ of these two execution paths are selected. From the Figure 2, it can be seen that both execution paths $ W_{e1}$ and $ W_{e2}$ are subsets of $ W$. Thus neither $ p_1$ nor $ p_2$ covers all tasks in $ W$. Since the global planner conducts planning before the execution time, it does not know which execution path will eventually be used for the composite service. Therefore it needs to aggregate $ p_1$ and $ p_2$ into an overall execution plan which covers all the tasks in $ W$. Assume that statechart $ W$ has $ k$ tasks (i.e., $ t_{1}$, $ t_{2}$, ..., $ t_{k}$) and $ n$ execution paths (i.e., $ W_{e1}$, $ W_{e2}$,..., $ W_{en}$). Thus, for each execution path, the global planner selects an optimal execution plan. Consequently, we obtain $ n$ optimal execution plans (i.e., $ p_1, p_2, ..., p_n$) for these execution paths. The global planner adopts the following approach to aggregate multiple execution plans into an overall execution plan.
  1. Given a task $ t_{i}$, if $ t_{i}$ only belongs to one execution path (e.g., $ W_{ej}$), then the global planner selects $ W_{ej}$'s execution plan $ p_j$ to execute the task $ t_{i}$. We denote this as $ t_{i}(p_j)$. For example, in trip planning statechart, task $ t_{7}$ (i.e.,CarRental ) only belongs to execution path $ W_{e2}$. In this case, $ W_{e2}$'s execution plan $ p_2$ is used to execute $ t_{7}$, i.e., $ t_{7}(p_2)$.
  2. Given a task $ t_{i}$, if $ t_{i}$ belongs to more than one execution paths (e.g., $ W_{ej}$, $ W_{ej+1}$, ..., $ W_{em}$), then there is a set of execution plans (i.e., $ p_j$, $ p_{j+1}$, ..., $ p_m$) that can be used to execute $ W_{si}$. In this case, the global planner needs to select one of the execution plans from $ p_j$, $ p_{j+1}$, ... , $ p_m$. The selection can be done by identifying the hot path for task $ t_{i}$. Here, the hot path of a task $ t_{i}$ is defined as the execution path that has been most frequently used to execute task $ t_{i}$ in the past. For example, in travel planner statechart, task $ t_{3}$ (FlightTicketBooking) belongs to both execution path $ W_{e1}$ and $ W_{e2}$. Assume that the statechart $ W$ has been used to execute the composite service for 25 times. Also assume that, in 20 times the execution of the composite service follows the execution path $ W_{e1}$; while in 5 times, the execution of the composite service follows the execution path $ W_{e2}$. This indicates that execution path $ W_{e1}$ has been more frequently used to execute task $ t_{3}$ (i.e., $ W_{e1}$ is the hot path for $ t_{3}$). Thus, $ W_{e1}$'s execution plan $ p_1$ is used to execute $ t_{3}$, i.e., $ t_{3}(p_1)$.
The system keeps composite service execution traces in an execution history [10]. This allows the global planner to identify hot path for each task.

Linear Programming Solution


The approach of selecting an optimal execution plan given in the previous section requires the generation of all possible execution plans. Assume that there are $ N$ tasks in a statechart and there are $ M$ potential Web services for each task. The total number of execution plans is $ M^N$. The computation cost of selecting an optimal execution plan is $ O(M^N)$. Such an approach is impractical for large scale composite services, where both the number of tasks in the composite services and the number of candidate Web services in communities are large. For example, assume that a composite service has one execution path and $ 10$ tasks, and for each task, there are $ 10$ candidate Web services. Then the total number of execution plans is $ 10^{10}$. It is very costly to generate all these $ 10^{10}$ plans and select an optimal one. In this subsection, we present a method based on linear programming (LP) [15], which can be used to select an optimal execution plan without generating all the possible execution plans. There are three inputs in LP: variables, an objective function and constraints on the variables, where both the objective function and constraints must be linear. LP attempts to maximize or minimize the value of the objective function by adjusting the values of variables based on the constraints. The output of LP is the maximum (or minimum) value of the objective function as well as the values of variables at this maximum or minimum point. In order to use LP to select an optimal execution plan, we model the selection of an optimal execution plan as an LP problem. The variables of the LP problem are $ y_{ij}$ representing the participation of service $ s_{ij}$ in the selected execution plan. The value of each variable $ y_{ij}$ is 1 if service $ s_{ij}$ is in the selected plan, 0 otherwise. The objective function of the LP problem, which is based on equations 5,6, and 7, is:
$\displaystyle Max \left( \sum_{l = 1}^{2} \left(
\frac{Q_{l}^{max}-Q_{i,l}}{Q_{...
...ft(\frac{Q_{i,l}-Q_{l}^{min}}{
Q_{l}^{max}-Q_{l}^{min}} \ast W_l \right)\right)$     (8)

where $ W_l \in [0,1] $ and $ \sum_{j = 1}^{5}W_j =1$. $ W_l$ is the weight assigned to quality criteria $ l$. In the following subsections, we discuss the constraints on the variables of the LP problem.

Constraints on Execution Duration and
Execution Price

In this subsection, we consider constraints on the execution duration and the execution price of an execution plan. Assume that $ A$ is the set of all tasks (i.e., basic states) of the statechart. For each task $ t_j$, there is a set of Web services $ S_j$ that can be assigned to this task, but on the end, for each task $ t_j$, only one Web service should be selected. Given that $ y_{ij}$ ($ y_{ij}=0$ or $ 1$) denotes the participation of Web service $ s_{ij}$ in the selected plan, this latter fact is captured by the following constraints:
$\displaystyle \sum_{i \in S_j} y_{ij} = 1, \; \forall j \in A$     (9)

For example, there are 100 potential Web services that can execute task $ j$, since only one of them will be selected to execute the task $ j$, then we have $ \sum_{i = 1}^{100} y_{ij} = 1$. Assume that variable $ x_j$ represents the earliest start time of task $ t_j$, variable $ p_j$ represents the execution duration for task $ j$, and variable $ p_{ij}$ represents the execution duration for task $ t_j$ by service $ s_{ij}$. We use the notation $ t_j
\rightarrow t_k$ to denote that task $ t_k$ is task $ t_j$'s direct successor task. We have the following constraints:
$\displaystyle \sum_{i \in S_j} p_{ij} \; y_{ij} = p_j, \forall j \in A$     (10)
$\displaystyle x_k - (p_j + x_j) \ge 0, \forall t_j \rightarrow t_k, j,k \in A$     (11)
$\displaystyle Q_{du} - (x_j + p_j) \ge 0, \forall j \in A$     (12)

Constraint 10 indicates that the execution duration of a given task $ t_j$ is equal to the execution duration of one of the Web services in $ A$. Constraint 11 captures the fact that if task $ t_k$ is a direct successor of task $ t_j$, the execution of task $ t_k$ must start after task $ t_j$ has been completed. Constraint 12 indicates that the execution of a composite service is completed only when all its tasks are completed. Assume that $ z_{ij}$ is an integer variable that has value $ 1$ or 0: $ 1$ indicates that Web service $ s_{ij}$ is a critical service and 0 indicates otherwise. We have the following constraint on execution plan's execution duration $ Q_{du}$:
$\displaystyle Q_{du}= \sum_{j \in A} \sum_{i \in S_j} p_{ij}z_{ij}$     (13)

For execution price, assume that variable $ c_{ij}$ represents the execution price of Web service $ s_{ij}$, then we have the following constraint on total execution price of composition service:
$\displaystyle Q_{price} = \sum_{j \in A} \sum_{j \in S_j}
c_{ij} \; y_{ij}$     (14)

An alternative of constraint 14 is as follows:
$\displaystyle \sum_{j \in A} \sum_{j \in S_j} c_{ij} y_{ij} \le B, B > 0$     (15)

where $ B$ is the budget constraint given by the user. This constraint indicates that the entire composite service's execution price can not be greater than $ B$. By introducing a budget constraint the above problem needs to be explicitly solved as an integer programming problem. This problem is a special case of the knapsack problem and hence it is NP-hard [19]. Notice that constraints on other criteria can be easily incorporated into LP if the aggregation function is a linear function. For example, assume that variable $ r_{ij}$ represents the reputation of Web service $ s_{ij}$, we can have the following constraint on the execution plan's reputation:
$\displaystyle Q_{rep}= \sum_{j \in A} \sum_{j \in S_j} r_{ij} y_{ij}$     (16)

Constraints on Reliability and Availability

In this subsection, we consider constraints on criteria where the aggregation function is not a linear function. Among the criteria that are used to select Web services, both the availability's and the reliability's aggregation functions are non-linear (see Table 1). We can linearize them using a logarithm function as shown below. Assume that variable $ a_{ij}$ represents the reliability of Web service $ s_{ij}$. Since $ z_{ij}$ indicates whether Web service $ s_{ij}$ is a critical service or not, the reliability of execution plan is:
$\displaystyle Q_{rel}= \Pi_{j \in A} \left( \sum_{j \in S_j} e^{a_{ij} z_{ij}} \right)$      

By applying the logarithm function $ ln$, we obtain:
$\displaystyle ln(Q_{rel})= \sum_{j \in A} ln \left( \sum_{j \in S_j} e^{a_{ij} z_{ij}} \right)$      

Since $ \sum_{j \in A} z_{ij} = 1$ and $ z_{ij}= 0$ or $ 1$, we obtain:
$\displaystyle ln(Q_{rel})= \sum_{j \in A} \left( \sum_{j \in S_j} a_{ij} z_{ij} \right)$      

Let $ Q_{rel}^{'} = ln(Q_{rel})$, we have the following constraint on the execution plan's reliability:
$\displaystyle Q_{rel}^{'}= \sum_{j \in A} \sum_{j \in S_j} a_{ij} z_{ij}$     (17)

Similarly, assuming that $ b_{ij}$ represents the availability of the Web service $ s_{ij}$, the following constraint is introduced:
$\displaystyle Q_{av}^{'}= \sum_{j \in A} \sum_{i \in S_j} b_{ij} z_{ij}$     (18)

where $ Q_{av}^{'}= ln( Q_{av})$. Criteria that can be introduced into the LP problem are not limited to what we defined in Section 3. Other criteria can be added once the aggregation functions are given.

Constraints on Uncertainty of Execution
Duration

In the previous sections, we assume that the execution durations $ p_{ij}$ of Web services are deterministic. In reality, the execution duration of a Web service $ s_{ij}$ may be uncertain. For example, Service $ s$ may advertise that the execution duration is 5 seconds, but the actual execution duration may be 4.5, 4.6, or 5.2 seconds. To address this issue, we model each $ p_{ij}$ using a normal distribution. Variable $ p_{ij}$ has therefore the following probability function:
$\displaystyle f (x) = \frac{1}{\sqrt{2\pi} \sigma}exp\biggr[ - \frac{1}{2}\left(
\frac{x-\mu } {\sigma} \right)^{2} \biggr], -\infty < x < \infty$      

where the mean $ \mu$ and the std. deviation $ \sigma$ are given by:
$\displaystyle \mu = \frac{1}{n} \sum_{i=1}^{n}x_i$     (19)
$\displaystyle \sigma^2 = \frac{1}{n-1} \sum_{i=1}^{n} ( x_i -
\mu)^2$     (20)

By applying formulas  19 and 20 to the execution logs of a Web service $ s_{ij}$, we can obtain $ \mu_{ij}$ and $ \sigma_{ij}$ for this Web service. Since $ \sum_{i \in A} \sum_{i \in
S_j} p_{ij}z_{ij} = Q_{du}$, the total execution duration $ Q_{du}$ must follow a normal distribution[*] whose deviation $ \sigma_{du}$ is:
$\displaystyle \sigma^{2}_{du} = \sum_{i \in A} \sum_{i \in S_j}
\sigma^{2}_{ij}z_{ij}$     (21)

So in order to consider the deviation of the total execution duration in the LP problem, we should adopt the following objective function:
$\displaystyle Max \left( \sum_{l = 0}^{2} \left(
\frac{Q_{l}^{max}-Q_{i,l} }{ Q...
...frac{Q_{i,l}-Q_{l}^{min} }{
Q_{l}^{max} - Q_{l}^{min}} \ast W_l \right) \right)$     (22)

where $ Q_{0}= \sigma^{2}_{du}$ and $ W_{0} \in [0,1]$ is the weight assigned to the deviation of the total execution duration.

Given the above inputs for the LP problem, the output of the LP solver will be a set of values for variables $ y_{ij}$, which indicate the selection or exclusion of Web services $ s_{ij}$. The selected Web services compose an optimal execution plan.


Validation


In this section, we present the implementation of the QoS-driven selection of services and some experimental results to evaluate the proposed approach.

Implementation


The proposed QoS-driven selection technique is implemented in the SELF-SERV prototype. Detailed description of SELF-SERV can be found in  [24]. In this section, we briefly overview the prototype architecture and discuss its extension to support the service selection approach. The prototype architecture (Figure 4) features an interface, a service manager and a pool of services. Services communicate via Simple Object Access Protocol (SOAP) messages. The service manager consists of four modules, namely the service discovery engine, the service editor, the composite service orchestrater and the global planner. The service discovery engine facilitates the advertisement and location of services. It is implemented using the Universal Description, Discovery and Integration (UDDI), the Web Service Description Language (WSDL), and SOAP. In the implementation, we make extensive use of the IBM Web Services Toolkit 2.4 (WSTK2.4) [14], which is a showcase package for Web services emerging technologies.

Figure 4: Architecture of the prototype.
Image prototype.jpg

The service editor provides facilities for defining new services and editing existing ones. A service is edited through a visual interface, and translated into an XML document for subsequent analysis and processing by the service orchestrater. The orchestrater is responsible for scheduling, initiating, and monitoring the invocations to the tasks of a composite service during its execution, and for routing events and data items between these components. Finally, global planner is the module that plans the execution of a composite service using the global planning based approach. The global planner is implemented as a linear programming solver based on IBM's Optimization Solutions and Library (OSL) [13]. It should be noted that QoS information is retrieved via pre-defined operations of services (e.g., getExecutionTime()).


Experimentation


We conducted experiments using the implemented prototype system to evaluate our approach. In the experiments, a cluster of PCs were used to run the prototype system. All PCs have the same configuration of Pentium III 933MHz with 512M RAM. Each PC runs Windows 2000, Java 2 Edition V1.3.0, and Oracle XML Developer Kit (Oracle XDK, for XML parsing). They are connected to a LAN through 100Mbits/sec Ethernet cards. This section presents two experimental results. The first experiment compares the QoS metrics of the execution of composite services using the global planning and local selection approaches. The second experiment shows the system costs (i.e., computation cost and bandwidth cost) of these two approaches.

QoS Metrics

The purpose of this experiment is to compare the QoS values of the global planning approach with that of the local selection. The comparison was done by measuring price and execution time of the composite services using both approaches. In the experiment, we created several composite services with different number of basic states. The services were created by randomly adding states to the composite service shown in Figure 1. The number of states ranges over the values 10, 20, 30, 40, 50, 60, 70, and 80. For each composite service, we executed the service 12 times and recorded the price and execution time. Since we obtained similar experimental results for all created composite services, only the result of one composite service (with 20 states) is shown (see Table 2) for clarity reasons. From the table, we can see that for every instance of the composite service, the global planning approach gives better QoS than local selection approach. For example, for the instance 7 of Table 2, the time required to execute the composite service is: (i) 6451 seconds in the global planning approach, (ii) 9480 seconds in the local selection approach. Similarly, the execution price spends for executing this composite service is: (i) 1231 dollars in the global planning approach, (ii) 1789 dollars in the local selection approach. Overall, the average execution time (resp., execution price) is: (i) 6627.2 seconds (resp., 1191 dollars) in the global planning approach, (ii) 9305.9 seconds (resp., 1753 dollars) in the local selection approach.

Table 2: The QoS of a composite service with 20 states
Instance $ Q_{time}(W)$ (second) $ Q_{price}(W)$ (dollar)
No Global Local Global Local
1 6523.2 8322.4 1023 1642
2 6634.4 9123.9 1117 1728
3 6843.2 9234.5 1123 1825
4 6432.5 9292.2 1132 1824
5 6347.3 8943.3 1121 1723
6 6512.3 9902.8 1185 1888
7 6451.2 9480.4 1231 1789
8 6440.5 9470.5 1275 1787
9 6970.4 9920.4 1324 1625
10 6890.3 9628.3 1235 1759
11 6590.3 9520.3 1267 1852
12 6890.3 8920.5 1250 1599
Average: 6627.2 9305.9 1191 1753


System Costs

The aim of this experiment is to investigate the system costs of executing composite services using the LP-based global planning and local selection approaches. The experiment was done by measuring: (i) computation cost (i.e., time used for selecting a Web service for each task of the composite service); (ii) bandwidth cost (i.e., network bandwidth consumed between global planner and Web services). In the experiment, we created several composite services with different number of tasks. The services were created by randomly adding states to the composite service shown in Figure 1. The number of tasks ranges over the values 10, 20, 30, 40, 50, 60, 70, and 80. In addition, we created a set of Web services which were assigned to tasks of composite services as the candidate Web services. For each task, the number of candidate Web services we used varies as follows: 5, 10, 20, and 40 services. We executed composite services with different number of states and candidate Web services. The computation and bandwidth costs for selecting Web services were recorded. The results for composite services with 40 candidate Web services of each state are shown in figures 5 and 6 respectively. Similar results were obtained for other cases.

Figure 5: The computation cost of selecting services for composite services (each state has 40 candidate Web services)
Image staticCost_40.jpg

Figure 6: The bandwidth cost of selecting service for composite services (each state has 40 candidate Web services)
Image staticCostb_40.jpg

From Figure 5, we can see that for both global and local selection approaches, the computation cost increases when the number of tasks and the number of candidate Web services increases. As expected, the computation cost of global planning approach is a little bit higher than that of local selection approach. For example, a composite service with 40 tasks spends: (i) 1.6 seconds for selecting Web services in global planning approach, (ii) 0.7 seconds in local selection approach. Similar observations are found regarding bandwidth cost. More specifically, for both approaches, the linear increase of the number of tasks and the number of candidate Web services leads almost a linear increase of bandwidth cost (see Figure 6). The bandwidth cost in global planning is slightly higher than that of local selection approach. For example, a composite service with 40 tasks consumes about 2080 KBytes of network bandwidth for selecting Web services in global planning approach, while it consumes 1910 KBytes in local selection approach.


Related Work


In this section, we briefly discuss the relationships between our work and existing Web service standards, Web service composition approaches, and QoS-driven workflow management. Several standardisation proposals aiming at providing infrastructure to support Web services composition have recently emerged including SOAP, WSDL, UDDI, and BPEL-4WS. SOAP defines an XML messaging protocol for communication among services. WSDL is an XML-based language for describing web service interfaces. UDDI provides the directory and a SOAP-based API to publish and discover services. Finally, BPEL4WS provides a language for process-based service composition. Other proposed notations for service description and composition include ebXML and DAML-S. The above proposals however are complementary to ours. Indeed, our proposal aims at leveraging the above standards (e.g., SOAP, UDDI) to provide a quality-driven and dynamic service composition model. Web service composition is a very active area of research and development [1,7,8]. Previous efforts in this area such as CMI [11] and eFlow [6] have investigated dynamic service selection based on user requirements. In particular, CMI's service definition model features the concept of a placeholder activity to cater for dynamic composition of services. A placeholder is an abstract activity replaced at runtime with a concrete activity type. A selection policy is specified to indicate the activity that should be executed in lieu of the placeholder. In eFlow, the definition of a service node contains a search recipe represented in a query language. When a service node is invoked, a search recipe is executed in order to select a reference to a specific service. Both CMI and eFlow focus on optimizing service selection at single task level (i.e. local selection). In addition, no QoS model is explicitly supported. In contrast, our approach focuses on optimizing service selection at a composite service level, based on a generic QoS model, and using established linear programming techniques. Related work on QoS has been conducted in the area of workflow. In particular, there are a number of research proposals addressing the specification and verification of temporal constraints in workflows [9,3]. Other proposals such as METEOR [5] and CrossFlow [18,17] have considered QoS models with other parameters than time. Specifically, [5] considers four quality dimensions, namely time, cost, reliability and fidelity. However, it does not consider the dynamic composition of services. Instead, it focuses on analyzing, predicting, and monitoring QoS of workflow processes. Similarly, [18] proposes the use of continuous-time Markov chain to estimate execution time and cost of a workflow. Closer to our proposal is the one reported in [17], which explores the issue of dynamically selecting several alternative tasks within a workflow, based on quality parameters, in a similar way as eFlow does using search recipes. As stated before, this local selection strategy contrasts with the global planning approaches that we advocate. Other related research proposals include [20,21], which focus on data quality management in cooperative information systems. They investigate techniques to select best available data from different service providers based on a set of data quality dimensions such as accuracy, completeness, and consistency.


Conclusion


Dynamic selection of component services is an important issue in Web services composition. In this paper, we have presented a general and extensible model to evaluate QoS of both elementary and composite services. Based on the QoS model, a global service selection approach that uses linear programming techniques to compute optimal execution plans for composite services has been described. We have conducted experiments to compare the proposed technique with the local selection approach. The results show that the proposed approach effectively selects high quality execution plans (i.e., plans which have higher overall QoS). Our ongoing research includes the support for exception handling during composite service executions. For example, after an execution plan has been built and while it is being executed, an exception may occur (e.g., unavailability of a component service). We will explore the possibility of performing dynamic plan revision during composite service execution, as a means to respond to runtime exceptions.

Acknowledgments

The work of Boualem Benatallah is partially supported by the Australian Research Council's Discovery GRANT DP02-1120.

Bibliography

1
B. Benatallah and F. Casati, editors.
Distributed and Parallel Database, Special issue on Web Services.
Springer-Verlag, 2002.

2
B. Benatallah, M. Dumas, Q. Z. Sheng, and A. Ngu.
Declarative Composition and Peer-to-Peer Provisioning of Dynamic Web Services.
In Proc. of ICDE'02, IEEE Computer Society, pages 297-308, San Jose, 2002.

3
C. Bettini, X. Wang, and S. Jajodia.
Temporal reasoning in workflow systems.
Distributed and Parallel Databases, 11(3):269-306, 2002.

4
H. C.-L and K. Yoon.
Multiple Criteria Decision Making.
Lecture Notes in Economics and Mathematical Systems, Springer-Verlag, 1981.

5
J. Cardoso.
Quality of service and semantic composition of workflows.
Ph.D. Thesis, University of Georgia, 2002.

6
F. Casati, S. Ilnicki, L.-J. Jin, V. Krishnamoorthy, and M.-C. Shan.
eFlow: a Platform for Developing and Managing Composite e-Services.
Technical Report HPL-2000-36, HP Laboratoris, Palo Alto, 2000.

7
F. Casati, M.-C. Shan, and D. Georgakopoulos, editors.
VLDB Journal, Special issue on E-Services.
Springer-Verlag, 2001.

8
A. Dogac, editor.
ACM SIGMOD Record 31(1), Special Section on Data Management Issues in E-Commerce.
ACM, March 2002.

9
J. Eder, E. Panagos, and M. Rabinovich.
Time constraints in workflow systems.
In Proc. of the 11th International Conference on Advanced Information Systems Engineering (CAiSE), pages 286-300, Heidelberg, Germany, June, 1999.

10
M.-C. Fauvet, M. Dumas, and B. Benatallah.
Collecting and querying distributed traces of composite service executions.
In Proc. of the 10th International Conference on Cooperative Information Systems (CoopIS), Irvine, CA, USA, 2002.

11
D. Georgakopoulos, H. Schuster, A. Cichocki, and D. Baker.
Managing process and service fusion in virtual enterprises.
Information System, Special Issue on Information System Support for Electronic Commerce, 24(6):429-456, 1999.

12
D. Harel and A. Naamad.
The STATEMATE semantics of statecharts.
ACM Transactions on Software Engineering and Methodology, 5(4):293-333, 1996.

13
IBM Optimization Solutions and Library, 2002.
http://www-3.ibm.com/software/data/bi/osl/index.html.

14
IBM WSTK Toolkit.
http://alphaworks.ibm.com/tech/webservicestoolkit.

15
H. Karloff.
Linear Programming.
Birkhauser, 1991.

16
M. Köksalan and S. Zionts, editors.
Multiple Criteria Decision Making in the New Millennium.
Springer-Verlag, 2001.

17
J. Klingemann.
Controlled flexibility in workflow management.
In Proc. of the 12th International Conference on Advanced Information Systems (CAiSE), pages 126-141, Stockholm, Sweden, June 2000. Springer.

18
J. Klingemann, J. Wasch, and K. Aberer.
Deriving service models in cross-organizational workflows.
In Ninth International Workshop on Research Issues in Data Engineering: Virtual Enterprise, RIDE-VE'99, Sydney, Australia, March 1999.

19
S. Martello and P. Toth.
Knapsack Problems: Algorithms and Computer Implementations.
John Wiley and Sons, 2001.

20
M. Mecella, M. Scannapieco, A. Virgillito, R. Baldoni, T. Catarci, and C. Batini.
Managing data quality in cooperative information systems.
In Proc. of the 10th International Conference on Cooperative Information Systems (CoopIS), Irvine, CA, USA, 2002.

21
F. Naumann, U. Leser, and J. C. Freytag.
Quality-driven integration of heterogenous information systems.
In Proceedings of the International Conference on Very Large Databases (VLDB), pages 447-458, Edinburgh, UK, 1999.

22
J. O'Sullivan, D. Edmond, and A. ter Hofstede.
What's in a Service.
Distributed and Parallel Databases, 12(2-3):117-133, September 2002.

23
M. Pinedof.
Scheduling: Theory, Algorithms, and Systems (2nd Edition).
Prentice Hall, 2001.

24
Q. Z. Sheng, B. Benatallah, M. Dumas, and E. Mak.
SELF-SERV: A Platform for Rapid Composition of Web Services in a Peer-to-Peer Environment.
In Proc. of the 28th VLDB Conference, Hong Kong, China, August 2002.

25
A. van Moorsel.
Metrics for the internet age: Quality of experience and quality of business.
Technical Report HPL-2001-179, HP Labs, August 2001.
Also published in 5th Performability Workshop, September 2001, Erlangen, Germany.

26
D. D. Wackerly, W. Mendenhall, and R. L. Scheaffer.
Mathematical Statistics with Application.
Duxbury Press, 1996.

About this document ...

Quality Driven Web Services Composition

This document was generated using the LaTeX2HTML translator Version 2K.1beta (1.48)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -dir htm -split 1 p358-zeng.tex

The translation was initiated by Quan Zheng Sheng on 2003-03-31


next_inactive up previous
Quan Zheng Sheng 2003-03-31