Quality Driven Web Services Composition
Liangzhao Zeng |
Boualem Benatallah |
Marlon Dumas |
small University of New South Wales |
University of New South Wales |
Queensland University of
Technology |
Sydney, Australia |
Sydney, Australia |
Brisbane Australia |
zlzhao@cse.unsw.edu.au |
boualem@cse.unsw.edu.au |
m.dumas@qut.edu.au |
Jayant Kalagnanam |
Quan Z. Sheng |
|
IBM T.J. Watson Research Center |
University of New South Wales |
|
New York, USA |
Sydney, Australia |
|
jayant@us.ibm.com |
qsheng@cse.unsw.edu.au |
|
Copyright is
held by the author/owner(s).
WWW2003, May 20-24, 2003, Budapest,
Hungary.
ACM 1-58113-680-3/03/0005.
Abstract:
The process-driven composition of Web services is emerging as a promising
approach to integrate business applications within and across organizational
boundaries. In this approach, individual Web services are federated into
composite Web services whose business logic is expressed as a process model. The
tasks of this process model are essentially invocations to functionalities
offered by the underlying component services. Usually, several component
services are able to execute a given task, although with different levels of
pricing and quality. In this paper, we advocate that the selection of component
services should be carried out during the execution of a composite service,
rather than at design-time. In addition, this selection should consider multiple
criteria (e.g., price, duration, reliability), and it should take into account
global constraints and preferences set by the user (e.g., budget constraints).
Accordingly, the paper proposes a global planning approach to optimally select
component services during the execution of a composite service. Service
selection is formulated as an optimization problem which can be solved using
efficient linear programming methods. Experimental results show that this global
planning approach outperforms approaches in which the component services are
selected individually for each task in a composite service.
H.3.5Information SystemsWeb-based services Management, Performance
Introduction
Web services technologies
are emerging as a powerful vehicle for organizations that need to integrate
their applications within and across organizational boundaries. In particular,
the process-based composition of Web services is gaining a considerable momentum
as an approach for the effective integration of distributed, heterogeneous, and
autonomous applications [1].
In this approach, applications are encapsulated as Web services and the logic of
their interactions is expressed as a process model. This approach provides an
attractive alternative to hand-coding the interactions between applications
using general-purpose programming languages. A Web service is a self-described
application that uses standard Internet technologies to interact with other Web
services. An example of a Web service is a SOAP-based interface to place bids in
an auction house. Once deployed, services can be aggregated into composite
services. An example of a composite service would be a ``Travel Planner''
system that aggregates multiple component services for flight booking,
travel insurance, accommodation booking, car ren-tal, and itinerary planning,
which are executed sequentially or concurrently. The process model underlying a
composite service identifies the functionalities required by the services to be
composed (i.e., the tasks of the composite service) and their
interactions (e.g., control-flow, data-flow, and transactional dependencies).
Component services that are able to provide the required functionalities are
then associated to the individual tasks of the composite services and invoked
during each execution of the composite service. The number of services providing
a given functionality may be large and constantly changing. Consequently,
approaches where the development of composite services requires the
identification at design-time of the exact services to be composed are
inappropriate. The runtime selection of component services during the execution
of a composite service has been put forward as an approach to address this
issue [2,6,11].
The idea is that component services are selected by the composite service
execution engine based on a set of criteria. However, previous approaches in
this area have not identified a set of criteria (other than price and
application-specific criteria) for selecting Web services. In addition, existing
service selection approaches adopt a local selection strategy, meaning that they
assign a component service to an individual tasks, one at a time. As a result,
these approaches are not able to handle global user constraints and preferences.
For example, the overall duration of the composite service execution should be
minimized, or a given budget constraint should be satisfied. In this paper, we
present quality-driven approach to select component services during the
execution of a composite service. The salient features of our approach are:
- A Web services quality model. We propose an extensible
multi-dimensional Web services quality model. The dimensions of this model
characterize non-functi-onal properties that are inherent to Web services in
general: execution price, execution duration,
reputation, reliability, and availability.
- Quality-driven service selection. In order to overcome the
limitations of local service selection outlined above, we propose a global
planning approach. In this approach, quality constraints and preferences are
assigned to composite services rather than to individual tasks within a
composite service. Service selection is then formulated as an optimization
problem and a linear programming method is used to compute optimal service
execution plans for composite services. Experimental results show that the
proposed service selection strategy significantly outperforms local selection
strategies.
The rest of the paper is organized as follows.
Section 2
presents the service composition model and defines some key concepts used
throughout the paper. Section 3
defines the service quality criteria used for service selection and explains how
the values of these quality criteria can be computed for a given service.
Section 4
formulates the global service selection problem and describes a linear
programming method to efficiently solve it. Section 5
presents a prototype implementation of the proposed approach, as well as a set
of experiments comparing the global planning approach with the local selection
approach. Finally, Section 6
discusses related work, and Section 7
draws some conclusions.
Web Service Composition Model
In this
section, we will present some basic concepts in Web service composition first,
then give some definitions on composite service execution planning.
A composite Web service is an umbrella structure aggregating multiple
other elementary and composite Web services, which interact with each other
according to a process model. Following our previous work [2],
we choose to specify the process model of a composite service as a
statechart [12].
The choice of statecharts for specifying composite Web services is motivated by
two main reasons: (i) statecharts have a well-defined semantics; and (ii) they
offer the basic flow constructs found in contemporary process modeling languages
(i.e., sequence, conditional branching, structured loops, concurrent threads,
and inter-thread synchronization). The first characteristic facilitates the
application of formal manipulation techniques to statechart models, while the
second characteristic ensures that the service composition mechanisms developed
in the context of statecharts, can be adapted to other process modeling
languages like, for example, those that are being designed by Web services
standardization efforts (e.g., BPEL4WS, WSCI, BPML). A
statechart is made up of states and transitions. In the proposed composition
framework, the transitions of a statechart are labeled with events, conditions,
and assignment operations over process variables. States can be basic or
compound. Basic states are labelled with invocations to Web services
operations. Compound states contain one or several statecharts within them.
Specifically, compound states come in two flavors: OR and AND states. An
OR-state contains a single statechart within it whereas an AND-state contains
several statecharts (separated by dashed lines) which are intended to be
executed concurrently. Accordingly, OR-states are used as a decomposition
mechanism for modularity purposes, while AND-states are used to express
concurrency: they encode a fork/join pair. The initial state of a statechart is
denoted by a filled circle, while the final state is denoted by two concentric
circles: one filled and the other unfilled. A simplified statechart specifying a ``Travel Planner'' composite Web Service is
depicted in Figure 1.
In this composite service, a search for attractions is performed in parallel
with a flight and an accommodation booking. After these searching and booking
operations are completed, the distance from the hotel to the accommodation is
computed, and either a car or a bike rental service is invoked. Note that when
two transitions stem from the same state (in this case the state ), they denote a conditional branching, and the transitions should
therefore be labelled with disjoint conditions.
Figure 1: Statechart of a composite
service ``Travel Planner''
|
A basic state of a
statechart describing a composite service can be labelled with an invocation to
either of the following:
- An elementary Web service, i.e., a service which does not transparently
rely on other Web services.
- A composite Web service aggregating several other services.
- A Web service community, i.e., a collection of Web services with a
common functionality although different non-functional properties (e.g., with
different providers, different QoS parameters, reputation, etc.)
The
concept of Web service community addresses the issue of composing a large and
changing collection of Web services. Service communities provide descriptions of
a desired functionality (e.g., flight booking) without referring to any actual
service (e.g., Qantas flight booking Web service). The set of members of a
community can be fixed when the community is created, or it can be determined
through a registration mechanism, thereby allowing service providers to join,
quit, and reinstate the community at any time. When a community receives a
request to execute an operation, this request is delegated to one of its current
members. The choice of the delegatee is done at execution time based on the
parameters of the request, the characteristics of the members, the history of
past executions, and the status of ongoing executions. Sections 3
and 4
deal with the selection of delegatees during the execution of a composite
service whose states are labelled with invocations to communities.
In this section, we define two concepts used in the rest of the paper:
execution path and execution plan.
Definition 1 (Execution path). An
execution path of a statechart is a sequence of states [, ,
.. ], such that is
the initial state, is
the final state, and for every state
(),
the following holds:
- is a direct successor of one of the states in
[,...,]
- is not a direct successor of any of the states
in [,...,]
- There is no state
in [, ..., ]
such that
and belong to two alternative branches of the
statechart.
- If t
is the initial state of one of the concurrent regions of an AND-state
AST, then for every other concurrent region C in AST, one
of the initial states of C appears in [,
..., ,
, ..., ]. In other words, when an AND-state is entered, all its concurrent
branches are executed.
This definition relies on the concept of direct successor of a state.
A basic state
is a direct successor of another basic state
if there is a sequence of adjacent transitions going
from to
without traversing any other basic state. In other words, the first transition
in the sequence stems from ,
the last transition leads to ,
and all intermediate transitions stem from and lead to either compound, initial,
or final states, but are not incident to a basic state. It is straightforward to
see that an acyclic statechart has a finite number of execution paths. To
simplify the presentation, we initially assume that all the statecharts that we
deal with are acyclic. If a statechart contains cycles, a technique for
``unfolding'' it into an acyclic statechart needs to be applied beforehand. The
method used to unfold the cycles of a statechart is to examine the logs of past
executions in order to determine the average number of times that each cycle is
taken. The states appearing between the beginning and the end of the cycle are
then cloned as many times as the cycle is taken in average. Details about this
unfolding process are omitted for space reasons. Under the assumption that the
underlying statechart is acyclic, it is possible to represent an execution path
of this statechart as a Directed Acyclic Graph (DAG) as follows.
Definition 2 (DAG representation of an execution path). Given
an execution path [,
, .. ]
of a statechart ST, the DAG representation of this execution path is a graph
obtained as follows:
- The DAG has one node for each task ,
, .. .
- The DAG contains an edge from task
to task
iff is a direct successor of in the statechart ST.
If a statechart diagram contains conditional branchings, it has multiple
execution paths. Each execution path represents a sequence of tasks to complete
a composite service execution. Figure 2
gives an example of a statechart's execution paths. In this example, since there
is one conditional branching after task ,
there are two paths, called
and
respectively. In execution path ,
task is executed after task , while in execution path ,
task is executed after task .
Figure 2: DAG representation of the
execution paths of the statechart of Figure 1.
|
As stated before, the basic states of a statechart describing a composite
service can be labelled with invocations to communities. If this is the case,
actual Web services (i.e., members of communities) need to be selected during
the execution of the composite service. Hence, it is possible to execute a path
in very different ways by allocating different Web services to the states in the
path. The concept of execution plan defined below captures the various ways of
executing a given execution path.
Definition 3 (Execution plan). A set
of pairs
is an execution plan of an execution path
iff:
- {, ,
... } is the set of tasks in .
- For each 2-tuple
in , service
is assigned the execution of task .
Web Service Quality Model
In a Web
environment, multiple Web services may provide similar functionalities with
different non-functional property values (e.g., different prices). In the
composition model presented in the previous section, such Web services will
typically be grouped together in a single community. To differentiate the
members of a community during service selection, their non-functional properties
need to be considered. For this purpose, we adopt a Web services quality model
based on a set of quality criteria (i.e. non-functional properties) that are
transversal to all Web services, for example, their pricing and reliability.
Although the adopted quality model has a limited number of criteria (for the
sake of illustration), it is extensible: new criteria can be added without
fundamentally altering the service selection techniques built on top of the
model. In particular, it is possible to extend the quality model to integrate
non-functional service characteristics such as those proposed in [22],
or to integrate service QoS metrics such as those proposed by [25].
In this section, we first present the quality criteria in the context of
elementary services, before turning our attention to composite services. For
each criterion, we provide a definition, indicate its granularity (i.e., whether
it is defined for an entire service or for individual service operations), and
provide rules to compute its value for a given service.
We consider five generic quality criteria for elementary
services: (1) execution price, (2)execution duration, (3)
reputation, (4) reliability, and (5) availability.
- Execution price. Given an operation
of a service , the execution price
is the amount of money that a service requester has to pay for executing the
operation . Web service providers either directly advertise the execution price
of their operations, or they provide means to enquire about it.
- Execution duration. Given an operation
of a service , the execution duration
measures the expected delay in seconds between the moment when a request is
sent and the moment when the results are received. The execution duration is
computed using the expression , meaning that the execution duration is the sum of the processing
time
and the transmission time . Services advertise their processing time or provide methods to
enquire about it. The transmission time is estimated based on past executions
of the service operations, i.e., , where
is a past observation of the transmission time, and
is the number of execution times observed in the past.
- Reliability. The reliability
of a service
is the probability that a request is correctly responded within a the maximum
expected time frame (which is published in the Web service description).
Reliability is a technical measure related to hardware and/or software
configuration of Web services and the network connections between the service
requesters and providers. The value of the reliability is computed from
historical data about past invocations using the expression , where
is the number of times that the service
has been successfully delivered within the maximum expected time frame, and
and is the total number of invocations.
- Availability. The availability
of a service
is the probability that the service is accessible. The value of the
availability of a service
is computed using the following expression , where
is the total amount of time (in seconds) in which service is available during the last seconds (
is a constant set by an administrator of the service community). The value of
may vary depending on a particular
application. For example, in applications where services are more frequently
accessed (e.g., stock exchange), a small value of gives a more accurate approximation for the
availability of services. If the service is less frequently accessed (e.g.,
online bookstore), using a larger
value is more appropriate. Here, we assume that Web services send
notifications to the system about their running states (i.e., available,
unavailable).
- Reputation. The reputation
of a service
is a measure of its trustworthiness. It mainly depends on end user's
experiences of using the service . Different end users may have different opinions on the same
service. The value of the reputation is defined as the average ranking given
by to the service by end users, i.e., , where
is the end user's ranking on a service's reputation,
is the number of times the service has been graded. Usually, the end users are
given a range to rank Web services, for example, in Amazon.com, the range is
.
Given the above quality criteria, the
quality vector of a service
is defined as follows:
|
|
|
(1) |
Note that the method for computing the value of the quality criteria is
not unique. The global planning model presented Section 4
is independent of these methods.
The above quality criteria are also applied to evaluate
the QoS of composite services. Table 1
provides aggregation functions for the computation of the QoS of a composite
service CS when executed using plan .
A brief explanation of each criterion's aggregation function follows:
Table 1: Aggregation functions for computing the QoS
of execution plans
Criteria |
Aggregation function |
|
Price |
|
|
Duration |
|
|
Reputation |
|
|
Reliability |
|
|
Availability |
|
|
- Execution price: The execution price
of an execution plan
is a sum of every service 's execution price .
- Execution duration: The execution duration of an execution plan
is computed using the Critical Path Algorithm (CPA) [23].
Specifically, the CPA is applied to the the execution path of execution plan
, seen as a project digraph. The critical path
of a project digraph is a path from the initial state to the final state which
has the longest total sum of weights labelling its nodes. In the case at hand,
the weights labelling the nodes correspond to the maximum expected execution
durations. A task that belongs to the critical path is a critical
task, while a service that belongs to the critical path is a critical
service. Figure 3
provides an example of critical path. In this example, the project digraph
represents execution path
and its execution plan , where ={ , , , ,
}. Each task's execution duration is given in the project digraph. There are
two project paths in this project digraph, where project path 1 is
and project path 2 is . The total execution time of project path 1 (project path 2) is 37
seconds (62 seconds). Since project path 2's total execution duration is
longer than that of project path 1, the critical path for the project digraph
is project path 2. Thus, the execution plan's total execution duration is 62
seconds. Task , ,
and
are critical tasks. Services , ,
and
are critical services.
- Reputation: The reputation
of an execution plan
is the average of each service 's reputation
in the execution plan .
- Reliability: Reliability
of an execution plan
is a product of . In the aggregation function,
is equal to 1 if service
is a critical service in the execution plan , or 0 otherwise. If , i.e., service
is not a critical service, then , and hence, the reliability of service
will not affect the value of execution plan's reliability.
- Availability: The availability
of an execution plan
is a product of , where
is service 's availability.
Using above aggregation functions, the
quality vector of a composite service's execution plan is defined as:
|
|
|
(2) |
Global
Service Selection
As mentioned before, in existing approaches, the
selection of component service to execute a task is determined independently to
other tasks of composite services [2,11,6].
More precisely, in our previous work [2],
service selection is done at each service community locally. The selection of a
service is based on a selection policy involving parameters of the request, the
characteristics of the members, the history of past executions, and the status
of ongoing executions. Although service selection can be locally optimized, the
global quality constraints may not be satisfied. For example, a global
constraint such as composite services' execution price is less than 500 dollars
can not be enforced. In this section, we present a global planning based
approach for Web services selection. We first present an approach of selecting
an optimal execution plan for a composite service, then present a novel linear
programming based method for optimal execution plan selection.
Selecting an Optimal Execution Plan
The basic
idea of global planning is the same as query optimization in database management
systems. Several plans are identified and the optimal plan is selected. The
foregoing discussion makes it clear that a statechart has multiple execution
paths and each execution path has its own set of execution plans if the
statechart contains conditional branchings. In this subsection, we assume that
the statechart does not contain any conditional branchings and has only one
execution path. We will discuss the case where a statechart has multiple
execution paths in Section 4.2.
We also assume that for each task ,
there is a set of candidate Web services
that are available to which task
can be assigned. Associated with each Web service
is a quality vector (see equation 1).
Based on the available Web services, by selecting a Web service for each task in
an execution path, the global planner will generate a set of execution plans
:
|
|
|
(3) |
is the number of execution plans. After a set of
execution plans is generated, the system needs to select an optimal execution
plan. When selecting the execution plan, instead of computing the quality vector
of a particular Web service, each execution plan's global service quality vector
needs to be computed. The selection of execution plan uses Multiple Attribute
Decision Making (MADM)[16]
approach. Once the quality vector for each execution plan is derived, by
accumulating all the execution plans' quality vectors, we obtain matrix ,
where each row represents an execution plan's quality vector.
|
|
|
(4) |
A Simple Additive Weighting (SAW) [4]
technique is used to select an optimal execution plan. Basically, there are two
phases in applying SAW:
Some of the
criteria used could be negative, i.e., the higher the value is, the lower the
quality is. This includes criteria such as execution time and execution price.
Other criteria are positive criteria, i.e., the higher the value is, the higher
the quality is. For negative criteria, values are scaled according to Equation
5.
For positive criteria, values are scaled according to Equation 6.
|
|
|
(5) |
|
|
|
(6) |
In the above equations,
is maximal value of a quality criterion in matrix ,
i.e., .
is minimal value of a quality criterion in matrix ,
i.e., .
In fact, we can compute
and
without generating all possible execution plans. For example, in order to
compute the maximum execution price (i.e., )
of all the execution plans, we select the most expensive Web service for each
task and sum up all these execution prices to compute .
In order to compute the minimum execution duration (i.e., )
of all the execution plans, we select the Web service that has shortest
execution duration for each task and use CPA to compute .
The computation cost of
and
is polynomial. After the scaling phase, we obtain the following matrix :
The following
formula is used to compute the overall quality score for each execution plan:
|
|
|
(7) |
where
and .
represents the weight of each criterion. In (7),
end users can give their preference on QoS (i.e., balance the impact of the
different criteria) to select a desired execution plan by adjusting the value of
. The global planner will choose the execution
path which has the maximal value of
(i.e., ). If there are more than one execution plans which have the same
maximal value of ,
then an execution plan will be selected from them randomly.
Handling
Multiple Execution Paths
In Section 4.1,
we assume that the statechart only has one execution path. In this subsection,
we discuss the case where statecharts have multiple execution paths. Assume that
a statechart has
execution paths. For each execution path, an optimal execution plan can be
selected. So, the global planner has
selected execution plans. Since each selected optimal execution plan only covers
a subset of the entire statechart, then the global planner needs to aggregate
these execution plans into an overall execution plan
to covers all the tasks in the statechart. This overall execution plan will be
used to execute the statechart. For example, for travel planner
statechart
(see Figure 1),
there are two execution paths
and .
The optimal execution plans
and of these two execution paths are selected. From
the Figure 2,
it can be seen that both execution paths
and
are subsets of .
Thus neither
nor covers all tasks in .
Since the global planner conducts planning before the execution time, it does
not know which execution path will eventually be used for the composite service.
Therefore it needs to aggregate
and into an overall execution plan which covers all
the tasks in .
Assume that statechart
has tasks (i.e., ,
, ..., )
and execution paths (i.e., , ,..., ). Thus, for each execution path, the global planner selects an optimal
execution plan. Consequently, we obtain
optimal execution plans (i.e., )
for these execution paths. The global planner adopts the following approach to
aggregate multiple execution plans into an overall execution plan.
- Given a task , if only belongs to one execution path (e.g., ), then the global planner selects 's execution plan to execute the task . We denote this as . For example, in trip planning statechart, task (i.e.,CarRental ) only belongs to execution
path . In this case, 's execution plan is used to execute , i.e., .
- Given a task , if belongs to more than one execution paths (e.g., , , ..., ), then there is a set of execution plans (i.e., , , ..., ) that can be used to execute . In this case, the global planner needs to select one of the
execution plans from , , ... , . The selection can be done by identifying the hot path for
task . Here, the hot path of a task is defined as the execution path that has been most frequently used
to execute task in the past. For example, in travel planner statechart,
task (FlightTicketBooking) belongs to both execution path and . Assume that the statechart
has been used to execute the composite service for 25 times. Also assume that,
in 20 times the execution of the composite service follows the execution path
; while in 5 times, the execution of the
composite service follows the execution path . This indicates that execution path
has been more frequently used to execute task (i.e.,
is the hot path for ). Thus, 's execution plan is used to execute , i.e., .
The system keeps composite service execution traces in an
execution history [10].
This allows the global planner to identify hot path for each task.
The approach of selecting an optimal execution plan given in the
previous section requires the generation of all possible execution plans. Assume
that there are
tasks in a statechart and there are
potential Web services for each task. The total number of execution plans is
. The computation cost of selecting an optimal
execution plan is .
Such an approach is impractical for large scale composite services, where both
the number of tasks in the composite services and the number of candidate Web
services in communities are large. For example, assume that a composite service
has one execution path and
tasks, and for each task, there are
candidate Web services. Then the total number of execution plans is . It is very costly to generate all these plans and select an optimal one. In this
subsection, we present a method based on linear programming (LP) [15],
which can be used to select an optimal execution plan without generating all the
possible execution plans. There are three inputs in LP: variables, an
objective function and constraints on the variables, where
both the objective function and constraints must be linear. LP attempts to
maximize or minimize the value of the objective function by adjusting the values
of variables based on the constraints. The output of LP is the maximum (or
minimum) value of the objective function as well as the values of variables at
this maximum or minimum point. In order to use LP to select an optimal execution
plan, we model the selection of an optimal execution plan as an LP problem. The
variables of the LP problem are
representing the participation of service
in the selected execution plan. The value of each variable is 1 if service
is in the selected plan, 0 otherwise. The objective function of the LP problem,
which is based on equations 5,6,
and 7,
is:
|
|
|
(8) |
where
and .
is the weight assigned to quality criteria . In the following subsections, we discuss the constraints
on the variables of the LP problem.
In this subsection, we consider constraints on the
execution duration and the execution price of an execution plan. Assume that
is the set of all tasks (i.e., basic states) of
the statechart. For each task ,
there is a set of Web services
that can be assigned to this task, but on the end, for each task , only one Web service should be selected. Given that (
or ) denotes the participation of Web service in the selected plan, this latter fact is
captured by the following constraints:
|
|
|
(9) |
For example, there are 100 potential Web services that can execute task
, since only one of them will be selected to
execute the task ,
then we have .
Assume that variable
represents the earliest start time of task ,
variable
represents the execution duration for task ,
and variable
represents the execution duration for task
by service .
We use the notation
to denote that task
is task 's
direct successor task. We have the following constraints:
Constraint 10
indicates that the execution duration of a given task is equal to the execution duration of one of the Web
services in .
Constraint 11
captures the fact that if task
is a direct successor of task ,
the execution of task
must start after task
has been completed. Constraint 12
indicates that the execution of a composite service is completed only when all
its tasks are completed. Assume that
is an integer variable that has value
or 0: indicates that Web service is a critical service and 0 indicates otherwise. We have
the following constraint on execution plan's execution duration :
|
|
|
(13) |
For execution price, assume that variable
represents the execution price of Web service ,
then we have the following constraint on total execution price of composition
service:
|
|
|
(14) |
An alternative of constraint 14
is as follows:
|
|
|
(15) |
where
is the budget constraint given by the user. This constraint indicates that the
entire composite service's execution price can not be greater than . By introducing a budget constraint the above problem
needs to be explicitly solved as an integer programming problem. This problem is
a special case of the knapsack problem and hence it is NP-hard [19].
Notice that constraints on other criteria can be easily incorporated into LP if
the aggregation function is a linear function. For example, assume that variable
represents the reputation of Web service , we can have the following constraint on the
execution plan's reputation:
|
|
|
(16) |
In this subsection, we consider constraints on criteria
where the aggregation function is not a linear function. Among the criteria that
are used to select Web services, both the availability's and the reliability's
aggregation functions are non-linear (see Table 1).
We can linearize them using a logarithm function as shown below. Assume that
variable
represents the reliability of Web service .
Since
indicates whether Web service
is a critical service or not, the reliability of execution plan is:
By applying the logarithm function ,
we obtain:
Since
and
or , we obtain:
Let , we have the following constraint on the execution plan's reliability:
|
|
|
(17) |
Similarly, assuming that
represents the availability of the Web service ,
the following constraint is introduced:
|
|
|
(18) |
where .
Criteria that can be introduced into the LP problem are not limited to what we
defined in Section 3.
Other criteria can be added once the aggregation functions are given.
In the previous sections, we assume that the execution
durations
of Web services are deterministic. In reality, the execution duration of a Web
service
may be uncertain. For example, Service
may advertise that the execution duration is 5 seconds, but the actual execution
duration may be 4.5, 4.6, or 5.2 seconds. To address this issue, we model each
using a normal distribution. Variable has therefore the following probability
function:
where the mean
and the std. deviation
are given by:
|
|
|
(19) |
|
|
|
(20) |
By applying formulas 19
and 20
to the execution logs of a Web service ,
we can obtain
and
for this Web service. Since , the total execution duration
must follow a normal distribution whose
deviation
is:
|
|
|
(21) |
So in order to consider the deviation of the total execution duration in
the LP problem, we should adopt the following objective function:
|
|
|
(22) |
where
and
is the weight assigned to the deviation of the total execution duration.
Given the above inputs for the LP problem, the output of the LP solver
will be a set of values for variables ,
which indicate the selection or exclusion of Web services . The selected Web services compose an optimal execution
plan.
Validation
In this section, we present the implementation of the QoS-driven
selection of services and some experimental results to evaluate the proposed
approach.
The proposed
QoS-driven selection technique is implemented in the SELF-SERV prototype.
Detailed description of SELF-SERV can be found in [24].
In this section, we briefly overview the prototype architecture and discuss its
extension to support the service selection approach. The prototype architecture
(Figure 4)
features an interface, a service manager and a pool of
services. Services communicate via Simple Object Access Protocol (SOAP)
messages. The service manager consists of four modules, namely the
service discovery engine, the service editor, the
composite service orchestrater and the global planner. The
service discovery engine facilitates the advertisement and location of
services. It is implemented using the Universal Description, Discovery and
Integration (UDDI), the Web Service Description Language (WSDL), and SOAP. In
the implementation, we make extensive use of the IBM Web Services Toolkit 2.4
(WSTK2.4) [14],
which is a showcase package for Web services emerging technologies.
Figure 4: Architecture of the
prototype.
|
The service editor provides facilities for defining new services
and editing existing ones. A service is edited through a visual interface, and
translated into an XML document for subsequent analysis and processing by the
service orchestrater. The orchestrater is responsible for scheduling,
initiating, and monitoring the invocations to the tasks of a composite service
during its execution, and for routing events and data items between these
components. Finally, global planner is the module that plans the
execution of a composite service using the global planning based approach. The
global planner is implemented as a linear programming solver based on IBM's
Optimization Solutions and Library (OSL) [13].
It should be noted that QoS information is retrieved via pre-defined operations
of services (e.g., getExecutionTime()).
Experimentation
We conducted experiments using the
implemented prototype system to evaluate our approach. In the experiments, a
cluster of PCs were used to run the prototype system. All PCs have the same
configuration of Pentium III 933MHz with 512M RAM. Each PC runs Windows 2000,
Java 2 Edition V1.3.0, and Oracle XML Developer Kit (Oracle XDK, for XML
parsing). They are connected to a LAN through 100Mbits/sec Ethernet cards. This
section presents two experimental results. The first experiment compares the QoS
metrics of the execution of composite services using the global planning and
local selection approaches. The second experiment shows the system costs (i.e.,
computation cost and bandwidth cost) of these two approaches.
The purpose of this
experiment is to compare the QoS values of the global planning approach with
that of the local selection. The comparison was done by measuring price and
execution time of the composite services using both approaches. In the
experiment, we created several composite services with different number of basic
states. The services were created by randomly adding states to the composite
service shown in Figure 1.
The number of states ranges over the values 10, 20, 30, 40, 50, 60, 70, and 80.
For each composite service, we executed the service 12 times and recorded the
price and execution time. Since we obtained similar experimental results for all
created composite services, only the result of one composite service (with 20
states) is shown (see Table 2)
for clarity reasons. From the table, we can see that for every instance of the
composite service, the global planning approach gives better QoS than local
selection approach. For example, for the instance 7 of Table 2,
the time required to execute the composite service is: (i) 6451 seconds in the
global planning approach, (ii) 9480 seconds in the local selection approach.
Similarly, the execution price spends for executing this composite service is:
(i) 1231 dollars in the global planning approach, (ii) 1789 dollars in the local
selection approach. Overall, the average execution time (resp., execution price)
is: (i) 6627.2 seconds (resp., 1191 dollars) in the global planning approach,
(ii) 9305.9 seconds (resp., 1753 dollars) in the local selection approach.
Table 2: The QoS of a composite service with 20
states
Instance |
(second) |
(dollar) |
No |
Global |
Local |
Global |
Local |
1 |
6523.2 |
8322.4 |
1023 |
1642 |
2 |
6634.4 |
9123.9 |
1117 |
1728 |
3 |
6843.2 |
9234.5 |
1123 |
1825 |
4 |
6432.5 |
9292.2 |
1132 |
1824 |
5 |
6347.3 |
8943.3 |
1121 |
1723 |
6 |
6512.3 |
9902.8 |
1185 |
1888 |
7 |
6451.2 |
9480.4 |
1231 |
1789 |
8 |
6440.5 |
9470.5 |
1275 |
1787 |
9 |
6970.4 |
9920.4 |
1324 |
1625 |
10 |
6890.3 |
9628.3 |
1235 |
1759 |
11 |
6590.3 |
9520.3 |
1267 |
1852 |
12 |
6890.3 |
8920.5 |
1250 |
1599 |
Average: |
6627.2 |
9305.9 |
1191 |
1753
| |
The aim of this
experiment is to investigate the system costs of executing composite services
using the LP-based global planning and local selection approaches. The
experiment was done by measuring: (i) computation cost (i.e., time used for
selecting a Web service for each task of the composite service); (ii) bandwidth
cost (i.e., network bandwidth consumed between global planner and Web services).
In the experiment, we created several composite services with different number
of tasks. The services were created by randomly adding states to the composite
service shown in Figure 1.
The number of tasks ranges over the values 10, 20, 30, 40, 50, 60, 70, and 80.
In addition, we created a set of Web services which were assigned to tasks of
composite services as the candidate Web services. For each task, the number of
candidate Web services we used varies as follows: 5, 10, 20, and 40 services. We
executed composite services with different number of states and candidate Web
services. The computation and bandwidth costs for selecting Web services were
recorded. The results for composite services with 40 candidate Web services of
each state are shown in figures 5
and 6
respectively. Similar results were obtained for other cases.
Figure 5: The computation cost of
selecting services for composite services (each state has 40 candidate Web
services)
|
Figure 6: The bandwidth cost of
selecting service for composite services (each state has 40 candidate Web
services)
|
From Figure 5,
we can see that for both global and local selection approaches, the computation
cost increases when the number of tasks and the number of candidate Web services
increases. As expected, the computation cost of global planning approach is a
little bit higher than that of local selection approach. For example, a
composite service with 40 tasks spends: (i) 1.6 seconds for selecting Web
services in global planning approach, (ii) 0.7 seconds in local selection
approach. Similar observations are found regarding bandwidth cost. More
specifically, for both approaches, the linear increase of the number of tasks
and the number of candidate Web services leads almost a linear increase of
bandwidth cost (see Figure 6).
The bandwidth cost in global planning is slightly higher than that of local
selection approach. For example, a composite service with 40 tasks consumes
about 2080 KBytes of network bandwidth for selecting Web services in global
planning approach, while it consumes 1910 KBytes in local selection approach.
Related
Work
In this section, we briefly discuss the relationships between our
work and existing Web service standards, Web service composition approaches, and
QoS-driven workflow management. Several standardisation proposals aiming at
providing infrastructure to support Web services composition have recently
emerged including SOAP, WSDL, UDDI, and BPEL-4WS. SOAP defines an XML messaging
protocol for communication among services. WSDL is an XML-based language for
describing web service interfaces. UDDI provides the directory and a SOAP-based
API to publish and discover services. Finally, BPEL4WS provides a language for
process-based service composition. Other proposed notations for service
description and composition include ebXML and DAML-S. The above proposals
however are complementary to ours. Indeed, our proposal aims at leveraging the
above standards (e.g., SOAP, UDDI) to provide a quality-driven and dynamic
service composition model. Web service composition is a very active area of
research and development [1,7,8].
Previous efforts in this area such as CMI [11]
and eFlow [6]
have investigated dynamic service selection based on user requirements. In
particular, CMI's service definition model features the concept of a
placeholder activity to cater for dynamic composition of services. A
placeholder is an abstract activity replaced at runtime with a concrete activity
type. A selection policy is specified to indicate the activity that should be
executed in lieu of the placeholder. In eFlow, the definition of a service node
contains a search recipe represented in a query language. When a
service node is invoked, a search recipe is executed in order to select a
reference to a specific service. Both CMI and eFlow focus on optimizing service
selection at single task level (i.e. local selection). In addition, no QoS model
is explicitly supported. In contrast, our approach focuses on optimizing service
selection at a composite service level, based on a generic QoS model, and using
established linear programming techniques. Related work on QoS has been
conducted in the area of workflow. In particular, there are a number of research
proposals addressing the specification and verification of temporal constraints
in workflows [9,3].
Other proposals such as METEOR [5]
and CrossFlow [18,17]
have considered QoS models with other parameters than time. Specifically, [5]
considers four quality dimensions, namely time, cost, reliability and fidelity.
However, it does not consider the dynamic composition of services. Instead, it
focuses on analyzing, predicting, and monitoring QoS of workflow processes.
Similarly, [18]
proposes the use of continuous-time Markov chain to estimate execution time and
cost of a workflow. Closer to our proposal is the one reported in [17],
which explores the issue of dynamically selecting several alternative tasks
within a workflow, based on quality parameters, in a similar way as eFlow does
using search recipes. As stated before, this local selection strategy contrasts
with the global planning approaches that we advocate. Other related research
proposals include [20,21],
which focus on data quality management in cooperative information systems. They
investigate techniques to select best available data from different service
providers based on a set of data quality dimensions such as accuracy,
completeness, and consistency.
Conclusion
Dynamic selection of component
services is an important issue in Web services composition. In this paper, we
have presented a general and extensible model to evaluate QoS of both elementary
and composite services. Based on the QoS model, a global service selection
approach that uses linear programming techniques to compute optimal execution
plans for composite services has been described. We have conducted experiments
to compare the proposed technique with the local selection approach. The results
show that the proposed approach effectively selects high quality execution plans
(i.e., plans which have higher overall QoS). Our ongoing research includes the
support for exception handling during composite service executions. For example,
after an execution plan has been built and while it is being executed, an
exception may occur (e.g., unavailability of a component service). We will
explore the possibility of performing dynamic plan revision during composite
service execution, as a means to respond to runtime exceptions.
The work of
Boualem Benatallah is partially supported by the Australian Research Council's
Discovery GRANT DP02-1120.
-
- 1
- B. Benatallah and F. Casati, editors.
Distributed and
Parallel Database, Special issue on Web Services.
Springer-Verlag,
2002.
- 2
- B. Benatallah, M. Dumas, Q. Z. Sheng, and A. Ngu.
Declarative Composition and Peer-to-Peer Provisioning of Dynamic Web
Services.
In Proc. of ICDE'02, IEEE Computer Society, pages
297-308, San Jose, 2002.
- 3
- C. Bettini, X. Wang, and S. Jajodia.
Temporal reasoning
in workflow systems.
Distributed and Parallel Databases,
11(3):269-306, 2002.
- 4
- H. C.-L and K. Yoon.
Multiple Criteria Decision
Making.
Lecture Notes in Economics and Mathematical Systems,
Springer-Verlag, 1981.
- 5
- J. Cardoso.
Quality of service and semantic composition of
workflows.
Ph.D. Thesis, University of Georgia, 2002.
- 6
- F. Casati, S. Ilnicki, L.-J. Jin, V. Krishnamoorthy, and
M.-C. Shan.
eFlow: a Platform for Developing and Managing Composite
e-Services.
Technical Report HPL-2000-36, HP Laboratoris, Palo Alto, 2000.
- 7
- F. Casati, M.-C. Shan, and D. Georgakopoulos, editors.
VLDB Journal, Special issue on E-Services.
Springer-Verlag,
2001.
- 8
- A. Dogac, editor.
ACM SIGMOD Record 31(1), Special Section on
Data Management Issues in E-Commerce.
ACM, March 2002.
- 9
- J. Eder, E. Panagos, and M. Rabinovich.
Time
constraints in workflow systems.
In Proc. of the 11th
International Conference on Advanced Information Systems Engineering
(CAiSE), pages 286-300, Heidelberg, Germany, June, 1999.
- 10
- M.-C. Fauvet, M. Dumas, and B. Benatallah.
Collecting and
querying distributed traces of composite service executions.
In Proc.
of the 10th International Conference on Cooperative Information Systems
(CoopIS), Irvine, CA, USA, 2002.
- 11
- D. Georgakopoulos, H. Schuster, A. Cichocki, and
D. Baker.
Managing process and service fusion in virtual enterprises.
Information System, Special Issue on Information System Support for
Electronic Commerce, 24(6):429-456, 1999.
- 12
- D. Harel and A. Naamad.
The STATEMATE semantics of
statecharts.
ACM Transactions on Software Engineering and
Methodology, 5(4):293-333, 1996.
- 13
- IBM Optimization Solutions and Library, 2002.
http://www-3.ibm.com/software/data/bi/osl/index.html.
- 14
- IBM WSTK Toolkit.
http://alphaworks.ibm.com/tech/webservicestoolkit.
- 15
- H. Karloff.
Linear Programming.
Birkhauser, 1991.
- 16
- M. Köksalan and S. Zionts, editors.
Multiple Criteria
Decision Making in the New Millennium.
Springer-Verlag, 2001.
- 17
- J. Klingemann.
Controlled flexibility in workflow management.
In Proc. of the 12th International Conference on Advanced Information
Systems (CAiSE), pages 126-141, Stockholm, Sweden, June 2000. Springer.
- 18
- J. Klingemann, J. Wasch, and K. Aberer.
Deriving
service models in cross-organizational workflows.
In Ninth
International Workshop on Research Issues in Data Engineering: Virtual
Enterprise, RIDE-VE'99, Sydney, Australia, March 1999.
- 19
- S. Martello and P. Toth.
Knapsack Problems: Algorithms
and Computer Implementations.
John Wiley and Sons, 2001.
- 20
- M. Mecella, M. Scannapieco, A. Virgillito, R. Baldoni,
T. Catarci, and C. Batini.
Managing data quality in cooperative
information systems.
In Proc. of the 10th International Conference on
Cooperative Information Systems (CoopIS), Irvine, CA, USA, 2002.
- 21
- F. Naumann, U. Leser, and J. C. Freytag.
Quality-driven
integration of heterogenous information systems.
In Proceedings of the
International Conference on Very Large Databases (VLDB), pages 447-458,
Edinburgh, UK, 1999.
- 22
- J. O'Sullivan, D. Edmond, and A. ter Hofstede.
What's
in a Service.
Distributed and Parallel Databases,
12(2-3):117-133, September 2002.
- 23
- M. Pinedof.
Scheduling: Theory, Algorithms, and Systems (2nd
Edition).
Prentice Hall, 2001.
- 24
- Q. Z. Sheng, B. Benatallah, M. Dumas, and E. Mak.
SELF-SERV: A Platform for Rapid Composition of Web Services in a
Peer-to-Peer Environment.
In Proc. of the 28th VLDB Conference,
Hong Kong, China, August 2002.
- 25
- A. van Moorsel.
Metrics for the internet age: Quality of
experience and quality of business.
Technical Report HPL-2001-179, HP
Labs, August 2001.
Also published in 5th Performability Workshop,
September 2001, Erlangen, Germany.
- 26
- D. D. Wackerly, W. Mendenhall, and R. L. Scheaffer.
Mathematical Statistics with Application.
Duxbury Press,
1996.
Quality Driven Web Services Composition
This document was generated using the LaTeX2HTML
translator Version 2K.1beta (1.48)
Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer
Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department,
Macquarie University, Sydney.
The command line arguments were:
latex2html -dir htm
-split 1 p358-zeng.tex
The translation was initiated by Quan Zheng Sheng on 2003-03-31
Quan Zheng Sheng 2003-03-31