Optimal elimination of inconsistency inexpert knowledge: formulation of the problem fast algorithms

(1)

Optimal Elimination of Inconsistency in Expert Knowledge:

Formulation of the Problem, Fast Algorithms

Timothy J. Ross

Berlin Wu

Vladik Kreinovich

Civil Engineering Dept. Mathematics Dept. Dept. of Computer Science University of New Mexico National Chengchi University University of Texas Albuquerque, NM 87131, USA Taipei, Taiwan El Paso, TX 79968, USA

ross@unm.edu berlin@math.nccu.edu.tw vladik@cs.utep.edu

Abstract

Expert knowledge is sometimes inconsistent. In this paper, we describe the problem of eliminating this inconsistency as an optimization

problem, and present fast algorithms for solving this problem.

Keywords:

expert knowledge, inconsistency, optimization

1 Introduction

1.1 Expert Knowledge Is Important

In many real-life situations, we have only a few high-level experts who can make high-quality deci-sions. It is desirable to incorporate their knowledge into an automated system which would help make sim-ilar good quality decisions when these top experts are not available. To design such systems, we must elicit the knowledge from the experts.

1.2 Expert Knowledge Is Sometimes Incon-sistent

This elicitation problem is difficult because a large part of this expert knowledge is imprecise. When the experts articulate this informal knowledge in computer-understandable formal terms, they in-evitably somewhat distort their knowledge. As a re-sult, the expert knowledge which was perfectly consis-tent may result in somewhat inconsisconsis-tent formal state-ments:

It could be that there is a contradiction between the statements presented by different experts. It could also be that several statements given by the same expert are inconsistent.

1.3 Experts Sometimes Cannot Easily Han-dle This Inconsistency, So We Need an Automated Inconsistency-Elimination Tool

Ideally, if we encounter such an inconsistency, we would like to point out this contradiction to the

ex-perts and ask them to resolve it before we input the in-consistent data into the computer-based system. How-ever, these inconsistencies reflect not internal incon-sistencies of the expert’s reasoning, but rather (in-evitable) differences between the expert’s internal rea-soning and its formalized representation. As a result, experts themselves often cannot provide us with any meaningful guidance about the best way to resolve these inconsistencies. Our experience shows that, if faced with such an inconsistency, experts often make arbitrary changes in their original estimates, changes which are kind of “random” in the sense that in a sim-ilar situation the same expert can make a completely different adjustment.

Since we cannot rely on experts to provide us with a meaningful and “consistent” way of eliminat-ing inconsistency, we must therefore design automatic inconsistency-elimination tools and use them as a part of knowledge elicitation.

1.4 What We Are Planning to Do

In this paper, we formulate the problem of optimal inconsistency elimination, and provide fast algorithms for such elimination for the practically important situ-ation when the expert knowledge is about the numer-ical value of a physnumer-ical characteristic (e.g., geometric and mechanical parameters of a complex structure). We will consider two cases:

when the inconsistency is between statements made by different experts, and

when the inconsistency is between different state-ments made by the same expert.

(2)

2 Eliminating

Inconsistency

Between

Statements Made By Different Experts

2.1 Expert Estimates: General Formulation

Let

n

denote the total number of experts who es-timate the value of a given physical characteristic (such as width, density, etc.), and letx

i

=

x

; i

x

+ i

]

be an interval estimate provided by

i

-th expert (

i

=

1

2 :::n

). In other words, expert

i

claims that the actual (unknown) value

x

of the estimated physical characteristic is in the interval

x

;

i

x

+ i

]

.

2.2 If Experts Are Consistent With Each Other, the Situation Is Easy to Handle

If all the experts are correct, then the actual value of

x

belongs to all

n

intervals. The resulting set of possible values of

x

is thus equal to the intersection

x

=

x 1 \x 2 \

:::

\x n

of these

n

intervals. This intersectionx

=

x

;

x

+

]

is easy to compute:

x

;

= max(

x

; 1

:::x

; n

)

and

x

+

=

min(

x

+ 1

:::x

+ n

)

.

2.3 The Problem: Experts’ Statements Can Be Inconsistent With Each Other

The problem emerges when the intersection of

n

intervals is empty, i.e., when

max(

x

;

1

:::x

; n

)

>

min(

x

+ 1

:::x

+

n

)

. In this case, no real number

x

can

satisfy all the experts’ requirements

x

2

x

; i

x

+ i

]

and therefore, some of these requirements are false. Thus, to design a consistent expert system, we must dismiss some of the expert estimates before computing the in-tersection.

2.4 We Want To Find the Optimal (Most Probable) Way of Eliminating Inconsis-tency

There are many ways of eliminating inconsistency. For example, we can always get a consistent statement by simply dismissing all but one interval. From the purely mathematical viewpoint, it makes sense: we get a single interval, and this single interval is, of course, consistent. However, from the viewpoint of our prob-lem, what we are doing is, in effect, proclaiming that

n

;

1

experts are wrong. It can be that one of the

experts is wrong, it can even happen that two experts are wrong, but it is hardly probable that all but one of them are wrong – if this is true what kind of experts are they? So, if we simply dismiss all expert estimates but one, we get a consistent set, but this consistent set is very improbable.

Thus, we do not simply want to eliminate inconsis-tency, we want to select the optimal (“most probable”) way of eliminating inconsistency.

So far, we have used the word “most probable” in the informal sense. To formalize this problem, we must describe the corresponding probabilities.

2.5 Possible Additional Knowledge About Reliability of Different Experts

Depending on our knowledge about the experts, we will consider two different formulations of this prob-lem:

We may already have some experience with this group of experts, and so we know, from experi-ence, for each expert

i

, the fraction

p

iof cases in

which this expert turned out to be correct. In our words, we know the probability

p

i that the

i

-th

expert’s statement is correct.

It may also happen that we do not have any pre-liminary experience with this particular group of experts, so we have no reason to assume that one of them is more reliable than the others. In this case, it makes sense to assume that all experts have the same (unknown) probability

p

of mak-ing a correct statement.

It is safe to assume that since we consider all these experts to be really experts, the probability

p

that an expert is correct should be larger than the probability

1

;

p

that the

expert is wrong, i.e.,

p >

1

;

p

(which is

equivalent to

p >

0 :

5

).

From the mathematical viewpoint, the second situa-tion can be considered as a particular case of the first one, when

p

1

=

:::

=

p

n

=

p

; the only difference

is that in the first case, we know the probabilities

p

i,

while in the second case, we do not know them (we only know that they are equal to each other).

2.6 Experts Are Independent

In this paper, we will assume that the experts are independent. This assumption makes sense: e.g., if the opinions of the two experts are highly correlated, then there is no sense to interview both of them: the opinion of the first expert describes, with a high accuracy, the opinion of the second expert as well. Thus, e.g.:

the probability that the experts

i

and

j

are both correct is equal to

p

i

p

j;

the probability that the expert

i

is correct and the expert

j

is not correct is equal to

p

i

(1

;

p

j

)

.

2.7 Towards the Mathematical Formulation of the Problem

Our goal is to select a subset

S

f

1

2 :::n

g

for which the corresponding statements are consistent, i.e., for which

\ x

i

6

(3)

For each such selection

S

, its probability

p

(

S

)

is equal to:

p

(

S

) =

Y i2S

p

i ! 0 @ Y i62S

(1

;

p

i

)

1 A

:

(2)

We must therefore, find a set

S

for which

p

(

S

)

!

max

among all sets

S

which satisfy the condition (1).

2.8 In Principle, We Can Use Exhaustive Search, But This Is Intractable

There are only finitely many sets

S

f

1 :::n

g,

so in principle, we could simply test them all and find the one for which the probability

p

(

S

)

is the largest possible. Unfortunately, this “exhaustive search” strat-egy is very time-consuming: indeed, a set of

n

ele-ments has

2

n

subsets, so for a realistic case of

n

= 30

experts, we need to test a billion of different sub-sets. For even larger number of experts, this search becomes completely intractable. We therefore need a more sophisticated algorithm for selecting the most probable combination of intervals.

This algorithm will be presented now.

2.9 Fast Algorithm for Solving This Problem (Case When We Knowp

i)

The proposed algorithm consists of the following stages:

1. First, we sort all

2 n

endpoints

x

; 1

:::x

; n

x

+ 1

:::x

+ n

into an ordered sequence of real numbers

x

(1)

x

(k )

:::

x

(2n)

:

This sorting requires

O

(

n

log(

n

))

elemen-tary computation steps (see, e.g., [3]). 2. Then, for each

k

= 1

:::

2 n

;

1

, we consider

the interval

x

(k )

x

(k +1)

]

. For this interval, we

check, for every

i

= 1

:::n

, whether this inter-val belongs tox

ior not. Thus, we form the set

S

(k )of all indices for which

x

(k )

x

(k +1)

]

x

i,

and and then compute the corresponding value

p

(

S

(k )

)

by using the formula (2).

Each checking requires

2 n

comparisons, and computing

p

(

S

)

requires

n

multiplica-tions. So, for each

k

, we need

3 n

steps,

and the total number of computational steps required to perform this procedure for all

2 n

;

1

values of

k

is equal to

O

(

n

)

O

(

n

) =

O

(

n

2

)

.

3. As a result of the second stage, we get

2 n

;

1

different values of

p

(

S

(k )

)

. To find the largest

value (or values), we compare them with each other: we start with the value corresponding to

the leftmost interval

x

(1)

x

(2)

]

, mark it as the

largest-so-far, and then go over other values, re-placing the largest-so-far with the next one if the next value of

p

(

S

(k )

)

is indeed larger.

For each of

2 n

;

1

values, we perform a

single comparisons, so this stage requires

O

(

n

)

elementary computational steps. The total number of steps is therefore equal to

O

(

n

log(

n

)) +

O

(

n

2

) +

O

(

n

) =

O

(

n

2

)

i.e., we get a quadratic-time algorithm for finding the optimal (most probable) consistent combination of ex-pert statements.

2.10 Pedagogical Example

Let us illustrate the above algorithm on the follow-ing simple example. Let

x 1

= 0

:

0

1 :

0]

x 2

= 0

:

5

1 :

5]

x 3

= 0

:

7

1 :

2]

x 4

= 1

:

3

1 :

6]

:

In this case,

max(

x

; 1

:::x

; 4

) = 1

:

3 >

min(

x

+ 1

:::x

+ 4

) = 1

:

0

and thus, the intersection of all four intervals is empty. Let us assume that

p

1

=

p

2

=

p

3

= 0

:

7

and

p

4

= 0

:

8

. Then, the algorithm results in the

follow-ing selection:

1. First, we sort all 8 values

x

i into a sequence

x

(1)

= 0

:

0 x

(2)

= 0

:

5 x

(3)

= 0

:

7 x

(4)

= 1

:

0 x

(5)

= 1

:

2 x

(6)

= 1

:

3 x

(7)

= 1

:

5 x

(8)

= 1

:

6 :

2. Second, for each of 7 intervals

x

(k )

x

(k +1)

]

, we

compute the corresponding value

p

(

S

(k )

)

:

For

k

= 1

, we have

x

(1)

x

(2)

] = 0

:

0

0 :

5]

, so

S

(1)

=

f

1

g, and

p

(

S

(1)

) = 0

:

7

0 :

3

0 :

3

0 :

2 = 0

:

0126

:

For

k

= 2

, we have

x

(2)

x

(3)

] = 0

:

5

0 :

7]

, so

S

(2)

=

f

1

2

g, and

p

(

S

(2)

) = 0

:

7

0 :

7

0 :

3

0 :

2 = 0

:

0294

:

For

k

= 3

, we have

x

(3)

x

(4)

] = 0

:

7

1 :

0]

, so

S

(3)

=

f

1

2

3

g, and

p

(

S

(3)

) = 0

:

7

0 :

7

0 :

7

0 :

2 = 0

:

0686

:

For

k

= 4

, we have

x

(4)

x

(5)

] = 1

:

0

1 :

2]

, so

S

(4)

=

f

2

3

g, and

p

(

S

(4)

) = 0

:

3

0 :

7

0 :

7

0 :

2 = 0

:

0294

:

(4)

For

k

= 5

, we have

x

(5)

x

(6)

] = 1

:

2

1 :

3]

, so

S

(1)

=

f

2

g, and

p

(

S

(5)

) = 0

:

3

0 :

7

0 :

3

0 :

2 = 0

:

0126

:

For

k

= 6

, we have

x

(6)

x

(7)

] = 1

:

3

1 :

5]

, so

S

(1)

=

f

2

4

g, and

p

(

S

(6)

) = 0

:

3

0 :

7

0 :

3

0 :

8 = 0

:

0504

:

For

k

= 7

, we have

x

(7)

x

(8)

] = 1

:

5

1 :

6]

, so

S

(1)

=

f

4

g, and

p

(

S

(1)

) = 0

:

3

0 :

3

0 :

3

0 :

8 = 0

:

0202

:

3. Among these 7 values, the largest corresponds to

S

(3)

=

f

1

2

3

g, so we dismiss the estimate of

the 4th expert.

2.11 How To Modify the Above Algorithm If We Do Not Know the Probabilitiesp

i

If we do not know the probabilities

p

i (i.e., if we

only know that

p

i

=

p

for some (unknown) value

p >

0 :

5

), then we cannot implement the second stage of the above algorithm – because we cannot compute the value

p

(

S

)

.

However, there is a way around it: the only places where we use the values

p

(

S

)

is on the third stage of this algorithm, and on this stage, we do not need the actual values of

p

(

S

)

, we only need to know which values are larger and which are smaller. For the case when

p

i

=

p

for all

i

, the probability (2) turns into

p

(

S

) =

p

jSj

(1

;

p

)

n;jSj

wherej

S

jdenotes the number of elements in the set

S

.

This expression can be rewritten as

p

(

S

) =

p

1

;

p

jSj

(1

;

p

)

n

:

Since

p >

1

;

p

, we can conclude that the more

el-ements

S

has, the larger the corresponding value of

p

(

S

)

. Thus, instead of computing

p

(

S

)

, we can simply computej

S

jand then select a consistent set

S

which

contains the largest possible number of intervals. In other words, we dismiss the smallest possible number of expert estimates: this makes perfect sense.

Thus, we arrive at the following algorithm:

2.12 Fast Algorithm for Solving This Prob-lem (Case When We Do Not Know the Probabilitiesp

i)

For this case, the proposed algorithm consists of the following stages:

1. First, we sort all

2 n

endpoints

x

; 1

:::x

; n

x

+ 1

:::x

+ n

into an ordered sequence of real numbers

x

(1)

x

(k )

:::

x

(2n)

:

2. Then, for each

k

= 1

:::

2 n

;

1

, we consider

the interval

x

(k )

x

(k +1)

]

. For this interval, we

check, for every

i

= 1

:::n

, whether this inter-val belongs tox

ior not. Thus, we form the set

S

(k )of all indices for which

x

(k )

x

(k +1)

]

x

i,

and then compute the total numberj

S

(k )

jof

ele-ments in this set.

3. As a result of the second stage, we get

2 n

;

1

dif-ferent values ofj

S

(k )

j. To find the largest value

(or values), we compare them with each other: we start with the value corresponding to the left-most interval

x

(1)

x

(2)

]

, mark it as the

largest-so-far, and then go over other values, replacing the largest-so-far with the next one if the next value ofj

S

(k )

jis indeed larger.

This version also requires

O

(

n

2

)

elementary compu-tation steps.

2.13 Pedagogical Example

If in the above example

x 1

= 0

:

0

1 :

0]

x 2

= 0

:

5

1 :

5]

x 3

= 0

:

7

1 :

2]

x 4

= 1

:

3

1 :

6]

we do not know the probabilities, then, on the second stage, we compute the values

j

S

(1) j

= 1

j

S

(2) j

= 2

j

S

(3) j

= 3

j

S

(4) j

= 2

j

S

(5) j

= 1

j

S

(6) j

= 2

j

S

(7) j

= 1

and select the set

S

(3) for which this number is the

largest.

3 Eliminating

Inconsistency

Between

Different Statements Made By the

Same Expert

3.1 Nested Intervals (Fuzzy Set) Instead of a Single Interval

In the above description, we assumed that for each physical characteristic

x

, each expert provides a single interval

x

;

x

+

]

of possible values of

x

. In reality, in addition to the interval about which the expert is absolutely sure that

x

is there, the expert may provide narrower intervals corresponding to smaller degrees of confidence. For example, an expert may be absolutely sure that

x

2

0 :

0

1 :

0]

, almost absolutely sure that

x

2

0 :

2

0 :

8]

, somewhat sure that

x

2

0 :

4

0 :

7]

, etc.

As a result, we get a “nested” (“consonant”) col-lection of intervals x 1 x 2

:::

x n which

correspond to different degrees of uncertainty. Such a nested collection of intervals forms a fuzzy set [1, 5, 6, 7], in which different intervals can be viewed as

-cuts.

(5)

3.2 Some Intervals May Be Erroneous, So We Must Choose a Consonant Sub-Collection

In the ideal situation, the intervals are nested. In real life, some intervals provided by an expert may not be ordered by inclusion. In this case, we must select a consonant subset.

Let us formulate this problem in more precise terms.

Definition 1. We say that two intervalsxandyare

compatibleif eitherxy, oryx.

Definition 2. We say that a collection of intervals is

consonantif every two intervals from this collection are compatible.

In these terms, our problem can be formulated as follows:

we have a collection of intervals x 1, . . . ,

x n

which is not consonant;

we must select a consonant sub-collection, i.e., a set

S

f

1

2 :::n

gfor which the intervalsx

i,

i

2

S

are consonant.

We would like to choose the “optimal” (“most proba-ble”) sub-collection.

3.3 We Want To Find the Optimal (Most Probable) Way of Eliminating Inconsis-tency

We know that some intervals in the original col-lection can be erroneous. Since we have no informa-tion about these probabilities, we can assume that each of these intervals has the same (unknown) probability

p

to be erroneous. The probability corresponding to each sub-collection can be therefore described by the formula

p

(

S

) =

p

jSj

(1

;

p

)

n;jSj

. We know, from the first problem, that this value is the largest if and only if

S

contains the largest number of elements. So, the problem of selecting the optimal consonant sub-collection can be reformulated as follows:

we have a collection of intervals x 1, . . . ,

x n

which is not consonant;

we must select a consonant sub-collection

S

which contains the largest possible number of in-tervals.

(In other words, we want to dismiss the smallest possi-ble number of intervals – which makes perfect sense.)

3.4 In Principle, We Can Use Exhaustive Search, But This Is Intractable

Similarly to the first problem, we can find the solu-tion to this one by exhaustive search, but for large

n

, this is intractable, so we need a faster algorithm. This algorithm is described below.

3.5 Fast Algorithm for Solving This Problem

We start with a collection of

n

intervalsx 1, . . . ,

x n.

First, for every two intervalsx iand

x

j from the

original collection, we check whetherx i

x j,

and put the result (“true” or “false”) into the cor-responding cell

r

ijof a special

n

table.

For

n

intervals, this stage requires

2 n

2

=

O

(

n

2

)

comparisons. For each intervalx

i, we check whether this

inter-val is minimal, i.e., whether

r

ji

=

“false” for all

j

. Then, we mark all the intervals which satisfy this property as intervals of Level 1.

Then, for every interval which has not yet been marked, we check whether this interval is minimal among un-marked ones, i.e., whether

r

ji

=

“false” for all un-marked intervals x

j. For

each interval which satisfies this property, we find all its marked sub-intervals (i.e., all marked inter-valsx

jfor which

r

ji

=

“true”); among these

sub-intervals, we find the ones with the largest level

`

. Then, we markx

ias an interval of level

`

+ 1

,

and list nearby all its sub-intervals of level

`

. We then repeat this procedure until all intervals are marked.

At the end, we select the desired consonant sub-collectiony

1, . . . , y

k:

asy

1, we select the interval with the highest

level

L

; aftery

iis selected (on level

L

;

i

+1

), and

i < L

, we select, asy

i+1, any of the y

i’s

sub-intervals of level

L

;

i

.

On each stage, at least one interval is marked, so after at most

n

stages, the procedure will stop. On each stage, for each of

n

intervals, we check

n

elements of the table; thus, each stage requires

O

(

n

2

)

steps, and

n

stages require

O

(

n

2

) =

O

(

n

3

)

steps. The algorithm as a whole requires

O

(

n

2

) +

O

(

n

3

) =

O

(

n

3

)

steps. Thus, this algorithm requires cubic time.

3.6 Pedagogical Example Let x 1

= 0

:

0

1 :

0]

x 2

= 0

:

4

0 :

7]

x 3

= 0

:

2

0 :

8]

x 4

= 0

:

9

1 :

0]

:

This collection is not consonant, because, e.g., neither

x 3 x 4, nor x 4 x 3.

According to the above algorithm, first, we pick intervals which do not contain any other inter-vals. In our original collection, there are two such intervals: x

2

= 0

:

4

0 :

7]

and x

4

= 0

:

9

1 :

0]

.

(6)

On the next stage, we compare two remaining (un-marked) intervalsx

1and x

3. The only

mini-mal one among them isx

3. Its only marked

sub-interval is x

2 of Level 1; thus, we assign to x

3

Level 2, and listx

2as its sub-interval of Level 1.

There is only one remaining un-marked interval

x

1, so this un-marked interval is therefore

min-imal. It has three marked sub-intervals: x 3 of

Level 2, and x 2 and

x

4 of Level 1. Thus, the

largest level of sub-intervals is 2, so we mark the intervalx

1as Level 3, and list x

3as its

sub-interval of Level 2.

All intervals are marked now, so we can start forming the desired consonant sub-collection:

First, we pick the intervalx

1

= 0

:

0

1 :

0]

with the highest level as the first elementy 1

of the desired collection. Here,

L

= 3

. As y 2, we pick y 1’s sub-interval x 3

=

0 :

2

0 :

8]

of level

L

;

1 = 2

. As y 3, we pick y 2’s sub-interval x 2

=

0 :

4

0 :

7]

of level

L

;

2 = 1

.

This means that we dismiss the intervalx 4.

4 An Application of the Above

Algo-rithm to Error Estimation

4.1 Formulation of the Problem

In this section, we show that although the above method was originally designed to deal with incon-sistency of expert knowledge, it can also be used in error estimation, when expert knowledge is only in-directly present. Specifically, we have a simulation model of a complex system. The results of the sim-ulation depend on the values of certain parameters of the system, parameters whose values are not ex-actly known. For example, for a mechanical struc-ture, these parameters include geometric parameters such as widths, heights, etc., and mechanical parame-ters such as density, Young’s modulus, etc. For each of these parameters

p

1

:::p

n, we know the

“aver-age” (nominal) value

p

(0)

i , and we know the interval

p

; i

p

+

i

]

of possible values of this parameter.

We are interested in a certain characteristic

x

of the system. For example, for a mechanical structure, we may be interested in this structure’s stability – mea-sured, e.g., by the maximal load that this structure can withstand. We want to make sure that this characteris-tic indeed exceeds the desired threshold

x

0(e.g., that

the structure can withstand the given load

x

0).

For each combination of values

p

1 2

p

; 1

p

+ 1

]

, . . . ,

p

n 2

p

; n

p

+

n

]

, the simulation model enables us

to compute the value

x

=

f

(

p

1

:::p

n

)

of the

de-sired characteristic. In particular, for the nominal val-ues

p

i

=

p

(0)

i of the parameters, we get the value

x

(0)

=

f

(

p

(0) 1

:::p

(0)

n

)

. Since we do not know what

exactly are the actual values

p

i, it is not enough to

check that this nominal value is OK (i.e., that

x

(0)

x

0). We must also test this condition under

differ-ent combinations of the parameters

p

i. The number

of possible combinations grows exponentially with

n

. Thus, for a realistically large number of parameters, an exhaustive-search type checking all possible com-binations becomes an intractable problem.

We need a faster algorithm.

4.2 Conservative Estimate (Linearization): Idea

A traditional engineering approach to this problem is to linearize it. The main idea of this approach is as follows. Usually, in most engineering problems, all the dependencies are smooth. In particular, it is safe to assume that the dependence

f

(

p

1

:::p

n

)

of the desired characteristic

x

on the parameters

p

i

is also smooth. To utilize this smoothness, we can take into consideration the fact that usually, the in-tervals

p

;

i

p

+

i

]

are reasonably narrow, and thus, we

can describe

p

i as

p

i

=

p

(0)

+

p

i, where

p

i 2

p

; i ;

p

(0) i

p

+ i ;

p

(o)

i

]

is reasonably small. Thus, we

can expand the dependence

f

(

p

1

:::p

n

) =

f

(

p

(0) 1

+

p

1

:::p

(0) n

+

p

n

)

in a Taylor series in

p

i and ignore quadratic and

higher order terms in this expansion. As a re-sult, we get the following approximate expression for

f

(

p

1

:::p

n

)

:

f

(

p

1

:::p

n

) =

x

(0)

+

c

1

p

1

+

:::

+

c

n

p

n

(3)

where we denoted

c

i

=

@f

@p

i jpi=p (0) i

:

(4)

If we know the derivatives

c

i, then we can find the

smallest and the largest values of the expression (3): namely, the sum attains the largest possible value if each term attains the largest possible value, and de-pending on the sign of

c

i, this largest possible value if

attained when

p

ireaches either the lower endpoint

or the upper endpoint of its interval of possible values. Often, possible bounds on

p

iare described in terms

of a largest possible distance

i between the actual

value

p

i and the nominal value

p

(0) i . In this case,

p

; i

=

p

(0) ;

i and

p

+ i

=

p

(0)

+

i. Then, the

largest possible value of

c

i

p

i is equal to j

c

i j

i,

the smallest possible value of

c

i

p

i is equal to ;j

c

i j

i, and therefore, the linear combination (3)

can take arbitrary values from the interval

x

;

x

+

] =

x

(0) ;

x

(0)

+ ]

, where we denoted

def

=

j

c

1 j

1

+

:::

+

j

c

n j

n

:

(7)

In general, if the function

f

was given by an an-alytical expression, then we would be able to explic-itly differentiate it and get the desired coefficients

c

i.

In reality, the function

f

is not known analytically, it is only given as a complex program, so we need to use numerical differentiation to find the values. For example, if we take

p

2

=

p

(0) 2 , . . . ,

p

n

=

p

(0) n (i.e.,

p

2

=

:::

=

p

n

= 0

), then (3) turns into

f

(

p

1

p

(0) 2

:::p

(0) n

) =

x

(0)

+

c

1

p

1

:

(5)

This expression is a linear function of

p

1. It is

known that a linear function attains its largest and smallest values on each interval at the endpoints of this interval. Thus, the largest and the smallest val-ues of the expression (5) are attained when

p

1 is

equal to the endpoints of the corresponding interval (and thus, when

p

1 is equal to either

p

; 1

or to

p

+ 1

). Hence, to find the smallest and the largest possible val-ues of (5), we can compute

f

(

p

;

1

p

(0) 2

:::p

(0) n

)

and

f

(

p

+ 1

p

(0) 2

:::p

(0) n

)

.

We will denote the smallest and the largest of these values by

x

;

1 and

x

+

1. In these notations, the

combina-tion

c

0

+

c

1

p

1takes arbitrary values from the

inter-val

x

; 1

x

+

1

]

. Hence, the product

c

1

p

1can take

ar-bitrary values from the interval

x

; 1 ;

x

(0)

x

+ 1 ;

x

(0)

]

. Similarly, for every

i

, the product

c

i

p

ican take

arbitrary values from the interval

x

; i ;

x

(0)

x

+ i ;

x

(0)

]

. Therefore, the linear combination (3) can take arbitrary values from the interval

x

;

x

+

]

, where

x

;

=

x

(0)

+ (

x

; 1 ;

x

(0)

) +

:::

+ (

x

; n ;

x

(0)

) (6)

x

+

=

x

(0)

+ (

x

+ 1 ;

x

(0)

) +

:::

+ (

x

+ n ;

x

(0)

)

:

(7)

4.3 Conservative Estimate (Linearization): Algorithm

We know the algorithm

f

(

p

1

:::p

n

)

which

trans-forms the values of the parameters

p

iinto an estimate

for the desired characteristic

x

. For every parameter

p

i, we know its nominal value

p

(0)

i , and we know the

interval

p

; i

p

+

i

]

of possible values. We want to

esti-mate the range of

x

when

p

i 2

p

; i

p

+ i

]

. First, we compute the nominal value

x

(0)

of

x

by applying

f

to the nominal values of the parame-ters:

x

(0) def

=

f

(

x

(0) 1

:::x

(0) n

)

.

Then, for every

i

, we change

p

i to

p

; i

and

p

+ i

while keeping all other parameters at their nom-inal values, and apply the simulation function

f

. As a result, we get two values of

x

:

x

(;) i def

=

f

(

p

(0) 1

:::p

(0) i;1

p

; i

p

(0) i+1

:::p

(0) n

) (8)

x

(+) i def

=

f

(

p

(0) 1

:::p

(0) i;1

p

+ i

p

(0) i+1

:::p

(0) n

) (9)

we denote the smallest of these two values by

x

; i ,

and the largest by

x

+ i .

Finally, we compute the endpoints

x

;

and

x

+

of the interval of possible values of

x

by using the formulas (6) and (7).

4.4 Conservative Estimate (Linearization): Problem

The main problem with the above conservative es-timate is that it is too conservative: the worst-case up-per bound (that we computed) corresponds to the case when all the parameters attain their extreme values. Theoretically, such an unfortunate worst-case combi-nation of parameters is possible, but in reality, experts often consider such a worst-case scenario to be un-realistic.

Since the worst case is not realistic, we can proba-bly get a narrower interval of possible values of

x

. In the following text, we describe two ideas for such an estimate. The second of these ideas will use the above consonant-selection algorithm.

4.5 Randomized Approach: Idea

Experts claim that the extreme combination of val-ues

p

iis “improbable”. To formalize this claim, let us

describe a reasonable probability distribution on the set of all possible combinations

p

= (

p

1

:::p

n

)

.

The only information that we have about each value

p

i is that

p

i can take any value from the interval

p

; i

p

+

i

]

. Thus, the possible values of the vector

p

= (

p

1

:::p

n

)

form an

n

-dimensional box

p

; 1

p

+ 1

]

:::

p

; n

p

+ n

]

:

Thus, the desired probability distribution must be lo-cated in this box. There are many possible probability distributions located in this box. According to math-ematical statistics, it is reasonable to select a distribu-tion for which the entropy is the largest possible (see, e.g., [4]). Such a distribution corresponds to the case when the variables

p

i are independent, and each of

them is uniformly distributed on the corresponding in-terval

p

;

i

p

+

i

]

. What is the corresponding distribution

for the value

x

(as described by the formula (3))?

4.6 Randomized Approach: Derivation of the Formulas

For large

n

, we can apply the central limit theorem (see, e.g., [10]) and conclude that the distribution is close to Gaussian. It is known that a Gaussian distri-bution is uniquely determined by its mean value

x

and standard deviation

x

]

, so let us determine these two values.

Since the formula (3) describes

x

as a linear com-bination of the random variables

p

i, the mean value

x

is equal to the linear combination of the corresponding means

p

i:

x

=

x

(0)

+

c

1

(

p

1 ;

p

(0)

)+

:::

+

c

n

(

p

n ;

p

(0)

)

:

(10)

(8)

Since each variable

p

iis uniformly distributed on the

interval

p

; i

p

+

i

]

, its mean value is equal to the

mid-point of this interval:

p

i

= (

p

; i

+

p

+

i

)

=

2

. Since we do

not know the values

c

i, we must compute the resulting

expression indirectly. Namely, from the formulas (8) and (9), we conclude that

x

(;)

i

=

x

(0) i

+

c

i

(

p

; i ;

p

(0)

)

and

x

(+) i

=

x

(0) i

+

c

i

(

p

+ i ;

p

(0)

)

. Taking an arithmetic average of the left-hand sides and right-hand sides of these two equalities, we conclude that

x

i

=

x

(0)

+

c

i

(

p

i ;

p

(0) i

)

where we denoted

x

i def

=

x

(;) i

+

x

(+) i

2 :

(11)

Therefore,

c

i

(

p

i ;

p

(0) i

) =

x

i ;

x

(0) , and from (10), we conclude that

x

=

x

(0)

+ (

x

1 ;

x

(0)

) +

:::

+ (

x

n ;

x

(0) n

)

:

(12)

To compute the standard deviation

x

]

, we can also use the fact that the random variables

p

iare

indepen-dent and uniformly distributed. For a random vari-able

p

i which is uniformly distributed on an interval

p

; i

p

+

i

]

, the standard deviation is determined by the

known formula

2

p

i

] = 112

j

p

+ i ;

p

; i j 2 . Therefore,

2

x

] =

n X i=1

c

2 i

2

p

i

] = 112

n X i=1

(

c

i

(

p

+ i ;

p

; i

))

2

:

(13)

From the formulas (8) and (9), we can conclude that

c

i

(

p

+ i ;

p

; i

) =

x

(+) i ;

x

(;) i

therefore,

x

] = 1

p

12

v u u t n X i=1

(

x

+ i ;

x

; i

)

2

:

(14)

It is known that with a high probability, the actual value of a Gaussian random variable

x

cannot dif-fer from its average by more than

k

standard devia-tions, where

k

= 2

corresponds to probability

0 :

95

,

k

= 3

corresponds to the probability

0 :

999

, etc.

Thus, we can conclude that with a high probability,

x

2

x

;

x

+

]

, where

x

;

=

x

;

k

x

]

x

+

=

x

+

x

]

:

(15)

Thus, we arrive at the following algorithm:

4.7 Randomized Approach: Algorithm

We know the algorithm

f

(

p

1

:::p

n

)

which

trans-forms the values of the parameters

p

iinto an estimate

for the desired characteristic

x

. For every parameter

p

i, we know its nominal value

p

(0)

i , and we know the

interval

p

; i

p

+

i

]

of possible values. We want to

esti-mate the range of

x

when

p

i 2

p

; i

p

+ i

]

.

First, we compute the nominal value

x

(0)

of

x

by applying

f

to the nominal values of the parame-ters:

x

(0) def

=

f

(

x

(0) 1

:::x

(0) n

)

.

Then, for every

i

, we change

p

i to

p

; i and

p

+ i

while keeping all other parameters at their nom-inal values, and apply the simulation function

f

. In other words, we compute the values

x

(;)

i and

x

(+)

i by using the formulas (8) and (9); we

de-note the smallest of these two values by

x

; i , and

the largest by

x

+ i .

For each

i

, we compute the mean value

x

i by

using the formula (11), and then compute

x

and

x

]

by using the formulas (12) and (14). Finally, we compute the endpoints

x

;

and

x

+

of the interval of possible values of

x

by using the formulas (15).

4.8 Single Fault: Idea

Another reasonable idea is as follows. In reliability analysis and in fault detection, it is often assumed that in a well-functioning system:

it is possible that one component fails, but it is highly improbable that two or more compo-nents fail at the same time.

Similarly, it is reasonable to assume that:

while a single parameter

p

ican drastically differ

from its nominal value

p

(0) i

,

it is highly improbable that two or more parame-ters will take values which are drastically differ-ent from their nominal values.

Thus, in estimating possible values of

x

, we can safely assume that at most one parameter

p

i differs from its

nominal value, while the values of all other parameters are practically equal to their nominal values.

Since we have no a priori information on which of

n

parameters is more probable to differ from its nomi-nal values, it is natural to assume that all

n

parameters have the same probability of deviation, which is thus equal to

1 =n

.

What can we then conclude about the possible val-ues of

x

?

4.9 Single Fault: Formulas

When the

i

-th parameter is the only one which dif-fers from its nominal value, then, due to formula (3), possible values of

x

range from

x

(;)

i (which corre-sponds to

p

i

=

p

; i ) to

x

(+) i (which corresponds to

p

i

=

p

+

i). In other words, in this case, the set of

pos-sible values of

x

is the interval

x

; i

x

+ i

]

.

Thus, we can conclude that with probability

1 =n

, the set of possible values coincides with

x

;

1

x

+ 1

]

,

(9)

with the same probability

1 =n

, this set coincides with

x

;

2

x

+

2

]

, etc. This description is in line with

the Dempster-Shafer formalism of possibility distribu-tions (see, e.g., [11]), which can be used to describe, for each interval

I

, the belief

Bel

(

I

)

and the

plausi-bility

Pl

(

I

)

that the actual value

x

is in

I

:

Bel

(

I

) =

X i:x ; i x + i ]I

1 n

Pl

(

I

) =

X i:x ; i x + i ]\I6=

1 n:

4.10 Single Fault: Possible Inconsistency

If all bounds on

p

i are described in terms of a

largest possible distance

ibetween the actual value

p

i and the nominal value

p

(0) i , then

p

; i

=

p

(0) i ;

i,

p

+ i

=

p

(0) i

+

i, and hence,

x

; i

x

+ i

] =

x

(0) i ;j

c

i j

i

x

(0) i

+

j

c

i j

i

]

:

The corresponding intervals all have the value

x

(0)

as their center and therefore, they should be nested (con-sonant).

In real life, due to the non-ideal character of the simulation model

f

(

p

1

:::p

n

)

, we may get

inconsis-tency. For example, the program

f

(

p

1

:::p

n

)

may

use Monte-Carlo simulation which produces correct results in a vast majority of the cases, but which some-times may lead to completely wrong results. In this case, a natural solution is to delete the erroneous in-tervals, and consider only a consonant sub-collection. To determine which intervals to delete we can use the above algorithm.

The resulting consonant collection has

m < n

in-tervals in it. Similarly to the above case, we can assume that all these consonant intervals are equally probable, with probability

1 =m >

1 =n

. (In effect, what we are doing here is re-distributing the probabil-ity of the deleted intervals to the remaining ones.)

4.11 Practical Applications

Practical applications of these approaches to dif-ferent structural design problems are described in [2, 8, 9].

Acknowledgments

This work was supported in part by NASA under cooperative agreement NCC5-209 and NASA/MURED ACE Center, by NSF grants No. DUE-9750858 and CDA-9522207, by the United Space Alliance, grant No. NAS 9-20000 (PWO C0C67713A6), by the Future Aerospace Science and Technology Program (FAST) Center for Structural Integrity of Aerospace Systems, effort sponsored by the Air Force Office of Scientific Research, Air Force Materiel Command, USAF, under grant number F49620-95-1-0518, and by the National Security Agency under Grant No. MDA904-98-1-0561.

References

[1] Bojadziev, G., and Bojadziev, M., Fuzzy sets,

fuzzy logic, applications, World Scientific,

Singa-pore, 1995.

[2] Cooper, A., and Ross, T., An investigation of

new mathematical structures for safety analysis,

Sandia National Lab, Technical Report SAND97-2695, 1997.

[3] Cormen, Th. H., Leiserson, C. E. and Rivest, R. L., Introduction to algorithms, MIT Press, Cam-bridge, MA, and Mc-Graw Hill Co., N.Y., 1994. [4] Erickson, G. J., Rychert, J. T., and Smith, C. R.

(eds.), Maximum Entropy and Bayesian Methods, Kluwer, Dordrecht, 1998.

[5] Klir, G., and Yuan, B., Fuzzy Sets and Fuzzy

Logic: Theory and Applications, Prentice Hall,

Upper Saddle River, NJ, 1995.

[6] Nguyen, H. T., and Kreinovich, V., “Nested Inter-vals and Sets: Concepts, Relations to Fuzzy Sets, and Applications”, In: R. B. Kearfott et al (eds.),

Applications of Interval Computations, Kluwer,

Dordrecht, 1996, 245–290.

[7] Nguyen, H. T., and Walker, E. A., First Course in

Fuzzy Logic, CRC Press, Boca Raton, FL, 1999.

[8] Ross, T. J., “Possibility intervals in the charac-terization of dynamic model uncertainty”,

Pro-ceedings of the 19th International Conference of the North American Fuzzy Information Soci-ety NAFIPS’2000, Atlanta, Georgia, July 13–15,

2000, pp. 353–357.

[9] Ross, T. J., and Donald, S., “A new method for the development of empirical probability distribu-tions”, Proceedings of the 2000 World Automation

Congress, Maui, Hawaii, June 11–16, 2000 (to

ap-pear).

[10] Wadsworth, H. M. Jr., ed., Handbook of

Sta-tistical Methods for Engineers and Scientists,

McGraw-Hill Publishing Co., New York, 1990. [11] Yager, R. R. et al. (Eds.), Advances in the

Dempster-Shafer Theory of Evidence, Wiley,