Optimal Elimination of Inconsistency in Expert Knowledge:
Formulation of the Problem, Fast Algorithms
Timothy J. Ross
Berlin Wu
Vladik Kreinovich
Civil Engineering Dept. Mathematics Dept. Dept. of Computer Science University of New Mexico National Chengchi University University of Texas Albuquerque, NM 87131, USA Taipei, Taiwan El Paso, TX 79968, USA
ross@unm.edu berlin@math.nccu.edu.tw vladik@cs.utep.edu
Abstract
Expert knowledge is sometimes inconsistent. In this paper, we describe the problem of eliminating this inconsistency as an optimization
problem, and present fast algorithms for solving this problem.
Keywords:
expert knowledge, inconsistency, optimization1
Introduction
1.1 Expert Knowledge Is Important
In many real-life situations, we have only a few high-level experts who can make high-quality deci-sions. It is desirable to incorporate their knowledge into an automated system which would help make sim-ilar good quality decisions when these top experts are not available. To design such systems, we must elicit the knowledge from the experts.
1.2 Expert Knowledge Is Sometimes Incon-sistent
This elicitation problem is difficult because a large part of this expert knowledge is imprecise. When the experts articulate this informal knowledge in computer-understandable formal terms, they in-evitably somewhat distort their knowledge. As a re-sult, the expert knowledge which was perfectly consis-tent may result in somewhat inconsisconsis-tent formal state-ments:
It could be that there is a contradiction between the statements presented by different experts. It could also be that several statements given by the same expert are inconsistent.
1.3 Experts Sometimes Cannot Easily Han-dle This Inconsistency, So We Need an Automated Inconsistency-Elimination Tool
Ideally, if we encounter such an inconsistency, we would like to point out this contradiction to the
ex-perts and ask them to resolve it before we input the in-consistent data into the computer-based system. How-ever, these inconsistencies reflect not internal incon-sistencies of the expert’s reasoning, but rather (in-evitable) differences between the expert’s internal rea-soning and its formalized representation. As a result, experts themselves often cannot provide us with any meaningful guidance about the best way to resolve these inconsistencies. Our experience shows that, if faced with such an inconsistency, experts often make arbitrary changes in their original estimates, changes which are kind of “random” in the sense that in a sim-ilar situation the same expert can make a completely different adjustment.
Since we cannot rely on experts to provide us with a meaningful and “consistent” way of eliminat-ing inconsistency, we must therefore design automatic inconsistency-elimination tools and use them as a part of knowledge elicitation.
1.4 What We Are Planning to Do
In this paper, we formulate the problem of optimal inconsistency elimination, and provide fast algorithms for such elimination for the practically important situ-ation when the expert knowledge is about the numer-ical value of a physnumer-ical characteristic (e.g., geometric and mechanical parameters of a complex structure). We will consider two cases:
when the inconsistency is between statements made by different experts, and
when the inconsistency is between different state-ments made by the same expert.
2
Eliminating
Inconsistency
Between
Statements Made By Different Experts
2.1 Expert Estimates: General FormulationLet
n
denote the total number of experts who es-timate the value of a given physical characteristic (such as width, density, etc.), and letxi
=
x
; ix
+ i
]
be an interval estimate provided by
i
-th expert (i
=
1
2
:::n
). In other words, experti
claims that the actual (unknown) valuex
of the estimated physical characteristic is in the intervalx
;i
x
+ i]
.2.2 If Experts Are Consistent With Each Other, the Situation Is Easy to Handle
If all the experts are correct, then the actual value of
x
belongs to alln
intervals. The resulting set of possible values ofx
is thus equal to the intersectionx
=
x 1 \x 2 \:::
\x nof these
n
intervals. This intersectionx=
x
;x
+]
is easy to compute:
x
;= max(
x
; 1
:::x
; n)
andx
+=
min(
x
+ 1:::x
+ n)
.2.3 The Problem: Experts’ Statements Can Be Inconsistent With Each Other
The problem emerges when the intersection of
n
intervals is empty, i.e., whenmax(
x
;1
:::x
; n)
>
min(
x
+ 1:::x
+n
)
. In this case, no real numberx
cansatisfy all the experts’ requirements
x
2x
; ix
+ i
]
and therefore, some of these requirements are false. Thus, to design a consistent expert system, we must dismiss some of the expert estimates before computing the in-tersection.
2.4 We Want To Find the Optimal (Most Probable) Way of Eliminating Inconsis-tency
There are many ways of eliminating inconsistency. For example, we can always get a consistent statement by simply dismissing all but one interval. From the purely mathematical viewpoint, it makes sense: we get a single interval, and this single interval is, of course, consistent. However, from the viewpoint of our prob-lem, what we are doing is, in effect, proclaiming that
n
;1
experts are wrong. It can be that one of theexperts is wrong, it can even happen that two experts are wrong, but it is hardly probable that all but one of them are wrong – if this is true what kind of experts are they? So, if we simply dismiss all expert estimates but one, we get a consistent set, but this consistent set is very improbable.
Thus, we do not simply want to eliminate inconsis-tency, we want to select the optimal (“most probable”) way of eliminating inconsistency.
So far, we have used the word “most probable” in the informal sense. To formalize this problem, we must describe the corresponding probabilities.
2.5 Possible Additional Knowledge About Reliability of Different Experts
Depending on our knowledge about the experts, we will consider two different formulations of this prob-lem:
We may already have some experience with this group of experts, and so we know, from experi-ence, for each expert
i
, the fractionp
iof cases inwhich this expert turned out to be correct. In our words, we know the probability
p
i that thei
-thexpert’s statement is correct.
It may also happen that we do not have any pre-liminary experience with this particular group of experts, so we have no reason to assume that one of them is more reliable than the others. In this case, it makes sense to assume that all experts have the same (unknown) probability
p
of mak-ing a correct statement.It is safe to assume that since we consider all these experts to be really experts, the probability
p
that an expert is correct should be larger than the probability1
;p
that theexpert is wrong, i.e.,
p >
1
;p
(which isequivalent to
p >
0
:
5
).From the mathematical viewpoint, the second situa-tion can be considered as a particular case of the first one, when
p
1=
:::
=
p
n=
p
; the only differenceis that in the first case, we know the probabilities
p
i,while in the second case, we do not know them (we only know that they are equal to each other).
2.6 Experts Are Independent
In this paper, we will assume that the experts are independent. This assumption makes sense: e.g., if the opinions of the two experts are highly correlated, then there is no sense to interview both of them: the opinion of the first expert describes, with a high accuracy, the opinion of the second expert as well. Thus, e.g.:
the probability that the experts
i
andj
are both correct is equal top
ip
j;the probability that the expert
i
is correct and the expertj
is not correct is equal top
i(1
;p
j)
.2.7 Towards the Mathematical Formulation of the Problem
Our goal is to select a subset
S
f1
2
:::n
gfor which the corresponding statements are consistent, i.e., for which
\ x
i
6
For each such selection
S
, its probabilityp
(
S
)
is equal to:p
(
S
) =
Y i2Sp
i ! 0 @ Y i62S(1
;p
i)
1 A:
(2)
We must therefore, find a set
S
for whichp
(
S
)
!max
among all setsS
which satisfy the condition (1).2.8 In Principle, We Can Use Exhaustive Search, But This Is Intractable
There are only finitely many sets
S
f1
:::n
g,so in principle, we could simply test them all and find the one for which the probability
p
(
S
)
is the largest possible. Unfortunately, this “exhaustive search” strat-egy is very time-consuming: indeed, a set ofn
ele-ments has2
nsubsets, so for a realistic case of
n
= 30
experts, we need to test a billion of different sub-sets. For even larger number of experts, this search becomes completely intractable. We therefore need a more sophisticated algorithm for selecting the most probable combination of intervals.This algorithm will be presented now.
2.9 Fast Algorithm for Solving This Problem (Case When We Knowp
i)
The proposed algorithm consists of the following stages:
1. First, we sort all
2
n
endpointsx
; 1:::x
; nx
+ 1:::x
+ ninto an ordered sequence of real numbers
x
(1)x
(k )
:::
x
(2n):
This sorting requires
O
(
n
log(
n
))
elemen-tary computation steps (see, e.g., [3]). 2. Then, for each
k
= 1
:::
2
n
;1
, we considerthe interval
x
(k )x
(k +1)]
. For this interval, wecheck, for every
i
= 1
:::n
, whether this inter-val belongs toxior not. Thus, we form the set
S
(k )of all indices for whichx
(k )x
(k +1)]
xi,
and and then compute the corresponding value
p
(
S
(k ))
by using the formula (2).Each checking requires
2
n
comparisons, and computingp
(
S
)
requiresn
multiplica-tions. So, for eachk
, we need3
n
steps,and the total number of computational steps required to perform this procedure for all
2
n
;1
values ofk
is equal toO
(
n
)
O
(
n
) =
O
(
n
2)
.
3. As a result of the second stage, we get
2
n
;1
different values of
p
(
S
(k ))
. To find the largestvalue (or values), we compare them with each other: we start with the value corresponding to
the leftmost interval
x
(1)x
(2)]
, mark it as thelargest-so-far, and then go over other values, re-placing the largest-so-far with the next one if the next value of
p
(
S
(k ))
is indeed larger.For each of
2
n
;1
values, we perform asingle comparisons, so this stage requires
O
(
n
)
elementary computational steps. The total number of steps is therefore equal toO
(
n
log(
n
)) +
O
(
n
2
) +
O
(
n
) =
O
(
n
2)
i.e., we get a quadratic-time algorithm for finding the optimal (most probable) consistent combination of ex-pert statements.
2.10 Pedagogical Example
Let us illustrate the above algorithm on the follow-ing simple example. Let
x 1
= 0
:
0
1
:
0]
x 2= 0
:
5
1
:
5]
x 3= 0
:
7
1
:
2]
x 4= 1
:
3
1
:
6]
:
In this case,max(
x
; 1:::x
; 4) = 1
:
3
>
min(
x
+ 1:::x
+ 4) = 1
:
0
and thus, the intersection of all four intervals is empty. Let us assume that
p
1=
p
2=
p
3= 0
:
7
andp
4= 0
:
8
. Then, the algorithm results in thefollow-ing selection:
1. First, we sort all 8 values
x
i into a sequencex
(1)= 0
:
0
x
(2)= 0
:
5
x
(3)= 0
:
7
x
(4)= 1
:
0
x
(5)= 1
:
2
x
(6)= 1
:
3
x
(7)= 1
:
5
x
(8)= 1
:
6
:
2. Second, for each of 7 intervals
x
(k )x
(k +1)]
, wecompute the corresponding value
p
(
S
(k ))
:For
k
= 1
, we havex
(1)x
(2)] = 0
:
0
0
:
5]
, soS
(1)=
f1
g, andp
(
S
(1)) = 0
:
7
0
:
3
0
:
3
0
:
2 = 0
:
0126
:
Fork
= 2
, we havex
(2)x
(3)] = 0
:
5
0
:
7]
, soS
(2)=
f1
2
g, andp
(
S
(2)) = 0
:
7
0
:
7
0
:
3
0
:
2 = 0
:
0294
:
Fork
= 3
, we havex
(3)x
(4)] = 0
:
7
1
:
0]
, soS
(3)=
f1
2
3
g, andp
(
S
(3)) = 0
:
7
0
:
7
0
:
7
0
:
2 = 0
:
0686
:
Fork
= 4
, we havex
(4)x
(5)] = 1
:
0
1
:
2]
, soS
(4)=
f2
3
g, andp
(
S
(4)) = 0
:
3
0
:
7
0
:
7
0
:
2 = 0
:
0294
:
For
k
= 5
, we havex
(5)x
(6)] = 1
:
2
1
:
3]
, soS
(1)=
f2
g, andp
(
S
(5)) = 0
:
3
0
:
7
0
:
3
0
:
2 = 0
:
0126
:
Fork
= 6
, we havex
(6)x
(7)] = 1
:
3
1
:
5]
, soS
(1)=
f2
4
g, andp
(
S
(6)) = 0
:
3
0
:
7
0
:
3
0
:
8 = 0
:
0504
:
Fork
= 7
, we havex
(7)x
(8)] = 1
:
5
1
:
6]
, soS
(1)=
f4
g, andp
(
S
(1)) = 0
:
3
0
:
3
0
:
3
0
:
8 = 0
:
0202
:
3. Among these 7 values, the largest corresponds to
S
(3)=
f
1
2
3
g, so we dismiss the estimate ofthe 4th expert.
2.11 How To Modify the Above Algorithm If We Do Not Know the Probabilitiesp
i
If we do not know the probabilities
p
i (i.e., if weonly know that
p
i=
p
for some (unknown) valuep >
0
:
5
), then we cannot implement the second stage of the above algorithm – because we cannot compute the valuep
(
S
)
.However, there is a way around it: the only places where we use the values
p
(
S
)
is on the third stage of this algorithm, and on this stage, we do not need the actual values ofp
(
S
)
, we only need to know which values are larger and which are smaller. For the case whenp
i=
p
for alli
, the probability (2) turns intop
(
S
) =
p
jSj(1
;p
)
n;jSjwherej
S
jdenotes the number of elements in the setS
.This expression can be rewritten as
p
(
S
) =
p
1
;p
jSj(1
;p
)
n:
Since
p >
1
;p
, we can conclude that the moreel-ements
S
has, the larger the corresponding value ofp
(
S
)
. Thus, instead of computingp
(
S
)
, we can simply computejS
jand then select a consistent setS
whichcontains the largest possible number of intervals. In other words, we dismiss the smallest possible number of expert estimates: this makes perfect sense.
Thus, we arrive at the following algorithm:
2.12 Fast Algorithm for Solving This Prob-lem (Case When We Do Not Know the Probabilitiesp
i)
For this case, the proposed algorithm consists of the following stages:
1. First, we sort all
2
n
endpointsx
; 1:::x
; nx
+ 1:::x
+ ninto an ordered sequence of real numbers
x
(1)x
(k )
:::
x
(2n):
2. Then, for each
k
= 1
:::
2
n
;1
, we considerthe interval
x
(k )x
(k +1)]
. For this interval, wecheck, for every
i
= 1
:::n
, whether this inter-val belongs toxior not. Thus, we form the set
S
(k )of all indices for whichx
(k )x
(k +1)]
xi,
and then compute the total numberj
S
(k )jof
ele-ments in this set.
3. As a result of the second stage, we get
2
n
;1
dif-ferent values ofj
S
(k )j. To find the largest value
(or values), we compare them with each other: we start with the value corresponding to the left-most interval
x
(1)x
(2)]
, mark it as thelargest-so-far, and then go over other values, replacing the largest-so-far with the next one if the next value ofj
S
(k )
jis indeed larger.
This version also requires
O
(
n
2)
elementary compu-tation steps.
2.13 Pedagogical Example
If in the above example
x 1
= 0
:
0
1
:
0]
x 2= 0
:
5
1
:
5]
x 3= 0
:
7
1
:
2]
x 4= 1
:
3
1
:
6]
we do not know the probabilities, then, on the second stage, we compute the values
j
S
(1) j= 1
jS
(2) j= 2
jS
(3) j= 3
jS
(4) j= 2
jS
(5) j= 1
jS
(6) j= 2
jS
(7) j= 1
and select the set
S
(3) for which this number is thelargest.
3
Eliminating
Inconsistency
Between
Different Statements Made By the
Same Expert
3.1 Nested Intervals (Fuzzy Set) Instead of a Single Interval
In the above description, we assumed that for each physical characteristic
x
, each expert provides a single intervalx
;x
+]
of possible values of
x
. In reality, in addition to the interval about which the expert is absolutely sure thatx
is there, the expert may provide narrower intervals corresponding to smaller degrees of confidence. For example, an expert may be absolutely sure thatx
20
:
0
1
:
0]
, almost absolutely sure thatx
20
:
2
0
:
8]
, somewhat sure thatx
20
:
4
0
:
7]
, etc.As a result, we get a “nested” (“consonant”) col-lection of intervals x 1 x 2
:::
x n whichcorrespond to different degrees of uncertainty. Such a nested collection of intervals forms a fuzzy set [1, 5, 6, 7], in which different intervals can be viewed as
-cuts.3.2 Some Intervals May Be Erroneous, So We Must Choose a Consonant Sub-Collection
In the ideal situation, the intervals are nested. In real life, some intervals provided by an expert may not be ordered by inclusion. In this case, we must select a consonant subset.
Let us formulate this problem in more precise terms.
Definition 1. We say that two intervalsxandyare
compatibleif eitherxy, oryx.
Definition 2. We say that a collection of intervals is
consonantif every two intervals from this collection are compatible.
In these terms, our problem can be formulated as follows:
we have a collection of intervals x 1, . . . ,
x n
which is not consonant;
we must select a consonant sub-collection, i.e., a set
S
f1
2
:::n
gfor which the intervalsxi,
i
2S
are consonant.We would like to choose the “optimal” (“most proba-ble”) sub-collection.
3.3 We Want To Find the Optimal (Most Probable) Way of Eliminating Inconsis-tency
We know that some intervals in the original col-lection can be erroneous. Since we have no informa-tion about these probabilities, we can assume that each of these intervals has the same (unknown) probability
p
to be erroneous. The probability corresponding to each sub-collection can be therefore described by the formulap
(
S
) =
p
jSj(1
;p
)
n;jSj. We know, from the first problem, that this value is the largest if and only if
S
contains the largest number of elements. So, the problem of selecting the optimal consonant sub-collection can be reformulated as follows:we have a collection of intervals x 1, . . . ,
x n
which is not consonant;
we must select a consonant sub-collection
S
which contains the largest possible number of in-tervals.(In other words, we want to dismiss the smallest possi-ble number of intervals – which makes perfect sense.)
3.4 In Principle, We Can Use Exhaustive Search, But This Is Intractable
Similarly to the first problem, we can find the solu-tion to this one by exhaustive search, but for large
n
, this is intractable, so we need a faster algorithm. This algorithm is described below.3.5 Fast Algorithm for Solving This Problem
We start with a collection of
n
intervalsx 1, . . . ,x n.
First, for every two intervalsx iand
x
j from the
original collection, we check whetherx i
x j,
and put the result (“true” or “false”) into the cor-responding cell
r
ijof a specialn
n
table.For
n
intervals, this stage requires2
n
2=
O
(
n
2)
comparisons. For each intervalx
i, we check whether this
inter-val is minimal, i.e., whether
r
ji=
“false” for allj
. Then, we mark all the intervals which satisfy this property as intervals of Level 1.Then, for every interval which has not yet been marked, we check whether this interval is minimal among un-marked ones, i.e., whether
r
ji=
“false” for all un-marked intervals xj. For
each interval which satisfies this property, we find all its marked sub-intervals (i.e., all marked inter-valsx
jfor which
r
ji=
“true”); among thesesub-intervals, we find the ones with the largest level
`
. Then, we markxias an interval of level
`
+ 1
,and list nearby all its sub-intervals of level
`
. We then repeat this procedure until all intervals are marked.At the end, we select the desired consonant sub-collectiony
1, . . . , y
k:
asy
1, we select the interval with the highest
level
L
; afteryiis selected (on level
L
;
i
+1
), andi < L
, we select, asyi+1, any of the y
i’s
sub-intervals of level
L
;i
.On each stage, at least one interval is marked, so after at most
n
stages, the procedure will stop. On each stage, for each ofn
intervals, we checkn
elements of the table; thus, each stage requiresO
(
n
2)
steps, and
n
stages requireO
(
n
n
2
) =
O
(
n
3)
steps. The algorithm as a whole requires
O
(
n
2) +
O
(
n
3) =
O
(
n
3)
steps. Thus, this algorithm requires cubic time.
3.6 Pedagogical Example Let x 1
= 0
:
0
1
:
0]
x 2= 0
:
4
0
:
7]
x 3= 0
:
2
0
:
8]
x 4= 0
:
9
1
:
0]
:
This collection is not consonant, because, e.g., neither
x 3 x 4, nor x 4 x 3.
According to the above algorithm, first, we pick intervals which do not contain any other inter-vals. In our original collection, there are two such intervals: x
2
= 0
:
4
0
:
7]
and x4
= 0
:
9
1
:
0]
.On the next stage, we compare two remaining (un-marked) intervalsx
1and x
3. The only
mini-mal one among them isx
3. Its only marked
sub-interval is x
2 of Level 1; thus, we assign to x
3
Level 2, and listx
2as its sub-interval of Level 1.
There is only one remaining un-marked interval
x
1, so this un-marked interval is therefore
min-imal. It has three marked sub-intervals: x 3 of
Level 2, and x 2 and
x
4 of Level 1. Thus, the
largest level of sub-intervals is 2, so we mark the intervalx
1as Level 3, and list x
3as its
sub-interval of Level 2.
All intervals are marked now, so we can start forming the desired consonant sub-collection:
First, we pick the intervalx
1
= 0
:
0
1
:
0]
with the highest level as the first elementy 1
of the desired collection. Here,
L
= 3
. As y 2, we pick y 1’s sub-interval x 3=
0
:
2
0
:
8]
of levelL
;1 = 2
. As y 3, we pick y 2’s sub-interval x 2=
0
:
4
0
:
7]
of levelL
;2 = 1
.This means that we dismiss the intervalx 4.
4
An Application of the Above
Algo-rithm to Error Estimation
4.1 Formulation of the Problem
In this section, we show that although the above method was originally designed to deal with incon-sistency of expert knowledge, it can also be used in error estimation, when expert knowledge is only in-directly present. Specifically, we have a simulation model of a complex system. The results of the sim-ulation depend on the values of certain parameters of the system, parameters whose values are not ex-actly known. For example, for a mechanical struc-ture, these parameters include geometric parameters such as widths, heights, etc., and mechanical parame-ters such as density, Young’s modulus, etc. For each of these parameters
p
1:::p
n, we know the“aver-age” (nominal) value
p
(0)i , and we know the interval
p
; ip
+
i
]
of possible values of this parameter.We are interested in a certain characteristic
x
of the system. For example, for a mechanical structure, we may be interested in this structure’s stability – mea-sured, e.g., by the maximal load that this structure can withstand. We want to make sure that this characteris-tic indeed exceeds the desired thresholdx
0(e.g., thatthe structure can withstand the given load
x
0).For each combination of values
p
1 2p
; 1p
+ 1]
, . . . ,p
n 2p
; np
+n
]
, the simulation model enables usto compute the value
x
=
f
(
p
1:::p
n)
of thede-sired characteristic. In particular, for the nominal val-ues
p
i=
p
(0)
i of the parameters, we get the value
x
(0)=
f
(
p
(0) 1:::p
(0)
n
)
. Since we do not know whatexactly are the actual values
p
i, it is not enough tocheck that this nominal value is OK (i.e., that
x
(0)x
0). We must also test this condition underdiffer-ent combinations of the parameters
p
i. The numberof possible combinations grows exponentially with
n
. Thus, for a realistically large number of parameters, an exhaustive-search type checking all possible com-binations becomes an intractable problem.We need a faster algorithm.
4.2 Conservative Estimate (Linearization): Idea
A traditional engineering approach to this problem is to linearize it. The main idea of this approach is as follows. Usually, in most engineering problems, all the dependencies are smooth. In particular, it is safe to assume that the dependence
f
(
p
1:::p
n)
of the desired characteristic
x
on the parametersp
iis also smooth. To utilize this smoothness, we can take into consideration the fact that usually, the in-tervals
p
;i
p
+i
]
are reasonably narrow, and thus, wecan describe
p
i asp
i=
p
(0)+
p
i, wherep
i 2p
; i ;p
(0) ip
+ i ;p
(o)i
]
is reasonably small. Thus, wecan expand the dependence
f
(
p
1:::p
n) =
f
(
p
(0) 1+
p
1:::p
(0) n+
p
n)
in a Taylor series in
p
i and ignore quadratic andhigher order terms in this expansion. As a re-sult, we get the following approximate expression for
f
(
p
1:::p
n)
:f
(
p
1:::p
n) =
x
(0)+
c
1p
1+
:::
+
c
np
n(3)
where we denotedc
i=
@f
@p
i jpi=p (0) i:
(4)
If we know the derivatives
c
i, then we can find thesmallest and the largest values of the expression (3): namely, the sum attains the largest possible value if each term attains the largest possible value, and de-pending on the sign of
c
i, this largest possible value ifattained when
p
ireaches either the lower endpointor the upper endpoint of its interval of possible values. Often, possible bounds on
p
iare described in termsof a largest possible distance
i between the actualvalue
p
i and the nominal valuep
(0) i . In this case,p
; i=
p
(0) ; i andp
+ i=
p
(0)+
i. Then, thelargest possible value of
c
ip
i is equal to jc
i j i,the smallest possible value of
c
ip
i is equal to ;j
c
i j
i, and therefore, the linear combination (3)
can take arbitrary values from the interval
x
;x
+] =
x
(0) ;x
(0)+ ]
, where we denoted def=
jc
1 j 1+
:::
+
jc
n j n:
In general, if the function
f
was given by an an-alytical expression, then we would be able to explic-itly differentiate it and get the desired coefficientsc
i.In reality, the function
f
is not known analytically, it is only given as a complex program, so we need to use numerical differentiation to find the values. For example, if we takep
2=
p
(0) 2 , . . . ,p
n=
p
(0) n (i.e.,p
2=
:::
=
p
n= 0
), then (3) turns intof
(
p
1p
(0) 2:::p
(0) n) =
x
(0)+
c
1p
1:
(5)
This expression is a linear function of
p
1. It isknown that a linear function attains its largest and smallest values on each interval at the endpoints of this interval. Thus, the largest and the smallest val-ues of the expression (5) are attained when
p
1 isequal to the endpoints of the corresponding interval (and thus, when
p
1 is equal to eitherp
; 1
or to
p
+ 1). Hence, to find the smallest and the largest possible val-ues of (5), we can compute
f
(
p
;1
p
(0) 2:::p
(0) n)
andf
(
p
+ 1p
(0) 2:::p
(0) n)
.We will denote the smallest and the largest of these values by
x
;1 and
x
+1. In these notations, the
combina-tion
c
0+
c
1p
1takes arbitrary values from the
inter-val
x
; 1x
+
1
]
. Hence, the productc
1p
1can take
ar-bitrary values from the interval
x
; 1 ;x
(0)x
+ 1 ;x
(0)]
. Similarly, for everyi
, the productc
ip
ican take
arbitrary values from the interval
x
; i ;x
(0)x
+ i ;x
(0)]
. Therefore, the linear combination (3) can take arbitrary values from the interval
x
;x
+]
, where
x
;=
x
(0)+ (
x
; 1 ;x
(0)) +
:::
+ (
x
; n ;x
(0)) (6)
x
+=
x
(0)+ (
x
+ 1 ;x
(0)) +
:::
+ (
x
+ n ;x
(0))
:
(7)
4.3 Conservative Estimate (Linearization): Algorithm
We know the algorithm
f
(
p
1:::p
n)
whichtrans-forms the values of the parameters
p
iinto an estimatefor the desired characteristic
x
. For every parameterp
i, we know its nominal valuep
(0)i , and we know the
interval
p
; ip
+
i
]
of possible values. We want toesti-mate the range of
x
whenp
i 2p
; ip
+ i]
. First, we compute the nominal valuex
(0)of
x
by applyingf
to the nominal values of the parame-ters:x
(0) def=
f
(
x
(0) 1:::x
(0) n)
.Then, for every
i
, we changep
i top
; iand
p
+ iwhile keeping all other parameters at their nom-inal values, and apply the simulation function
f
. As a result, we get two values ofx
:x
(;) i def=
f
(
p
(0) 1:::p
(0) i;1p
; ip
(0) i+1:::p
(0) n) (8)
x
(+) i def=
f
(
p
(0) 1:::p
(0) i;1p
+ ip
(0) i+1:::p
(0) n) (9)
we denote the smallest of these two values by
x
; i ,and the largest by
x
+ i .Finally, we compute the endpoints
x
;and
x
+of the interval of possible values of
x
by using the formulas (6) and (7).4.4 Conservative Estimate (Linearization): Problem
The main problem with the above conservative es-timate is that it is too conservative: the worst-case up-per bound (that we computed) corresponds to the case when all the parameters attain their extreme values. Theoretically, such an unfortunate worst-case combi-nation of parameters is possible, but in reality, experts often consider such a worst-case scenario to be un-realistic.
Since the worst case is not realistic, we can proba-bly get a narrower interval of possible values of
x
. In the following text, we describe two ideas for such an estimate. The second of these ideas will use the above consonant-selection algorithm.4.5 Randomized Approach: Idea
Experts claim that the extreme combination of val-ues
p
iis “improbable”. To formalize this claim, let usdescribe a reasonable probability distribution on the set of all possible combinations
p
= (
p
1:::p
n)
.The only information that we have about each value
p
i is thatp
i can take any value from the intervalp
; ip
+
i
]
. Thus, the possible values of the vectorp
= (
p
1:::p
n)
form ann
-dimensional boxp
; 1p
+ 1]
:::
p
; np
+ n]
:
Thus, the desired probability distribution must be lo-cated in this box. There are many possible probability distributions located in this box. According to math-ematical statistics, it is reasonable to select a distribu-tion for which the entropy is the largest possible (see, e.g., [4]). Such a distribution corresponds to the case when the variables
p
i are independent, and each ofthem is uniformly distributed on the corresponding in-terval
p
;i
p
+i
]
. What is the corresponding distributionfor the value
x
(as described by the formula (3))?4.6 Randomized Approach: Derivation of the Formulas
For large
n
, we can apply the central limit theorem (see, e.g., [10]) and conclude that the distribution is close to Gaussian. It is known that a Gaussian distri-bution is uniquely determined by its mean valuex
and standard deviationx
]
, so let us determine these two values.Since the formula (3) describes
x
as a linear com-bination of the random variablesp
i, the mean valuex
is equal to the linear combination of the corresponding means
p
i:x
=
x
(0)+
c
1(
p
1 ;p
(0))+
:::
+
c
n(
p
n ;p
(0))
:
(10)
Since each variable
p
iis uniformly distributed on theinterval
p
; ip
+
i
]
, its mean value is equal to themid-point of this interval:
p
i= (
p
; i+
p
+
i
)
=
2
. Since we donot know the values
c
i, we must compute the resultingexpression indirectly. Namely, from the formulas (8) and (9), we conclude that
x
(;)i
=
x
(0) i+
c
i(
p
; i ;p
(0))
andx
(+) i=
x
(0) i+
c
i(
p
+ i ;p
(0))
. Taking an arithmetic average of the left-hand sides and right-hand sides of these two equalities, we conclude thatx
i=
x
(0)+
c
i(
p
i ;p
(0) i)
where we denotedx
i def=
x
(;) i+
x
(+) i2
:
(11)
Therefore,c
i(
p
i ;p
(0) i) =
x
i ;x
(0) , and from (10), we conclude thatx
=
x
(0)+ (
x
1 ;x
(0)) +
:::
+ (
x
n ;x
(0) n)
:
(12)
To compute the standard deviation
x
]
, we can also use the fact that the random variablesp
iareindepen-dent and uniformly distributed. For a random vari-able
p
i which is uniformly distributed on an intervalp
; ip
+
i
]
, the standard deviation is determined by theknown formula
2
p
i] = 112
jp
+ i ;p
; i j 2 . Therefore,2
x
] =
n X i=1c
2 i2
p
i] = 112
n X i=1(
c
i(
p
+ i ;p
; i))
2:
(13)
From the formulas (8) and (9), we can conclude that
c
i(
p
+ i ;p
; i) =
x
(+) i ;x
(;) i therefore,x
] = 1
p12
v u u t n X i=1(
x
+ i ;x
; i)
2:
(14)
It is known that with a high probability, the actual value of a Gaussian random variable
x
cannot dif-fer from its average by more thank
standard devia-tions, wherek
= 2
corresponds to probability0
:
95
,k
= 3
corresponds to the probability0
:
999
, etc.Thus, we can conclude that with a high probability,
x
2x
;x
+]
, wherex
;=
x
;k
x
]
x
+=
x
+
x
]
:
(15)
Thus, we arrive at the following algorithm:
4.7 Randomized Approach: Algorithm
We know the algorithm
f
(
p
1:::p
n)
whichtrans-forms the values of the parameters
p
iinto an estimatefor the desired characteristic
x
. For every parameterp
i, we know its nominal valuep
(0)i , and we know the
interval
p
; ip
+
i
]
of possible values. We want toesti-mate the range of
x
whenp
i 2p
; i
p
+ i
]
.First, we compute the nominal value
x
(0)of
x
by applyingf
to the nominal values of the parame-ters:x
(0) def=
f
(
x
(0) 1:::x
(0) n)
.Then, for every
i
, we changep
i top
; i andp
+ i
while keeping all other parameters at their nom-inal values, and apply the simulation function
f
. In other words, we compute the valuesx
(;)i and
x
(+)i by using the formulas (8) and (9); we
de-note the smallest of these two values by
x
; i , andthe largest by
x
+ i .For each
i
, we compute the mean valuex
i byusing the formula (11), and then compute
x
andx
]
by using the formulas (12) and (14). Finally, we compute the endpointsx
;and
x
+of the interval of possible values of
x
by using the formulas (15).4.8 Single Fault: Idea
Another reasonable idea is as follows. In reliability analysis and in fault detection, it is often assumed that in a well-functioning system:
it is possible that one component fails, but it is highly improbable that two or more compo-nents fail at the same time.
Similarly, it is reasonable to assume that:
while a single parameter
p
ican drastically differfrom its nominal value
p
(0) i,
it is highly improbable that two or more parame-ters will take values which are drastically differ-ent from their nominal values.
Thus, in estimating possible values of
x
, we can safely assume that at most one parameterp
i differs from itsnominal value, while the values of all other parameters are practically equal to their nominal values.
Since we have no a priori information on which of
n
parameters is more probable to differ from its nomi-nal values, it is natural to assume that alln
parameters have the same probability of deviation, which is thus equal to1
=n
.What can we then conclude about the possible val-ues of
x
?4.9 Single Fault: Formulas
When the
i
-th parameter is the only one which dif-fers from its nominal value, then, due to formula (3), possible values ofx
range fromx
(;)i (which corre-sponds to
p
i=
p
; i ) tox
(+) i (which corresponds top
i=
p
+i). In other words, in this case, the set of
pos-sible values of
x
is the intervalx
; ix
+ i
]
.Thus, we can conclude that with probability
1
=n
, the set of possible values coincides withx
;1
x
+ 1]
,with the same probability
1
=n
, this set coincides withx
;2
x
+2
]
, etc. This description is in line withthe Dempster-Shafer formalism of possibility distribu-tions (see, e.g., [11]), which can be used to describe, for each interval
I
, the beliefBel
(
I
)
and theplausi-bility
Pl
(
I
)
that the actual valuex
is inI
:Bel
(
I
) =
X i:x ; i x + i ]I1
n
Pl
(
I
) =
X i:x ; i x + i ]\I6=1
n:
4.10 Single Fault: Possible InconsistencyIf all bounds on
p
i are described in terms of alargest possible distance
ibetween the actual valuep
i and the nominal valuep
(0) i , thenp
; i=
p
(0) i ; i,p
+ i=
p
(0) i+
i, and hence,x
; ix
+ i] =
x
(0) i ;jc
i j ix
(0) i+
jc
i j i]
:
The corresponding intervals all have the value
x
(0)as their center and therefore, they should be nested (con-sonant).
In real life, due to the non-ideal character of the simulation model
f
(
p
1:::p
n)
, we may getinconsis-tency. For example, the program
f
(
p
1:::p
n)
mayuse Monte-Carlo simulation which produces correct results in a vast majority of the cases, but which some-times may lead to completely wrong results. In this case, a natural solution is to delete the erroneous in-tervals, and consider only a consonant sub-collection. To determine which intervals to delete we can use the above algorithm.
The resulting consonant collection has
m < n
in-tervals in it. Similarly to the above case, we can assume that all these consonant intervals are equally probable, with probability1
=m >
1
=n
. (In effect, what we are doing here is re-distributing the probabil-ity of the deleted intervals to the remaining ones.)4.11 Practical Applications
Practical applications of these approaches to dif-ferent structural design problems are described in [2, 8, 9].
Acknowledgments
This work was supported in part by NASA under cooperative agreement NCC5-209 and NASA/MURED ACE Center, by NSF grants No. DUE-9750858 and CDA-9522207, by the United Space Alliance, grant No. NAS 9-20000 (PWO C0C67713A6), by the Future Aerospace Science and Technology Program (FAST) Center for Structural Integrity of Aerospace Systems, effort sponsored by the Air Force Office of Scientific Research, Air Force Materiel Command, USAF, under grant number F49620-95-1-0518, and by the National Security Agency under Grant No. MDA904-98-1-0561.
References
[1] Bojadziev, G., and Bojadziev, M., Fuzzy sets,
fuzzy logic, applications, World Scientific,
Singa-pore, 1995.
[2] Cooper, A., and Ross, T., An investigation of
new mathematical structures for safety analysis,
Sandia National Lab, Technical Report SAND97-2695, 1997.
[3] Cormen, Th. H., Leiserson, C. E. and Rivest, R. L., Introduction to algorithms, MIT Press, Cam-bridge, MA, and Mc-Graw Hill Co., N.Y., 1994. [4] Erickson, G. J., Rychert, J. T., and Smith, C. R.
(eds.), Maximum Entropy and Bayesian Methods, Kluwer, Dordrecht, 1998.
[5] Klir, G., and Yuan, B., Fuzzy Sets and Fuzzy
Logic: Theory and Applications, Prentice Hall,
Upper Saddle River, NJ, 1995.
[6] Nguyen, H. T., and Kreinovich, V., “Nested Inter-vals and Sets: Concepts, Relations to Fuzzy Sets, and Applications”, In: R. B. Kearfott et al (eds.),
Applications of Interval Computations, Kluwer,
Dordrecht, 1996, 245–290.
[7] Nguyen, H. T., and Walker, E. A., First Course in
Fuzzy Logic, CRC Press, Boca Raton, FL, 1999.
[8] Ross, T. J., “Possibility intervals in the charac-terization of dynamic model uncertainty”,
Pro-ceedings of the 19th International Conference of the North American Fuzzy Information Soci-ety NAFIPS’2000, Atlanta, Georgia, July 13–15,
2000, pp. 353–357.
[9] Ross, T. J., and Donald, S., “A new method for the development of empirical probability distribu-tions”, Proceedings of the 2000 World Automation
Congress, Maui, Hawaii, June 11–16, 2000 (to
ap-pear).
[10] Wadsworth, H. M. Jr., ed., Handbook of
Sta-tistical Methods for Engineers and Scientists,
McGraw-Hill Publishing Co., New York, 1990. [11] Yager, R. R. et al. (Eds.), Advances in the
Dempster-Shafer Theory of Evidence, Wiley,