Skip to main content

A decentralized trustworthiness estimation model for open, multiagent systems (DTMAS)


Often in open multiagent systems, agents interact with other agents to meet their own goals. Trust is, therefore, considered essential to make such interactions effective. However, trust is a complex, multifaceted concept and includes more than just evaluating others’ honesty. Many trust evaluation models have been proposed and implemented in different areas; most of them focused on algorithms for trusters to model the trustworthiness of trustees in order to make effective decisions about which trustees to select. For this purpose, many trust evaluation models use third party information sources such as witnesses, but slight consideration is paid for locating such third party information sources. Unlike most trust models, the proposed model defines a scalable way to locate a set of witnesses, and combines a suspension technique with reinforcement learning to improve the model responses to dynamic changes in the system. Simulation results indicate that the proposed model benefits trusters while demanding less message overhead.


In many systems that are common in virtual contexts, such as peer-to-peer systems, e-commerce, and the grid, elements act in an autonomous and flexible way in order to meet their goals. Such systems can be molded as open, dynamic multi-agent systems (MASs) [1]. In open, dynamic MASs, agents can represent software entities or human beings. Agents can come from any setting with heterogeneous abilities, organizational relationships, and credentials. Furthermore, the decision-making processes of individual agents are independent of each other and agents can join or leave the system. As each agent has only bounded abilities, it may need to rely on the services or resources of other agents in order to meet its objects [2]. Agents cannot take for granted that other agents share the same core beliefs about the system or that other agents make accurate statements regarding their competencies and abilities. In addition, agents must accept the possibility that others may intentionally spread false information, or otherwise behaving in a harmful way, to meet their own aims [3]. Therefore, trust evaluating agents, also referred to as trusters (TRs), should use a trust estimation model that allows them to recognize reliable partners in their systems. The estimated assessment should be sufficiently accurate to allow TRs, to distinguish honest trustees (TEs) in the system. In open MAS, the trust evaluation model should not rely on centralized entities but should dynamically update the agents’ knowledge sets to take into account new characteristics of the environment. The failure or takeover of any agent must not lead to the failure of the whole system.

TRs use trustworthiness estimation to resolve some of the uncertainty in their interactions and form expectations about the behaviors of others [3]. Trust has been defined in many ways in different domains [1]. For this work the definition used in [4] for trust in MASs, will be adapted. A TE’s trustworthiness is considered as a measurement of the TE’s possibility to do what it is supposed to do.

Unlike most trust models, DTMAS defines a scalable way to locate a set of witnesses, where there exist a network structure, to consult for indirect trust information. The model uses of a semi-hierarchical structure for MASs, coupled with the notion of the small world networks [5] and the concept of contacts [6]. Furthermore, a suspension technique is used and combined with reinforcement learning (RL) to improve the model responses to dynamic changes in the system. This parameter is used to address the short-term relationship between a TR and the TE under consideration. It helps the TR to address a recently malfunctioning TE that used to be honest for a relatively large number of transactions. The idea is that a TR will stop interacting with a misbehaving trustee immediately, and wait utile it is clear whether this misbehavior is accidental or it is a behavioral change. Because the suspension is temporary, and because TR uses information from witnesses, the effect of accidental misbehavior will phase out, but the effect of a behaviour change will be magnified.

Background and related work

Reinforcement learning (RL)

The reinforcement learning attempts to solve the problem of learning from interaction to achieve an object. An agent starts by observing the currents state s of the environment, then performs an action on the environment, and later on receives a feedback r from the environment. The received feedback is also called a reinforcement or reward. Agents aim to maximize their cumulative reward they receive in the end [7].

There are three well-known, fundamental classes of algorithms for solving the reinforcement learning problem, namely dynamic programming, Monte Carlo, and temporal-difference (TD) learning methods [7]. Unlike other approaches, TD learning algorithms can learn directly from experience without a model of the environment. TD algorithms do not require an accurate model of the environment (contrary to Dynamic Programming) and are incremental in a systematic sense (contrary to Monte Carlo methods). However, unlike Monte Carlo algorithms, which must wait until the end of an episode to update the value function (only then is the return r known), TD algorithms only need to wait until the next time step. TD algorithms are thus incremental in a systematic sense [7].

One of the most widely used TD algorithms is known as the Q-learning algorithm. Q-learning works by learning an action-value function based on the interactions of an agent with the environment and the instantaneous reward it receives. For a state s, the Q-learning algorithm chooses an action a to perform such that the state-action value Q(s,a) is maximized. If performing action a in state s produces a reward r and a transition to state s , then the corresponding state-action value Q (s, a) is updated accordingly. State s is now replaced by s and the process is repeated until reaching the terminal state [7]. The detailed mathematical foundation and formulation, as well as the core algorithm of Q-learning, can be found in [8] therefore it is not repeated here.

Q-learning is an attractive method of learning because of the simplicity of the computational demands per step and also because of proof of convergence to a global optimum, avoiding all local optima, as long as the Markov Decision Process (MDP) requirement is met; that is the next state depends only on the current state and the taken action (it is worth noting that the MDP requirement applies to all RL methods) [9].

Clearly, a MAS can be an uncertain environment, and the environment may change any time. Reinforcement learning explicitly considers the problem of an agent that learns from interaction with an uncertain environment in order to achieve a goal. The learning agent must discover which actions yield the most reward via a trial-and-error search rather than being told which actions to take as in most forms of machine learning. It is this special characteristic of reinforcement learning that makes it a naturally suitable learning method for trust evaluating agents in MASs. Furthermore, the suitability of RL can also be seen if we note that a TR observes the TEs, selects a TE, and receives the service from that TE. In other words, these agents get some input, take an action, and receive some reward. Indeed, this framework is the same framework used in reinforcement learning, which is why we chose reinforcement learning as a learning method for our proposed agents model [10].

Related work

A wide variety of trust and reputation models have been developed in the literature. Different dimensions to classify and characterize computational trust and reputation models were presented in different surveys [11]. One of the most cited and used general purpose surveys is the one developed by Sabater and Sierra [12]. Their dimensions of analysis were prepared to enhance the properties of the Regret model [13] and show basic characteristics of trust models [11]. The dimensions proposed in [12] are:

  • Paradigm type, where models are classified as cognitive and numerical. The numerical paradigm includes models that do not have any explicit representation of cognitive attitudes to describe trust. On the other hand, cognitive paradigm includes models in which the notion of trust or reputation is built on beliefs and their degrees [11].

  • Information sources where a set of models use direct experiences while other models use third-party testimonials from other agents in the same environment, refered to as witness. Yet, others depend on the analysis of social relations among the agents [11].

  • Visibility, where the trust information of an agent is be considered a private property that each agent build or a global property that all other agents can observe.

  • Granularity, which refers to the context-dependence of trust and reputation models.

  • Cheating behavior, which refers to the models’ assumptions regarding information from witnesses. According to [12] a model may assume that witnesses are honest, or that witnesses may hide information but never lies or, alternatively, that witnesses can be cheaters.

  • Type of exchanged information, where information assumed to be either boolean, or continuous estimations.

In addition to the those dimintions, [11] added the procedural dimension to reflect weather or not a bootstrap mechanizime is embeded within the model. Furthermore, they introduced generality dimension to classify models that are general purpose versus the ones that focus on very particular scenarios.

Form architectural point of view, [11,14-16], among others, differentiated between centralized and distributed models. Typically, a central entity manage the reputation of all agents in the first approach, but each agent performs its trust estimations without a central entity in the second approach [16]. Decentralized systems may use flat or hierarchical architectures. Generally speaking, decentralized models are more complex than centralized models. But Single point of failure and performance bottleneck are major concerns for centralized models [17]. centralized models are subject to single point of failure and need powerful and reliable central entities and communication bandwidth [16].

Witnesses locating dimension was described in [14] to reflect weather or not a mechanism is embedded within the model for locating third party information sources. Few general purpose trust, such as FIRE [15] use a distributed approach for this purpose.

This section reviews a selection of decentralized models, and not meant to provide a comprehensive survey of trust modeling literature in MASs. Recent surveys such as [2,11,14] provide further detailed overview of the literature.

A decentralized, subjective, RL based trustworthiness estimation model for buying and selling agents in an open, dynamic, uncertain and untrusted e-marketplace is described in [18] and further elaborated in [7]. This model is based on information collected from past direct experiences where buyers model the trustworthiness of the sellers as trustworthy, untrustworthy and neutral sellers. A buying agent chooses to purchase from a trustworthy seller. If no trustworthy seller is available, then a seller from the list of non-untrustworthy sellers is chosen. The seller’s trustworthiness estimation is updated based on whether the seller meets the expected value for the demanded product with proper quality. The update process is maintained after comparing the obtained information about the reliability of a seller against the obtained product quality from the same seller. The model described in [7,18], uses some certain thresholds set to categorize TEs to trustworthy and untrustworthy agents. TRs do not interact with the untrustworthy TEs and among the trustworthy ones, TEs with the highest values are selected as interaction partners. The information is based on TR’s personal interaction experience, so the new entry TR has lack of knowledge about different TEs. This model has limited applicability, if repeated transactions between traders are rare [19].

A decentralized, subjective, extension to the model used in [18] is describe in [20,21], to enable indirect trustworthiness based on third party witnesses in e-marketplace. In this model, witnesses are partitioned into trustworthy, untrustworthy and neutral sets to address buyers’ subjectivity in opinions. However, the authors did not present any experimental results to justify their theoretical approach [22].

A combined model based on direct and indirect trustworthiness estimation is described in [23] as an extension to [20,21] where advising agents are partitioned into trustworthy, untrustworthy and neutral sets. To address the subjectivity of witnesses, the mean of the differences between the witness’s trustworthiness estimation and the buyer’s trustworthiness estimation of a seller is used to adjust the advisory’s trustworthiness estimation for that seller. Nevertheless, how witnesses are located was not specified.

All those RL based modes [18,21] and [22] classify TEs into three non overlapping sets namely trusted, distrusted, and neutral, also referred to as neither trusted nor untrusted [18]. Similarly, the computational model of [24], classifying TEs as trusted, distrusted or untrusted, where the last one means neither trusted nor distrusted. The authors extend the model of [25] and enriched their model with the use of regret and forgiveness to aid TRs classifying TEs. Their trust estimation at time instance t+1 depends of the current estimation (at time t) and a forgiveness function. The forgiveness function, in turn, depends on both the regret the TR has because it trusted a TE that does not fulfill its needs, and the regret that the corresponding TE express, if any, for not satisfying the demand of the TR. However, this later factor, the regret of the TE, can be misleading if the TE communicate false regret value. The model does not use third party witnesses.

TRAVOS [26] is a well-known decentralized, general purpose trustworthiness estimation model for MASs. The model depends on beta distributions to predict the likelihood of honesty of a TE [19]. If the TR’s confidence in its evaluation is below a predefined threshold, the TR seeks advices from witnesses. In TRAVOS, trust is a private and subjective property that each TR builds. The model computes the reliability of witnesses via the direct experience of interaction between the TR and witnesses, and discards inaccurate advices [4]. Unfortunately, it takes certain time for the TR to recognize the inaccuracy of the provided reports from the previously trusted witness agents [4]. The model does not describe how witnesses are located.

Regret [13] is a decentralized, general purpose trustworthiness estimation model for open MASs that takes into account direct experiences, witness information and social structures to calculate trust, reputation and levels of credibility [11]. The model assumes that witnesses are willing to cooperate, and depends on social networks among agents to find witnesses, then uses a set of fuzzy rules to estimate the credibility of witness and therefore there testimonies [2]. Even though the model heavily depends on agents’ social networks, it does not show how TRs may build them [15].

Yu and Singh [27], presented a decentralized, subjective trustworthiness estimation model where a TR estimates the trustworthiness of a TE using both its own experience, and advices from witnesses. The model use social network concepts in MASs, where it incorporates for each agent a TrustNet structure. Each agent in the system maintains a set of acquaintances and their expertise. The set of neighbors is a subset of the acquaintances set. The model locate witnesses based on individual agents’ knowledge and help through each agent’s neighbors without relying on a centralised service [15]. Thus, when looking for a certain piece of information, an agent can send the query to a number of its neighbors who will try to answer the query if possible or, they will refer the requester to a subset of its neighbors [15]. The requester considers the information only if the referred witnesses are whiten a limit in the social tree [11]. To address the subjectivity of witnesses, agents model acquaintances expertise [27]. An agent’s expertise is then used to determine how likely it is to have interaction with or to know witnesses of the target agent [15]. The model uses Dempster Shafer evidence theory to aggregate the information from different witnesses.

FIRE [15] is a well-known, general purpose, decentralized, trustworthiness estimation model for open MASs. The model takes into account multiple sources of information. FIRE categorizes trust components into four categories; direct experience called Interaction Trust, Witness Reputation, Role-based Trust and Certified Reputation. The model assumes that witnesses are honest and willing to cooperate and use weighted summation to aggregate trust components [15]. In FIRE, trust is a private and subjective property that each TR builds [11].

To locate witnesses, FIRE uses a variant of the referral system used by [27], but does not model witnesses experties the same way as in [27]. Instead, FIRE assumes that addressing subjectivity of witnesses is application dependent. To address resources limitations; the branching factor and the referral length threshold parameters were used. The first used to limit the breadth of search and the second is used to limit the depth of search [15]. An important aspect of FIRE is that a TR does not it does not create a trust graph, as in [27], and therefore may quickly evaluates the TE’s trust value using a relatively small number of transactions [4]. Unfortunately, if a TE proposes some colluding referee for certified reputation, this source of information and be misleading to the TR [4].

To address the subjectivity of witnesses, most models allows a TR to evaluate the subjectivity of witnesses simply based on their deviations from its own opinions [28]. Most models of trustworthiness estimation allow the communication of trustworthiness information regardless of their contexts, even though trustworthiness estimations are context-dependent [28]. A functional ontology of context for evaluating trust (FOCET), a context-aware trustworthiness estimation model for multi-agent systems, is described in [28] to address the case where a TE may offer different kinds of services in different contexts. These contexts might be totally different or have some features in common.

To measure the effect of a particular context, two individual metrics were defined: 1) a weight matrix (WM) that includes the importance level of each feature of context; and 2) a relevancy matrix (RM) that indicates the degree of similarity of each feature in the first context with the corresponding one in the second context. The WM is a 1 n matrix, where n is the number of FOCET context features and matrix entry w i is in [0, 1]. The RM is an n 1 matrix where matrix entry v i is in [0, 1] and refers to the degree of importance of the corresponding feature. For example, an agent, called B1, may consider the “fast-enough” delivery of a specific transaction very important and uses 0.9 as the corresponding value in its WM. Similarly, another agent, called B2, may also consider the “fast-enough” delivery of another transaction very important and uses 0.9 as the corresponding value in its WM. However, for B1, “fast enough” means: within one week. On the other hand, for B2, “fast enough” means: within one day. Therefore, B1 will use a lower value for “delivery time” feature in its RM (e.g. 0.2) whereas, B2 will use a higher value for “delivery time” feature in its RM (e.g. 0.7).

Given the WM and RM matrixes, the influence of the context of the direct interaction between the witness agent and the TE in which the trustworthiness of the TE is estimated, known as the context effect factor (CEF), is computed in [28] by

$$ CEF=\frac{\sum_{i=1}^{n}(1-w_{i})+w_{i}*v_{i}}{n} $$

TRs subjectively specify a degree of decay p (0 ≤ p ≤ 1) that is based on their policies in order to reduce the influence of the old trustworthiness estimation information adaptively. Time decaying is used to emphasize that utility gain (UG) from recent transaction weigh more compared to UG from old transactions if they have the same absolute value [28].

$$ CEF^{'}=e^{(-\text{p\ensuremath{\Delta}}t)}CEF $$

where t indicates the time elapsed since previous interactions took place and can be determined according to the temporal factor concept in FOCET.

A common issue with RL based trust models that TRs do not quickly recognize the environment changes and adapt with new settings. To address this shortcoming, DTMAS uses a technique similar to regret described in [24] in order to improve the model responses to dynamic changes in the system. We propose suspending the use of a TE as a response to unsatisfactory transaction with the TE. The suspension is temporary, and its period increases as the transaction importance increases. Similar to regret [24], it represent an immediate reaction of a TR when it is not satisfied with an interaction with a TE. However, unlike the use of regret in [24], TRs do not depend of any expressed feel of sorry from TEs. Furthermore, suspension is more aggressive. No interactions with suspended TEs as long as it is suspended. On the other hand, while forgiveness is used to reduce the effect of regret in [24], our suspension decays with time only. This is to avoid being mislead by false feel of sorry expressed by the TE, and to allow the effect of accidental misbehavior to phase out, and to magnify the effect of a behavior change as testimonies form witnesses are aggregated. Furthermore, DTMAS integrates context-dependency of third party testimonies [28] together with reinforcement learning for MASs. Similar to [13,15,27] DTMAS defines a way to locate a set of witnesses to consult for indirect trust information. Like [15], using DTMAS a TR does not create a trust graph, as in [27], and therefore may quickly evaluates the TE’s trust value using a relatively small number of transactions. Unlike existing trust models, DTMAS uses of a semi-hierarchical structure for MASs, coupled with the notion of the “small world” [5] to help reducing the communication overhead associate locating witnesses.


In this section, we will outline some general notation, and outline the necessary components and assumptions we make about the underlying trust estimation model, which we will use in the remainder of this work. For the complete list of abbreviation terms used in this study, please see Table 1.

Table 1 Abbreviations

Agent architecture

Based on the agent’s architecture described in [29], we assume that each agent has an embedded trust management module. This module stores models of other agents and interfaces both with the communication module and the decision selection mechanism. The subcomponents of the trust management module, in compliance with [29], are listed in the following:

  • Evaluate: This component is responsible for evaluating the trustworthiness of other agents using different information sources such as direct experience and witness testimonies. Trust models described in [13,15,18,26,27] are well-known models that belongs mainly to the evaluation component. The proposed DTMAS belong is a trust evaluation model.

  • Establish: This component is responsible for determining the proper actions to establish the agent to be trustworthy to others. The work of Tran, et al. [30] is an example of a model designed mainly to address this component.

  • Engage: This component is responsible for allowing rational agents to decide to interact and engage others with the aim of estimating their trustworthiness. In the literature, this component is usually referred to as trust bootstrapping and cold start problem. Bootstrapping Trust Evaluations Through Stereotypes [31] is an example model that belongs mainly to this component.

  • Use: This component is responsible for determining how to select prospective sequences of actions meant on the trust models of other agents that have been learned. The model described in [32] is an example model that belongs mainly to this component.

Agents and tasks

We assume a society of agents, A = {a1, a2,...} which is referred to as the global society. We assume a set of possible tasks T = {s1,..., s n }. The nature of tasks in T are application dependent. A TR that desires to see some task accomplished, considers depending on a trustee to perform the task on its behalf [32].

Agents can communicate with each other in a distributed manner. No central entity exists to facilitate trust-related communications. No service level agreement or contract exists between TRs and TEs. We assume that witnesses are willing to cooperate.


Architectural overview of DTMAS

Finding the set of witnesses that previously interacted with a specific TE, in a pre-request before a TR can use indirect estimation for the trustworthiness of the TE. While many decentralized trust models offer a variety of approaches for trustworthiness estimation, few of them define how to locate the set of witnesses to contact for indirect trust information, and simply assume the availability of this set. Even though a simple broadcast may be used for this purpose, the overhead associated with that may limit the scalability of the model.

We propose the use of a semi-hierarchical model for MASs influenced by Zone Routing Protocol (ZRP)[33], coupled with the notion of the small world networks [5] and the concept of contact [6]. In the context of Mobile Ad-hoc NETworks (MANETs), the ZRP defines a zone for each node as the number of nodes reachable within a radius of R edges (links or hops in MANETs’ terminology). Nodes obtain routes to all nodes within their zone in a proactive approach. A reactive routing approach is employed to discover routes to nodes outside a zone [33]. It was suggested in [5] that introducing a small amount of long-range edges is enough to make the world “small”, while having short paths between each pair of nodes. In the context of MANETs, [6] suggested the use of a few nodes away from the querying node, which act as shortcuts to convert a MANET into a ‘small world’. He referred to those nodes as contacts of the querier.

In the architecture of DTMAS, a zone is defined for each TR as the number of neighboring agents within a radius of R edges. Each TR broadcasts to its neighbors within its zone that it has interacted with a TE whenever such interaction takes place for the first time. This information can be refreshed periodically, or when a change takes place; e.g., when the agent moves into a different neighborhood. Therefore, each TR knows which of its neighboring agents interacted with a particular TE(s). If a TR did not find proper information locally, the TR starts searching outside its zone through its contacts. For DTMAS, contacts are a few agents away from the querying agents, which act as shortcuts to convert a MAS into a small world in order to help the querying agents in locating witnesses, if any. This is useful for highly dynamic systems where agents may change their locations frequently and/or agents frequently enter and leave the system. To alleviate potential overhead and enhance the scalability of the model, a TR can use a number of “contacts”, typically far away from itself, to inquire instead of inquiring every node in its neighborhood and the query may be forwarded up to a maximum number of agents called search depth. This has a similar effect to the use of branching factor in Yu and Sing model [27] and in FIRE model [15]. After the presentation of the trustworthiness estimation in subsection “Trustworthiness estimation”, the algorithm for locating witnesses is presented in subsection “Locating witnesses” followed by the algorithm for selecting contacts in subsection “Selecting contacts”. Both the number of contacts and the search depth, explained in subsection “Locating witnesses”, can be used to control overhead, in the MAS.

Trustworthiness estimation

Q-learning based trustworthiness estimation is deemed suitable for uncertain environments in an electronic marketplace where agents discover which actions yield the most satisfying results via a trial-and-error search [34]; therefore, we make use of Q-learning for trustworthiness estimation in DTMAS, for simplicity, we will use RL to refer to Q-learning in this work.

In addition to the use of RL for trustworthiness estimation and the integration of direct and indirect trust estimation without assuming the honesty of witnesses, we propose the integration of a suspension technique to enhance the model responses to dynamic changes in the system.

Employing DTMAS, a TR implement immediate, temporary suspension of the use of a TE if a transaction results in an unsatisfactory result for the TR. The suspension period increases as the transaction importance of the unsatisfactory transaction increases. The objective is to reduce the side effects associated with a TE that built a good reputation, and then, for some reason, began to misbehave. However, suspension should be temporary to avoid excluding a good TE that accidently misbehaved.

If an honest witness a 1 considers a TE as suspended, according to direct transaction between a 1 and TE, then, whenever a TR consults a 1 about TE during the suspension period, a1 will report the trustworthiness estimation of TE as -1 (the lowest limit of possible credibility). Furthermore, an honest witness A will reduce the reported trustworthiness estimation of a TE after the end of the suspension period. This reduction is inversely proportional to the time elapsed since the end of suspension. In other words, the reported trustworthiness estimation of a TE will equal the calculated trustworthiness estimation of the TE, based on the history of interaction between a 1 and TE, minus a penalty (i.e. punishment) amount related to time elapsed since the end of suspension period.

When considering the credibility of witnesses, a suspension policy similar to the one implemented in the direct trust component of the model is used. That is, a TR will suspend the use of any witness whose advice is the opposite of the actual result of interaction with the TE. The suspension period increases as the transaction importance increases. Initially all witnesses are considered neutral. In case no witness found the TR depends on the direct experience component alone. The default value of the direct trust value of a TE is the neutral value. Zero is used as the neutral value in DTMAS.

When a TR wants to interact with a TE at time t, the TR avoids any TE that is untrustworthy or suspended and estimates the trustworthiness of TEs using integrated direct and indirect trust components, but avoids the advice of all untrustworthy or suspended witnesses. Then, the TR selects the TE that maximizes its UG of the interaction subject to the constraint that the integrated trustworthiness estimation of the TE is not less than a satisfactory threshold (ST).

Integrated trustworthiness estimation TT(TR,TE)

The trust equation we are interested in should take into consideration TRs’ direct trust of TE(s), testimonies from witnesses, subjectivity in witnesses’ opinions and credibility of witnesses. Therefore, the total trust estimate can be calculated using Eq (3).

$$ TT(TR,\, TE)=x*DT(TR,\, TE)+(1-x)*IT(TR,\, TE) $$
  • DT(TR,TE) is the direct trust estimation component of the TR for the TE.

  • IT(TR,TE) is the indirect trust estimation component of the TR for the TE.

  • x is a positive factor, chosen by the TR, which determines the weight of each component in the model.

Direct trustworthiness estimation DT(TR, TE)

TRs use RL to estimate the direct trust of TEs in a way similar to the process in [7]. If the TR is satisfied by the interaction with the TE, Eq. (4) is used to update the credibility of the TE as viewed by the TR.

$$ {DT}_{t}(TR,\, TE) = {DT}_{t-1}(TR,\, TE) + \alpha \,(1-|{DT}_{t-1}(TR,\, TE)|) $$
  • DT t (TR, TE) is the direct trust estimation of the TE by the TR at instant Time t .

  • The cooperation factor α is positive (1 >α> 0) and the initial value of the direct trustworthy estimation is set to zero.

The value of DT(TR, TE) varies from -1 to 1. A TE is considered trustworthy if the trustworthiness estimation is above an honesty threshold (HT), which is similar to the cooperation threshold in [24]. The TE is considered untrustworthy if the trustworthiness estimation value falls below a fraudulent threshold (FT), which is similar to the forgiveness limit in [24]. TEs with trustworthiness estimation values between the two thresholds are considered neutral.

If the TR is not satisfied by the interaction with the TE, Eq. (5) is used to update the credibility of the TE as viewed by TR.

$$ {DT}_{t}(TR,\, TE) = {DT}_{t-1}(TR,\, TE) + \beta\,(1-|{DT}_{t-1}(TR,\, TE)|) $$
  • β is a negative factor called the non-cooperation factor (0 >β> -1).

Furthermore, the TR suspends the use of the TE for a period of time determined by equation (6).

$$ {SUS}_{t}(TE)={SUS}_{t-1}(TE)+BSI*IV $$
  • SUS t (TE) is the suspension penalty associated with the TE at instant Time t .

  • The basic suspension interval (BSI) is application dependent, and could be days in an e-marketplace or seconds in a robotics system that has a short life time.

  • The transaction importance (IV) is how much the TR values the transaction, not the actual utility gain of the interaction.

We believe that the cooperation and non-cooperation factors are application dependent and should be set by each agent independently. In general, we agree with [7] that the factors should be related to the value gain of the transaction.

When a TR wants to interact with a TE at instant Time t , the TR avoids any TE that is untrustworthy (i.e., DT t (TR, TE) < FT) or suspended (i.e., SUS t (TE) > Time t ).

Indirect trustworthiness estimation

To estimate indirect trust, a TR consults other witnesses who interacted previously with the TE. To adopt different context elements with different importance levels relating to their subjective requirements and environmental conditions, FOCET [28] will be used. An overview of FOCET is presented in subsection Related work.

To reduce the effect of fraudulent and outlying witnesses, a TR excludes reports from any witness where the mean of the differences between the witness’s trustworthiness estimation and the TR’s trustworthiness estimation of TEs other than the one under consideration is above the witnesses differences threshold (WDT).

To protect the model from attacks in which a TE would obtain some positive ratings and participates in a bad interaction that actually causes large damage, the importance of transactions is considered when estimating the trust. An honest witness AD reports its testimony (RT) about a TE as

$$ RT(AD,TE)=\frac{\sum_{tr=1}^{NI}\left(CEF'_{tr}*{IV}_{tr}*UG{}_{tr}\right)}{MaxUG*\sum_{tr=1}^{NI}\left(CEF'_{tr}*{IV}_{tr}\right)}*PNT $$
  • PNT=(1-DD(AD,TE)).

  • DD (AD, TE) is the reduction fraction of the reported trust of TE because of previous suspension(s).

  • DD= 0 if TE has never been suspended previously; otherwise D D=S U S(T E)/A g e(A D).

  • NI is the number of transactions between the AD and the TE,

  • CEF’ is the decay factor applied to the CEF. as calculated by equation 2.

  • UG tr is the utility gain of the transaction t r with the TE

  • and MaxUG is the maximum possible UG of a transaction. Obviously, MaxUG is application dependent.

A TR will calculate the indirect trust (IT) component as

$$ IT(TR,\, TE)=\frac{\sum_{i=1}^{N}RT\left({AD}_{i},\, TE\right)}{N} $$
  • N is the number of trustworthy witnesses.

  • RT (AD i , TE) is the testimony of witness i about TE

Each TR updates its rating for the witnesses after each interaction as follows:

  • If the transaction was satisfactory for the TR and the witness AD had recommended TE or If the transaction was NOT satisfactory and AD’s opinion was “not recommend”, then the trustworthiness estimation of witness AD is incremented by

$$ DT(TR,\, AD)=DT(TR,\, AD)+ \gamma \left(1-|DT(TR,\, AD)|\right) $$
  • Otherwise, the trustworthiness estimation of AD is decremented by

$$ DT(TR,\, AD)=DT(TR,\, AD)+ \zeta\,(1-|DT(TR,\, AD)|) $$

Furthermore, the TR suspends the use of the witness for a period of time determined by

$$ {SUS}_{t}(AD)={SUS}_{t-1}(AD)+WBSI*IV $$
  • γ and ζ are positive and negative factors respectively and chosen by the TR as cooperation and noncooperation factors.

  • SUS t (AD) is the suspension penalty associated with AD at time instant t

  • As with the BSI, the Witnesses Basic Suspension Interval (WBSI) is application dependent.

  • Transaction importance (IV) is the how much the TR values the transaction, not the actual UG of the interaction.

  • S U S t (A D) is decremented by one each time step. However, it can not be less than zero.

The value of DT (TR, AD) varies from -1 to 1. A witness is considered trustworthy if the trustworthiness estimation is above the witnesses honesty threshold (WHT). A witness is considered untrustworthy if the trustworthiness estimation falls below the witnesses fraudulence threshold (WFT). Witnesses with trustworthiness estimation values in between the two thresholds are considered neutral.

When a TR wants to interact with a TE at instant Time i , the TR avoids any AD that is untrustworthy or suspended.

Locating witnesses

A TR can use Algorithm 1, to find witnesses who interacted previously with a TE. The algorithm is inspired by the routing protocol for MANETs in [35]. The TR a first initiates the witnesses locating request with search depth D1 = initial_value to its contacts, if it does not receive satisfying feedback within a specified time, it creates a new request with D i = 2*D i -1 and sends it again to its contacts. Each contact observes that D i ≠ 1, reduces the value of D i in the request by 1 and forwards it to its contacts that serve as second-level contacts for a. In this way the request travels through multiple levels of contacts until D reduces to 1. Depending on the quality of the provided information (if any), a may choose to continue searching for other alternatives probably with larger D up to a predefined upper level for D. In this way the value of D is used to query multiple levels of contacts in a manner similar to that of the expanding ring search. However, this would be more efficient than a system-wide broadcast search as the request is directed to individual agents (the contacts).

In addition to the described mechanism for locating witnesses, a TR can request the TE to provide a list of referee agents where a referee RF of a TE; is a TR that previously interacted with TE, and willing to share its experience with other TRs.

Selecting contacts

Each TR can decide on the number of contacts K to use depending on how cautious the agent is, how important the interaction is, and how much resources the agent has. To reduce the maintenance overhead, contacts can be selected dynamically, when a TR requests an advice on a TE using the Contacts Selection Algorithm (Algorithm 2).

Performance analysis

It is often difficult to find suitable real world data set for comprehensive evaluation of trust models, since the effectiveness of various trust models needs to be assessed under different environmental conditions and misbehaviors [2]. Therefore, in trust modeling for MASs research field, most of the existing trust models are assessed using simulation or synthetic data [2]. One of the most popular simulation test-beds for trust models is the agent reputation and trust (ART) test-bed proposed in [36]. However, even this test-bed does not claim to be able to simulate all experimental conditions of interest [2].

Simulation environment

We use simulation to evaluate the performance of the proposed model for distributed, multi-agent environment using the discrete-event multi-agent simulation toolkit MASON [37] with TEs, as agents that provide services, and TRs, as agents that consume services. As with [15], we assume that the performance of a TE in a particular service is independent from that in another service. Therefore, without loss of generality, and in order to reduce the complexity of the simulation environment, it is assumed that there is only one type of service in the system simulated and all trustees offer the same service with, possibly, different performance. In order to study the performance of the proposed trust model for TE selection, we compare the proposed model with the well known FIRE trust model [15], one of the well-known trust models for MASs and among the few models that define a mechanism to locate witnesses.

All agents are placed randomly in a rectangular working area. Each TR has a radius of direct communication to simulate the agent’s capability in interacting with others and all other agents in that range are direct neighbors of the TR. The simulation step is used as the time value for transactions. Interactions that take place in the same simulation step are considered simultaneous. TRs evaluate the trustworthiness of the TE(s), and then select the one that promise the maximum transaction importance. Locating TEs is not part of the trust model; therefore TRs locate TE(s) through the system. Table 2 gives the number of agents, the dimensions of the working area and other parameters used for DTMAS and those used for the environment. FIRE-specific parameters are similer to those used in [15].

Table 2 Values of used parameters

Having selected a provider, the TR then uses the service and gains some benefits from the interaction. This benefit is referred to as UG. A TE can serve many users in a single step, and all TRs attempt to use the service in every step. For DTMAS, after each interaction, the TR updates the credibility of the provider and the credibility of witnesses. We did not consider the case where a TE can record all or part of the history of interactions to be able to provide referee lists to other TRs upon request.

We consider a mixture of well behaving and poorly behaving TEs in addition to those who alter their behavior randomly. Witnesses are categorized as agents who are honest, agents who strictly report negative feedback, agents who strictly report positive feedback, agents who report honestly with probability 0.5, or agents who strictly lie in that they always report the opposite of their beliefs. TRs are associated with nine different context categories randomly.

Since agents can freely join and leave, and they may be moving, the agent population can be very dynamic and agents can break old relationships and make new ones during their lifetimes. To address this, in our simulation, agents change their locations in the working area. When a TR changes its location, it will have a new set of neighbors. Therefore, changing an agent’s location changes its relationships with others, as well as its individual situation.

In each step, TRs are assumed to move random distances between 0 and MaxMove in a random direction between 0 and 2 π. When they reach an edge of the working area, they simply enter the working area from the opposite edge.

When bidding, an honest TE bids its UG category. This value is considered the transaction importance, whereas the UG of the interaction for the TR is the transaction importance divided by the context category of the TR to address its subjectivity.

Experimental results

We analyze the performance of DTMAS in terms of UG, and the overhead of locating witnesses, and we compare the performance of DTMAS with that of FIRE. However, because FIRE assumes that witnesses are honest, we present the performance comparison of DTMAS with the use of honest witness as well as with the use of witnesses who are not necessarily honest. We refer those two variates as “DTMAS - 2” and “DTMAS - 1” respectively.

Figure 1 shows that selecting TEs using DTMAS performs consistently better than FIRE in terms of UG per agent, which indicates that DTMAS helps TRs select honest TEs from the population and gain better utility than that gained using the FIRE model. This is because DTMAS prefers TEs who have not been suspended for a longer time over those who promise higher benefits but have been suspended within a shorter period, in order to reduce the effect of a TE whose performance starts to reduce. DTMAS integrates FOCET [28] to adopt different context elements with different importance levels relating to their subjective requirements and environmental conditions. Additionally, using DTMAS, a TR excludes reports from any witness where the mean of the differences between the witness’s trustworthiness estimation and the TR’s trustworthiness estimation of TEs other than the one under consideration is above the witnesses differences threshold (WDT).

Figure 1
figure 1

Average utility gain. Selecting providers using DTMAS performs consistently better than FIRE

Figure 2 shows the average communication overhead per transaction per agent, calculated as the total number of messages passing over all edges divided by multiplication of the number of transactions by the number of TRs, when employing the witnesses locating strategy. The figure shows that DTMAS with contact-based architecture has lower overhead than FIRE for locating witnesses. This is due to the contact selection strategy, which attempts to reduce the overlapping of contacts’ zones. Suspending the use of unreliable witnesses reduces the overhead associated with consulting a larger number of witnesses, slightly as shown in the figure. It worth noting that the two variants of DTMAS used in this study, the one with honest witnesses and the one with witnesses who are not necessarily honest, achieve a comparable results in terms of average UG and communication overhead. This indicates the ability of DTMAS to reduce the effect dishonest witnesses, and work in an environment where a subset of witnesses may provide misleading information.

Figure 2
figure 2

Average overhead. DTMAS has lower overhead than FIRE for locating witnesses

Conclusions and future work

This paper presented DTMAS; a scalable, decentralized model for trust evaluation. We presented a generic architecture that reduces the overhead of locating witnesses, which enhances the scalability of the architecture of the model. DTMAS allows direct and indirect sources of trust information to be integrated, thus providing a collective trust estimation. Additionally, we introduced a temporary suspension mechanism to reduce the harm of misbehaving TEs and misbehaving witnesses. In short, we believe DTMAS can provide a trust measure that is sufficiently useful to be used in an open and dynamic MAS.

Dynamically determining parameter values such as the weight of each component in the model (x), HT, FT, etc., enabling TEs to actively promote their honesty to allow new and honest TEs to enter the system, and enhancing the scalability of the proposed architecture are considered as future work.


  1. Ramchurn SD, Huynh TD, Jennings NR (2004) Trust in multi-agent systems. Knowledge Eng Rev 19(1): 1–25.

    Article  Google Scholar 

  2. Yu H, Shen Z, Leung C, Miao C, Lesser VR (2013) A survey of multi-agent trust management systems. Access, IEEE 1: 35–50.

    Article  Google Scholar 

  3. Burnett C (2011) Trust assessment and decision-making in dynamic multi-agent systems. PhD thesis, Department of Computing Science, University of Aberdeen.

  4. Khosravifar B, Bentahar J, Gomrokchi M, Alam R (2012) Crm: An efficient trust and reputation model for agent computing. Knowl-Based Syst 30: 1–16.

    Article  Google Scholar 

  5. Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684): 409–10.

    Article  Google Scholar 

  6. Helmy A (2002) Architectural framework for large-scale multicast in mobile ad hoc networks In: IEEE International Conference on Communications, ICC 2002, April 28 - May 2, 2002, New York City, NY, USA, 2036–2042. doi:10.1109/ICC.2002.997206.

  7. Tran TT (2010) Protecting buying agents in e-marketplaces by direct experience trust modelling. Knowl Inf Syst 22(1): 65–100.

    Article  Google Scholar 

  8. Sutton RS, Barto AG (1998) Introduction to Reinforcement Learning. ISBN0262193981, 1st edition. MIT Press, Cambridge, MA, USA.

    Google Scholar 

  9. Georgoulas S, Moessner K, Mansour A, Pissarides M, Spapis P (2012) A fuzzy reinforcement learning approach for pre-congestion notification based admission control In: Proceedings of the 6th IFIP WG 6.6 International Autonomous Infrastructure, Management, and Security Conference on Dependable Networks and Services, AIMS’12, 26–37.. Springer, Berlin, Heidelberg.

    Google Scholar 

  10. Tran TT (2004) Reputation-oriented reinforcement learning strategies for economically-motivated agents in electronic market environments. PhD thesis, David R. Cheriton School of Computer Science, University of Waterloo.

  11. Pinyol I, Sabater-Mir J (2013) Computational trust and reputation models for open multi-agent systems: a review. Artif Intelligence Rev 40(1): 1–25. doi:10.1007/s10462-011-9277-z.

    Article  Google Scholar 

  12. Sabater J, Sierra C (2005) Review on computational trust and reputation models. Artif Intell Rev 24(1): 33–60.

    Article  MATH  Google Scholar 

  13. Sabater J, Sierra C (2001) Regret: A reputation model for gregarious societies In: Fourth Workshop on Deception Fraud and Trust in Agent Societies.

  14. Noorian Z, Ulieru M (2010) The state of the art in trust and reputation systems: A framework for comparison. J Theor Appl Electron Commer Res 5(2): 97–117.

    Article  Google Scholar 

  15. Huynh TD, Jennings NR, Shadbolt NR (2006) An integrated trust and reputation model for open multi-agent systems. Autonomous Agents Multi-Agent Syst 13(2): 119–154.

    Article  Google Scholar 

  16. Wang Y, Vassileva J (2007) Toward Trust and Reputation Based Web Service Selection: A Survey. Int Trans Syst Sci Appl 3(2): 118–132.

    Google Scholar 

  17. Wang Y, Zhang J, Vassileva J (2014) A super-agent-based framework for reputation management and community formation in decentralized systems. Comput Intell 30(4): 722–751.

    Article  MathSciNet  Google Scholar 

  18. Tran T, Cohen R (2004) Improving user satisfaction in agent-based electronic marketplaces by reputation modelling and adjustable product quality In: 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2004), 19-23 August 2004, New York, NY, USA, 828–835.. IEEE Computer Society, Los Alamitos, CA, USA. ISBN0-7695-2092-8.

    Google Scholar 

  19. Kerr RC (2007) Toward secure trust and reputation systems for electronic marketplaces. Master’s thesis, Computer Science, University of Waterloo.

  20. Regan K, Cohen R, Tran T (2005) Sharing models of sellers amongst buying agents in electronic marketplaces In: Proceedings of Decentralized, Agent Based, and Social Approaches to User Modelling Workshop, 75–79, Edinburgh, UK.

  21. Regan K, Cohen R (2005) Indirect reputation assessment for adaptive buying agents in electronic markets In: Proceedings of business agents and the semantic web workshop (BASeWEB 05) Decentralized Agent Based Social Approaches User Modell Workshop, 121–130, Victoria, British Columbia, Canada.

  22. Beldona S (2008) Reputation based buyer strategies for seller selection in electronic markets. PhD thesis, Electrical Engineering & Computer Science, University of Kansas.

  23. Beldona S, Tsatsoulis C (2007) Reputation based buyer strategy for seller selection for both frequent and infrequent purchases In: ICINCO 2007, Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics, Robotics and Automation 2, Angers, France, May 9-12, 2007, 84–91.. INSTICC Press, Portuga. ISBN978-972-8865-83-2.

    Google Scholar 

  24. Marsh S, Briggs P (2009) Examining trust, forgiveness and regret as computational concepts. In: Golbeck J (ed)Computing with Social Trust. Human Computer Interaction Series, 9–43.. Springer.

  25. Marsh SP (1994) Formalising trust as a computational concept PhD thesis Department of Mathematics and Computer Science, University of Stirling, Scotland, UK.

  26. Teacy WL, Patel J, Jennings NR, Luck M (2006) Travos: Trust and reputation in the context of inaccurate information sources. Autonomous Agents Multi-Agent Syst 12(2): 183–198.

    Article  Google Scholar 

  27. Yu B, Singh MP (2002) An evidential model of distributed reputation management In: Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems: Part 1, AAMAS ’02, 294–301.. ACM, New York, NY, USA. doi:10.1145/544741.544809.

  28. Mokhtari E, Noorian Z, Ladani BT, Nematbakhsh MA (2011) A context-aware reputation-based model of trust for open multi-agent environments. In: Cory B Pawan L (eds)Advances in Artificial Intelligence - 24th Canadian Conference on Artificial Intelligence, Canadian AI 2011, St. John’s, Canada, May 25-27, 2011. Proceedings, 301–312.. Springer, Heidelberg, Berlin. ISBN:978-3-642-21042-6.

  29. Sen S (2013) A Comprehensive Approach to Trust Management In: International conference on Autonomous Agents and Multi-Agent Systems, AAMAS 13, Saint Paul, MN, USA, May 6-10, 797–800.. International Foundation for Autonomous Agents and Multiagent Systems, St. Paul, MN, USA. ISBN:978-1-4503-1993-5.

    Google Scholar 

  30. Tran T, Cohen R, Langlois E (2014) Establishing trust in multiagent environments: realizing the comprehensive trust management dream In: Paper presented at the 17th International Workshop on Trust in Agent Societies, Paris, France, on the 6th May 2014.

  31. Burnett C, Norman TJ, Sycara K (2010) Bootstrapping Trust Evaluations Through Stereotypes In: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: Volume 1 - Volume 1, 8.. International Foundation for Autonomous Agents and Multiagent Systems, Series AAMAS ’10, Richland, SC. ISBN:978-0-9826571-1-9.

    Google Scholar 

  32. Burnett C, Norman TJ, Sycara KP (2011) Trust decision-making in multi-agent systems. In: Walsh T (ed)IJCAI 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, July 16-22, 2011, 115–120.. AAAI Press/International Joint Conferences on Artificial Intelligence, Menlo Park, California.

    Google Scholar 

  33. Haas ZJ (1997) A new routing protocol for the reconfigurable wireless networks In: Proceedings of IEEE 6th International Conference on Universal Personal Communications, 12-16 October San Diego, CA, USA, 562–566. doi:10.1109/ICUPC.1997.627227, ISSN:1091-8442.

  34. Tran TT, Cohen R (2002) A reputation-oriented reinforcement learning strategy for agents in electronic marketplaces. Comput Intelligence 18(4): 550–565.

    Article  MathSciNet  Google Scholar 

  35. Helmy A (2005) Contact-extended zone-based transactions routing for energy-constrained wireless ad hoc networks. Vehicular Technol, IEEE Trans 54: 307–319.

    Article  Google Scholar 

  36. Fullam KK, Klos TB, Muller G, Sabater J, Schlosser A, Topol Z, Barber KS, Rosenschein JS, Vercouter L, Voss M (2005) A specification of the agent reputation and trust (art) testbed: Experimentation and competition for trust in agent societies In: Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS ’05, 512–518.. ACM, New York, NY, USA. doi:10.1145/1082473.1082551.

    Chapter  Google Scholar 

  37. Luke S, Cioffi-Revilla C, Panait L, Sullivan K, Balan G (2005) Mason: A multiagent simulation environment. Simulation 81(7): 517–527.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Abdullah M Aref.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

AA participated in the creation of the model, carried out the simulation studies, and drafted the script. TT participated in the creation of the model and supervised the research. All authors read and approved the final manuscript.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aref, A.M., Tran, T.T. A decentralized trustworthiness estimation model for open, multiagent systems (DTMAS). J Trust Manag 2, 3 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: