Performance Analysis of the RepuScore Algorithm

We demonstrate the effectiveness of RepuScore through experiments. We accomplish this with the help of a) simulated logs to demonstrate specific properties of RepuScore and b) real logs from a non- profit organization. The logs from the organization were 20-day logs collected by five domains that they maintained. The log contained information about 45K domains to which about 450K emails were sent, 55% of which were marked as spam by RBLs or rejected since the sender domain were determined not to exist through DNS reverse lookup.Experiments:

  1. Effect of a on Reputation of a Trusted RepuCollector With Sudden Increase in the Amount of Spam it Transmits
  2. Effect of a on using Modified RepuScore Algorithm
  3. RepuScore using MailLogs from Non-Profit Organization
  4. Resilience to Sybil Attacks [What is a Sybil Attack?]
  5. Participation Threshold and Initial Values for RepuCollector

Effect of a on Reputation of a Trusted RepuCollector With Sudden Increase in the Amount of Spam it Transmits

Spammers might attempt to thwart RepuScore by building reputation and then suddenly transmitting huge amounts of spam. In such cases, it is expected that the reputation of the sender would decrease and the spammer would be removed from the trusted group within a minimal number of reputation aggregation intervals. To demonstrate the effectiveness of RepuScore, we created logs with 100 RepuCollectors spanning 45 reputation aggregation intervals. We selected a random number of RepuServers which reported to their local RepuCollectors. The number of emails and spams that were transmitted to and from an organization was perturbed using a random number; for example, since RepuScore creates a trusted group of reputable senders, the spam rate among them was set at under 20%, whereas a spamming domain’s spam rate was set at greater than 95%. (We see this trend in the logs from the non-profit organization.)

Figure 2 demonstrates the reputation of a RepuCollector from which the amount of spam suddenly increased as a function of a. For the first 30 reputation intervals, the RepuCollector built its reputation and attempted to be a part of the trusted group. After reputation interval 30, the spam rate from the RepuCollector increased to 95%. The RepuCollector’s reputation is based on the reputation of all its RepuServers. The jump in the value of reputation is due to the value of a and the initial reputation value of RepuDomain that was set at 0.5. Therefore, the reputation of the RepuCollector for a = 0.9 decreased from 0.7 after the first reputation aggregation interval. In cases where the sender does not propagate spam, the reputation should increase slowly, which indicates a long past history. Hence the high value of a implies an association for a long history of good actions. If the sender propagates spam, the reputation should decrease immediately, reflecting the current actions of the sender. A low value of a guarantees an immediate reduction when the sender propagates spam.

Figure 2: demonstrates the change in the reputation score of a trusted domain that transmits spam after reputation interval 30 as a function of a. The reputation eventually converges to (1 - average spam rate) over multiple reputation intervals. High a puts more weight to previous reputation score, whereas low a puts more weights to current score. Thus, for high values of a, it takes long time for the reputation to be built up whereas for low a value the decrease (or increase) in reputation is faster. The sudden drop from the initial score to the first interval is due to the effect of a. The RepuCollector’s reputation has been set at 0.7. In the future intervals, the RepuCollector reputation is based on the reputation of all RepuServers which starts at 0.5. Therefore, for a = 0.9, the reputation of RepuDomain is around 0.55.


Effect of a on using Modified RepuScore Algorithm

Equation 4 demonstrates our change in the reputation algorithm to accommodate this behavior. Figure 3 demonstrates the change in reputation by employing the modified algorithm. For a high a, the reputation increases gradually but decreases more rapidly.

Figure 3: In the modified RepuScore algorithm, a high value of a (other than 1.0) implies gradual increase, but fast decrease in reputation when the domain starts to spamming.


RepuScore using MailLogs from Non-Profit Organization

Figure 4 shows the modified RepuScore algorithm with collaboration among multiple domains using the 20 day logs from the non-profit organization. The reputation of the spamming domain decreased, but the reputation of a good domain increased.

Figure 4: Using modified RepuScore algorithm with 20 days log from a non-profit organization. A number of new RepuCollectors were introduced at different reputation aggregation intervals.


Resilience to Sybil Attacks

We increased the percentage of malicious RepuCollectors from 10 to 30% to demonstrate RepuScore’s resilience to Sybil attacks. Each RepuCollector transmits a high amount of spam (> 95%) for the first 30 reputation aggregation intervals. After 30 reputation intervals, we had the Sybil attacker to start increasing the reputation of its own Sybil domains and decrease the reputation of other domains. Figure 5 demonstrates our results. The reputation of the Sybil domains steadily decreased, but the reputation of the non-Sybil domains increased.

Figure 4: Using modified RepuScore algorithm with 20 days log from a non-profit organization. A number of new RepuCollectors were introduced at different reputation aggregation intervals.


Participation Threshold and Initial Values for RepuCollector

Having an appropriate initial value for RepuCollector’s reputation is extremely important to maintaina trusted group of reputable senders. For instance, if the initial reputation scores for the RepuCollector and RepuServers are set too high, it would take a long time for the reputation to decrease. On the other hand, if the initial reputation is set too low, it would take a long time for the reputation of a non-spamming RepuCollector to increase.

Our experiments show that an ideal initial reputation value for the RepuServer and the RepuCollector is between 0.5 and 0.7. With different initial values we noted that the average reputation of all the domains using the logs from the non-profit organization converged to about 0.6 for a = 0.1, 0.47 for a=0.5 and 0.36 for a = 0.9. Hence, an ideal initial reputation should be equal to the average reputation of all domains in the system after a long period of time. In order for the new reputation domains to participate in the reputation aggregation intervals, the threshold should be 0.1-0.3 below the initial reputation.