News
  News Archives
  Search
  Articles
  How To: RTCW
  Servers
  Scripts
  Staff

  Register a User


  Upload Stats
  Clan List
  Top 10
  Ladders
  Clan Stocks
  Future Matches
  Recruiting Center

  Quakecon 2020


  Quakecon 2020


  Quakecon 2020


  Discord (EU)
  Discord (NA)
  Facebook
  Maps
  OSP
  Reddit
  RTCW Chile
  RTCW Cups 2010+
  Stats processor



5129646 hits

Aggregate Performance Metric, Pt. I - Overview

Posted by: source (Friday, November 6, 2020)

Introduction

In anticipation of Wolf PRO being released sometime in the near future, I developed a new framework for ranking players which is called Aggregate Performance Metric (APM). It works by calculating two primary factors, player impact and team success which are then combined and adjusted for quality of competition. Each player contributes a certain percentage of their rank points to two pools, impact points and success points. The ratio of those contributions is determined by the number of players in the match. In essence, each player makes two bets: 1) on themself meet or exceed expectations, and 2) on their team to meet or exceed expectations. Expectations for impact and success are determined by relative rank within the match. Impact points are distributed based on each player’s performance in a number of weighted statistical categories, and success points are distributed based on the outcome of the match. A player’s rank after a match is a percentage of the difference between the amount points that they had prior to the match and the amount attributed to them after the match.


Problem

The current system (Elo) is unsuitable for use with multiplayer games because:

1) It doesn’t account for the disparity between players within a match;
2) It doesn’t account for the individual impact that a player has on the game, particularly in the event of a loss.


Requirements

I set the following requirements for a replacement system to be well-rounded:

> It must correctly rank a player in each of 27 (33) possible scenarios, where the factors are player impact, team success, and quality of competition given that each of these may take on a positive, neutral, or negative value.
> It must be scalable to work correctly for player counts between 2-64.
> Each match must be a closed system where the number of points is fixed and the sum of the redistribution of points is 0.
> Player ranks are adjusted based on expectations. Those that exceed expectations increase their rank, those that meet expectations stay in the same position, and those that fail to meet expectations decrease their rank.
> It must work at minimum for objective format games (Objective, Stopwatch).
> It must account for players who played incomplete rounds.


Assumptions

I made the following assumptions when constructing APM:

> The less people in a match, the more important it is to win; the more people in a match, the more important it is to have an individual impact.
> The less people in a match, the more important it is for teams to be balanced. A team’s expectation to win is not linear in the sense that a significant gap between team ranks becomes increasingly difficult to overcome.
> The more people in a match, the more difficult it is to win in spite of any particular rank gap between teams - as variance decreases, confidence increases.




For a mathematical breakdown of how it works, see Aggregate Performance Metric, Pt. II - Breakdown.
For examples of it working in 6v6 matches, see Aggregate Performance Metric, Pt. III - Examples.

Back to PlanetRTCW Articles

  Posted on Friday, November 6, 2020 at 10:35:12 PM | -doNka- v | ^ | #
For reference:
Current elo system is here https://github.com/donkz/RTCW-stats-py-sci/blob/master/tests/elo.py

Lines specific to rtcw are 82-97

In short, people are ranked by the number of kills they have in a given match.
Kills by special weapons worth half a point.
They are ranked 1-x.
Ranks are subtracted or awarded 1.5 points for win or loss and then they are matched against all of their opponents based on their existing ELO score.

People with high ELO's are expected to be at the top of the ranks and people with low ELO expected to be in the bottom.
Any difference will result in subtracting or adding elo to a player.

The rankings can be only calculated from the metrics that could be derived from rtcw logs and therefore should be absolutely objective.

Any player can have a good or a bad game. That's okay because yellow is calculated over hundreds right now it's about a thousand games for each player. You can see the elo rankings on the front page in the season closing posts.

You can see for yourself if it works for you.

ELO system has been used by a decade of quake live games, xonotic, and is derived from mathematical theory for ranking chess players.

  Posted on Saturday, November 7, 2020 at 05:52:38 AM | source v | ^ | #
I know that you are using a slight modification right now but since I don't know how many people reading this are going to even care about the math I wanted to stick to classic Elo for comparisons. The gather bot just goes off standard Elo at the moment, but integrating this into a new bot is a whole other thing. There are a number of different systems out there which are either modified versions of Elo or different altogether like Microsoft's TrueSkill system and this is more or less a different flavour which is tailored for use with team & class based objective games.

Elo assumes that 1) wins alone are indiciative of greater skill between opponents, and 2) relative skill is spread across a standard normal distribution and that. That's great for 2-player games like chess (or quake duels) but is pretty bad for larger teams. The system that I created has a lot of the same elements as Elo including expectation vs outcome and scaling the difference but with a more holistic approach. Practically speaking, you could think of this as a way of combining the objectives of both the Elo & rankPts systems that we use right now. I'm also not sure that a standard normal distribution is even relevant with such a small / inconsistent community.

Please Login to PlanetRTCW to post a comment.
Please Login to PlanetRTCW


Team Special America. Very Special
Team Special America. Very Special

(Submit POW)

Irc Channel:
irc.gamesurge.net,
#PlanetRTCW

Top Stock Prices
US/flagrant/139.92
US/np./139.78
US/blatant/109.97
US/LLC/109.08
US/dont-blink/108.99

Recent Articles
Case for Sten
(79 views)
(0 comments)
Sten - cool, but ultimately useless weapon. Fun long range, but not good enough for real play. The problem is obvious - natural limitation - 10...

Aggregate Performance Metric, Pt. III - Examples
(129 views)
(0 comments)
For the previous articles, see Aggregate Performance Metric, Pt. I - Overview and Aggregate Performance Metric, Pt. II - Breakdown. Here...

Aggregate Performance Metric, Pt. II - Breakdown
(122 views)
(0 comments)
For the previous article, see Aggregate Performance Metric, Pt. I -...

Aggregate Performance Metric, Pt. I - Overview
(165 views)
(2 comments)
Introduction In anticipation of Wolf PRO being released sometime in the near future, I developed a new framework for ranking players which...

Movies 2008
(542 views)
(5 comments)
Here is a list I compiled of my picks for movies in 2008. There is nothing else really going on with this site, but I figured you guys would...

Which map is missing from QuakeCon?

Aldersnest
Password
Tundra
Cipher
Sub
Chateau
UFO
Keep

 


(Previous Polls)

 

printer-friendly Printer view
 

PlanetRTCW (planet-rtcw.donkanator.com) is 2003-2004 Mithun Balachandran and Robert Dyer.
Best viewed in Internet Explorer at 1024x768x32-bit color.
Return to Castle Wolfenstein is a registered trademark of id Software.
OSP and related material is property of rhea and Orange Smoothie Productions.
All other names or logos listed on this site are property of their respective owners.