Statistics – Part 2

In this post, we’ll cover Mean, Median and Mode. We’ll keep them to the point and will not go deep down into them in this series.

Mean
There are three different types of means

  • Arithmetic mean
  • Geometric mean
  • Harmonic mean.

For this series we’ll cover arithmetic mean only. I’ll refer to arithmetic mean as mean.

In simple words, mean is average. But average of what?
We’ll get into that but take a look at a following table of Players’ scores:

Players Scores
15 35 85 55 95 65
45 20 25 60 80 20

These are scores of 12 players for some game they played.

So, now if we want to know ‘what is the score that is achieved averagely by the players?’ we can simple get an average of all the observed values. (Remember data set is a set of values)

Average = sum of all observations / total number of observations 

Now, we have
total observations = 12, and
sum of all observations = 15 + 35 + 85 + ….. + 80 +20 =  600

Therefore, Mean = 600 / 12 = 50

This concludes that the average score that players achieve in the game is 50.

But that only tell us about the overall info about that data. What if we want to know much more detail about that? What if we want to know if the game was difficult or easy? How can we deduce that from out data set?

This brings us to our next topic.

Median
Median is the mid point of our data.

To observe the mid point we need to organize our data from low score to high score. Take a look at the following table:

Ordered Players Scores
Player Score Player Score
1 15 7 55
2 20 8 60
3 20 9 65
4 25 10 80
5 35 11 85
6 45 12 95

Median is calculated using the total number of observations. Median is a value which lies in the mid of all the values. We have 12 observations. What is the mid point of these values? Well it is difficult to say, since there is no single value that fulfills this criteria. Now, only if it had been 13 total observations, then we could have said value number 7th value is the median. Then this would have been our expression:

Median = value at (n + 1 ) / 2
where n = Total number of observations

But we have 12 total values, this makes it a problem. If we use the above formula we’ll get 6th value as our median which does not makes sense as 6th value doesn’t lie in the middle of the data.
How can we get the median for this?

If we observe the table again we can see that 6th and 7th values lie in the middle but neither of them are true medians. To solve this dilemma, just take an average of these both values.
6th value = 45
7th value = 55
Their average = (45+55)/2 = 100/2 = 50

So, we can say that in case of even number of observations, our formula is as:

Median = (value at ‘a’ + value at ‘b’)/2
where n = Total number of observations,
a = n/2,
b = (n+1)/2

We got a median value as 50, therefore we can see that the game is neither easy nor difficult.

Now most of the times you see a data set, you can observe that certain data values are repeating. This triggers a question, what value is the most common?
This brings us to our last topic Mode.

Mode
Mode is a data point that occurs with high frequency in the data set. To put it simply, the data value that appears most often in the data set is Mode.

Frequency of scores
Score Frequency Score  Frequency
15 1 55  1
20 2 75  1
25 1 80  1
35 1 85  1
45 1 95  1
50 1

We can see that 20 is the score that occurs 2 times in our data set. This is our mode.

Easy. Right?

Yes, but make sure to take care of following points while finding the mode.

  • Mode does not have a minimum frequency;
    • Suppose the following data set is there
      1,4,5,5,3,6,8,2,7,8,9,0
      And another data set
      1,4,5,5,5,5,3,6,8,2,7,8
      We can observe that in the first data set, 5 occurs twice and in the second data set 5 occurs 4 times. In both cases, the mode is 5.
      However in the second case the 5 is the most common value whereas in first case it could be a ‘by-chance’ situation. Always make a smart observation. Frequency matters for any data set values.
  • Multiple modes can be there for a single data set;
    • Suppose the following data set is there
      1,4,4,4,5,3,6,6,6,8,2
      We can observe that 4 and 6 occurs three times. Therefore, 4 and 6 both are our modes.

So we have seen how mean, median and mode can help us evaluate our data in different contexts and answer some questions that would have been very difficult to answer. Add in a good representation such as histograms, we get a good idea what is going on with out data.

We’ll study some other concepts that are useful in statistics in future posts.

Advertisements

Statistics – Part 1

Statistics is a study of methods for data collection, analysis, and interpretation, and principles of experimental design.

What is Data?
Data is any set of values. It could be qualitative or quantitative. In a simple words, data is collection of values or numbers or anything that gives some sense of information.

Is data good or bad?
Well to put it simply, data can be both. If we can get some useful information from that data, deduce something by going through it, then data is generally considered good.
But is it really so?
Even the good data can still be further classified as good or bad. Confusing, isn’t it?

Think about these questions:
How much of that data is useful?
Is it valuable?
What is the quality of data?
Is the data biased? 

You got the gist, right? When we ask these questions we slowly begin to understand whether data is good or not.

How is data collected?
The data is collected through observations and measurements.

What type of data is there?
Data can be broadly divided into 2 parts:

  • Primary data (collected by us)
  • Secondary data (collected by others)

What is importance of organised data?
Organised data help us to understand it more easily. We save time and money. It allows us to make good decision. It helps us work with precision.

How do we represent data?
There are many ways that data can be represented. The most common of them all are:

  • Tables
  • Charts
  • Graphs

There are several other ways to plot our data such as dot plot, histogram, pie charts, etc. We’ll go through these in upcoming posts.

Made a Game for GameJam

So last month I participated in a 48-hour game jam in my University. I was in a team of 5 people. The theme of the game jam was “You are villain”.

We discussed about the theme and decided that we should make it simple yet fun.

We wanted to create something that everyone can relate with. And what better game than the classic Pacman. We thought let us give our players the ability to control the ghosts. It sounded nice but how to go about it? Which ghost should we let the player control? Instead of picking one of the ghosts we picked all.

Yes, you read that right all of the ghosts could be controlled by the player.

But now it begs the question of how to control all of them? Let the user switch between them through buttons? That was one solution but we went nuts and let the player control all the ghosts simultaneously. Yep, we were going to let the player manipulate all the ghosts from a single controller.

We used Xbox controller and use left thumbstick, right thumbstick, directional pad and 4-buttons(A, X, B, Y) to control the ghosts. And another controller to control the player. So essential we made it a 2-player game with one person controlling the pacman and the second person controlling all 4 ghosts.

We were sure that people were going to go crazy figuring out the controls of the game.

Since Halloween was approaching we opted to give our game a thematic look. I created a pixel art characters as you can see below.

We made our pacman Van Helsing lookalike, hence PacVan (our title for the game). Our 4 ghosts were Werewolf, Zombie, Dracula and Ghost (yeah.. simple one).

Each of us designed one level, so in total we got 5 playable levels. The current game is simple and you can get it from here and play.

There are many features like Portal travel, Selective passing, Hide behind Grass, Time trial, etc. We have loads of ideas that can be incorporated in the game and we may implement them in the game some other time.

We got many compliments from players who played the game. It was pretty fun to watch them figuring out the controls and miss the PacVan by moving the wrong ghost.

In conclusion, people who were playing as ghosts won more times than people who played PacVan, despite the obvious fumbling with controls.

It was a great experience for me and I’d love to be part of such events in future as well.