Computer Programming I, Programming Languages, Python

Compute the Average, Min, Max and Mode of a List in Python

October 24, 2013

Lists are by far the most common data type you will use in Python. I wanted to take a few minutes to see how easy Python make the use os lists. I also want you to notice how much ground we can cover by just thinking about these properties of a list. In this article we will talk about lists, functions, searching, return values, data tpes, diconaaries, try-catch try-except blocks and more interesting Python techniques.

So we will be finding the average, min, max and mode of a list. I won’t go into different theories on how to efficiently find these, but rather straight forward approach. Let us start by an assumption that all the values in our list are going to be integers and the data is already stored in a variable named list. As before, we will be using Python 2.7, here is a copy of Python2.7 documentation if you need it. Coming from that we can start by saying our main function looks like this:


def main():

	list = [3,4,1,20,102,3,5,67,39,28,10,1,4,34,1,6,107,99]

	avg(list)
	min(list)
	max(list)
	mode(list)

This will simply call 4 different functions that will each find and print what we are looking for. First, lets go over some list basic properites. In all these function we will use for loops to iterate over the elements of the array. Lucky Python has a simple syntax that allows us to have a variable in the for loop that will take the values of the array.

What does this mean? You will recall that arrays have 0 indexing property that allows us to access the elements in the array directly. If we were to have the syntax list[0] is our example, the value will correspond to 3. the syntax list[1] will have the value 4. We can access each element individually and directly. We can use this to change values in the array. If we were to have list[0] = 55 in our code, the value of list[0] will change to be 55. Here is an example code:


list = [3,4,1,20,102,3,5,67,39,28,10,1,4,34,1,6,107,99]

print list

print list[0]
print list[1]

list[0] = 55

print list[0]
print list

This code will output:

wpid-list_index_exampe_python-2013-10-24-15-30.png

Notice that the values of the list have changed after the assignment. Now let us recall the for loop syntax and properties in Python. The syntax is for element in sequence. That means that we need to provide a name of a variable, in this case element and a sequence such as a String, List, Tuple, etc. So if we wanted to print the numbers 1 to 10 in Python, we will need a list sequence with the numbers in a list and then we can just print element. Like this:


for element in [1,2,3,4,5,6,7,8,9,10]:
		print element

And the output:

wpid-1_to_10_python_for_loop-2013-10-24-15-30.png

Cool. However, what if we need the numbers 1 to 1,000? Well, you can either write them all in a list, or you can use the Python range function. This function has 3 variations. You can eithier call it with 1 integer, and the function will return the numbers from 0 to the number you requested in increments of 1. You can provide the function with a start number and end number (2 integers), and you will get back the numbers from the start to end in increments of 1. Finally you can give the range function 3 integers which will server as start, stop and increment. Here is an example to demonstrate:


# range with 1 argument, return a list with the numbers 1 to 10 in increments of 5
print range(10)

# range with 2 arguments, return a list with the numbers 5 to 10 in increments of 1
print range(5,10)

# range with 3 arguments, return a list with the numbers 5 to 50 in increments of 5
print range(5,50,5)

Output:

wpid-range_function_python-2013-10-24-15-30.png

Remember that the range function in not including the last number. So range(10) gave us the numbers 0 to 9 without 10. Lets combine the range function with indexing of the list to access and print the elements in a list.


list = [3,4,1,20,102,3,5]

for i in range(len(list)):
	print list[i]

This code will output:

wpid-python_list_example-2013-10-24-15-30.png

This is one way to do it. Note that we are using the Python len function to get the length of the list. Len is a built in function in Python and will return the length of any sequence.

Another way to accomplish the same task is to use Python and remove the indexing. Like this:


for element in list:
	print element

Both code output will be the same. There are some cases (as we are going to see) were we will need to use the first one and some where we are going to use the second. It all depends on your program. Now lets take a look and see how can we calculate the Average, Min, Max and Mode of a list. First what is the Average of a list? It is just the sum of the elements divided by the number of elements. Well, we can get the number of elements using the len function, and we can iterate through the elements and add them all up. Like this:


def avg(list):

	sum = 0
	for elm in list:
		sum += elm

	print “The average element of the list is: “ + str(sum/(len(list)*1.0))

That was very simple. Let me just clarify some things. You will notice that in the calculation we are multiplying everything by 1.0. We do this to make sure the length of a list, an integer is turned into a double. That way the result of the average element will not be rounded. We also need to convert the average from double to a string in order to concatenate it with a String.

Now let us move on to the Min/Max problem. Essentially both of these are two faces of the same coin. Here is our stragedy to solver this:

  1. Assume the fist element is our Minimum/Maximum value.
  2. For the rest of the elements in the array
  3. If you find a result smaller/larger than Minimum/Maximum, they are the new Minimum/Maximum
  4. return Minimum/Maximum

Hopefully that algorithm was easy enough to follow. Let us take a look how this looks like in Python:


def min(list):

	min = list[0]

	for elm in list[1:]:
		if elm < min:  			min = elm 			 	print "The minimum value in the list is: " + str(min) 	 def max(list): 	max = list[0] 	 	for elm in list[1:]: 		if elm > max:
			max = elm

	print "The maximum value in the list is: " + str(max)

And the output: (assume that: list = [3,4,1,20,102,3,5,67,39,28,10,1,4,34,1,6,107,99])

wpid-min_max_output-2013-10-24-15-30.png

Now let us to get to the hard part, computing the mode. This will turn out to be not so hard. What is the mode of a list? It is the number (or numbers) that occur most often. Ok. Now we are going to use 2 powerful tools Python has in it’s dictionaries and try-catch try-except statements. Dictionaries in Python are essentially lists where you can define the key. You can think about them as lists, but instead of an index that starts at 0 and goes up, you can name the key anything you would like. I would love to spend some time on them, but this might have to be another post by itself. Try-Catch Try-Except blocks are another post all by themselves, but think about them as a way to avoid TraceBacks. In other words if Python were to execute a statement that will result in a TraceBack, you can specify what to execute and the program will continue instead of terminate. I would love to talk mode about these 2, but for the time being you will have to read more about them on the Python Documentation or Google it.

So what i our strategy to find the Mode of a list? Well, first we can count the number of occurrences of each element in the list. We can use a dictionary to store the count and use the element themselves as the key. Let us see what that we give us:


list = [3,4,1,20,102,3,5,67,39,10,1,4,34,1,6,107,99]

d = {}
for i in t:
	try:
		d[i] += 1
	except(KeyError):
		d[i] = 1

print d

This will output the following:

wpid-dic_mode_python-2013-10-24-15-30.png

When we print out a dictionary we get all the keys and the values separated by a colon (:). We can also get all the keys in the list by typing d.keys(). That will give us a list of keys. Now the rest of the code involves finding the max of all values in the dictionary and then printing the keys that fit these values. Here is how that part looks:


for key in keys[1:]:
		if d[key] > max:
			max = d[key]

	print "The mode of the list is: ",
	for key in keys:
		if d[key] == max:
			print key,
	print " with the mode of: " + str(max)

wpid-mode_of_list_python-2013-10-24-15-30.png

Here is a link to the full code to Compute a Mode of a list in Python.

That’s it. We figured out everything we set to do in the beginning. But wait, there is more!

BOUNS Time: Find the range of the numbers in the list. That is super easy, especially now. The range of a list of numbers is simply the Maximum of your list minus the Minimum of your list. Since we alredy have function for that, this is super easy. Let’s take a look: (note a slight modification of the min/max functions is required to return values instead of print statements).


print "The range of the list is: " + str(max(list) - min(list))

By now you should have a better understanding of how lists work in Python. You will find out that all the tools we looked at are extremely useful and you will find them in every Python application. I am including here a full copy of the .py file I used to Compute the Mean, Min, Max, Mode and Range of a list in Python. I hope you have had some fun and learned something along the way. Here is the output of the complete program for:


list = [3,4,1,20,102,3,5,67,39,10,1,4,34,1,6,107,99]

wpid-python_list_data-2013-10-24-15-30.png

Any questions?

If you enjoyed this post, please consider leaving a comment or subscribing to the RSS feed to have future articles delivered to your feed reader.

You Might Also Like

  • Pingback: Captain DeadBones” Chronicles: Compute the Average, Min, Max and Mode of a List in Python | The Black Velvet Room()

  • joaquin

    I'm not sure of the purpose of the article. In Python coding, to overwrite built-in functions or objects is considered a bad practice (You should not name list a list because then you will be unable, for example, of creating a list from, for example, a tuple). Further, you should indicate that sum(), min() and max() are already python built-in functions, like range() or len(), and thus it is not necesary to program them (except for teaching purposes). They work out of the box and are already optimized. Sorry if I misunderstood something

    • CptDeadBones

      The purpose of this article is to demonstrate the use of a list in Python. It is purely an educational exercise. You are 100% right that it is bad practice to rewrite built in functions. However, what if you do not have them? It is always a good skill to know how to code the same thing multiple ways.

      • joaquin

        You could call them my_list, my_sum, or similar. The reader, who is learning how to code, must also learn he must not overwrite default names. Stack Overflow is full of questions about 'strange' problems that turn out to be due to these, often involuntary, bad practices. On the other hand, I think the post would be more complete if you let know the reader about the existence of these functions as integral parts of the language.

        In the same line, note than in python 3.4 the new stats module allows to calculate basic statistical functions like mean, mode and average.

        My two cents. btw I like your blog. Thanks for the work.

        • Nico

          Combing all suggestions we get the fragment below, which is reasonably Pythonic in style, but doesn't tell us a lot about lists:

          values = [3,4,1,20,102,3,5,67,39,28,10,1,4,34,1,6,107,99]

          def average(v):
          return sum(v)/len(v)

          def mode(v):
          a, b = max(dict((x, v.count(x)) for x in v).items(),
          key=lambda i: i[1])
          return b

          print("Average is: ", average(values))
          print("Minimum is: ", min(values))
          print("Maximum is: ", max(values))
          print("Mode is: ", mode(values))
          print("Range is: ", max(values)-min(values))

          • Nico

            Sorry: indentation was removed mysteriously in reply above.

          • CptDeadBones

            Thank you Nico.

        • CptDeadBones

          I will try to incorporate your notes in future posts. Every programming problem has multiple ways to solve the same thing. I like to show the way I would solve it if I were a beginner programmer with limited tools at my disposal (including Internet connection). I meet a lot of new programmers that do not know basic things because the relay of it being already done for them. Using built in function is an optimization specifically for Python and it should be done. The solution I provided can be ported to other languages with the ideas reming intact.
          Thanks for commenting!

  • Jacob

    I concur with jauquin, you should definitely use the built-in functions max(), min(), sum() and len() to calculate the statistical values.

    In addition, you should change the variable name "list" to something else, as this is also the name of a built-in function. See the list of all built-in functions: http://docs.python.org/3/library/functions.html

    As a bonus, here is a one-liner for calculating the mode using a dict comprehension (assuming the list variable is named data):

    number, mode = max(dict((k, data.count(k)) for k in data).items(), key=lambda i: i[1])

    (It may not be the fastest way to calculate it, but one-liners are fun!)

    • Andrew Jhonson

      I agree with both of you that this might not be the most efficient code. I also agree with Joaquin that you should use built in functions if you can. However, this post helped me understand for loops a little better and encouraged me to go fin out what try-catch blocks are. I think it was helpful. Maybe not 100 correct, but helpful.

      BTW. Love you 1 liner mode, but can you explain it?

      • CptDeadBones

        Thanks! Sorry if I mislead any one, I will fix it at once!

    • CptDeadBones

      I concur with him and you as well. I hope you will forgive me about using list as a variable name, Python was perfectly fine with it.

      I like your 1 liner mode computation, good job!

    • wuzelwazel

      Here's another one-liner:

      mode = sorted(data, key=data.count)[-1]

      Doesn't return the number of times the mode appeared in the list but you could do that with another line:

      num = data.count(mode)

      The two lines run together about 20% faster than the previous one-liner as written. However, replacing the dict() call on the tuple with a dictionary comprehension seems to run faster:

      number, mode = max({k: data.count(k) for k in data}.items(), key=lambda i: i[1])

      The two lines above are only 5% faster than this last one-liner on my machine.

  • Ouya

    I think the purpose of this article is to tell you the way you can do to find the maximum and minimum values within a list without using the building function. Probably it is also how the built-in function max() and min() are written. I found this article very useful, thank you!

    • CptDeadBones

      You are welcome! Thank you for you comment!

  • Truth be told

    Can anyone please tell me how to calculate average of a dictionary efficiently..

    • Thomas.S.A

      You can access dictionaries as lists using .keys() and .values().

      e.g avg(mydic.values()) is probably what you want