# Ignore Outliers with Excel TRIMMEAN

From what I've seen in workbooks over the years, SUM is the most frequently used Excel function, and AVERAGE is the runner-up. Would you agree, or do you see other functions used more often than those two?

Last week, a client asked about excluding some of the highest and lowest numbers from a data set, to give a better average. It looked like some of those outliers had been incorrectly entered (an extra zero, or a mistyped numbers), and they were throwing off the results.

So, we did some tests with the TRIMMEAN function, to compare the results with a simple AVERAGE formula.

Do you ever use TRIMMEAN? I'm not a statistician, but it could be useful for giving a slightly different perspective on your data.

## Exclude a Percentage of Data

With the TRIMMEAN function, you can exclude a specific percentage of the data points from the top and bottom of the data set. TRIMMEAN returns the average (mean) of the remaining interior data points.

In the sample data for this demo, there is a list of quantities sold, for stationery orders. There are 20 records, with some outliers at the top and bottom. The data will be trimmed by 25%, to get the trimmed mean.

**NOTE**: In this example, the quantities are sorted in ascending order, so it is easier to see the top and bottom numbers. Numbers do NOT need to be sorted, for the TRIMMEAN function to calculate correctly.

Item |
Qty |

Binder | 3 |

Pen | 4 |

Pen | 14 |

Pencil | 20 |

Binder | 22 |

Pencil | 30 |

Pen | 36 |

Binder | 41 |

Pen | 44 |

Binder | 52 |

Pencil | 59 |

Pen | 65 |

Pen | 66 |

Pen | 72 |

Binder | 78 |

Pencil | 81 |

Pen | 84 |

Binder | 85 |

Pencil | 86 |

Binder | 97 |

### TRIMMEAN Function Arguments

The TRIMMEAN function requires 2 arguments: **TRIMMEAN(array, percent)**

In this example,

- the values for the
**array**are in cells B2:B21 - the trim
**percent**is entered in cell E3, as 25% (or 0.25)

To calculate the trimmed mean, enter this formula in cell E4:

=TRIMMEAN(B2:B21,E)

The TRIMMEAN result (53.06) is different from the AVERAGE (51.95), which is shown in cell E5.

### How TRIMMEAN Works

In this example, there are 20 values, and the trim percent is 25%.

To calculate how many number to trim,

- the values are counted, then multiplied by the trim percentage (e.g. 20 * .25 = 5
- That number is divided by 2, to get the number to trim at each end ( e.g. 5 / 2 = 2.5)
- To remove an equal number of data points at each end, the number is rounded down to the nearest integer ( e.g. INT( 2.5) = 2)

So, in this example, the top 2 (86,97) and bottom 2 (3,4) data points will not be included in the average that TRIMMEAN calculates.

To verify the result, use the AVERAGE function with cells B4:B19 – ignoring the top 2 and bottom 2 numbers. This returns 53.06 – the same result as the TRIMMEAN function in cell E4.

### Watch the Video

Watch this video to see how to set up a TRIMMEAN formula, and see how it works.

### Download the Sample File

To follow along with the video, and to see the TRIMMEAN example, you can download the sample file from the Excel Average Functions page on my website. The file is in xlsx format, and does not contain macros.

____________________

Hi Debra,

Since I have little experience with these more statistical functions – can you tell me is there is a trimmean function that can exclude data one sidedly from dat (i.e. trim away the highest or lowest x percent, but not symmetrical)

Hi ikkeman,

You could use AVERAGEIF to calculate the mean for everything above the number in the x% position. Using the same data as in the sample for this article:

I too would like such a function. The outliers I count are often e.g. in an average of work order days outstanding, where some work orders have been allowed to remain open long after they should have been closed. Rather than allow them to skew the mean, I exclude them (manually) and list the outliers with a recommendation that the offending orders be closed and the database cleansed. I certainly don’t want to trim an equal amount off the bottom! Who wants that?

Note that TRIMMEAN(1-(1E-16)) gives the median, naturally, but that TRIMMEAN(1) gives an error.

@derek, you could use AVERAGEIF or AVERAGEIFS, to calculate the mean based on criteria, such as an empty cell in another column.

In the example for this article, you could calculate an average for quantities where the Item column is not blank:

I tried to Count Unique Items in Pivot Table using method 2 with COUNTIF formula and found if the “person” is not region specific then in some cases the person number will be ZERO for some regions. The problems repeatedly found whenever the “goods” are counted instead of “person”. The reson I figured out is that the COUNTIF was applying to the whole database without boundies across the regions. Am I right? Please comment.

I just tweeted you Debra as you retweeted this page a few hours ago.

In terms of the TRIMMEAN() function, all is well. What I think Ikkeman and Derek were asking was best explained by using the InterQuartile Range IQR and Tukey’s Fences.

The IQR = Quartile 3 – Quartile 1 = 50.75 = 78.75 – 28.00 … using QUARTILE.INC() or PERENTILE.INC()

Tukey’s Lower Inner Fence = Q1 – (1.5 * IQR) = 28.00 – (1.5 * 50.75) = -48.13

Tukey’s Upper Inner Fence = Q3 + (1.5 * IQR)= 50.75 + (1.5 * 50.75) = 154.88

These are your Outlier boundaries which you can also see if you construct a Box & Whisker Plot (very easy in Office365)

These calculations confirm there are no outliers in this data set!

Incidentally, why multiply by 1.5 in the fence calculations? Because Tukey said so! Really! If you want to find the Outer fences, you use 3 instead of 1.5 and can conclude if anything is outside those fences, something is REALLY wrong!

Thanks Duncan! That’s a very helpful explanation. I did warn that I’m not a statistician. ;-)