Pivot Tables or SUMIFS?

The scenario

Today I had a coaching call with a student who is taking the SE1-B course and SE2 course. Let's call her Jeni.

Jeni is not new to the Excel skills covered within SE1-B course and SE2 course. She understood already some portions of the course materials. Through these two courses, she got much deeper understandings to the portion she was familiar with and learned many new skills.

Jeni started a new job in an accounting firm three weeks ago. This job requires more advanced Excel skills than what she had from previous job.

Before they create the general ledger, they need to process the large volume raw transaction data, and summarize them based on different grouping.

The step of summarizing those raw data has been done with pivot tables. However recently, an incident has caused a distrust from management about using the pivot tables approach.

One colleague in this department somehow did not include the entire input data set and went ahead with pivot tables, and caused the results to be wrong, and consequently general ledger was also wrong.

The management asked Jeni and the person who is guiding Jeni to this position to scrap the old pivot table approach, instead, they want her to use formula approach. They showed her the spreadsheet from another department. According to Jeni, the formulas used in that spreadsheet is humongous. If you print a formula from one cell on a page, it could be half page long, with lots of layers of nesting of SUMIFS, INDEX, MATCH, IF functions etc...

The SE2 Super Excel Data Processing & Analysis course does teach how to create the long nested multiple layers of functions. Jeni could potentially create those long formulas too. At this point, she is new to this position, and still in the stage of understanding all those processes. Trying to create those insanely long formulas seems to be more risky than any potential benefit.

So, should she insist using the old pivot table approach, or use the formula approach? What are the pros and cons of either approach.

If she choose to use formula approach, should she create those half page long formulas?

Pivot Tables

First of all, let's understand what pivot table really does.

Pivot table is the Excel tool to help you summarize large volume of data easily and quickly, without any knowledge of functions and formulas. You can slice it and dice it in many different ways, and get the insights and summaries instantly.

Traditionally, pivot table is done against one single table to get summary. As Microsoft keeps improving and upgrading Excel application by introducing data model and Power Pivot, you can now bring in multiple tables and set relationships among the tables, and create pivot tables off them.  This is essentially equivalent to using Microsoft Access application but in an Excel interface. This is much much more advanced topic, and it is not covered within the SE1-B Super Excel Pivot Tables & Charts course.

In a grand scheme, most Excel users only create pivot table against one single table to get the summary. This is what we will focus on in this article when talking about pivot tables.

SUMIFS function

Now, let's understand what the SUMIFS function does.

SUMIFS function is a formula to sum a range based on multiple criteria. There are definitely commonality among pivot tables and SUMIFS function. Both can summarize results based on various grouping.

The difference is obvious. SUMIFS is a function, in order to use it properly, you need to understand the structure of the function, and the different arguments within the function. You also need to have the foundational knowledge of formula references before using any function.

When you try to summarize the results based on different grouping with SUMIFS function, you have to list all the groups first, before you create the SUMIFS function.  If you do not know what all those possible groups are, you may want to quickly create a pivot table first to summarize the results by groupings, so you see all those groups.

 

SUMIFS vs Pivot Tables

Now you see the dichotomy here. If I already got the summary results by pivot table for the different groups in few seconds, why do I bother to write SUMIFS function to summarize the results again.

Pivot table only summarize data, which means it mostly performs sum, count, average, max, min calculation for different group. If we use formulas to accomplish the same thing, we would use SUMIFS, COUNTIFS, AVERAGEIFS, MAXIFS, MINIFS functions.

Pivot table is very easy and quick to do, and it has its own structure and rules. You could even set it up so the pivot table refreshes automatically, and new data could be automatically included in the source data of the pivot table.

But, pivot table is not really a presentation tool. It is more of an analytical tool. This means most final reports are not pivot tables.

If your final reports are in pivot tables, you may find the structure and format may not be consistent from one period to another.

If your reports need to show metrics from multiple sources and various comparisons, you soon realize your pivot tables will be very messy if not impossible.

By creating reports with formulas, the structure and format will stay the same regardless which period you are reporting. Your formulas can calculate against different source data tables and integrate them in one nice report.

If you have strong formula skills, your reports can be completely automated. Every period you just dump the raw data into the input worksheets, then change the report period variable, then the whole report is updated automatically by the existing formulas.

Three approaches to generate periodic reports

Pivot tables approach

Apparently, Jeni's workplace reports are not in pivot tables. They use pivot tables to generate the summaries for different groups. Then they would manually copy the required results into the various spots of the report.

This is a common approach in creating reports. It happens in many organisations and businesses. For people who do not have advanced function/formula skills, this is the best approach. It does require a little bit manual work every reporting period. If the report is not too complex, the manual work is manageable.

Formulas with intermediate steps approach

However, for people who have advanced formula skills and generate more complex reports, the pivot table approach would be way too manual and prone to manual errors. They would set up multiple stages of formulas to get the intermediate results, and the final report formulas will extract data and results from those intermediate stages.

Formulas without (or with minimum) intermediate steps approach

And for people who have even more advanced formulas skills, they may not make use of many intermediate steps. They can create complex nested formulas and/or array formulas in final report to directly summarize and extract from the source data tables.

This appears to be the case for the report being shown to Jeni, where formula in one single cell could be half page long.

Which approach do you use?

So now, back to Jeni's case, what should she do? Which approach should she take?

First, changing approach from pivot table to formulas does not address the cause of the original problem at all. The cause of the original problem is that the person who created pivot table did not include the entire source data set into the pivot table.

This could be due to two reasons. One, this person does not have deep enough understanding of Pivot Table. Two, this person has enough understanding of pivot table, but is not careful enough to recognize the issue in the first place, or not having a routine step of check and balance.

In the first scenario, this person may seem to be able to create pivot tables, as it has been done many times in the same way. But he or she does not have deep enough understanding of those steps, and may not realize this time, there might be a blank row in the data set. When you insert pivot table, the system will only take the consecutive range of your cursor position, and not including all the rows under the blank row into the pivot table.

(If you are taking the SE1-B Super Excel Pivot Tables & Charts course, you will be taught to check the source data range and pay attention to blank rows or columns in the source data).

Formulas are harder to learn than pivot tables. If one does not have deep enough understanding about pivot tables, it will be even harder for this person to learn and master deep enough knowledge about formulas.

And if one is not care enough in creating pivot table, this person could be not care enough in creating formulas too.

If one does not have strong formula skills, this process should remain in pivot table approach. If your Excel skills is at SE1-B course level, and not SE2 course level yet, this would be the approach.

However for somebody who does have strong formulas skills at SE2 course level, formula approach will automate the periodic reporting process more, and minimize manual touches and errors. You will create this reporting model in multiple stages of formulas, so you will not have those insanely long nested formulas.

If your Excel skills is at SE3 course level, you may incline to automate the reporting model to its extreme, and minimize the intermediate steps as much as you can.

You will likely nest those many layers of formulas in array formula, and enjoy doing that. However, from the management point's view, this may not be the best for the team.

Most Excel users are far from reaching this level of Excel skills. A reporting model at this complexity becomes very difficult to maintain.

If anything goes wrong, you are the only person in the team who can possibly fix it. If you are sick that day, on one else can solve the problem. You become the bottleneck in the chain.

Of course, if multiple team members have this level of Excel skills, it would be nice to automate this reporting model to this degree.

So, depending on your Excel skills, your team members' Excel skills, management's risk tolerance level, you will take the approach that fits the situation.