group by


How to “group by” and sum or count in LibreOffice Calc (Excel)

In the world of spreadsheet applications, LibreOffice Calc stands out as a versatile and powerful tool for managing data. It offers a wide array of features that can help you organize, analyze, and make sense of your data. One of these features, often underutilized, is the Subtotals functionality. In this blog post, we’ll explore how to use the Subtotals functionality in the Data menu of LibreOffice Calc to count how many times each item is repeated in a dataset. This is particularly useful when working with large datasets or lists, as it allows you to create summary reports without the need for complex formulas or manual counting.

Preparing Your Data

Start by opening LibreOffice Calc and loading the dataset you want to analyze. Ensure that your data is organized in columns and that each item you want to count is in a separate column. For example, if you have a list of products, each product name should be in its own column.

Sorting Your Data

To use the Subtotals functionality effectively, your data needs to be sorted by the column containing the items you want to count. To sort your data:

  • Select the entire dataset by clicking and dragging your mouse.
  • Go to the “Data” menu, and then click on “Sort.”

Sort Data

  • In the “Sort Criteria” dialog box, select the column containing the items you want to count.
  • Choose the sorting order (ascending or descending), and click “OK.”

Your data is now sorted and ready for subtotal analysis.

Using the Subtotals Functionality

With your data sorted, you can now use the Subtotals functionality:

  • Select the entire dataset again.
  • Go to the “Data” menu and click on “Subtotals.”

Subtotals

In the “Subtotals” dialog box, you’ll see options for grouping and summarizing your data. By default, it may suggest using the first column for grouping, which is what you want in most cases.

Subtotals Dialog

  • In the “Function” dropdown, choose the type of summary you want, which is “Count” in this case.
  • Make sure that the “Replace current subtotals” option is selected.
  • Click “OK.”

LibreOffice Calc will now calculate the subtotal counts for each item in your dataset and insert them into your spreadsheet. It will also group items together and provide an outline to help you navigate the summary.

Subtotals Result

The Subtotals functionality creates a summary of your data by grouping items and counting them. You can expand and collapse these groups using the outlined symbols to the left of the spreadsheet. This allows you to view the summary data in a more organized manner.

The Subtotals functionality in LibreOffice Calc is a powerful tool for analyzing data and generating summary reports. Whether you’re working with product lists, customer data, or any other dataset, Subtotals can help you count how many times each item is repeated without the need for complex formulas or manual counting. By following the steps outlined in this blog post, you can harness the full potential of LibreOffice Calc and make your data analysis tasks more efficient and accurate. Give it a try, and you’ll be amazed at how Subtotals can streamline your data analysis workflow.


Count how many submissions per score

Recently, we had access to a database that contained the scores of a programming competitions system.
The database contained several contests, each contest contained several challenges and any competitor could make multiple submissions.
We wanted to extract a couple of charts showing

  • how many submissions we had per score and
  • how many submissions we had per score while filtering out the best submission (max score) per contestant per challenge per contest

The following code will return the number of submissions per score per challenge per contest.


SELECT contest_id, challenge_id, TRUNCATE(score, 1), COUNT(*)
FROM submissions
GROUP BY contest_id, challenge_id, TRUNCATE(score, 1)
ORDER BY contest_id, challenge_id, TRUNCATE(score, 1);

The next one will return the number of submissions per score per challenge per contest while filtering out the best submission (max score) per contestant per challenge per contest:


SELECT contest_id, challenge_id, TRUNCATE(max_score, 1), COUNT(*)
FROM
(
  SELECT contest_id, challenge_id, competitor_id, MAX(score) AS max_score
  FROM submissions
  GROUP BY contest_id, challenge_id, competitor_id
) AS max_scores
GROUP BY contest_id, challenge_id, TRUNCATE(max_score, 1)
ORDER BY contest_id, challenge_id, TRUNCATE(max_score, 1);


How to “group by” and sum in Excel

In the following example we used the Subtotal feature to create a spreadsheet that shows the partial sum based on another column.

Methodology:

  1. Click the Data tab in Excel’s ribbon toolbar
  2. Click the Sort button to sort our data by the user column
  3. Click the Subtotal button and fill in the dialog as appropriate, then click OK

In our example, we had only two columns (User and Lot), we wanted to produce the total sum of lots per user and so filled the dialog as follows:

Subtotal

What the above options do is the following:

  • Use the User column to group on by checking when its value changes
  • Use the Sum function on the columns that will be selected later on
  • Apply the function on the Lot column