group by


Count how many submissions per score

Recently, we had access to a database that contained the scores of a programming competitions system.
The database contained several contests, each contest contained several challenges and any competitor could make multiple submissions.
We wanted to extract a couple of charts showing

  • how many submissions we had per score and
  • how many submissions we had per score while filtering out the best submission (max score) per contestant per challenge per contest

The following code will return the number of submissions per score per challenge per contest.

SELECT contest_id, challenge_id, TRUNCATE(score, 1), COUNT(*)
FROM submissions
GROUP BY contest_id, challenge_id, TRUNCATE(score, 1)
ORDER BY contest_id, challenge_id, TRUNCATE(score, 1);

The next one will return the number of submissions per score per challenge per contest while filtering out the best submission (max score) per contestant per challenge per contest:

SELECT contest_id, challenge_id, TRUNCATE(max_score, 1), COUNT(*)
FROM
(
  SELECT contest_id, challenge_id, competitor_id, MAX(score) AS max_score
  FROM submissions
  GROUP BY contest_id, challenge_id, competitor_id
) AS max_scores
GROUP BY contest_id, challenge_id, TRUNCATE(max_score, 1)
ORDER BY contest_id, challenge_id, TRUNCATE(max_score, 1);


How to “group by” and sum in Excel

In the following example we used the Subtotal feature to create a spreadsheet that shows the partial sum based on another column.

Methodology:

  1. Click the Data tab in Excel’s ribbon toolbar
  2. Click the Sort button to sort our data by the user column
  3. Click the Subtotal button and fill in the dialog as appropriate, then click OK

In our example, we had only two columns (User and Lot), we wanted to produce the total sum of lots per user and so filled the dialog as follows:

Subtotal

What the above options do is the following:

  • Use the User column to group on by checking when its value changes
  • Use the Sum function on the columns that will be selected later on
  • Apply the function on the Lot column