Online Qualification Round Problem for Google Hash Code 2017

[Download not found]

[Download not found]

Streaming videos

Problem statement for Online Qualification Round, Hash Code 2017


Have you ever wondered what happens behind the scenes when you watch a YouTube video? As more and more people watch online videos (and as the size of these videos increases), it is critical that video-serving infrastructure is optimized to handle requests reliably and quickly.

This typically involves putting in place cache servers, which store copies of popular videos. When a user request for a particular video arrives, it can be handled by a cache server close to the user, rather than by a remote data center thousands of kilometers away.

But how should you decide which videos to put in which cache servers?


Given a description of cache servers, network endpoints and videos, along with predicted requests for individual videos, decide which videos to put in which cache server in order to minimize the average waiting time for all requests.

Problem description

The picture below represents the video serving network.


Each video has a size given in megabytes (MB). The data center stores ​ all videos​ . Additionally, each video can be put in 0, 1, or more cache servers​. Each cache server has a maximum capacity given in megabytes.


Each endpoint represents a group of users connecting to the Internet in the same geographical area (for example, a neighborhood in a city). Every endpoint is connected to the data center. Additionally, each endpoint may (but doesn’t have to) be connected to 1 or more cache servers​ .

Each endpoint is characterized by the latency of its connection to the data center (how long it takes to serve a video from the data center to a user in this endpoint), and by the latencies to each cache server that the endpoint is connected to (how long it takes to serve a video stored in the given cache server to a user in this endpoint).


The predicted requests provide data on how many times a particular video is requested from a particular endpoint.

Input data set

The input data is provided as a data set file – a plain text file containing exclusively ASCII characters with a single \n character at the end of each line (UNIX-​ style line endings).

Videos, endpoints and cache servers are referenced by integer IDs. There are V videos numbered from 0 to V − 1 , E endpoints numbered from 0 to E − 1 and C cache servers numbered from 0 to C − 1 .

File format

All numbers mentioned in the specification are natural numbers that fit within the indicated ranges. When multiple numbers appear in a single line, they are separated by a single space.

The first line of the input contains the following numbers:

  • V ( 1 ≤ V ≤ 10000) – the number of videos
  • E ( 1 ≤ E ≤ 1000) – the number of endpoints
  • R ( 1 ≤ R ≤ 1000000) – the number of request descriptions
  • C ( 1 ≤ C ≤ 1000) – the number of cache servers
  • X ( 1 ≤ X ≤ 500000) – the capacity of each cache server in megabytes

The next line contains ​V numbers describing the sizes of individual videos in megabytes: S0, S1, … SV-1. Si is the size of video i​ in megabytes ( 1 ≤ Si ≤ 1000).

The next section describes each of the endpoints one after another, from endpoint 0 to endpoint E − 1 . The description of each endpoint consists of the following lines:

  • a line containing two numbers:
    • LD ( 2 ≤ LD ≤ 4000) – the latency of serving a video request from the data center to this endpoint, in milliseconds
    • K ( 0 ≤ K ≤ C ) – the number of cache servers that this endpoint is connected to
  • K lines describing the connections from the endpoint to each of the K connected cache servers.
    Each line contains the following numbers:

    • c ( 0 ≤ c < C ) – the ID of the cache server
    • Lc ( 1 ≤ Lc ≤ 500) – the latency of serving a video request from this cache server to this endpoint, in milliseconds. You can assume that latency from the cache is strictly lower than latency from the data center ( 1 ≤ Lc < LD

Finally, the last section contains R request descriptions in separate lines. Each line contains the following numbers:

  • Rv ( 0 ≤ Rv < V ) – the ID of the requested video
  • Re ( 0 ≤ Re < E ) – the ID of the endpoint from which the requests are coming from
  • Rn ( 0 < Rn ≤ 10000) – the number of requests


5 2 4 3 100
50 50 80 30 110
1000 3
0 100
2 200
1 300
500 0
3 0 1500
0 1 1000
4 0 500
1 0 1000

Example input file explanation.

5 videos, 2 endpoints, 4 request descriptions, 3 caches 100MB each.
Videos 0, 1, 2, 3, 4 have sizes 50MB, 50MB, 80MB, 30MB, 110MB.
Endpoint 0 has 1000ms datacenter latency and is connected to 3 caches:
The latency (of endpoint 0) to cache 0 is 100ms.
The latency (of endpoint 0) to cache 2 is 200ms.
The latency (of endpoint 0) to cache 1 is 200ms.
Endpoint 1 has 500ms datacenter latency and is not connected to a cache.
1500 requests for video 3 coming from endpoint 0.
1000 requests for video 0 coming from endpoint 1.
500 requests for video 4 coming from endpoint 0.
1000 requests for video 1 coming from endpoint 0.

Connections and latencies between the endpoints and caches of example input.


File format

Your submission should start with a line containing a single number N ( 0 ≤ N ≤ C ) – the number of cache server descriptions to follow.

Each of the subsequent N lines should describe the videos cached in a single cache server. It should contain the following numbers:

  • c ( 0 ≤ c < C ) – the ID of the cache server being described,
  • the IDs of the videos stored in this cache server: v0, …, vn ( 0 ≤ vi < V) (at least 0 and at most V numbers), given in any order without repetitions

Each cache server should be described in at most one line. It is not necessary to describe all cache servers: if a cache does not occur in the submission, this cache server will be considered as empty. Cache servers can be described in any order.


0 2
1 3 1
2 0 1

Example submission file explanation.

We are using  all 3 cache servers.
Cache server 0 contains only video 2.
Cache server 1 contains videos 3 and 1.
Cache server 2 contains videos 0 and 1.


The output file is valid if it meets the following criteria:

  • the format matches the description above
  • the total size of videos stored in each cache server does not exceed the maximum cache server capacity


The score is the average time saved per request, in microseconds. (Note that the latencies in the input file are given in milliseconds. The score is given in microseconds to provide a better resolution of results.)
For each request description ( Rv, Re, Rn) in the input file, we choose the best way to stream the video Rv to the endpoint Re. We pick the lowest possible latency L = min(LD, L0, … , Lk−1) , where L​D is the latency of serving a video to the endpoint Re from the data center, and L0, … , Lk−1 are latencies of serving a video to the endpoint Re from each cache server that:

  • is connected to the endpoint Re, and
  • contains the video Rv

The time that was saved for each request is LD

As each request description describes Rn requests, the time saved for the entire request description is Rn × ( LD − L ) .

To compute the total score for the data set, we sum the time saved for individual request descriptions in milliseconds, multiply by 1000 and divide it by the total number of requests in all request descriptions, rounding down.

A schematic representation of the example submission file above​ .

In the example​ above, there are three request descriptions for the endpoint 0

  • 1500 requests for video 3, streamed from cache 1 with 300ms of latency, saving 1000ms − 300ms = 700ms per request
  • 500 requests for video 4, streamed from the data center, saving 0ms per request
  • 1000 requests for video 1, streamed from cache 2 with 200ms of latency saving 800ms per request

There is also one request description for the endpoint 1:

  • 1000 requests for video 0, streamed from the data center, saving 0ms per request

The average time saved is:

( 1500x700 + 500x0 + 1000x800 + 1000x0 )/(1500 + 500 + 1000 + 1000)

which equals 462.5ms. Multiplied by 1000, this gives the score of 462 500​.

Note that there are multiple data sets representing separate instances of the problem. The final score for your team will be the sum of your best scores on the individual data sets.

[Download not found]

[Download not found]

Practice Problem for Google Hash Code 2017 1

Happy new year people!!

Google released a practice problem for Google Hash Code 2017!

Please do not forget to register!

[Download not found]

[Download not found]

Submission deadline:     Thursday, Feb 23, 19:30 Cyprus time (18:30 CET)


Practice Problem for Hash Code 2017


Did you know that at any given time, someone is cutting pizza somewhere around the world? The decision about how to cut the pizza sometimes is easy, but sometimes it’s really hard: you want just the right amount of tomatoes and mushrooms on each slice. If only there was a way to solve this problem using technology…

Problem description


The pizza is represented as a rectangular, 2-dimensional grid of R rows and C columns. The cells within the grid are referenced using a pair of 0-based coordinates [r, c] , denoting respectively the row and the column of the cell.

Each cell of the pizza contains either:

  • mushroom, represented in the input file as M ; or
  • tomato, represented in the input file as T


A slice of pizza is a rectangular section of the pizza delimited by two rows and two columns, without holes.
The slices we want to cut out must contain at least L cells of each ingredient (that is, at least L cells of mushroom and at least L cells of tomato) and at most H cells of any kind in total – surprising as it is, there is such a thing as too much pizza in one slice.

The slices being cut out cannot overlap. The slices being cut do not need to cover the entire pizza.


The goal is to cut correct slices out of the pizza maximizing the total number of cells in all slices.

Input data set

The input data is provided as a data set file – a plain text file containing exclusively ASCII characters with lines terminated with a single \n character at the end of each line (UNIX- style line endings).

File format

The file consists of:

  • one line containing the following natural numbers separated by single spaces:
    • R (1 ≤ R ≤ 1000) is the number of rows,
    • C (1 ≤ C ≤ 1000) is the number of columns,
    • L (1 ≤ L ≤ 1000) is the minimum number of each ingredient cells in a slice,
    • H (1 ≤ H ≤ 1000) is the maximum total number of cells of a slice
  • R lines describing the rows of the pizza (one after another). Each of these lines contains C
    characters describing the ingredients in the cells of the row (one cell after another). Each character is either M (for mushroom) or T (for tomato).

Example Input File

3 5 1 6

3 rows, 5 columns, min 1 ingredient per slice, max 6 cells per slice


File format

The file must consist of:

  • one line containing a single natural number S (0 ≤ S ≤ R × C) , representing the total number of slices to be cut,
  • U lines describing the slices. Each of these lines must contain the following natural numbers
    separated by single spaces:

    • r1 , c1 , r2 , c2 (0 ≤ r1, r2 < R, 0 ≤ c1, c2 < C)  describe a slice of pizza delimited by the rows r1 and r2 and the columns c1 and c2 , including the cells of the delimiting rows and columns. The rows ( r1 and r2 ) can be given in any order. The columns ( c1 and c2 ) can be given in any order too.


0 0 2 1
0 2 2 2
0 3 2 4

Example description

3 slices.
First slice between rows (0,2) and columns (0,1).
Second slice between rows (0,2) and columns (2,2).
Third slice between rows (0,2) and columns (3,4).

Slices described in the example submission file marked in green, orange and purple.


For the solution to be accepted:

  • the format of the file must match the description above,
  • each cell of the pizza must be included in at most one slice,
  • each slice must contain at least L cells of mushroom,
  • each slice must contain at least L cells of tomato,
  • total area of each slice must be at most H


The submission gets a score equal to the total number of cells in all slices.

Note that there are multiple data sets representing separate instances of the problem. The final
score for your team is the sum of your best scores on the individual data sets.

Scoring example

The example submission file given above cuts the slices of 6, 3 and 6 cells, earning 6 + 3 + 6 = 15 points.

Past editions

— From https://hashcode.withgoogle.com/past_editions.html

Hash Code started in 2014 as a one-day programming competition for students and professionals from across France. We introduced the Online Qualification Round in 2015 where more than 1,500 students and professionals competed. The top teams were then invited to the Google Paris office to face off in the Final Round of the competition. In 2016 we scaled the competition to the rest of Europe, the Middle East and Africa where more than 17,000 people signed up to compete. You can take a look at the problems and winning teams from past editions of Hash Code below.

Past problem statements

Schedule Satellite Operations

Hash Code 2016, Final Round
[Download not found]
A satellite equipped with a high-resolution camera can be an excellent source of geo imagery. While harder to deploy than a plane or a Street View car, a satellite — once launched — provides a continuous stream of fresh data. Terra Bella is a division within Google that deploys and manages high-resolution imaging satellites in order to capture rapidly-updated imagery and analyze them for commercial customers. With a growing constellation of satellites and a constant need for fresh imagery, distributing the work between the satellites is a major challenge. Given a set of imaging satellites and a list of image collections ordered by customers, schedule satellite operations so that the total value of delivered image collections is as high as possible.

Optimize Drone Deliveries

Hash Code 2016, Online Qualification Round
[Download not found]
The Internet has profoundly changed the way we buy things, but the online shopping of today is likely not the end of that change; after each purchase we still need to wait multiple days for physical goods to be carried to our doorstep. Given a fleet of drones, a list of customer orders and availability of the individual products in warehouses, schedule the drone operations so that the orders are completed as soon as possible.

Route Loon Balloons

Hash Code 2015, Final Round
[Download not found]
Project Loon aims to bring universal Internet access using a fleet of high altitude balloons equipped with LTE transmitters. Circulating around the world, Loon balloons deliver Internet access in areas that lack conventional means of Internet connectivity. Given the wind data at different altitudes, plan altitude adjustments for a fleet of balloons to provide Internet coverage to select locations.

Optimize a Data Center

Hash Code 2015, Online Qualification Round
[Download not found]
For over ten years, Google has been building data centers of its own design, deploying thousands of machines in locations around the globe. In each of these of locations, batteries of servers are at work around the clock, running services we use every day, from Google Search and YouTube to the Judge System of Hash Code. Given a schema of a data center and a list of available servers, your task is to optimize the layout of the data center to maximize its availability.

Street View Routing

Hash Code 2014, Final Round
[Download not found]
The Street View imagery available in Google Maps is captured using specialized vehicles called Street View cars. These cars carry multiple cameras capturing pictures as the car moves around a city. Capturing the imagery of a city poses an optimization problem: the fleet of cars is available for a limited amount of time and we want to cover as much of the city streets as possible.

Google Hash Code 2017 – Online Qualification Round Schedule

19:00 EET:

  • The hub will open to the public
  • People can view the live stream (Nat & Lo videos) on the video projector
  • Teams can set themselves up with the help of the volunteers

19:30 EET:

  • Live stream starts

19:45 EET:

  • Task will be made available, competition starts
  • Scoreboard will be displayed on the video projector
  • Participating teams will be confirmed in the Judge System

23:30 EET:

  • End of the competition
  • Announcement of the score for the local teams

00:00 EET:

  • The hub will close


Hashtag for the competition #hashcode2017

Google Hash Code 2017 Limassol Cyprus – Call for participation

We’ll be hosting a hub at the Cyprus University of Technology for the Online Qualification Round of Hash Code, a team-based programming competition created by Google for university students and industry professionals. The Online Qualification Round takes place on the 23rd of February at 19:30 EET and registered teams from Cyprus are invited to participate from our hub, which will take place at the Labs of the University. Top scoring teams from the Online Qualification Round will then be invited to Google’s Paris office to compete in the Final Round of the competition in April.

If you’re interested in joining our hub, find a team (two to four people) and register at g.co/hashcode. Make sure to select Cyprus University of Technology from the list of hubs in the Judge System.

For more information about this and other hubs in Cyprus (including the twin event in Nicosia) visit https://goo.gl/XSfUPv

Hash Code 2017 Limassol Cyprus – Facebook Event



Cyprus University of Technology
Room: ΚΧΕ 1 - Computer Lab
Polyxeni Loizia and Eleni Autonomou Building (Old Cadastre)
Athinon Street

Τεχνολογικό Πανεπιστήμιο Κύπρου
Δωμάτιο: ΚΧΕ 1 -  Εργαστήριο Ηλεκτρονικών Υπολογιστών
Κτήριο Πολυξένη Λοϊζία και Ελένη Αυτονόμου (Παλιό Κτηματολόγιο)
Οδός Αθηνών

Date and Time:

23 February 2017
From: 19:30 EET
To: 23:30 EET

Free Amenities Offered

High speed Internet access
Wi-Fi access to the Internet for your mobile devices (personal computers and smart phones)
Lab computers will be available for use by the participants
Food in the form of snacks and beverages will be available outside the labs