Algorithm – The YouTube Video Recommendation System

Diagram:

There are thousands of millions of videos exist on YouTube website, and people are still keeping upload new videos. Thus how to recommend the most appropriate videos to the watcher is a challenge. So how did YouTube recommend videos to you? Now to solve the problem of the large scale of data, to improve efficiency, and to eliminate noise, deep learning technology is being used in the recommendation system. Starting from the pool of videos, there are three parts of the algorithm consist of several steps through which the appropriate videos would finally reach the users, respectively data collection, candidate generation and ranking system. To follow these steps and have a complete picture of how the algorithm functions, we set up four scenarios, considering two problems: whether a device is new or old one, and whether the user has logged in to their account.

Logged in & using the old device(history available)

Logged in & using a new device(history available)

Not logged in & using the old device(history available)

Not logged in & new device–→ create a new profile(history not available)

For the first three scenarios, though history is all available, differences exist in the data that YouTube would collect and use. The first and the second scenarios are different in whether a new device is being used for watch videos, that means the data they collect is not only from their account history but also from information previously stored in the device. And how data is being collect differently is being present in how our flowchart branched. This distinction primarily matters at the beginning of the algorithm. YouTube may then load data accordingly. YouTube can use historical information to generate video recommendations if such information is available on users’ accounts or devices. Information including watch history, age, region, etc. is represented by the watch history and age in our flowchart. For completely new users, YouTube creates a temporary profile for them and uses it to capture their activity within the current session.

After information is loaded, YouTube recommendation system would work to recommend videos according to a comparison between user information and their models. It goes through the two subsystems: candidate generation and ranking system. When the videos go through the candidate generation system, millions of videos would be narrow down to hundreds of the most likely watched videos highly related to user history information. These videos then serve as inputs to the ranking system, in this subsystem, videos would be sorted according to its feature. Those videos will again go through the ReLu neural network, and after being weighted according to the watch time logistic, the videos would be scored and ranked, then it will go to user’s end. Whether user clicks on these recommended videos will again become history information that would be collected and store as retrieving data for next time recommendation. During this process, watch time is crucial: if users watch recommended videos for more than 30 seconds, these videos rank higher up in the recommendation list. Otherwise, they are less likely to show up again. Since the recommendation algorithm is complex and secret, our diagram presents a reverse-engineered approximation based on the real algorithm’s documented behavior.