With over 139 million paid subscribers(total viewer pool -300 million) across 190
countries, 15,400 titles across its regional libraries and 112 Emmy Award
Nominations in 2018 — Netflix is the world’s leading Internet television network and
the most-valued largest streaming service in the world
Netflix first started using analytics tools in 2000 to recommend DVDs for users. Two
decades later, the Netflix recommendation system is one of the most sophisticated in
business. The personalised recommendation algorithms drive customer retention,
helping Netflix pocket profits to the tune of $1 billion annually.
Netflix uses a combination of machine learning, data from its users, and algorithms to rank
order its content and decide what to show us on our account.
Whenever we access the Netflix service, Netflix recommendations system strives to help us
find a show or movie to enjoy with minimal effort. Recommendation System estimate the
likelihood that we will watch a particular title in their catalog based on a number of factors
including:
• our interactions with service (such as our viewing history and how we rated other titles),
• other members with similar tastes and preferences on service, and
• information about the titles, such as their genre, categories, actors, release year, etc.
In addition to knowing what we have watched on Netflix, to best personalize the recommendations
the things like:
• the time of day we watch,
• the devices we are watching Netflix on, and
• how long we watch.
When we create our Netflix account, or add a new profile in our account, Netflix ask
us to choose a few titles that we like. System use these titles to “jump start” our
recommendations. Choosing a few titles we like is optional. If we choose to forego
this step then netflix system will start us off with a diverse and popular set of titles to
get we going.
Once we start watching titles on the service, this will “supercede” any initial
preferences we provided to system, and as we continue to watch over time, the titles
we watched more recently will outweigh titles we watched in the past in terms of
driving the recommendations system.
Our own Movie Recommendation System:
Approach: We will build Movie recommendation systems with various approaches:
1. Content-Based Recommendation System : In this approach, the recommends Movie/
Show that are similar to the ones that a user has liked in the past. if a user likes
movies such as ‘The Prestige’ then we can recommend him the movies of ‘Christian
Bale’ or movies with the genre ‘Thriller’ or maybe even movies directed by
‘Christopher Nolan’.So what happens here the recommendation system checks the
past preferences of the user and find the film “The Prestige”, then tries to find similar
movies to that using the information available in the database such as the lead actors,
the director, genre of the film, production house, etc and based on this information
find movies similar to “The Prestige”.
2. Collaborative Filtering Recommendation System :
This approach is based on the idea that the user rates, and the system will recommend
different movies that the user has not watched but the other users similar to our test
user have watched and liked. This type of collaborative filtering approach is called
the User-to-User Collaborative filtering approach as we find similar users to our user.
For example, if the user ‘A’ likes ‘Batman Begins’, ‘Justice League’ and ‘The
Avengers’ while the user ‘B’ likes ‘Batman Begins’, ‘Justice League’ and ‘Thor’ then
they have similar interests because we know that these movies belong to the superhero
genre. So, there is a high probability that the user ‘A’ would like ‘Thor’ and the
user ‘B’ would like The Avengers’.
1. COSINE SIMILARITY:
To find similar content for our item, we used the cosine similarity algorithm. The dot
product between two vectors is equal to the projection of one of them on the other.
Therefore the dot product of two identical vectors is equal to their squared modules. On the
other hand if the two vectors do not share any directions, the product will be zero. General
formula for calculating dot product is given below:
This dot product is important when defining the similarity as it is directly connected to it.
The definition of similarity between two vectors u and v is in fact the ratio between their dot
products and product of their magnitudes. Thus, this will be equal to 1 if the two vectors are
identical or it will be 0 if the two are orthogonal.
2. K Nearest Neighbors:
The standard method of Collaborative Filtering is known as Nearest Neighborhood
algorithm. We have an n × m matrix of ratings, with user u, i = 1, …n and item p, j=1, m.
Now we want to predict the rating r if target user i did not watch/rate an item j. The process
is to calculate the similarities between target user i and all other users, select the top X
similar users, and take the weighted average of ratings from these X users with similarities
as weights.
However, not all users have the same baseline for giving ratings to movies. Some users may
tend to give high scores generally while some are pretty strict with their ratings even though
they are satisfied with the items. To avoid such bias, we will subtract each users average
ratings of all the items when computing weighted average, and add it back for the target user
3. Matrix Factorization :
Sparsity is a big issue that needs to be addressed while creating collaborative filtering
recommendation systems. Our approach creates matrices where rows are unique users in our
environment and the columns represent different movies and the values within are the
ratings that different users give to movies. However, it is rather obvious that not all movies
will be rated by each user. Thus this matrix of ours faces the problem of sparsity that needs
to be solved. For this purpose, w use Matrix Factorization. In this method, we decompose
the original sparse matrix to low-dimensional matrices with latent features. Therefore matrix
factorization gives us how much a user is aligned with a set of latent features, and how
much a movie fits into this set of latent features.
4. Neural Network Approach :
We will combine our collaborative filtering results with neural network techniques to
improve the quality of our recommendations. We will be using neural network embeddings
to get desired results.
Such embeddings are a way to represent discrete categorical variables as continuous vectors.
These are low-dimensional and learned, this means that they place similar entities closer to
one another in the embedding space. There are many ways to compute embeddings, but the
end goal is to map things to a latent space with complex and meaningful dimensions
In this Movie Recommendation System we will recommends movies to the user based on
their past details. If user is new that create a new account on our recommendation system,
system will ask to user to choose a few categeory titles that user will like. System use these
category titles to recommends the movies and shows. This is optional that user can skip this
part to choose category title. When new user skip this part then recommendation will based
on the most viewed, newly release, Highly awaited Movies and shows. For old user it will
recommend based on their old history and Collaborative user that having similer viewed
history of movies for further recommendation of Movies and shows.
Recommendation systems have become an important part of everyones lives. With
the enormous number of movies releasing worldwide every year, people often miss
out on some amazing work of arts due to the lack of correct suggestion. Putting
machine learning based Recommendation systems into work is thus very important to
get the right recommendations.Thus we conclude that studying various approaches
towards recommendation engine is vital to come up with a hybrid engine that
overcomes the shortcomings of these independent approaches and users will get the
accurate recommendations for movies.
Better data is the key for the better products. We train you data for Machine Learning and better business analytics. We can annotate, collect, evaluate and translate any type of data in any language.