How to Create a Successful Movie Based on Analyses of Movies on IMDb and Box Office Mojo Websites

Jmstipanowich
8 min readDec 6, 2021

--

Three Python Plots Utilizing Data From IMDb and Box Office Mojo Websites Provide Information on Potential Strategies to Produce Successful Movies

What characteristics help a movie succeed? Is a movie successful because of the location of the film, because of the studio that releases the film, or because of the cast of the film? What makes a movie GREAT? More specifically…what traits define a spectacular movie that can be manipulated to enhance movie success? In my blog today, I am going to investigate some traits of movies and procure actionable insights for movie producers to help movie creators generate successful films.

As an extension to my github project available to view at https://github.com/jmstipanowich/Movie-Data-Repo, I am using the same IMDb website data and Box Office Mojo website data in my blog as from my github movie project with the same basic premise as my github project. The difference from my project is in this blog I will perform unique examinations of movies with the IMDb website data and Box Office Mojo website data with new analyses, insights, and suggestions on how to formulate a successful movie. I plan to use boxplots to analyze the ranges of domestic gross amounts for movies for distinct years from the Box Office Mojo data, a histogram to display the counts of movies that fit into different run time categories from the IMDb data, and a color-coded scatterplot diagram to show how top 5 movie genres compare across average ratings and number of votes with IMDb data.

Note: I applied Python code to create my graphic displays that included three Python libraries: Pandas, Matplotlib, and Seaborn. I will display my code for the plots in case anyone wants to reproduce properties of the visualizations.

What Do Movie Domestic Gross Values Per Year From 2010–2018 Reveal About Successful Movies?

The amount of money a movie generates definitely influences movie success. More successful movies usually generate more money and have higher domestic gross. I obtained data on 3,387 movies from the Box Office Mojo website to construct box plots of the ranges of domestic gross values for movies for each year from 2010–2018 to analyze what the mean, median, and general range was of domestic gross values. My code to initialize the box plots and a diagram of the box plots is shown below with my results:

The Box Office Mojo data displaying domestic gross by year expresses that domestic gross of a movie is hard to predict. All years of the data represented present many outlier domestic gross values that were outside the 1.5 times the Interquartile Range (IQR) centered around the median that is the box plot. The box plot is supposed to show where most values of the data fall in a dataset, and in the box plots exhibited above, much of the data fell outside the box plot range. If there is a year to speculate the monetary procedures and money allocations of movies in order to produce a successful film, analyze the year 2010. 2010 had one of the highest mean and median domestic gross values ($31,445,593 and $3,100,000 respectively attained with python code) of all the years represented on the box plots graph and had the highest maximum cap for its box plot out of all the box plots for each year on the graph. Also, the outlier data was less extreme for 2010. Overall, 2010 had a lot of high domestic gross values that were understandable and expected from the data.

What Range of Movie Run Times Is Most Likely to Lead to a Successful Movie?

The length of a movie is another factor that can influence the success of a movie. Some people like shorter movies and some people like longer movies, but there is a movie run time category that is most popular for movies. Below I produced a histogram that shows 113,867 movies from the IMDb website and the run time movie categories those movies fall into. The run time movie categories are represented in half hour increments from 0 minutes to 180 minutes (3 hours). The Python code to create the graph and the graphic display are shown below with results:

The most popular run time category for movies was 60 minutes (1 hour) to 90 minutes (1 and a half hours) with 43,445 movies from IMDb falling into this movie run time category. About 38 percent of movies from the IMDb movie dataset fall into the run time minute category of 60 minutes to 90 minutes. The run time category of 90 minutes (1 and a half hours) to 120 minutes (2 hours) was popular as well with 41,436 movies from IMDb falling into this movie run time category. About 36 percent of movies from the IMDb movie dataset fell into this category. For a movie to be successful, the movie should most likely run between 60 minutes (1 hour) to 90 minutes (1 and a half hours) because this was the most popular movie run time category from the IMDb movie data I used.

What Genre of Movie Could Be Employed to Obtain a Successful Movie Based on Popularity Expressed Through Average Rating and Number of Votes Received on the IMDb Website ?

Movie genres impact the success of a movie as well. Depending on the genre of a film can determine the degree of success of the film. From the IMDb movie data, I constructed a color-coded scatterplot that exhibits 45,128 movies from the IMDb website whose primary genre (Genre1) was identified to be one of the five popular movie genres of Action, Drama, Comedy, Horror, or Fantasy. The scatterplot displays a comparison of the IMDb movies with the top 5 movie genres across average ratings and number of votes on the IMDb website. The purpose of the scatterplot is to determine the most popular genre of movie and because of popularity, consider the genre for use when constructing a successful movie. The Python code to create the top 5 movie genre scatterplot, along with the scatterplot with written analyses is shown below:

Most of the Action movie data points from the IMDb movie data had the highest average rating along with greatest number of votes. Drama was fairly popular as well with high average ratings and high number of votes at points. Action is the genre to utilize for the best chances of having a successful movie because the Action genre had the most points with high average ratings and high number of votes on IMDb.

How to Create a Successful Movie Based on Analyses of Movies on IMDb and Box Office Mojo Websites

The methods by which a movie is created can have a large effect on its success. There are all kinds of choices to be made by directors and producers of a movie to help a film succeed. To better understand the inner workings of the movie industry myself and support the creation of great films, I inspected movie datasets from IMDb and Box Office Mojo websites to discern traits for successful movies. From my exploration and manipulation of the movie data, I came up with three actionable suggestions for movie-makers to inform them about how to make a successful movie. My recommendations are as follows:

  1. Evaluate monetary procedures and allocations for movies made in 2010 to better comprehend how movies develop high domestic gross and can identify as successful. The 2010 box plot on the “Ranges of Domestic Gross for Movies Per Year” graph expressing Box Office Mojo movie data had one of the highest mean and median domestic gross values of all the years from 2010–2018 represented on the box plots visualization. 2010 had the highest maximum domestic gross cap for its 2010 box plot out of all the maximum domestic gross caps on the yearly box plots on the graph as well. Also, 2010 had some of the least extreme domestic gross outlier data for any year on the “Ranges of Domestic Gross for Movies Per Year” graph. 2010 was a year where movies made great amounts of money and there were more generalizable trends in the data than in other years.
  2. Design movies with run time minutes between 60 minutes (1 hour) and 90 minutes (1 hour and a half) because this was the most popular run time movie category with the most movies (43445 movies) from the IMDb data. Simply because this movie run time category was most popular, it can be considered a category for use to achieve the formation of a successful movie.
  3. Produce movies that fall under the Action genre because the Action movie genre movies from the IMDb movie data generally had the highest average rating along with greatest number of votes on IMDb of any of the top movie genres of Action, Drama, Comedy, Horror, and Fantasy. The Action genre movie statistics on IMDb demonstrate this genre is very popular and would stand the best chance of any genre at being employed to attain a successful movie.

These are some insights into how to create a successful movie. For more information and statistical analyses relating to the production of great films, visit my github website at https://github.com/jmstipanowich/Movie-Data-Repo. There are many pieces that go into movie-making, but follow my advice and hopefully future movies will have great success!

Resources:

https://www.boxofficemojo.com/

--

--

Jmstipanowich
Jmstipanowich

No responses yet