程序代写案例-CSCI 4380-01

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
Database Systems, CSCI 4380-01
Homework # 4
Due Monday March 15, 2021 at 11:59:59 PM EDT
Homework Statement. This homework is worth 5% of your total grade. This homework will
concentrate on SQL skills. You will be able to test your queries against real data and expected
output.
If you want to create the database on your local database server, you can find the database dump
file at:
https://www.cs.rpi.edu/academics/courses/spring21/csci4380/_downloads/ministreaming.dmp
You can create a database for this data, let’s call it ministreaming on psql shell:
psql> create database ministreaming;
and then load the data using the following Unix command (on Linux and Macs):
cat ministreaming.dmp | psql ministreaming
If you get a message similar to this:
CREATE EXTENSION
psql:ministreaming.dmp:28: ERROR: must be owner of extension plpgsql
it is safe to ignore it.
The data model for this homework is given in the end of this homework. We have data regarding
Movies and TV Shows and which streaming platforms they are available in. This data is sampled
down from a much larger database for simplicity (less than 2% of the full data) and has potentially
some noise. We will use the database “as is” and not fix issues unless they are major.
1
Database Server Use Rules
If you want to install and create the database on your own computer, you can use the data scripts
I used. You do not have to have a database server installed to do this homework. This database
will be created as ministreaming on the shared database server at:
http://rpidbclass.info
Feel free to use it for testing your queries, but please be considerate of others when using the server.
Here are a few ground rules:
ˆ Server response to load can be unpredictable. So, be patient if you see some slow down.
This is a medium sized database for a 100+ student class, so you can expect a serious slow
down near the homework deadline. Please do not wait until the last minute to submit your
homework.
ˆ Test your queries one at a time. Best setup is using a browser and a text editor. Write queries
elsewhere and test in the server with cut and paste.
ˆ Make every effort to read your queries before submitting. A forgotten join condition may
cause a disaster. Check join conditions first, then run your queries.
ˆ Remember, if you have an unresponsive query, it will continue to use system resources even if
you quit your browser. So, opening a new browser window will solve your problem but may
slow things down for everyone. Queries should terminate after 2 minutes, so if you increased
the load with a bad query, then wait for your query to end before submitting another.
If you are experiencing problems with a query, read it carefully before running it again
in a separate window. Remember: missing join conditions can be the difference between:
2,094,266,610,000 tuples and 3335 tuples.
ˆ If the server is not responsive, let us know on Submitty/Webex Teams. I will see if some jobs
need to be killed or whether server needs to be made more powerful. Please be patient.
ˆ Please do not include a query that does not run in your homework submission. I will run all
your queries in a batch job and an incomplete query will cause me a great deal of problems.
1 Problem Description
Write the following queries in SQL. In all your queries, use the simplest possible expression possible.
Do not forget your JOIN conditions and pay attention to returned attributes and ordering of tuples.
Returned attributes should be listed in exactly the same order in which they are listed in the
description. Attribute names should be the same as in the corresponding tables unless a different
name is requested in parentheses.
Query 1 Return the title, platform and number of seasons of series with at least 5 but no more
than 10 seasons while having IMDB rating of 4 or lower. Order results by title ascending and
platform descending.
Query 2 Find the average Rotten Tomatoes rating (avgrottentomatoes) of series that have no
date when the series was added and have either no age group the series is intended for or
the age group is ’18+’. When computing the average, only consider non-negative Rotten
Tomatoes ratings. The value of avgrottentomatoes should be output rounded to two decimal
places (you may use ROUND() function).
2
Query 3 Return the title, year, first director (director1), second director (director2), and lan-
guage of all movies made after 2000, available in at least one of the following languages:
English, Spanish, or French, and having at least two directors who have the same last name
’Wachowski’ but who are different people (i.e., the values of director attribute needs to be
different but contain ’Wachowski’). director1 should always be lexicographically smaller than
director2. Both director1 and director2 have to be Wachowski, even if there are other direc-
tors who are not Wachowski. Also, there should be no duplicate rows in the output. Order
results by language ascending, year descending, and finally title ascending.
Query 4 Return the id (id) and title of all movies and series that are animation but not horror.
Return an additional column called streamingtype that has the value 'movie' or 'series'
depending on the source. Order results by the title in descending order and streamingtype in
ascending order.
Query 5 Return the total number of tuples in the series relation (numtuples), total number of
tuples with an imdbrating value (numimdbrating), earliest year and last year for a series
(minyear, maxyear), average minimum age series are intended for (avgcontentrating), and
average number of seasons in series (avgseasons). Ignore any tuples with contentrating values
that are not in the d+ format where d is a whole number.
Format your averages with only two decimal places! Remember to carry out all arithmetic
operations with full precision and only show two decimal places when producing the final
output. For example, do not round values before summing or averaging them.
Note: This query requires you to do type casting. Often there are minor differ-
ences in floating point operations on different Postgresql servers. So, for simplic-
ity I have decided to ask everyone to use the same casting: cast contentrating as
an integer and averages as numeric(5,2). Format for doing this is as follows:
select '18'::int, '4.1234123412'::numeric(5,2) ;
Query 6 Find directors who have directed both a movie and a series available in the same country
of either United States or India. Return the name of directors (director) and the total
number of movies (nummovies) and series (numseries) they have directed that are available
in one of these countries. Order by the nummovies, numseries, and director ascending.
Query 7 Find platforms that have more than 40 movies. Return the name of these platforms
(platform), and the total number of movies (nummovies) and languages (numlanguages) that
are available on each of these platforms. Order results by nummovies descending, platform
ascending, and numlanguages descending.
Query 8 For each language that does not have movies with the Rotten Tomatoes rating of at least
60, return the language, the total number of genres (numgenres) and movies (nummovies) in
that language, and the spread of Rotten Tomatoes ratings (spreadrotten) for these movies.
Spread is defined as the absolute value of the difference between the maximum and the min-
imum values. Order results by spreadrotten descending, nummovies descending, numgenres
descending, and language ascending.
Query 9 For all cast members who appeared in the same movie with Adam Sandler in more than
one movie, return their name and the number of times they appeared together with Adam
Sandler (nummovies). Order results by nummovies descending and castname ascending.
Query 10 Return the name of all cast members who were cast in a series released after 2000 and
have also directed a series but were not cast in any movies. Order the results by castname
ascending.
3
SUBMISSION INSTRUCTIONS. You will use Submitty for this homework.
Please submit each query to the appropriate box in Submitty.
You must add a semi-colon at the end of each query to make sure it runs properly.
If you want to provide any comments, all SQL comments must be preceded by a double hyphen:
-- Example comment.
4
Database Schema
This database is merged from multiple datasets, containing data for movies and TV Series, both
appearing in various streaming platforms.
Note that this is real data, scraped from websites and then parsed, so it may be noisy. Make a
habit of using simple queries to first explore the data to understand various issues.
create table movies(
movieid int primary key
, title varchar(1000)
, year int -- year the movie was made
, contentrating varchar(10) -- age group the movie is intended for
, imdbrating float -- imdb rating
, rottentomatoes float -- rotten tomatoes rating
, runtime int
, date_added date -- date movie is added to Netflix
, duration varchar(40)
, description text
) ;
-- Which countries the movie is available in
create table moviescountry (
movieid int
, country varchar(100)
, primary key (movieid, country)
, foreign key (movieid) references movies(movieid)
) ;
-- Directors of the movie
create table moviesdirectors (
movieid int
, director varchar(100)
, primary key (movieid, director)
, foreign key (movieid) references movies(movieid)
) ;
create table moviesgenres (
movieid int
, genre varchar(100)
, primary key (movieid, genre)
, foreign key (movieid) references movies(movieid)
) ;
-- Languages the movie is available in
create table movieslanguages (
movieid int
, language varchar(100)
, primary key (movieid, language)
, foreign key (movieid) references movies(movieid)
) ;
-- People who starred in the movie
create table moviescast (
movieid int
, castname varchar(100)
, primary key (movieid, castname)
, foreign key (movieid) references movies(movieid)
) ;
5
-- Which platform the movie is in (if ispresent is True),
-- or not in (if ispresent is False)
create table moviesonplatform (
movieid int
, platform varchar(100)
, ispresent boolean
, primary key (movieid, platform)
, foreign key (movieid) references movies(movieid)
) ;
-- TV series on various platforms
create table series (
seriesid int primary key
, title varchar(400)
, yearreleased int
, contentrating varchar(40) -- age group the series is intended for
, imdbrating float -- imdb rating
, rottentomatoes int -- rotten tomatoes rating
, description text
, seasons int -- how many seasons are available
, date_added date -- date series is added to Netflix
) ;
-- Which platform the series is available in
create table seriesonplatform (
seriesid int
, platform varchar(100)
, primary key (seriesid, platform)
, foreign key (seriesid) references series(seriesid)
) ;
-- People who starred in the series
create table seriescast (
seriesid int
, castname varchar(100)
, primary key (seriesid, castname)
, foreign key (seriesid) references series(seriesid)
) ;
-- Categories for series
create table seriescategory (
seriesid int
, category varchar(100)
, primary key (seriesid, category)
, foreign key (seriesid) references series(seriesid)
) ;
-- Countries the series is available in
create table seriescountry (
seriesid int
, country varchar(100)
, primary key (seriesid, country)
, foreign key (seriesid) references series(seriesid)
) ;
-- Directors for the series
create table seriesdirectors (
seriesid int
6
, director varchar(100)
, primary key (seriesid, director)
, foreign key (seriesid) references series(seriesid)
) ;
-- Genre from series
create table seriesgenres (
seriesid int
, genre varchar(100)
, primary key (seriesid, genre)
, foreign key (seriesid) references series(seriesid)
) ;
7

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468