NYUTandonSchoolofEngineering
CS-GY6083,PrinciplesofDatabaseSystems,Spring2024
ProfPhyllisFrankl
HOMEWORK#5
Considerthisrelationaldatabaseschema:
Widget(wID,wName,wDesc,department,size,color,price)
Customer(cID,pwd,fName,lName,stNum,street,city,aptNum,
state,zip,bonusPoints)
Sale(cID,wID,sDateTime,status,notes,discount,tax)
Assumerecordsandattributeshavethefollowingsizes:
Widgetrecords:1024bytes
Customerrecords:512bytes
Salerecords:200bytes
cID,wID,sDateTimeeach8bytes
Statusis20bytes
RIDs(pointerstotreenodesandtorecordsondisk):8bytes
Assumethefollowingstatisticsaboutthedata:
1,000,000customers(10
6
)
10,000Widgets(10
4
)
100,000,000Sales(10
8
)
10%oftheSaleshavestatus“shipping”
Only10Saleshavestatus“extra-special”
1%ofcustomersareinzipcode11201
Assumethefollowingcharacteristicsofthediskandmainmemory:
10msforseek+rotationallatency
Transferrate20MB/sec
Blocksize=4K
Alittlemorethan100MBofmainmemoryavailable(enoughthatinblock-nestedjoinswecan
use100MBfortheouterrelationandstillhavesufficientmemoryforoneblockoftheinner
relation,oneblockofoutputbuffer,andmaybealsoanothersmallrelation)
Problems:
Forthefollowing,youmayroundoffanswerstogetnumbersthatareeasiertoworkwith.
1.Considerthequery:
SELECTzipFROMCustomerNATURALJOINSaleWHEREstatus=
“shipping”
a.Showrelationalalgebraexpressiontree,translateddirectlyfromtheSQL
b.Showtransformedrelationalalgebratreeinwhichtheselectoperationispushed
andanypossibleprojectoperationsarepushed.(Here“pushed”meanstheyare
lowerintheexpressiontree,i.e.appliedearlierintheevaluation).
c.Assumetherearenoindexes.Describethestepsinexecutingthisqueryand
analyzehowlongittakes.Youranswershouldusethetreefrompart(b)and
shouldclearlyindicate:
●Operationsperformed
●Algorithmsusedtoperformtheoperations(forproblem1,uselinear
searchforselectoperationsandblock-nestedjoinforjoinoperations)
●Forjoins,specifywhichistheouterrelation,whetheritfitsinmemoryand,
ifnot,howmanypassesareneeded
●Numberofseeks,totalseektime,amountofdatatransferred,total
transfertimeforeachpartoftheplan
●Totaltime
2.NowsupposethatthereisasecondaryB+indexontheSale.statusattribute
a.DeterminethelargestvalueofnsuchthatnrecordIDsandn-1keyswillfitintoa4KB
node.UseassumptionsonRIDandattributesizesinthepreamble.
b.HowmanylevelsofthetreeareneededtoindextheentireSalerelation?Youmay
assumethenodesarenotcompletelyfullandpickaconvenientnumberasthenumber
ofchildrenpernode(aslongasit’swithintherangespecifiedbytheB+treedefinition)
c.HowmuchtimeisneededtoretrieveonerecordfromSaleusingthisindex?Youmay
roundyouranswertothenearestmillisecond.
3.SupposethatthereisasparseprimaryB+indexontheCustomer.cidattribute
a.DeterminethelargestvalueofnsuchthatnrecordIDsandn-1keyswillfitintoa4KB
node.UseassumptionsonRIDandattributesizesinthepreamble.
b.Howmanyindexentriesareneeded,assumingoneindexentryperblock?Usethe
recordsizeandblocksizeinthepreamble
c.HowmanylevelsofthetreeareneededtoindextheentireCustomerrelation?Youmay
roundntoamoreconvenientnumber,butifyoudoso,specify.
d.HowmuchtimeisneededtoretrieveonerecordfromCustomerusingthisindex?You
mayroundyouranswertothenearestmillisecond.
e.Comparethetimetoretrieveacustomerrecordusingtheindextotheworstcasetime
usingalinearsearch.
4.SupposethatthereisasecondaryB+indexontheSale.statusattribute.Considerthisquery:
SELECTzipFROMCustomerNATURALJOINSaleWHEREstatus=
“extra-special”
Describeandanalyzethequeryplanusingtheindexesfromparts2and3.Youranswershould
includestepsanalogoustothoseinpart1.
5.Doyouthinkusingeitherorbothindexeswouldbeusefulfortheoriginalquery,
SELECTzipFROMCustomerNATURALJOINSaleWHEREstatus=
“shipping”
Brieflyjustifyyouranswer
6.[OPTIONAL:]Deviseandanalyzeaqueryplanforthefollowingqueryandnoteassumptions
youaremakingaboutthedistributionofpricesandhowtheyimpactyouranswer:
SELECTcID,fName,lNameFROMCustomerNATURALJOINSale
NATURALJOINWidgetWHEREstatus=“shipping”ANDprice<5