COMP1521

  COMP1521 19T2 Computer Systems Fundamentals Marks Group? Submit Deadline Late Penalty Assessment Assignment 2 My Very Own Malloc/Free! Objectives • to give you experience writing C code to manipulate memory • to give you further experience with data structures in C Admin 11 (towards total course mark) This assignment is completed individually give cs1521 assign2 myHeap.c or via Webcms3 Tue 13 Aug 2019, 08:59:59 0.08 marks per hour late (approx 1.9 marks per day) off the ceiling (e.g. if you are 36 hours late, your maximum possible mark is 6.1/9) • 9 marks for correct performance, as determined by automated testing on a range of test cases, some of which will be provided for you to test your code as you write it; • 1 mark for commenting the code: you don't need a comment on every line, but roughly one comment on each block of C statements that does a meaningful task; • 1 mark for readable code: sensible names, consistent use of indentation to highlight control structures For a guide to style, use the code in the lectures and tute solutions. If your submitted code won't compile, your maximum possible performance mark is 3/9. The solution: make sure your code compiles and runs before submitting. If your submitted code fails all the performance tests, your maximum possible performance mark is 5/9. In both cases, the actual mark will be determined by a tutor's assessment of how close your code is to working. Background The C standard library provides a mechanism to dynamically manage the usage of memory in a program: via functions like malloc(3) and free(3), programmers can create and destroy data objects as needed. This means that you can allocate just as much memory as you need to hold your data, without the potential wasted space which can occur when using fixed size data objects (e.g. arrays of a pre-defined size). In this assignment, you will write functions to manage chunks of memory within a large region of memory, analogous to how malloc(3) and free(3) work. There are two types of chunks: free chunks and allocated chunks. We'll describe the precise algorithms needed to determine which chunks to allocate and free (and merge) below, but first a few diagrams showing what we need to do. An example of how some simple myMalloc()'s change the memory: An example of how some more malloc/free operations change the memory: And yet another example of how some malloc/free operations change the memory: We acquire the region of memory by some other means: our implementation is of a sub-allocator, where we get one large region from the system, and allocate out of that; or we could also make our implementation act as a top-level allocation library by taking advantage of the memory management system calls. Sub-allocators are often used in large applications that need certain guarantees (usually of performance) about the way memory allocation works, or which allocate lots of complexly-linked objects that can all be released at the same time: compilers, graphics engines, and web browsers all use sub-allocators to improve performance. Makefile myHeap.c myHeap.h test1.c test2.c test3.c data Setting Up Create a new directory for working on the assignment, change into that directory, and then run the command: $ unzip /web/cs1521/19T2/assignments/assign2/assign.zip If you're working at home, download assign.zip from the link above, and extract it using unzip. It's fine to work on your own machine, but remember to always test your code on the CSE machines before submitting. If you've done the above correctly, you should now find the following files in the directory: a set of dependencies used to control compilation a skeleton for the functions you are required to write a complete interface for the memory management functions a simple test program: initialises the heap, then does a heap dump, to check the initHeap function. a simple test program: implements a sorted linked list, generating random numbers and storing them in order in an initially-empty list, then calling a recursive free list function to clean up the memory allocated to the list. a general-purpose test program that reads an input describing a sequence of malloc and free requests, which trigger calls to myMalloc() or myFree(). A trace of a simple sequence of allocations and deallocations, formatted as input to the test3 program; you should make up more data files to test your heap functions thoroughly. Exercise The aim of this exercise is to complete the program skeleton in myHeap.c, implementing the initHeap(), myMalloc(), and myFree() functions. There are two other functions, freeHeap() and dumpHeap(), that are already completed. You can add as many private functions as you like; define them as static to make them private. The Heap The heap, for our purposes, is a large region of memory which is composed of free and allocated chunks. Each chunk has a header, containing a status (FREE or ALLOC) and the size of the entire chunk (including the header). The rest of the chunk after the header is a data block. The following diagram shows the two different types of chunks: Alongside the heap region is an array of pointers to the free chunks in the heap. Its usage is described in more detail below. By necessity, the heap is a global variable. The Heap variable is a struct, which groups together all the heap's state. int initHeap (int N) This function allocates a region of memory of size N bytes, in which all our allocations will happen. If N is less than the minimum heap size (4096), then N is set to the minimum heap size. The value of N is also rounded up to the nearest multiple of 4 (e.g., 5001 would be rounded up to 5004). The function sets Heap.heapMem to point to the first byte of the allocated region, and zeroes out the entire region. It then initialises the region to be a single large free-space chunk. Finally, the function allocates a freeList array, of size , containing pointers to the free chunks in heapMem, and sets the first item in this array to the single free-space chunk. If it is successfully able to do all of the above, initHeap() returns 0; otherwise, it returns -1. The initial state of the heap is as shown below: N/MIN_CHUNK void *myMalloc (int N) The myMalloc() function finds the smallest free chunk larger than N + HeaderSize. If the free chunk is smaller than N + HeaderSize + MIN_CHUNK, allocate the whole chunk. If the free chunk is larger than this, then split it into two chunks, with the lower chunk allocated for the request, and the upper chunk being a new free chunk. The function returns a pointer to the first usable byte of data in the chunk as in the diagram below: If a new free chunk is created, it is inserted into the appropriate location in the free list — the pointers in the free list should be maintained in ascending address order — as shown below: If myMalloc() is called with a value less than 1, it should return NULL. If N is not a multiple of 4, it should be increased to the next multiple of 4 (e.g., myMalloc(14) behaves as if the call had been myMalloc(16)). void myFree (void *p) The myFree() function releases an allocated chunk and places it into the free list. The following diagram shows an example of this: However: just returning free chunks will lead to fragmentation, and future calls to myMalloc may find it harder (or even impossible) to find an appropriately-sized block. Instead, we avoid having adjacent free chunks: the free chunk(s) adjacent to the new free chunk are merged into a single larger free chunk. This could involve just two chunks being merged, or could involve three chunks as shown in the diagram below: Note: if the argument to myFree() does not represent an allocated chunk in the heap, the program should print the message “Attempt to free unallocated chunk\n” to stderr, and then invoke exit(1). This behaviour also applies if the argument is an address somewhere in the middle of an allocated block; i.e., the only valid argument to myFree() is a pointer to the start of the data block in an allocated chunk. Testing Output from test1 The test1.c program simply initialises the heap to whatever size the user requests (rounded to 4096 if less than this), and then dumps the contents of the heap. It should contain a single free block whose size is whatever the heap was initialised with. For example: $ ./test1 5000 +00000 (F, 5000) $ ./test1 10000 +00000 (F,10000) $ ./test1 100 +00000 (F, 4096) $ ./test1 5001 +00000 (F, 5004) The output from heapDump() is a sequence of triples, where each triple represents one chunk and is of the form +NNNNN (X, SSSSS), where NNNNN is the offset of the chunk from the start of the heap, X is either F for “free” or A for “allocated”, and the SSSSS is the size of the chunk. Output from test2 The test2.c implements a sorted linked list, which it fills from a sequence of 100 random numbers (since it uses the same seed each time, it will always generate the same sequence). It prints the list and dumps the heap after each insert, and prints the final heap after the list is freed. Since it generates a lot of output (potentially useful for debugging), we only show the first few inserts and the final state of the list and heap. It uses a 10000-byte heap, so after the list is freed, there should be a single 10000-byte free chunk. $ ./test2 L = 83 +00000 (A, 24) +00024 (F, 9976) L = 83->86 +00000 (A, 24) +00024 (A, 24) +00048 (F, 9952) L = 77->83->86 +00000 (A, 24) +00024 (A, 24) +00048 (A, 24) +00072 (F, 9928) L = 15->77->83->86 +00000 (A, 24) +00024 (A, 24) +00048 (A, 24) +00072 (A, 24) +00096 (F, 9904) ... lots of output elided ... L = 2->3->5->5->8->11->11->12->13->13->14->15->15->19->21->21->22->23->24->24->25->26 ⤷->26->26->26->27->27->29->29->29->29->30->32->34->35->35->36->36->37->39->39->40->42 ⤷->43->45->46->49->50->51->54->56->56->57->58->59->60->62->62->62->63->64->67->67->67 ⤷->67->68->68->69->70->70->72->73->73->76->76->77->78->80->81->82->82->83->84->84->84 ⤷->86->86->86->87->88->90->91->92->93->93->94->95->96->98->99 +00000 (A, 24) +00024 (A, 24) +00048 (A, 24) +00072 (A, 24) +00096 (A, 24) +00120 (A, 24) +00144 (A, 24) +00168 (A, 24) +00192 (A, 24) +00216 (A, 24) +00240 (A, 24) +00264 (A, 24) +00288 (A, 24) +00312 (A, 24) +00336 (A, 24) +00360 (A, 24) +00384 (A, 24) +00408 (A, 24) +00432 (A, 24) +00456 (A, 24) +00480 (A, 24) +00504 (A, 24) +00528 (A, 24) +00552 (A, 24) +00576 (A, 24) +00600 (A, 24) +00624 (A, 24) +00648 (A, 24) +00672 (A, 24) +00696 (A, 24) +00720 (A, 24) +00744 (A, 24) +00768 (A, 24) +00792 (A, 24) +00816 (A, 24) +00840 (A, 24) +00864 (A, 24) +00888 (A, 24) +00912 (A, 24) +00936 (A, 24) +00960 (A, 24) +00984 (A, 24) +01008 (A, 24) +01032 (A, 24) +01056 (A, 24) +01080 (A, 24) +01104 (A, 24) +01128 (A, 24) +01152 (A, 24) +01176 (A, 24) +01200 (A, 24) +01224 (A, 24) +01248 (A, 24) +01272 (A, 24) +01296 (A, 24) +01320 (A, 24) +01344 (A, 24) +01368 (A, 24) +01392 (A, 24) +01416 (A, 24) +01440 (A, 24) +01464 (A, 24) +01488 (A, 24) +01512 (A, 24) +01536 (A, 24) +01560 (A, 24) +01584 (A, 24) +01608 (A, 24) +01632 (A, 24) +01656 (A, 24) +01680 (A, 24) +01704 (A, 24) +01728 (A, 24) +01752 (A, 24) +01776 (A, 24) +01800 (A, 24) +01824 (A, 24) +01848 (A, 24) +01872 (A, 24) +01896 (A, 24) +01920 (A, 24) +01944 (A, 24) +01968 (A, 24) +01992 (A, 24) +02016 (A, 24) +02040 (A, 24) +02064 (A, 24) +02088 (A, 24) +02112 (A, 24) +02136 (A, 24) +02160 (A, 24) +02184 (A, 24) +02208 (A, 24) +02232 (A, 24) +02256 (A, 24) +02280 (A, 24) +02304 (A, 24) +02328 (A, 24) +02352 (A, 24) +02376 (A, 24) +02400 (F, 7600) After freeList ... +00000 (F,10000) Output from test3 The output from the test3.c depends on the input. It takes lines of the form: • a = malloc size where a can be any character between a and z inclusive, and attempts to allocate size bytes; and • free a where a can be any character between a and z inclusive, and attempts to free a. test3 also dumps the locations of the allocated variables (as heap offsets), and dumps the contents of the heap, to help you debug. Here are the first few lines of output, using the supplied data file: $ ./test3 10000 < data +00000 (F,10000) [a] +00008 +00000 (A, 108) +00108 (F, 9892) [a] +00008 [b] +00116 +00000 (A, 108) +00108 (A, 208) +00316 (F, 9684) [a] +00008 [b] +00116 [c] +00324 +00000 (A, 108) +00108 (A, 208) +00316 (A, 308) +00624 (F, 9376) [a] +00008 [c] +00324 +00000 (A, 108) +00108 (F, 208) +00316 (A, 308) +00624 (F, 9376) [a] +00008 [c] +00324 [d] +00116 ... Reminder: the sizes of the chunks are always eight bytes more than the requested amount of memory, because they include the 8-byte header. You should design some inputs that test various malloc and free and merge scenarios. Draw them on a diagram first, then write the input file, and then run it with test3 to see whether it produces the expected results. Challenge exercises Achieve great distinction and honour by completing large amounts of meaningless busywork for little gain! Worth kudos, but no marks. Rather than using a best-fit approach to choosing the free block, choose the largest free block. Find some interesting inputs for test3 that show how the performance of the two strategies differs. Does best-fit always use less space than biggest-fit? Does bestfit work faster than biggest-fit? Investigate ways of requesting an amount of memory from the operating system (sbrk(2) and mmap(2) are a good place to begin) and add a #define to your implementation that allows it to use one of those mechanisms to request its initial memory areas, instead of using malloc(3) and free(3) yourself directly. It's unusual that applications know how much memory they'll need a priori, except in very particular circumstances (although it is possible, and quite common in embedded systems). How would you change your implementation to support growing the heap region when you run out of space? Benchmark your implementation, using (e.g.,) gprof(1) or gperftools; construct some meaningful tests, and try to be scientific. Based on your findings, what changes would you need to make to be able to allocate a chunk in constant time? Read some real-world memory allocators. For example, phkmalloc is described in Poul-Henning Kamp's excellent 1998 white-paper, “malloc(3) revisited”. Or, read about jemalloc, described at BSDCan 2006 in “A Scalable Concurrent malloc(3) Implementation for FreeBSD” (though it has since significantly evolved from that description) — and there's a good chance you're directly or indirectly using it. There are others described in the literature; see if you can find some more implementations, and compare their behaviour and performance. COMP1521 19T2: Computer Systems Fundamentals is brought to you by the School of Computer Science and Engineering at the University of New South Wales, Sydney. For all enquiries, please email the class account at cs1521@cse.unsw.edu.au CRICOS Provider 00098G  

51作业君 51作业君

扫码添加客服微信

添加客服微信: IT_51zuoyejun