PKU-ICS
Malloc Lab: Writing a Dynamic Storage Allocator
1 Introduction
In this lab you will be writing a general purpose dynamic storage allocator for C programs; that is, your own version of the malloc, free, realloc, and calloc functions. You are encouraged to explore the design space creatively and implement an allocator that is correct, efficient, and fast.
2 ANote
This lab features a wide design space; feel free to be creative during this lab. In order to get the most out of this lab, we strongly encourage you to start early. The total time you spend designing and debugging can easily eclipse the time you spend coding.
Bugs can be especially pernicious and difficult to track down in an allocator, and you will probably spend a significant amount of time debugging your code.
3 Logistics
This is an individual project. You should do this lab on one of the ics machines.
4 Hand Out Instructions
Start by downloading malloclab-handout.tar from Autolab to a protected directory in which you plan to do your work. Then give the command tar xvf malloclab-handout.tar. This will cause a number of files to be unpacked into the directory.
The only file you will be modifying and turning in is mm.c, which contains your solution. The mdriver.c program is a driver program that allows you to locally evaluate the performance of your solution in the same way that Autolab will evaluate your final submission. Use the command make to generate the driver code, and run it with the command ./mdriver.
1
5 How to Work on the Lab
Your dynamic storage allocator will consist of the following functions, which are declared in mm.h and defined in mm.c:
int mm_init(void);
void *malloc(size_t size);
void free(void *ptr);
void *realloc(void *ptr, size_t size); void *calloc (size_t nmemb, size_t size); void mm_checkheap(int);
The mm-naive.c file we have given you implements everything correctly but naively. In addition, the mm-textbook.c file in the handout directory implements the example implicit list allocator described in your textbook.
Because we are running on 64-bit machines, your allocator must be coded accordingly, with one exception: the size of the heap will never be greater than or equal to 232 bytes. This does not imply anything about the location of the heap, but there is a neat optimization that can be done using this information. However, be very, very careful if you decide to take advantage of this fact. There are certain invalid optimizations that will pass all the driver checks because of the limited range of functionality we can check in a reasonable amount of time, so we will be manually looking over your code for these violations. If you do not understand this paragraph, you should re-read the x86-64 section of the text.
You may use mm.c, mm-naive.c, or the book’s example code (available from csapp.cs.cmu.edu) as starting points for your own mm.c file. Implement the functions (and possibly define other private static helper functions), so that they obey the following semantics:
• mm init: Performs any necessary initializations, such as allocating the initial heap area. The return value should be -1 if there was a problem in performing the initialization, 0 otherwise. Warnings:
– You must reinitialize all of your global pointers in this function. – Donotcallmeminitfromthisfunction!
Every time the driver executes a new trace, it resets your heap to the empty heap by calling your mm init function.
• malloc: The malloc routine returns a pointer to an allocated block payload of at least size bytes. The entire allocated block should lie within the heap region and should not overlap with any other allocated chunk.
Your malloc implementation must always return 8-byte aligned pointers.
• free: The free routine frees the block pointed to by ptr. It returns nothing. This routine is only guaranteed to work when the passed pointer (ptr) was returned by an earlier call to malloc, calloc, or realloc and has not yet been freed. free(NULL) has no effect.
2
• realloc: The realloc routine returns a pointer to an allocated region of at least size bytes with the following constraints:
– ifptrisNULL,thecallisequivalenttomalloc(size);
– if size is equal to zero, the call is equivalent to free(ptr) and should return NULL;
– ifptrisnotNULL,itmusthavebeenreturnedbyanearliercalltomallocorreallocand not yet have been freed. The call to realloc takes an existing block of memory, pointed to by ptr — the old block. It then allocates a region of memory large enough to hold size bytes and returns the address of this new block. Note that the address of the new block might be the same as the old block (perhaps there was free space after the old block and it could just be extended, or the new size was smaller than the old size); or it might be different, depending on your implementation, the amount of internal fragmentation in the old block, and the size of the realloc request. If the call to realloc does not fail and the returned address is different than the address passed in, the old block has been freed and should not be used, freed, or passed to realloc again.
The contents of the new block are the same as those of the old ptr block, up to the minimum of the old and new sizes. Everything else is uninitialized. For example, if the old block is 8 bytes and the new block is 12 bytes, then the first 8 bytes of the new block are identical to the first 8 bytes of the old block and the last 4 bytes are uninitialized. Similarly, if the old block is 8 bytes and the new block is 4 bytes, then the contents of the new block are identical to the first 4 bytes of the old block.
• calloc: Allocates memory for an array of nmemb elements of size bytes each and returns a pointer to the allocated memory. The memory is set to zero before returning.
Note: Your calloc will not be graded on throughput or performance. A correct, simple im- plementation will suffice.
• mm checkheap: The mm checkheap function (the heap consistency checker, or simply heap checker) scans the heap and checks it for correctness (e.g., are the headers and footers identical). Your heap checker should run silently until it detects some error in the heap. Then, and only then, should it print a message and terminate the program by calling exit. It is very important that your heap checker run silently; otherwise, it will produce too much output to be useful on the large traces.
A quality heap checker is essential for debugging your malloc implementation. Many malloc bugs are too subtle to debug using conventional gdb techniques. The only effective technique for some of these bugs is to use a heap consistency checker. When you encounter a bug, you can isolate it with repeated calls to the consistency checker until you find the instruction that corrupted your heap. Because of the importance of the consistency checker, it will be graded. If you ask a member of the course staff for help, the first thing we will do is ask to see your checkheap function, so please write this function before coming to see us!
The mm checkheap function takes a single integer argument that you can use any way you want. One very useful technique is to use this argument to pass in the line number of the call site:
mm_checkheap(__LINE__);
3
If mm checkheap detects a problem with the heap, it can print the line number where mm checkheap was called, which allows you to call mm checkheap at numerous places in your code while you are debugging.
These semantics match the semantics of the corresponding libc routines (note that mm checkheap does not have a corresponding function in libc). Type man malloc to the shell for complete documentation.
6 Support Routines
The memlib.c package simulates the memory system for your dynamic memory allocator. You can invoke the following functions in memlib.c:
7
• void *mem sbrk(int incr): Expands the heap by incr bytes, where incr is a positive non-zero integer, and returns a generic pointer to the first byte of the newly allocated heap area. The semantics are identical to the Unix sbrk function, except that mem sbrk accepts only a positive non-zero integer argument.
• void *mem heap lo(void): Returns a generic pointer to the first byte in the heap.
• void *mem heap hi(void): Returns a generic pointer to the last byte in the heap.
• size t mem heapsize(void): Returns the current size of the heap in bytes.
• size t mem pagesize(void): Returns the system’s page size in bytes (4K on Linux systems).
The Trace-driven Driver Program
The driver program mdriver.c in the malloclab-handout.tar distribution tests your mm.c pack- age for correctness, space utilization, and throughput. The driver program is controlled by a set of trace files that are included in the malloclab-handout.tar distribution. Each trace file contains a sequence of allocate and free directions that instruct the driver to call your malloc and free routines in some se- quence. The driver and the trace files are the same ones we will use when we grade your handin mm.c file.
When the driver program is run, it will run each trace file 12 times: once to make sure your implementation is correct, once to determine the space utilization, and 10 times to determine the performance.
The driver mdriver.c accepts the following command line arguments. The normal operation is to run it with no arguments, but you may find it useful to use the arguments during development.
• -p: Runs each trace file 12 times: once to make sure your implementation is correct, once to deter- mine the space utilization, and 10 times to determine the performance.
• -t <tracedir>: Look for the default trace files in directory tracedir instead of the default directory defined in config.h.
4
8
• -f <tracefile>: Use one particular tracefile instead of the default set of tracefiles for test- ing correctness and performance.
• -c <tracefile>: Run a particular tracefile exactly once, testing only for correctness. This option is extremely useful if you want to print out debugging messages.
• -h: Print a summary of the command line arguments.
• -l: Run and measure libc malloc in addition to the student’s malloc package. This is interest-
ing if you want to see how fast a real malloc package runs.
• -V: Verbose output. Print additional diagnostic information as each trace file is processed. Useful
during debugging for determining which trace file is causing your malloc package to fail.
• -v <verbose level>: This optional feature lets you manually set your verbose level to a par- ticular integer.
• -d <i>: At debug level 0, very little validity checking is done. This is useful if you are mostly done but just tweaking performance.
At debug level 1, every array the driver allocates is filled with random bits. When the array is freed or reallocated, we check to make sure the bits have not been changed. This is the default.
At debug level 2, every time any operation is done, all arrays are checked. This is very slow but useful to discover problems very quickly.
• -D: Equivalent to -d2.
• -s <s>: Time out after s seconds. The default is to never time out.
Programming Rules
• You are writing a general purpose allocator. You may not solve specifically for any of the traces—we will be checking for this. Any allocator that attemps to explicitly determine which trace is running (e.g., a sequence of if statements at the beginning of the trace) and change its behavior based on that trace’s pattern of allocations will receive a penalty of 20 points.
• You should not change any of the interfaces in mm.h. However, we strongly encourage you to use static helper functions in mm.c to break up your code into small, easy-to-understand segments.
• You should not invoke any external memory-management related library calls or system calls. The use of the libc malloc, calloc, free, realloc, sbrk, brk, or any other memory management packages is strictly prohibited.
• You are not allowed to define any global data structures such as arrays, trees, or lists in your mm.c program. However, you are allowed to declare global structs and scalar variables such as integers, floats, and pointers in mm.c.
5
The reason for this restriction is that the driver cannot account for such global variables in its memory utilization measure. If you need space for large data structures, you can put them at the beginning of the heap.
• You are not allowed to simply hand in the code for the allocators from the CS:APP or K&R books. If you do so, you will receive no credit.
However, we encourage you to study these examples and to use them as starting points. For example, you might modify the CS:APP code to use an explicit list with constant-time coalescing. Or you might modify the K&R code to use constant-time coalescing. Or you might use either one as the basis for a segregated list allocator. Please remember, however, that your allocator must run on 64-bit machines.
• It is okay to look at any high-level descriptions of algorithms found in the textbook or elsewhere, but it is not acceptable to copy or look at any code of malloc implementations found online or in other sources, except for the implicit list allocator described in your book or K&R.
• Your allocator must always return pointers that are aligned to 8-byte boundaries. The driver will check this requirement.
• Your code must compile without warnings. Warnings often point to subtle errors in your code; when- ever you get a warning, you should double-check the corresponding line to see if the code is really doing what you intended. If it is, then you should eliminate the warning by tweaking the code (for instance, one common type of warning can be eliminated by adding a type-cast where a value is being converted from one type of pointer to another). We have added flags in the Makefile to force your code to be error-free. You may remove those flags during development if you wish, but please realize that we will be grading you with those flags activated.
9 Evaluation
The grading of the final hand-in will be based on the performance of your allocator on the given traces, the quality of your heap checker, and your coding style. There are a total of 120 points for the final hand-in:
Performance (100 points). Two metrics will be used to evaluate your solution:
• Space utilization: The peak ratio between the aggregate amount of memory used by the driver (i.e., allocated via malloc but not yet freed via free) and the size of the heap used by your allocator. The optimal ratio equals 1. You should find good policies to minimize fragmentation in order to make this ratio as close as possible to the optimal.
• Throughput: The average number of operations completed per second.
The driver program summarizes the performance of your allocator by computing a performance index,
0 ≤ P ≤ 100, which is a weighted sum of the space utilization and throughput P(U,T)=100wmin1, U−Umin +(1−w)min1, T−Tmin
Umax − Umin Tmax − Tmin 6
where U is your space utilization, T is your throughput, Umax and Tmax are the estimated space utilization and throughput of an optimized malloc package, and Umin are Tmin are minimum space utilization and throughput values, below which you will receive 0 points.1 The performance index favors space utilization over throughput: w = 0.60.
Observing that both memory and CPU cycles are expensive system resources, we adopt this formula to en- courage balanced optimization of both memory utilization and throughput. Since each metric will contribute at most w and 1−w to the performance index, respectively, you should not go to extremes to optimize either the memory utilization or the throughput only. To receive a good score, you must achieve a balance between utilization and throughput.
The 100 performance points ($perfpoints) will be allocated as a function of the performance index ($perfindex):
if ($perfindex < 50) { $perfpoints = 0;
} else {
$perfpoints = $perfindex; }
Submissions with an index below 50 will get an autograded score of 0 points. After the lab is complete, we will manually update the autograded score as follows:
• If you hand in the K&R allocator or CS:APP implicit list allocator: 0 pts • If you make a reasonable attempt at a non-implicit list allocator:
– Doesn’t compile: 10 pts
– Compiles, but segfaults, times out, or fails some traces: 35 pts – Passes all of the traces, but too slow or space inefficient: 50 pts
We give you a handful of traces. Some of them are smaller traces that don’t count towards your memory utilization or throughput. These will be useful for debugging. In the mdriver’s output, you will see these marked without a ’*’ next to them. The traces that count towards both your memory utilization and throughput are marked with a ’*’ in mdriver’s output. There are two traces which only count towards your memory utilization - these are marked with a ’u’. Likewise there is one trace which only counts towards your throughput, and is marked with a ’p’.
Note: The performance score that counts is the one computed on Autolab after you submit. The throughput component of this score may be different from that computed when you run the driver on the class machines. Make sure that you check your score after you submit to avoid any surprises.
1The values for Umin, Umax, Tmin, and Tmax are constants in the driver (0.70, 0.90, 4,000 Kops/s, and 14,000 Kops/s). This means that once you beat 90% utilization and 14000 Kops/s, your performance index is perfect.
7
Heap Consistency Checker (10 points). Ten points will be awarded based on the quality of your implemen- tation of mm checkheap. It is up to your discretion how thorough you want your heap checker to be. The more the checker tests, the more valuable it will be as a debugging tool. However, to receive full credit for this part, we require that you check all of the invariants of your data structures. Some examples of what your heap checker should check are provided below.
• Checking the heap (implicit list, explicit list, segregated list):
– Check epilogue and prologue blocks.
– Check each block’s address alignment.
– Check heap boundaries.
– Check each block’s header and footer: size (minimum size, alignment), previous/next allo- cate/free bit consistency, header and footer matching each other.
– Check coalescing: no two consecutive free blocks in the heap. • Checking the free list (explicit list, segregated list):
– All next/previous pointers are consistent (if A’s next pointer points to B, B’s previous pointer should point to A).
– Allfreelistpointerspointsbetweenmemheaplo()andmemheaphigh().
– Count free blocks by iterating through every block and traversing free list by pointers and see if
they match.
– All blocks in each list bucket fall within bucket size range (segregated list).
Style (10 points). Your code should follow the Style Guidelines posted on the course Web site. In particular:
• •
• •
10
Make
Your code should be decomposed into functions and use as few global variables as possible. You should use macros, inline functions, or packed structs to isolate pointer arithmetic to a few places.
Your mm.c file must begin with a header comment that gives an overview of the structure of your free and allocated blocks, the organization of the free list, and how your allocator manipulates the free list.
In addition to this overview header comment, each function must be preceded by a header comment that describes what the function does.
You will want to use inline comments to explain code flow or code that is tricky.
Handin Instructions
sure you have included your name and Student ID in the header comment of mm.c.
8
Hand in your mm.c file by uploading it to Autolab. You may submit your solution as many times as you wish until the due date.
Only the last version you submit will be graded.
For this lab, you must upload your code for the results to appear on the class status page.
11 Hints
• The measured throughput on your class machines and on Autolab might be different, even though these machines are identical. Your job is to ensure that your allocator is fast enough that these differ- ences don’t matter.
• Use the mdriver -c option or -f option. During initial development, using tiny trace files will simplify debugging and testing.
• Use the mdriver -V options. The -V option will also indicate when each trace file is processed, which will help you isolate errors.
• Use the mdriver -D option. This does a lot of checking to quickly find errors.
• Use a debugger. A debugger will help you isolate and identify out-of-bounds memory references. Modify the Makefile to pass the -g option to gcc and not to pass the -O3 option to gcc when you are using a debugger. But do not forget to restore the Makefile to the original when doing performance testing.
• Usegdb’swatchcommandtofindoutwhatchangedsomevalueyoudidnotexpecttohavechanged.
• Encapsulate your pointer arithmetic in C preprocessor macros or inline functions. Pointer arithmetic in memory managers is confusing and error-prone because of all the casting that is necessary. You can significantly reduce the complexity by writing macros for your pointer operations. See the text for examples.
• Remember we are working with 64-bit linux machines. Pointers take up 8 bytes of space, so you should understand the macros in the book and port them to 64-bit machines. Notably, on the 64-bit linux machines, sizeof(size t) == 8.
• Use your heap consistency checker. We are assigning ten points to your mm checkheap function for a reason. A good heap consistency checker will save you hours and hours when debugging your malloc package. You can use your heap checker to find out where exactly things are going wrong in your implementation (hopefully not in too many places!). Make sure that your heap checker is detailed. To be useful, your heap checker should only produce output when it detects an error. Every time you change your implementation, one of the first things you should do is think about how your mm checkheap will change, what sort of tests need to be performed, and so on.
• Use a profiler. You may find the gprof tool helpful for optimizing performance.
9
12
•
•
•
Keep backups. Whenever you have a working allocator and are considering making changes to it, keep a backup copy of the last working version. It is very common to make changes that inadvertently break the code and then have trouble undoing them.
Versioning your implementation. You may find it useful to manage a couple of different versions of implementation (e.g., explicit list, segregated list) during the assignment. Since mdriver looks for mm.c, creating a symbolic link between files is useful in this case. For example, you can create a symbolic link between mm.c and your implementation such as mm-explicit.c with command line ln -s mm-explicit mm.c. Now would also be an great time to learn an industrial-strength version control system like Git (http://git-scm.com).
Start early! It is possible to write an efficient malloc package with a few pages of code. However, we can guarantee that it will be some of the most difficult and sophisticated code you have written so far in your career. So start early, and good luck!
More Hints
Basically, you want to design an algorithm and data structure for managing free blocks that achieves the right balance of space utilization and speed. Note that this involves a trade-off. For space, you want to keep your internal data structures small. Also, while allocating a free block, you want to do a thorough (and hence slow) scan of the free blocks, to extract a block that best fits our needs. For speed, you want fast (and hence complicated) data structures that consume more space. Here are some of the design options available to you:
• Data structures to organize free blocks:
– Implicit free list
– Explicit free list
– Segregated free lists
• Algorithms to scan free blocks:
– First fit/next fit
– Blocks sorted by address with first fit – Best fit
You can pick (almost) any combination from the two. For example, you can implement an explicit free list with next fit, a segregated list with best fit, and so on. Also, you can build on a working implementation of a simple data structure to a more complicated one.
In general, we suggest that you start with an implicit free list (mm-textbook.c in your handout directory), then change this to an explicit list, and then use the explicit list as the basis for a final version based on segregated lists.
Good luck!
请加QQ:99515681 或邮箱:99515681@qq.com WX:codehelp