21 September 2010

Learning by Doing

For 6.172 (that's Performance Engineering of Software Systems for non-MITers), we were assigned our first project. The task was to optimize (or rewrite) the supplied code to solve three tasks (OCW has two of the three tasks). We were given access to sixteen dual-hexacore (so twelve physical cores total, with hyperthreading (so twenty-four virtual cores)) machines with 48GB RAM each. The code was distributed via the class' own git repository, and it would also be used for submitting our beta and final code. Because I was a lowly froshie and was eager to become acknowledged, I spent most of my free time staring at the computer, coding.

For simplicity, I coded on the CSAIL machines so I would not have to worry about code fragmenting among multiple machines (my CSAIL AFS locker, my laptop, and my desktop). I used a combination of vim and screen to view and edit multiple files at once, make to compile the project, gdb to debug, and git for version control. You'll realize that these are all terminal-based and open source tools, which means that I did not have to rely on a GUI or purchase the programs. I do not want to resort to torrenting tools and working in the terminal solicits weird looks from people. :D

In any case, I was having trouble tracing the cause of a bug in a program. Having read a book about basic debugging and working with gdb during the summer, I pulled out gdb and said book and began delving into code. Setting breakpoints was easy:

(gdb) break everybit.c:254

to set a breakpoint in file everybit.c at line 254.

I finally figured out that getting a signed char (in range [-128, 127]) to map to [0, 255] while keeping the same bit representation was to simply cast it to an unsigned char. In other words, the map I wanted was [0,127] -> [0,127] and [128,255] -> [-128,-1], and casting works because for i in [0,127], the most significant bit does not matter, but for i in [128,255], the most significant bit is worth +128 in unsigned chars rather than -128. This bug took me approximately two days to and multiple trials to squash.

Later, when I was well into the project, I encountered an annoying git error that complained about a non-fast forward change. I'm still not sure how to fix this because I did not log what I did to resolve the solution, but the logical thing is to copy the edited file, checkout the file (so it is reverted to the one in the git repo), and then merge by hand. Actually, thinking about it now, creating a new branch, pushing changes to that, and then merging that branch with the master branch will let git combine the two files so you can just delete the old parts (the different parts are sectioned with <<<<<<<<<< and >>>>>>>>>>).

Merging brings me to my second note about git. DO IT! Pushing everything to the master branch, while it may seem convenient, will cause major troubles (somewhat similar to those aforementioned). Branching will be used in any decently sized software development and by any decently reasonable software developers.

Last but not least, recording all tricks that you tried and noting what works/does not work is one of the most valuable lab skills. Your log becomes an invaluable resource (aside from Google) because not only do you have archives of what you have tried before, but also note new observations and untested tricks/experimentation methods. The more you consult your blog to add new material, the more you will see your old notes, and thus remember them! After all, this is what blogs are for!

No comments:

Post a Comment