Thursday, May 14, 2009

BoostCon'09 Trip Report

Below I report on BoostCon'09 (http://www.boostcon.com/home) which took place in Aspen, Colorado, last week (May 3-8, 2009).

Disclaimer: This report is not exhaustive. I apologize upfront to any BoostCon'09 presenters whose talks are not listed below. BoostCon runs Track I and Track II sessions concurrently, therefore, it is only possible to attend half of the sessions. I also had some exciting "hallway discussions" which caused me to miss some additional talks. Lastly, I'd like to send out my personal apologies to Jeff Garland for missing his "Boost Library in a Week" talks. I heard that Jeff's sessions were fantastic, but unfortunately, I cannot report on them personally.

C++0x Overview and Compiler Support (Michael Wong)

Michael Wong, a C++0x, OpenMP, and OpenCL standards committee member, started his talk by explaining that he had over 100 slides covering C++0x. He explained that his intent was not to get through all the slides, but instead to make them available to the BoostCon attendees so any topics he could not cover were available online.

The two motivating ideas behind C++0x as explained by Wong (and also by Stroustrup) are: (1) to make C++ a better language and (2) to make C++ easier to teach and learn. After hearing Wong's talk, I believe C++0x does a fantastic job of achieving these goals. Simplifying a language like C++ is a difficult thing to do because backwards compatibility is not optional. However, through new language extensions like concepts, rvalue references and auto, I believe the C++ standards committee has succeeded. Some of you reading this may be thinking, "how do concepts and rvalue references simplify C++, it seems like they make things harder?"

In truth, concepts and rvalue references do add complexity in one area in order to reduce it in another. First, concepts will probably only be coded by expert C++ programmers, as they are used to constraint template parameters. As such, novice programmers will likely never write a concept constraint. Instead, the novice programmer will simply inherent the benefit of concepts within their code. He or she will directly see the benefit of concepts because template error messages will change from 100+ lines of complex compiler output to more elegant error messages like,

"There is no operator< defined for class A."

Regarding how rvalue references simplify C++, they achieve this simplification at a more abstract level. See my comments on Dave Abrahams's rvalue references talk below for more details.

Wong also gave a fantastic survey of C++0x compiler status (IBM, GNU, Intel, Comeau, Borland, Visual Studio and Sun Studio -- wow!). Wong's C++0x talk was fantastic; his survey of C++0x alone was worth the trip's expense.

Parallel Patterns Library in Visual Studio 2010 (Stephan T. Lavavej)

Stephan T. Lavavej's talk on the Parallel Patterns Library (PPL) in Visual Studio 2010 was quite interesting. As with most concurrent libraries, Microsoft's PPL aims to shift as much of the parallel programming complexity as possible into its libraries so the end programmer only has to reason about a few minor issues related to concurrency. Through my own research efforts, I can attest that this simplification is more challenging that it sounds. For that, I applaud the early PPL efforts.

The PPL libraries in Visual Studio do a good job simplifying the concurrency and communication required by the end programmer. Stephan Lavavej did a wonderful job demonstrating the (1) ease of interfacing with PPL and (2) the performance results from using it. For those of you who use Visual Studio, Lavavej's talk shed light on Microsoft's direction with Visual Studio 2010 toward simplifying the programming complexity that is inherent in the multi- and many-core revolution.

Boost++0x: Hands on rvalue References (Dave Abrahams)

Dave Abrahams is the co-founder of Boost with Beman Dawes. Abrahams is responsible for many things related to C++. He is perhaps best known for his "C++ Template Metaprogramming" book and his work on a theoretical exception framework (nothrow, strong and basic) known as Abrahams Exception Safety Guarantees. Abrahams's Metaprogramming book is excellent, however, it is clearly written for the "expert-only" crowd. Less experienced C++ programmers should start with something easier, like Meyers's Effective C++ series.

Once you hear Dave Abrahams talk it becomes clear why he has been so influential in the C++ community; he seems to know everything there is to know about C++ and he moves at a lightning pace. Dave's talk on rvalue references was succinct, clear and informative. The basic purpose of rvalue references is to enable object moving instead of copying. Moving objects can improve the performance of a program as well as making it more exception safe. In particular, move semantics should always be nothrow. Since moved objects should be implemented with nothrow operations, the safety of the calling client code is improved, simplifying their necessary try/catch behavior.

Several topics outside of rvalue references were touched on in Abrahams's talk, such as copy-on-write (COW), exception safety, rvalue and lvalue reference operator precedence order, copy elision (return value optimization [RVO]) and much more. By the time Dave's high-level explanation of rvalue reference was complete, I felt like Neo from the Matrix, "I know rvalue references." Dave's knowledge of C++ is amazing.

Truthfully, I had some prior experience using move semantics (rvalue references) before attending Dave's talk. Jeremy Siek and I found that rvalue references are particularly useful in building a correct implementation of STM within C++. We published a paper on rvalue references last year at ICOOOLPS which you can find on my or Jeremy's website. However, even with my prior experience in rvalue references, Dave's talk made me realize how much of the space I hadn't considered.

Boost Exception Authors Corner (Emil Dotchevski)

In my opinion, Emil Dotchevski deserves to jointly win the "Best Presentation Award" for BoostCon'09 with Thomas Becker. Here's why. First, Emil covered a significant amount of exception-related material in only 45-minutes. Entire books have been dedicated to this topic; see Sutter's Exceptional C++ series. Second, his talk was about exceptions. People generally despise dealing with errors so giving a talk on exceptions is already disadvantaged. Third, and most importantly, Emil gave his entire talk without the benefit of slides (projector failure!). During his talk, Emil conveyed the complexities of exception propagation, exception neutrality, throw and catch sites, and finally, exception augmentation which allows exceptions to contain more relevant (and practical) information than non-augmented exceptions.

Emil's talk was very practically-motivated as is his Boost.Exception implementation. His exception framework makes exceptions additive and more informative while maintaining neutrality. I hope Emil gives this same talk next year, but instead is given a 90-minute slot to give more examples on how to properly use his exception framework. 45-minutes isn't enough time for him to demonstrate some of the more complex exception behavior his library can handle.

For those of you who don't know Emil, he's the author of the Boost.Exception Library. An interesting feature of Emil's library is that it supports transporting exceptions between threads. This is a key aspect in ensuring correct concurrent error-handling. He's also a self-proclaimed ex-video game programmer. Being a video game programmer myself, I don't hold this against him. =)

Boost + Software Transactional Memory (Justin Gottschlich, Jeremy Siek)

This was my talk and thankfully I was there (and on time!). This was my first time giving a talk at BoostCon and I was pleasantly surprised by the interest and intelligence of the BoostCon attendees. I have given a number of academic and industry talks over the last few years and I was amazed at how deeply interested, attentive, and curious the BoostCon attendees were.

During and after my talk, the BoostCon folks, most of whom are much more familiar with C++ than I am, gave me gentle and useful suggestions on how to improve my implementation. It was greatly reassuring to speak with so many intelligent people with notably positive suggestions on how to improve my library implementation.

Multithreaded C++0x: The Dawn of a New Standard (Michael Wong)

Michael Wong's talk on C++0x multithreading kept a packed room of BoostCon attendees in silence for 90-minutes. The reason why is obvious; Michael's talk was on memory consistency models, which is one of the most complex ideas in hardware concurrency.

I've taken an entire graduate level computer engineering course on memory consistency models, so it was no shock to me when the room fell completely silent for almost the entire talk. At one point Michael asked the audience "is this stuff interesting?" There was a resounding "YES!" I think Michael thought that due to the lack of questions people were uninterested or bored. What I think was happening is the audience was deeply probing the new complexities of hardware concurrency and were mentally iterating through a new era of upcoming heisenbugs. At least, that's what I was doing. =)

Iterators Must Go (Keynote: Andrei Alexandrescu)

Andrei Alexandrescu is a world-famous C++ expert. He has written two books: "C++ Coding Standards" and "Modern C++ Design." Andrei's "Modern C++ Design" book is considered by many (including Scott Meyers) as one of the most influential C++ books ever written. As such, when Andrei talks, people tend to listen.

In my opinion, Andrei's keynote address was meant to be radical, pervasive and thought-provoking. If these were his goals, he achieved them in waves. Andrei's talk suggested that while the STL is great, iterators aren't. Andrei spent 90-minutes demonstrating why iterators are problematic and why his new solution, called ranges, are less problematic.

The BoostCon folks are intelligent and critical and so Andrei received some opposition when he first presented the idea that "iterators must go." After the initial shock passed, people started to more openly explore the idea of ranges in hallway conversation and later that night at the picnic.

My opinion is that ranges alone are insufficient. Even if ranges are better than iterators, which I'm not going to dispute -- you can read Emil Dotchevski BoostCon'09 trip report for his thoughts -- if we rewrite all of STL using ranges, we should also ensure the underlying reimplementation is highly concurrent. I must admit that Andrei's talk on ranges did convince me of one thing: ranges are more concurrent-friendly than iterators. I'm not disrespecting Stepanov's brilliant STL. Stepanov is orders of magnitude more intelligent than I am, so for you die-hard C++ers, please don't take my view here as disrespecting him. I am only pointing out that when Stepanov created STL, he may have been focused on single-threaded execution. Now that we are in the multi- and many-core era, we need underlying libraries that are highly concurrent.

Given my research interest in transactional memory, I favor an STL rewrite using transactional memory at its core. I do concede that non-blocking atomic primitives are also viable alternatives that are highly scalable. The problem is that non-blocking atomic primitives (e.g., CAS, LL/SC) are unbelievably difficult to use for implementing even the most basic algorithms. A queue implementation using non-blocking atomic primitives is so challenging, when it was first accomplished it was published in a top tier theoretical computer science conference (Michael and Scott, PODC'96). That's not a good thing. It demonstrates that non-blocking primitives are too difficult to implement for non-experts. If building a queue using non-blocking atomic primitives is publication worthy, rewriting STL using only non-blocking primitives will be Turing Award worthy.

Kamasu: Parallel Computing on the GPU with boost::proto (Troy Straszheim)

Troy Straszheim gave a interesting talk on boost::proto and general purpose graphics processing units (GPGPUs). Troy set out to learn boost::proto and CUDA, the programming interface for nVidia's GPGPUs, and then combine them to see what would happen. In Troy's journey to combine boost::proto with CUDA, he explored holder classes using shared_ptr implementations and dozens of aspects in operator overloading. While Troy achieved a simplified interface using boost::proto, he hinted that the amount of effort to combine CUDA and boost::proto may be unnecessarily heavy.

While Troy's talk ended anti-climatically, I found his exploration to be insightful for two reasons. (1) Troy's goal of making CUDA accessible using native C++ semantics is important for wide-scale GPGPU adoption in the general C++ community. (2) Troy's early results are useful for future research. Now that we know what has been explored and how it has been explored, we can try new approaches that extend, without replicating, Troy's initial research efforts.

Later that night at the BoostCon picnic I tracked Troy down and introduced myself. I shared my thoughts about his talk and mentioned that boost::proto + CUDA may not be the solution, but maybe another combination of various interfaces might work. We tossed a few ideas around, but it was getting late and I had my dog Max with me so I had to cut the conversation short. Whatever the case, Troy's CUDA + Boost research is going to have growing importance as GPGPUs are used to exploit embarrassing parallelism and C++ programmers want to take advantage of that using native C++ semantics. Whatever you do Troy, please continue exploring this space.

The Boost Smart Pointer Library (Thomas Becker)

Dr. Thomas Becker's talk on smart pointers covered so much more than smart pointers. After Thomas's talk, I had learned more than smart pointers; I learned new ways to think. By deeply analyzing something's past, as Becker explained, we can better understand its present and future. Becker continued by demonstrating how contending views between purists and pragmatists effects software interfaces, such as those found in Boost's smart pointers. Listening to Dr. Becker talk about software is an amazing experience. He is so knowledgeable about software and is such an elegant speaker, he is able to cover a range of topics without the audience experiencing even the slightest bumps in topic transition. If I were handing out BoostCon awards, Thomas Becker would jointly share the BoostCon'09 "Best Presentation Award" with Emil Dotchevski.

While Dr. Becker's talk was centered on smart pointers, if you stepped into the room halfway through the talk, you might think his talk was about the differences in purist and pragmatists programming views. Thomas's ability to break down ideas to their core elements is truly uncanny; he did this with smart pointers effortlessly. When the C++ User's Journal was still active, I used to read Thomas's articles on advanced C++. Most people that write well, tend to fall short in other areas, like giving public talks. Thomas certainly does not fall into this category. Based on what I saw at BoostCon'09, I will actively go out of my way to attend every talk Thomas Becker gives in the future. This man possesses brilliant insight.

Building a simple language compiler using Spirit V2.x (Hartmut Kaiser, Joel de Guzman)

Because I went to half of Thomas Becker's smart pointer talk, I missed the first half of Hartmut Kaiser and Joel de Guzman's Spirit talk. This turned out to be a really bad idea. When I showed up, Joel was working through an expression deconstruction example using Spirit V2, which I believe was performing lexical analysis. I have to admit, this was probably not the talk to come in on in the middle. Joel is a very sharp, methodical speaker and his work in Spirit V2 is outstanding. The general purpose of Joel and Hartmut's Spirit talks seemed to be about general compiler construction all from within the capabilities of Spirit V2 (a truly amazing accomplishment).

After 45-minutes, Hartmut Kaiser, the program chair for BoostCon'09, presented a fascinating demonstrating of Karma, a library for flexible generation of arbitrary character sequences. In short, Karma is an "unparser." Perhaps the most motivating example of Karma for me is its use with the kleene star (or kleene closure) which allows entire containers (or containers with markup) to be output in a single expression. Finally, I/O in C++ that's as easy as Java. Karma is a slam-dunk; once you see what you can do with Karma, you will be convinced of its importance and practicality almost immediately.

Note: I've tried to show a Karma I/O example but blogger.com keeps rejecting the syntax and destroying the post. So after an hour of trying, I'm giving up. If you want to see concrete examples, check out Joel an Hartmut's slides on the BoostCon'09 website. They really are amazing.

Regarding the Practical Importance of Compiler Construction (or "Why Joel and Hartmut's Research is Critical")

I know many programmers who think the compiler construction courses that are offered in undergraduate and graduate academic curriculum are less than practical. Some of my colleagues have argued that such courses are entirely worthless. It is my opinion that just the opposite is true; compiler courses are invaluable.

Compiler construction is important because most programmers will build several "mini-languages" in their lifetime. These "mini-languages" are often given little thought when they are first designed and only basic parsers are written to handle their (generally dynamic) code. In my 10 years of industry experience, I have written at least four complete mini-languages and the full parsers to go with them. One of these mini-languages now has over a millions lines of code associated with it.

The problem with these mini-languages is that when we create them we generally don't build compilers for them, or even begin to think about creating BNF or EBNF compliance. This is because industry is on a time critical path. We need these mini-languages implemented yesterday. And so we rush to get them functional. Since we can't possibly build an entire compiler, we foolishly plow through the language design and crank out a rudimentary parser (i.e., we build just what is needed). The results of our fast-paced language design are numerous dynamically created bugs, language awkwardness and general mayhem as the mini-language starts to be used more and more.

An example of this is Nodeka's artificial intelligence (AI) language. The Nodeka AI language is a simple language I created for my game which handles non-player character (NPC) intelligence. I created the Nodeka AI language because I needed a grammar that supported simple AI decision tree construction. These decision trees are not easily created in regular C++ and I wanted dynamic NPC AI support, so I could extend the NPC AI while the game is running, thereby minimizing reboots. The problem is, I only created a basic parser, not a full blown compiler. While my Nodeka AI language works, to a certain extent, it is certainly far from ideal. As we go beyond a million lines of AI code, subtle errors are beginning to arise that my parser misses and are nearly impossible to track down to fix. We actually had to stop enhancing a certain NPC's AI because the NPC continually crashed Nodeka and we couldn't find the root cause its in AI.

If I had Joel and Hartmut's Spirit library when I created my AI language a number of errors could have been avoided and we wouldn't be at the stand-still we are with one of our NPCs that has highly complex AI. As I write this, I am seriously debating building a compiler front-end using Spirit for Nodeka's AI language. Very seriously. This is not a plug for Joel or Hartmut's talk (neither of them need my plugs), instead it is a sincere acknowledgement on the practical importance of their work.

Closing Comments

BoostCon'09 was a tremendous event. If you missed it, hopefully my trip report sheds some light into the interesting things that happened. But if you did miss BoostCon'09, you missed a lot.

2 comments:

Emil Dotchevski said...

Justin, thanks for the kind words. Reading your post I realize how lame my BoostCon report is, damn!

I liked your talk too and I'd really like to learn more about TBoost STM. I'm a complete n00b in transactional memory theory but I've done a bit of traditional lock-based multi-threaded programming, just enough to appreciate that it's too difficult to do and next to impossible to test. Using non-blocking atomic primitives is also tricky; as you point out even simple algorithms are surprisingly tricky to get right with atomics. To add to your example, the double refcounting in shared_ptr/weak_ptr is lock-free only since Boost version 1.33!

MW said...

Thanks for the endorsements. I too enjoyed being there and encourage everyone to go. I thought your talk on STM library was the most dynamic talk in all of BoostCon09 (even more so then mine:). This is a credit to your skill as a speaker.I will blog about that in my own IBM blogs:
Parallel & Multi-Core Computing http://www.ibm.com/software/rational/cafe/blogs/ccpp-parallel-multicore
C++ Language & Standard http://www.ibm.com/software/rational/cafe/blogs/cpp-standard

I wanted to point out that my talk about concurrency affect the entire stack from software all the way to hardware as it must. And yes, we hope to reduce the number of heisenbugs. Only time will tell if we succeeded.

Post a Comment