Why C++ is not my favorite programming language
For years I've been complaining about C++. I complain a lot. I learned how to program large projects using C, Java, and perl, and I was shocked when I wrote my first nontrivial C++ project at how stupid the language was. I feel bad using that word -- "stupid" -- because language design is so hard, and C++ is almost certainly better than whatever I would come up with if you asked me to construct a modern object-oriented programming environment. Still, it seems so obviously wrong in so many areas that I felt it was my right -- nay, my duty -- to inform my coworkers at frequent intervals how bad it was and what a rough time I was having with it. Ask them; they'll tell you about it.
Anyway, my feelings of negativity sort of crystallized recently, when I was talking to a colleague about the STL, and how no one uses it because it's such an obvious piece of garbage. Most people that I've talked to in the C++ world believe this. Every large C++ project I've worked on has its own home-grown implementation of basic things like strings, dynamic arrays with bounds checking, and sometimes even garbage collection. I realized that this was the perfect demonstration of C++'s near-unsuitability for high-level programming:
How broken does a language have to be before strings, dynamic arrays, and hash tables turn into things that are challenging to implement? Most other high-level languages have these features built in. Even the ones that have them more or less tacked on as just another class -- like Java -- seem to have done a decent enough job of it that people at least think about using the provided abstractions. Not so in C++ -- there, the construction of a string class has been a problem that's occupied the programming community off and on for decades.
This has been my general experience with large subsystems in C++ projects -- they aren't very general, because C++ doesn't allow it. They wind up as sort of C solutions in object-oriented drag, because as an object-oriented environment C++ simply doesn't allow you enough flexibility. You have to do a lot of weird nonsense to get them to work, because that's the way things are. When I say things like this to C++ people, they look at me funny, because they've lived their whole programming lives amid this chaos, and they don't know what I'm talking about.
A Simple Example
Here's how you fetch a web page using libwww-perl:
Anyway, let's get back to the lack of functionality. How would I go about maintaining two user agents, each one with its own set of cookies, user-agent string, and so on? In the perl program, it's immediately obvious -- I have two LWP::UserAgent objects. It turns out, after examining the source for libcommon, that there's also a way there -- each URLStream object can be assigned a user-agent string, cookies, and so on. Very well, what if I want persistent settings even when the design of my program means that I will want to maintain several active URLStreams that are going simultaneously? Can I build one "dummy" URLStream, set it up, and then assign others to it to duplicate its settings? Or do I just have to re-set-up each URLStream as I create it?
I know it seems like I'm picking on something minor -- it's not that hard to keep track of a user-agent string and a cookie string, and re-set-up each URLStream as I create it -- but it's a good example of how C++ tends to defeat object-oriented designs. It's very difficult to have C++ objects create each other, or "own" each other temporarily, because it creates resource-lifetime problems, so C++ programs tend to wedge everything into one class (or one mutually-cooperative class hierarchy with a single external interface), instead of having several public classes that cooperate according to a specified interface, each representing one aspect of a problem. Try to implement the libwww-perl program's class structure in a C++ program, and before long you'll find yourself making things like new_response_from_request() functions that have a note saying that you must treat this function as a new call, and destroy the resulting objects when you're done with it. I know what I'm talking about; I've been there.
Another place where the lack of interaction between classes causes trouble is with strings. libcommoncpp defines -- surprise! -- yet another interal string abstraction, ost::String, but for sanity's sake all the public-facing interactions are done via char *s. Have strings in a non-ASCII encoding? Want to store each string only once across all of the C++ libraries you're using? Need to store data that includes a '\0'? Sorry, you're screwed. Welcome to the 1970s. Have fun solving the same problem over and over again in a bunch of different places (notice how dealing with '\0's in the retrieved data adds a couple lines to that 17/20 ratio of uselessness above, since there has to be a separate way to get a count of the data bytes retrieved).
I didn't just pick libcommoncpp because I thought it was crappy -- I actually wrote all of the above up to "Now here's the same thing"before I downloaded the first C++ web-fetching library I ran across, because I know from experience that every nontrivial C++ library looks like this. They all have, for each problem, one big class that does everything (and leaves some things undone, because you can't get everything done in one class). If that's all you know, then you're not surprised by that, but it you're accustomed to sane OO designs, it's constantly slapping you in the face when you work with C++.
I have one last thing to say -- for perl, I included the use LWP::UserAgent; line, because it seemed only fair to include it as part of the program. For C++, it seemed it would have been a bit petty to do likewise and include:
compiling, one google search, and lots of errors like this:
was this:
"But that's just one crappy library."
No it isn't. It's every C++ project. I don't think there's anything wrong with libcommoncpp -- it's just another C++ library working in the language it's been given. They all suffer, to a greater or lesser degree, to the same disease, which is this: It's so much of a PITA to pass objects around meaningfully that classes don't really cooperate with each other. Each class (or, in the case of large projects, class hierarchy) sits in its own little level of abstraction, and the idea of using low-level classes to get things done in the higher-level classes never gets much beyond a sort of stilted one-interface-per-problem-domain approach. Frequently, the lowest-level functionality winds up getting put into a huge common superclass that gets inherited by half the project. From a genuinely object-oriented point of view, it's ridiculous.
"What do they call a common hash table class in C++?"
"The impossible dream."
Why else would every project invent its own string class and dynamic array? Why else would a lot of projects implement hash tables by hand everywhere they're needed? How can a C++ programmer claim with a straight face that C++ can build powerful abstractions when, 20+ years after the invention of the language, many C++ libraries use char * as the string class in their public interfaces?
"But it's FAAAAAASTER!"
No it isn't. C is faster than everything, and it's easy to write C as part of your C++ program. But that doesn't mean that C++ is faster than other high-level languages. If you write a C++ program, and then convert the slow parts of it to fast C, then you've just done what you could have done (with a lot more difficulty, it's true) to any perl or python program. You probably would have gotten done faster if you'd done it in perl or python -- even with the extra pain-in-the-ass of C-izing pieces of your interpreted program -- since developing in a proper language will get you done much, much faster than developing in C++ will. It wouldn't be a proper rant if I had any research to back that statement up, but you can trust me.
Anyway, look here. There's a simple problem which is attacked in "The Pratice of Programming" of generating garbage text via a Markov chain algorithm. They give sample implementations in C, Java, C++, awk, and perl. I wrote one of my own in python. I also rewrote the C++ version somewhat to make it work only on prefixes of length 2, since the original C++ program could easily handle any prefix length (which slowed it down), but the perl and python programs couldn't, and that wasn't fair. I called my modified C++ Markov program markov++2. Here's the redefinition of the structure for Prefix that I used (unfortunately I can't give you the whole program, since the original markov++ is copyright Lucent Technologies):
"Do you have any constructive suggestions, or are you just bitching?"
Yes. Don't use C++. Use something else.
Read more!
Anyway, my feelings of negativity sort of crystallized recently, when I was talking to a colleague about the STL, and how no one uses it because it's such an obvious piece of garbage. Most people that I've talked to in the C++ world believe this. Every large C++ project I've worked on has its own home-grown implementation of basic things like strings, dynamic arrays with bounds checking, and sometimes even garbage collection. I realized that this was the perfect demonstration of C++'s near-unsuitability for high-level programming:
How broken does a language have to be before strings, dynamic arrays, and hash tables turn into things that are challenging to implement? Most other high-level languages have these features built in. Even the ones that have them more or less tacked on as just another class -- like Java -- seem to have done a decent enough job of it that people at least think about using the provided abstractions. Not so in C++ -- there, the construction of a string class has been a problem that's occupied the programming community off and on for decades.
This has been my general experience with large subsystems in C++ projects -- they aren't very general, because C++ doesn't allow it. They wind up as sort of C solutions in object-oriented drag, because as an object-oriented environment C++ simply doesn't allow you enough flexibility. You have to do a lot of weird nonsense to get them to work, because that's the way things are. When I say things like this to C++ people, they look at me funny, because they've lived their whole programming lives amid this chaos, and they don't know what I'm talking about.
A Simple Example
Here's how you fetch a web page using libwww-perl:
use LWP::UserAgent;Straightforward logic: Import the module, create a web-client abstraction, create a request, execute the request, and print the content. One line for each of those logical steps. That's the way it should be. Now here's the same thing, using libcommoncpp (a Gnome C++ utility library; I picked it because it was the only C++ HTTP library listed on http://curl.oc1.mirrors.redwire.net/libcurl/competitors.html that didn't include pointed criticism along with the link). This is taken from the urlfetch.cpp example that comes with the source:
my $ua = LWP::UserAgent->new;
my $req = HTTP::Request->new(GET => 'http://slashdot.org/');
my $res = $ua->request($req);
print $res->content if $res->is_success;
URLStream url;Did you notice how much more complicated that is, and how many things the interface is missing? Actually, the first thing to notice is how much less time I spend telling the library what I want done, and how much more time I spend running around barking orders about precisely how it needs to be organized in order to get done successfully. I count three lines of code devoted to "what I want":
URLStream::Error status;
#ifdef CCXX_EXCEPTIONS
try {
#endif
status = url.get("http://slashdot.org/");
if (!status) {
while(!url.eof()) {
char buffer[1024];
url.read(buffer, sizeof(buffer));
int len = url.gcount();
if (len > 0)
cout.write(buffer, len);
}
}
#ifdef CCXX_EXCEPTIONS
} catch(...) {
// No response, we just want to print nothing
}
#endif
status = url.get("http://slashdot.org/");
if (!status) {
cout.write(buffer, len);The other 17 (!) lines are just bookkeeping, just me having to keep the compiler verbosely informed of the particulars of how to go about the task instead of just having the damn thing get it done for me. C++ programmers don't mind those lines, because that's just what you have to do (TM). To someone who's accustomed to writing things like those five lines of perl, though, writing them feels like telling a child how to put on his pants. Every day, every function, it's "left leg goes in left hole, now create sock object..."Anyway, let's get back to the lack of functionality. How would I go about maintaining two user agents, each one with its own set of cookies, user-agent string, and so on? In the perl program, it's immediately obvious -- I have two LWP::UserAgent objects. It turns out, after examining the source for libcommon, that there's also a way there -- each URLStream object can be assigned a user-agent string, cookies, and so on. Very well, what if I want persistent settings even when the design of my program means that I will want to maintain several active URLStreams that are going simultaneously? Can I build one "dummy" URLStream, set it up, and then assign others to it to duplicate its settings? Or do I just have to re-set-up each URLStream as I create it?
I know it seems like I'm picking on something minor -- it's not that hard to keep track of a user-agent string and a cookie string, and re-set-up each URLStream as I create it -- but it's a good example of how C++ tends to defeat object-oriented designs. It's very difficult to have C++ objects create each other, or "own" each other temporarily, because it creates resource-lifetime problems, so C++ programs tend to wedge everything into one class (or one mutually-cooperative class hierarchy with a single external interface), instead of having several public classes that cooperate according to a specified interface, each representing one aspect of a problem. Try to implement the libwww-perl program's class structure in a C++ program, and before long you'll find yourself making things like new_response_from_request() functions that have a note saying that you must treat this function as a new call, and destroy the resulting objects when you're done with it. I know what I'm talking about; I've been there.
Another place where the lack of interaction between classes causes trouble is with strings. libcommoncpp defines -- surprise! -- yet another interal string abstraction, ost::String, but for sanity's sake all the public-facing interactions are done via char *s. Have strings in a non-ASCII encoding? Want to store each string only once across all of the C++ libraries you're using? Need to store data that includes a '\0'? Sorry, you're screwed. Welcome to the 1970s. Have fun solving the same problem over and over again in a bunch of different places (notice how dealing with '\0's in the retrieved data adds a couple lines to that 17/20 ratio of uselessness above, since there has to be a separate way to get a count of the data bytes retrieved).
I didn't just pick libcommoncpp because I thought it was crappy -- I actually wrote all of the above up to "Now here's the same thing"
I have one last thing to say -- for perl, I included the use LWP::UserAgent; line, because it seemed only fair to include it as part of the program. For C++, it seemed it would have been a bit petty to do likewise and include:
#include <cc++/common.h>... in my line count. Likewise the several minutes of failed
#include <iostream>
using namespace std;
using namespace ost;
int main(int argc, char **argv) {
compiling, one google search, and lots of errors like this:
/tmp/ccTlx2HI.o: In function `ost::URLStream::~URLStream()':... before I figured out that the right way to compile my program
url.cpp:(.text._ZN3ost9URLStreamD1Ev[ost::URLStream::~URLStream()]+0x30): undefined reference to `vtable for ost::URLStream'
was this:
g++ -D_GNU_SOURCE -L/usr/lib -lccext2 -lccgnu2 -lgnutls -lgcrypt -lz -ldl -lrt -pthread -o url url.cppGreat stuff, huh? I bet in Dev Studio it would have been even easier.
"But that's just one crappy library."
No it isn't. It's every C++ project. I don't think there's anything wrong with libcommoncpp -- it's just another C++ library working in the language it's been given. They all suffer, to a greater or lesser degree, to the same disease, which is this: It's so much of a PITA to pass objects around meaningfully that classes don't really cooperate with each other. Each class (or, in the case of large projects, class hierarchy) sits in its own little level of abstraction, and the idea of using low-level classes to get things done in the higher-level classes never gets much beyond a sort of stilted one-interface-per-problem-domain approach. Frequently, the lowest-level functionality winds up getting put into a huge common superclass that gets inherited by half the project. From a genuinely object-oriented point of view, it's ridiculous.
"What do they call a common hash table class in C++?"
"The impossible dream."
Why else would every project invent its own string class and dynamic array? Why else would a lot of projects implement hash tables by hand everywhere they're needed? How can a C++ programmer claim with a straight face that C++ can build powerful abstractions when, 20+ years after the invention of the language, many C++ libraries use char * as the string class in their public interfaces?
"But it's FAAAAAASTER!"
No it isn't. C is faster than everything, and it's easy to write C as part of your C++ program. But that doesn't mean that C++ is faster than other high-level languages. If you write a C++ program, and then convert the slow parts of it to fast C, then you've just done what you could have done (with a lot more difficulty, it's true) to any perl or python program. You probably would have gotten done faster if you'd done it in perl or python -- even with the extra pain-in-the-ass of C-izing pieces of your interpreted program -- since developing in a proper language will get you done much, much faster than developing in C++ will. It wouldn't be a proper rant if I had any research to back that statement up, but you can trust me.
Anyway, look here. There's a simple problem which is attacked in "The Pratice of Programming" of generating garbage text via a Markov chain algorithm. They give sample implementations in C, Java, C++, awk, and perl. I wrote one of my own in python. I also rewrote the C++ version somewhat to make it work only on prefixes of length 2, since the original C++ program could easily handle any prefix length (which slowed it down), but the perl and python programs couldn't, and that wasn't fair. I called my modified C++ Markov program markov++2. Here's the redefinition of the structure for Prefix that I used (unfortunately I can't give you the whole program, since the original markov++ is copyright Lucent Technologies):
struct Prefix {
Prefix(const string &w1, const string &w2) : w1(w1), w2(w2)
{ }
string w1;
string w2;
void advance(const string &newword)
{
w1 = w2;
w2 = newword;
}
}; Here are the average running times of a test script that exercises the markov functionality of the various programs on my home PC (Debian sid), the C++ being compiled with "g++ -O". Smaller is better (fewer seconds to complete).markov-c 0.088The C program is blazing fast, as you'd expect. But the other high-level programs are about equally fast. The perl one is faster than either of the C++ versions. Nothing need be said about the Java version. For fairness, I tried to compile and run the C++ code in Dev Studio, but Dev Studio choked on one of its own header files for some reason. Oh well.
markov++ 0.588
markov++2 0.288
markov.pl 0.245
markov.py 0.384
"Do you have any constructive suggestions, or are you just bitching?"
Yes. Don't use C++. Use something else.
Read more!