Sponsored Jobs

Sunday, February 10, 2008

Portrait of a N00b

 The older I grow, the less important the comma becomes. Let the reader catch his own breath.
— Elizabeth Clarkson Zwart

This is how I used to comment my code, twenty years ago (Note: dramatization):
  /**
* By the time we get to this point in the function,
* our structure is set up properly and we've created
* a buffer large enough to handle the input plus some
* overflow space. I'm not sure if the overflow space
* is strictly necessary, but it can't hurt. Next we
* have to update the counter to account for the fact
* that the caller has read a value without consuming
* it. I considered putting the counter-increment on
* the shoulders of the caller, but since it meant every
* caller had to do it, I figured it made more sense to
* just move it here. We can revisit the decision down
* the road if we find some callers that need the option
* of incrementing it themselves.
*/

counter++; // increment the consumed-value counter


/**
* Now we've got to start traversing the buffer, but we
* need an extra index to do it; otherwise we'll wind up
* at the end of the function without having any idea
* what the initial value was. I considered calling this
* variable 'ref', since in some sense we're going to be
* treating it as a reference, but eventually I decided
* it makes more sense to use 'pos'; I'm definitely open
* to discussion on it, though.
*/

char* pos = buffer; // start our traversal


/**
* NEXT, we...
*/
Does this style look at all familiar? It should! This is, to put it as impolitely as possible, n00b-style. (Incidentally, if u dont no wat a n00b iz, u r 1.)

This is how junior programmers write code. If you've read Malcom Gladwell's remarkable and eye-opening book The Tipping Point, you'll notice a striking similarity to the real-life 2-year-old Emily he describes in Chapter Three, who tells herself stories after her parents leave her room. Here's a short excerpt from one of her stories:
Tomorrow when we wake up from bed, first me and Daddy and Mommy, you, eat breakfast eat breakfast like we usually do, and then we're going to play and then soon as Daddy comes, Carl's going to come over, and then we're going to play a little while. And then Carl and Emily are both going down to the car with somebody, and we're going to ride to nursery school [whispered], and then when we get there, we're all going to get out of the car...
Gladwell's account of Emily is fascinating, as she's allegedly a completely normal 2-year-old; they all do this when Mommy and Daddy aren't around.

Gladwell explains:
Sometimes these stories were what linguists call temporal narratives. She would create a story to try to integrate events, actions, and feelings into one structure — a process that is a critical part of a child's mental development.
If you look back at the comments in my hypothetical code from 20 years ago, you'll see that I was doing exactly what Emily does: making up a temporal narrative in an attempt to carve out a mental picture of the computation for myself. These stories I told myself were a critical part of my mental development as a programmer. I was a child trying to make sense of a big, scary new world.

Most programmers go through this phase. It's perfectly normal.

In contrast, here's what my code tends to look like today:
(defun js2-parse-variables (in-for decl-type)
"Parse a 'var', 'const' or 'let' statement or for-loop initializer.
IN-FOR is true if we are currently in the midst of the init clause of a for.
DECL-TYPE is a token value: either VAR, CONST, or LET depending on context.
Returns the parsed statement node."

(let ((result (make-js2-var-decl-node))
destructuring-init
destructuring
s start tt init name node
(continue t))
;; Examples:
;; var foo = {a: 1, b: 2}, bar = [3, 4];
;; var {b: s2, a: s1} = foo, x = 6, y, [s3, s4] = bar;
(while continue
(setq destructuring nil
s nil
tt (js2-peek-token)
start js2-token-start
init nil)
(if (or (= tt js2-LB) (= tt js2-LC))
;; Destructuring assignment, e.g., var [a, b] = ...
(setq destructuring (js2-parse-primary-expr))
;; Simple variable name
(js2-must-match js2-NAME "msg.bad.var")
(setq name (make-js2-name-node))
(js2-define-symbol decl-type js2-ts-string))
(when (js2-match-token js2-ASSIGN)
(setq init (js2-parse-assign-expr in-for)))
(if destructuring
(progn
(if (null init)
;; for (var [k, v] in foo) is initialized differently
(unless in-for
(js2-report-error "msg.destruct.assign.no.init")))
(setq node (make-js2-destructuring-init-node :start start
:end js2-ts-cursor
:lhs destructuring
:initializer init))
(js2-node-add-children node destructuring init))
;; simple variable, possibly with initializer
(setq node (make-js2-var-init-node :start start
:end js2-ts-cursor
:name name
:initializer init))
(js2-node-add-children node name init))
(js2-block-node-push result node)
(js2-node-add-children result node)
(unless (js2-match-token js2-COMMA)
(setq continue nil)))
result))
If I'd seen this code 20 years ago I'd have been appalled. The lines of code are all crammed together! Some of them aren't even commented! If I'd been given the task of maintaining this code, I'd have been screaming "rewrite!"

I probably write more Java and JavaScript these days, but I picked an Emacs-Lisp function I wrote recently to highlight how alien my code today would have looked to me twenty years ago.

To be fair, this function is actually a port of some Java code from Mozilla Rhino's JavaScript parser, which in turn is a port of some C code from SpiderMonkey's parser, which in turn was probably borrowed and modified from some other compiler. Compiler code tends to have some of the purest lineage around, tracing back to the assembly-language code they wrote for the first compilers 40 or 50 years ago. Which means it's going to be a bit on the ugly side compared to "ordinary" code.

But when I write code in other languages these days, even in Java, it looks a lot more like this Emacs Lisp fragment than like the n00b code I was writing 20 years ago. It's denser: there's less whitespace and far less commenting. Most of the commenting is in the form of doc-comments for automated API-doc extraction. On the whole, my code today is much more compressed.

In the old days, seeing too much code at once quite frankly exceeded my complexity threshold, and when I had to work with it I'd typically try to rewrite it or at least comment it heavily. Today, however, I just slog through it without complaining (much). When I have a specific goal in mind and a complicated piece of code to write, I spend my time making it happen rather than telling myself stories about it.

A decade of experience makes you a teenager

After going through their 2-year-old phase, programmers eventually have to go through a stupid-teenager phase. All this month I've been hearing sad but unsurprising news stories about teenagers getting stuck on big rocks, being killed falling off cliffs, or dying of exposure. I'm actually lucky the same didn't happen to me when I was a teenager. It's just a bad time for us. Even though teenagers are old enough to understand the warnings, they have this feeling of invincibility that gets them into trouble and often mortal peril.

The programming equivalent happens around us all the time too. Junior programmers with five to ten years of experience under their belts (still n00bs in their own way) attempt to build giant systems and eventually find themselves stuck on the cliff waiting for a helicopter bailout, telling themselves "my next system rewrite will be better!" Or they fall off the cliff – i.e., the project gets canceled, people get laid off, maybe the company goes under.

Yes, I've gone through that phase too. And let's face it: even seasoned programmers need a little optimism and a little bravery in order tackle real challenges. Even as an experienced programmer, you should expect to fail at projects occasionally or you're probably not trying hard enough. Once again, this is all perfectly normal.

That being said, as a hiring manager or company owner you should keep in mind that "5 to 10 years of experience" on a resume does not translate to "experienced"; it means "crazy invincible-feeling teenager with a 50/50 shot at writing a pile of crap that he or she and his or her team can't handle, and they'll eventually, possibly repeatedly, try to rewrite it all." It's just how things are: programmers can't escape being teenagers at some point.

Building compression tolerance

Hopefully the scene I've painted so far helps you understand why sometimes you look at code and you just hate it immediately. If you're a n00b, you'll look at experienced code and say it's impenetrable, undisciplined crap written by someone who never learned the essentials of modern software engineering. If you're a veteran, you'll look at n00b code and say it's over-commented, ornamental fluff that an intern could have written in a single night of heavy drinking.

The sticking point is compression-tolerance. As you write code through your career, especially if it's code spanning very different languages and problem domains, your tolerance for code compression increases. It's no different from the progression from reading children's books with giant text to increasingly complex novels with smaller text and bigger words. (This progression eventually leads to Finnegan's Wake, if you're curious.)

The question is, what do you do when the two groups (vets and n00bs) need to share code?

I've heard (and even made) the argument that you should write for the lowest common denominator of programmers. If you write code that newer programmers can't understand, then you're hurting everyone's productivity and chances for success, or so the argument goes.

However, I can now finally also see things from the veteran point of view. A programmer with a high tolerance for compression is actually hindered by a screenful of storytelling. Why? Because in order to understand a code base you need to be able to pack as much of it as possible into your head. If it's a complicated algorithm, a veteran programmer wants to see the whole thing on the screen, which means reducing the number of blank lines and inline comments – especially comments that simply reiterate what the code is doing. This is exactly the opposite of what a n00b programmer wants. n00bs want to focus on one statement or expression at a time, moving all the code around it out of view so they can concentrate, fer cryin' out loud.

So it's a problem.

Should a team write for the least common denominator? And if so, exactly how compressed should they make the code? I think the question may be unanswerable. It's like asking for a single format for all books, from children's books to epic novels. Each team is going to have its own average preference. I suspect it's a good idea to encourage people to move their stories into design documents and leave them out of the code, since a junior programmer forced to work in a compressed code base may well grow up faster.

As for me, at this point in my career I would rather puzzle through a small, dense, complex piece of code than a massive system with thousands of files containing mostly comments and whitespace. To some people this trait undoubtedly flags me as a cranky old dinosaur. Since this is likely the majority of programmers out there, maybe I am a cranky old dinosaur. Rawr.

Metadata Madness

Everyone knows that comments are metadata: information about the data (in this case, the data being your source code.) But people often forget that comments aren't just a kind of metadata. Comments and metadata are the same thing!

Metadata is any kind of description or model of something else. The comments in your code are just a a natural-language description of the computation. What makes metadata meta-data is that it's not strictly necessary. If I have a dog with some pedigree paperwork, and I lose the paperwork, I still have a perfectly valid dog.

You already know the comments you write have no bearing on the runtime operation of your code. The compiler just throws them away. And we've established that one hallmark of a n00b programmer is commenting to excess: in a sense, modeling every single step of the computation in painstaking detail, just like Emily modeled her ideal Friday by walking through every step and reassuring her 2-year-old self that she really did understand how it was going to work.

Well, we also know that static types are just metadata. They're a specialized kind of comment targeted at two kinds of readers: programmers and compilers. Static types tell a story about the computation, presumably to help both reader groups understand the intent of the program. But the static types can be thrown away at runtime, because in the end they're just stylized comments. They're like pedigree paperwork: it might make a certain insecure personality type happier about their dog, but the dog certainly doesn't care.

If static types are comments, then I think we can conclude that people who rely too much on static types, people who really love the static modeling process, are n00bs.

Hee hee.

Seriously, though: I'm not actually bashing on static-typing here; I'm bashing on the over-application of it. Junior programmers overuse static typing in the exact same way, and for the same reasons, as they overuse comments.

I'll elaborate by first drawing a parallel to data modeling, which is another kind of "static typing". If you've been working in a field that uses relational databases heavily, you'll probably have noticed that there's a certain personality type that's drawn to relational data modeling as a career unto itself. They're usually the logical modelers, not the physical modelers. They may have begun their careers as programmers, but they find they really love data modeling; it's like a calling for them.

If you know the kind of person I'm talking about, you'll doubtless also have noticed they're always getting in your way. They band together and form Database Cabals and Schema Councils and other obstructive bureacracies in the name of safety. And they spend a lot of time fighting with the engineers trying to get stuff done, especially at the fringes: teams that are not working directly with the schema associated with the main revenue stream for the company, but are out trying to solve tangential problems and just happen, by misfortune, to be homed in the same databases.

I've been in surprisingly many situations at different companies where I had a fringe team that was being held up by data modelers who were overly-concerned about data integrity when the real business need was flexibility, which is sort of the opposite of strong data modeling. When you need flexible storage, name/value pairs can get you a long, long, LONG way. (I have a whole blog planned on this topic, in fact. It's one of my favorite vapor-blogs at the moment.)

It's obviously important to do some amount of data modeling. What's not so obvious is when to stop. It's like commenting your code: newer programmers just don't know when to quit. When you're a little insecure, adding comments and metadata are a great security-blanket that make you feel busy when you've in fact stopped making forward progress and are just reiterating (or perhaps teaching yourself) what's already been accomplished.

Hardcore logical data modelers often suffer from an affliction called metadata addiction. Metadata modeling is seductive. It lets you take things at a leisurely pace. You don't have to be faced with too much complexity at once, because everything has to go in a new box before you'll look at it. To be sure, having some metadata (be it a data model, or static types, or comments) is important for human communication and to some extent for performance tuning. But a surprising percentage of people in our industry take it too far, and make describing an activity more important than the activity itself.

The metadata-addiction phenomenon applies equally to coders. Code is data, and data is code. The two are inextricably linked. The data in your genes is code. The floor plans for your house are code. The two concepts are actually indistinguishable, linked at a fundamental level by the idea of an Interpreter, which sits at the very heart of Computer Science. Metadata, on the other hand, is more like the kidney of Computer Science. In practice you can lose half of it and hardly notice.

Creeping bureacracy

I think that by far the biggest reason that C++ and Java are the predominant industry languages today, as opposed to dynamic languages like Perl/Python/Ruby or academic languages like Modula-3/SML/Haskell, is that C++ and Java cater to both secure and insecure programmers.

You can write C++ like straight C code if you like, using buffers and pointers and nary a user-defined type to be found. Or you can spend weeks agonizing over template metaprogramming with your peers, trying to force the type system to do something it's just not powerful enough to express. Guess which group gets more actual work done? My bet would be the C coders. C++ helps them iron things out in sticky situations (e.g. data structures) where you need a little more structure around the public API, but for the most part they're just moving data around and running algorithms, rather than trying to coerce their error-handling system to catch programmatic errors. It's fun to try to make a bulletproof model, but their peers are making them look bad by actually deploying systems. In practice, trying to make an error-proof system is way more work than it's worth.

Similarly, you can write Java code more or less like straight C, and a lot of seasoned programmers do. It's a little nicer than C because it has object-orientation built in, but that's fairly orthogonal to the static type system. You don't need static types for OOP: in fact OOP was born and proven in dynamic languages like Smalltalk and Lisp long before it was picked up by the static-type camps. The important elements of OOP are syntax (and even that's optional) and an object model implemented in the runtime.

So you can write Java code that's object-oriented but C-like using arrays, vectors, linked lists, hashtables, and a minimal sprinkling of classes. Or you can spend years creating mountains of class hierarchies and volumes of UML in a heroic effort to tell people stories about all the great code you're going to write someday.

Perl, Python and Ruby fail to attract many Java and C++ programmers because, well, they force you to get stuff done. It's not very easy to drag your heels and dicker with class modeling in dynamic languages, although I suppose some people still manage. By and large these languages (like C) force you to face the computation head-on. That makes them really unpopular with metadata-addicted n00bs. It's funny, but I used to get really pissed off at Larry Wall for calling Java programmers "babies". It turns out the situation is a little more complicated than that... but only a little.

And Haskell, OCaml and their ilk are part of a 45-year-old static-typing movement within academia to try to force people to model everything. Programmers hate that. These languages will never, ever enjoy any substantial commercial success, for the exact same reason the Semantic Web is a failure. You can't force people to provide metadata for everything they do. They'll hate you.

One very real technical problem with the forced-modeling approaches that static type systems are often "wrong". It may be hard to imagine, because by a certain definition they can't be "wrong": the code (or data) is programmatically checked to conform to whatever constraints are imposed by the type system. So the code or data always matches the type model. But the type system is "wrong" whenever it cannot match the intended computational model. Every time want to use multiple inheritance or mixins in Java's type system, Java is "wrong", because it can't do what you want. You have to take the most natural design and corrupt it to fit Java's view of the world.

An important theoretical idea behind type systems is "soundness". Researchers love to go on about whether a type system is "sound" or not, and "unsound" type systems are considered bad. C++ and Java have "unsound" type systems. What researchers fail to realize is that until they can come up with a type system that is never "wrong" in the sense I described earlier, they will continue to frustrate their users, and their languages will be abandoned for more flexible ones. (And, Scala folks, it can't just be possible to express things like property lists – it has to be trivial.)

To date, the more "sound" a type system is, the more often it's wrong when you try to use it. This is half the reason that C++ and Java are so successful: they let you stop using the type system whenever it gets in your way.

The other half of their success stems from the ability to create user-defined static types. Not, mind you, because they're helpful in creating solidly-engineered systems. They are, sure. But the reason C++ and Java (particularly Java) have been so successful is that their type systems form a "let's not get any work done" playground for n00bs to spend time modeling things and telling themselves stories.

Java has been overrun by metadata-addicted n00bs. You can't go to a bookstore or visit a forum or (at some companies) even go to the bathroom without hearing from them. You can't actually model everything; it's formally impossible and pragmatically a dead-end. But they try. And they tell their peers (just like our metadata-addicted logical data modelers) that you have to model everything or you're a Bad Citizen.

This gets them stuck on cliffs again and again, and because they're teenagers they don't understand what they did wrong. Static type models have weight and inertia. They take time to create, time to maintain, time to change, and time to work around when they're wrong. They're just comments, nothing more. All metadata is equivalent in the sense of being tangential documentation. And static type models get directly in the way of flexibility, rapid development, and system-extensibility.

I've deleted several thousand words about the evolution of Apache Struts and WebWork, an example framework I chose to illustrate my point. Rather than waste a bunch of time with it, I'll just give you a quote from one of the Struts developers in "The Evolution of Struts 2":
...the Struts 1 code base didn’t lend itself to drastic improvements, and its feature set was rather limited, particularly lacking in features such as Ajax, rapid development, and extensibility."
Struts 2 was thrown away for WebWork, which was in the process of throwing away version 1 (for similar reasons) in favor of version 2 (which has all the same problems).

Some of those several thousand words were devoted to JUnit 4, which has comically (almost tragically) locked on, n00b-style, to the idea that Java 5 annotations, being another form of metadata, are the answer to mankind's centuries of struggle. They've moved all their code out of the method bodies and into the annotations sections. It's truly the most absurd overuse of metadata I've ever seen. But there isn't space to cover it here; I encourage you to go goggle at it.

There are die-hard Java folks out there who are practically gasping to inject the opinion, right here, that "rapid development" is a byproduct of static typing, via IDEs that can traverse the model.

Why, then, was Struts considered by its own developers to be a failure of rapid development? The answer, my dear die-hard Java fans, is that a sufficiently large model can outweigh its own benefits. Even an IDE can't make things go faster when you have ten thousand classes in your system. Development slows because you're being buried in metadata! Sure, the IDE can help you navigate around it, but once you've created an ocean, even the best boats in the world take a long time to move around it.

There are hundreds of open-source and proprietary Java frameworks out there that were designed by code-teenagers and are in perpetual trouble. I've often complained that the problem is Java, and while I think the Java language (which I've come to realize is disturbingly Pascal-like) is partly to blame, I think the bigger problem is cultural: it's hard to restrain metadata addiction once it begins creeping into a project, a team, or an organization.

Java programmers, and logical data modelers, and other metadata-addicted developers, are burying us with their "comments" in the form of models within their static type system. Just like I did when I was a n00b. But they're doing it with the best of intentions, and they're young and eager and energetic, and they stand on street corners and hand you leaflets about how great it is to model everything.

Seasoned programmers ignore them and just get it done.

Solutions and takeaways

Software engineering is hard to get right. One person's pretty data model looks like metadata-addiction to another person.

I think we can learn some lessons from code-commenting: don't try to model everything! You need to step back and let the code speak for itself.

For instance, as just one random illustrative example, you might need to return 2 values from a function in Java (a language with no direct support for multiple return values). Should you model it as a MyFunctionCallResult class with named ValueOne and ValueTwo fields (presumably with actual names appropriate to the problem at hand)? Or should you just return a 2-element array (possibly of mixed types) and have the caller unpack it?

I think the general answer to this is: when in doubt, don't model it. Just get the code written, make forward progress. Don't let yourself get bogged down with the details of modeling a helper class that you're creating for documentation purposes.

If it's a public-facing API, take a lesson from doc-comments (which should be present even in seasoned code), and do model it. Just don't go overboard with it. Your users don't want to see page after page of diagrams just to make a call to your service.

Lastly, if you're revisiting your code down the road and you find a spot that's always confusing you, or isn't performing well, consider adding some extra static types to clarify it (for you and for your compiler). Just keep in mind that it's a trade-off: you're introducing clarifying metadata at the cost of maintenance, upkeep, flexibility, testability and extensibility. Don't go too wild with it.

That way the cliff you build will stay small enough for you to climb down without a helicopter rescue.

Postscript

I'm leaving comments on, at least until "click-my-link" spam starts to surface. I'm curious to know how this entry goes over. This was an especially difficult entry to write. I did a lot of editing on it, and left out a lot as a result. I feel like I may not have made my points as clearly as I'd like. And I'm sure I haven't convinced the metadata-addicted that they have a problem, although at least now they know someone out there thinks they have a problem, which is a start.

Let me know what you think!

116 comments:

Ben said...

Well, the typing issue is a thorny one. I think as a general rule, the reddit types are too eager to denounce it immediately and tell us all that Ruby is the answer, or whatever it is this week, and the Java purists and academics make the opposite argument.

Coming from someone from a dynamic background, I didn't use a language with any strong typing until I had been programming for 2 years. I'm not a heavy commenter of code (I'm not into redundancy), but I love static typing....until I don't.

A few years of not catching an error until a specific code-path is executed (and let's be honest, I don't test as much as I should) gets really old. I love it when the compiler says "you're a moron" before I even have to think about writing a test.

I think expecting tests for every situation just isn't realistic. I enjoy this about C as well, in as much as the compiler can do this, so maybe it's a compiler thing more than a static thing.

I'm pulling a Yegge. Point is, use both. Use the dynamic side enough to find the static side ridiculous, and use the static side enough to find the dynamic side occasionally too eager to not be helpful.

I think this is the point where you'll program with an appropriate level of safety. There are things you can do in a dynamic language to give you that safety, and there are things you can do in a static language to not be stupid about. Keeping perspective on these moments is what professional, accumulated knowledge gives you.

Tappen said...

Good post. I've had some doozy arguments about excessive commenting and class/interface defining myself. You covered some aspects of the issue I hadn't thought of.

There's a parallel with processes: managers develop corporate processes to try to allow not-so-smart/experienced people to accomplish the same tasks that the talented people can do. It rarely if ever works because doing so takes away the understanding and flexibility that made the initial accomplishment so worthwhile.

In both cases there's the mistaken belief that the important thing is to allow repetition, reuse, as if we were still on a Henry Ford assembly line instead of the modern world where doing things well, and quickly, is usually 10x as valuable as doing them a 2nd time.

Dave said...

I'm not really smart enough to debate with the bigwigs, but I can categorically agree with the idea that over modelling costs a lot for sometimes negligible gain.

Since shifting focus from enterprise systems in Java to much more rapid projects in Ruby over the last 18 months I'm noticing daily just how much more we get done, even though our team sizes are way smaller.

And also, lets not forget that it's very possible to over model in a dynamic language as well.

I do worry though that the tradeoff is yet to be felt - there are a lot of people "just doing it", but is maintenance of their rapidly developed Ruby/Python program going to come back and bite them some time down the line?

Brian Slesinsky said...

Ugh. There are some good points in there, but that code is not nearly as good as you think it is. Of course you don't have to write paragraph-long comments, but have you ever heard of "extract method"?

Since it seems that forced code review hasn't cured you of writing only for yourself, I think the only hope of getting beyond the adolescent phase is either pair-programming or teaching.

Max Kanat-Alexander said...

I've thought about some of this a lot, particularly the experienced programmer vs. the novice programmer bit. I work on an open-source project, where I'm an experienced and trained programmer but the majority of contributions come from people who are not.

My experience is that it's up to trained programmers to devise the "way it's going to be" and review and correct the novice programmers on that way. The novice programmers eventually pick it up and understand it, even if in a limited sense, and start to write their code that way.

My experience says that it's entirely possible to train people to be better programmers without years and years of experience, and so it's never necessary to reduce yourself to writing for the lowest common denominator. Of course, if you're approaching the bound of complexity where even an experienced programmer would have difficulty reading the code, then that's a completely different issue to consider.

-Max

Mark said...

Hi, my name is Mark, and I'm a meta-data addict.

Hi Mark

It all started at my first job supporting a database driven app written by a cobol programmer.

I spent more time putting out fires then I did adding new features. The database model wasn't locked down. Once I changed the database so that the data model was more explicitly defined through foreign keys and the like the bugs in the application code that caused inconsistancies were easier to find. I was able to work on features on a regular basis and deal with bugs as the occurred rather then when they were discovered.

Fast forward 3 years and I'm working in a Python shop. All my functions start with:
assert isinstance(x, y)
Some of my co-workers complain that things fail when they pass an int instead of a float. It's easy to ignore their complaints as they also state that unit testing isn't important.

I have a problem.

hairyape said...

i recognize myself in your description of the software teenager, i definitely went through that phase.

i've grown since then and the biggest change i can point to is my move away from statically-typed programming languages. it feels like my arteries have been unclogged.

your statement about tackling technical challenges head-on is exactly the feeling i get when i code these days. i spend most my time solving the problem i need to solve instead of building up scaffolding to solve it.

if i read your post a few years ago i would have left an emotional comment listing reasons i thought you were so wrong. being a teenager is tough. i wonder what i'll think of my current self ten years from now.

Samuel A. Falvo II said...

"To date, the more "sound" a type system is, the more often it's wrong when you try to use it."

Boy are you ever wrong on this one. Both Oberon and Haskell have type systems which are very sound, and in my experience, I can count the number of times it's been "wrong" on one hand. ONE hand. Out of all the programming projects I've done with them.

In point of fact, Oberon's type system nearly is identical to C's in terms of expressivity, but is much stricter than C's to ensure proper coupling of mutually untrusted modules. Having many years experience with both C and Oberon, I find that C offers *ZERO* productivity benefit over Oberon, but instead a 100% more error-prone environment.

This is why every solid C coder will tell you, "Use -Wall." That forces the compiler (GCC in this case; Visual C/C++ has similar features) to treat all warnings as errors. They'll also tell you to minimize the use of type-casting. These suggestions come from C coders with >20 years experience. These rules of thumb in C are mandated in Oberon.

No, what makes C/C++ more popular than other languages is their relative brevity -- as Paul Graham points out, brevity is what makes a language "popular." This is why Oberon, for all its bad-a$$ness, failed to capture the market. Being a Modula-2-derived language, it was "too wordy."

Type systems really, truely, honestly have nothing to do with it.

Samuel A. Falvo II said...

Oops, I should point out that excessive brevity is also a killer, too. Were it not, APL, and perhaps its successor J, would have conquered the world. Clearly, this also has not happened.

Tracy R Reed said...

Isn't it true in general that the average programmer doesn't comment enough? It seems rare to me that I see such a ridiculous narrative in the comments and more often see just pages and pages of code with no comments, no factoring, etc. It seems that is the more common n00b
problem than over-commenting once they are out in industry and not turning in cs101 assignments anymore where they know the professor might spank them for not commenting.

JS said...

Funny you should pick on database data modeling. The idea behind it is to eliminate redundancy in data and to represent it in a form usable by multiple applications for a wide assortment of purposes. This sort of modeling is supposed to improve flexibility in how data may be used. Compared to overcommenting or static typing, database metadata seems a very different animal.

Justin Rudd said...

Funny. I was just working on some code today and started pulling out pieces and modeled it as an interface so that I could mock it :) And yes the double meaning is intentional.

I think I've finally hit my rebellious teenage years. I've stopped listening to the man (aka authority figures aka "a"-list bloggers) about TDD, TFD, BDD, ADD :), etc. And also mock vs. stub and state vs. behavior testing, etc. Who friggin' cares? I just want to solve the problem at hand and write some tests that turn the bar green and give me a "good enough" feeling that the code is fine.

Justin George said...

The cool thing with Haskell's stuff is that you can completely embed a dynamic type system in it, so you only get the strictness where you want it.

To be fair, you can do that the other way around in dynamic languages (above commenter and his assert isinstances) but few people build a type inference engine into their dynamic apps.

It's funny, I've gone through both of those stages, and I think I'm moving back towards static typing. It even makes testing easier.

Ravi said...

It is interesting that you praise code compression and then mock Haskell and its type system. It really looks like you're contradicting yourself. Besides being one of the tersest languages known to man (consider point-free code and the amount of plumbing that can be buried in a stack of monads), you could argue that Haskell types are the ultimate in code compression.

A function's type is typically a one-line compression of all of its code - enough detail to give you a mental model of what the function does, but abstracting away all of the details of how a function does its job. The correspondence between functions and types can be very tight as we see from tools like hoogle and djinn, where, in a magical reversal of type inference, functions can often be inferred (or found) based on their types.

And Haskell has an inference based system (except for some of the more experimental corners), so you can get that compressed mental model for free - the compiler will compute it for you! Or if you want to check that your mental model corresponds to the code that you writing as you go, you write down a type signature and have the compiler check it. This is a great way to create useful documentation and get feedback while you're working - without taking you out of coding space and into testing space.

I'll agree that learning to use an advanced static type system is a difficult skill and that it can take even good programmers years to understand how a type system can be a sword that helps you destroy complexity rather than create it. But just because a skill is hard to master doesn't mean it isn't worth mastering. I'm more productive in Haskell than I am in any other language not just because I've been a full-time Haskell programmer (at least as much as I can be a full-time anything at a startup) for over 4 years, but because I've learned how to turn its type system into one of my most powerful development allies - helping me check my mental models, make sure I don't make silly mistakes when refactoring or extending code and, most importantly, giving me a shorthand language that lets me take the vague intuitions I want to implement and start giving them (minimal) concrete form. I can use this concrete form as a starting point for code, as an efficient way to communicate with colleagues and as a way to keep more parts of a fantastically complex system in my head at the same time.

It really sounds like you've just missed the point of modern type-inference systems. I'll agree that there are plenty of mediocre researchers trying to fill in ugly corners with arcane theories, but the core of a Hindley-Milner based type system is a beautiful thing - and the battle-tested set of extensions you can see in languages like Haskell and OCaml are worth allies for any programmer.

Karl Rosaen said...

The best part of the article:

"To date, the more "sound" a type system is, the more often it's wrong when you try to use it. This is half the reason that C++ and Java are so successful: they let you stop using the type system whenever it gets in your way."

Isn't there truth to the other side then too? The more dynamic a language's type system is the more comments you need to explain what function arguments are, and the more unit tests you need? So maybe Java's success is in striking a balance between the two.

dibblego said...

I've called you on your under-qualified comment before, but it seems your state of delusion is getting worse.

Please clue up a bit before making claims about type systems. Seriously, public ridicule is the only adequate response for such tripe.

Curiously, have you read this?
http://cdsmith.twu.net/types.html

Alan said...

I get the point about static types being just metadata, but I really feel like you're overstating your case. Types are much more functional as metadata than code comments are; they help ensure the correctness of your program. That doesn't mean they're always the right thing, just that the comparison is a bit stretched. They're closer to unit tests than to comments. People can get lost in writing the perfect set of tests, too, and unit tests also exert a serious maintenance drag by making changes to the system difficult, but that doesn't mean they're not useful. The larger the system is and the more people that work on it, the more useful the type information is. And yeah, I know ruby/python/lisp/etc. help keep projects smaller so you don't get in those situations as easily, but some systems just end up having a large surface area no matter what language they're in.

As to datamodelling, I think you're completely off. Name value tables? Really? Maybe for a prototype or your personal website. But for anything that needs to perform or where you actually care whether or not your data is trashed, you kind of need some kind of schema with (gasp!) typed, named columns and (double gasp!) maybe even some foreign key or nullability constraints. Code is much easier to change release to release than the database schema is, which means you do need to spend more time making sure you can live with whatever you ship/put into production. A bunch of name-value tables with self-pointers might seem more "flexible" at development time, but it makes interpreting the data basically impossible, and god help you if your application changes in such a way that it starts misinterpreting things. If your code is buggy, that's one thing. If your code loses or corrupts data, that's generally game over.

And maybe the majority of database applications out there never require any performance tuning, but as I'm sure you know if your application is going to expect any significant database load the name/value architecture probably isn't going to fly, and given how hard schemas are to change it's not something you want to find out after your server's fallen over under load.

Frans Bouma said...

In general you seem to be able to make a point properly, but not in this post. It sweeps from semi-point to semi-point without getting really TO the point, IMHO.

The main gripe I have with your post is that it tries to sell the idea that over-using comments and static types is bad, however it fails to illustrate what is NOT over-using and what is. This surprises me because it's so simple to describe:

- with static types: if introducing a static type DECREASES complexity: introduce it, otherwise DONT
- with comments: if adding the comment makes it for a mortal easier to understand wtf is going on, add the comment, otherwise, don't.

Humans suck really bad in interpreting code. We need every help we can get and even then we suck in it. This implies two things:
1) a programmer who inherits a project has in general a hard time
2) a programmer who just wrote a piece of code can't in ALL cases find the errors s/he made in that code right away be re-reading it.

I.o.w.: if you as a programmer rely on 'the code speaks for itself', you're mistaken: a human has to PARSE and INTERPRET the code to understand what it does and to know what the value of _foo is at line 243 after that wacky loop has completed. Comments can help in that area, and should be added to AID to understand the code when a human has to read it.

And we're not all equal. A programmer who thinks s/he's very very smart can perhaps decide not to add any comments because it's so straight forward, however a person who takes over the project might just because of that have a hard time understanding it, and might misinterpret what the code does, introduce a mistake because of that etc.

Is that progress? Did the team as a whole become better because of the lack of comments because the veteran was too snobby to realize that not everyone can program a C++ compiler in assembler?

I surely think not. That doesn't mean we all should write books inside comments. As I said: comments should describe what the code does for the human reader. Not as in: // increases i
but as in: /* we have to check ... here because if we do it later we have a performance problem. */

You get the idea.

John "Z-Bo" Zabroski said...

Steve, pay attention to the comments section. In particular, Mark, Alan, Samuel, and dipplego. Your critics tend to be right.

It just seems like you see a problem but don't understand it. Imparting this vision on others is one thing. Imparting an incorrect understanding is another. The former is an experience report, while the latter infects reader's minds.

Probably the most balanced explanation of the differences between C and Lisp is Richard Gabriel's Counterpoint: Do programmers need seat belts?.

Please do not tell impressionable readers that Database Administrators know nothing about programming. I could've put up with most of your nonsense, but that comment is intolerable. Mark the Metadata Addict in the comments section explained why. Telling people to devalue the Schema is the biggest mistake you could possibly make.

Moreover, telling people models aren't important shows that you see a problem but don't understand it. Models are everywhere in languages like Lisp. Just pick up a book published by Springler-Verlag about AI and planning algorithms, and you'll see tons of code written in Lisp along with discussion about how the author's model stacks up against other models.

I read your blog posts mostly for the interesting metaphors, not for the technical advice.

AlBlue said...

I've got to disagree with your view that types are purely documentation in the same way that comments are.

Comments aren't understood by the compiler. Types are. There's a key difference between 'stuff that a compiler can do things with' and 'stuff that a compiler can't do things with'.

All programming is telling a computer what to do. Documentation is explaining to humans what it does. Some things, like well-chosen function names, function signatures (whether that's an informal "this takes two arguments" or a highly-specified combination of return types and typed arguments) are useful to both sides.

There's also a lot of misinformation. A well-typed program isn't any more correct than an untyped program; both can still have bugs. The former has a class of problems that a compiler (or IDE) can find out for you in advance, but it's not a guarantee of 'correctness' that some people seem to claim, which is an easy point to pick on.

As for the type information at runtime; there *is* information at runtime in some languages (Java's use of 'instanceof' or 'getClass', for example). In fact, some of this type information is available in dynamically typed languages as well; you can find a python's class at runtime and choose to do different things.

One advantage of Python/Ruby vs Java at the moment is that the former allows for functions to be passed around, which Java doesn't. However, that's not a failure with statically typed systems; Scala supports that, for example.

To conclude; there's merits in types, but the whole argument about types being 'good' or 'bad' is pretty polarising. I don't see why good systems can't take advantage of both in the right situations; and part of that is understanding what the limitations are in each place. I wouldn't use a J2EE EJB system for mailing me when someone's birthday is coming up; but then again, I wouldn't use Python to write a distributed transactional on-line banking system either. The big problem is people who are only exposed to one type of problem, then think that everything can be done with the same set of tools.

Steve Cooper said...

Beginners are much more comfortable with their native language, and have to code in both natural language and the target language;
// Make the number one bigger;
i++;

The type of compression I think you're talking about is mainly one of fluency; the ability to 'think in Russian'(http://www.imdb.com/title/tt0083943/) That once you've internalised 'i++', you don't need to explain it back to yourself.

Hopefully, instead of creating big monstrous blocks, you can create better abstractions (functions, libraries, macros) to keep your code relatively clear; not

(if destructuring
(progn

but

(when destructuring

The key is the ability to deal with more and better forms of abstraction.

Adam said...

John, I don't think Steve is "telling people models aren't important". He's saying that overuse is wrong.

On the other hand, Steve, you are overusing the metaphor that static types are metadata, as some before me have pointed out.

But, I would like to test a theory of mine. Me, myself and I, am solidly anchored in the safety of strongly typed languages. But I can sort of "feel" advantages of a loser typed system, but my experice of such languages is limited to a short session on javaScript, in project where we were writing in the old style anyway. So my question is this:

Could it be that the productivity advantage, sometimes seen in dynamic languages is greater for smaller projects. Could it be that in fact development [of a fully debugged system] becomes a lot harder the dynamic way for a project that is larger than say 10 man-years? Opinions ? Anyone?

Adam said...

This article seems to be an unhealthy mix of overgeneralization and childish name calling. (Do you really need to call people "metatdata addicts" to convey the notion that there's a point of diminishing returns for modeling activities?)

In my own experience, what you're going through seems to be common among "senior" programmers. They tend to confuse "writing code" with "getting stuff done." I can only speculate, but I think it may be a natural consequence of the narrow perspective afforded by writing code most of the time.

The comments on database schemas are particularly telling. You seem to be assuming that all the potential use cases for the database occur within the confines of your current code base. Might be true initially, but just plain silly as a long-term assumption.

Adam said...

Incidentally, the two "adam" commenters above are, in fact, different people. (Hi, other person I don't know with the same handle as me! [waves] )

Thomas David Baker said...

Thanks for yet another thought-provoking post Steve. Just quickly I have to say please keep posting and keep the posts just as long as they are now (or longer)!

Almost every line of code I write for my day job is then available to any of the hundreds of thousands of network creators on Ning. Those who actually delve into it (you don't have to) vary from experienced programmers to people that just want to add a new page to their website and it is their first experience with programming.

I wonder what your advice would be on commenting (and static typing, and everything else) in that scenario?

We tend to go moderately big (javadoc-style) on the comments at the class and function level. And we even have a little static typing in there, even though this is PHP.

A different question from the one you are addressing, for sure, but one that is particularly interesting to me!

Charles said...

I disagree with you lumping comments, static types, and annotations into one "meta-data" category. An interesting analogy you've made, but it seems stretched too far.

Static types are clearly more powerful than comments. Generics and function overloading immediately spring up as (arguably) positive uses of static types.

Code annotations are even more potent, giving rise to new refactoring possibilities and even denser code. To be honest though, I found the static type/java annotations link tenous.

Paddy said...

All living souls welcome whatever they are ready to cope with; all else they ignore, or pronounce to be monstrous and wrong, or deny to be possible.
- George Santayana

Somehow I feel that statement applies to a lot of commenters upstream.

I welcome your musings and I can see the resonance in your ideas :)

Peter Svensson said...

Personally, I feel that the points Steve gets very well across are these;

1. The whole point of programming is to deliver a working system.
2. A lot of time is time is being spent on too much meta-data, instead of problem-solving.

Also, he makes a case that the features of certain languages makes problem-solving quicker, without impairing the deliverance of the system.

Michael Duffy said...

Comments? We don't need no stinkin' comments, especially those awful ones to demark the end of a block or class:

public class FooBar
{
} // end class FooBar

I will admit that I'm getting uncomfortable about my addition to curly brace languages after reading Steve for a while.

I'd love to see how long his rant about paper architects and UML would be. That's the ultimate meta-data, in my opinion.

I'll bet that Google would laugh at the idea of someone with the title "architect" not writing code, but that's exactly the direction that many companies are taking. It dovetails well with their mental model of "software development as manufacturing", where UML takes the place of engineering drawings and overseas outsourced coders are the assembly line workers stamping out the widgets according to plan.

Michael Duffy said...

s/addition/addiction/

TH said...

I may be categorized as a n00b or as a teenager.

I have program quiet a bit in Java/C++. But i have also program good enough in Python too. To me Python code is smaller and some of situation quicker to write a solution.

But i have found one thing that type system does not matter at all. I have failed and succeed irrespective of type system i used.

**The thing that really matter is working code!**

I agree Python have sometime made it possible for me to reach desire working code quicker. But vice versa has also been true.

You gave struts example i can give you Ruby/python examples too.

"Type war" are for those who want to talk about code not write code. God damnit everyone should write some working code! :P

TH
http://bootstrapping.wordpress.com/

Lorenzo said...

teenager ... seems rougly analagous to effects seen in Brooks, such as "second system syndrome".

My commenting style tends more toward somewhat richer "header block" comments, about the purpose and general approach to the function and less inline comments unless there's some nasty gotcha in there. ... avoid the nasty gotchas.

Nice article!

Tom said...

I program in English. After reviewing the English, I comment in Java after each sentence to let the computer know how to do it. In essence, I program in dual languages. IMHO the approach is what's important.

fr. chorazon 652 said...

Nice post, and oh-so-true. One nitpick - being unnecessary is *not* what makes metadata 'meta'. Try deleting your filesystem metadata and see what happens... :o)

Gwenhwyfaer said...

"in fact OOP was born and proven in dynamic languages like Smalltalk and Lisp long before it was picked up by the static-type camps"

(nitpick alert)

Proven, yes, but not born - Simula-67 was statically typed, and inheritance is arguably as static a concept as data modelling.

"As for me, at this point in my career I would rather puzzle through a small, dense, complex piece of code than a massive system with thousands of files containing mostly comments and whitespace. To some people this trait undoubtedly flags me as a cranky old dinosaur"

Yeah, me too, except I've always felt that way. If I can see something all at once, there's a much higher chance I can work it out all at once. (But then I also have the screenfulosaurus bit set.)

Duncan said...

It's always a trade off, but I think that the tenet of "self documenting code" is really important when you're working in a sizable team. In the older sections of our codebase it can take a long time to work out how some clever succinct piece of code works before it can be edited for some simple maintenance. You recently advocated reading Fowler's Refactoring which pushes this point throughout and yet this piece seems to me to say almost the complete opposite. I really like your essays and think that they lead to much health debate, but this latest one seems contradictory.

Benjamin Baril said...

Hey Steve,

Your ATOM feed for today is garbled with CSS style info.

Just a heads up

emmanuel said...

I'm not sure I should feel that enthousiatic about this article. Enthousiasm make you look like noob.
I've been reflecting for months an article like that. Everything seem to fall into place after reading it.
I'm really pleased you wrote it. I wouldn't have reach this kind of masterpiece, not having your style.
This the very true story of how a programmer grow. This is my story, I've gone by every state you describe from comment, to metadata and bureacraty. I wouldn't have suspected the parallel with 2-year-old Emily temporal narrative.
This will relax some tension, when i feel code is monstruouly verbose with for loop indexes beeing iProdigiouslyLongLoopCount, comments and so on. Thank you.

anuma said...

Java definitely has a bad case of verboseitis and overengineering fever, but I agree with some of the comments that you've overstated your case a bit.

Marc said...

Strong typing can, and should, be more than metadata.

babo said...

Well done! Just a small note about static typing, actually Haskell's type system is quite close to an ideal state, where you are not forced to explicitly state type in your code, but types are still there automagically.

grant rettke said...

Supposedly teenagers brains work differently than adults:

http://www.pbs.org/wgbh/pages/frontline/shows/teenbrain/work/

I've met plenty of 30 somethings who think that they are invincible :)

grant rettke said...

If I had a purebred dog but I lost "the paperwork". I would still have a "valid dog" but no way of proving it was a purebred, at least without a non-trivial effort on my part.

grant rettke said...

RE: Well, we also know that static types are just metadata.

They're like pedigree paperwork: it might make a certain insecure personality type happier about their dog, but the dog certainly doesn't care.
==================
True but the folks who invested the money in that dog probably care!

grant rettke said...

Haskell and OCaml provide type inference; but I guess you would categorize that too as "being forced"?

grant rettke said...

Steve you cover so many different points that alone could probably take up as much space as this single post. Move them out into different posts and give them each the time that they deserve :)

Dan Lewis said...

Maybe this is a n00b point of view, but bare code has no meaning. Why a piece of code exists or why it's written the way it is are left up to the maintenance programmer's imagination. In the words of Bruce Lee, "It is like a finger pointing away to the moon. Don't concentrate on the finger, or you will miss all the heavenly glory"; but the code is all finger and no moon.

Mandating comments will also save you from the worst excesses of geniuses, golfers, and optimizers.

Where I work, there is legacy code without legacy documentation. Now no one just knows how it works, and we're basically stuck at a French cafe, trying to induce the language that we are on contract to extend.

Joshua said...

My term for metadata addicts are IBM'ers. Since I've started looking at Ruby, I've put more focus on making my code readable instead of lots of comments.

Gabriel C. said...

Very interesting.
(The nitpicking comment that Simula was the first OO language and it was static has been already been made... darn!)
I would say I'm the typical lazy teenager who wants to enjoy the free sets of tests that come with a nice compiler, instead of having to write them myself: i.e. I want somebody else to take care of checking in the collection of pedigree dogs I need to walk that nobody put a couple of ducks (and without going to each one and see if it walks like a duck and quacks like a duck...). Thats tends to be particularly nasty, specially in "production".
I have some appreciation for the contract of types in a method declaration (I like to know if the method expects a duck or a pedigree dog... they tend to be different), specially if I'm not the only one writing the code.
So far, (with lots of generalization), static type languages tend to be faster than dynamic ones and type inferencing removes much of the verbosity. On the other hand, the ability to modify allows lot of flexibility.

Gabriel C. said...

>>On the other hand, the ability to modify the code at run time allows lot of flexibility, and you can do awesome things very quickly (or crash and burn faster too)

Maxim Khailo said...

A hardcore experienced programmer does not cloud his mind with such dogma (static typing bad, dynamic typing better, etc). The greatest asset a senior developer has is his ability to recognize a problem, recognize a good solution, or if he has no experience in a good solution, think one out. Thinking is your best asset.

Just as there are times when commenting makes sense, there are times when static typing makes sense.

Static typing is like saying there is this box and you can only put this kind of thing in it. The box is labeled.

Dynamic typing is like saying we have these generic transparent boxes that you have to look into to find out what is in there. or just remember where you put everything.

Yeah labelling boxes sucks but sometimes its very useful. Haskell has an auto labeler which is even more useful.

Different problems need different solutions and it takes a professional to see when and where to use different tools.

The dogma you talk about is from someone who has only solved a specific set of problems, or from someone who was not smart enough to use the right tools.

But to know the right tools takes time. The best benefit to programmers is not ruby, or dynamic languages, but just writing lots of code. By good teaching as well (which we lack in USA). By writing code in static and dynamic languages.

Your apparent frustration is the result of you not understanding this basic premise. That going from noob to rock star takes time.

All anger and frustration come from ignorance. I just hope this post does not dilute people who write in dynamic languages to think they are rock stars if they are not.

HMTL and Javascript are some of the most permissive languages for writing code, but I would hardly argue someone who only writes in those is a rock star.

Brendan said...

I'm kind of a n00b programmer, but it seems to me, when considering looking at other people's code, or when revisiting something I wrote long ago, that it's a lot easier to filter out unwanted comments than it is to cause them to appear. In fact, I suspect 90% of the readers here have written a quick script to do just that.

I grant a lot of comments are unhelpful, and some are even misleading. But there's always the chance that the comments can provide an additional insight.

It has not been my happiness to work with lots of code written by really good programmers (my own included), so I say, err on the side of excessive commenting, if such a choice actually has to be made.

Harold said...

re: the Haskell/OCaml type system idea:

Two problems:

1) If you don't start declaring complex types, the compiler will start inferring data types that you can't begin to understand because of the complexity of the type. The type of a relatively simple function could easily be longer than the code for that function. So you have to play the meta-data game if you're in it for the type system.

2) The type system still gets in your way. E.g. many Lisp functions accept some value or nil (aka null) and just return nil on nil input. This is really helpful when you want to propagate a non-error situation where some data should be ignored. But in Haskell/Ocaml you need to declare a Nullable type. Polymorphic data types helps here, but then you with so many "case" statements to propagate null value that you write a meta-function to take a function :: String -> String and make it become Nullable String -> Nullable String, but then you end up having to use that everywhere explicitly - ugh.

Steel Bank Common Lisp shows the right way to do this: include static type checking that tells you when you make a mistake but assumes that you know what you're doing when the code is ambiguous.

Rick said...

Interesting and provoking comments, Steve.

Of course, no one will completely agree with you, but I think your post at the very least leads to a productive discussion.

In terms of years-programming, I would be a teenager in your scale, but I did happen to start programming a little later in life than most, having a little background in math and linguistics. Thus, my abilities lean more towards the logical/database thing. But, I have been developing working code for 9 years as well as designing databases (I hate the term "modelling databases").

When it comes to application code, I completely sympathize with your frustration at the metadata addicts. In fact, I never had the patience to even become one of those temporarily (perhaps to my detriment).

But I think your analysis doesn't quite apply to relational database design. The situation is a little more complex and confused by other issues. Most people who call themselves "data modellers" have absorbed just barely enough of Database Design for Mere Mortals to be dangerous. DDMM is a decent book for beginners, but hardly begins to describe the flexibility of the relational model. Secondly, there is the whole "keeper of the tower" syndrome that happens with those who become experts at a specific product like Oracle or Sybase.

I found that after some more serious reading, such as the writings of CJ Date and Hugh Darwen, I had a completely different perspective on what is possible with databases. (Besides the obvious weighty tomes, "The Askew Wall" and other short texts by Darwen are *priceless*). In the end, I was able to produce systems with a fraction of the effort I would have spent previously. Good database design actually sped up the coding process. Expressiveness is what it's all about.

I must stress that by "good design" I don't mean to mean the endless committees and power plays that occur in many corporate settings. I mean that
a) the relational model allows us to express some things much more concisely and clearly than can be managed with any sort of programming approach. But, programmers tend to ignore those capacities because they don't like logic to be out of their hands. And I sympathize; I think programmers and database designers should be one and the same.
b) with a little foresight there is no need to follow the classic dual-model approach of handling the same logic in both code and database. ORM is probably the biggest culprit there.

mpeters said...

This is where Perl6's planned optional static typing would be really useful. Don't type if you don't want to, but when you need it (for clariy or performance) then type away.

Aaron Davies said...

k4/q: the Finnegans Wake of programming languages? docs; examples; how a k4 programmer writes C

jsnx said...

From one point of view, code does things, and since we don't
need static types to say what the code does, we might as well do
with out them. However, I want to do something besides say what
my code does -- I want to say what the code shall not
do
. The first two hours of writing Python are fun, but
then you start writing asserts or writing comments like "this
function accepts a function returning foos" -- much more verbose
than static types.

Dynamic languages are suited for little more than big shell
scripts. Their prominence is only the result of the failure of
statically typed languages to innovate (or to die a natural
death).

Object orientation and duck-typing are astrology -- they make
one stupid and afraid of knowledge. Type annotations are
replaced with naming conventions, mathematical concepts are
supplanted by made-up programmer talk, logic is rejected in
favor of rules of thumb. To suppose that there are valid ways of
thinking outside of mathematics is heterodoxy -- it is the cause
of all our problems and the root of all our sins.

stefan.ciobaca said...

Th