?

Log in

The Long Piece about Defining "Planet"

Sep. 11th, 2009 | 07:32 pm

"Why's Pluto not a planet any more?" ... it means 20 other things would be too, and 100 later on ... "But what's wrong with 100 planets?"


100 planets isn't really a) teachable, especially to young children, b) able to be memorized, and most importantly, c) reflective of the classes of objects we see now.

Those classes of solar system objects, with their rough distances to the Sun in "astronomical units" (in units of the Earth's average distance to the Sun), are pretty widely accepted among astronomers to be:
Note that this class of "Trans-Neptunian Objects" is brand new. We discovered the first TNO, Pluto, in 1930, but didn't discover another until 1992. Sedna is also new. Note also that we strongly suspect there's something else between the TNOs and the Oort Cloud besides Sedna. We just can't see it yet. It's too dim.


Well, OK. Knowing that, what do you think the definition of "planet" should be? You could pick out all the round-ish objects from there, but Alan Taylor showed that was a truckload. (And that was a few years ago now. And also, what counts as round, exactly? and how do you prove it? do you have to see 20x20px? 50x50?)

You could say "anything bigger than X" but what's X? and why? "Anything bigger than Pluto" leads to the 100+ planet problem in 100 years when we can see the whole Oort cloud. "As big as Mercury", maybe - but why Mercury? Just because it was known a long time? Seems a bit arbitrary. Plus, we don't really KNOW exactly how big the TNOs are right now, not unless they have moons. Long story.

Even the IAU definition - "round balls that have 'cleared their neighborhoods'" - is a bit iffy. Place Earth where Pluto is, and it's not a planet any more, by that definition. Earth could not have kicked out all those TNOs from that neighborhood by now. They're too far apart.

Other definitions - "those planets known to the ancients", "whatever a particular generation thinks of them" - don't really help us think about this modern version of the solar system.


And any of those doesn't really capture the really interesting part about Pluto. It's just one of many objects a lot like it: cold, icy, never very big, and in these chaotic or resonant screwed-up orbits that all intersect and play off each other and Neptune.

Pluto's not even clearly the most interesting TNO, necessarily; maybe not even the 2nd most interesting one. Eris is bigger, and in a more odd orbit. And Haumea's got a really wacky shape & spin. Pluto's interesting because it was first, it's relatively close & visible, it's a binary (Charon's nearly as big as it), and we have a spacecraft going there.


In the end, I think I agree with Alan Taylor, and astronomers Neil deGrasse Tyson and Mike Brown. What's interesting about the Solar System is its varieties, not our definitions. There's just a big pile of stuff out there that we can break up into interesting chunks, is all. Drawing a thick line between Mercury and Eris, or between Pluto and Makemake, because of our own history, seems awfully arbitrary.

Worse, it cuts off fascinating areas of the Solar System from our childrens' - and therefore most of the population's - education. Most people don't know how fascinating the giants' moons, Ceres, the big TNOs, the Oort Cloud, and the little whizzing asteroid and TNO bits everywhere are. I'd even argue that in some ways, things like Titan, Haumea, Sedna, and Triton are more fascinating than some of the planets half the world has memorized by name.

Link | Leave a comment | Share

Open Source Software's "Sweet Spot"

Sep. 9th, 2009 | 12:51 pm

In order to be successful, I think that open source projects will always be small in code size, highly modular, or a copy of something else.

The reason why is project management and requirements.

In order to pull off something the size and scope of Photoshop for the first time, you need to have a massive staff with a clear, consistent vision of what they're going after, and a lot of dense talking going back and forth.

Open source projects just can't achieve that, because
  1. they're spread out,
  2. the developers tend to leave more when the times get rough, and
  3. they usually don't have the top-down hierarchy necessary to resolve issues towards the consistent vision.
I'd argue the only truly innovative and successful open source projects have been Apache HTTPD, Firefox, and Linux, and I think the only reason those worked was the modular architecture designed into them, and their "modular" requirements.  Each module is a little fiefdom.

Other open source projects are highly successful, but imitative. The best way to resolve issues in a distributed software development is to say, "Well, this is the way iTunes, or Word, or Outlook does it, so we're gonna do it that way too." So that becomes the requirements.


I also think commercial software has unique advantages:
  • It can be truly dictatorial. Typically, product vision is set by a cadre of about two executives, two marketers, and two engineers. In my experience, all that counts is what those six people think.
  • They can force developers to toe the line with their contracts. This tends to cut down on the fights and forking.
  • Most importantly, everyone is in one building! Requirements analysis on difficult problems HAS to have everyone in the same room. In my experience, a pure requirements phase takes 6 weeks (+/- 3) if you have everyone in the same room most of that time. When you distribute people, you end up not agreeing on what you're building at the end of a much longer period, or abdicate requirements control to a single person.
I've seen more projects fall apart because everyone thought they could do it themselves, than the converse.

(Also see this.)

Link | Leave a comment {2} | Share

The FP @reply implementation

Sep. 8th, 2009 | 04:50 pm

My side project, which I'll call FP here, has a @username notification system a la Twitter's on it. The users use it all the time. There are N users total.

It's slightly more flexible than Twitter's. I call it the @reply system. This is a software design overview of that system. It's basically a "how-to". I'll compare it to Twitter's implementation.


You can think about detecting @replies on FP like this: every time there's a comment or post, is there one of N strings, @[username], inside the post?

This is horribly slow if you do it the "naive" way - that is, if you search the string N times.  But if you created a little tree-like structure that allowed you to scan the string with a state machine, it's nice and fast and it searches all N at once.

Luckily I didn't have to write that, someone already did. It's called the Aho-Corasick algorithm. And luckily, I found someone who ported it from C to Ruby. So I just run Aho-Corasick on every incoming piece of text, it returns me a list of the @replies it found, and I stick the text coordinates of the hits into a table for quick look up.


There have been four issues:
  1. I'm not 100% sure how well this scales to, say, 1,000,000 users. I think it'll scale all right in computation time. But the memory and tree-regeneration might be an issue. I was thinking I'd create a simple multithread-aware messaging server that could respond to "find all @reply" requests and "here's a new user" notifications if it ever got there. But that's probably never going to be a real problem for FP.
  2. You have to regenerate the state machine when a new user comes around.  This isn't too bad, since it takes people a while to realize there's someone new to @reply. So you can do that hourly or nightly.
  3. There's the slight problem of substrings that's baked in to the syntax. Since it's a robust substring search, user@gmail.com will send an @reply to a user named "gm". Not much I can do about that. I suppose you could filter new usernames for all common @[blank].com and .org addresses. There's also the issue of different users being proper prefixes of other users - eg if we had a user "user" and me, "usernameguy". A simple solution is to punt - whichever is the longest match at a position, wins.
  4. One nasty little detail outstanding in the FP implementation: it's not case insensitive. It really should be. Users don't remember the proper case on usernames. Case insensitivity needs to be supported in the Aho-Corasick implementation. But the Ruby version doesn't. I hacked it by inserting the lowercase version of everyone's names into the tree.

Finally, I can see why Twitter restricted it to the beginning of string. They don't want to run the algorithm on every incoming string. A lot of needless computation and false positives otherwise for them. Aho-Corasick would still be the fastest strategy for them, though. I think. I'd simply restrict using Aho-Corasick on strings that start with @, and only accept hits at the beginning.

Link | Leave a comment {1} | Share

"Humans could've done so much more"

Aug. 27th, 2009 | 07:29 pm

"Did any of you ever imagine that in the 21st century there'd be less than 10 people in space? ... Humans could've done so much more"

I've been reading a lot about astronomy lately. This wistful, romantic sentiment keeps coming up, particularly from members of my generation. It's been sticking in my craw.

"Humans could not have done so much more"? Not really. It's nice to think so. But - not really. The Shuttle has been a big dead-end, and a massive waste of cash, but really, not much more could've been done on its budget. NASA has pulled off impressive robotic missions again and again.

A trip to the Moon costs tens of billions of 2009 dollars. This is an amount even the US government has trouble spending. It takes an enormous amount of willpower and political clout to commit that much money to something. Because that much money represents a substantial chunk of your society's output.

Spaceflight is very, very expensive in general. Horribly expensive. It's very hard to get off the Earth into low earth orbit. It's not because the government screwed it up. It's not because of gold-bricking. It's because it's very hard to do.

It annoys me when people look at something that fills me with astonishment, and say, "is that all?" Give the Apollo program some respect. Understand the monumentally hard thing that they did. Give NASA some respect. Appreciate how they've given us Cassini and the MERs on a shoestring budget, while maintaining a farce of human spaceflight program that only gives us a warm fuzzy feeling and drains their coffers.


To put it all another way: you were lied to. Those images of moon bases and trips to Jupiter and jet packs and whatever that led you to feel this way - they were fantasy, based on a overly optimistic predictions.  Or sometimes, nothing at all. They were never truly achievable, even scaling up at the hysterical levels of the Cold War.

Let it go.

Link | Leave a comment {1} | Share

The Long Piece about Spaceflight Basics

Aug. 25th, 2009 | 12:38 pm

So I've been looking at the costs of spaceflight, lately. Here's my basic mental model of it works.


Spaceflight Mass and Fuel

In the end, spaceflight is all about weight. And the weight is mostly fuel. The best example is the Space Shuttle. Here is a summary of its weight at liftoff:
  • 509,480 lbs (231.1 metric tons, or "mT")* - weight empty aka "dry"                       
  • 4,442,000 lbs (2014.9 mT) - with fuel
  • 4,510,000 lbs (2045.7 mT) - with fuel, payload, and crew
  • 240,000 lbs (108.9 mT) - a loaded Shuttle, alone
  • 55,250 lbs (25.1 mT) - maximum payload
In other words, that's 87.2% in fuel, 11.3% in vehicle, 1.2% in payload, and 0.3% of crew, crew life support, and other amenities. And only the 5.3% that consists of the Shuttle itself makes it to orbit.

This is not a case of including more fuel than needed as a failsafe. Launch vehicles typically are made to eke every lb of fuel they can out of a design.**

You can typically use fuel as a proxy for launch cost when comparing designs, and be "right" within 50% or less. The fuel is the primary design driver in a launch vehicle.  It dictates the launch weight, the structural integrity of the vehicle, and the maximum payload. This dominance of fuel as the main design driver is true in all current launch vehicles.

These are not costs to be taken lightly. The Apollo program as a whole cost $25.4B in 1969 (around $135B in 2005 terms). This is roughly the size of our current Air Force budget.


* (Note all these numbers are rough. I'm piecing them together from bad sources without a lot of detailed knowledge of how to recompute things.)
** (But note that this 3.9M lbs of fuel in this particular vehicle is enormously expensive. The fuel itself, in bulk, is "cheap" - costing on the order of $1M if I researched right - but the vehicle design, the reuse requirement, and reassembly, combined with the choice of fuel means the costs
skyrocket:
   "It has been proven that the launch of heavy payloads to LEO is much more economical using expendable rockets than by the Shuttle system. For example, the almost 50-year-old Russian PROTON rocket with a total launch mass 3 times lower can place in LEO almost the same cargo as the Shuttle, at a cost almost 10 times lower. The cost of other expendable boosters like ARIANE, ATLAS, DELTA is only a little higher." )



I've been unfair in picking the Shuttle as my first example, because it is famously ... um ... shall we say ... weirdly designed. But every launch vehicle has this dichotomy. Take, for example, the workhorse of the American satellite industry, the Delta IV. Specifically, the Delta IV Medium+ (4.2), that goes up at around $140M a shot:
  • 73,755 lbs (33.5 mT) - weight empty
  • 626,891 lbs (284.4 mT) - with fuel
  • 652,591 lbs (296.0 mT) - with fuel and payload (to LEO)
  • 25,700 lbs (11.7 mT) - maximum payload
So that's 85.8% in fuel, 11.3% in vehicle, and 3.9% in payload. Note how this is an almost identical breakdown to the Space Shuttle's if you consider the Shuttle as a whole as the payload.


Delta-V

There's essentially 3 phases to any flight: the launch itself, the transfer, and the mission orbit. Manned missions and sample-return missions also have a landing phase, maybe, another liftoff to go with the landing, and a return phase. 

Each phase requires what's known as a "delta-V" - a change (delta or Δ) in velocity (v) and direction. These delta-V's can vary dramatically. Sometimes it involes a 200-ft-high rocket. Sometimes that delta-V is a light push. Sometimes the "burn" that accomplishes the delta-V takes months on end. There's a lot of complicated math involved to figure out the exact trajectory. But planners will talk abstractly about just the delta-V between the beginning and end of each phase.

In 95% of missions, these delta-V are accomplished by fuel from chemical rockets.


How Spaceflights are Designed, Roughly


The first thing I think of about designing a spaceflight: things towards the end of the mission are much more costly than things towards the beginning. The reason why is that you have to cart it, and all the fuel necessary for it, around the whole flight. If you need something during final landing, you have to get it off the Earth to begin with, you have to bring it to the target orbit/object, and finally you have to bring it back from the target orbit/object. To take a heavy heat shield to a landing, working backwards, you'll need:
  •   "fuel #3" - push that heat shield into the atmosphere at the right angle from the target orbit/object when you're landing
  •   "fuel #2" - push that heat shield PLUS FUEL #3 to the target orbit/object off the orbital burn
  •   "fuel #1" - push that heat shield PLUS FUEL #3 PLUS FUEL #2 up off the Earth to begin with.
So you pay for the heat shield three times. But also you have to first bit of fuel there, "fuel #3", twice extra. And "fuel #2" gets paid for once extra. (For the technically minded: the costs go superlinear - maybe even parabolic - because of the fuel.)

And most rockets have 3 stages. Sometimes 4. Each stage means yet another push to make.  So you could be needing a pyramid of fuel 7-steps deep or more in order to get that heat shield to protect you on reentry. Fuel to push fuel to push fuel to push fuel...*

* (If you think about it carefully, it doesn't actually matter how many stages there are. You could split the main tank into two and get more stages arbitrarily. Or into three ... or four ... or five ...  If you merge all the stages together, split the whole into infinitely small stages, and use calculus to solve it, you will in essence have the ideal rocket equation. You don't need to understand the details; the important part to understand is the relationship between delta-V and the rocket mass. Note how it builds upon itself as delta-V goes up.)


The second thing I think of: the speeds are obscene. The delta-V to get into LEO alone is 9.3-10km/sec. Think about that speed - it almost crosses the length of San Francisco in one second.

Note also that once you get into orbit, if you are going to some other object - it's going a different speed than you are. And it's rotating. The Moon orbits the Earth at an average of around 1km/sec, but objects at LEO are going around 8km/sec. You have to bleed off that speed. Delta-V for LEO to lunar orbit is listed at 4.1 km/sec; to the lunar surface it's another 1.6 km/sec. The Earth revolves around the Sun at around 29.8km/sec; Mars does at 24km/sec. From LEO to lower Martian orbit is the prescribed delta-V is 6.1 km/sec; down to the Martian surface is another 4.1km/sec.  Every one of these requires yet more fuel.


The third thing: it takes a lot of fuel to get these speed changes. You're moving these HUGE things, too.  It takes the Delta IV rocket 553,136 lbs of fuel to get its 25,700 lbs payload to LEO (at 9.5km/sec or so).  Each one of these burns requires a similar avalanche of rocket fuel in comparison to the payload to pull off.


The fourth thing I think of: it's really hard to overcome Earth's gravity. Much harder than you might expect. Just look at that huge delta-V. That gravity well is deep. This is easily remembered just by looking at the rockets - we're sending stuff the size of a small bus into lower earth orbit, and it takes rockets the size of skyscrapers to do it.

In general, gravity is a lot harder to deal with than we commonly imagine. This is almost as true on the Moon and on Mars as it is on Earth.

And also - don't forget the atmosphere(s).


Conclusions


So a few rules of thumb from these things:

1) Vehicle size gets progressively smaller during the mission. They're like Matryoshka dolls, except each inner doll is 1/10 the size of the outer one. The rest is fuel.

2) Missions that don't come back are much cheaper. Let's say you're planning to go to Mars. You're sending a mission to Mars that needs 1000kg to do its business there.  The mass of the rocket you need to launch a version that doesn't come back is X.  If you have the mission return an additional 1000kg, suddenly that 1000kg lander turns much larger, because it needs to contain fuel and an engine and aeronautics to return back to a Martian transfer orbit. So the part that went to Mars in the first place was larger, which means a larger launch vehicle. I'm not sure of the exact math, but it's much greater than 2 times X. I want to say it's more like 10 times X. That's a ten-times-bigger rocket. Much bigger cost.

3) Missions that make "stops" are usually expensive. Most "nearby" places you'd want to stop - The Moon, Mars, Ceres - are hard to stop at because of their speeds and their gravity. Even destinations without as much gravity to consider (near-Earth asteroids, the Martian moons) have a lot of speed to make up for.  The one exception to this are the Earth-Moon "Lagrangian points", points in the middle of empty space where various gravitational forces cancel each other out. They make a lot of the delta-V costs very cheap. (Lagrangian points are very interesting.)

4) Robotic missions are much cheaper. They mostly don't come back. When they do, they leave most of themselves in space, and what comes back is tiny. And they don't require all the weight of the crew's life support all the way back to Earth, either.

5) Anything useful that avoids mass during the space cruise is a good thing. A few examples that are sexy, relatively "new", or on the drawing boards: ion thrusters, including the exciting new VASIMR, solar sails (which are slow but they're great because they wouldn't need fuel), and inflatable heat shields.


Summary

This is why human spaceflight is very difficult. Your mission to send a man to Jupiter turns into bankrupting your country, once you plan it out. The costs rise faster than our imaginations do.

This is the reason NASA has been stuck with men in lower Earth orbit and sending out all these small one-way robotic missions all these years.

Link | Leave a comment | Share

The Coming VOIP War

Aug. 7th, 2009 | 02:33 pm

I think Google is about to launch a massive campaign to change mobile phones. They'll change how mobile phones are routed, and in the process, undercut every existing mobile phone company and provide superior service for lower costs.

(Forgive me if this is totally obvious to you - it just struck me a few days ago.)

The goal is a iPhone/Android-level smartphone whose service is a flat $30/mo, with novel phone handling features like blocked telemarketers, filtering calls, rerouting calls, built-in conferencing, and recording calls. They'll accomplish this by making an Android phone that performs VOIP calls, only, through their Google Voice service. These phones would have only a monthly unlimited data fee to pay for, and all the Google Voice advantages baked right into the phone.

It's the logical endpoint of the original Grand Central purchase, the ongoing Android development, and perhaps even their dark fiber purchases and their bidding on wireless airwaves.


The value to the consumer is obvious. But it will undermine the existing phone companies' billion-dollar businesses. How will Google get around this?

First, they'll get around it by using the Internet directly - 802.11 networking. In time, this will also include WiMAX or other wide-area broadband suitable for handheld mobiles. Perhaps something will take advantage of the "White Space Broadband".

But in the near-term, they'll need access to the mobile towers for a data stream. They'll get this one of three ways:
  1. by playing the companies off each other, and reminding them and the US and European governments what a cartel is;
  2. by legally by-passing any terms of service that restrict VOIP use (somehow); or
  3. by setting up their own towers.

This is a potentially massive market. Everyone will want a piece. In my mind, it helps explains some of the antagonism between Google and Apple lately. Kicking Eric Schmidt off the Apple board and clamping down on Google Voice apps makes sense if you consider Apple as a potential player here too.

Apple's VOIP play would be distinct from Google's. Google's play will be about commodity hardware with software with tight service integration, open software, and agnostic wireless data use. Google will make money on the software and their services.  Apple's play will be about selling proprietary hardware in a packaged software-service platform, tied to a particular wireless data provider.  Apple makes money on the hardware and the wireless data.

Apple wants a VOIP world tied to one data provider. Google wants the data provider to be a commodity so they can make more money selling software, end-user services, and advertising. A lot comes down to who can get pure wireless data service contract set up first.

(Note that Palm may also try to play, but I'm not sure how - maybe they agree with Google's model of the world but trying to compete with Google on a product basis.)

Link | Leave a comment {1} | Share

Voronin's Universality Theorem

Jun. 6th, 2009 | 12:39 pm

The Riemann Zeta function is a relatively famous mathematical function that has a number of remarkable properties.

One of the most remarkable was discovered by the Russian mathematician Voronin in 1975: its "universality".

In short, the Riemann Zeta function has within it all other functions. That is, along a particular strip of the function, you can find, in miniature, a piece of any other function you can name (so long as it's continuous & analytic). This includes simple parabolas, more complex and chaotic shapes, a relief of the Earth's surface, and a relief of Mickey Mouse.

This may not sound like a lot. But consider that you can can translate and dilate words into a 2-d area that would fit on the special strip of the Riemann Zeta function. So you can map areas to words and words to areas.

Thus, anything you can name appears in the Riemann Zeta. Anything at all:
  • the Declaration of Independence
  • that picture your parents took of you when you were 4 (encoded as a JPEG then uuencoded)
  • this post
  • the current state of the Universe
To me, this says more about the nature of infinity than anything else. The Riemann Zeta function is infinitely complex. It has infinite entropy. We tend to think about infinite functions as if they're merely very big. On the special strip in the Riemann Zeta function, you can scan and find anything at all if you're just patient enough*.

But something truly infinite can do things we have trouble intuitively understanding. It's the Riemann Zeta's combination of intricacy and its true infinity that allows it to do this.

For a more technical discussion of this, see this and this.



* (That said, note there are things that are literally impossible to find on it. If you took the largest, most efficient computer you could construct, and ran it for the full lifetime of the Universe, there are still things outside that scan area. An infinite number of things, in fact. Infinity is bigger than that.)

Link | Leave a comment | Share

Markdown's rigidness

Apr. 24th, 2009 | 02:06 pm

I'm messing around with RDispatch, an alt Ruby Markdown library, and I'm reminded how skeptical I am about the Markdown spec. Specifically, its rigidness.

The best example, I think, is that Markdown doesn't translate newlines to br tags. So this:
some text and a (no?)
newline
Becomes this:
some text and a (no?) newline
And newlines aren't even preserved in the output, necessarily, despite the fact newlines are part of the incoming syntax.

One of the main Markdown design goals is to make both the source and output look right. It's classic Mac-style engineering. It's designed to make everything look right, but only if you use the same style Gruber does for your text. If you really grok that when you read the spec, it's great and it works great, and everything is fine. It's brittle, otherwise. And I doubt many people actually get that.

I'd claim Markdown is very dogmatic. Almost snobby. It's definitely surprising. There's this assumption that everyone must agree with this stuff - not only an assumption, an enforcement that you agree. It's the same basic thing I dislike about Rails "convention over configuration".

Compare and contrast with, say, an HTML3-era browser, which bends over backwards to forgive the writer's lack of HTML knowledge - misspelled tags? missing close tags? binary instead of unary? Fine. No problem. Sure, it creates this horrible pile of messy HTML, but it also doesn't challenge the writer to learn it.


Markdown breaks a basic design principle of every good piece of systems engineering I've ever seen - forgiveness.

Link | Leave a comment | Share

Lazyweb Idea: Our Chemical Evolution poster

Mar. 20th, 2009 | 01:04 pm

I had an idea about a science poster someone could do that'd be similar to Kokogiak's great All Known Bodies in the Solar System Larger than 200 Miles in Diameter poster:
http://www.kokogiak.com/gedankengang/2007/03/all-known-bodies-in-solar-system.html

Basically, it'd be a poster of the history of our chemicals/particles from the beginning of the universe:
  • start on the left with the Big Bang,
  • the early quark-gluon soup,
  • the "freeze" stages that give us each of the basic particles,
  • the "let there be light" moment,
  • Hydrogen everywhere
  • the first quick mega-stars
  • the first supernovas and the first heavier elements (black holes in here somewhere too)
  • all the various populations of stars, the variety of novas and supernovas, and star remnants
  • focus on the Sun, the early planet formation, the formation of "prebiotic" chemicals
  • the formation of the Moon and injection of Fe into the Earth
  • the Earth's early chemistry (like Titan's, apparently)
  • the Earth's radiological history (and possible tie to our molten core)
  • the slow long production of more and more prebiotic chemicals until we get amino acids, DNA, etc
  • how the Earth's biosphere makeup changed with the first cellular life, and all the various stages thereafter
  • plankton, plants, trees; 02 and C02
  • animals; methane as a biological marker; other biological markers
  • humanity's appearance, and global warming, on the far right
And there's no doubt there's stuff I just don't know here. You could do a bunch of branching at various points, too.

I see little icons indicating the Universe's chemical/particle breakdown and the Earth's along the way.

It occurred to me how much work is ongoing on this very sequence, how it's one of the prime questions in science right now. It'd be really interesting to pull together just to know what we don't know. Kind of a monument to a lot of the late 20th century/early 21st century science going on.

Link | Leave a comment | Share

Why Software Estimates Are Hard

Mar. 18th, 2009 | 04:45 pm

(After someone pushed back on a generous time estimate for debugging.)

Sometimes, setting up the right test cases takes an hour.

Sometimes, some 3rd party code you have no control over assumes it's ordered A-to-Z, and you have to spend half a day tricking it into thinking it still is, when it's not. (This kind of thing happens far more often than one might think.)

Sometimes, you screwed yourself, by accident, and your own code assumes it's ordered A-to-Z. You suddenly have to find all spots in 2,000 lines of code where it assumes that.

Sometimes, you have a weird subsystem interaction you can't predict (like, MySQL up and decides that it counts starting from 0, but your PHP platform counts from 1, and when you reverse stuff, the last item gets chopped off).

Sometimes, it turns out you didn't actually know how to do a subproject.  Suddenly, something that looked like 1 week is now 6.

Sometimes, parts you use fail.  You spend weeks getting around those.

Sometimes, you just screw up and overdesign it or underdesign it or write something that ends up getting thrown away.

Sometimes, you simply misunderstand what a piece of software does. It turns out that it's useless*. It took weeks of your time.


Really, you get yourself into a lot of trouble doing off-the-cuff estimates of the difficulty of work units. Even senior engineers do. After a while, you build in 2x the time to your estimate. 'Cause stuff certainly comes up.**

Software estimates are very, very hazard-prone.

The trick with software is, if you had a truly good idea of how long something would take, you'd be almost done. You would have had to make all the decisions necessary to build it in order to get the estimate. Software unknowns are really unknown, and will stay unknown until you run over them.

This is kind of a tough concept, so I'll repeat it: if you knew exactly how long a software project was going to take, you'd be nearly done.  Writing the code out isn't the hard part.***

It's the nature of the unknowns that makes software very expensive.



* A great line from House MD that is as true in software as it is in medicine:  "I’m sure this goes against everything you’ve been taught, but right and wrong do exist.  Just because you don’t know what the right answer is – maybe there’s even no way you could know what the right answer is – doesn’t make your answer right or even okay.  It’s much simpler than that.  It’s just plain wrong."

** True story: a friend of mine was once forced by a CEO to give an estimate of a project that only had a goal statement, no requirements or spec. He said 12 months, and shrugged. The CEO said, what's the plus/minus on that? He said 12 months, and shrugged. The CEO was not pleased.

*** A corollary is that anyone who uses lines of code as a metric does not understand this. Software is not the lines of code, it's the decisions behind them. Often, 30%+ of the time is taken up by less than 1% of the code line count.  Sometimes, you need to hire a PhD to come in and write 500 lines of your 250,000 line app.

Link | Leave a comment | Share