On Wax and Feather Assumptions

The problem is not that Icarus flew too close to the sun, but that he used wax and feathers for his wings.

On Wax and Feather Assumptions

LX

2024.08.18

The problem is not that Icarus flew too close to the sun, but that he used wax and feathers for his wings. 

Wax and Feather Assumptions

I listened to a brief Stanley Kubrick speech this week. A quote stood out: 

I’ve never been certain whether the moral of the Icarus story should only be, as is generally accepted, don’t try to fly too high, or whether it might also be thought of as forget the wax and the feathers and do a better job on the wings.

Stanley Kubrick

Jack & I started building BirdDog on July 23rd. Since then, I can’t tell you how many wax and feathers assumptions I’ve made. The number that I’m aware of is nauseatingly high. The number that I’m unaware of must be a multiple of that. 

Regardless, we now have a product that takes only three minutes of human time to add in a new user… I suppose that’s the beauty of software. This is a huge milestone in the sense that we can start to open the floodgates of feedback. 

It is going to be painful; the platform is so far from what we want it to be. However, the best way to see if your wings hold up is to put them under the heat of the sun. For a startup, there is no better sun than users. Not only will they melt bad assumptions, but they’ll also help us focus in on the handful of features that actually matter out in the “real world.” 

All that being said, we didn’t even need any users to interact with the platform for quite a high number of my bad engineering assumptions to be revealed as just that—bad engineering assumptions. 

Bad Assumptions

Assumptions make our model of the world. More scientifically, you might call your assumptions hypotheses. Whatever they are, we use them to guide action.

Assumptions without action is armchair philosophy. Action without assumptions are random (this statement, of course, assumes that everything you “know” is an “assumption”). 

As I wrote about a couple of weeks ago, I started building the BirdDog backend from the top down, rather than bottom up. The first iteration of what I thought the product architecture would look like was this:

I wanted something that would last “forever.” I got lots of top level complexity unhardened by knowledge of low level implementation details.

If I were to draw a diagram of what our current actual backend looks like, it would be nothing like this. This was full of bad assumptions, the biggest of which being that we needed three databases, as well as tying the architecture to the technicalities of Mongo DB. Still, planning it out was incredibly useful.

Plans are worthless, but plans are everything.

Eisenhower

By last Monday, the product “worked” & followed a simplified version of the above schematic. When I say worked, there is a huge asterisk. Because of limitations of both the document database I naively chose and the way I had structured our data hierarchy, when we went to scrape a website, we only could fit a fraction of the content we extracted in the database. This is without mentioning a slew of other logical complexities that went into reconciling the information in the DBs. 

When I designed the fun diagram above, there were many very simple variables I did not focus on, such as, you know, the storage capacity of different parts of a database. Which, you know, is kind of like the point of a database. Rather, I was focusing on a bad assumption around read-write permissions that I had wrongfully generalized after heavy exposure to developers using git repos as DBs. 

As a result, my original plan for BirdDog inherited my bad assumptions and naivety.

However, once it became painfully apparent that this was far too complex, I went back on the assumption as quickly as possible and started learning SQL. Three days later, we had a 3NF SQL DB. 3NF is something I only know about because somebody smarter than me brought it up (thank you, anon).

Implementation melted the wax of bad assumptions. 

Other Bad Assumptions

A few other bad assumptions I made, as well as how they evolved:

Quote Extraction

Assumed Implementation: After a model identified that a paragraph from our database contained a valuable quote, we were going to have another model extract those quotes from the passage.

Actual Implementation: We turned the database into a collection of quotes; now, selecting a quote implies extraction.

Separate Pipelines

Assumed Implementation: We started with different pipelines for 1) extracting & embedding info from websites 2) sorting through and extracting quotes from the embeddings.

Actual Implementation: Now, we have one 165 line generalized pipeline, the core of our data processing, complete with multiprocessing and increasingly reliable error handling. More granular functions are arguments. 

Rust Web Crawler

Assumed Implementation: To preempt performance limitations of Python, I spent the first few days building a 500 line rust web crawler.

Actual Implementation:  We switched to Python; by leveraging libraries built by people smarter than us & 92 lines of our own code, not only can we crawl most websites, but we can also robustly extract the relevant text from the html.

Cost of Iteration

When you’re early, the costs of iteration are time and ego.  

BirdDog has spent  $.01 on OAI in August. I expect that if we had 100 users with 200 prospects each, the cost might be $10 a week, likely much lower.

Pop off, only on occasion, Brother. We are using my personal OAI account for BirdDog. The $.13 came from my website’s usage. The other one cent was from BirdDog. 

The cost benefit analysis will get more complex as time goes on and we have paying users and expenses and all of those things, but, right now, the costs are time and ego. I’ve committed all of the former to this and am always trying to conquer the latter.

As of now, our entire code base is 1,177 lines, including the sql schema*. I need to set up GitHub Stats to be sure, but I’d guess that last week, before the big database switch, we peaked out above 2,000 lines. Trading maybe three or four days of time for a code base that’s half the size as it otherwise would be and even more powerful and useful is a trade I would take every time.

The benefit of melting wax and feather assumptions is a more accurate world view. In terms of code, that generally means doing the same thing better with fewer & more clear lines. In terms of life, it can mean a lot of things, most of which are good.

*31% of the code base is the SQL wrapper, which I’m sure is a product of me being a total SQL noob. Learning more about SQL will help me slash that. If there are any books or blogs or videos you like, send them my way. 

Lampson’s Law

An assumption is part of your model for the world. Maybe you believe that with these wax and feather wings, you can fly. The trick to figuring out if you’re right is to ask the world for feedback.

For BirdDog, asking the world is about to be literally asking users. The last few weeks, though, asking the world was seeing if the code would do the things we wanted it to do under different foreseeable conditions.

Many of my assumptions in that regard ended up being quite silly. An experienced engineer would have known that, when asked what it thought about those assumptions, the world’s answer was going to be more negative than positive. As a matter of fact, some experienced engineers did tell me that some of these were not great assumptions. If that was you, thank you; I’m sure that those conversations helped me to know which direction to go when it became apparent that my direction was wrong.

Having tried these things was not a waste of time, though. As a matter of fact, I’ll keep trying things, and I’ll keep being wrong. But, each time, I’ll be closer to right.

Get it right. Neither abstraction nor simplicity is a substitute for getting it right.

Lampson’s Law

In the context of software, I think getting it right means building a tool that provides value to the users in a sustainable way. Sustainable means the value created and captured is ultimately greater than the cost of maintaining it. 

Getting there will require many mistakes and perpetual humility. Sometimes, when I realize that I’ve overcome three or four bad assumptions in a row, I counterintuitively have this momentum and attitude of “Oh, I understand, now. I was an idiot then, but I get it now.” Confidence is good, but never arrogance.

I am learning that no matter how many times I melt the wax wings, there will always be a more powerful sun and a better way to fly. 

I am a naive engineer.

Being naive at something is probably the most exciting part—you get to learn at a mile a minute.

Live Deeply,