Open Spam Net

I've been looking into migrating away from Gmail for a while, largely out of concern that Google can shut down my email on a whim. My mail already runs through my own mail server, but I use Gmail for its nice UI, searching, and other neat features. I still haven't found a mail client that I actually like, but I'm hopeful that something will come along.

Aside from UI, the other big advantage of Gmail over your own server is the quality of its spam filtering. This isn't even just an algorithmic thing, although I'm sure their algorithms are top-notch. Rather, it's systemic: Google has access to everyone's email, so they can do spam filtering across everyone's email at once. They can apply detection techniques in the large that are impossible for an individual mail server operator to match.

Well, maybe. Most spam filters use some kind of Bayes classifier. Essentially, when you mark some emails are spam or not spam, you know the probability of different features (keywords, usually) appearing in those emails. What you really want to know is the reverse: the probability of spam given the features. That's exactly what Bayes' theorem is for.

And there's no reason that model has to be restricted to a single user, or single server. You can chain many layers of predictions together in a Bayes network, and you could use that exact structure to federate your predictions. So: I think a certain set of features are likely to be spam. You can subscribe to my ideas of spamminess, and any time I'm wrong, not only does your prediction of those features being spammy go down, your prediction of me being right goes down too!

The nice thing about this is that there's very little benefit in spammers getting hold of it. Anyone (including a spammer) entering the network would start with a 50/50 predictiveness, which is to say there's no assumption that they have any value. They would earn that value by making successful predictions, and if spammers want to help fight spam then great.

The end result would be a massive predictive network where each user's spam predictions are combined. A place where everyone from big corporate networks to little private VPS mail servers can collaborate to fight spam together.

The Elo paradox

I was showing yesterday's post to a friend and he made the point that the feeling of constant improvement is very important, motivationally speaking, and that works against keeping your goals modest and reasonable. In fact, I think there's a very fundamental conflict between motivation and improvement, which I call the Elo paradox.

Chess, as well as many online competitive games, use an Elo-based rating system. Essentially these systems try to create a predictive measurement for you as a player, such that two people with equal ratings are equally likely to win a match against each other. There have been various improvements since then, but the core concept is the same: your skill can be represented as a prediction of how likely you are to win. It's a powerful idea and yields very accurate skill measurements.

But as good a rating system as it is, Elo is terrible game design. Almost everyone's journey in an Elo-ranked system looks the same: they come in as a beginner with an abysmal score. They have an initial burst of improvement that pushes their score into the low end of average. They work hard on their average score and eventually turn it into a slightly above average score. Their score stops going up. They have a bad week and lose a bunch of games. Their score drops. They stop playing for a little while. They come back rusty. Their score drops even more. They stop playing for good.

The problem is that we want to feel a sense of progress. It's nice to be better than you were yesterday. But the tragic reality is that won't always be true. Mostly you're the same, and sometimes you're worse. That's the Elo paradox, in a nutshell: you can't have an accurate measurement of your ability that always increases.

For that reason, I think getting your motivation from an intrinsic measurement is fundamentally silly. Maybe it works for some people, but it seems obvious to me that you'll always run up against that skill ceiling at some point or another and see your results taper off. Instead I prefer to focus on cumulative output. That has the nice quality of being a measurement you can always control, and it always goes up.

I might not move faster than yesterday, but I've moved further. I might not be smarter, but I've thought more. I might not be better, but I've done more.

High water mark

Achievement, overachievement and underachievement

It strikes me that we often measure ourselves by what we've done previously, but in some cases that can result in very perverse incentives. The classic example here is the employee who pulls the heroic all-nighter to meet a deadline. The project is saved, the people shout and cheer, and that employee is branded as the all-nighter hero. But guess what happens next time the project is running behind schedule? Oh hero, where are you? Soon the heroism becomes expected and you've created yet another crunch-cycle zombie.

One thing that has surprised me is that this effect seems to persist even without the necessity of the bad-project-management-stress bogeyman. On my own projects, if I achieve beyond what I expected for any length of time, my expectation rises to meet that achievement, even when I explicitly set more modest goals to anchor that expectation. I like to visualise this as a high water mark being set by the steady ebb and flow of productivity, and it's very dangerous if left unchecked. As inflated expectation stacks upon inflated expectation, a small project with modest expectations can quickly swell into a Sistine Chapel behemoth.

I find that the only way to reset that expectation is to sometimes deliberately do the bare minimum. Watch the clock and get out at 5pm exactly. At least 1000 words? Sounds like exactly 1000 words to me! Minimum fifteen pieces of flair means I'm wearing fifteen pieces, and if you want more you need to set the minimum higher. I'm not saying do this all the time, that sounds like a recipe for mediocrity, but I think it's healthy to wipe out the high water mark on occasion.

Never pushing down means the only influence on expectations is upwards, and there's no way that can keep up forever.

Predictor-surprise pattern

An interesting thought I had a little while back is that a lot of aesthetically pleasing things, like music, art or comedy, seem to rely to some degree on making and breaking expectations. Comedy is maybe the most direct example, because a good joke is usually being surprising or confronting in some way. But in music too there is a constant back-and-forth between expectation and reality. You hear a refrain two times, you're all primed to hear the same thing again, but then it changes subtly the third time. For whatever reason, this seems to be very enjoyable.

It'd be interesting to make a foray into generative music or art by explicitly building an audience model that works as a kind of predictor. The predictor would be constantly searching for patterns in the sequence of notes it's seen, trying to accurately predict what comes next. However, since you're running the predictor and also generating the notes, you can do something a bit interesting: change what comes next if the predictor would have predicted it too easily.

The system would then have some tunable 'surprise factor'. The higher it is the less willing you are to let the predictor win and the more you will subvert its expectations. My prediction is that after a while experimenting you would find some particular value that seems to be the sweet spot for making enjoyable music. But I'm prepared to be surprised.

A strange game

One thing I've noticed is that a lot of people, myself included, tend to have fairly particular ideas about what their game is. A programmer, say, is usually explicitly not playing the business game. A researcher is not playing the marketing game. An office worker is not playing the politics game. Indeed, in many cases that game is seen as somehow not relevant, or unworthy, or even just a mug's game - one where the only winning move is not to play.

Which would all be fine, except that you don't actually get to choose which games to play. If there are office politics and you deliberately remain ignorant about them, you're not achieving some noble victory, you're just hobbling yourself. Similarly, a researcher who avoids marketing isn't making some strategic non-move move by letting their ideas languish in obscurity. The status quo for these kinds of games isn't neutral, it's failure; you don't start at 50% and get worse as you play, you start at 0% and work your way up.

Reality doesn't care about how you've divided things up in your head. Your options for influencing it are a big bunch of levers; some shaped like engineering, some shaped like business, others shaped like people skills and politics. If you pull the right set of levers you can get the things you want, and from that perspective it's pretty hard to justify ignoring most of them because, while they might work perfectly, they're the wrong shape damnit!

Not playing may not always be a losing move, but it's putting your success in the hands of more experienced players and hoping that their victory aligns with yours. If not, the only winning move is to play.