25 August 2010

PayPhrase reverted. Reason: satisfactory levels of positive entropy.

Consummate Creativity is going away for a while. The experiment’s purpose was to produce more positive entropy in my life at a time where such was sorely lacking. It’s been a success, and though I completely failed at posting updates on the random swag you guys made me buy myself—the lightsaber, the vibrator, the Twilight posters, etc.—it was all amusing.

However, $10/month is no longer worth the increase in positive entropy, so I’m bringing the experiment to a close. Two reasons:

  1. I’m currently in part of the world where this (nsfw) sort of thing just happens naturally, and
  2. I’m still waiting on my first paycheck (ouch).

I may bring it back at some point, or I may not. Meanwhile, I’d strongly recommend starting one of your own; it was a wholly enjoyable experience, and it gives Amazon an actual use case for an otherwise fairly pointless piece of technology.

14 May 2010

How do you handle self-motivation?

I’ve just made it out of undergrad—I’m meeting Don Knuth later today, and walking Sunday. This means that for the first time in several months, the soul-crushing burden of undergraduate education at Case Western isn’t weighing me down. I’m free, in theory, to do whatever I want (at least until August). What, then, am I going to do?

I’ve got lots of ideas. I want to start exercising again, learn to sight-read, write a book, get my coding chops back, and lay the groundwork for discursive programming analysis. I’m worried, though, that if I try to take all of that on, I’m going to wind up burning out before I accomplish much on any of those fronts. So, how should I approach this?

I was feeling really great this morning—I woke up early, well-rested, and made an amazing breakfast. I was feeling so great, in fact, that I wanted to go out for a run. If I do this, though, I’m setting the precedent that I can only go running on days when I’m feeling great; experience tells me that as soon as I feel less-than-great some day, my exercise routine is going to fall apart. This is especially true if I try to start running, writing, sight-reading, coding, and analyzing all at once—there’s not enough time to make all of those things habitual.

So, how do you handle drive? How do you take a task list and turn it into your life? Given a bunch of newfound time and energy, how do you capitalize on it so you don’t wind up hunched over at a desk reading Reddit all day?

03 May 2010

Proposal: Social-graph–localized Trending Topics

Geography is a pretty godawful way of linking people with others who have similar interests. The local Trending Topics for San Francisco (as close, culturally, to “my city” as I’m likely to get) right now include Lynn Redgrave (who?), #2010yearof (uh…), #DearJonasBrothers, iPads, and the gulf spill. Of these, only the last is something that I’m even tangentially curious about, and I can find information about that on basically every news site on the internet.

I don’t take from this the conclusion that San Francisco is filled with idiots. Rather, whereas interests tend to clump into geographically diverse networks these days, geographic sampling is likely to yield the lowest common denominator—the stuff everyone is sort of viscerally interested in sans thought and cultivation. So the real problem is, again, that geography is a terrible way of grouping people for social purposes.

Now, what I’d be way more interested in seeing is a list of trending topics localized to my social graph—my friends/followers, and maybe their friends too. The topics that come up here would be the things that people I tend to find interesting find interesting—in my case, probably things like software development, 8bit music, hackerspace happenings, and memetics. This would help guide me towards interesting conversations people are having, and ones that I’d be far more interested in following than, say, #2010yearof.

So why doesn’t Twitter already do this? The simple answer, if you stop and think about it for a bit, is that it’s computationally expensive. Geographically localized trending topics are easy—there are all sorts of algorithmic tricks you can use to speed up and simplify the problem until it’s something you can manage to do on a dataset the size of Twitter’s. Social graph–localized topics, on the other hand, are hard. The layout of the social landscape is way less static than that of the physical one—people are constantly shifting around, making and pruning connections, and changing their interests, making the whole thing a pretty big mess. Moreover, the type of data I’m talking about gathering—trends, for each user, of that user’s friends and friend-of-friends—is pretty gnarly to compute and store.

Clearly, I think it’s possible anyway, or I wouldn’t be writing this post. Here’s how: move the computation from the cloud to the client machines. Every modern browser includes a JavaScript interpreter. The data required—friend and friend-of-friend connections—is all (or mostly) public. So write code that performs the computation client-side and then sends the data back to Twitter. For bonus points, perform the computation in such a way that data is reused—instead of running the calculation over and over for each friend, first try to query that friend’s data and apply some transform to that.

There’s an obvious problem with this plan: how do you prevent people from cheating? If they’re doing the computation, not you, then they can send back whatever bogus data they feel like. Maybe they want to say that everyone around them loves their social media website, or their sexy pictures, or what-have-you. But this is actually pretty easy to combat: employ zero knowledge proofs. Twitter has all the data needed to run the computations server-side. So take some random subset of users (ideally a subset that reduces the asymptotic complexity of the overall task) and evaluate the algorithm over that to see if it matches what people are claiming. If so, then people probably aren’t cheating. If not, someone is.

So that’s how Twitter can solve a really thorny problem by out-sourcing its computation to users’ machines. The point I’m getting at here is that this strategy generalizes—for a lot of (especially social) computations, it makes sense to farm the work out to users. This also means that you don’t necessarily need a massive compute cluster to handle a large user base.

05 April 2010

Life-changing

$ cat <<EOV >>~/.vimrc
se backupdir=/var/tmp
se dir=/tmp
EOV

Vim swapfiles and backup files will no longer litter your project directories; instead, they'll all be consolidated in one place. Brilliant.

31 March 2010

In Case You Need It

In my last post, I used the following pattern to run some code after it had become available (with typeof thing != 'undefined' replaced here with a fair coin flip):

(function doThingEventually() {
  if (Math.random() > 0.5) {
    alert('hi');
  } else {
    window.setTimeout(doThingEventually, 100);
  }
})();

This does what we want, but as a side effect, it sticks doThingEventually into the global scope. doThingEventually can now be accessed by subsequent bookmarklet calls. That's kinda gross. We might want to know if there's a way to do what we want without any side effects. It turns out that the answer is yes, if we take a page out of the Y combinator’s book:

(function(f, thing, predicate) {
  if (predicate()) {
    thing();
  } else {
    window.setTimeout(function() { f(f, thing, predicate); }, 100);
  }
})(function(f, thing, predicate) {
  if (predicate()) {
    thing();
  } else {
    window.setTimeout(function() { f(f, thing, predicate); }, 100);
  }
}, function() { /* thing(), function to run if predicate() is true */
  alert('hi');
}, function() { /* predicate() that must eventually evalutate to true */
  return Math.random() > 0.5;
});

Running this through the Closure compiler, we get the following:

(function(a,b,c){c()?b():window.setTimeout(function(){a(a,b,c)},100)})(function(a,b,c){c()?b():window.setTimeout(function(){a(a,b,c)},100)},function(){alert("hi")},function(){return Math.random()>0.5});

That code clocks in at 202 bytes. For comparison, here's the compiled version of the original, at 73 bytes:

(function a(){Math.random()>0.5?alert("hi"):window.setTimeout(a,100)})();

We can verify that both methods work: try them yourself! Method one and two. Both of these should, with very high probability, pop up a box saying "hi" to you after a short delay.

Is it worth the 129 extra bytes to not pollute the global namespace? Depends. If you're compiling code that uses that pattern, you should be careful about sticking short identifiers like a into the global namespace, especially in bookmarklet code that needs to be mindful of whatever bizarre code the page author has running. On the other hand, if you have control over all the code that gets run, it's probably saner to just let the compiler keep track of everything and use the simpler variant.

Update: Via a comment by _johnny on Reddit, it turns out Javascript actually has a way of referring to the function being called: arguments.callee. His method doesn't pollute the global namespace, yet maintains conciseness. Unfortunately, according to another comment made in the same thread, arguments.callee is now deprecated. A third way to pull this hat trick off is as follows, courtesy of snorlaxx:

(function() {
  (function doThingEventually() {
    if (Math.random() > 0.5) {
      alert('hi');
    } else {
      window.setTimeout(doThingEventually, 100);
    }
  })()
})()

Phew. Isn't Javascript fun?

Update 1 April 2010: As it turns out, you don't even need the enclosing function to take the variable out of global scope. Apparently just saying (function doThingEventually() { ...etc... })() is fine.

Writing Maintainable Bookmarklets (Part 2)

Last time, we learned a simple way to write maintainable bookmarklets. The method we used was to dynamically add a script tag to our document, executing any code contained in it. Every time we click on our bookmarklet, the script gets removed and re-added to the page, re-running its contents.

Trouble is, that's kind of wasteful. Every time your users use your bookmarklet, they're going to have to first make a round trip to your server and run some code. If your bookmarklet is simple and unlikely to be used many times per page-reload, or if you don't mind the traffic, then this might be fine. In the general case, though, we'd probably like our bookmarklet to set up some data structures the first time it's run and then make a function call on subsequent runs. We can accomplish this by defining window.someUniqueFunctionName within our script and then calling it in our bookmarklet after the script is loaded. First, though, we need to wait until the script has loaded. Here's how the code looks for that case:

var n = 'some-unique-identifier';
var f = 'someUniqueFunctionName';
var s = document.getElementById(n);
if (!s) {
  s = document.createElement('script');
  s.type = 'text/javascript';
  s.src = 'http://example.com/js/bar-script.js';
  s.id = n;
  document.body.appendChild(s);
}
(function doBookmarklet() {
  if (typeof window[f] != 'undefined') {
    window[f]();
  } else {
    window.setTimeout(doBookmarklet, 100);
  }
})();

So if the script hasn't yet been loaded, we load it. Now we need to run the function we defined in it. First, we see if the function exists yet. If so, we're done: we call it and exit. Otherwise, we use window.setTimeout to try again in some $short_interval---in this case, 100ms.

30 March 2010

Writing Maintainable Bookmarklets (Part 1)

Bookmarklets are great. Clicking a button in your bookmarks bar can do everything from prettifying typography to making pages readable to launching a full-fledged bookmark(let) manager. As a programmer, you probably want to learn how to write them. Hopefully this post will help.

There are a few decent introductions to writing bookmarklets. There's a particularly well-written one here (though if anyone can find any better resources, please let me know!) If you don't know how bookmarklets work on a high level, you'll probably want to go through that---at least get to the point where you understand what this link is doing.

So now that you've got a grasp of writing bookmarklets, let's talk about the maintainability aspect. This is a little trickier. Things like pushing out upgrades get tricky when your code is all sitting in a bunch of people's bookmark bars. You can maintain a mailing list and demand that people sign up and check for updates, but realistically, that's not going to work; it's just not worth your users' time.

A better approach is to make your bookmarklet as small as possible, and have it load and run the bulk of its code from an external site that you control. You can do this by employing a clever hack someone came up with a while back: write a new script tag to the end of your document containing your code, and then call functions from it. You can test new versions on your own workstation by keeping a "test" bookmarklet pointing at a different script. Every time you fix a bug or add a new feature, just push the change to that link, and people will transparently start using the new code.

Here's how the code for this looks:

var n = 'some-unique-identifier';
var s = document.getElementById(n);
if (s) {
  document.body.removeChild(s);
}
void(s = document.createElement('script'));
s.src = 'http://example.com/js/foo-script.js';
s.type = 'text/javascript';
s.id = n;
document.body.appendChild(s);

That's it. Any function calls contained in foo-script.js will get executed in the context of the current window. The amount of code you have to worry about working all the time no matter what---the stuff you can't realistically maintain---is miniscule. Any updates you make will be instantly visible as soon as users click on your bookmarklet.

This isn't a one-size-fits-all solution, though. Unless you code carefully, this will obliterate all your script's data structures every time it's run. Worse (from a performance perspective), it potentially makes an HTTP round trip every time it's run. Caching lessens the effects of this, but that's not a bulletproof solution either. Given these caveats, though, this is a pretty decent way to write maintainable bookmarklets.