Just for kicks: a captcha using <canvas>

Hopefully the first part in a series of useless experiments, this post will cover how to draw a google-like captcha using the <canvas> element.

First of, I’d like to declare that I am aware of how useless having the ability to draw a captcha on the client side is. Sending the information required to generate the image would defeat the whole purpose of using a captcha in the first place. Sort of like this.

How ever, assuming we do have a legitimate reason for doing this.. how would this be done?

To customize the drawing the text, the wonderful typeface.js can be used. This library was initially developed to support custom fonts on the client-side but has been arguably made obsolete by webfont support in all major browsers.

Typeface.js works by reading the raw font and generating JavaScript code that describes the glyphs for each character in the font. These can then be rendered using <canvas> for most browsers and VML in IE. The actual drawing code is quite simple and, together with the fonts, is the only part we need for drawing the captcha. Of course, applied directly, this only results in “correctly” rendered words that are not much of a challenge for spammers:


This needs more squiggles!

Which means we just need some random aberrations. By randomly applying the transform, translate and rotate methods to the canvas context, we should get something distorted enough that it stops looking like regular text.

First, we’ll add a random transform before drawing each letter. Each transform will add upon the previous ones without resetting the context (rnd just returns an appropriately small random number; see source for details):

for(var i = 0; i < word.length; i++) {
  ctx.transform(1-rnd(), rnd(1), rnd(), 1 + rnd(2), rnd(1), rnd());
  ...
}
ctx.transform(1-rnd(), rnd(1), rnd(), 1 + rnd(2), rnd(1), rnd());

The results after doing this are promising, but we’re not quite there yet:


The obvious improvement is to decrease the spacing between words, which can be achieved by lowering the translation to the right after each letter:

ctx.translate(glyph.ha * 0.8 - 45, 0);

Now the resulting image really looks like captcha:


Of course, by adding more transforms at various stages of the drawing process the images gets even more distorted. The key though is reaching a point where it’s sufficiently easy for humans to decipher the word but hard for OCR software to do it. Here are two more examples with images that are progressively harder to read:


Now, even though doing this on the client side is pointless, the exact same operations could be done in any language with a decent graphics library. typeface.js can in turn be used to create interesting effects such as wavy text, which, as kitschy as it sounds, may work for an artistic site. 

Code here

A Better CPU Graph for Gnome System Monitor

If you use Gnome on a machine with at least four CPU cores than you’ve probably seen something like this before:

The fact that the Linux constantly moves threads between cores causes the whole chart look like spaghetti, so I decided to make it render an area chart instead.

Here are the results:

The code is hosted on github.

A Reddit widget using jQuery and JSONP

A very important part of the experience of reading blogs is the ability to see other people’s opinions on the subject, which in many cases turn out to be more insightful than the post itself. Also, social news sites like Reddit or Hacker News tend to generate discussions that engage many more users than traditional blog comments do, and it’s important to make it easier for readers to access them.

When creating this blog, I checked whether Reddit provides a way of embedding information about the submissions for an URL inside a page, such as the number of comments, a link to the actual submission and, of lesser importance, the number of upvotes. Unfortunately, while they do have both a button widget and a submission list widget, neither of them are very customizable nor do they display exactly what I wanted, so I decided to make something myself. The end result looks something like this: 

Reddit API to the rescue!

Reddit has been exposing a pretty good RESTful API for quite some time now, and fortunately, about two months ago, they also added support for JSONP. What that means is that their API can now be used directly from the browser using standard XmlHttpRequests. The docs can be found here: http://code.reddit.com/wiki/API .

Luckily, the functionality I want requires a single API call: http://reddit.com/api/info.json?url=URL . Here’s what it returns when asked about microsoft.com:

{ "data" : { 
  "after" : null,
  "before" : null,
  "children" : [ { 
    "data" : { 
      "id" : "a1ql9",
      "num_comments" : 2,
      "permalink" : "/r/programming/comments/a1ql9/developer/",
      "subreddit" : "programming",
      "subreddit_id" : "t5_2fwo",
      "title" : "developer",
      "ups" : 2,
      "downs" : 9,
      "url" : "http://www.microsoft.com",
      ..............
    },
    "kind" : "t3"
  },
  ............
] } } }

The children field’s value is a list containing an entry for each submission of the specified URL on Reddit. I’ll assume that my posts will not be submitted to Reddit multiple times and just use the first element.

Creating the widget

Ideally, a widget such as this would not depend on any JavaScript libraries, since it should be easy to include it in any page, but since this is only more of a demo I decided to use jQuery anyway. It also doesn’t hurt that the Tumblr theme I’m using already includes it.

The nice thing about using jQuery is that it already has built-in support for JSONP. The $.ajax function can handle the creation of the <script> element, as well as managing the callback function. Here’s what a call to the above API URL looks like:

$.ajax({ 
  url: "http://www.reddit.com/api/info.json?url=" + 
       encodeURI(location.href),
  jsonp: "jsonp",
  dataType: "jsonp",
  success: function(json) {
    console.log(json);
  }
});

The jsonp: "jsonp" option tells jQuery that the callback parameter (which contains the name of the callback function) is called “jsonp”.

To render the JSON received from Reddit, I wrote a small function that takes a string of HTML and an object, and replaces fields of the form $propertyName with properties from an object:

function replaceFields(html, data) {
  for(var prop in data) {
    html = html.replace("$" + prop, data[prop]);
  }
  return html;
};

You could call this function “the poor man’s template engine”. This will be called with the first submission found in the JSON returned by Reddit (i.e. json.data.children[0].data).

Putting it all together

The final widget combines the two above bits of JavaScript with a bit of glue code:

function loadRedditInfo(target) {
  function replaceFields(html, data) {
    for(var prop in data) {
      html = html.replace("$" + prop, data[prop]);
    }
    return html;
  };
  function processResponse(json) {
    if(json.kind != "Listing" ||
       json.data.children.size == 0) {
      // don't show anything if this url hasn't been 
      // submitted to reddit
      return;
    }
    // we only care about the first entry
    var entry = json.data.children[0].data;

    target = $(target);
    target.html(replaceFields(target.html(), entry));
    target.show();
  };
  $.ajax({ 
    url: "http://www.reddit.com/api/info.json?url=" + 
         encodeURI(location.href),
    jsonp: "jsonp",
    dataType: "jsonp",
    // the try/catch is ugly, but somewhat necessary in case 
    // Reddit is down or they change their API
    success: function(json) {
      try { processResponse(json); }
      catch(e) { };
    }
  });
};
$(function() { loadRedditInfo("#reddit-info-widget"); });

This can be added to a <script> element anywhere on the page (of course, after jQuery).

Here’s an example of an HTML template which can be used together with the above unction that displays the number of upvotes and comments, as well as providing a link to the reddit submission page:

<div id="reddit-info-widget" style="display: none">
  This post has gathered $ups upvotes and 
  <a class='comments' href='http://reddit.com$permalink'>
$num_comments comments
</a>
  on Reddit!
</div>

Note how the HTML above contains the $ups, $permalink and $num_comments fields. These will get replaced with the number of upvotes, link to the submission and number of comments respectively.

To view how this script and html template render together, visit this page.

The code is also available on GitHub.

Snapping windows in Linux - osnap.py

Some time ago I bought a 24” display for my desktop. Because the 1920px width is way to large for any single window with the obvious exception of the movie player, without even realizing, I had changed the way I arrange windows: on a 19” display I keep a single large window visible at a time, maximized to cover the whole screen; on the 24” one I tend to keep two, even three windows visible, for example the browser on the left side of the screen, emacs on the upper right corner and a terminal on the lower right.

This, however, made me realize that resizing and moving windows by hand is a real nuisance. 

The snapping in Windows 7 is of great help with this, but has two drawbacks: 

  • it doesn’t allow you to snap to just a quarter of the screen
  • it requires Windows 7; I prefer to use Linux with Gnome

The only similar thing I could find for Linux was the Compiz Grid plugin, which looks great. Unfortunately, it also has two drawbacks:

  • it requires 8 keyboard shortcuts: 4 for snapping to the up/down/left/right half of the screen and another 4 for the corners, which means that you can’t simply use the arrow keys (I believe it’s designed to work with the numpad, which I kinda hate)
  • it requires Compiz, which I don’t use for some (very arguable) performance considerations

So I decided to make something that works just as I want it to. It actually ended up taking only a few hours and exactly 100 lines of Python, and here it is: osnap.py.

What is does in a nutshell is allow you to snap windows in a 4x4 grid using only keyboard shortcuts, to either a half or a corner of the screen, while also being able to restore windows to their old size and position from before the snap.  

And a little more detail:

After being ran, osnap stays resident and catches Win+Arrow key combinations (or Mod4+Arrow, for purists). When the key handler is triggered, osnap will resize the currently focused window to cover the half of the screen corresponding to the arrow key pressed. 

If the same shortcut is pressed again the window’s original size and position will be restored. 

If two key combinations are pressed in sequence quickly (under 250ms), the focused window will be resized to cover the corresponding *quarter* of the screen.

And that’s all there is to it. It works exactly how I want it to and it’s small. The code’s available on github.

Top