This Week on Github: In Good Company

As open source burgeons in popularity, contributors are flocking to websites to share and borrow code. Github is one of the most popular. This inaugural installment of "This Week in Github" introduces the service and highlights some of the best projects available.

A free lunch may still be a myth, but the benefits of free and open source software are already legendary. Businesses now readily embrace Linux, Apache, and scores of other packages up and down the so-called stack; companies such as MySQL have proven it’s possible to turn an open source project into a (very) profitable enterprise; and services like Dreamhost and WordPress are leading a new wave of businesses powered entirely by community-driven software rather than an impenetrable black tower of proprietary code.

Moreover, some companies are going even further than using open source. Many are contributing code to open source, too, releasing the source of software founded by internal development efforts. In recent years, Google, Yahoo, and even the United States government have become open source contributors. With so many producers and consumers of open source software, websites such as Github have cropped up to serve as, well, hubs.

If you’ve never tried Github, think of it as a “MySpace for hackers” (an analogy originally drawn by Ryan Tomayko). Upload your code, share it with the world, and simultaneously stand on the shoulders of giants. Further, you can comment on others’ code, and can split off, or fork, a body of code to create your own private version. Once you fork, you can modify the code and submit your enhancements for integration into its progenitor. Better yet, Github is easy to use—no operation requires more than a few clicks.

The Usual Suspects

I said many companies are getting into Github, but Github isn’t solely for open source companies like Mozilla and Red Hat. Many other corporations and institutions use Github to host private and client project code. (You can see a rather long list of patrons on the Github homepage). Interestingly and increasingly, those same companies are also using Github to share their codebases—and allow others to contribute, too.

For example, 37signals (who created and maintains the Rails Web framework) shares its Chef cookbooks with the world. Chef is a “systems integration framework” that manages systems configurations, Chef is much easier than manual maintenance, and is also far less complex than, say, Puppet. 37signals’s cookbooks are especially helpful because each one provides bits like Passenger and Ruby gems that you need to setup a typical Ruby on Rails Web stack (although you can obviously use other software with Chef for other kinds of systems). If nothing else, these cookbooks serve as excellent examples for those new to Chef.

In addition to great Chef recipes, 37signals also shares a great library called Action Profiler, which, unfortunately, is only useful to Rails developers, but is extremely useful if you happen to be one. The Action Profiler Rails plugin provides a simple way to generate call tree files suitable for kcachegrind. While this probably isn’t useful to every Rails developers, if you’ve got a nasty memory leak or CPU spike you can’t seem to figure out, Action Profiler is a fantastic way to help root out the fault.

Simply put this in the controller you wish to profile…

around_filter :action_profiler unless Rails.env.production?

… add any scoping you want (that is, make it run only on certain actions), and away you go. Send a request to the action with suffix ?profile=process_time to get the call tree.

If you’ve installed the Ruby Enterprise Edition (a patched Ruby distribution that includes a number of excellent additions like a smart garbage collector), you can append ?profile=memory or ?profile=allocations to examine application objects—a killer tool for outing memory leaks.

The 37signals Github repository is chock full of trinkets for Rails developers. Enjoy an early Christmas at http://github.com/37signals. I recommend project_search (a Rails plugin to help you search your Rails projects) and the company’s Capistrano extensions for speeding up deployment.

Speaking of Ruby and deployment, one of the more popular, recent developments in deploying Ruby applications is JRuby, a Ruby that runs on the Java Virtual Machine. The project, funded and sponsored by Sun Microsystems, is one of the few projects Sun has on Github. JRuby has made itself a cozy home on Github at http://github.com/jruby. If you’re interested in using Ruby with Java, you should check out what else is offered in the respository there. It contains JRuby and a couple of other tools for using Ruby in a Java ecosystem, such as an an implementation of the OpenSSL C extension in Java.

And Some Surprising Twists!

Beyond the aforementioned usual suspects, you can find some unexpected companies lurking about Github, too.

For example, Microsoft recently jumped on the Git bus by publishing IronRuby to Github. Check it out at http://github.com/ironruby. Unfortunately, Microsoft’s contribution model is draconian, to say the least. Rather than the simple “fork-fix-request” cycle of other projects, if you want to contribute, you have to electronically sign a contribution agreement and then can only contribute to certain parts of the source. Even so, it’s a great first step for Microsoft.

Yahoo! has published its YUI JavaScript library and related tools (including the library’s website!) to GitHub at http://github.com/yui. The repository includes the 2.x and 3.x source trees, along with a few tools like builder and yuicompressor for customizing and compressing YUI builds, respectively. Using these tools (or their online counterparts) hastens your load times when using YUI.

Yahoo’s GitHub account also houses a PHP loader for the YUI files and a documentation tool for use with YUI. The documentation tool is written in Python and generates a fresh set of API documentation for your local consumption. It’s an interesting bit of code to look at for ideas about parsing and extracting information from source code and its comments (especially bin/yuidoc_parse.py).

Even Internet portal Digg has gotten in on the action at http://github.com/digg, releasing a splay of PHP tools. One of the more interesting is its pattern project, which is a collection of design patterns implemented in PHP as a PEAR package. One of the more useful patterns included is the Observer pattern. This pattern makes it dead simple to attach smart event handling to a class, and Digg’s implementation is one of the best around.

Some of Digg’s other projects include Butler, a tool for fetching PHP information and managing Alternative PHP Cache (APC), and Digg UI, a JavaScript library that adds classes to jQuery.

A few lesser known companies have also tossed their hat into the Github ring:

  • Strangely enough, EMI Music has a Github repository (http://github.com/emi). It only has one project in it, but it’s a pretty interesting one named bixo, a Java library/tool for creating customized web crawlers.
  • Another surprising company repository, albeit a little less known, is Neuros. Neuros produces a neat little set-top box for your TV that brings a lot of extra, interesting features to your set through third-party apps. The company has open sourced much of its development tool chain and application software at http://github.com/neuros.
  • Six Apart, purveyors of Movable Type and TypePad, have published some of their Movable Type-related code to http://github.com/sixapart. It’s a small collection of plugins such as a feed-to-entry parser, a tidylib binding, and a few others related to testing and maintenance. I’ve added them to my watch list to see if one day they see the light and publish all their code on GitHub, but until then you can enjoy this small collection.

Come In and Browse!

If you’re at all interested in contributing to open source, shop around on Github. A lot of larger projects, including Perl, GHC Haskell, JRuby, Ruby on Rails, jQuery, and Apache Commons) are already there, ready and waiting.

Indeed, there is a lot of code on Github. If you want a map, stay tuned here. This column will highlight projects, enhancements, and other Github tidbits each and every Friday.

Does your company publish code to Github? Is it really mind blowing (or at least a little interesting)? Then send me a Github message at jeremymcanally, and I’ll check it out. Who knows? It might end up here.

Comments on "This Week on Github: In Good Company"


Hi Jeremy,

Thanks for the ref to the Bixo project at GitHub. One comment – in addition to EMI Music, Bixo is also sponsored by Share This. They’ll be using it to create a searchable index of shared pages, and for data mining the web.

– Ken


Assuming that you’ve never tried Github, consider it a “Myspace for hackers” (a relationship basically drawn by Ryan Tomayko). Transfer your code, impart it to the planet, and synchronously stand on the shoulders of goliaths. how2seobacklinks.com