Hype is a good thing
While listening to the Ruby on Rails podcast, one of the guests mentioned that one of the things behind their decision to use Rails for their project was the popularity of the language. Now, social news sites are weird. Browsing reddit and digg, you get the impression that popularity and hype, in the context of programming languages, concepts, and libraries, is a bad thing. I never really understood why popularity is seen as negative, especially in an area where community is important.
One of the enormous benefits of popularity, especially in the programming world, is the benefit of a large, active community of early adopters. Given an unpopular language and a popular language, the popular language will have more libraries, more support, and more experimentation than the unpopular language. As long as those early adopters are hammering on the language, it will quickly improve and give programmers in that language the resources they need to get work done.
Take two imaginary languages, Blub and Frob. They are fundamentally pretty equivalent, except Blub uses tabs instead of spaces and Frob has an awesome emacs mode but sucks with vi. Blub is way more popular (annoyingly so) than Frob in the blog circuit, but Frob has a better implementation and runtime than Blub. These differences, although minor, make for heated discussions among a team that is interested in the productivity benefits either language will give them.
I'll pick Blub every time, because when I run into an esoteric error message, I'll find 10 blogs from google explaining the problem and workarounds, instead of half a blog post saying, "I ran into this problem, does anyone have a solution?" (with no comments, of course). When I need a library to interface with our new caching service, I'll have tons to choose from (and modify), instead of one that I may or may not like using. With the number of eyes looking at Blub and how it compares to Frob, there will be people working to mitigate and solve issues that make it harder to work in Blub than Frob.
Even though there will be dedicated, hardworking, genius programmers working in Frob, they probably won't be able to produce as much help and resources as the early adopters that are jumping on the Blub bandwagon, documenting their experiences, writing code to experiment with features, and making Blub better and easier to use.
Of course, popularity does not necessarily make a language better -- look at Java for one great example of a popular language that I never want to touch again. Popularity as a language "feature" makes the most sense with up and coming languages. Hyped languages also attract people who will just as easily flock to the "next big thing," reducing the pool of contributing users that give it this advantage in the first place, and attracting a large community will often leave the average quality of work lacking. But as annoying as an overhyped language can be, it's often a good reason to pick a language over other, mostly equivalent languages when it comes to building on the work of others.
Hype is a good thing
While listening to the Ruby on Rails podcast, one of the guests mentioned that one of the things behind their decision to use Rails for their project was the popularity of the language. Now, social news sites are weird. Browsing reddit and digg, you get the impression that popularity and hype, in the context of programming languages, concepts, and libraries, is a bad thing. I never really understood why popularity is seen as negative, especially in an area where community is important.
One of the enormous benefits of popularity, especially in the programming world, is the benefit of a large, active community of early adopters. Given an unpopular language and a popular language, the popular language will have more libraries, more support, and more experimentation than the unpopular language. As long as those early adopters are hammering on the language, it will quickly improve and give programmers in that language the resources they need to get work done.
Take two imaginary languages, Blub and Frob. They are fundamentally pretty equivalent, except Blub uses tabs instead of spaces and Frob has an awesome emacs mode but sucks with vi. Blub is way more popular (annoyingly so) than Frob in the blog circuit, but Frob has a better implementation and runtime than Blub. These differences, although minor, make for heated discussions among a team that is interested in the productivity benefits either language will give them.
I'll pick Blub every time, because when I run into an esoteric error message, I'll find 10 blogs from google explaining the problem and workarounds, instead of half a blog post saying, "I ran into this problem, does anyone have a solution?" (with no comments, of course). When I need a library to interface with our new caching service, I'll have tons to choose from (and modify), instead of one that I may or may not like using. With the number of eyes looking at Blub and how it compares to Frob, there will be people working to mitigate and solve issues that make it harder to work in Blub than Frob.
Even though there will be dedicated, hardworking, genius programmers working in Frob, they probably won't be able to produce as much help and resources as the early adopters that are jumping on the Blub bandwagon, documenting their experiences, writing code to experiment with features, and making Blub better and easier to use.
Of course, popularity does not necessarily make a language better -- look at Java for one great example of a popular language that I never want to touch again. Popularity as a language "feature" makes the most sense with up and coming languages. Hyped languages also attract people who will just as easily flock to the "next big thing," reducing the pool of contributing users that give it this advantage in the first place, and attracting a large community will often leave the average quality of work lacking. But as annoying as an overhyped language can be, it's often a good reason to pick a language over other, mostly equivalent languages when it comes to building on the work of others.
Hype is a good thing
While listening to the Ruby on Rails podcast, one of the guests mentioned that one of the things behind their decision to use Rails for their project was the popularity of the language. Now, social news sites are weird. Browsing reddit and digg, you get the impression that popularity and hype, in the context of programming languages, concepts, and libraries, is a bad thing. I never really understood why popularity is seen as negative, especially in an area where community is important.
One of the enormous benefits of popularity, especially in the programming world, is the benefit of a large, active community of early adopters. Given an unpopular language and a popular language, the popular language will have more libraries, more support, and more experimentation than the unpopular language. As long as those early adopters are hammering on the language, it will quickly improve and give programmers in that language the resources they need to get work done.
Take two imaginary languages, Blub and Frob. They are fundamentally pretty equivalent, except Blub uses tabs instead of spaces and Frob has an awesome emacs mode but sucks with vi. Blub is way more popular (annoyingly so) than Frob in the blog circuit, but Frob has a better implementation and runtime than Blub. These differences, although minor, make for heated discussions among a team that is interested in the productivity benefits either language will give them.
I'll pick Blub every time, because when I run into an esoteric error message, I'll find 10 blogs from google explaining the problem and workarounds, instead of half a blog post saying, "I ran into this problem, does anyone have a solution?" (with no comments, of course). When I need a library to interface with our new caching service, I'll have tons to choose from (and modify), instead of one that I may or may not like using. With the number of eyes looking at Blub and how it compares to Frob, there will be people working to mitigate and solve issues that make it harder to work in Blub than Frob.
Even though there will be dedicated, hardworking, genius programmers working in Frob, they probably won't be able to produce as much help and resources as the early adopters that are jumping on the Blub bandwagon, documenting their experiences, writing code to experiment with features, and making Blub better and easier to use.
Of course, popularity does not necessarily make a language better -- look at Java for one great example of a popular language that I never want to touch again. Popularity as a language "feature" makes the most sense with up and coming languages. Hyped languages also attract people who will just as easily flock to the "next big thing," reducing the pool of contributing users that give it this advantage in the first place, and attracting a large community will often leave the average quality of work lacking. But as annoying as an overhyped language can be, it's often a good reason to pick a language over other, mostly equivalent languages when it comes to building on the work of others.
Bootstrapping content with Hpricot
On my latest project, I discovered I had to pre-populate the project's database with existing content. Jon Udell just posted about how much of a waste of time this can be in some circumstances, but in this case, Hpricot and database migrations made it easy. This wouldn't be a solution I'd use if I needed the data as anything beyond a one-off bootstrap, but in this case it worked really well.
Hpricot, for those who don't know, is an HTML parser for Ruby that's fun to use. When I was first learning Ruby, most of the simplest yet useful projects I could come up with used Hpricot to grab content off of websites and format or combine it in different ways. Its syntax looks like this:
require 'hpricot'
require 'open-uri'
uri = URI.parse(link)
doc = Hpricot(open(uri))
name = (doc/"li.active a").inner_html
page_title = (doc/"title").inner_html
body = (doc/"#content_body").html
In this example, Hpricot is using CSS selectors to grab different pieces of content out of the page in link. The nice thing about using CSS selectors here is the code tends to be less fragile than screenscrapers that depend on the architecture of the page.
Page scraping can be a frustrating art, especially if the page layout changes or if pages are inconsistent, or have unique properties. Luckily, in this case, I only had to get it right once, and even then, I didn't have to get it completely right. I used this four-stage process:
- Use Hpricot to get as much data off the page and into our data structures as possible.
- Persist this data to the database, and make appropriate changes that Hpricot missed, or couldn't catch.
- Dump the database to a file, and use it to bootstrap our production database.
- Repeat until finished.
Rails database migrations made this relatively easy. I ended up with three migrations. The first migration created the structure of the database. The second loaded the current page data dump from the dump file. The third grabbed a few pages I still needed to parse, and I was left with data that I could tweak and dump, overwriting it with a dump containing all the page data (including the stuff I just tweaked). I could then blow away the database and repeat until I didn't have any more pages to parse.
This worked perfectly, since I didn't have to spend time getting my Hpricot parsing perfect (since I could modify the resulting data using our CMS and re-dump), and I was left with a dump of all the data that I needed in order to dynamically generate these formerly mostly static pages.
Sidebars are better than components 2
This article made it across my RSS reader today. I ran into my own problem with this while writing a custom CMS for work. We wanted to have reusable components that could be added to CMS pages, which could take various parameters, could be cached, and could be viewed in different ways given a size. I investigated Rails components at work, but noticed that using those is discouraged by the Rails community.
My investigation brought me to Typo’s sidebar model, which I used as the basis for the model we ended up using for the prototype of the project. The ultra-simplified version of the model works like this:
We have a Sidebar base class, which inherits from ActiveRecord::Base. Sidebars inherit from this Sidebar class.
Which gives us something like this:
class Sidebar < ActiveRecord::Base
serialize :config
class << self
def params
@params ||= []
end
def param(name, type, options = {})
params << options.merge({:name => name, :type => type})
self.send(:define_method, name) do
self.config[name] || options[:default]
end
self.send(:define_method, "#{name}=") do |value|
self.config[name] = value
end
end
end
end
class StaticTextSidebar < Sidebar
param :content, :text, :default => "Hello, World!"
endSo now we have a way of defining sidebars and their parameters. The metaprogramming in the Sidebar base class allows us to programatically query the parameters declared in a Sidebar. This will be important later. For now, we still need to declare the view of a sidebar, so we do it in _static_text_sidebar.rhtml:
<%= sidebar.content %>Now, we add a helper to application_helper.rb to render the sidebar:
def render_sidebar(sidebar)
render :partial => sidebar.class.name.underscore, :locals => { :sidebar => sidebar }
endand then we can call render_sidebar in any of our views on an instance of a sidebar to render it. It’s not perfect, but it’s good enough for a prototype!
From here, we have a very basic reusable model-view framework that we can include in any of our pages. Sidebar instances can be associated with content on a page to be displayed, and their configuration can be serialized to the database along with the items they display with.
Creating and configuring sidebars can be done programmatically, by generating a form based on the parameters a sidebar takes and placing that form data into the sidebar, the same way one would with a standard ActiveRecord object. Their parameters can be validated using standard Rails validations and the result of the render_sidebar call can be cached.
This basic idea, with a little bit of work, can easily form the basis for a simple reusable component architecture, and we’ve been having a ton of success with it so far.
Leopard Prompt 2
One thing that bugged me after upgrading to Leopard was the prompt in the terminal. It only showed the last part of the path, and I wanted to see the whole thing. I found the solution on a tumblog, so here's the prompt I'm using now:
PS1='e[32mu@he[0m:e[34mwe[0m$ '
I put this line in ~/.profile, and that brought things back to normal!
Well, partly... unfortunately, when I try this, after getting to about column 80 on the first line of text, the text starts wrapping over the beginning of the line. I'll probably have to read up on prompt settings and figure out what's causing all this to happen.
No-Impact Testing
For my day job, I work on Legacy Code. As legacy code, it needs unit tests. Unfortunately, this particular legacy code is used by millions of people on a regular basis, precluding refactoring to make unit testing easier. I needed a way of testing our code without changing any of the code that was put into production. Luckily, I discovered a way of exploiting the C/C++ linker to allow me to stub out our dependencies, which I'll talk about here.
For me, the hardest part of writing C/C++ code is managing dependencies. It's very easy to get locked into a design due to dependencies that will make life harder later. The code that I've been working on is over ten years old in places, and wasn't written with dependency management in mind. This means that most of the functionality of the code calls APIs that I have no control over and I don't want to run in my unit tests, because they do things that change the state of the underlying system. Luckily for me, the C/C++ linker that I've cursed on many an occasion has become a useful tool for softening these dependencies. I'm sorry for doubting you, linker.
The current archtecture of the system looks like this:

I wanted it to look something like this:

In order to control both ends of the system from my tests, however, I needed to override the methods that my code calls that are exposed by the underlying layer that I have no control over. Of course, in the Software Engineering world, the only problem that can't be solved by another layer of abstraction is performance, so that's what I did -- created a proxy layer as a library that re-implemented each of the methods exposed by the underlying layer. This library would forward these calls to a global instance of a proxy class with stub functionality that I could override in my test code. This looked something like this:

Ruby came to the rescue here, as it tends to do, allowing me to write a quick and dirty (ok... filthy) script to parse the header file of the dependency and generate these classes for me. This saved me the trouble of doing it by hand, which would make me cry. I took these generated files and generated a .dll and a .lib, which I statically linked into my test code, making sure to link it before the real .lib I depended on.
As you can see, I could now pinch my production code between my test code, enabling me to control what goes in and look at what comes out. The best part of all of this is that it didn't require a single change to the code that goes into production, allowing me to cover methods with tests before refactoring. For easily broken code that is used by an enormous population, this is a huge deal. But it's not the end.
I've heard many experienced unit testers say that the most difficult places to test are the "Ends of the World..." that is, upstream and downstream dependencies. The most common way I've seen people avoid this is by writing proxy objects that isolate these dependencies from the code. This leads to an architecture that looks like this:

This leads to a few benefits. First, your interface to the dependent code is written by you, so you can expose only the functionality that you actually use in a way that you like. Second, when your dependency breaks you, you have a simple test case that you can toss at them, which avoids the communication problems that plague large, distributed teams. Finally, you can subclass and override this proxy, allowing the injection of test code on both ends of the code. That's a much easier way of doing what I described above, and doesn't require any build hackery. It's the route I'm planning on taking once I get good enough test coverage that I feel comfortable making wild, invasive surgery on the code.
Much love for Michael Feathers' book "Working Effectively With Legacy Code", which gave me the idea and the confidence that this could in fact be done.
Language Geek
I love programming languages. Learning a new one is one of my favorite computer-related activities to do. Right now my favorite is Ruby for reasons I'll get into at another time. There are a few others, though, that have caught my attention lately:
- Erlang: There's something really powerful about a language where remote procedure calls are built into the language. I also realized (again) how cool tail recursion can be. I don't have any ideas trying to get out of my head that would find Erlang useful now, but I really want to play with it more.
- Scala: Scala is an interesting language. It's strongly typed with type inference, which I find very cool and wish more languages would try. It also supports the actor model (like Erlang) and some really crazy generic programming constructs. To me it seems like magic. It also runs on the JVM, which means I get every library people can think of right off the bat. Score!
I also found the article by Stostroup here to be really interesting. It's about 50 pages long, but well worth reading if you're interested in or (are forced to) use C++ at all.
Language Geek
I love programming languages. Learning a new one is one of my favorite computer-related activities to do. Right now my favorite is Ruby for reasons I'll get into at another time. There are a few others, though, that have caught my attention lately:
- Erlang: There's something really powerful about a language where remote procedure calls are built into the language. I also realized (again) how cool tail recursion can be. I don't have any ideas trying to get out of my head that would find Erlang useful now, but I really want to play with it more.
- Scala: Scala is an interesting language. It's strongly typed with type inference, which I find very cool and wish more languages would try. It also supports the actor model (like Erlang) and some really crazy generic programming constructs. To me it seems like magic. It also runs on the JVM, which means I get every library people can think of right off the bat. Score!
I also found the article by Stostroup here to be really interesting. It's about 50 pages long, but well worth reading if you're interested in or (are forced to) use C++ at all.
Help from the compiler vs. Less code
Jeff Atwood's Coding Horror has recently become one of my favorite blogs (although he seems to be more of a Code Complete guy while I'm more of a Pragmatic Programmer guy).
One of the recent posts on the blog made the point that "the best code is no code at all." This is something I completely agree with, but I get into arguments on this topic pretty regularly. This particularly comes up when I discuss static vs. dynamic languages. When I talk about Ruby being my favorite language (for now), many people who come from a C++/C#/Java mindset wonder how correct the code I write can be if the compiler isn't doing any compile-time checks. After all, errors are much cheaper to catch early, and compile-time checking is really early, right?
In theory, they have a point, but in practice, it hasn't been a problem. I was thinking about this recently, and I think there are two main reasons I haven't had noticeably more bugs in my dynamic programs than I do in my static programs:
The first is what Jeff refers to in his blog: Dynamic languages, as a whole, require less code to be written. I don't care how "correct" the compiler thinks my code is, I am smart enough (in a way) to write broken code that the compiler can't catch. When the broken code is a single line hidden in a ton of boilerplate C++/C#/Java style class/method/etc. definitions, it takes a little longer for me to wrap my head around what I need to do to fix the problems. In Ruby, I find that my mistakes are much easier to catch and fix, because my code _is_ my intent. Part of it is because less code = less room to make mistakes, and the other part is less code = easier to find the mistakes I do make.
The other reason, which branches off from the first, is that the dynamic programs are (again, in my experience) easier to test. When you can torture every object in the system at runtime, it becomes unbelievably easier to mock, isolate, and test the exact functionality that exists in a part of the program. Not only does this hit most of the dumb mistakes that compile-time checks will notify you about, it also will tell you when another dumb mistake breaks logic errors that the compiler could never tell you about.
This leads me to an interesting fact that I just recently discovered: Not only does the code I write in C++/C# not have fewer (discovered) errors than my Ruby code, my Ruby code is, on the whole, easier to fix when things do go wrong. Less code provides a lot of benefits when it comes to agility, avoiding needless multitasking, and verifying code correctness, and that's one of the many reasons I've been moving much more toward more dynamic languages lately (and been much a much happier developer for it).
Older posts: 1 2