Episode 14 of the “Under the Radar” podcast covered the specifics of how to best architect a back-end service for you mobile-app, web service, web application, and so on. It’s a follow-up to a previous episode (https://www.relay.fm/radar/13) about the Parse shutdown and the potentially high cost of external dependencies. The one part of this conversation that really caught my ear was around 09:15 and it contained the following interesting approach:
“What you want most of all when choosing server software – if you don’t want to be administering and tweaking your server constantly – what you want is old, boring, and popular. Those 3 things – old, boring, and popular. New and trendy does not always mean better.”
A word of thanks
I’ve approached this by looking for numbers and meaning at github.com and libraries.io. Obviously non-github.com projects (like the Apache web server) cannot be looked at in this way because the direct stats aren’t there.
Special thanks goes out to:
- Marco and David for the content of their podcast and the BOP idea/approach
- Rachel Berry from GitHub for answering my questions about the best way to interpret GitHub statistics
- Andrew Nesbitt from Libraries.io for answering my incessant questions about Libraries.io’s statistics
Note that I discovered libraries.io through the amazing Changelog podcast (episode 188). If you’re looking for a tool that will help you figure out your open source compliance (as well as many other things) – check out Libraries.io’s services (I would suggest that you listen to the Changelog podcast to get a clear understanding of Libraries.io’s value).
Lets break this down
If you’re new to this, the first question is where to begin?
I think the place to start is to find some sort of categories that are related to back-end technologies. After all, there’s no point to compare Linux (an operating system) to Ruby on Rails (a web framework).
Two sources that seem interesting in terms of such categories are:
GitHub’s showcases page
In terms of back-end technologies (i.e. server side software) that are shown on the showcases pages the following areas seem more relevant:
- Web application frameworks
- Programming languages
- Open Source Operating Systems
- Projects that power GitHub (i.e. seeing the components that run a huge enerprise like GitHub – some of these components will likely fit the BOP model; some of course will not fit this since GitHub can afford to hire devs for very niche and young projects)
Note: The image below is an aggregation of the 3 pages of this showcase and the “Search showcases” fields is great to finding a category for a specific project.
Libraries.io main page
Libraries.io has lots of different ways to look for projects. The keyword section at the bottom seems quite interesting.
Boring, Old, Popular: What does ‘Old’ mean?
While I initially wanted to start with ‘Boring’ because BOP starts with it (and BOP is memorable), I realized that the better way was to start with the property that is easiest to figure out, or at least something that seemed easier.
What does ‘old’ mean in terms of software? Is 2 year old software ‘old’, or does 10 year old software count as ‘old’? (in the case of this post ‘software’ means ‘open source project’)
The definitive answer is “it depends” but that doesn’t help much. I think the better question is “is this piece of software ‘old’ within its category?” In the following examples, we’ll look at the web applications framework showcase on GitHub.
Rails is 12 years old…that’s definitely old – isn’t it?
Express is 6 years old
Laravel is 5 years old…so what gives?
Meteor is 5 years old….but is that old?
What about the age of the Internet?
Good lord – that depends on your definition. Is it starting from the 1950s when computers were more widely used by governments and universities?
If I’m going to pick a number – I’m going to use HTTP as my criteria so: 2016 – 1989 = 27 years.
Damn it – what is ‘old’?
I was tempted to use log2 to help figure the numbers (because logarithms are COOL), but then I thought about what it means to be ‘old’ as an adult and used that to figure out ages of adolescence, young adulthood, middle age, and old age. Here’s an imperfect attempt at figuring this (I use percentage of LEB to help with range indication for age stages).
Note that I’m using Soulver for these calculations (the best-est ‘human’ usable spreadsheet program out there).
So if I use the age of the Internet as 27
Umm…this is a bit of a chicken and egg thing in terms of current technology and the origin of technology.
Lets make InternetLEB 16
I definitely feel that Rails is ‘old’. What if I take 16 as the InternetLEB. 2000 seems like the ‘right’ year for Web 1.5/2.0 – doesn’t it?
This makes more sense to me but you can picke whatever InternetLEB works for you. So here’s a criteria of judging the age of a project. Based on the Marco/David criteria – you would want a project that is in the middle-age to old-age area. That is the definition that I’m picking for the ‘Old’ part from the BOP criteria.
Boring, Old, Popular: What does ‘Boring’ mean?
Stepping back for a second to the Under the Radar episode about this whole BOP criteria, the discussion centers around backend software. Software that resides on the server, software that is supposed to be rock steady so you don’t have to worry about your web site or web service falling down on its face on a frequent basis. So we’re talking ‘boring’ in this context, not ‘boring’ as in “uninteresting and tiresome; dull.”
Still, what’s a better definition in this context?
My definition for this is “software that has clarity in terms of usage and is used in many projects because of this clarity”. To me ‘clarity’ refers to a couple of things:
- how it is used in the context of application/service (i.e. well defined use)
- used by many others, which in turn leads to clarity in terms of direct documentation or indirect documentation (i.e. stack overflow answers that add up to common and clear usage practices)
Now in terms of hard numbers – I’m not sure how to define and discover ‘boring’ in terms of GitHub or libraries.io. The closest thing that I can think of is the “Dependent Repositories” number from Libraries.io’s SourceRank number (example shown for Rails). I was unclear about the difference between “Dependent Projects” and “Dependent Repositories” and I got the following clarification from Andrew Nesbitt:
*Dependent repos and dependent projects are two separate things, for dependent projects of a rubygem, it’s the number of other projects that list that as a dependencies, for rails there are ~7940 other rubygems that depend on it: *https://libraries.io/rubygems/rails/dependents
For dependent repos, it’s every Github repository that has rails listed as a dependency in it’s Gemfile or Gemfile.lock, which there are around 60,000: *https://libraries.io/rubygems/rails/dependent-repositories *
I asked Rachel Berry if there was anything equivalent on GitHub and there didn’t seem to be anything that was directly equivalent. She suggested the use of code search to provide a rough statistic. So something like https://github.com/search?utf8=%E2%9C%93&q=gem+rails+path%3A%2F&type=Code&ref=searchresults or https://github.com/search?utf8=%E2%9C%93&q=%22gem+rails+5%22+path%3A%2F&type=Code&ref=searchresults could provide a possible alternative. The problem with this approach is that you need to know how a dependency is included and then deal with the various variations in inclusion strings (besides other issues like different package managers for different software).
Overall, I don’t think there is any “hard” number that can easily capture the ‘boring’ criteria. I think that in this case ‘boring’ is really the result of looking at ‘old’ and ‘popular’. So instead of the BOP criteria it should perhaps be (B)OP or B/OP. Moving forward from this point – I’m going to go with (B)OP.
Boring, Old, Popular: What does ‘Popular’ mean?
I left the “best” for last – POPULARITY. What the heck is ‘popular’ when it comes to the BOP criteria?
Is popularity based on GitHub stars?
How useful are GitHub stars in evaluating popularity? They seem somewhat transient and unreliable for this criteria.
What about popularity based on GitHub forks?
Forks by their very nature are other people’s experimentation with a project. Of course there could be upstream contribution but how much of forks are actual contributions back to the project?
Forks seem like a way of learning and modifying a project’s code but I don’t think that they have anything to do with popularity.
What about project members?
So the “Members” graph is a visual representation of the Forks number (i.e. “members” of the fork network). It’s another view of forks, and therefore its ‘popularity’ usefulness is questionable.
What about a project’s contributors as a reflection of popularity?
I think that this is similar to forks – specific people being interested in a project for their own reasons.
Something that ‘trends’ is popular – isn’t it?
Something that is trending may reflect momentary popularity. But it is certainly in conflict with the ‘old’ and ‘boring’ criteria, so this is definitely not a good measure.
OK – I FOUND IT – I KNOW THE DEFINITION OF POPULAR!
Actually I don’t but I’ll take a run at it anyway.
I don’t know what’s popular or how to best evaluate popular in terms of the BOP criteria. Maybe it’s one of those I’ll know it when I see it things. Still, it doesn’t help anyone who is new to backend software infrastructure. The best thing that I can come with at this point is Libraries.io’s SourceRank number as a decent data point for popularity. Is it the best? Probably not. But I don’t see anything that’s better at this point.
Note: We need to keep in mind that log values are used in the creation of SourceRank so a difference of 2 between the SourceRank numbers of two projects could be quite significant
(B)OP Comparison Example
So essentially – the (B)OP criteria boils down more to the O and P, since B falls under O or P – your choice.
- Old = age based on the previously mentioned age/stage criteria using the year 2000 as a baseline
- Popular = SourceRank at this point or using a GitHub source search if the project is unavailable on libraries.io
With the above in mind – lets compare Rails and Express.
The (B)OP criteria for Rails
So for Rails we’re looking at:
- Old = 12 years with an age factor of %75; so its at middle-age about to hit old-age
- Popular = SourceRank of 28
The (B)OP criteria for Express
So for Express we’re looking at:
- Old = 6 years with an age factor of %44; so its at middle-age
- Popular = SourceRank of 26
Which to choose?
So in summary – make your back-end server and services the best they could be by choosing the most (B)OPish (boring, old, and popular) technology when looking at the server side level of your technology stack. This advice would seem to contradict the “I want to develop on the latest and greatest technology”, but it is the best path to system administration sanity and it takes away nothing in terms of the fun part of your product and using the latest/greatest in there.
Some other resources that I came across
While researching and reflecting on this post I came across some resources that might be useful for those that are looking for ways to distinguish different projects (this is not limited to server side type of projects):
- GitHut (http://githut.info/): A really great visual display of languages that are used on GitHub