Student Question – How does JavaScript Compares Strings?

Overview

In this article I answer a student question that I got in my Beginning Programming with JavaScript course. If you have feedback – please let me know via twitter (@eli4d).

Regarding tools used – I am using JS Bin. If you’re unfamiliar with this tool – check out http://jsbin.com/help/what-is-jsbin. The purpose of this article is to both answer the question and show an approach of how to find the answer.

A Quick Reminder about JavaScript Logic Club

“The first rule of JavaScript Logic Club is that you don’t talk about JavaScript Logic Club. The second rule of JavaScript Logic Club is that you don’t talk about JavaScript Logic club.” 🙂

I know…what does the above quote even mean? I constantly get students with an exasperated expression, frown, and comment that goes something like “but that doesn’t make sense” when it comes to the rules of JavaScript and how it works. In my slightly snarky moods I answer with “when you design your own language that is adopted by all web browsers over a 20 year period, then feel free to make it any way you want” or “well – if you have a time machine, you can go back to 1995 and knock out Brendan Eich for those crucial 10 days, take over, and design your own language – the one that makes sense…I’ll wait for a minute while you do that.”

All snarkiness aside, programming languages are just like board games. Maybe you don’t like the rules of Pandemic, and that’s completely fine. However, your rules of the game won’t be the “standard” of this particular game unless there is lots of adoption and acceptance by other people. There is a vast graveyard of board games that never became popular in the same way that there is a graveyard of programming languages that never took off.

So asking for JavaScript to fit your rules and what makes sense to you is nonsensical. If JavaScript rubs you the wrong way, you can always go to something else that converts to JavaScript (Elm and TypeScript comes to mind).

The Question

So the question that I got from a student was the following:

Small question on comparing strings. Ordering of strings is based on Unicode, got it. If the first letter is equal, are the strings equal? Or does it go letter by letter until it finds inequality? For example, is “made” i and therefore “made” > “maid” ?

The Answer or better said an answer

While one’s initial instinct may be to go and duckduckgo around for an answer. This is a fine approach but in the case of JavaScript, there is much more misinformation than clear information because of the age and popularity of the language. For this article I will refer to Nicolas Zakas“Professional JavaScript for Web Developers” (3rd Edition)*. It’s a thorough reference book that is more approachable than some other heavy duty books (in my opinion).

Note: This information is related to ES5 though the comparison operator will work the same in ES6.

Zakas on String Comparisons

The following excerpt comes from Chapter 3.

So the gist here is that:

  • Don’t use your human logic to associate capital letters as being “bigger” than lowercase letters
  • All strings boil down to character codes
  • Comparison starts with the first letter of each string and goes from there (so there’s no cumulative addition of strings on each side)

Let’s look at “made” < "maid" in terms of what JavaScript gives us, and then we can figure out the character codes to see if they correspond to Zakas's explanation.

Zakas on String Comparisons

How do we figure out a strings character code?

Looking for 'character code' in Zakas's book yields the following.

How do we figure out a strings character code?

What exactly does charCodeAt do?

Off to Mozilla Developer Network (MDN): https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/charCodeAt

Interesting – we're dealing with the part of Unicode that is represented in UTF-16 (before things get a bit more complicated). Recall that the student (in his question) had assumed Unicode.

What exactly does charCodeAt do?

Lets go to JS Bin

We're going to use JS Bin to look at the original question revolving around the student's question related to "made" < "maid" and Zakas's explanation. We'll just use the 'JavaScript' and 'Console' tabs.

JS Bin – 1

So according to JavaScript "made" is less than "maid". This is the truth based on JavaScript's rules.

If we evaluate according to Zakas, then the comparison starts at 'd' and 'i' since 'm' and 'a' are the same on both the left and right side of the expression.

JS Bin - 1

JS Bin – 2

Since charCodeAt provides the UTF 16 code unit at a specific part of a string. Giving it one character with or without an index results in the same thing.

Now in the next steps we could put the full string (i.e. "made") and then pick a specific index, but I rather keep it simple and have a laser focus on what we're trying to answer.

JS Bin - 2

JS Bin – 3

Comparing the differing letter for each word – we can see that in the first set ("d" and "i") – 100 is less than 105 so up to this point the answer would be true when asking the question 'is "made" less than "maid".

For the last letters ("e" and "d") the answer is still the same even though "e" is a higher value than "d". What's going on here? Well the comparison stops at the previous set of letters, so this comparison has no effect.

JS Bin - 3

JS Bin – 4

What about "mad" versus "made"? Is "mad" less than "made"?

As the image below shows – JavaScript indicates that it considers this to true. Clearly the first 3 letters on each side of the comparison operator are exactly equal to each other. The only difference is the number of characters. So because "mad" has less characters than "made", it is 'less' in terms of the comparison.

JS Bin - 4

Conclusion

You can find the JS Bin at http://jsbin.com/gopacuv/edit?js,console

We could do much more at this point. For example, we could create a function that gets two strings, then iterates through each one comparing the character codes for each letter and returning the less than comparison based on this evaluation. We could see if the String's localCompare would do a better job in terms of character by character comparison. We could do lots of things, but the goal of this article was to explain string comparison, and this has been done.

Keep your chin up and enjoy JavaScript for what it is rather than for what you might like it to be.

Advertisements