BookBytes

A book club for developers.

BookBytes

BookBytes is a fortnightly (or biweekly) book club for developers. Each episode the hosts discuss part of a book they've been reading. And they also chat with authors about their books. The books are about development, design, ethics, history, and soft skills. Sometimes there are tangents (also known as footnotes).

Hosts

Adam Garrett-Harris

Jason Staten

Jen Luker

Safia Abdalla

Subscribe

14: The Imposter's Handbook: Data Structures and Simple Algorithms

9/3/2018

The hosts cover one algorithm each, after wrapping up the data structures chapter.

Hosts

Transcript

Help improve this transcript on GitHub

(Intro music: Electro Swing)

0:00:13.8
Adam Garrett-Harris

Hello, and welcome to BookBytes, a book club podcast for developers, and this week we’re talking about 2 more chapters of “The Imposter’s Handbook: A CS Primer for Self-Taught Programmers” by Rob Conery; and this week we’re going to go over, we’re going to finish up the chapter on data structures and then talk about simple algorithms. I’m Adam Garrett-Harris.

0:00:35.5
Jen Luker

I’m Jen Luker.

0:00:37.2
Safia Abdalla

I’m Safia Abd-

0:00:37.9
Jason Staten

I’m Jason State- Oh!

0:00:39.3
Jason State, Safia Abdalla, & Jen Luker

(laughing)

0:00:40.7
Jason Staten

Go, Safia.

0:00:42.2
Safia Abdalla

I’m Safia Abdalla.

0:00:43.5
Jason Staten

I’m Jason Staten.

0:00:45.0
Adam Garrett-Harris

All right, do we want to do another “Book News”? ‘Cause I’ve got one.

0:00:48.5
Jason Staten

Let’s hear it!

0:00:48.9
Adam Garrett-Harris

All right. “Eloquent JavaScript”, the third edition is coming out. I’m don't know if you all are familiar with it, but it's the yellow book on JavaScript with the bird on it, like a peacock?

0:00:58.9
Safia Abdalla

Yeah.

0:00:59.7
Adam Garrett-Harris

So, it’s going to be updated with all of the new JavaScript stuff, and it’s always been free and you can already read it online for free, and then a paper edition is going to come out in October. So, that’s pretty exciting.

0:01:13.4
Jason Staten

Nice. I’ll have to check it out.

0:01:15.0
Adam Garrett-Harris

All right, so we need to finish data structures real quick.

(Typewriter dings)

0:01:19.6
Adam Garrett-Harris

What is a hash table?

0:01:21.6
Jen Luker

I’m like, flipping pages, where is the hash table page?

0:01:24.7
Adam Garrett-Harris
0:01:26.3
Jen Luker

Oh! You mean the one that I have marked as “Start”?

0:01:28.7
Adam Garrett-Harris

(laughs)

0:01:30.3
Jason Staten

So, hash tables are a data structure that goes and indexes the values within it by going and hashing those values into a computed value that is consistent for all values put into it, and because everything is based upon its hash, going and adding to, and looking up items from a hash table are usually a Big-O(1), they’re pretty consistent so long as your hashing algorithm can produce unique results for the values that are passed into it.

0:02:06.5
Adam Garrett-Harris

Yeah. So, I think it’s pretty interesting what happens when there’s a collision. One option is you just like, move to the next available spot, but that could get really bad, or you could just use a link to list if there’s more than one value there.

0:02:21.2
Jason Staten

Yeah, that’s definitely a challenging problem to deal with and hopefully you find a decent hashing algorithm so you don’t have to bump into that very much, ‘cause I’m sure perf goes way down as you wind up having lots and lots of collisions.

0:02:35.1
Adam Garrett-Harris

Yeah, I remember in college we had to try to write a hashing algorithm that didn’t have very many collisions and it was extremely hard, but I guess generally you’re not going to write your own.

0:02:43.7
Jason Staten

Yeah. A lot of times libraries that you use will have, or sorry, some, whatever language you’re using will have a hash table built into it, especially if you’re using something like Ruby where hash tables are like, the core data structure that you use everywhere; and so, they’ve actually had times in the past where they’ve made pretty dramatic improvements to it and it just speeds up Ruby across the board because they’ve made hashing faster.

0:03:07.3
Adam Garrett-Harris

Cool. Well, unless anyone has something else about hash tables I think we’re going to skip over dictionaries and talk about trees.

0:03:15.0
Jen Luker

Oh, trees.

0:03:16.3
Jason Staten

What’s a tree, Jen?

0:03:17.8
Jen Luker

Trees are a type of graph, essentially… It’s pretty much what you’re used to seeing regarding data. They have like, the bubble at the top and then they have the bubbles that are divided from down below. A tree kind of like a family history tree where you have you and then it breaks out and you have your parents, and then you have, each of them has their parents, you’ve got great-grandparents, all the way down. So, that’s technically a binary tree, but trees don’t necessarily have to have just two values on each side, they could actually have a really large set of limbs as you follow that tree.

0:03:52.8
Adam Garrett-Harris

Yeah, a family tree is kind of interesting, right? Because you’ve got, I guess it’s kind of upside down. Or you would have to start with yourself at the top of the tree-

0:04:03.0
Jen Luker

Yes.

0:04:03.1
Adam Garrett-Harris

And then your parents below you, and then their parents below each of them. Okay. I was thinking of it the other way around, like if your parents are above you that’s… That’s definitely a graph, but that would be a, I guess it’s also a tree.

0:04:16.1
Jen Luker

It’s technically a heap (laughs), I guess.

0:04:19.1
Adam Garrett-Harris

Yeah, based on age?

0:04:21.8
Jen Luker

Yeah, birthdate.

0:04:23.7
Adam Garrett-Harris

So, I guess, we should… You said it’s a type of graph?

0:04:27.1
Jason Staten

Yeah, that’s actually a note that’s mentioned later on in this chapter towards the end when they actually do cover graphs, and they say that all of these other data structures that we’ve covered throughout the chapter are all, also, forms of graphs. It’s kind of mind-blowing.

0:04:40.5
Adam Garrett-Harris

Yeah, so I think the key difference between a tree and a graph is that a tree is a graph that is acyclical, so it has no cycles, so there’s never a loop.

0:04:52.2
Jen Luker

Yeah, it’s a hierarchical structure, which is linear.

0:04:56.6
Adam Garrett-Harris

Yeah.

0:04:57.7
Jen Luker

I always found them funny that they called them trees, primarily because when you’re used to looking at trees you’re starting from the trunk and it goes up into the sky, but almost all definitions of digital trees are the trunk is in the air and then everything goes back down, kind of like roots instead of branches and leaves. Alas-

0:05:14.6
Adam Garrett-Harris

Yeah, I think of it as a, it’s kind of like a Christmas tree where you start at the star.

0:05:19.4
Jason Staten

Same, same here.

0:05:19.8
Jen Luker

That’s a good way to put it, but I’d never made that connection.

0:05:23.2
Adam Garrett-Harris

And then you’re going from ornament to ornament on the way down.

0:05:25.7
Jason Staten

Well that’s deep.

0:05:26.9
Jen Luker

Yeah.

0:05:27.3
Jason Staten

I hadn’t thought about it as ornaments, but I did definitely have the Christmas tree vision in my mind. You just draw a little straight trunk underneath of it and put some present there and you’re good.

0:05:35.9
Jen Luker

(laughs) Hmm.

0:05:38.3
Jason Staten

So, a common place for trees that I can think of that I, myself, have actually interacted with, actually, pretty recently this week I had to write a codemod for some JavaScript at work where we needed to go and upgrade a jQuery version from 2 to 3, and there are some common cases where if you find a certain pattern you need to go and swap it out with something else, that’s kind of what a codemod allows you to do.

And when you are creating a codemod you do it based on what’s known as the abstract syntax tree, which is a tree data structure that is created from reading and parsing JavaScript code so you wind up having a tree-like structure that perhaps is a function expression, and a function expression has a couple of children underneath of it, one of them being a parameters type node, and also a body type node. And then the body type node has member expressions where it’s maybe calling console log or something, and it goes real deep; but there’s a really cool tool called AST explorer that you can go and put a snippet of JavaScript code into and it will give you back the tree structure that represents that piece of code, and it is invaluable for creating codemods with.

0:07:06.3
Jen Luker

Or anything that you’re trying to transpile.

0:07:10.6
Jason Staten

Definitely.

0:07:11.6
Jen Luker

Like Babel.

0:07:12.9
Adam Garrett-Harris

Yeah, I guess also with TypeScript, or linters.

0:07:18.2
Jen Luker

Mm-hmm (affirmative). Speaking of which, I just released a linter last week.

0:07:22.9
Adam Garrett-Harris

You released a linter?

0:07:24.6
Jen Luker

I did.

0:07:25.2
Adam Garrett-Harris

What?

0:07:26.4
Jen Luker

Mm-hmm (affirmative)- So, you know how we have eslint-plugin-JSX-a11y?

0:07:33.7
Jason Staten

Yes.

0:07:33.7
Adam Garrett-Harris

No. (laughs)

0:07:35.1
Jen Luker

You didn’t know? Ah!

0:07:36.1
Adam Garrett-Harris

No.

0:07:37.4
Jen Luker

So, there- You know, eslint, though, right?

0:07:39.6
Adam Garrett-Harris

Yeah.

0:07:40.6
Jen Luker

Well, eslint-plugin-JSX-a11y actually tests for accessibility while you’re coding. So, you can lint for accessibility requirements.

0:07:50.5
Adam Garrett-Harris

Awesome.

0:07:51.4
Jen Luker

Yes, well it turns out that when we’re talking about React Native it kind of lands right in that space in between the web, and DOM, and React; and the ability to test apps in the native world. So, there’s accessibility tests for the web through like, axe-core or something to that effect, and there’s accessibility testing through iOS and Android, but there’s really nothing for React Native. So, Formidable and a great guy named Alex Saunders wrote JSX-plugin-react-native-a11y, and I released it as part of my chain react talk last week.

0:08:34.1
Adam Garrett-Harris

Oh, that’s awesome. Congrats!

0:08:36.1
Jen Luker

Thank you.

0:08:36.3
Jason Staten

Nice. All-

0:08:36.9
Adam Garrett-Harris

That’s good, I’m sure that’s going to help a lot of people, and it’s great.

0:08:40.6
Jason Staten

Nice work, Jen.

0:08:41.5
Jen Luker

Thanks.

0:08:42.5
Adam Garrett-Harris

So, another type of tree is a trie. I think it’s pretty funny because it comes from the word “retrieval” so it technically should be pronounced “tree” as well, but then it would just be confusing, but you know it’s spelled, T-R-I-E.

0:08:56.2
Jen Luker

You think it’s not confusing already?

0:08:59.1
Adam Garrett-Harris

Yeah, it’s pronounced “try”, it’s not spelled that way. Anyway, and it’s a tree, but it’s a tree that can help you... spell out words?

0:09:10.3
Jen Luker

Isn’t that cool?

0:09:11.5
Adam Garrett-Harris

Yeah.

0:09:12.4
Jen Luker

It’s an, it’s like a word completion.

0:09:14.8
Adam Garrett-Harris

Yes, so it can be used for when you’re typing into a search and you need to suggest what could come next. So, as you go down the tree it kind of narrows down what possible words you’re typing.

0:09:27.6
Jason Staten

Also, it’s way cool that it shares parts of entries that are within that trie as well. So-

0:09:35.2
Adam Garrett-Harris

Yeah.

0:09:35.9
Jason Staten

For example, the word “impart” and “imposters” both share I-M-P- and so, you don’t actually wind up duplicating those nodes, but instead they are shared for each of those entries, so tries are also space efficient as well.

0:09:51.8
Adam Garrett-Harris

Oh yeah. I think another thing this picture in the book doesn’t show, is where a word ends. So, for example, “imposters” could actually end at “imposter”, so I think one thing you can do is at the end of the word also add a node that’s just like some sort of indication, like a null node or something. Some sort of indication that that can be the end of a word.

0:10:13.9
Jason Staten

Hmm, yeah. I can see the need for that if you had a word that is completely within another word.

0:10:19.8
Adam Garrett-Harris

Yeah, like you can’t assume that “impost” is a word, but “imposter” is.

0:10:24.0
Jen Luker

Yeah, but I’m not entirely sure that you need to define the end of the word because the word itself ends. So, “imposters” may fall completely within that word, but you’re going to end up with “imposter” and “imposters” under the “imposter” parent. So-

0:10:38.5
Adam Garrett-Harris

Yeah, but it’s just hard to tell, is “Imposte”? Is “impost”? Is “impos”? “Impo”?

0:10:44.5
Jen Luker

But as you go down that list, it’s going to limit based on what’s available.

0:10:48.7
Adam Garrett-Harris

Yeah, but if someone typed in i-m-p-, you wouldn’t want to suggest “impo” as a possible word, unless it’s really a word!

0:10:57.0
Jen Luker

But no, there wouldn’t be anything under “impo”.

0:10:59.7
Adam Garrett-Harris

But how do you know you can suggest “imposter” and not “imposters”?

0:11:02.8
Jen Luker

Because “imposter” exists and “impo” doesn’t.

0:11:05.1
Adam Garrett-Harris

But how do you know it exists?

0:11:07.2
Jen Luker

Because it’s part, one of the children.

0:11:09.6
Adam Garrett-Harris

It, I mean, “impo” exists, i-m-p-o-...

0:11:13.9
Safia Abdalla

I think you probably could do some sort of like, suffix check. So, like “imposter” and “imposters” are two words and they both have the same like, “Imposte” root, or I guess “impost” depending how you look at it, but like, what is the root of the word? “Impo”, like there’s no valid root for it, so it might be that or maybe like, each path on the tree is ranked, so, you know, it’s more likely that from “o” you want to go out to another node than terminate at “o”.

0:11:52.7
Jen Luker

I mean, if you’re looking at that picture, what you’re looking at is that you have “imp” and then you have it break out into “impart” and it separates into the “o” but it completes it with “osters”. So, you’d have “osters” or it would go all the way down to “oster” and then you’d have a side node that would have an “s”. So, it would become a sibling to the “r”. You end up dividing it, but because of the fact that “impo” isn’t a child under the “o” letter there, it’s not one of the options.

0:12:25.3
Adam Garrett-Harris

Okay, so I’m looking at, actually a different book I used when reading this book. It was another book I used as a reference, and it’s called “Cracking the Coding Interview”, and so, when it goes over tries-

0:12:38.6
Jen Luker

Mm-hmm (affirmative).

0:12:39.5
Adam Garrett-Harris

It says-

0:12:40.8
Jen Luker

Add a null node?

0:12:41.9
Adam Garrett-Harris

It has little, it has little nodes, yeah. It says there’s null nodes to indicate complete words and it makes them on the graph with an asterisk. So, for instance it’s got “man” and “many” and so after “n” you have a “y” and an asterisk, and then after “y” you have an asterisk. So, yeah, I feel like that’s just something this book kind of glossed over, because it’s just not going into that much detail.

0:13:10.8
Jason Staten

This discussion definitely makes me want to go and implement it, just to see what approach I would have to take to make sure I’m showing only entries that are there, and not every partial entry. So, I could take that on for you, Jen, to see what I can find from that side of things, as a little challenge to myself.

0:13:28.6
Jen Luker

Sure. It’d be interesting. I mean, I’ve used auto-completions, I’ve written auto-completions, but…

0:13:34.4
Adam Garrett-Harris

Yeah, and I may not have entirely understood what you were trying to explain with just audio only.

0:13:38.9
Jen Luker

Yeah. It’s just anything that came off of “o” would have its own, it would be like all the children that came off of “o”, but because there’s none that just end at “o”, I guess, but that would be part of your definition is that there’s no asterisk all by itself, as a child, saying “impo” is a possible word.

0:13:56.9
Adam Garrett-Harris

Yeah. All right! Should we move onto algorithms?

0:14:01.9
Safia Abdalla

Yeah.

0:14:02.7
Adam Garrett-Harris

So, actually, we’ve got a sponsor first.

0:14:05.9
Jen Luker

Do we?

0:14:06.1
Safia Abdalla

What? (laughs)

0:14:06.7
Adam Garrett-Harris

Yeah. (laughs) So, this week’s sponsor is V School.

V School was established in 2013 and is Utah’s highest ranked coding boot camp, and it’s the first of its kind in Utah. You can choose to learn full-time or part-time in the evenings, and you’ll immerse yourself learning new skills. So, even before classes start you’ll get a career counselor to help you find a job when you graduate, and they take care of everything so you can focus on learning, including free housing if you need it, and the housing’s just a few blocks from school. It’s located in beautiful Salt Lake City, Utah. Or you can learn from the comfort of your own home with V School’s 100% virtual classroom. You’ll get a super transcript, which is kind of like a portfolio, transcript, and letter of recommendation all in one, and it will help a potential employer quickly understand your strengths and abilities, and what you’ve accomplished.

0:15:02.0
Jen Luker

That sounds pretty cool.

0:15:03.1
Adam Garrett-Harris

Yeah.

0:15:03.4
Jason Staten

Yeah.

0:15:04.3
Adam Garrett-Harris

If you’re interesting in V School, head on over to https://vschool.io/ and it will help support the show, and definitely tell them that you heard about it from BookBytes. Thanks to V School for supporting the show.

(Typewriter dings)

0:15:16.4
Adam Garrett-Harris

So, simple algorithms. So, Jason and I were talking about this and our algorithm for going through these chapters was O(n), well, actually (mn) where m is the number of hosts and n is the length of the chapter, and so, we wanted to speed that up by each of us just picking our favorite algorithm, and so that should make it O(1).

0:15:39.6
Jen Luker

You nerds.

0:15:41.3
ALL** (laughi

g

0:15:43.3
Jason Staten

I, uh, definitely agree with that. So-

0:15:46.9
Jen Luker

(laughs) The nerd part or the O(1) part? ‘Cause-

0:15:50.1
Jason Staten

Oh, I’m not-

0:15:50.8
Jen Luker

That was way cooler.

0:15:51.3
Jason Staten

I’m not gonna back track from either.

0:15:54.2
Jen Luker & Adam Garrett-Harris

(laughing)

0:15:56.4
Jason Staten

I’ve definitely been geeking out with this book, I’ve really enjoyed it so far.

0:16:00.8
Jen Luker

Yeah, it’s been great.

0:16:02.0
Jason Staten

So, speaking of geeking out on things then, I actually will jump in line and I will talk about my sort that I wanted to do. So, first, kind of going back to the last chapter related to a data structure that we didn’t spend a lot of time on, but it is a tree, and it’s called a heap; and a heap is a special kind of tree where a parent either has a higher or lower, they call it priority, but you can think of it as a value, than its sibling nodes, but those sibling nodes themselves don’t have to have any specific order to them. So, for example, if you were to have a heap of 3 nodes, with the highest priority at the top, say you had 3, 5, and 7 as your values within it, then 7 could be at the top, and then 5 could be on the left side of that, and 3 could be on the side. The order of 3 and 5 doesn’t particularly matter.

And the value in that is you can insert and move things around within a tree pretty quickly, because as we talked before, a tree winds up becoming a divide-and-conquer type strategy when you’re interacting with it where you only need to figure out which level something belongs in rather than having to figure out the exact position of something within a tree when you’re inserting, or removing, and rearranging in a tree.

So, first a heap has the value or use case of being useful for a prioritized queue, and a prioritized queue would be something you would want to use in the case of, say you have a RabbitMQ type worker process where you have a queue of things that you’re putting a whole bunch of tasks into and sometimes tasks will have higher priorities than others. By using a heap you can insert into them very quickly with a, it is a Log(n) type insertion time to put something into that, and then actually pulling off of the top of one winds up being a Big-O(1) operation to go and remove the item because it’s exact where it’s at, and then a log(n) type operation to go and rebalance the tree again.

But heaps also have the benefit of being useful for sorting as well, and the way that you can do that is by converting your list structure, oftentimes it’s an array, and reorganizing the items into a complete heap structure, and a complete heap is the one that happens to have the greatest or lowest priority at the root node, or the first node on it, and then your second level of the tree and third level of the tree are just denoted purely by the indexes within that array.

So, kind of a cool thing about it is a heap can be represented by an array-type structure and you don’t actually have to have individual node-type structures that point to other structures. So, it makes it really memory efficient because it’s just a contiguous set of memory, and because of that, going and doing a heapsort winds up being an (n log n) type Big-O, and I just thought it was pretty awesome. I went and actually implemented it today in Rust and it is, by far, the fastest algorithm that I went ahead and implemented. I did bubble sort insertion, merge selection, and heap, and heap, I mean because of its (n log n)-type speed, it outdid everything that I wrote, but the built in sorts are still definitely faster in terms of actual runtime. Their Big-Os are still the same as heaps being (n log n), but heap is definitely on top, or it’s top of the heap of my source that I wrote.

0:20:07.0
Adam Garrett-Harris

(laughs) Nice.

0:20:08.8
Jason Staten

Yeah.

0:20:09.6
Adam Garrett-Harris

So, question.

0:20:11.1
Jason Staten

Yeah?

0:20:11.7
Adam Garrett-Harris

So, normally a heap, like you said, a min heap or a max heap would be represented with a tree, and it looks like a tree, like, when I see a picture of it in the book here.

0:20:23.4
Jason Staten

Mm-hmm (affirmative).

0:20:24.2
Adam Garrett-Harris

And you said you can actually do it as an array-

0:20:29.6
Jason Staten

Mm-hmm (affirmative).

0:20:30.3
Adam Garrett-Harris

Where you just specify what level of the tree each item would be at, if it were in a tree, and that’s great because its contiguous piece of memory. Do you do that for a sort because you already know how many items are going to be in the heap? And like, normally if you were just building the tree, ad hoc, you would know how many items are in there?

0:20:53.2
Jason Staten

So, even if you are building it with an unknown set of entries in it, that is a case where it could be backed by one of the more dynamic-type list structures, like the dynamically allocating arrays that when they get beyond their capacity they go and reallocate another set of memory, kind of like what JavaScript arrays do automatically for you, and some of the other languages, like Java, has a list that will do that under the hood.

So, it’s not so much about knowing the size but the benefit could come from… First, if you are able to back it with an array, a lot of times arrays are the type of thing that you want to be sorting. I mean, I generally don’t find myself sorting other types of data structures too often, and so by using a heapsort on it and just backing it by the original data source, then you actually don’t have to go and allocate other memory in order to do the sort, whereas some other sorts actually require you to go and copy the original data source in order to rearrange it.

So, it’s in place and doesn’t require additional memory allocation which is good for like, low memory type devices and stuff. And the way that you can go and specify what level something is at is by looking at its index within the backing array. So, for example, index(0) is the root node, and there’s only one node at the root node level, and then indexes 1 and 2 are nodes that are at the second level, and then indexes, I guess 3 through… Oh, it got me multiplying it. (laughs) But uh…

0:22:34.1
Adam Garrett-Harris

(laughs)

0:22:34.8
Jason Staten

Yeah, so like, everything under there, you have 4 nodes under there ‘cause every layer is 2 to the whatever level that it’s at, and it’s just known by the index that it’s at. So you’re not even keeping track of “this is the level that thing’s at” other than relying on the existing index that it is in the array.

0:22:53.7
Adam Garrett-Harris

Gotcha.

0:22:54.2
Jason Staten

And there is a really cool blog post about it, about “Heapify All The Things” and it is written by Vaidehi Joshi on Medium, and I will go and share a link that you can check out there, because it is an illustrated guide to doing a heapsort and I found it as an excellent resource to go along with the book.

0:23:19.9
Adam Garrett-Harris

Awesome, yeah. It seems like an illustrated guide would go great with the book. All right, Safia, you want to tell us about merge sort?

0:23:27.5
Safia Abdalla

Sure. So, I decided to choose merge sort because it was related to a sorting algorithm that I brought up in one of my tangents during the last podcast episode which was Timsort.

0:23:39.6
Adam Garrett-Harris

It’s not a tangent, it’s a side note.

0:23:42.2
Safia Abdalla

It’s a side note, of- Oh! That’s right, we’re being punny here, perfect.

0:23:47.1
Jason Staten & Adam Garrett-Harris

(laughs)

0:23:47.0
Safia Abdalla

One of my “side notes” during the last episode. So, merge sort, although it’s used in Timsort which we mentioned last episode, is actually a really old sorting algorithm. It was invented in the 1940s by somebody who was mentioned in the book a little earlier, Von Neumann who was tinkering a lot with computers in the ‘40s, as many of us do now I guess, and it was really interesting about merge sort was it applied the principle of divide-and-conquer, and this was also something that the book touched on in the last chapter. I think the, you know, in general, divide-and-conquer for solving any kind of problem tends to be like, a pretty common tool across computing, and merge sort was like, one of the first applications of that. It’s been around for 80 years and it’s pretty reliable.

So, with all of that background aside, what does merge sort actually do? Merge sort, essentially, works by dividing the lists that you want to sort into smaller and smaller sublists and sorting on those. So, the idea is that instead of taking the time to sort a list of 20 items, you sort the 10 lists of 2 items contained within that list and then you build back up. And this actually ends up being a really efficient technique. We mentioned in the Big O chapter, we learned about how, you know, generally when you have these sort of like, divide-and-conquer type techniques you get really good runtimes and the runtimes usually tend to be a log(n) because you’re always taking half of what you were looking through before, generally, and that’s the case with this.

So, the efficiency, or the time complexity of the algorithm is O(log n)-, O(n log n), and Jason mentioned a little bit about how much space heapsort takes up, merge sort takes up, I guess a space complexity of O(n) which basically means that if you have a list of 20 items and with merge sort you’re breaking it up into smaller lists that you want to sort through, then you would need basically the same amount of space as the number of items in your list for all of those sublists that you’re working through.

0:26:10.6
Adam Garrett-Harris

Is that in addition to the list that you already have saved?

0:26:14.4
Safia Abdalla

Yes.

0:26:15.1
Adam Garrett-Harris

Okay.

0:26:16.2
Safia Abdalla

So, O(n) extra space.

0:26:19.0
Adam Garrett-Harris

Okay.

0:26:20.4
Safia Abdalla

Yeah, one of the things that I think is cool to note in one of the visualizations that is on page 145 if you’re looking at PD,F is you can kind of see how much merge sort looks like a binary tree, ‘cause you’ve got those successive splits, and it was touched on in the last chapter, you know a binary tree is basically a tree where every node is either a terminating node or has 2 children.

So, I thought that was a really cool way to showcase those analogous principals. Like, you know, a binary tree is a divide-and-conquer representation technique, I guess is the way you might say it, or data structure; and so is merge sort. That’s all I have to say about it. My biggest was that it’s like really old, as most things in computers are. It’s been around for a while and it’s pretty time tested and I think that’s what’s cool about algorithms. Once you’ve got something that’s space efficient and time efficient you can take it anywhere.

0:27:27.8
Adam Garrett-Harris

Nice.

0:27:28.6
Jason Staten

One thing that confused me initially about merge sorts was, okay, we’re breaking everything up and now we have all of these lists of a single value and then I’m merging them back together. Like, how does that make anything faster, because then when I have a list of 2, and then another list of 2, how do I- my initial thought was like, okay well, I have compare all of these things, and it turns out the efficiency comes from having those two lists of 2 that you’re assembling back together, those 2 lists are already sorted and so the only things that you need to look at are the things on the I guess the left side of the list, or the head of those lists and that is the order that you assemble them back. You don’t have to check each item in one list against all of the items in the other ones. You only have to look at the heads of the list and that’s where the efficiency can come from.

0:28:19.1
Safia Abdalla

Yeah, and the code, the JavaScript code on page 148 kind of shows that, that you’re always just checking the first item, and-

0:28:26.8
Jason Staten

Mm-hmm (affirmative).

0:28:27.6
Safia Abdalla

The trend is kind of that although you’re still sorting you’re always sorting elements that are more sorted than they were before so your list is getting less chaotic as the algorithm runs and you only have to compare those first things.

0:28:44.0
Adam Garrett-Harris

Yeah, Jason. I had the same thought as you. It seems very, and it doesn’t seem very efficient because that’s not how I would do it. I think he mentions in the book that the sort that’s most like how a human would do it is selection sort ‘cause you just kind of look for the smallest one, and then look for the next smallest one.

0:29:03.8
Jason Staten

Mm-hmm (affirmative).

0:29:04.5
Safia Abdalla

Yeah, and I think that’s- Oh, I’m going to go on a side note again, I will pause it. But yeah, it kind of goes back on… It was, ‘cause it was invented in the ‘40s, it was one of the first sorting algorithms that was designed to be written on a computer and specifically like, a Von Neumann machine, because as is noted, you need the additional memory to keep track of all of the sublists.

0:29:28.0
Jason Staten

Mm-hmm (affirmative).

0:29:27.8
Safia Abdalla

And then you also need like, your CPU to actually do the magic. So, it was like, definitely like, very designed to be run on a computer and not, definitely not, by a person, ‘cause I would not sort a deck of cards or something, like this.

0:29:41.1
Adam Garrett-Harris

(laughs) Yeah. Okay, so you said you need the additional memory. Jason, is the heapsort in place? Like, does it need additional memory?

0:29:51.7
Jason Staten

So, heapsort does not need additional memory so that is one benefit to it. It also did come a bit later. I think it came in the mid or later ‘60s, so definitely after merge sort; and one difference between the 2 algorithms as well, even though they have the same runtime merge sort winds up being a stable sorting algorithm whereas a heapsort is not stable, meaning that if, say you have 2 duplicate items in terms of their priority or ordering, those may get swapped in a heapsort whereas a merge sort will always make sure that if something came first before the list was sorted, like one duplicate came first before the list was sorted, it will continue to remain before it after the list is sorted.

0:30:43.1
Adam Garrett-Harris

Ah, which doesn’t sound important if all you’re sorting is numbers, but if you’re sorting numbers that have other data associated with it, it could be a big deal.

0:30:52.0
Jason Staten

Yes.

0:30:52.9
Adam Garrett-Harris

Cool, Jen what was your algorithm you wanted to talk about?

0:30:56.1
Jen Luker

Mine is quicksort.

0:30:57.9
Jason Staten

Sounds fast.

0:30:59.2