A book club for developers.
BookBytes is a fortnightly (or biweekly) book club for developers. Each episode the hosts discuss part of a book they've been reading. And they also chat with authors about their books. The books are about development, design, ethics, history, and soft skills. Sometimes there are tangents (also known as footnotes).
(Intro music: Electro Swing)
Hello and welcome to BookBytes, a book club podcast for developers. We are talking about 2 more chapters of “The Imposter’s Handbook” by Rob Conery, at least as long as we don’t get distracted by turbo switches or something.
It’s always a possibility.
Yeah, so we’re going to try to go over chapters 5 and 6, “Big-O” and “Data Structures”. I’m Adam Garrett-Harris.
I’m Jen Luker.
I’m Safia Abdalla.
I’m Jason Staten.
All right, so, I was thinking about adding a section of book news, but I’m not sure. So, tell me if it sounds like a good idea or not.
I think we try it, and if want to do a one off this time that’s fine, and if it works we do it again.
Yeah. I’m pretty excited about it, and it should be on the shelves in the early fall of 2018. And that’s all I have about that! I read the first one and it was really hard for me to keep up with the Java that I didn’t know. (laughs)
Hopefully that won’t be a problem this time.
And space. Okay.
All right, let’s get into the book.
What did you all think of Big-O?
Big-O was one of those things that I didn’t realize I knew.
Oh! Is… Did you find it out while reading this?
I did. I had no idea what Big-O notation was, and then I learned, and I’m like, “Oh yeah! I totally know what this is.”
So, that made it nice and a little bit easier to get.
That is the reason why Rob Conery wrote this book, is a lot of times we know something but we don’t necessarily have the name for it, and so by going through it you now have a name to that thing that you were being wary of.
Speaking of the name, it’s kind of a weird name.
It is. Um...
And I guess the abbreviation “Big-O” is representative of the symbol that’s used in it, which is Omega, and I think, I’m pretty sure that’s why it’s called Big-O, just because it’s…
Oh, is an Omega?
Yeah, so there’s like different kinds of classifications you can do for like, measuring time complexity. I think… So one of them’s Omega, one of them’s Theta, there’s like, different ways of expressing the bounds of how long it takes an application to run, or a program to run. This is all very foggy in my memory.
Yeah. I think one is best case, one is a worst case.
Yeah, but if you look at this, just on that first page, it says that it’s Order N [ O(n) ], or the like, so I wonder if the O is actually for “order”.
So, yeah. Sorry, I’m like trying to look this up just so I don’t misinform people on this podcast. So, there’s like, Big-O is one of them, there’s also Big Theta, so I like, learned this like, a while ago, and it’s one of those things where like, Big-O is one of the most important ones you need to know ‘cause it’s kind of like the one that’s commonly used in industry; and then most of the other ones are definitions I feel mostly come up in an academic context, but that’s what it is. I did not come prepared for this, did I? (laughs)
Yeah. Yeah, like when you talk about best case that’s not very useful, ‘cause you could say best case is you find it right away and it’s a constant or something.
But best case is you don’t have to go through the whole thing, but that’s never going to happen.
So, I found a really great stack overflow question where somebody asked, you know, “What’s the difference between Big-O and Big Theta?” and it… you know? I will just read it because it verbalized what I was trying to say about bounds and stuff much better. So, Big-O defines the upper bound for how long it’ll take something to run, and Big Theta is a definition of both the upper and the lower bound. So, in general, what this answer is saying is that in industry you use Big-O because it defines the upper bound which is like, the possible worst case scenario, and the reason that you would want to use Big Theta is it defines both the upper bound and the lower bound so it kind of gives you a range. So, if you have a small value for Big Theta, that’s basically saying that the variability between the worst possible time it would take your program to run, and the best possible time is like, very low. So, it’s like, basically if the range of those two values is really small then the difference between the worst case situation the best case situation is small so it’s like, trivial. And so like, yeah. Am I making any sense? (laughs)
Yeah, so it would be more accurate.
If you’re trying to find something in an array and you only have one item in the array then it’s going to be a 1:1 ratio, so there’s really not going to be any difference between finding something in the array, the low bound is 1 and the high bound of that is 1; whereas, if you ‘ve got an array of a million and you’re trying to find one item, then the different operations that you can perform on it changes how long it’s going to take. So, you could totally luck out and pick the first one the first time and that would be like one operation. Or, you could have to go through the whole entire thing and it’s the very last one and you get it to like, a million operations. So, then it’s a much larger range of how long it can take, right?
Wait, sorry. Can you repeat that? I’m like, trying to read and process and go back into my memories for things. So, I’m like, all scatterbrained about this.
Yeah, it’s weird because we’re talking about stuff that’s not even in the book. (laughs)
Yeah, so yeah. I should bring it back. Basically, if you’re interested in time complexity and you’re like, curious about learning more about the ways that people might more rigorously define how long it will take programs to run and the bounds that they define for a program’s time complexity, there’s more. So much more! And some additional learning, if you’re interested, is to go look up Big Theta and Big-Omega and learn more about that. I do know that Khan Academy, in their computer science course has videos that explain all of the just different kinds of notations.
Yeah. So, right off the bat it says that, “All the examples in this chapter use arrays, even though you wouldn’t normally use arrays, because it makes it simpler to just explain time complexity.” And then we’ll get into data structures in the next chapter.
Talking about things other than arrays, including arrays.
Yeah. So, I like how it starts out super simple and it’s like, “Hey. Say you have an array of 5 numbers and you want to get the first one. How many operations does that take? And it’s just one. And what if the array is 100 long? It’s still one. It doesn’t matter how long the array is, right? ‘Cause it’s just one operation.” That’s like, super ideal. If every algorithm could just be on operation that would be awesome.
And I do like that it points out, as well, that even though it says one operation, it may not technically be like, that the CPU is doing exactly one thing, but we do go and kind of ignore some of the overhead involved with it. Like, oftentimes you know, you’ll go and you’ll make a read into memory and then load that thing onto the stack or something like that. And so, like, there are other things going on, potentially more than one, because I mean, there’s always lots of transistors involved in every little thing that you do, but boiling it down to that single operation, that “Yes, it is 1” or “It is constant” and how much time is required.
Yeah. Like, it doesn’t matter if it’s actually you doing 5 things. If it’s 5 things every time it’s constant and you can just say it’s O(1).
Yeah, and you can like, break out of that early and end up not looking at all of them, hopefully. Hopefully you would do that. There’s no need to keep looking at all of it if you’ve already found what you’re looking for, but you know, worst case is you look at all of them.
Yeah, that is true. Like, even if you are going to break out that is a good point. It’s that you may break out and most of the time you might breakout. But in the worst case you are going to take that long, therefore you still sit at Big-O(N) even if, in your particular use case you happen to bump into the right thing on the first or second iteration. That is like, very data specific rather that algorithm specific whereas the algorithm could potentially could take the whole N number of iterations before it finds what it’s looking for.
The next set actually increases the complexity a fair amount, and one of the examples that they gave here was that maybe you duplicate an item in the array and now you have to find the duplications and for that it’s O of N squared [ O(N squared) ] because you have to go through the array twice to find all of the items that you have, and then compare it against itself to see if there is any duplication.
Yeah. So, if you loop through it once, and then loop through it again that would be 2 to the Nth. .
Which is, essentially N.
But, in this case it’s like a nested for loop.
So, it ends up being like, NxN instead of N+N.
Because you’re having to go through N for every single item.
That’s not very good. (laughs)
And like, I like seeing these things, it doesn’t include one in the book, but I like seeing a chart of these things and you can see like how bad some of these get when there’s a lot of elements.
Oh yeah. It gets fun real fast. The exponential ones are always fun.
I was just going to talk a little bit about the discussion on optimizations that he covers, and it starts on page 92 if you’re following in the book that I’m in, and it covers how you might optimize the summing function that he was talking about when he was showcasing linear time, and he uses this really cool math trick to convert the amount of time it takes for summing a list from O(N), the number of items in the list, to just a single, like, equation that you can use. So, it runs is O(1) and that kind of bit where he does that analysis, highlights one of the central themes that he goes on to cover in the rest of the chapter, which is that you can always make things better. Once you know the time complexity of something you can start to think a little bit about how you optimize it and improve it. And in future sections of this chapter he talks about how you can improve implementations that are slow. A little bit of foreshadowing there for you.
Yeah. One thing I like about that is that it takes a little bit of additional information to create that equation and that is that the array is contiguous and starting at 1. So, it doesn’t skip any numbers, it doesn’t have any duplicates. And so, a lot of times in interview questions they give you some sort of extra information and you really need to use all of that information, there’s like, some reason why they’re telling you that, there’s some sort of optimization you can do with it.
I really like that point of like, understand the assumptions and the context of the problem you’re solving and use that information to help you improve your algorithm. It reminds me… see, I don’t want to reference things again because I might get the details wrong, but I’ll leave it as a hint for folks and people can look it up if they’re interested. This kind of notion of thinking and recognizing the context of the problems you’re solving and using that to improve the run time of your program reminds me of something that I read about a long time ago about Timsort. T-I-M sort-
Which is the, who’s exclaimed?
That was me. You keep going.
Somebody got excited so-
Everyone but me.
(laughs) So, Timsort is the default sorting method used in the Python programming language. If you’ve used Python and you like, call sort on a list of numbers or something, that algorithm is what is called on your list. And it’s really reformant I think. Like, the sorting worst case, or maybe average case, is O(n log n), again foreshadowing to something that’s covered later on in the book. And when I was reading about this I was really fascinated about it because one of the ways that algorithms is so optimal is because it leverages the fact that when you’re writing a computer program in the world and you know you’re not solving some like, problem set or math problem, it’s highly likely that the data that you’re going to get into your sorting function, or wherever you want to sort it, is probably already sorted in some way. You’ve got little bits of ordered sequences in a generally unordered list, and so the implementation of Timsort uses that fact that most data in the real world is partially sorted already to create an improved implementation of a sorting algorithm that uses 2 other sorting algorithms in it. The big point that I wanted to share there, and like, the thing that like, blew my mind, especially when I first learned about it, was like, “Oh yeah. You can kind of look at the program that you’re writing and say, you know, there’s no situation in the real world where data is completely random. There’s always going to be some sort of order.” And you can use that assumption to guide you to making a more efficient algorithm like Timsort in Python. I also think that now it’s used in maybe Java and a couple of other things, maybe. But yeah, if you’re interested in that, Timsort, a pretty cool thing to check out and read about.
So, I’m glad you brought up Timsort, that is one of the resources that I actually have a link to that I want to share because it surfaced again recently and so there’s a good Hacker Noon about it, and it’s kind of awesome how it goes and, before going and doing the search, or early on in its process of sorting, it goes and actually gathers heuristics about your data that you’re sorting against, too. So, for example, if you’re doing a sort on a list that has fewer than 64 elements in it, then Timsort just does insertion sort so it doesn’t do anything real fancy. LIke, insertion sort is just kind of the slow, I guess it’s considered slow in like, the great scheme of things because, I mean, it is Big-O(N Squared). However, because it’s working on such a small list of things, like 64, that is a really small number for a computer to have to work with when it comes to working with an array size, that insertion sort is the fasted approach that it can take; because there’s no overhead of having to split the list or start creating indexes or pointers at other parts of the list to do more advanced sorts such as merge sort and quicksort which, I think, yeah, that does come a little later in the book, but yeah, it is a very practical approach because it’s assessing the data and making assumptions about it as it goes along. And so, yeah, I’m glad you’re also familiar with it, Safia, and I will share a link with you, Adam and Jen.
And I want to let our listeners know that Jason and I did not coordinate that at all, we’ve just-
Got some like, vulcan mind meld thing going on right now. (laughs) So, yeah. Thanks for sharing that and like, spelling out some of the heuristics that it uses a little bit more clearly for everyone listening. That’s super cool.
Yeah, that was really cool. I’ve never heard of that. Okay, so we were talking about Nsquared and then there’s log, (log N). I love how it takes a break to explain logarithms because if you haven’t been in a math class for you while you might not remember logarithms, and I’d completely forgotten them like a year ago when I started reading about some interview stuff, preparing for interviews, and it’s basically the opposite operation from exponents.
So, if you, for example, in the case that you want to solve 2 to the X equals 512, if you want to get that that is where you could go and use log in order to solve it. So, you could say “log 2” giving it the right side of that equals thing, 512, equals X will get you back your result.
Yeah, and logarithms have a base, as well, but you know, one thing about Big-O is the math, you want to simplify it as much as possible, like all the extra things don’t matter so it doesn’t even matter what base it is. You just say “log” “log N” It doesn’t matter if it’s log base 2n, but I guess like, with logarithms the thing to keep in mind is just if you’re splitting things in half, like if you’re doing a binary search, then it’s going to have a log in the complexity.
Adam, you said binary search, what’s that?
Binary search, isn’t that what you mentioned? Did you mention that one earlier? No, you didn’t. So, binary search is a divide and conquer algorithm and we may be talking about it later in a couple of chapters, but you split the array in half and then you split it again. Or, actually you split it up completely and then you’re like, combining it. You’re repeating the same thing over and over. That wasn’t a very good explanation. I was thinking of a binary sort.
A binary search is where you look halfway and then you decide which half to look at, you know, look half way and you get closer and closer until you find it.
So, one of the easiest things is to say okay, so you have a list of 1 to 100 and you start with 50 and you say, okay is the number you’re looking for higher or lower than 50? And you’re assuming this list is sorted, and at that point you say okay, well my number’s 27 so we’re gonna cut off everything above 50 because it’s below, and then we look at 1-50 and we divide it in half again and say 25, okay so is 27 higher or lower than 25? So, we know it’s higher than 25 so we cut off everything from 1-25. So then we look at you know, 25-50 and we find the middle point at like, 37, and is it higher or lower? It’s lower, you cut it in half and now we’re looking at 25-37 and you keep dividing it down until you land on our number, 27.
That was a better…
Yeah, thank you.
You know, because of that you didn’t have to look through all of those numbers you otherwise would have and instead you have like, 8 steps instead of 100 steps or 27 steps to go from 1-27 to find your number.
Yeah, and the key is making sure it’s sorted first.
Yeah, that is a requirement on binary search otherwise you will not find what you’re looking for.
Yeah, you can’t determine which half it’s in if it’s not sorted.
Exactly. So, one real world metric. So, I went ahead and did some of these algorithms inside of Rust, because that’s what I’ve been sticking to, and I went and put a brute force search which is just looping directly over the list up against a binary search, and for the case of looking up index 900k in a list of a million items. For a brute force search every time it needed to go and find that, or in order to find that one time, it took 429 nanoseconds. I mean, fast in human time, but slow in computer. And to compare that to binary search, it actually took 19 nanoseconds in order to find it. So, dramatically different amount of time and so, if you do have a sorted list, like using an algorithm that has a smaller Big-O when you are working on larger datasets can really be beneficial there.
Yeah, and then if you double the list it would be even worse.
Exactly. It just gets better and better of an option as your list size grows and grows.
So, what else stuck out in this chapter?
Well, I was glad to see that the book covered something I mentioned briefly, I think, in the complexity chapter which was a couple of episodes ago, which was space complexity versus time complexity, and recognizing that sometimes things might run out of memory resources before they actually finish executing. So, yeah I thought it was really cool that he, the author, had a chance to tack that in on the end of the chapter here.
Yeah. I thought of you when I read that.
It is page 106 and 105 for those who are following along.
I also thought the quick break for math geeks on 104 was pretty cool. I’ve seen this video before in "Numberphile" where they explain adding up all the natural numbers from 1-infinity equals -1/12th, and I’ll put a link for that in the show notes, it’s pretty cool.
You’re not going to explain it right here, Adam?
I cannot explain it, it’s weird. It’s weird.
Yeah. There is an addendum here in the book that says, “Fair warning. You’ll get lost down this rabbit hole” in reference to it.
Be prepared, reader!
I did. I did click on a 2nd video after watching that one, but...
(laughs) That’s how they get you!
I also liked the section on thinking in Big-O, where Rob actually goes through and summarizes each of the Big-Os, or at least, not all of them, but at least the ones that you’ll likely care about as a developer. So, in particular with arrays, it says, “If you’re randomly accessing an array, it’s always Big-O(1).” So, you’re just grabbing an index off an array it’s constant time, like Big-O(1), one operation required. If you’re iterating over the list, then you have a Big-O(N). If you’re doing a nested loop then you have a Big-O(N Squared). If you are dividing that list up and working through those pieces then you have a Big-O(Log N). And if you’re iterating and then dividing, then you have a Big-O(N Log N). So, those are a nice, I don’t know, handy guide to keep in mind. So, when you look at a piece of code, it can be at least somewhat obvious to you of what the time complexity of it is. If you see a number of for loops just within it, like if you’re seeing nesting or something going on you can say, “Oh yeah, that is Big-O N Squared. Maybe we can improve this.” Or something.
Yeah. That’s a good little cheat sheet. All right, are we ready for the next chapter? Data Structures?
Let’s talk about them.
I thought we used objects for everything.
Oh, yeah. Yeah, and arrays are objects which is weird.
Yep. Objects with numeric keys.
Well, that joke fell flat.
What was the joke? (laughs)
(laughs) Yeah. Yeah. Sorry, I didn’t… I don’t see the joke.
I hope somebody out there laughs.
Wait, what, what did you say?
Oh. Oh! I see. I thought you were saying that Rob wrote joke and I missed it, like in the book.
That like, confused me for the longest time. I feel like different programming languages use array and list differently and I always get the terminology confused because sometimes a list is used to refer to something that is technically an array and vice versa. So I’m always like… that fact always surprised me, too.
Oh yeah, so, what is the difference between a list and an array in some language?
So, the distinction is that with an array you can do direct index based access, and a list is something where you have to do sequential access. So, to get to the next item you need to like, go through the previous one and generally that’s because a load local in memory, an array takes up a contiguous chunk of memory. So, all of the numbers that are next to each other in the list- … See! I did it again! (laughs)