BookBytes

A book club for developers.

BookBytes

BookBytes is a fortnightly (or biweekly) book club for developers. Each episode the hosts discuss part of a book they've been reading. And they also chat with authors about their books. The books are about development, design, ethics, history, and soft skills. Sometimes there are tangents (also known as footnotes).

Hosts

Adam Garrett-Harris

Jason Staten

Jen Luker

Safia Abdalla

Subscribe

13: The Imposter's Handbook: Big O and Data Structures

8/20/2018

They go over big O and data structures, Jason and Safia have a vulcan mind meld over Timsort, and Adam almost went down a rabbit trail of YouTube videos.

Hosts

Transcript

Help improve this transcript on GitHub

(Intro music: Electro Swing)

0:00:11.8
Adam Garrett-Harris

Hello and welcome to BookBytes, a book club podcast for developers. We are talking about 2 more chapters of “The Imposter’s Handbook” by Rob Conery, at least as long as we don’t get distracted by turbo switches or something.

0:00:24.0
Jen Luker

It’s always a possibility.

0:00:24.7
Safia Abdalla

(laughs)

0:00:25.4
Adam Garrett-Harris

Yeah, so we’re going to try to go over chapters 5 and 6, “Big-O” and “Data Structures”. I’m Adam Garrett-Harris.

0:00:33.3
Jen Luker

I’m Jen Luker.

0:00:34.6
Safia Abdalla

I’m Safia Abdalla.

0:00:36.1
Jason Staten

I’m Jason Staten.

0:00:37.4
Adam Garrett-Harris

All right, so, I was thinking about adding a section of book news, but I’m not sure. So, tell me if it sounds like a good idea or not.

0:00:44.5
Jen Luker

I think we try it, and if want to do a one off this time that’s fine, and if it works we do it again.

0:00:50.1
Adam Garrett-Harris

Okay. Well I’ve got, I’ll just give one item of book news and that is that the 2nd edition of “Refactoring” by Martin Fowler is coming out and it will be written in JavaScript instead of Java.

0:01:02.5
Jen Luker

Oooh.

0:01:03.8
Adam Garrett-Harris

Yeah. I’m pretty excited about it, and it should be on the shelves in the early fall of 2018. And that’s all I have about that! I read the first one and it was really hard for me to keep up with the Java that I didn’t know. (laughs)

0:01:15.5
Jen Luker

Hopefully that won’t be a problem this time.

0:01:18.0
Adam Garrett-Harris

Yeah.

0:01:18.5
Jason Staten

That’s just showing that JavaScript really is dominating.

0:01:22.3
Adam Garrett-Harris

Yeah, ‘cause he wrote a blog post about it and he’s not super happy about doing it in JavaScript, but there’s certain reasons why he chose it as the best language for this book.

0:01:31.4
Safia Abdalla

Hmm.

0:01:32.0
Jen Luker

Low entry, and JavaScript is taking over the world.

0:01:37.2
Adam Garrett-Harris

Yeah. (laughs)

0:01:38.9
Safia Abdalla

And space!

0:01:39.6
Jen Luker

And space!

0:01:40.8
Adam Garrett-Harris

And space. Okay.

0:01:41.9
Jen Luker

Mm-hmm (affirmative).

0:01:42.7
Adam Garrett-Harris

All right, let’s get into the book.

(Typewriter Dings)

0:01:47.2
Adam Garrett-Harris

What did you all think of Big-O?

0:01:49.1
Jen Luker

Big-O was one of those things that I didn’t realize I knew.

0:01:52.1
Adam Garrett-Harris

Oh! Is… Did you find it out while reading this?

0:01:56.3
Jen Luker

I did. I had no idea what Big-O notation was, and then I learned, and I’m like, “Oh yeah! I totally know what this is.”

0:02:02.2
Adam Garrett-Harris

Nice!

0:02:03.0
Jen Luker

So, that made it nice and a little bit easier to get.

0:02:05.3
Jason Staten

That is the reason why Rob Conery wrote this book, is a lot of times we know something but we don’t necessarily have the name for it, and so by going through it you now have a name to that thing that you were being wary of.

0:02:19.5
Adam Garrett-Harris

Speaking of the name, it’s kind of a weird name.

0:02:22.0
Safia Abdalla

It is. Um...

0:02:22.9
All

(laughing)

0:02:26.3
Safia Abdalla

And I guess the abbreviation “Big-O” is representative of the symbol that’s used in it, which is Omega, and I think, I’m pretty sure that’s why it’s called Big-O, just because it’s…

0:02:40.0
Adam Garrett-Harris

Oh, is an Omega?

0:02:41.4
Safia Abdalla

Yeah, so there’s like different kinds of classifications you can do for like, measuring time complexity. I think… So one of them’s Omega, one of them’s Theta, there’s like, different ways of expressing the bounds of how long it takes an application to run, or a program to run. This is all very foggy in my memory.

0:03:01.3
Adam Garrett-Harris

Yeah. I think one is best case, one is a worst case.

0:03:04.6
Jen Luker

Yeah, but if you look at this, just on that first page, it says that it’s Order N [ O(n) ], or the like, so I wonder if the O is actually for “order”.

0:03:14.2
Adam Garrett-Harris

Hmm.

0:03:14.9
Safia Abdalla

So, yeah. Sorry, I’m like trying to look this up just so I don’t misinform people on this podcast. So, there’s like, Big-O is one of them, there’s also Big Theta, so I like, learned this like, a while ago, and it’s one of those things where like, Big-O is one of the most important ones you need to know ‘cause it’s kind of like the one that’s commonly used in industry; and then most of the other ones are definitions I feel mostly come up in an academic context, but that’s what it is. I did not come prepared for this, did I? (laughs)

0:03:46.1
Adam Garrett-Harris

Yeah. Yeah, like when you talk about best case that’s not very useful, ‘cause you could say best case is you find it right away and it’s a constant or something.

0:03:58.8
Safia Abdalla

Yeah.

0:03:59.3
Adam Garrett-Harris

But best case is you don’t have to go through the whole thing, but that’s never going to happen.

0:04:02.7
Safia Abdalla

So, I found a really great stack overflow question where somebody asked, you know, “What’s the difference between Big-O and Big Theta?” and it… you know? I will just read it because it verbalized what I was trying to say about bounds and stuff much better. So, Big-O defines the upper bound for how long it’ll take something to run, and Big Theta is a definition of both the upper and the lower bound. So, in general, what this answer is saying is that in industry you use Big-O because it defines the upper bound which is like, the possible worst case scenario, and the reason that you would want to use Big Theta is it defines both the upper bound and the lower bound so it kind of gives you a range. So, if you have a small value for Big Theta, that’s basically saying that the variability between the worst possible time it would take your program to run, and the best possible time is like, very low. So, it’s like, basically if the range of those two values is really small then the difference between the worst case situation the best case situation is small so it’s like, trivial. And so like, yeah. Am I making any sense? (laughs)

0:05:18.2
Adam Garrett-Harris

Yeah, so it would be more accurate.

0:05:19.9
Jen Luker

If you’re trying to find something in an array and you only have one item in the array then it’s going to be a 1:1 ratio, so there’s really not going to be any difference between finding something in the array, the low bound is 1 and the high bound of that is 1; whereas, if you ‘ve got an array of a million and you’re trying to find one item, then the different operations that you can perform on it changes how long it’s going to take. So, you could totally luck out and pick the first one the first time and that would be like one operation. Or, you could have to go through the whole entire thing and it’s the very last one and you get it to like, a million operations. So, then it’s a much larger range of how long it can take, right?

0:06:05.4
Safia Abdalla

Wait, sorry. Can you repeat that? I’m like, trying to read and process and go back into my memories for things. So, I’m like, all scatterbrained about this.

0:06:13.7
Jen Luker

(laughs) Okay.

0:06:16.6
Adam Garrett-Harris

Yeah, it’s weird because we’re talking about stuff that’s not even in the book. (laughs)

0:06:19.3
Safia Abdalla

Yeah, so yeah. I should bring it back. Basically, if you’re interested in time complexity and you’re like, curious about learning more about the ways that people might more rigorously define how long it will take programs to run and the bounds that they define for a program’s time complexity, there’s more. So much more! And some additional learning, if you’re interested, is to go look up Big Theta and Big-Omega and learn more about that. I do know that Khan Academy, in their computer science course has videos that explain all of the just different kinds of notations.

0:06:58.0
Adam Garrett-Harris

Yeah. So, right off the bat it says that, “All the examples in this chapter use arrays, even though you wouldn’t normally use arrays, because it makes it simpler to just explain time complexity.” And then we’ll get into data structures in the next chapter.

0:07:13.0
Jen Luker

Talking about things other than arrays, including arrays.

0:07:16.2
Adam Garrett-Harris

Yeah.

0:07:17.0
Jason Staten

Yeah. Working with the array is, I guess first, nice as a construct that we’re all pretty familiar with. Like, if you do programming and you have a list of something, generally, an array is the first thing that you’ll go with because you’ll want to be pushing something onto a list, right? Like, you’ll have a bunch of things, an array just being an indexed piece of memory that usually is like, one continuous piece of memory, is a good data structure to work with from a starting point, and it also, just by us working through this Big-O chapter and talking about some of the costs that are involved with interacting with an array, it does kind of explain or start to show the value of working with a different data structure if you happen to be doing other operations that an array is not well suited for. And it also says that in, it specifically says in .NET, but in whatever language that you’re working in, often there are other type of data structures that come with that language that you can bank on. For example, JavaScript happens to have sets in it as well, so if you want set-like behavior and need its benefits, that is something that’s available to you. So, being aware of what’s in there can be beneficial for your performance.

0:08:35.0
Adam Garrett-Harris

Yeah. So, I like how it starts out super simple and it’s like, “Hey. Say you have an array of 5 numbers and you want to get the first one. How many operations does that take? And it’s just one. And what if the array is 100 long? It’s still one. It doesn’t matter how long the array is, right? ‘Cause it’s just one operation.” That’s like, super ideal. If every algorithm could just be on operation that would be awesome.

0:08:57.8
Jason Staten

And I do like that it points out, as well, that even though it says one operation, it may not technically be like, that the CPU is doing exactly one thing, but we do go and kind of ignore some of the overhead involved with it. Like, oftentimes you know, you’ll go and you’ll make a read into memory and then load that thing onto the stack or something like that. And so, like, there are other things going on, potentially more than one, because I mean, there’s always lots of transistors involved in every little thing that you do, but boiling it down to that single operation, that “Yes, it is 1” or “It is constant” and how much time is required.

0:09:38.4
Adam Garrett-Harris

Yeah. Like, it doesn’t matter if it’s actually you doing 5 things. If it’s 5 things every time it’s constant and you can just say it’s O(1).

0:09:47.5
Jason Staten

Yeah.

0:09:48.3
Adam Garrett-Harris

What else?

0:09:49.7
Jason Staten

So, it moves on from just accessing the index of an array onto iterating over an array. And so, if you are looping over an entire array you are doing something that has the Big-O(N), N being the number of elements that happen to be in that array, and that is totally unavoidable. Like, there is no way that you can loop over everything that’s in an array without actually going over each of the elements. Like, I mean there is no shortcut here, but do know that when you are doing a for loop or a for each on some JavaScript thing, you’re doing something that requires Big-O(N). I guess also referred to as linear type scaling.

0:10:32.6
Adam Garrett-Harris

Yeah, and you can like, break out of that early and end up not looking at all of them, hopefully. Hopefully you would do that. There’s no need to keep looking at all of it if you’ve already found what you’re looking for, but you know, worst case is you look at all of them.

0:10:46.6
Jason Staten

Yeah, that is true. Like, even if you are going to break out that is a good point. It’s that you may break out and most of the time you might breakout. But in the worst case you are going to take that long, therefore you still sit at Big-O(N) even if, in your particular use case you happen to bump into the right thing on the first or second iteration. That is like, very data specific rather that algorithm specific whereas the algorithm could potentially could take the whole N number of iterations before it finds what it’s looking for.

0:11:16.6
Jen Luker

The next set actually increases the complexity a fair amount, and one of the examples that they gave here was that maybe you duplicate an item in the array and now you have to find the duplications and for that it’s O of N squared [ O(N squared) ] because you have to go through the array twice to find all of the items that you have, and then compare it against itself to see if there is any duplication.

0:11:41.2
Adam Garrett-Harris

Yeah. So, if you loop through it once, and then loop through it again that would be 2 to the Nth. .

0:11:46.7
Jen Luker

Oh, right.

0:11:47.4
Adam Garrett-Harris

Which is, essentially N.

0:11:47.4
Jason Staten

Hmm.

0:11:49.7
Adam Garrett-Harris

But, in this case it’s like a nested for loop.

0:11:52.8
Jen Luker

Right.

0:11:53.5
Adam Garrett-Harris

So, it ends up being like, NxN instead of N+N.

0:11:56.8
Jen Luker

Because you’re having to go through N for every single item.

0:12:00.4
Adam Garrett-Harris

Yeah.

0:12:01.3
Jen Luker

Yeah.

0:12:02.4
Adam Garrett-Harris

That’s not very good. (laughs)

0:12:03.5
Jen

Hmm, no.

0:12:04.9
Adam Garrett-Harris

And like, I like seeing these things, it doesn’t include one in the book, but I like seeing a chart of these things and you can see like how bad some of these get when there’s a lot of elements.

0:12:14.4
Jen Luker

Oh yeah. It gets fun real fast. The exponential ones are always fun.

0:12:19.3
Safia Abdalla

I was just going to talk a little bit about the discussion on optimizations that he covers, and it starts on page 92 if you’re following in the book that I’m in, and it covers how you might optimize the summing function that he was talking about when he was showcasing linear time, and he uses this really cool math trick to convert the amount of time it takes for summing a list from O(N), the number of items in the list, to just a single, like, equation that you can use. So, it runs is O(1) and that kind of bit where he does that analysis, highlights one of the central themes that he goes on to cover in the rest of the chapter, which is that you can always make things better. Once you know the time complexity of something you can start to think a little bit about how you optimize it and improve it. And in future sections of this chapter he talks about how you can improve implementations that are slow. A little bit of foreshadowing there for you.

0:13:23.1
Adam Garrett-Harris

Yeah. One thing I like about that is that it takes a little bit of additional information to create that equation and that is that the array is contiguous and starting at 1. So, it doesn’t skip any numbers, it doesn’t have any duplicates. And so, a lot of times in interview questions they give you some sort of extra information and you really need to use all of that information, there’s like, some reason why they’re telling you that, there’s some sort of optimization you can do with it.

0:13:55.4
Safia Abdalla

I really like that point of like, understand the assumptions and the context of the problem you’re solving and use that information to help you improve your algorithm. It reminds me… see, I don’t want to reference things again because I might get the details wrong, but I’ll leave it as a hint for folks and people can look it up if they’re interested. This kind of notion of thinking and recognizing the context of the problems you’re solving and using that to improve the run time of your program reminds me of something that I read about a long time ago about Timsort. T-I-M sort-

0:14:32.6
Jason Staten

Oh!

0:14:33.5
Safia Abdalla

Which is the, who’s exclaimed?

0:14:37.1
Jason Staten

That was me. You keep going.

0:14:40.1
Safia Abdalla

Okay.

0:14:40.7
Adam Garrett-Harris

(laughs)

0:14:41.6
Safia Abdalla

Somebody got excited so-

0:14:43.2
Jason Staten

Yes.

0:14:43.5
Jen Luker

Everyone but me.

0:14:45.1
Safia Abdalla

(laughs) So, Timsort is the default sorting method used in the Python programming language. If you’ve used Python and you like, call sort on a list of numbers or something, that algorithm is what is called on your list. And it’s really reformant I think. Like, the sorting worst case, or maybe average case, is O(n log n), again foreshadowing to something that’s covered later on in the book. And when I was reading about this I was really fascinated about it because one of the ways that algorithms is so optimal is because it leverages the fact that when you’re writing a computer program in the world and you know you’re not solving some like, problem set or math problem, it’s highly likely that the data that you’re going to get into your sorting function, or wherever you want to sort it, is probably already sorted in some way. You’ve got little bits of ordered sequences in a generally unordered list, and so the implementation of Timsort uses that fact that most data in the real world is partially sorted already to create an improved implementation of a sorting algorithm that uses 2 other sorting algorithms in it. The big point that I wanted to share there, and like, the thing that like, blew my mind, especially when I first learned about it, was like, “Oh yeah. You can kind of look at the program that you’re writing and say, you know, there’s no situation in the real world where data is completely random. There’s always going to be some sort of order.” And you can use that assumption to guide you to making a more efficient algorithm like Timsort in Python. I also think that now it’s used in maybe Java and a couple of other things, maybe. But yeah, if you’re interested in that, Timsort, a pretty cool thing to check out and read about.

0:16:36.9
Jason Staten

So, I’m glad you brought up Timsort, that is one of the resources that I actually have a link to that I want to share because it surfaced again recently and so there’s a good Hacker Noon about it, and it’s kind of awesome how it goes and, before going and doing the search, or early on in its process of sorting, it goes and actually gathers heuristics about your data that you’re sorting against, too. So, for example, if you’re doing a sort on a list that has fewer than 64 elements in it, then Timsort just does insertion sort so it doesn’t do anything real fancy. LIke, insertion sort is just kind of the slow, I guess it’s considered slow in like, the great scheme of things because, I mean, it is Big-O(N Squared). However, because it’s working on such a small list of things, like 64, that is a really small number for a computer to have to work with when it comes to working with an array size, that insertion sort is the fasted approach that it can take; because there’s no overhead of having to split the list or start creating indexes or pointers at other parts of the list to do more advanced sorts such as merge sort and quicksort which, I think, yeah, that does come a little later in the book, but yeah, it is a very practical approach because it’s assessing the data and making assumptions about it as it goes along. And so, yeah, I’m glad you’re also familiar with it, Safia, and I will share a link with you, Adam and Jen.

0:18:17.1
Safia Abdalla

And I want to let our listeners know that Jason and I did not coordinate that at all, we’ve just-

0:18:22.0
Jason Staten

Mm-hmm (affirmative).

0:18:22.7
Safia Abdalla

Got some like, vulcan mind meld thing going on right now. (laughs) So, yeah. Thanks for sharing that and like, spelling out some of the heuristics that it uses a little bit more clearly for everyone listening. That’s super cool.

0:18:35.8
Adam Garrett-Harris

Yeah, that was really cool. I’ve never heard of that. Okay, so we were talking about Nsquared and then there’s log, (log N). I love how it takes a break to explain logarithms because if you haven’t been in a math class for you while you might not remember logarithms, and I’d completely forgotten them like a year ago when I started reading about some interview stuff, preparing for interviews, and it’s basically the opposite operation from exponents.

0:19:06.2
Jason Staten

So, if you, for example, in the case that you want to solve 2 to the X equals 512, if you want to get that that is where you could go and use log in order to solve it. So, you could say “log 2” giving it the right side of that equals thing, 512, equals X will get you back your result.

0:19:29.5
Adam Garrett-Harris

Yeah, and logarithms have a base, as well, but you know, one thing about Big-O is the math, you want to simplify it as much as possible, like all the extra things don’t matter so it doesn’t even matter what base it is. You just say “log” “log N” It doesn’t matter if it’s log base 2n, but I guess like, with logarithms the thing to keep in mind is just if you’re splitting things in half, like if you’re doing a binary search, then it’s going to have a log in the complexity.

0:19:55.7
Jason Staten

Adam, you said binary search, what’s that?

0:19:57.8
Adam Garrett-Harris

Binary search, isn’t that what you mentioned? Did you mention that one earlier? No, you didn’t. So, binary search is a divide and conquer algorithm and we may be talking about it later in a couple of chapters, but you split the array in half and then you split it again. Or, actually you split it up completely and then you’re like, combining it. You’re repeating the same thing over and over. That wasn’t a very good explanation. I was thinking of a binary sort.

0:20:24.4
Jen Luker

Mm-hmm (affirmative).

0:20:24.9
Adam Garrett-Harris

A binary search is where you look halfway and then you decide which half to look at, you know, look half way and you get closer and closer until you find it.

0:20:31.1
Jason Staten

Mm-hmm (affirmative).

0:20:31.2
Jen Luker

So, one of the easiest things is to say okay, so you have a list of 1 to 100 and you start with 50 and you say, okay is the number you’re looking for higher or lower than 50? And you’re assuming this list is sorted, and at that point you say okay, well my number’s 27 so we’re gonna cut off everything above 50 because it’s below, and then we look at 1-50 and we divide it in half again and say 25, okay so is 27 higher or lower than 25? So, we know it’s higher than 25 so we cut off everything from 1-25. So then we look at you know, 25-50 and we find the middle point at like, 37, and is it higher or lower? It’s lower, you cut it in half and now we’re looking at 25-37 and you keep dividing it down until you land on our number, 27.

0:21:15.4
Adam Garrett-Harris

That was a better…

0:21:16.7
Jen Luker

(laughs)

0:21:17.1
Adam Garrett-Harris

Yeah, thank you.

0:21:18.0
Jason Staten

(laughs)

0:21:18.5
Jen Luker

You know, because of that you didn’t have to look through all of those numbers you otherwise would have and instead you have like, 8 steps instead of 100 steps or 27 steps to go from 1-27 to find your number.

0:21:29.1
Adam Garrett-Harris

Right.

0:21:29.8
Jason Staten

So-

0:21:30.6
Adam Garrett-Harris

Yeah, and the key is making sure it’s sorted first.

0:21:33.2
Jason Staten

Yeah, that is a requirement on binary search otherwise you will not find what you’re looking for.

0:21:38.3
All

(laughing)

0:21:39.1
Jen

Yeah, you can’t determine which half it’s in if it’s not sorted.

0:21:42.2
Jason Staten

Exactly. So, one real world metric. So, I went ahead and did some of these algorithms inside of Rust, because that’s what I’ve been sticking to, and I went and put a brute force search which is just looping directly over the list up against a binary search, and for the case of looking up index 900k in a list of a million items. For a brute force search every time it needed to go and find that, or in order to find that one time, it took 429 nanoseconds. I mean, fast in human time, but slow in computer. And to compare that to binary search, it actually took 19 nanoseconds in order to find it. So, dramatically different amount of time and so, if you do have a sorted list, like using an algorithm that has a smaller Big-O when you are working on larger datasets can really be beneficial there.

0:22:44.3
Adam Garrett-Harris

Yeah, and then if you double the list it would be even worse.

0:22:47.0
Jason Staten

Exactly. It just gets better and better of an option as your list size grows and grows.

0:22:51.7
Adam Garrett-Harris

So, what else stuck out in this chapter?

0:22:54.1
Safia Abdalla

Well, I was glad to see that the book covered something I mentioned briefly, I think, in the complexity chapter which was a couple of episodes ago, which was space complexity versus time complexity, and recognizing that sometimes things might run out of memory resources before they actually finish executing. So, yeah I thought it was really cool that he, the author, had a chance to tack that in on the end of the chapter here.

0:23:22.2
Adam Garrett-Harris

Yeah. I thought of you when I read that.

0:23:23.8
Safia Abdalla

Aww.

0:23:24.5
Adam Garrett-Harris

(laughs)

0:23:25.2
Safia Abdalla

It is page 106 and 105 for those who are following along.

0:23:30.0
Adam Garrett-Harris

I also thought the quick break for math geeks on 104 was pretty cool. I’ve seen this video before in "Numberphile" where they explain adding up all the natural numbers from 1-infinity equals -1/12th, and I’ll put a link for that in the show notes, it’s pretty cool.

0:23:46.6
Jason Staten

You’re not going to explain it right here, Adam?

0:23:48.9
Adam Garrett-Harris

I cannot explain it, it’s weird. It’s weird.

0:23:52.3
Safia Abdalla

Yeah. There is an addendum here in the book that says, “Fair warning. You’ll get lost down this rabbit hole” in reference to it.

0:23:59.4
Jason Staten

(laughs)

0:24:01.1
Safia Abdalla

So-

0:24:01.4
Adam Garrett-Harris

Yeah.

0:24:01.4
Safia Abdalla

Be prepared, reader!

0:24:02.6
Adam Garrett-Harris

I did. I did click on a 2nd video after watching that one, but...

0:24:06.1
Safia Abdalla

(laughs) That’s how they get you!

0:24:08.4
Jason Staten

I also liked the section on thinking in Big-O, where Rob actually goes through and summarizes each of the Big-Os, or at least, not all of them, but at least the ones that you’ll likely care about as a developer. So, in particular with arrays, it says, “If you’re randomly accessing an array, it’s always Big-O(1).” So, you’re just grabbing an index off an array it’s constant time, like Big-O(1), one operation required. If you’re iterating over the list, then you have a Big-O(N). If you’re doing a nested loop then you have a Big-O(N Squared). If you are dividing that list up and working through those pieces then you have a Big-O(Log N). And if you’re iterating and then dividing, then you have a Big-O(N Log N). So, those are a nice, I don’t know, handy guide to keep in mind. So, when you look at a piece of code, it can be at least somewhat obvious to you of what the time complexity of it is. If you see a number of for loops just within it, like if you’re seeing nesting or something going on you can say, “Oh yeah, that is Big-O N Squared. Maybe we can improve this.” Or something.

0:25:20.0
Adam Garrett-Harris

Yeah. That’s a good little cheat sheet. All right, are we ready for the next chapter? Data Structures?

0:25:24.8
Jen Luker

Yeah.

0:25:25.4
Jason Staten

Let’s talk about them.

(Typewriter Dings)

0:25:28.2
Adam Garrett-Harris

There's a funny little joke. He said something here at the beginning about JavaScript developers, that you just basically use arrays for everything.

0:25:34.8
Jen Luker

I thought we used objects for everything.

0:25:36.5
Adam Garrett-Harris

Oh, yeah. Yeah, and arrays are objects which is weird.

0:25:39.9
Jason Staten

Yep. Objects with numeric keys.

0:25:41.4
Jen Luker

Well, that joke fell flat.

0:25:42.9
Adam Garrett-Harris

What was the joke? (laughs)

0:25:43.9
Jen Luker

That everything is an object in JavaScript.

0:25:47.6
Adam Garrett-Harris

(laughs) Yeah. Yeah. Sorry, I didn’t… I don’t see the joke.

0:25:51.7
Jen Luker

I hope somebody out there laughs.

0:25:54.2
Safia Abdalla

Wait, what, what did you say?

0:25:56.3
Jen Luker

(laughs)

0:25:57.1
Jason Staten

(laughs)

0:25:57.8
Jen Luker

He said he was referencing the part at the beginning of Data Structures saying that JavaScripters use arrays for everything. I said, “I thought we used objects for everything.” The joke being that everything in JavaScript is an object.

0:26:09.1
Safia Abdalla

Oh. Oh! I see. I thought you were saying that Rob wrote joke and I missed it, like in the book.

0:26:15.4
Adam Garrett-Harris

Yeah, I think I said that but it’s not. Not really. Anyway. Yeah. So, it starts off talking about arrays. I kind of forget this because I write in JavaScript so often, but arrays are typically a specific length and you can’t resize them because they take up a specific amount of memory and they have to be contiguous in memory. So, if you wanted to resize it you’d have to make it a completely new array and then copy everything, and then add on.

0:26:41.0
Jason Staten

Mm-hmm (affirmative)

0:26:41.3
Safia Abdalla

That like, confused me for the longest time. I feel like different programming languages use array and list differently and I always get the terminology confused because sometimes a list is used to refer to something that is technically an array and vice versa. So I’m always like… that fact always surprised me, too.

0:27:01.3
Adam Garrett-Harris

Oh yeah, so, what is the difference between a list and an array in some language?

0:27:06.7
Safia Abdalla

So, the distinction is that with an array you can do direct index based access, and a list is something where you have to do sequential access. So, to get to the next item you need to like, go through the previous one and generally that’s because a load local in memory, an array takes up a contiguous chunk of memory. So, all of the numbers that are next to each other in the list- … See! I did it again! (laughs)

0:27:35.6
Adam Garrett-Harris

(laughs)

0:27:36.2
Safia Abdalla

All of the numbers that are next to each other in the array are also physically contiguous in memory so you can kind of just like, bounce directly from one to the next, but in a list they are not in the same chunks of memory so you might have the first three elements of your array in one chunk of memory and the next three in another so you can’t access them directly because you can’t compute their position as directly as you would if they were all lumped together in memory.

0:28:07.8
Adam Garrett-Harris

Yeah, I think you did it again. You said array when you meant list. (laughs)

0:28:10.5
Safia Abdalla

Oh my goodness! Ahh! (laughs)

0:28:12.6
Jason Staten

(laughs)

0:28:15.3
Adam Garrett-Harris

I didn’t even realize that when you said “list” it meant the same thing as “linked list.”

0:28:19.6
Jason Staten

Because it doesn’t, exactly mean the same thing as linked list.

0:28:22.7
Safia Abdalla

Yeah.

0:28:23.9
Jason Staten

So, in, I guess quick Rust note, in Rust you have arrays which are a fixed size type of data structure where you’d say, “This in an array and it’s holds 10 insigned integer 8 sizes” or something. LIke, 10 numbers in it and that is all it can hold and it can do no more, and then there’s also a, they call it a “vec”, short for “vector” and then that is the self growing type of data structure where if you keep pushing onto it beyond its capacity it will grow its own capacity; and whether or not it is a linked list and you have to follow something is dependent on the structure. I believe that a vec works in the way of once you surpass the capacity it will go and recreate the underlying structure with maybe doubling the size of what it was before. And so, like, say I have a vector that happens to have a capacity of 10 and I put the 11th item in it, or I go to do that, it will go and actually copy that original set of memory into an array, like an underlying array that has room for 20 elements, or something, in it. Whatever algorithm they choose to expand its capacity with. So, in that case it’s not actually linked and it still is contiguous memory, but there is a cost everytime that you surpass that underlying array’s capacity because it has to go and rebuild the array again.

0:29:49.0
Adam Garrett-Harris

Cool.

0:29:49.7
Safia Abdalla

Yeah. Anyway. Words are hard to use sometimes. (laughs) But yeah, array, list, two different things despite the fact that I mix them up, and it’s weird ‘cause like, conventionally an array is a list of things so you’re tempted to call it a list but like, technically-

0:30:04.5
Adam Garrett-Harris

Yeah.

0:30:05.4
Safia Abdalla

It’s not. Yeah.

0:30:06.9
Jen Luker

You know, this is something that I’m afraid to bring up because I don’t want to talk about it, or explain it really, but this is how I learned to understand monoids, Was the-

0:30:16.0
Jason Staten

Here we go.

0:30:16.6
Adam Garrett-Harris

(laughing)

0:30:17.3