An Introduction to Javascript Generators
written by Gorilla Sun
A tale of Iterators and Iterables
Generators are essentially iterators in disguise. Let me explain what that means! An iterator can be best explained as a "protocol that defines how to produce a sequence of values from an object". That's a bit of a mouthful, and it's probably best to directly have a look at an example. You're probably familiar with the for...of loop statement that was introduced in ES6 (not to be confused with the for...in statement):
let nums = [1, 2, 3, 4, 5]
for(let n of nums){
console.log(n)
}
// 1, 2, 3, 4, 5
It's essentially an alternative to the regular old school imperative for loop that we're used to, where we manually have to access the elements of the array we're looping over and requires us to keep track of the index. The for...of loop takes care of all of that for us, we just need to pass it the array that we want it to loop over and it will directly serve us the elements of that array without having to access them by index (as opposed to the for...in loop statement). Fun fact, the for...of loop can also be used for other objects such as sets and strings (and more)! Neat, right?.
The object that is passed to a for...of loop usually needs to be an iterable. The important thing to understand here, is that the for...of loop doesn't actually loop over the object by itself, but makes use of an iterator function that is attached to the iterable. The for...of loop simply invokes that iterator function and makes use of it to generate the correct sequence of items. This means that arrays, set, strings, etc. already have such an iterator function built into them that predicates in what sequence the for...of loop will go over the elements. In his book exploring ES6, Dr. Axel Rauschmayer gives a good summary of what iterables and iterators are:
An iterable is a data structure that wants to make its elements accessible to the public. It does so by implementing a method whose key is Symbol.iterator. That method is a factory for iterators.
In essence, iterables are sequences of data that are intended to be traversed and consumed, and at the same time hold their own iterator function stored inside an object property with the identifier Symbol.iterator. It describes what the sequence of items should look like when it's being looped over. What javascript Symbols are is widely outside the range of this article (you can learn more about Symbols here), and isn't even of much concern here, we simply need to implement this function to make an object iterable. More on that later.
An iterator is a pointer for traversing the elements of a data structure (think cursors in databases).
Basically, iterators are procedures that describe how the iterable sequence is being traversed. A more formal definition for an iterator can be found on MDN, it essentially is an object that satisfies two conditions:
In JavaScript an iterator is an object which defines a sequence and potentially a return value upon its termination. Specifically, an iterator is any object which implements the Iterator protocol by having a next() method […] — MDN
We're gonna see what all of this means in a hands on manner throughout the next two sections! Additionally, as a matter of fact, all iterators are actually also iterables, because they can be iterated over by another iterator. However, not all iterables are iterators. This is probably very confusing at this point, but hang on it'll get a little clearer in the coming sections. I hope I have piqued your interest at this point, and you're curious to know how generators fit into the overall picture!
Generators: Iterators in Disguise!
A javascript generator is a special type of synchronous function where execution can be manually halted and resumed at any given point, all the while remembering where we halted execution. Regular functions don’t have this ability, once invoked they will run until fully executed or a return statement is encountered.
To best illustrate this, let’s start by having a look at the syntax! To tell Javascript that a function should be a generator, we simply follow the function keyword by an asterisk symbol (*). And that's that, how this syntax came to be from a historical point of view is a bit hazy. Here's an initial minimal example:
function* generator(){
yield;
}
Our next point of interest is the yield keyword that you can see in the body of the generator's function. Think of it as the return keyword’s distant cousin: it is also used to get a value out of the generator function and returns it when the generator is invoked, however, unlike the return keyword, it doesn’t terminate the function instantaneously but rather just pauses it’s execution. The next time the generator is invoked again we will resume execution starting from the last yield keyword. This essentially means that the yield keyword can occur more than once throughout the generator’s body:
function* generator(){
yield 1;
yield 2;
yield 3;
}
console.log(generator().next().value) // 1
console.log(generator().next().value) // 2
console.log(generator().next().value) // 3
console.log(generator().next().value) // undefined
You probably also noticed that we can’t simply invoke the generator like a regular function. Calling the generator that way would actually not do anything. To actually make the generator run and execute its code until it reaches the next yield statement (to get a value out of it) we need to use the next() method. Invoking this method on the generator function we will obtain a little object with two properties value and done, where the former is simply the value that we are returning with the yield statement, and the latter is a boolean indicating if the next next() call will return a value or undefined. You will get a generator object with a value: undefined when there are no more yield statements to consume.
console.log(generator().next()) // {done: false|true, value: val}
If your generator accepts input parameters, these have to be passed through the next() method. In this manner, generator functions are iterators as the satisfy the two conditions that we've mentioned earlier:
- Generators are objects that implement a next() function. Printing out the generator object itself with console.log(generator) you'll see that it has a property called next that holds a function.
- A generator has a return value on when it terminates. This happens when all yield statements are exhausted, and even holds true when there are no yield statements in the body of the generator function.
In the coming sections we'll have a crack at creating our own custom iterator functions for our objects! Generators will come in very handy for that!
Generators as Iterables!
At the same time, generators also act as iterables! For example, we can simply pass our generator to a for...of loop to extract values from it:
function* generator(){
yield 1;
yield 2;
yield 3;
}
for(let val of generator()){
console.log(val)
} // 1, 2, 3
We can also do all the other stuff that we would normally be able to do with iterables, like extracting the entire sequence with a spread operator:
let seq = […generator()]
This implies that we can pass one generator (as an iterable) to another generator (as an iterator):
function* iterable(){
yield 1;
yield 2;
yield 3;
}
function* iterator(iterable){
for(let val of iterable){
yield val
}
}
let a = [...iterator(iterable())]
console.log(a) // [1, 2, 3]
This can get a bit confusing and it's probably a bit difficult to see how this is useful, but later we'll get into an interesting use case, where we can plug these generators together in a sort of modular manner and achieve lazy evaluation of values. However, let's first have a look at how to create custom iterator functions for our own objects!
Creating custom Iterables with Generators
Some objects in Javascript already have iterators built-in, simply allowing us to pass them to for...of loops. But what if we create a custom object, how can we turn it into an iterable that we could potentially pass to a for...of loop? In this case we would have to define our own iterator and attach it to the object under the Symbol.iterator property that we've already discussed earlier. To demonstrate this, I'll borrow an example from an amazing talk by Anjana Vakil: The Power of JS Generators. Let's assume we have the following object:
let cardDeck = {
suits: ['♥️', '♦️', '♠️', '♣️'],
court: ['J', 'Q', 'K', 'A']
}
The expected behaviour here, when passing it to a for...of loop, would be to loop over the all of the court + suit combinations in addition to the pip cards (the cards with numbers on them). Hence 52 distinct pairs in total, and we want to do this without manually having to type all of them out. Building our iterator object from scratch, we need to follow the two conditions:
- We need to implement a next() function
- We need to return an object of the form {done: true|false, value: val}
Typically this is implemented in the following manner:
[Symbol.iterator] = function(){
return {
next: () => {
done: true,
value: 'hello'
}
}
}
A little bit difficult to digest on first glance. The function returns an object that holds as property the next() function. In turn, the next() function is what returns the iterator object which holds the done and value properties. This would already work if we were to pass it to a for...of loop, but is still a bit primitive in its current state. Here we defaulted the done value to true, because having it set to false would be problematic if we fed it to a for...of loop (it would loop indefinitely). We've got to add some more stuff to make this work as intended:
let cardDeck = {
suits: ['♥️', '♦️', '♠️', '♣️'],
court: ['J', 'Q', 'K', 'A'],
[Symbol.iterator]: function(){
// indices that keep track of our position in the two arrays
let suitsIdx = -1;
let courtIdx = 0;
return {
next: () => {
// simulating a nested loop here: courtIdx is only incremented
// whenever suitsIdx has made a complete pass over the suits array
if (suitsIdx >= this.suits.length - 1) {
suitsIdx = 0;
courtIdx++
} else {
suitsIdx++;
}
// when courtIdx has made a complete path, we are done
if (courtIdx > this.court.length - 1) {
return {
value: undefined,
done: true,
};
}
// If the previous condition didn't trigger, we return the iterator object with the current suit/court combinations
return {
done: false,
value: this.suits[suitsIdx] + this.court[courtIdx]
}
}
}
}
}
Now passing this to a for...of loop we would have successfully generated all of the possible suit/court combinations, we would still need to work on it to generate all of the 52 combinations. Try changing the code to make it also produce the numerical cards, it's a bit of a hassle... Imagine if you had more complicated objects for which you would want to generate certain sets of combinations! The problem of adding your own iterator function to your custom objects can be made much more tractable when using generators! Let's do the same thing as we just did, with the simple difference that this time around we'll use a generator as a middle man:
let cardDeck = {
suits: ['♥️', '♦️', '♠️', '♣️'],
court: ['J', 'Q', 'K', 'A'],
[Symbol.iterator]: function* (){
for(let suit of this.suits){
for(let i = 2; i <= 10; i++) yield suit + i;
for(let c of this.court) yield suit + c;
}
}
}
See how much more compact this has become?! We didn't have to deal with keeping track of any index, the syntax is a lot cleaner, much more readable, and it implicitly took care of returning the iterator object. Essentially, we mediated between iterability and our custom object via means of a generator! By adding the Symbol.iterator function we've provided a means for iterators to consume our iterable.
Moral of the story is: the next time that you want to turn one of your objects into a custom iterable, try to use a generator to make things easier!
Implementing Lazy Evaluation with Generators
All of what precedes already makes generators incredibly versatile tools, however Javascript generators really shine when it comes to lazy evaluation (also sometimes referred to as call-by-need). Here it might be useful to spend a moment and talk about what that actually means. In essence it’s a programming paradigm where the computation of values and/or evaluation of statements is delayed until is absolutely necessary.
During my master studies I incidentally used generators on a daily basis, where I was working on machine learning models and training neural networks. It is often the case that machine learning datasets are quite large (ranging in the hundreds of Gigabytes), especially when you're dealing with vision models that need to be trained on image files for example. From a memory perspective, training these models can be quite challenging, even when you're working with a super high end computer. It is impossible to fit such datasets into memory. It would be much more convenient to be able to fetch just a couple of training samples at a time, feed them to the neural network, do a training pass, then discard those and fetch a new batch of samples and repeat the previous steps. And well, I think you know what I'm trying to imply.
Generators allow us to do this with ease, and are much less clunky to work with than for loops in this scenario, often it the case that we want to fetch a small batch of data samples, apply some pre-processing steps to it, pass it to the neural network, do a training pass, and then every so many iterations we'd also want to evaluate how well our model is doing, and ultimately start over again and repeat these steps. Generators just make this interleaved and sequenced way of writing code super easy, as they will always wait for us when we suspend them and remember where we'e left off as well.
This is however just one example of this lazy, on demand evaluation type of programming. A great resource on this lazy iteration, is James Sinclair's article Why would anyone need javascript generator functions? where you will incidentally also learn a lot about Australian culture and what a Tim Tam Slam is. Let's have a look at a concrete example, say we have an array of numbers:
let nums = [0, 1, 2, 3, 4]
And say, we'd like to apply two operations on each element in this array, firstly add the previous value (when it exists) to the current value, and then square the result. We can attempt this with the inbuilt Javascript array methods, like for example map():
let offsetAndSquared = nums.map((curr, n, arr) => curr+=arr[n-1] ? arr[n-1] : 0).map(curr => curr**2)
console.log(offsetAndSquared) // Result: [0, 1, 9, 25, 49] / Expected: [0, 1, 5, 13, 25]
However this is not the intended effect, we would actually like that both actions happen in sequence on each element before they are applied to the next element. Although we could do this with an old school imperative for loop... Generators to the rescue! For this we'll create a little number generator that can give us a sequence of N numbers as an iterable:
function* number(){
let i = 0
while(true){
yield i++
}
}
Next we'll create our own version of the map array method, but with the twist that it's a generator:
function* mapGen(iterable, mapFn){
for(let item of iterable){
yield mapFn(item)
}
}
And we'll also create a generator which remembers and adds the previous value if there is one:
function* addPrev(iterable){
let prevItem = undefined;
for(let item of iterable){
yield (prevItem) ? item + prevItem : item;
prevItem = item
}
}
Last but not least, we also need a function that can extract a N values from our starting iterable (the number generator), we'll call that function take, as in taking from the iterable, and which will act as the main iterator that starts our iterable domino chain:
function* take(n, iterable){
for(let item of iterable){
if(n <= 0) return;
n--;
yield item;
}
}
Now we would put together our little iterable daisy chain in this manner:
let offsetAndSquaredGen = [...take(5, addPrev(mapGen(numberGenerator, n => n**2)))] // [0, 1, 5, 13, 25]
This is really cool, because technically there is no limit to how many generators we can chain together, and essentially obtain new and interesting sequences in this modular approach! Could be cool for generative art purposes?
Infinite Sequence Generators
If you've noticed, we've had a while(true) in our number generator, which is in any other scenario quite a scary statement, but thanks to the yield keyword we can rest assured that it won't run infinitely (except if we pass it to a for...of loop, but at that point it really is our fault). In essence, we've created an iterable that defines an infinite range, which could be potentially useful in certain situations, for instance when we don't know how many items/values we need in advance, we can just write a generator that defines the sequence and generate these values on demand. An example of an infinite sequence would be the Fibonacci numbers:
function* () {
let prev = 0;
let curr = 1;
while (true) {
yield curr;
const tmp = prev;
prev = curr;
curr = curr + tmp;
}
}
Now we could also recreate the usual array methods as generator equivalents, like we've already done earlier and act upon this infinite sequence in a lazy manner! This might come in handy one day.
Closing Notes
I think we've covered all of the basics of generators in this article! They are a very powerful feature of Javascript which allows for a completely different approach for certain types of problems and otherwise might just be fun to play around with. One thing that needs to be pointed out here is that generators are not supported on some older browsers, so make sure you account for that if you use them in your projects.
If there's anything else that you think should be included, let me know, and if you have any questions or need any clarifications shoot me a message. Otherwise, thanks for reading! If this was useful to you, share it with a friend! Or check out some of my other articles! It helps out a lot! Cheers, happy holidays, merry Christmas, a happy new year and most of all, happy coding!
Here's some further reading material:
The Power of JS Generators by Anjana Vaki - Why would anyone need javascript generator functions by James Sinclair - Infinite Data Structures in Javascript by Francis Stokes - The Ultimate Guide to JavaScript Symbols