Implementing the python zip and range functions in JavaScript
Zip
Takes complementary elements(elements with same index) from two arrays and combine them into a single element(tuple)
Example:
const arr1 = [1, 2, 3]
const arr2 = [4, 5, 6]
// : <- means returns
zip(arr1, arr2) : [[1,4], [2,5],[3,6]]
since JavaScript does not have tuples in the sense of python tuples, we will use an Array of Arrays, the inner array's being tuples, actually depending on how you use an array in JS it can be called a tuple
example:
function tuple(){
return [1, 2]
} // this function is said to return a tuple
let [one, two] = tuple()
// this is actually the concept react hooks use, example useState()
let [value, setValue] = useState() // react code
zip is very useful in combining tow arrays into a single array, while keeping the order of elements of the array(this will become clear in the unzip part), zip is especially useful in the data science world, for example charting: creating tuples of coordinates, combining training data(X) and labels(y) into a single structure, you can de-structure later
Zip Function
function zip(array1, array2){
let zipped = []
for(let i = 0; i < array1.length; i++){
zipped.push([array1[i], array2[i]])
}
return zipped
}
explanation:
for(let i = 0; i < array1.length; i++)
we use array1.length because zip stops as soon as the first array ends, that is one simple rule zip follows, meaning if your first array length is bigger than the second you will run into problems, you can handle that by throwing an error
zipped.push([array1[i], array2[i]])
we push a new array(tuple) in zipped with complementary elements from each array
console.log(zip([1, 2, 3], [4, 5, 6])) // [ [ 1, 4 ], [ 2, 5 ], [ 3, 6 ] ]
to destructure the array into the original arrays we can actually use the same function by making the second array optional, if there's no second array, that means a zipped array is being passed in
refactoring:
function zip(array1, array2){
if(array2 === undefined){
// unzip
}
else{
// zip
let zipped = []
for(let i = 0; i < list1.length; i++){
zipped.push([list1[i], list2[i]])
}
return zipped
}
}
unzip :
if(array2 === undefined){
// unzip
let list1_ = [] // will hold the original elements
let list2_ = []
for(let i =0; i < array1.length; i++){
list1_[i] = array1[i][0]
list2_[i] = array1[i][1]
}
return [list1_, list2_]
}
explanation:
list1_[i] = array1[i][0]
list2_[i] = array1[i][1]
the magic happens here, we are getting the i-th tuple, and assigning the elements in the tuple, according to their index, 0 being the first, 1 the second
as simple as that we have a working zip function that can also unzip
const zipped = zip([1, 2, 3], [4, 5, 6])
console.log(zipped) // [ [ 1, 4 ], [ 2, 5 ], [ 3, 6 ] ]
let [arr1, arr2] = zip(zipped)
console.log(arr1, arr2) // [ 1, 2, 3 ] [ 4, 5, 6 ]
we can create another version that zips to objects as tuples(I use this a lot to create coordinates for charts)
function zipToObjCoord(arr1, arr2){
let zipped = []
for(let i = 0; i < arr1.length; i++){
let key = arr1[i]
zipped.push({ x:key, y:arr2[i]})
}
return zipped
}
same concept, but creating coordinates
console.log(zipToObjCoord([1, 2, 3], [4, 5, 6])) // [ { x: 1, y: 4 }, { x: 2, y: 5 }, { x: 3, y: 6 } ]
Range Function
Range takes a number(n) and returns a "loopable structure" from 0 to n, a more complicated range fn takes a start , end and step number
Naive implementation
we can naively implement this using an array, range returns an array with numbers from 0 to n, which we can for..loop on.
function range(n){
let r = []
for(let i = 0; i < n; i++){
r[i] = i
}
return r
}
for(let i of range(10)){
// works but very problematic
}
what if we want to create a range of 4, 000, 000, that means range has to first loop 4 million times and create an array with values 0 to 4 million, then for..of can start looping, 4 million times again, if you know Big O(n) you know this is very inefficient, we are doing twice the work for each range function
n*2, besides that now we have a useless array with 4 million elements
Robust Implementation
Solution is creating @@iterator element,
@@iterator
before we even go to @@iterator, let me explain the concepts behind iterables and collections,
a collection is an array of elements(consumable elements), iterables are collections that define the iterator protocol
Iterator protocol
how does the for..of loop work?, for example looping over an array. for..of loop does not know what an array is, all for...of knows is the iterator protocol, so when for..of loops encounters anything, for..of looks for the implementation of the iterator protocol in that thing.
let's look at it from the array perspective, an array implements an iterator protocol that tells for...of loop how to to iterate the array itself, basically the array is saying through the protocol if you are trying to iterate me, this is how you do it. its a form of contract between the two, for...of expects array to implement the iter protocol, and array expects for...of to understand the iter protocol, Ok enough babbling, what is the the iter protocol
simply an object that has a next function which also returns an object
{ // object
next(){ // w/ a next function
return {} // which returns an object
}
}
zooming in to the object returned by next
// this object has a value and "state" called done a boolean indicate whether we are at the end of an array
{value: "element in the array", done: false}
which simply means this object can take two forms
- we are not at the end of the array
{value: "element in the array", done: false}
- we are at the end of the array
{done: true}
let's go back now to the array and for..of loop example, when the for...of loops over an array it's looks for this object and calls the next function, based on what next returns for...of loop continues or stops
for(let i of [1, 2, 3]){
console.log(i)
}
// 1st iter -> [1, 2, 3].next() returns {value: 1, done: false}
// 2nd iter -> [1, 2, 3].next() returns {value: 2, done: false}
// 3rd iter -> [1, 2, 3].next() returns {value: 3, done: false}
// 4rd iter -> [1, 2, 3].next() returns {done: true} // end of the array
in each iteration the value is returned or assigned to i, when done becomes true, for...of stops looping cause we are at the end of the array.
I have omitted few details but this is the gist of it, the iterative algorithm
Implementation
the only thing we will implement is the next function, JS has a symbol.iterator(@@iterator) object all we need to do is customize how the next works,
and Note: you can use the iterative algorithm anyway besides collections, collections were an example,
for example in this case we are not looping a over a collection but generating a number in each iteration
function range(n){
let i = 0 // start
return { // iterator protocol
[Symbol.iterator]:() =>{ // @@iterator
return { // object with the next function
next(){
while(i !== n){
let temp = i
i++
return {
value: temp,
done: false
}
}
return {done: true}
}
}
}
}
}
the only addition here to the iterator protocol is wrapping the object that returns next with
[Symbol.iterator]:() =>{ // @@iterator function
but everything is as defined in the iter protocol
explanation
[Symbol.iterator]:()// simply : allows array like behaviour(what for..of) looks for
next(){ // the next we defined above
while(i !== n){ // will loop as long as i is not equal n(passed in val)
let temp = i
i++
return {
value: temp, // returns the value
done: false
}
}
return {done: true} // when done looping the range
}
and that's it, a robust implementation of range, as a challenge you can add start, stop and step as additional functionality, I personally never need them.
for(let i of range(10)){
console.log(i)
}
Robust vs Naive
rename the naive range function to Nrange
let start, finish
start = Date.now()
for(let i of Nrange(10)){
}
end = Date.now()
console.log("naive", end- start, " ms")
start = Date.now()
for(let i of range(10)){
// console.log(i)
}
end = Date.now()
console.log("robust", end- start, " ms")
1st test: 10
range(10) vs Nrange(10)
naive 0 ms
robust 1 ms
naive performs way better than robust, did we just implement rubbish?(not really), it will become apparent after few tests
2nd test : 10 thousand
range(10000) vs Nrange(10000)
naive 7 ms
robust 11 ms
this the must be conclusive right?, no not really, that is the point with naive implementations they always seem to perform better when values are lower, but as you crank up the sample space they crumble
3rd test : 40 thousand
range(40000) vs Nrange(40000)
naive 29 ms
robust 18 ms
now the tables are turning, Nrange is starting to crack under pressure which we like so much, our work was not in vain.
4th test: 4 hundred thousand
range(400000) vs Nrange(400000)
naive 106 ms
robust 32 ms
final test: 4 million
range(4_000_000) vs Nrange(4_000_000)
naive 650 ms
robust 97 ms
of course these tests are not conclusive, and depend on your machine, for example mine is not that powerful and I have many software's, cmd's etc opened like a normal dev , this depends on how free your memory is. keep cranking up the sample space.
conclusion
with that we conclude this rather short tutorial, my suggestion is to study or take a look at the iterative algorithm, which is actually the backbone of many collections in languages knowing it, is very valuable and opens new possibilities