Notes of Maks Nemisj

Experiments with JavaScript

This is a series of articles about Git (https://git-scm.com) version control system (VCS). I aim to show you Git from a different perspective, starting with the central part of Git – commit and going further into the branches and remotes. You will see what is in reality underneath “origin”, why the button “Create branch” in Jira/GitHub/Gitlab makes no sense, how to merge unmergeable branches, and much much more.

Most of the articles will include tasks to execute in a terminal. I strongly advise you to do them. This way, you will get a better understanding of the explanations. To simplify the bootstrapping of the tasks, I will include a bash/powershell script, which will prepare a basic repository structure.

I hope, as soon as you finish through all the articles, you will level up your Git skills.

Every week I will post one article. Chapters so far:

What is a “branch” ?

If you have read the previous two chapters, you should know already that commit is the most fundamental part in Git and diverged history points, make it possible to create “branches”. So what about the “branches” which you knew before? What are those, and why do you need them? It appears that branches which you see in Git client, are only labels by
themselves. It’s a human-readable reference to a commit. It’s like a Domain Name System (DNS). Domain name resolves to an IP address, and the branch resolves to a Git commit.

This brings us the first valuable property of a branch. Your commit can have as many labels as you want, and because of that, labels don’t have a time they’ve been created. Remember the task from the previous chapter? Have a look at the picture below. I’ve just added 5 “branches” to the dangling commit I had, while the branch was created at 28 of march, branches/labels are placed now.
Branches are labels
Branches are labels
The second and most valuable point is that removing a branch doesn’t remove the commit itself. It solely removes the label on that commit. I repeat – it SOLELY removes the label on that commit. Very important! Commit stays in history, and even if you do not see it, you can find it as I’ve already described in the previous chapter.

Enough theory it’s time for another exercise.

Undeletable

Task 1: Prepare the environment

The repository will have almost the same structure as in the previous post, except that history will have labels now – branches.

For curl’ers:

curl -o- https://raw.githubusercontent.com/nemisj/git-learn/master/branch-with-label.sh | bash

For wget’ers:

wget -qO- https://raw.githubusercontent.com/nemisj/git-learn/master/branch-with-label.sh | bash

For Powershell’ers:

Invoke-WebRequest https://raw.githubusercontent.com/nemisj/git-learn/master/branch-without-branch.ps1 -UseBasicParsing | Invoke-Expression

When you run the script, it will create a branch-with-label folder, and in that folder, you will find two branches: octopus and cat.
Branches with labels
Branches with labels
I’ve used such branch names on purpose and removed the ‘master’ branch repository to decouple your brain from the idea of “branches you know.”

Task 2: Removing a “branch”

Now, let’s remove the cat branch, where the last commit message was “Added 1.1.2”. However, before doing that, remember the hash of the commit on which “cat” label is placed. In my case, it is 6f3810e5. To remove the branch, I use the CLI, but you can do it through your Git client.

git branch -D cat

After removing cat branch, my log looks like this:
Cat branch is removed
Cat branch is removed

Task 3: Find the dangling commit

Now let’s find out that commit 6f3810e5 is still there by using fsck command. Execute in terminal:

git fsck --lost-found

This is what I’ve got in my log afterward:
Log of dangling "cat" branch
Dangling “cat” branch
To make sure, we are still able to access the commit and the whole tree of the “cat” branch, let’s check it out. Execute in your terminal:

git checkout {dangling_commit}
# for me it will be 
# git checkout 6f3810e583258eb144cadd1e4628dfbac4341057

Commit, on which the cat branch was set, is still there with all the history. Let’s have a look at the log.
Dangling cat branch after checkout
Dangling cat branch after checkout
What does this information give us? First of all, confidence that we can reclaim our branch if we have removed it by coincidence. Second, the confirmation that name of the branch is only a label, and we should start thinking of it like that.
A label is a label

A branch is just a …

Task 4: Creating a label/branch

Before we are going to make a new label for our dangling commit, I will tell you one exciting secret. Let’s list files inside hidden folder .git/refs/heads , which is at the root of the repository. In that folder, you will see one file which has the name octopus. Execute in terminal:

ls -l .git/refs/heads

Furthermore, let’s have a look at the content of that file. You will be surprised, yet the only line in this file has the hash to which the octopus label is pointing to. Execute in your terminal:

cat .git/refs/heads/octopus

This is my output:
Content of the refs folder and octopus file
Content of the refs folder
What does this mean? In its simplest form to create a label/branch on a commit, you have to create a file inside .git/refs/heads with the content, which is the commit hash. Time to try it out and type in terminal:

echo {dangling_commit} > .git/refs/heads/new-cat-branch

As soon as you create a file, your git log will show you your new branch. It goes the same for removing a branch. Just kill the file in .git/refs/heads, and your label is gone. Of course, not the commit itself, but this you already know.
Log with new-cat-branch
new-cat-branch created
After you understand that branch as you knew it is just a label on a commit, “Create a branch” in Bitbucket or GitHub has a different meaning to you, right? This button does not change the history of your repository. It does not make any new “shunt” or sideways in your history. It only places another label on the commit from where you want to start.

Task 5: Extend your imagination

To practice a bit more, let’s have a look at the picture below. Would you consider this history containing two branches or only one?
2 branch or not 2 branch
Technically, these are two branches. A branch is a label, and this history has two labels. As soon as someone starts committing from octopus commit, then you will see a split, but until then, it’s just a flat-liner, still with two labels/branches.

Nothing to loose

Task 6: Last but not least

I’m not sure if you know about the amend flag for in git commit, but this one allows you to “update” a commit if you forgot something to add/remove.

Let’s take as an example commit with the message Added 1.2.2. You can change the content of this commit by adding something to the zork file and then amending it to the last commit. Please change file zork and then run in terminal:

git commit --amend -a --no-edit

By doing this, you will not see any new commit in the log, but instead, if you click on the last commit of the “octopus” branch, you will notice that your changes are also in that commit. This commit has a new hash, but it looks identical to the previous one, except for the changes. You would think actually that this commit is indeed the previous one, but it’s not true. What Git is doing underneath is creating a new commit and places the branch name on that new commit. The previous commit, which was there, is still available in the history, and, of course, you can find it with fsck and make checkout, but this I will leave for you to experiment. Below is the history after finding the old commit and checking it out:
The green line is an early commit, and the new one is with the ‘octopus’ label. What

Summery

Let’s recap what we have learned:
  • A branch is a label on a commit, which has no date, no owner, no attribute expect the commit hash
  • A branch is a simple text file on a File System (FS)
  • Removing a branch doesn’t remove your commit at all
  • Everything that happens in Git, stays in Git
  • Flatline history might still contain thousands of branches.

What’s next…

Now you know the fundamental part of git, in the next chapter, I will start explaining what a “merge” is. Do you still believe that it’s not possible to merge two branches multiple times and not having the message “Already up to date.” ? After the next chapter, you will know it’s not true.

, ,

This is the second article of the series about Git. If you haven’t read the introduction, I strongly advise you to do this since this chapter will operate on the knowledge you’ve received in the intro. This time you will have to do exercises in the terminal of your choice. Most of the practices will have a bootstrap script, which will help you to set up the initial folder structure with dummy repositories and commits.

Branches without branch names

The previous chapter was an extended reading, I know, but it was needed to decouple your mind from the idea that “branch” in Git is the holy grail. Contrariwise, the branch is the last thing you should be thinking when working with Git. Putting ‘git commit’ in the first place makes working with Git a different adventure. To make this a second nature, we are going to practice in the form of small tasks, as I already mentioned before.

Task 1: Prepare the environment

To start a task, create a folder anywhere, cd into it from the console and execute a script which is below ( REMINDER: before executing any unknown script, read the content of it, to make sure it doesn’t install any malware on your system 😉 )

curl -o- https://raw.githubusercontent.com/nemisj/git-learn/master/branch-without-branch.sh | bash

or using wget if you don’t have curl

wget -qO- https://raw.githubusercontent.com/nemisj/git-learn/master/branch-without-branch.sh | bash

and one for PowerShell users on Windows:

Invoke-WebRequest https://raw.githubusercontent.com/nemisj/git-learn/master/branch-without-branch.ps1 -UseBasicParsing | Invoke-Expression

This script will create branch-without-branch folder as a Git repository with two branches in it. Those branches will not have any names, and if you use any Git client, you won’t see these branches yet.

If you run the bootstrap script for a second time, you will notice that hashes are different now. As I’ve already explained, hashes are unique and change every time you do things to Git. This means the hashes I have in the pictures will be different than yours.

Let’s have a look at the history of a newly created repository.
Git log of "branch-without-branch" folder
Git log of “branch-without-branch” folder
What you can notice is that there are no signs of branches in this log at all. Still, the branches are there, and this is what you’re going to find out. One thing to note is that currently, the last commit I’m at is ee23fc10.

If a commit doesn’t have any branch name on it, it is called dangling and is not visible in git logs nor git clients. It is still available, and you can return to it by doing git checkout {{hash-of-the-commit}}
Dangling commit
Dangling commit

Task 2: Find a dangling commit

To find out which dangling commits are there you can use the git fsck command. Execute in terminal:

git fsck --lost-found

--lost-found will print all the dangling commits, except the one on which you’re currently. My dangling commit is: 906194feded708a451b92bb8b317f71e9d81d43e
Dangling commit with fsck command
Dangling commit with fsck command
The dangling commit, which shows our terminal is the second unnamed branch. This branch is not visible in the git client yet. To find out branch history and the point where it is branched from we have to checkout this commit and see the logs.

Task 3: Checkout the dangling commit

By using the git checkout command, we are going to examine this second branch (execute in a terminal and put your hash instead of mine):

git checkout 906194feded708a451b92bb8b317f71e9d81d43e

If you look now into the Git history, you can see that it’s different. It does not end anymore with Added 1.2.2 commit message but instead with Added 1.1.2.
Git log of "branch-without-branch" folder after checkout
Git log of “branch-without-branch” folder after checkout
Now we have a log of a second branch, we can create a full picture of the git history we have so far:
Both branches in the histogram
Both branches in the histogram

Task 4: Find another dangling commit

Rerun git fsck to see that you have different dangling commit, which points now to the previous branch. Mine is ee23fc1047bcce7944a7e146157d8fb93fa5554c. This is exactly the commit where I was before checkout of the first dangling commit. Execute in terminal:

git fsck --lost-found

Log of the first dangling commit
Log of first the dangling commit

Task 5: Keep committing

Now, while you’re on an unnamed branch, create a change and commit it. I will create a new file and will commit it with a message “keep committing”. This is what I have done to commit file, you can use your editor to commit changes:

touch second-file
git add second-file
git commit -m "Keep committing"

My history after I’ve made the commit:
As you can see, you can continue working on the same branch and make new commits

Task 7: Creating a new branch

But what about making a new branch? Well, that’s easy too. Remember I told you about the time machine previously? You checkout point where you want to start branching with git checkout, and you make new commits on top of that. Doing this will create a new branch line.

To start this exercise, take the first commit in your repo. For me, this is 8d348af0. Make checkout and create a new commit. Do it in your editor or use following commands for bash:

git checkout 8d348af0
touch this-is-new-branch-file
git add this-is-new-branch-file
git commit -m "This is a new branch"

History shows that we have two commits in there – the initial commit and the new one.
New branch log
New branch log
By running git fsck we can see that now we have two dangling commits, cause we have two other branches which are not visible to us. Test it in your terminal:

git fsck --lost-found

Two dangling commits, two branches
This means that in total we have 3 branches in the current git repository. The new histogram will look now like this:

Final thoughts

As you can see, branching without branch names is not a problem in Git, maybe a more cumbersome, but still possible. This is exactly what I want to make clear. Here is an analogy that might help you.

Task 6: Bucket with ping-pong balls

Imagine a bucket full of ping-pong balls enumerated in random order.
Bucket with enumerated ping-pong balls
Imagine these ping-pong balls are connected using strings, cables or anything you come up with.
Connected ping-pong balls
This bucket is a Git repository, and these ping-pong balls are commits. Numbers on these balls are commit hashes. The only difference between this analogy and Git is that while connections of ping-pong balls do not have explicit direction, the relation between Git commits has a defined direction.

One of the commits is always a parent, and the other one is a child. The parent is regularly below the child in the history you usually see in the Git log of your Git client.
Take this git history from above – commit 8d348af0 is the parent of child commit 4acd0469, and in its turn, the commit 4acd0469 is the parent of children a080d192 and 3229b166.

If a commit has two children instead of one, these are, literally, two branches. They might not have a name, they are not visible in the UI, but they are real branches. The commit without any children and a branch name on it called a dangling commit.

Hi there,

This is an introduction to my series of articles regarding Git. It contains essential information to understand the other chapters. Ofcourse if you know all this you can skip to the next chapter.

There is no “spoon”

Probably you’ve been working with Git already for a long time, but for now, I want to ask you to forget about all the stuff you do with Git (like creating branches, committing to the branch, merging branches etc). Forget about “origin” and the fact that Git has branches.

Task 1:

Do you remember how your git history looks like? Probably something like this (see below), where you have, branch names and that “origin” word:
git history with branches
Git history you normally see
Now, imagine the same history, but only with these different strange numbers every commit has and without any human branches, like this:
git history with hashes
Git history you should try to see

There is only a commit

You know that Git has commits, and this is what it is all about in this chapter. It’s all about a commit. There is only a commit. There is so much more in a commit itself that I want to tell you and maybe you don’t know.

Let’s have a look at what is a commit. You could compare a commit to a point in time associated with a group of actions that happened before. You can imagine that a commit is like your daily activity – wake up, wash your face, get dressed, etc.

The number of these “dots” in time is completely up to you and you decide when one dot contains one action or multiple.
human history in dots
Human history is a chain of “dots” ( events )
This is also what we do to our source code. Instead of washing a face, we wash out some code from a file, rename a file, change code…but you know that already.
source code history in dots
Source code history is a chain of “dots” ( actions )
We create commits every day, but maybe not in the way I described above – without thoughtfully thinking, “How much of these actions should I put into one commit and how much of them will I put into another one”.

As soon as you move on with the chapters, you will notice that thinking this way is essential to better understand and work with Git.

This is a time machine

One of the most significant powers which Version Control Systems (VCS) gives us is the ability to return in time.
Git is not different in that, and for that purpose, I will call Git a “Time machine.”

You can go back in time and replay the whole history of your actions, from start to the end.

By going back in time to an individual commit, you analyze the state of the code at that particular time, you can feel the code, and you can smell it. This is precisely what command checkout is doing in Git.
Checkout dot 2
Return back to dot 2 -“Rename file a to b”
To return to a “dot” number 2 ( Rename file a to b ) we would execute:

git checkout 2

One important thing to note. Do you remember I told you to forget about branches in Git? Good, with the command above, we don’t checkout a branch with name 2, instead we checkout a commit itself. The number 2 in this case is an id of that commit. They also call it “a commit hash.”

What is even more critical to understand is that Git also allows us to create a new “dot” from that point in time, meaning we can create a new alternate history, like in the movie “Back to the Future Part II“.
Doc Brown explains future alternation
Now, imagine we will change the file z again and commit it.

git commit -m "Add file z"

Alternate git history with a new commit
Alternate git history with a new commit
(I will keep repeating. We have no branches yet. There is only a new commit.)

As you’ve done that (currently only in your brain 🙂 ), you might think that all other stuff, which you had in git history before, is gone … actually, it’s not. All the previous commits are still available, but not that easily visible in Git.
Previous history is still available
Previous history is still available
Doesn’t it remind you of a picture with Doc Brown from above? 😉 I told you, it’s a time machine.

This all means that commit 3a together with a previous commits 2 and 1 will start forming a fork and the whole git history will look like a tree.

It’s so bad we can’t do that in real life, look how cool could it be 😀 :
Time machine
My Daily time machine
So if you want to see what would happen if you do first ‘get dressed’ and then ‘wash your face’, do it and don’t be afraid, git remembers it all ( even without branches )

This is precisely the power that Git gives us. We can experiment without worrying about the consequences. We can keep changing history, committing, going back, and committing again.

I repeat – don’t be afraid to “travel in time” with Git. If you start looking at these dotted lines in your Git clients and see the commits instead of branches you’ve used to, you’ll see the power of a time machine git provides to you.

What is in the hash

Since history might become huge with all the different paths, Git doesn’t use numeric values to give commits a number. Instead, Git uses hashes to identify the commits.
commit has hashes
Git hashes
What is very important to remember is that hash is a UNIQUE element of a commit. No matter what you do, you can always rely on it being a unique value.

That’s why every change to Git history, will CHANGE the hash of that commit. Even if you redo the same action again, Git will still see it as a new “dot” in time, meaning a new commit, and a new hash. Such a situation is also shown in the picture above – “File z” has been added again as a new commit “3a”

With all this information now you can easily go into any commit in your git history, by running checkout with commit hash at the end. You will need to remember this. This is very essential for future articles.

git checkout 69e13aa8

Let’s recap what have we learned so far.
  • It’s possible to travel in time, by specifying commit hash to checkout command
  • There is no need in “branches” to create a “branch” like history
  • Commit has a unique hash
  • Two commits with the same content will deliver 2 unique and different hashes
In the next chapter, you will learn how to create branches without a name and how to see what is not visible in the Git history. Additionally, it will be a more practical chapter than this one.

Chapter 1 is here

,

Have you ever been into a situation where you wanted to do something with your state inside useEffect hook but didn’t want to put it into a dependency array? react-hooks/exhaustive-deps is an excellent guard when working with primitive values, but as soon as you have an object in the state, it might stay on your way. Let me show you some code, which requires state handling in the way I’ve described above.

type FetchInfo =
    | { state: 'LOADING' }
    | { state: 'LOADED' }
    | { state: 'ERROR'; error: Error; statusCode: number | undefined }
    | { state: 'INITIAL' }

interface StateType {
    fetchInfo: FetchInfo
    input: string | undefined
    value: Value
}

export const VerySpecialComponent: React.FC<{ input: string }> = ({ input }) => {
    const [state, setState] = React.useState<StateType>({
        fetchInfo: { state: 'INITIAL' },
        input: undefined,
        value: undefined,
    })

    React.useEffect(() => {
        if (state.fetchInfo.state === 'LOADING' || state.input === input) {
            return
        }

        setState((state) => ({
            ...state,
            fetchInfo: { state: 'LOADING' },
            input,
        }))

        fetchStuff(input).then((result) => {
            setState((state) => ({
                ...state,
                value: result,
                fetchInfo: { state: 'LOADED' },
            }))
        })
    }, [input, state, setState])

    return <div>{state.value}</div>
}

The code above looks quite straightforward, but let me elaborate on it.
First, “VerySpecialComponent” component is introduced, which should react only to the property change. As soon as “input” changes from one value to another one, this component should fetch some information from the back-end. Though, I don’t want React to re-trigger useEffect of this component, at the moment when state updates inside of the component useEffect . For this to happen, I’v implemented that dirty if statement inside useEffect, which checks for an already running effect. In the past, before React Hooks existed, I could prevent this by using shouldComponentUpdate lifecycle hook. Unfortunately this doesn’t exist in a land of the functional components. Another moment I don’t like when using objects as dependencies for useEffect, is that I do need to have an immutable state implementation ( using immutable.js or immer.js ), if I want to skip equal state changes. What I mean by that, is that calling setState even with the same values, will produce new object and re-trigger useEffect. Look at the code above, to better understand my concern:

// Calling setState, with the same fetchInfo, will still create a new reference of the state.

 setState((state) => ({
  ...state,
  fetchInfo: { state: 'LOADING' },
})

// this will re-trigger useEffect even though state is not changed
 setState((state) => ({
  ...state,
  fetchInfo: { state: 'LOADING' },
})

So all these items brought me to the plan of making an effectless state, which wouldn’t re-trigger running of the effect, whenever it changes inside useEffect. To achieve that, I will create an useEffectlessState hook, which wouldn’t re-trigger useEffect if state changes. Let me show how it will be used:

type FetchInfo =
    | { state: 'LOADING' }
    | { state: 'LOADED' }
    | { state: 'ERROR'; error: Error; statusCode: number | undefined }
    | { state: 'INITIAL' }

interface StateType {
    fetchInfo: FetchInfo
    input: string | undefined
    value: Value
}

export const VerySpecialComponent: React.FC<{ input: string }> = ({ input }) => {
    const stateTuple = useEffectlessState<StateType>({
        fetchInfo: { state: 'INITIAL' },
        input: undefined,
        value: undefined,
    })
    const [state] = stateTuple

    React.useEffect(() => {
        const [, setState] = stateTuple
        setState((state) => ({
            ...state,
            fetchInfo: { state: 'LOADING' },
            input,
        }))

        fetchStuff(input).then((result) => {
            setState((state) => ({
                ...state,
                value: result,
                fetchInfo: { state: 'LOADED' },
            }))
        })
    }, [input, stateTuple])

    return <div>{state.value}</div>
}

In the code above, by using useEffectlessState i ensure that useEffect will only run when input changes, but not the state itself. There are also other benefits to it. First of all, now I can implement AbortController functionality and abort the previous request inside the cleanup procedure of useEffect as follow:

type FetchInfo =
    | { state: 'LOADING' }
    | { state: 'LOADED' }
    | { state: 'ERROR'; error: Error; statusCode: number | undefined }
    | { state: 'INITIAL' }

interface StateType {
    fetchInfo: FetchInfo
    input: string | undefined
    value: Value
}

export const VerySpecialComponent: React.FC<{ input: string }> = ({ input }) => {
    const stateTuple = useEffectlessState<StateType>({
        fetchInfo: { state: 'INITIAL' },
        input: undefined,
        value: undefined,
    })
    const aboortControllerRef = React.useRef(new AbortController())
    const [state] = stateTuple

    React.useEffect(() => {
        const aboortController = aboortControllerRef.current
        const [, setState] = stateTuple
        setState((state) => ({
            ...state,
            fetchInfo: { state: 'LOADING' },
            input,
        }))

        fetchStuff(input, aboortController.signal).then(
            (result) => {
                setState((state) => ({
                    ...state,
                    value: result,
                    fetchInfo: { state: 'LOADED' },
                }))
            },
            (error) => {
                if (error.name !== 'AbortError') {
                    setState((state) => ({
                        ...state,
                        fetchInfo: { state: 'ERROR', error, statusCode: undefined },
                    }))
                }
            },
        )

        return () => {
            aboortController.abort()
        }
    }, [input, stateTuple])

    return <div>{state.value}</div>
}

Another thing I like in this approach is that now it’s easier to extract functionality from useEffect and put it into a separate function, which can be unit-tested (also this function will not have all the side effected ‘if’ statements):

export const run = function<T>(
    stateTuple: StateTuple<T>,
    signal: AbortSignal,
    input: string,
) {
    const [, setState] = stateTuple
    setState((state) => ({
        ...state,
        fetchInfo: { state: 'LOADING' },
        input,
    }))

    fetchStuff(input, signal).then(
        (result) => {
            setState((state) => ({
                ...state,
                value: result,
                fetchInfo: { state: 'LOADED' },
            }))
        },
        (error) => {
            if (error.name !== 'AbortError') {
                setState((state) => ({
                    ...state,
                    fetchInfo: { state: 'ERROR', error, statusCode: undefined },
                }))
            }
        },
    )
}

export const VerySpecialComponent: React.FC<{ input: string }> = ({ input }) => {
    const stateTuple = useEffectlessState<StateType>({
        fetchInfo: { state: 'INITIAL' },
        input: undefined,
        value: undefined,
    })
    const aboortControllerRef = React.useRef(new AbortController())
    const [state] = stateTuple

    React.useEffect(() => {
        const aboortController = aboortControllerRef.current

        run(stateTuple, aboortController.signal, input)

        return () => {
            aboortController.abort()
        }
    }, [input, stateTuple])

    return <div>{state.value}</div>
}

Clean and testable. Time to have a look at how does useEffectlessState hook is implemented:

import React from 'react'

export type StateTuple<T> = [T, React.Dispatch<React.SetStateAction<T>>]

export const useEffectlessState = <T>(initalState: T): StateTuple<T> => {
    const [state, setState] = React.useState<T>(initalState)

    const ref = React.useRef<StateTuple<T>>([state, setState])

    ref.current[0] = state
    ref.current[1] = setState

    return ref.current
}

To have a state which wouldn’t re-trigger changes, I’ve created an immutable array and used its reference using ref hook. This allows me to change that array without re-triggering an update of the hook. One caveat which exists in this approach is the reference to the state itself. When updating a state inside “run” function, I always have to make use callback style of setState. This is needed so that tha latest version of the state object is used. Meaning it should be like setState(state => {}) and not “setState(newState)`. In case I need the state inside run function, it’s better to always use stateTuple[0] to guarantee that state points to the latest object.

, , ,

If you’re already using async/await syntax, you might notice that forEach is not working for asynced functions. In that case you might start doing old style for loops or even for-of loops. Also if you have a bluebird package already installed you can use bluebird.each() instead. In ECMAScript 2018 there is going to be asynchronous iteration, but for those who stuck on older versions I have oneliner, which i use, if i don’t want to install bluebird or other promise library. Definition of each:

const each = (arr, cb, i = 0) => i < arr.length ? Promise.resolve(cb(arr[i++])).then(() => each(arr, cb, i)) : Promise.resolve();

Usage:

await each(['a', 'b', 'c'], async(item) => {
  console.log('this is ', item);
  await new Promise(resolve => setTimeout(resolve, 1000));
});

, , ,

Recently I’ve got a pleasure to debug some bug inside node.js app. Run-time was breaking with the following error:

copy.forEach( function(attrValue) {
     ^
TypeError: copy.forEach is not a function

Quite usual error, meaning that object has no forEach function. Let’s see the code itself:

function rebuild(data) {
  // for everything in the data we just fetched
  data.forEach( function(group) {
    // force array for attr
    if ( !Array.isArray(group.attr) ) {
      group.attr = [group.attr];
    }
    // for all values in attr
    const copy = group.attr.slice(0);
    copy.forEach( function(attrValue) {
    });
  });
}

I bet you were surprised like I was. This is just not possible from the first look, right?. If you read the code above, you can see that one of the developers have put a forcing of the array on member attr and Array MUST have forEach method. Which means JS executor is insane and doesn’t know what he is talking about. At least, this is what I was thinking in the first place.

After scratching my head, drinking a cup of coffee and setting debugger; statement exactly before the break, I understood that JS executor is not insane and I just can’t trust things I see. If you want to find out yourself, what caused the code to break, then don’t read next paragraph, but open your editor and try to reproduce the error. One small tip. You can reproduce it not only in node.js, but in any recent browser.

For those who are back, let’s see what was causing all this mess. It has appeared that attr member of group object was actually defined by using getter and setter. It was not just a plain object as I assumed at the begin, but it was a real instance of one class in this system. The setter on that instance was doing some magic to the passed value causing getter to return a simple value and not an Array.

{
    set attr(val) {
      this.attrValue = Array.isArray(val) ? val[0] : val;
    }
}

As I’ve already wrote in one of my article regarding getters and setters I still thinks it’s a bad idea. It might work in languages with static type checking, where every attribute has known type, but it is horribly broken in JavaScript, because JS developers are not used to fact that assign operator is doing some crazy stuff to the object.

, , ,

Recently I’ve came across a couple of node.js projects which use NODE_ENV for defining environment for the Development, Testing, Acceptance and Production (DTAP) pipeline. At first sight this looks like a good idea, but I would advise against it.

What we often see is that npm modules are consuming NODE_ENV as either ‘production’ or something else. When NODE_ENV is set to ‘production’, then less logging is shown, code is optimized for performance and some other stuff is disabled, which makes it a ‘real production code’. React.js is one of the examples in doing this through the whole codebase. Based on that I see developers define NODE_ENV as ‘testing’, ‘acceptance’ and ‘production’ to have more logging in the test environments, less logging in the production and more performant code in the production. In my opinion, this is one of the things, which you should not do.

When code moves through the DTAP pipeline you want it to be as similar as possible on all the stages. It’s not without the reason that there are ‘testing’ and ‘acceptance’ stages, besides the ‘production’. By making difference inside the code and making it ‘development|testing|acceptance’ code and ‘production’ one, you can’t guarantee that the code which runs in DTA environments will run in ‘production’ the same way. Due to the subtle differences, bugs can popup at the places where you don’t expect them.

The second reason is extra logging which you get on non ‘production’ mode. You would say – That’s exactly what I want in my DTA environments – but I would argue. By making explicit differentiation between DTA and P you branch your release/debug process into two different threads: debugging production code and debugging loggable code. Though if debugging of production code is not in your daily workflow, then probably you will stuck for much longer when things will go wrong there. often people only learn when they’re doing something periodically. But how could we learn to trace and debug ‘production’ code if that doesn’t happen in the daily workflow? Also don’t forget that production bug must be solved MUCH faster than any other bug. This is what makes it even more high priority to learn to do this early in the phase of software development.

The last reason of not using NODE_ENV for defining environments applies only for isomorphic apps. This doesn’t make it less important to me. If you stick to the NODE_ENV this means you will also have to use it in the client code. Say honestly if (NODE_ENV === 'acceptance') looks weird in the client, isn’t it? There is no node in the browser, so it makes no sense.

Here is my rule of thumb. First of all keep your code similar as possible through all the environments and keep NODE_ENV always on ‘production’. Second, if you have to use differentiation, make a new variable for your environment, like APP_ENV or CODE_ENV, you name it. For example, we used APP_ENV for defining our environments, because we used shared log DB and need the way to know, where it comes from.

, , ,

Today JSON is widely used in different corners of software development. It’s used as data format, as configuration or even as in memory database. At my current company we also use it as configuration format. The more I use it, the more I have feeling that it’s “a bit” inconsistent and “raw”. You would expect that it would be enough to do JSON.parse(str) and all problems solved, but that’s not true. Read further to find out the real truth.

json-safe-parse

First thing which is weird is the fact that JSON allows you to override inherited properties of the Native Object object. I will not go deeper into this topic, but thanks to the library json-safe-parse, we can sleep well and don’t thik of this problem anymore. If you want to read more about the problem, read library’s documentation.
Next.

Objects

You know what JSON stands for, right? It is “JavaScript Object Notation”. Which is, in my opinion, means that JSON should represent an object, like {}, [] or null. At least this is what I was thinking until a couple of days ago, when I found out truth. It appeared that there are not only arrays, objects and null values, which can be parsed by json, but also something else. Let’s have a look at the following code snippet and think what would it give us?

const zero = JSON.parse('0');
const truth = JSON.parse('true');

When executing following code I found out that "zero" variable will be a real 0 number and truth variable will be a real boolean true. This brings the following statement: number values and booleans are also part of the JSON spec. What makes sense, since in JavaScript, everything is an object, right? But it’s not quite like that for JSON.parse. At the same time, empty string – "" – is NOT a valid JSON.

JSON.parse(''); will happily throw an Error and it is something what brings BIG confusion into my head!!! Why? Because, now to parse JSON configuration, I need to think in “special-cases”.Let’s have a look.

We use JSON as a configuration object and store it as a string in the DB. Imagine now that someone will put 'true' into the field wher JSON is stored. In the code, which will parse that json, parser will not throw any error, since the JSON.parse("true") is a valid bit. Though later on somewhere in the code it could through an error, since it’s a different type. Imagine now, that there is a 'null' value going to be in DB. In that case JSON.parse("null") could lead to errors at even more places, e.g. when using Object.keys(json). It’s not possible to enumerate null values, so it will just break. But that’s not all. Don’t forget about empty strings, which are not empty – " " which will pass logical if – if (str !== '').

All this brought me to the next snippet which I now use when I want to parse JSON into configuration object:

First of all, it’s always checking that the parsed json is an object object. Secondly it dismisses empty strings or null values, since it’s not misconfiguration, but merely an emptiness of configuration. As last it trims empty strings, since it’s possible to have value like " ", which is just should be ignored.

, , ,

It’s going to be a short one, but powerful.

Do you remember I wrote previously why getters/setters is a bad idea in JavaScript? I didn’t change my mind, I do still think so, but now I found one valid place where I can and DO want use them. You will never guess. (just kidding)

Unit tests. Nowadays I write unit tests and use getters for testing my code. At appears we have a lot of these else, if statements where boolean values are checked, something like this:

function doSomething(options) {
  if (!options.hasZork) {
    return;
  } else if (options.hasBork) {
    return;
  }
}

And this is exactly the place where i can now use getters to test whether hasZork has been checked or not. It helps me to protect my API and ensure that all this eval logical branches are tested:

const sinon = require('sinon');

const hasZork = sinon.spy(() => false);
const hasBork = sinon.spy(() => true);

const options = {
  get hasZork() { return hasZork(); }
  get hasBork() { return hasBork(); }
};

doSomething(options);

// assert that both hasZork and hasBork has been called;

I promised you, it will be a short one. The End!

Previous Posts

Theme created by thememotive.com. Powered by WordPress.org.