Overviews
Whenever you want to do any kind of automation in the cloud or in general, there is always Git involved. And you know what they say?
Without a strategy, there is no difference between automatic and manual.
So when it comes to automating a CI/CD, obviously you need some kind of source control strategy as a base. With that said, we will cover some techniques and some design patterns and their pros and cons. And then we’ll see how those concepts apply in real-life situations.
For instance, when building and maintaining a production app, You might want to protect the main branch so that no one can push any code directly to the main branch. Or let’s say, you have a bunch of private keys that you use in production and those keys are needed for the CI/CD.
How do you protect those?
Well…. Being able to understand how these things work, is going to help you tremendously in your daily DevOps life.
Now let’s talk about the components of a version control system which is GIT.
So in git, as you know we have repositories where you store all of your code. And within the repository, we have branches. Now, the next thing is Commit. A commit is essentially a single change on a given branch. So you can have many many commits on a single branch, but you cannot have the same commit to multiple branches, because when you commit, it generates a unique hash. And that's how you identify each unique commit.
And finally, we have a pull request. A pull request is eh eh eh… come on, yk what a PR is. You are not dumb.
let’s understand why we need different branches and the relationship between commits and PRs.
This is our first source control strategy - Trunk Based Development.
So this line going across with the arrow at the end, is a branch. and let's say these circles are individual commits. that means those are the changes made to the branch. So if you were to work on this code base, you would clone the repository, and then you would start making commits directly to the Main branch. and now the Main branch becomes the only branch in this case. Now, this is considered Trunk-Based Development.
The pros here are that you can make very small changes because every single commit is going straight to the Main branch. And that allows you to make small changes instead of all at once kind of changes. And then as well, you get continuous code merges. So you're continuously merging your code with the Main branch and continually merging in with everybody else's changes to the Main branch. With this approach, you can also increase delivery throughput because you don't have to wait to merge the branches or anything like that. And spoiler alert this is a shitty approach.
The problem is that to be successful with Trunk-Based Development, you do have to implement a lot of testing because you want to make sure that everything that goes into the Main branch is deployable code and It can go all the way through to production. You need to be sure to cover your bases when it comes to testing in a Trunk-Based Development. And so, like I said, all the work is done on Main. And then when the code reaches an agreeable state, maybe when you think, okay, this is a well-written code. Let's put a version tag or a release tag on it.
And now you can consider it as a version of your code. So when you see applications that have 5.3, or 6.0. those are the versions of that code. So you can tag our code like that. And you’ll know that at this point, at this commit, it is version 5.3. And then you can gradually increase that number as you go.
GitHub Flow
In this one, you have the Main branch. Now you have the first commit on the far left and then you go down to this other line. Now this line is also a branch. And this branch is Feature X, maybe you're working on the next cool feature to add to the application. So you'll take a branch off of the Main branch and then you'll write your code. And then when you're done, you merge it back into the Main branch and delete the Feature X branch. And now when you see there's a bug in that release. you now take another branch from here called Bug A branch and then you'll fix the bug, commit it back into the Main branch, and then delete that old branch.
In this case, all of your work is never done on Main branch directly because you worked on individual branches and merged it back into Main. In this strategy You don't ever want to have a long list of branches. So your branches are merged frequently. You only branch for small things. You break up your features into small, bite-sized features and then you just branch, merge, branch and merge. So the pros here is that Main branch is always releasable. So you notice with Main branch you’re not doing any direct edits to Main, so we don't have to worry if Main can be released in production because we are branching and doing our work on those branches. So there's no development work present on Main.
Now the other pro here is at its short-lived branches, you don't want to get a ton of different branches because then you'll start losing track of those features and then they'll never make it into the final product ever coz these get pilled up over time.
And the cons here is that Main is not always the most up-to-date, in these cases when you’re working on features, it's going to be lagging behind the features.
So if you are working on a feature and someone else starts working on a feature, they don't have access to your changes, they have to work off of the last change made to Main. And then when you both merge into Main together, you have to worry about merging your changes and to Main instead of just straight to Main.
The next one here is called Git Flow. This one is one of the most extensive ones as well as the most popular. Here you have a Main branch and a developed branch. These branches will live the entire life of this repository. Now from Main, you make a commit off of it and we start the Develop branch. Now, when you want to do a feature, you don't branch off of Main. you branch off from the Develop branch. you build your new feature, merge it into Develop branch and same with Feature xyz etc…
Now let’s say something goes wrong in Main branch in the middle of some development, then we can create another branch from that point, do the fix, and then merge it back into Main. But we also have to merge that hotfix into the Develop branch because whatever we changed in Main should be equal to the Develop branch. So now once you do that and we're ready to release our code into Main, then we will cut a release branch off of Develop. So a new branch is created. We might call this release 69. Now we will create that release branch and we will do our testing, we’ll do everything we need to do to make sure that the release is going to work in production, and then we will commit it. And then we will merge that release branch into Main and then delete the release branch. But any changes made during that release branch state have to get merged back into Develop also. And this cycle repeats. You're branching for features and then you're merging back into Develop.
Now in this model, the Main branch is always releasable. It's always being tested before we merge code into Main. This also allows for a lot stricter controls because we are only really touching the Develop branch unless we're doing a hotfix. And usually when something critical event happens in main you get granted special permission to main. Now, the cons here is that Main is not always up to date because we're working on Feature X, Feature Y, and then right in the middle of that, you see, you had to do a hot fix, but we weren't able to bring in those new features into Main. you had to use the old Main and just add that hotfix in until we were ready to release. So it's not always the most up-to-date. So your features all get released at once. So this causes large changes to Main. You're not just making one feature to Main, you're doing multiple features at the same time getting added into the Main branch, so that can lead to some complications because you have to worry about if there was something broken after that release branch merged to Main, you have to go back and try to find out which feature broke it.
And the last one here is that There are a lot of moving parts in this one. As you see, branches are being created for various things as well and we now have two long-living branches that we have to manage.
Environmental Branching
This one is pretty much what it sounds like. What you'll end up having is one long-lived branch for every environment your code gets deployed to. So if you have a Dev account or a test account, maybe you have a UAT account, an integration or a staging account, or a product account. All of those accounts get their own branch. And so what happens here is you usually have a Main branch equivalent to the lowest environment.
So for example in development. So you'll be committing your code into Main, the develop branch. And from there you can merge into the higher branch. So the next branch maybe is testing and the next would be integration. And so as you merge into these new branches, each time you're going to run those tests in those environments to make sure that when you go to prod everything will work. In this model, there's that gradual release to production. And we're making all of our changes to one branch. And that's the Main branch. We don't have to worry about release branches or hotfix branches or any of that. We just commit it straight to the Main branch and then we graduate that release all the way through.
One advantage here is that in this case, each environment has its own branch, so it should become stable in each of those environments.
Another advantage is that prod is always going to be releasable because we are protecting it with those other branch tests. Now again, the Con here is that production is not always up to date. We have all of those features being developed and it will never catch up to what is being currently developed, because you all keep on developing more. And then you graduate. And while you're graduating those, you're going to be working on more features. So Prod will always kind of lag from where your development cycle is.
And then the next Con here is that there are multiple merges needed for every release of code, so this can be time-consuming. It can also be a little bit more difficult because you have to manage all of these merges. And then you go test. And let’s say you got all the way down into staging, you are four branches in and then you find a bug. Now you have to go back to the development branch, make the change for that bug, graduate it down to test, UAT, integration, you make it down to staging, and now you find it works. And now you can go to Prod.
But see how there are so many complicated merges needed just for fixing a bug when you're in the middle of a release.
Those were the strategies I wanted to cover in this video. This is not really exhaustive. There's a lot more to it. There are also a lot of variations and spinoffs of each one of these. It is always a little bit different from environment to environment. It's not necessarily as important to know what each strategy is, what it's consisted of and all that. But what I want you to get is the big picture about what we do in git, and how we can separate environments, how we can separate development from Main, how we can graduate a feature release into the Main branch.
I hope you've learned something. byeeeeeee.