Unit Testing

Unit Testing
open	file:///Users/jkamenik/Desktop/selenium/sel_test.html	test
verifyEval	this.browserbot.getUserWindow().Int(1)	1
verifyEval	typeof(this.browserbot.getUserWindow().Int(1))	number

open

file:///Users/jkamenik/Desktop/selenium/sel_test.html

test

verifyEval

this.browserbot.getUserWindow().Int(1)

verifyEval

typeof(this.browserbot.getUserWindow().Int(1))

number

}?s=200\">{{ end }} ... Rest is unchanged ... You will notice that I pull gravatarHash from Site.Params. This needs to be added to the config.yaml file.\n1 2 params: gravatarHash: \"md5 checksum of your email address\" DeploymentThe hugo site has a very good tutorial on using github pages here. I used a modified version with my blog.\nThe blog will reside at http://jkamenik.github.io. This means the repo MUST be named jkamenik.github.io, and the static site MUST be on the master branch. If you are using Hugo for project documentation then the setup is a bit different and you should read the tutorial linked above.\nKnowing that Hugo generates the static site in ./public the easiest thing to do is use git subtree to track a directory as a ref. git subtree is similar to git submodule in that they both aim to make a directory of one repository a ref to another repository. The difference is that git subtree does this as a merge strategy, while git submodule does it by maintaining two separate full repos.\nIn my case I will be developing the blog on the draft branch, and publishing it on the master branch. For this reason git submodule is not a good fit, but git subtree is ideal.\nHere is the setup.\n1 2 3 4 5 6 7 8 9 10 11 12 $ git checkout --orphan master $ git ls-files | xargs git rm --cached -f $ ls | xargs rm -rf # remove all but the hidden .git directory $ git checkout drafts README.md $ git add README.md $ git commit -m \"Initial Commit\" $ git push origin master $ git checkout drafts $ rm -rf public $ git subtree add --prefix=public origin master --squash $ git push origin drafts $ git subtree push --prefix=public origin master It seems more complicated then it is. Basically, I just create an orphan branch (one not associated with the commit history of the current branch), and then I load that branch in the ./public directory of the branch I will be maintaining. That way using a single git commit will take care of both repos at the same time, and GitHub will be happy because master will have the static site.\nDeployment becomes:\nAdd and commit changes Regen site Push drafts branch Push master subtree (./public) The following script does that taking a commit message or supplying a default.\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 #!/bin/bash echo -e \"\\\\\\\\033[0;32mDeploying updates to GitHub...\\\\\\\\033[0m\" # Build the project. hugo # Add changes to git. git add -A # Commit changes. msg=\"rebuilding site `date`\" if [ $# -eq 1 ] then msg=\"$1\" fi git commit -m \"$msg\" # Push source and build repos. git push origin drafts git subtree push --prefix=public origin master Hugo does a great job of separating out configuration, content, themes, and local overrides. Each getting their own file or directory. But it provides no deployment scripts.\nFor comparison, Octopress/Jekyll leaves it as an exercise for the developer to separate configuration, content, themes, and local overrides, but it provides a deployment script.\nUsing GitHub pages and a little bit of git wizardry and the deployment process is pretty easy.\nHugo does a great job of separating out configuration, content, themes, and local overrides. Each getting their own file or directory. But it provides no deployment scripts.\nFor comparison, Octopress/Jekyll leaves it as an exercise for the developer to separate configuration, content, themes, and local overrides, but it provides a deployment script.\nUsing GitHub pages and a little bit of git wizardry and the deployment process is pretty easy.\nHugo does a great job of separating out configuration, content, themes, and local overrides. Each getting their own file or directory. But it provides no deployment scripts.\nFor comparison, Octopress/Jekyll leaves it as an exercise for the developer to separate configuration, content, themes, and local overrides, but it provides a deployment script.\nUsing GitHub pages and a little bit of git wizardry and the deployment process is pretty easy.\nHugo does a great job of separating out configuration, content, themes, and local overrides. Each getting their own file or directory. But it provides no deployment scripts.\nFor comparison, Octopress/Jekyll leaves it as an exercise for the developer to separate configuration, content, themes, and local overrides, but it provides a deployment script.\nUsing GitHub pages and a little bit of git wizardry and the deployment process is pretty easy.\nHugo does a great job of separating out configuration, content, themes, and local overrides. Each getting their own file or directory. But it provides no deployment scripts.\nFor comparison, Octopress/Jekyll leaves it as an exercise for the developer to separate configuration, content, themes, and local overrides, but it provides a deployment script.\nUsing GitHub pages and a little bit of git wizardry and the deployment process is pretty easy.","lastmodified":"2015-12-30T00:00:00-05:00","tags":null,"title":"Hugo Blog Development"},"/blog/20160101-forcing-factors":{"content":"A forcing factor (a.k.a Forcing Function, for us nerds) is any factor that forces you to make a choice. They are often thought of as bad because when you are forced to make a choice that choice is not likely going to be a good one.\n“Check” in the game of chess is one such negative forcing factor. Global climate change is another. However, they can be used for good if you take control of them.\nIf you think about forcing factors as the start of a feed back loop then you can use that to your advantage. The following are some ways I used them to make my life better.\nBroccoli Once upon a time - like most kids - I hated broccoli. So I forced myself to eat it first instead of last. By forcing myself to eat it first I creating a forcing factor that prevented me from eating the food I liked until I finished the broccoli.\nAs an adult I actually like broccoli, so this particular forcing factor isn’t necessary, but I still use the idea when trying new foods. Instead of resisting, I simply try new foods first, and if I don’t like it then I finish it anyway before moving to the foods I like. That way my doggy bag contains the food I do like and I get a second meal.\nCleaning Up After the Dogs I dislike cleaning up dog poop. But I really dislike when people don’t clean up after their dogs, so I always clean up after my my dogs when I walk them. And I use that dichotomy as a forcing factor.\nI used to let the dog out into our fenced in backyard to do their business I would simply let them out and back in. This meant that if we wanted to let the kids play back there (they’re toddlers) I would have to take 30 minutes and comb the backyard for the poop bombs, or clean it off the kids when started playing in it. It was not pleasant.\nI would have to do the same before I mowed the backyard as well. Or I risked stepping in poop, or worse having it explode out of the lawn mower. Yuck!!!!\nWhen I walk the dogs I always carry poop bags and clean up after my dogs. So now, I simply leash up the dogs and walk them around the front yard with poop bags two to three times a day. When the dogs go I clean it up and trash it. Now I never have to worry about there being poop bombs in my backyard.\nAt first it was annoying but now it is habit, and sometimes if the mood strikes me I simply take the dogs on a full walk around the neighborhood. Its better for me and the dogs.\nYard Work I hate mowing the lawn, but I like a clean yawn. Specifically I like my front yard to look good for my neighbors. So, I use this to force me to do the entire yard.\nMy wife would prefer I do the “important” stuff first, so that I could quit in the middle. But half completed work bothers me.\nI do the edging first (because I won’t do it if I wait). Then I mow the backyard, then the front, and finally I blow off the sidewalk, driveway, and patio so that everything looks nice. By doing it that order it ensures that I do it all, so when head in for the day I am not worried about how much work I have left.\nConclusion Forcing factors can be bad, but you can still use them as a force for good in your own life. Take a step back, figure out what makes you avoid something and do that thing first (eating yucky food). Or find a thing you really don’t like and find a way to make it habit (picking up dog poop). Or find a way to organize a daunting tasks which ensures you complete it (yard work).\nOne benefit - not listed above - is that as you find and use forcing factors on yourself you become more aware of them. It becomes easier to see when others are using them for good or ill; which means your response will naturally become more calculated and less forced.","lastmodified":"2016-01-01T00:00:00-05:00","tags":null,"title":"Forcing Factors"},"/blog/20160118-story-points-done-wrong":{"content":"Ever said or heard something like:\n“How many hours per Point?” “How many days is a 3 point story?” “Why can’t we just use hours?” If so…\nTo be honest Story Points are probably the trickiest part of Scrum largely because everyone “wants” them to be something that they are not. But once you understand them they are far more useful and accurate then any other form of estimation (that I have seen).\nStory Points are Complexity not Time Story Point do not aim to be an accurate calendar representation of work. And because of this, when used correctly they are. Story Points are a measure of risk and complexity combined into a single number, nothing more and nothing less.\nAt most you should use 5 story point values, but ideally you have fewer. The scale you choose doesn’t matter just so long as everyone knows that there is no direct relationship. For example, 2 smalls don’t make a medium.\nPoints are chosen by comparing the unestimated story to a known golden story whose complexity is well understood. Finding the golden story is the hardest part of using Story Points, but also the most important. Usually it is best to pick a single generic medium story. That way the comparision is simply “is this story more effort, less effort, or similar effort to the golden story”.\nGrooming: Never do Points at Planning I won’t go over what or how to do Grooming, but by the time planning starts it is already too late for Story Points to have value. Planning is the time where a story is tasked out and (if the teams so chooses) hours assigned to tasks. If you wait, then Points will be equated to hours and the value of Points is lost.\nVoting Each team member should vote in silence and at the same time so they are not influenced by a dominate person (usually a PO or senior dev which under values complexity, effort, or risk). The idea is to get a general consensus, not a unanimous verdicts. If there is an event split or one extreme outlier, then a discussion should be had so that everyone understands. A new vote should be taken after the discussion. Repeat until everyone is close to the same number then pick the point value of the majority and move on.\nI Need a Time! - Velocity By abstracting Points from Time you are no longer conflating the two. However, projects are done based on calendar time, so points need to be translate. Enter Velocity.\nVelocity is simply the average number of Story Points the current team has done in the last several sprints. It doesn’t pretend to have perfect knowledge. It is just a calculated number which accurately represents how the team produces.\nOn the small (at the sprint level) velocity has little to no accuracy, but on the large (at the release level) it is very accurate. Think of it like a radioactive half-life. At the atom level you have 0% chance of guessing when it will decay into the stable form. However, at a larger quantity you know exactly how many atoms will decay in the time period (just not which ones).\nIf you are used to moving warm bodies around, or individually assigning stories, then velocity may seem like a step backwards. And maybe you (or a manager) have even seen short term gains doing your own thing, but eventually things will turn sour (usually at the worst possible time). Go read Mythical Man Month, and return here when finished.\nA stable velocity is a fragile thing, and it doesn’t pretend like it isn’t, but if you are willing to protect the team (and by extension the velocity) then it is amazingly accurate. For example, if over the last 3 sprints the team has averaged 10 points a sprint and you double the team size and put 20 points of work into the sprint then there is no hope of success because the velocity number is still 10 point / sprint.\nVelocity changes slowly, so don’t fall into the trap of playing with team makeup in an attempt to gain velocity. It is always better to increase throughput (amount of new feature) while leaving velocity alone.\nThroughput Throughput is simply the number of new features added. It is related to velocity, but not the same. In fact, whenever someone says “we need to increase our velocity” this is actually what they mean. The easiest way to increase throughput is to limit risk in one of the following ways.\nCut scope Does “user login” imply a “forgot password” process? If so, split it into two explicit stories. Cut stories horizontally Stories are often cut vertically, which usually leads to interesting half done features. Consider cutting stories horizontally first, then vertically. To ensure that stories aren’t half done and need to be revisited. Release Earlier - with an easy upgrade path A Min Viable Product is likely 2 to 3x more features then are actually needed. In many cases it is better to 1/2 the feature count and 2x the quality and release early. Remember, once you release you must support; which is going to chew up time, so get used to it. Don’t focus only on the “customer” A “user” is any user of the system including Devs (developers and QA). Making their jobs easier will increase throughput overall. So make sure stories reduce tech debt, and increase automation and testing. We overcommit A very common problem, if you use Story Points at sprint planning is to overcommit the team. Like previously said, points have no accuracy in the small. So using them as a measure of the sprint will result in no accuracy.\nPoints may (and should be used) by the PO to help order the backlog so an even mix of large and small tasks make it into a sprint. If the sprint is mostly large stories (which are large because of risk or complexity) then the sprint is likely to fail.\nThe correct way to do planning is to take each story from the backlog and task it out. For each task add an estimated time to complete, and don’t forget about QA or outside team tasks, but refrain from assigning tasks. Keep tasking stories until you get to the story that can only be partially completed during the sprint. The PO may decide to accept a half done story, or decide to rearrange the backlog to find a smaller story.\nOnce the stories are pulled into the sprint it is entirely up to the team to decide when and in what order to do the stories. Ideally, the entire team would bring a single story to completion before starting another, but there are cases where it doen’t work completely. For those cases the devs should move on to testing.\nWhen deciding how many tasks will fit in a sprint a good rule of thumb is 5 productive hours per day per person. This is because email, team discussions, and scrum meetings all take away from development time. The larger the team it more coordination is needed therefore 5 hours is only true for teams of 3 to 5 devs. For larger teams take away 45min of productive time per additional dev.","lastmodified":"2016-01-18T00:00:00-05:00","tags":null,"title":"Story Points Done Wrong"},"/blog/20160202-gitflow-simple":{"content":"Gitflow is a great workflow to ensure you maintain constant ever increasing version numbers with enough room to fix mistakes. The downside is the slowness of deploying new features. GitFlowSimple is a simplified version which can be expanded to standard GitFlow when needed, but is less effort when deploying new features.\nAssumptions This method assumes the following are true:\nYou understand the following features of git How to branch from a non-HEAD commit How to deal with a merge conflict That linear history is an illusion (so you use git log -a --graph). You understand what is meant by a “SNAPSHOT” version. Basically, ever changing code which devs can use, but never goes into production. You have read and understand GitFlow You need to release changes quickly. For example, you are maintaining several projects and need to add features to a library package; GitFlowSimple is ideal for the library. You don’t care about contiguous version numbers. they The numbers are ever increasing, but may not be in sequence (a use might get 1.2.5 as the update to 1.1.0). This branching style is ideal for library code that is not normally seen by an end user, or end-user software that is automatically updated since the end-user doesn’t usually keep track of version numbers. You mileage may vary, and you can always migrate to the standard GitFlow workflow when needed.\nBranching Where GitFlow has “develop”, “master”, feature, release, and hotfix branches, GitFlowSimple only has “master”, and feature branches (or you can call them release branches if you like merging several features into a release). Everything that is done on the extra GitFlow branches is be done directly on a feature branch or master branch in GitFlowSimple.\nThe main branch “master” is the only main branch and (just like GitFlow) it must remain production ready. And just like GitFlow this is the branch where version tags are created.\nUnlike GitFlow this is also the branch where version numbers are bumped. The flow looks a little bit like this:\nHEAD is version 1.0 (which is also tagged) A feature branch is created from the HEAD commit of master The feature is bumped into a snapshot image as follows: The version file (inside the source code) is updated to 1.1.0-SNAPSHOT The change is committed, immediately The change and the feature branch are pushed to the central repo The feature is developed The feature is tested and merged (git merge --no-ff) On master, the version is bumped as follows: The Version file is updated to 1.1.0 (no “-SNAPSHOT”) The change is committed The change is tagged All changes are pushed to the central repo Merging using --no-ff prevents git from linear-izing the commits and makes it easy (using git log -a --graph) to see all the branches/features.\nSupporting Branches The hotfix and feature branches are both used in GitFlowSimple. The two branches serve the same purpose as they do in GitFlow. The release branch serves no function in GitFlowSimple.\nAn example GitFlowSimple is best shown using an example. Lets assume the project is currently at 1.0, and a defect that popped up needs to be fixed, and at the same time two new features need to be added. I am going to deal with each in turn, but in actuality they will be done in parallel.\nHotfix A hotfix branch is created using the 1.0 tag (unlike GitFlow which would have used the “master” branch for this). The branch is called “hotfix-1.0.1”. The version file is updated to “1.0.1-SNAPSHOT” and committed. The branch is pushed The defect is fixed (which takes 10 commits) The defect is tested Now we need to merge to master and claim the rightful version number.\nThe code is merged to master On master the version is updated to “1.0.1” The change is committed The commit is tagged “1.0.1” The commit and tag are pushed to origin Feature 1 Started at the same time as the Hotfix.\nA feature branch is created using master’s HEAD (currently 1.0.0) is created named “feature-1”. The version file is updated to “1.1.0-SNAPSHOT” and committed. The branch is pushed The feature is added (which takes 3 commits) Now we need to merge to “master” and create a release, but the Hotfix was already merged.\nThe code is merged to master, but fails because of a version file conflict The merge is reverted. The feature branch is checked out and “master” is merged back into the feature branch. The conflict is present again. It is fixed by keeping our “1.1.0-SNAPSHOT” version. Note: we did this because our version was greater then what was in master; see “Feature 2” for what to do when this is not the case. We retest before merging to master. We attempt the merge to master again. There are no conflicts so we bump the version number as we did with the Hotfix; becoming 1.1.0. Feature 2 Feature 2 logically started on the 0.9 version so has the 1.0.0-SNAPSHOT version. This feature is not finished until after feature 1 has been merged.\nMaster has been pulled Master is merged into the feature branch, with a conflict on the version file. The conflict is fixed by keeping master’s version: 1.1.0. The version number is bumped to 1.2.0-SNAPSHOT (as if this was a new branch based on the current master) The code is retested The branch is merged into master (as with feature 1 and the hotfix becoming 1.2.0. Multiple Same Versions There is a chance when merging a feature it will not create a merge conflict because it is trying to claim the same version as what is already in master.\nThis is why it is important to ALWAYS do the following when merging features:\nMerge master back into the feature branch before merging the feature into master. Manually update the version number if no merge conflict happened. ","lastmodified":"2016-02-02T00:00:00-05:00","tags":null,"title":"Gitflow Simple"},"/blog/20160423-packer-ova":{"content":"Recently I have started using Packer to build AMI images. It works like a champ, but then I tried to make VMWare images and it produced machine images, not machine exports. This makes the exports nearly useless. However, with a little post-processing magic this can be fixed.\nThere is a GoLang based ovftool post processor that already does this, but it needs to be compiled and put into the packer plugins dir. Nothing that the plugin does requires GoLang, so I whipped up a quick bash version that I could include with the packer script.\nVMWare comes bundled with a tool called ovftool which will convert a machine image into a OVA file. It has a lot of useful options so I recommend you reading the manual page. Basically it take a vmx file and its associated files and makes a OVA: ovftool sourc.vmx dest.ova.\nI turned this into a script that can be called as a post processor from packer. There are a number of protections, but the important items are:\nline 5: Ensures ovftool is on the path line 19: This avoids an interesting behavior (read bug) of Packer where the the inline script is called multiple times; once per file in the output directory. line 33: Removes the floppy which is used to install some VMWare drivers. line 37: Removes the CD-Room that is also used to install some VMWare drivers. line 42: Uses ovftool to create an OVA. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 #!/bin/bash # Make sure ovftool is in the path. # On the mac: PATH=/Applications/VMware\\\\\\\\ Fusion.app/Contents/Library/VMware\\\\\\\\ OVF\\\\\\\\ Tool/:$PATH # I need to be given the output path and vmname. if [ -z \"$1\" -o -z \"$2\" ]; then echo \"output path and vm-name are required\" exit 1 fi DIR=$1 NAME=$2 cd $DIR # I may be called multiple times. If so bail early if [ -f \"${NAME}.ova\" ]; then echo \"OVA already created, skipping\" exit 0 fi if [ ! -f \"${NAME}.vmx\" ]; then echo \"no ${NAME}.vmx file found\" exit 1 fi # bail on error set -e # remove floppy sed '/floppy0\\\\\\\\./d' ${NAME}.vmx > 1.tmp echo 'floppy0.present = \"FALSE\"' >> 1.tmp # remove CD sed '/ide1:0\\\\\\\\.file/d' 1.tmp | sed '/ide1:0\\\\\\\\.present/d' > 2.tmp echo 'ide1:0.present = \"FALSE\"' >> 2.tmp mv 2.tmp ${NAME}.vmx ovftool -dm=thin --compress=1 ${NAME}.vmx ${NAME}.ova Add this script to the packer JSON files.\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 { \"variables\": { \"output_path\": \"output\", \"vm_name\": \"data-collector\", ... }, \"builders\": [{ \"output_directory\": \"{{ user `output_path` }}\", \"vm_name\": \"{{ user `vm_name` }}\", ... }], \"_comment\": \"Create an OVA because packer cannot.\", \"post-processors\": [{ \"type\": \"shell-local\", \"inline\": [\"post-ova.sh {{user `output_path`}} {{user `vm_name`}}\"] }] } ","lastmodified":"2016-04-23T00:00:00-04:00","tags":null,"title":"Packer OVA"},"/blog/20161203-managing-base-docker-images":{"content":"Docker is a great way to package your code such that you can be sure it will run on any machine that has docker installed. However, maintaining your docker containers and publishing them to docker hub can be a bit of a challenge. The following are two ways I do it.\nOption 1: Let docker hub do it If your needs are simple then you can have Docker Hub monitor a repo and rebuild the docker file when you push a changed to the directory.\nDocker hub maps branches to tags with “master” being the “latest” tag. If you want more docker tags (maybe multiple versions) then you will need to create multiple branches and reconfigure Docker Hub.\nOption 2: One repo to rule them all The option that I generally go for is a single repo to manage all my docker containers. Where the directories map to the docker repos and to the tags. I do this because you “should” be separating your packaging from your source, and I like to keep similar things organized.\nWith this method you do not need to create the Docker Hub repos. They will be created by docker push from the build.sh, but you will need to manually edit the repo information on Docker Hub to make the repo useful for others.\nIf you wanted to automate it then you will need a CI service which allows docker images and Docker-next-to-docker. Setting up automation is beyond the scope of this entry.\nThe docker-image repo directory would look a bit like this:\n- / - jkamenik/ - hugo/ - 0.15/ - Dockerfile - entrypoint.sh - latest -> ./0.15 - README.md - my-private-repo.com/ - emacs - 25.1 - Dockerfile - build.sh The highest level directory (“jkamenik”) is the Docker hub account name. Or it could be a private docker repo. build.sh has not been tested against private repos, but it should work.\nThe 2nd level directory (“hugo”) is the docker image name: “jkamenik/hugo”. Within this directory is also where I put the README.md. This is because a Docker Hub Repo’s information is independent of a tag; so it makes sense for me to manage it in the same way. After the first creation, or an update to the README, I copy and paste this information into the Docker Hub repo’s information.\nThe 3rd level directory (“0.15”) is the docker tag: “jkamenik/hugo:0.15”. Also, the 3rd level directory can be a symlink, which is useful when tagging “latest” to a specific version.\nThe 4th level is where the Dockerfile and any supporting files go. A common pattern for me is to provide a script which will be the entry point. The behavior of this script is generally to check the argument list and if they look like program arguments blindly pass them to the default program. If they don’t look like program arguments then execute them as the entry point.\n1 2 3 4 5 6 7 8 # execute hugo without args $ docker run jkamenik/hugo # execute hugo with \"-D\" $ docker run jkamenik/hugo -D # execute bash $ docker run jkamenik/hugo /bin/bash The build.sh file takes 0 or 1 argument. If there are 0 arguments then it will build all directories. If there is 1 argument traverse that path and build only images there. For example:\n1 2 3 4 5 6 7 8 # build all $ ./build.sh # build all private repos $ ./build.sh my-private-repo.com # build only the 0.15 verion of hugo $ ./build.sh jkamenik/hugo/0.15 To see this all in action see https://github.com/jkamenik/docker-images","lastmodified":"2016-12-03T00:00:00-05:00","tags":null,"title":"Managing Base Docker Images"},"/blog/20171021-docker-details-dumb-init":{"content":"If you don’t control the “init” process of docker then you are doing it wrong. But don’t worry there is an easy fix. Before I explain the solution, I should explain the issue. Almost every process you run in Linux will likely run at least 1 child process. And Linux expects that every parent will properly care for its children by propagating kernel signals like SIGTERM, and by cleaning up child zombie processes. If all else fails the Linux init process will do that on behalf of Linux and all is happy.\nHowever, programmers generally don’t know the requirements of dealing with child processes, and linux clean up after itself so unless you already know what to do testing won’t show issues. The issue comes because Docker doesn’t provide an init process for the container, so your child processes will not get signals, zombies will be created, and eventually things will terminate uncleanly or hang indefinately.\nSolution: Use dumb-init dumb-init provides a very small init runtime that deals with signals and zombie processes. And nothing else! It is a tiny 45Kb statically compiled and will work inside any docker container as the entrypoint.\ndumb-init has the ability to do signal-rewriting which is very important if you are using apache which uses SIGWINCH for a graceful shutdown or nginx which uses SIGQUIT. Many other programs also use different signals then TERM to mean graceful shutdown.\nAnother nice benefit of dumb-init is that it will terminate the container immediately on any of the termination signals, even if the child processes ignore the signals. This prevents stall-out of container termination in Kubernetes, Docker Swarm, and Docker Compose.\nTest your containers To test your containers send it the SIGHUP, SIGINT, SIGTERM, SIGUSR1, and SIGUSR2 signals to your running container. All of them should immediately start the shutdown process.\nLets assume you have a container like the following container\n1 2 3 4 5 6 7 FROM alpine:3.5 COPY entrypoint.sh /entrypoint.sh RUN chmod +x /entrypoint.sh RUN apk add --no-cache bash ENTRYPOINT [\"/entrypoint.sh\"] The entrypoint.sh looks like this. It will print the signal it gets but only exit cleanly if it gets USR1. I also added a 2s wait on USR1 to simulate a graceful shutdown. All the signal handlers should be called in turn until one exists the program.\n1 2 3 4 5 6 7 8 9 10 11 #!/bin/bash trap \"echo TERM\" TERM trap \"echo HUP\" HUP trap \"echo INT\" INT trap \"echo QUIT\" QUIT trap \"echo USR1; sleep 2; exit 0\" USR1 trap \"echo USR2\" USR2 ps aux tail -f /dev/null You can build the container with a command like docker build -t my_container. If you run it with docker run --rm -ti --name my_container my_container you will get the following output:\nPID USER TIME COMMAND 1 root 0:00 {entrypoint.sh} /bin/bash /entrypoint.sh 7 root 0:00 ps aux You can send any signal you want to the container via docker kill -s HUP my_container (replace HUP with various signals). None of the signals are printed! To double check run docker stop, which will sent TERM, then wait 10 seconds before sending KILL.\n$ time docker stop my_container my_container real\t0m10.761s user\t0m0.011s sys\t0m0.015s So here we can tell that stop waited the full 10s and still needed to send KILL.\nFixing the issue Changing nothing about /entrypoint.sh we can fix this by updating the Dockerfile as follows.\n1 2 3 4 5 6 7 8 9 10 11 12 13 FROM alpine:3.5 COPY entrypoint.sh /entrypoint.sh RUN chmod +x /entrypoint.sh RUN apk add --no-cache bash # Change 1: Download dumb-init ADD /usr/local/bin/dumb-init RUN chmod +x /usr/local/bin/dumb-init # Change 2: Make it the entrypoint. The arguments are optional ENTRYPOINT [\"/usr/local/bin/dumb-init\",\"--rewrite\",\"15:10\",\"--\"] CMD [\"/entrypoint.sh\"] Again start the container and run docker stop\n$ time docker stop my_container my_container real\t0m2.778s user\t0m0.011s sys\t0m0.013s Here you can see it exists immediately after the simulated graceful shutdown meaning it got and processed the USR1 signal, even though it was sent the TERM signal. The output of the container is\nPID USER TIME COMMAND 1 root 0:00 /usr/local/bin/dumb-init --rewrite 15:10 -- /entrypoint.s 7 root 0:00 {entrypoint.sh} /bin/bash /entrypoint.sh 8 root 0:00 ps aux User defined signal 1 USR1 0 In case you want to double check that TERM was actually used we can use docker kill -s TERM to explicitly send the TERM signal:\n$ docker kill -s TERM my_container my_container PID USER TIME COMMAND 1 root 0:00 /usr/local/bin/dumb-init --rewrite 15:10 -- /entrypoint.s 7 root 0:00 {entrypoint.sh} /bin/bash /entrypoint.sh 8 root 0:00 ps aux User defined signal 1 USR1 0 We still get USR1 as the signal.\nHonorable Mentions The following are worth mentioning, though I don’t use them.\ntini tini is an alternative to dumb-init. It is a few months older then dumb-init but doesn’t provide the signal rewriting that you are going to need for many of the programs you will want to run.\nIt is also 850Kb for the statically compiled version (vs 45kb for dumb-init). Not a huge number, but given it has fewer features it isn’t worth the bloat.\nAlso, if you do use tini remember to use the -g options so that all child processes are signaled, like would be done from init during a shutdown. This is the default for dumb-init but needs to be enabled for tini.\ndocker run –init The --init flag was added to docker 1.13 to run tini as the init process before the ENTRYPOINT is executed. Its a cute addition, but isn’t used by Kubernetes, Docker Swarm, or Docker Compose. This might be fixed in the future, but for now it is best to ignore this option.","lastmodified":"2017-10-21T00:00:00-04:00","tags":null,"title":"Docker Details - Dumb Init"},"/blog/20250406-wikilinks":{"content":"Wikilinks are a standard of many / most wiki software. However, Hugo does not have support for them. They have a simple form of [[Page Title]] or [[Page Title|Display Text]], which makes them very useful for quick linking. This is how I implemented something like them.\nShortcode Proof of Concept First, let’s start with a shortcode that gets us most of the logic. The being that shortcode { {% wl \"Hugo Wikilinks\" %}} would link to this page.\nSome examples\nwl \"Hugo Wikilinks\": Hugo Wikilinks wl \"Hugo Wikilinks\" \"Another name\": Another name wl \"hugo wikilinks\": Hugo Wikilinks wl \"invalid\": invalid wl \"FOO\" \"bar\": bar Note\nFor ease we’ll use the markdown form of calling shortcodes: % instead of <.\nThe contents of layouts/shortcodes/wl.html:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 {{- $title := .Get 0 }} {{- $display := .Get 1 | default \"\" }} {{- $page := \"\" }} {{- range .Site.RegularPages }} {{- if eq (lower .Title) (lower $title) }} {{- $page = .}} {{- if eq $display \"\" }} {{- $display = .LinkTitle }} {{- end }} {{- end }} {{- end }} {{- if ne $page \"\" }} {{- printf \"[%s](%s)\" $display $page.RelPermalink }} {{- else }} {{- $display | default $title }} {{- end }} Line 1: Gets the Title of the page for comparison 2: Gets the display value override 5 - 12: Searches all regular pages (ones with valid titles) for the one that matches 8 - 10: Uses the Page’s Link Title if an override is not provided 14 - 18: Prints either the wiki link or the display string or the title This is a 90% solution. However, we can make it the 100% solution with a lot more logic. So let’s leave it here for now.\nIf you are interested in more of a solution then milafrerichs/hugo-wikilinks has a good start.","lastmodified":"2025-04-06T00:00:00-04:00","tags":null,"title":"Hugo Wikilinks"},"/blog/20250418-kustomize-rollout":{"content":"Kustomize is a tool built into kubectl which helps in the management of YAML. It does a lot of things, but one of the major ones is having overlays per deployment. It is not uncommon to have a single base and a rollout per deployments. However, this can cause issue when you need to fix your base, as it will happily update all your overlaid environments en mass; which is less then ideal. Here is how I have fixed that for my deployments.\nSetup Let’s say we have the following kustomize layout. It is a simple deployment with a service and ingress. There are 4 deployments; the 3 standard stages (dev, staging, prod) plus a local testing overlay meant to be deployed manually to a local instance.\napp base deployment.yaml ingress.yaml kustomization.yaml service.yaml overlays dev kustomization.yaml local kustomization.yaml prod kustomization.yaml staging kustomization.yaml Option 1: Patches In this first option everything that makes a deployment unique is only tracked in the overlay via patches. This is very good for simple patches like the following:\n1 2 3 4 5 6 7 8 9 10 11 12 # overlays/dev/kustomization.yaml resources: - ../base patches: - patch: |- - op: replace path: /spec/rules/0/host value: dev.example.com target: kind: Ingress name: app However, things are little more annoying if you have to exclude or remote items. There is no way in kustomize to delete entire objects so the only way to handle this in each overlay is via full object copies. If however, you have a large discrepancy between an env or a lot of object then this can be error prone.\nOption 2: A/B Bases Note\nThis style is more complicated because it prevents accidental rollouts. It should be reserved for when the deployment maturity requires it.\nIn this style the base has a, b, provider-a and provider-b bases as well as all the standard overlays plus a local overlay that cannot have provider specific objects.\napp base a deployment.yaml ingress.yaml kustomization.yaml service.yaml provider-a backend-config.yaml cert.yaml external-secret.yaml frontend-config.yaml kustomization.yaml b deployment.yaml ingress.yaml kustomization.yaml service.yaml provider-b backend-config.yaml cert.yaml external-secret.yaml frontend-config.yaml kustomization.yaml overlays dev kustomization.yaml local kustomization.yaml prod kustomization.yaml staging kustomization.yaml A standard rollout is as follows:\nEvery overlay is pointing at *a local and provider-a directly at a dev, staging, and prod at provider-a Create / update b to be what is in a Point local to b Update b until it works for local Create / update provider-b based on provider-a Make sure that provider-b points to b Point dev to provider-b Update provider-b until it works for dev Roll out the changes to staging and prod (optional) Delete a and provider-a Blast Radius Diff\nBecause there are going to be so many file changes with every update it is useful to create a script that can diff the output of kustomize between what is currently deployed (usually the main branch), and what is currently on the branch. This is the only way to get a real idea of the blast radius of the change.\nOption 3: Versioned Helm Charts The final option - which is complicated as it introduces a new tool - is to produce a versioned [helm chart](/radar/languages/helm-chart/). Then use that reference and a values file in each overlay. In this case, because the helm chart is the base, no separate base is needed.\napp overlays dev kustomization.yaml values.yaml local kustomization.yaml values.yaml prod kustomization.yaml values.yaml staging kustomization.yaml values.yaml Each kustomizaton.yaml would look like this:\n1 2 3 4 5 6 helmCharts: - name: app includeCRDs: false valuesFile: values.yaml version: 3.1.3 repo: https://oci.example.com/repos/app Separately you will need a repo for the helm chart with a CI pipeline that pushes a versioned OCI image to a repository. To rollout a change you update the version flag and make any correction needed to the values file.\nNote\nIt is not uncommon to do local testing directly from the Helm chart and have a special dev cluster for manually testing the helm chart before it gets rolled out. This means that the local vs provider specific objects has to handled by logic exposed in the values.yaml.\n1 2 3 4 5 6 7 8 9 10 # values.yaml provider: \"local\" # only include the CRDs valid for local deployments # or provide explitic disable flags disableExternalSecrets: true disableManagedCertificates: true # hardcoded secrets are now needed secrets: someSecret: key: some-secret-value Don’t use chartHome\nKustomize does support a chartHome option which will use a local file path to find the helm chart. Don’t use it, as it is worst of all worlds. You have to manage it in A/B style or you have to add the stage logic to the chart directly which means you cannot truly test it before roll out. It will bite you.","lastmodified":"2025-04-18T00:00:00-04:00","tags":null,"title":"Kustomize Rollout"},"/blog/20250514-rag-pipeline":{"content":"A Retrieval-Augmented Generation (RAG) pipeline is a technique for interfacing with LLMs that helps to:\nAdd Context Improve Accuracy Check / Filter hallucinations Preserve Privacy Add Value In this age where LLMs are becoming ubiquitous you will very likely need to create one of these sooner or later.\nBasic Flow flowchart LR User((User)) Vector[(Vector DB)] LLM{{LLM}} Embedding[Embedding Processor] Embedding -->|A| RAG -->|B| Vector Vector -->|2 & 4| RAG -->|5| User User -->|1| RAG -->|3| LLMThere are two distinct flows that need to be performed. The first is the embedding flow, which reads and stores data not directly accessible to the LLM. It vectorizes the data and hands it to the RAG for storage in the Vector DB. Following Single Responsibility Principle, it is important that the RAG be the only system with direct access to the Vector DB.\nSeparately the Query flow starts with the user asking a question. The RAG system may (or may not) pull data from the Vector DB to augment the query with context the LLM doesn’t have. Care has to be taken to ensure that no IP is leaked to the LLM because queries are considered public and may be used in future LLM training. After the LLM response is returned the RAG can additionally augment the response with embedded context from the Vector DB before handing it back to the user.\nRAG Service The RAG service should be a simple HTTP service that has 2 types of endpoints. The first type will be the CRUD API for the embedding flow. The second is the streaming API for the query flow.\nEmbedding API The Embedding API should be your standard CRUD operations. You would be well served using an OpenAPI server. If at all possible we recommend taking the raw data in the API and then vectorizing it in the service for storage in Vector DB.\nQuery API The Query API should “stream” the response. There are a few ways to do this which will effect what deployment options you have. One of the easier ways - from the server perspective - is websockets. However, not all loadbalancers or WAFs supoport that. Another option is a Cursor Keep Alive, though that comes at the disadvantage of requiring a separate DB to keep track of the cursors.\nDeployment Options Your simplest option is a serverless function or container. That service would take care of the API endpoint and then you largely just have to make sure that your code completes before the service timeouts.\nA Kubernetes Deployment is another viable option. Expose the endpoimt via an Ingress though a high performance IngressController and you should be all set.","lastmodified":"2025-05-14T00:00:00-04:00","tags":null,"title":"RAG Pipeline"},"/blog/20250615-blast-radius":{"content":"Context is everything, and understanding Blast Radius is crucial for providing the necessary context when assessing risk as a DevSecOps professional.\nBlast Radius - simply put - is how much of the infrastructure is touched when a change is made. The higher the blast radius the higher the risk. And in the world of IaC that usually means it is caused by a change in IaC code. So it would be natural to assume that a large change in blast radius would be caused by a large change in IaC, but that is often not the case. There is only an indirect relationship between IaC lines changes and infrastructure changes.\nBut how do you protect yourself?\nYou measure blast radius, then track it using Key Performance Indicator (KPI), and finally design to minimize it.\nMeasuring Blast Radius The best way to measure Blast Radius is by using the IaC tool itself. A naive approach is to simply count the number of changes. However, that doesn’t include any kind of risk assessment. Different types of change represent different risks. Therefore, my recommendation is to calculate it as follows:\n+1 for any item added or deleted, but not replaced If possible, consider the underlying provider’s objects and not what is in IaC. For example, in terraform an aws_s3_bucket might have a logging block. Those are 2 separate objects in AWS and thus should be counted separately. As a side note, this is why the logging block is deprecated and the aws_s3_bucket_logging object should be used instead. +3 for any item changed in place They are higher risk targets as they are presumably already in place and working so a change is a higher risk +5 for any item that is replaced. They are already working, and any replace is likely to result in data loss. Add up all the changes and that represents the risk of the change. This can then be tracked over time to indicate the relative risk of changes.\nOptional items The following items are harder to know because sometimes the IaC won’t know. But if you can track them, you should.\n+7 for any case where the IaC change would not be respected. That is, the IaC is changed, but it won’t effect the underlying infrastructure. If this doesn’t result in an error then the IaC tool then it is a really high risk because you’ll assume everything is ok, but it very much isn’t. S3 bucket encryption is an example of this. If the flag is enabled only new files are encrypted. Older files remain unencrypted without any way to know. +9 if the change would cause loss of data This usually is only discovered during that oh-shit moment when everything fails and you have to trigger Disaster Recovery. Once this type of change is known it should be documented as the kind of change that shouldn’t be made. Policy-as-Code tools are good for preventing this. At the time of this writing, AWS EFS encryption is an example. It would happily remove all existing files when this flag is changed. Blast Radius KPI The following are the KPIs you need to track and publish:\nBlast Radius (See above) - This tells you the risk of each apply IaC lines changed - This tells you the programmer’s intended number of changes for each apply. It is important to track the lines changed between two applies of the IaC, not just the lines changed per commit. Each line changed, added, or deleted counts as +1. Resources Under Management (RUM) - This is a count of the number of objects described by the IaC at the time it is applied. Since IaC often has means to reduce code duplication (like modules) this should be based on fully expanded resources, not just what is in the IaC files. IaC Code Efficiency ($\\frac{IaC\\ resources}{1K\\ line\\ of\\ code}$) - The number of resources should be at the plan or apply stage. The lines of code should be measured based on the workspace. It is measuring how effective you are at using coding techniques to manage your IaC resources. Change Risk (min, max, mean, median, & mode of $\\frac{Blast\\ Radius}{IaC\\ Changes}$) - This tells you how much of your infrastructure is changing per change in IaC. High numbers indicate that relative small changes in actual code cause high changes in infrastructure. You need to balance IaC Code Efficiency and Change Risk to maximize your operational speed.\nDesigning for Blast Damage Each situation and business model is different. So there is no one size-fits-all solution here. The design goal should be to know what the maximum potential effect of a change could be and to ensure it is within an acceptable tolerance given the business. Then design an IaC layout and processes that maximizes the IaC Code Efficiency without increasing the potential blast radius outside acceptable tolerances.\nBalancing these two competing ideas is difficult and constantly needs to be revisited and adjusted. Below are some ideas that might help you.\nSimple Base Let’s say you’re using a code gen tool like Kustomize. This uses a base + overlays. Let’s say there are overlays for dev, staging, and prod. Any change to base affects all envs, thus has a large blast radius. Change to overlays only affect those overlays. Therefore they have a smaller blast radius.\nA good technique to reduce blast radius is to apply the changes to each overlay individually. This maximally reduces Blast Radius by maximally increasing code duplication. To reduce code duplication, once each overlay has all the changes, move them up into base so that any new envs also benefit. The idea is to keep the base as simple and minimal as possible, and copy and paste the required changes between overlays.\nYou might even consider splitting your monolith across multiple bases so that each can release independently.\nA/B rollout A more extreme version of the above - but ultimately easier to maintain - is an A/B style rollout. Using the Kustomize example again, you’d create a second copy of base called base-b. You’d modify it as needed and point each overlay to it in turn. Once all overlays are pointed at base-b you can erase base.\nYou’d have as many bases as you’d have in-process rollouts up to a maximum of the number of envs you have. Therefore it is important to track the roll outs carefully.\nCare also has to be taken with PRs since the actual line change will look huge but the relative change will be small.\nCheck-in Fully Rendered Code Similar to the A/B style, but more broadly is to use code gen tools like CUE to maintain an easier to manage base template then fully render it out and check-in that code. In this case the CI/CD pipeline should only ever run off the fully rendered code, and the developer controls what changes get committed and thus the blast radius.\nFor early teams this is likely a good balance. But as the team scales the amount of generated IaC per small change will be untenable. So just keep that in mind. This model is ultimately a path with limited long-term scalability (an “evolutionary dead-end”), but it could serve you for years before any issues arise.\nVersioned Artifacts The most effective approach to handling Blast Radius is through the use of Versioned Artifacts and rolling out new versions. All roads will eventually lead to this method; however, there is quite a bit of effort involved in setting up a repository for the artifact, CI pipelines that include adequate testing, and artifact storage.\nWhile this gives you the benefit of being able to independently test the artifact before release, the entire process does involve some amount of operational overhead. Therefore it is only recommended for mature organizations for the items that MUST remain stable.\nBranches and/or Tags Sort of a lazy man’s versioned artifacts is using branches or tags. Code can be merged between branches as needed to rollout changes, but you do not have to do things like setting up new repos, CI pipelines, or Artifact Repos. Folks also generally know how to merge between branches better then they understand how to release artifacts. Therefore it is a smaller conceptual load and easier to pick up.\nHowever, there are some major downsides:\nMerging is manual so forgetting to merge changes across branches will be common. There is no relationship between branches, thus it won’t be clear what or how changes can and should be made, or where things are deployed. It would be entirely possible to have changes not flow though a standardized process. Environments will tend to drift both making future merging difficult. ","lastmodified":"2025-06-15T00:00:00-04:00","tags":null,"title":"Blast Radius - Critical Context"},"/contentindex/index.json":{"content":"","lastmodified":"0001-01-01T00:00:00Z","tags":null,"title":"Content Index"},"/linkindex/index.json":{"content":"","lastmodified":"0001-01-01T00:00:00Z","tags":null,"title":"Link Index"},"/radar/languages/container-structure-test":{"content":"Container Structure Test provides a powerful framework to validate the structure of a container image. These tests can be used to check the output of commands in an image, as well as verify metadata and contents of the filesystem.\nIf you are building containers then this is a critical tool.\nTests are defined in a YAML file and then run via\n1 container-structure-test test --image --config There are three types of tests that can be performed:\nCommand Tests - Execute a command and check its output File existence tests - Check a file is or is not present File content tests - Check a file has or does not have certain content Metadata test - Check that the metadata of the container is correct ","lastmodified":"2023-12-17T00:00:00-05:00","tags":null,"title":"Container Structure Test"},"/radar/languages/dev-container":{"content":"Dev Container was a system born in VS Code that allowed the IDE to boot a container and use it as its development runtime. The benefits of this are huge: no longer does each developer need to set up a development machine, no longer does one developer’s effort have to be manually replicated by all others, no longer is there a skew between dev, CI, and production. There simply is shared Infrastructure as Code (IaC) which is used to build all of those. And because of these benefits, Dev Container is very much in the adopt ring.","lastmodified":"2025-04-23T00:00:00-04:00","tags":null,"title":"Dev Container"},"/radar/languages/go":{"content":"Go is a language that google created to handle its unique scaling needs. It turns out that designing for those needs, eliminates the performance and scale cliff that almost everyone experiences. Since the language itself is fairly simple this has made it a boon for anywhere large scale and/or highly concurrent programming is needed. For this reason it has become the de facto language for many DevOps tools, so if you are in that space then you should know it.\nGoLang\nBecause “Go” is very generic “GoLang” can be used in search.\nOne of its core strengths is an active community of maintainers and a Special Interest Group (SIG) that forces a workable reference implementation of all things before they can be added to the language. Unlike many other languages where a theoretical feature is added only to be poorly thought out, the SIG forces a working example to be implemented in a fork of the language before it is brought in.\nThis can make it frustrating for those familiar with other language constructs like package management, inheritance, or generics. Go seems to move slowly and when picked the solutions don’t always look like they do in other languages. However, this is a strength of Go. The solutions that Go comes up with scale extremely well because they are only added to the language if they do.\nTake for example error handling in Go. It is implemented as an interface, where any object that has a function Error() string is considered an error. This means:\nErrors are treated as variables, and don’t need special handling. Functions often return both an object and an error, leveraging Go’s multiple-return feature. This encourages handling errors immediately after they are returned. To check if a file is closed, you can use a simple comparison like if err == io.EOF, which checks if the error is exactly io.EOF. io.EOF is a singleton defined in the io package. So you are asking if the memory address of the error is that of io.EOF. In many cases, when you know the context of the error, this is enough.\nHowever, there are cases where the exact context isn’t fully known, leading to much debate on how to handle these cases. Go came up with the wrapped errors. Functionally it acts kind of like exception inheritance in other languages. Let’s say you want to print the file name in the error, you would wrap the error: return fmt.Errorf(\"file :%v, %w\", file_name, io.EOF). This breaks err == io.EOF since the new error is not io.EOF. So two functions to check wrapped errors were added: Is and As.\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 func usingIs() { if _, err := os.Open(\"non-existing\"); err != nil { if errors.Is(err, io.ErrNotExist) { fmt.Println(\"File does not exist\") } } } func usingAs() { if _, err := os.Open(\"non-existing\"); err != nil { var pathError *fs.PathError if errors.AS(err, io.&pathError) { fmt.Println(\"Failed at Path:\", pathError.Path) } } } which will check if the current error is or wraps io.EOF. In this way it is a drop in replacement for ==, plus works with error wrapping.","lastmodified":"2025-05-08T00:00:00-04:00","tags":null,"title":"Go"},"/radar/languages/go-template":{"content":"If you are a Go programmer then Go Templates makes a lot of sense. But Helm, and Hugo are some of the more popular tools that use it, so its usage is relatively specialized. For this reason you really have to assess if there is any need.\nThings to know The language is fully described in the programming documentation. Basically, it reads text for embeds. Anytime you want to drop to code you use {{ ... }}. The output of that block is then inserted in place. Data can be accessed with $ (main object), or . (current scope). There are a relatively small set of built-ins like range, or block.\nThe main object ($) and all functions have to be registered directly in the go code. Meaning that different usages have very different behaviors. Sprig is a very common library of template functions that should be loaded to make the templates really useful.\nExample 1 2 3 4 5 6 7 8 9 10 import ( \"github.com/Masterminds/sprig/v3\" \"html/template\" ) // This example illustrates that the FuncMap *must* be set before the // templates themselves are loaded. tpl := template.Must( template.New(\"base\").Funcs(sprig.FuncMap()).ParseGlob(\"*.html\") ) template.FuncMap is a map[string]any and can be used to add any functions to the list, including internally developed ones.","lastmodified":"2025-04-24T00:00:00-04:00","tags":null,"title":"Go Template"},"/radar/languages/helm-chart":{"content":"Helm Charts are the artifact that Helm reads and installs in a Kubernetes cluster. They are written in Go Template format, and are compiled into OCI compatible images, which means they can be stored in any OCI compliant Docker registry, of which there are a plenty.\nWhile Helm itself is a MUST if you are in the Kubernetes ecosystem that doesn’t extend to maintaining your own Helm Charts. They are nessessity to understand but there are limited use cases where you should maintain them your selves. The real only reason to do that is if you are providing customers the ability to install on their own clusters. For all other cases Kustomize is a better choice.","lastmodified":"2025-07-13T00:00:00-04:00","tags":null,"title":"Helm Chart"},"/radar/languages/yaml":{"content":"YAML is a more human readable format which is fully API compatible with JSON. It gained popularity as being the Declarative IaC language for things like Docker Compose, and Kubernetes.\nThere are some syntactic sugar like anchors and aliases and deep merging of maps which makes it easier to for humans to read. Additionally, many systems (like Helm) add a templating language on-top to make generating large amounts of YAML easier. And since Yaml is a subset of JSON, JSON Schema that can be used to validate complex YAML. Because of this ubiquity you should adopt it.\nThe creators of YAML also created YAMLScript as a templating language, but it is not wide spread yet.","lastmodified":"2025-05-10T00:00:00-04:00","tags":null,"title":"YAML"},"/radar/languages/yamlscript":{"content":"YAMLScript (YS) is valid YAML, however, it is not API compatible with YAML. Therefore, even if example.ys can be read by a YAML interpreter the output would not contain the same objects as the parsed YS. Currently no main stream programs use it, but we are optimistic that it might eventual replace Go Templates for things like Helm. But until that happens you have to assess if it suits your needs.\nAll YS files must start with !yamlscript/v0/data, or a SheBang (#!). This will cause non-compliant parsers to fail, and instruct compliant ones on which language to use for the rest of the file.\nExample Data 1 2 3 4 !yamlscript/v0/data base =: 42 example: int(base) will result in\n1 example: 43 Example Program 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 #!/usr/bin/env ys-0 # Print the verses to \"99 Bottles of Beer\" # # usage: # ys 99-bottles.ys [] defn main(number=99): each [n (number .. 1)]: say: paragraph(n) defn paragraph(num): | $bottles(num) of beer on the wall, $bottles(num) of beer. Take one down, pass it around. $bottles(num - 1) of beer on the wall. defn bottles(n): cond: n == 0 : 'No more bottles' n == 1 : '1 bottle' => : \"$n bottles\" main is the entry point and all command line arguments are split as function arguments. Builtins like each, say, cond take zero or more arguments and perform work given by their object definition. User defined functions are called using ().\nThis is a functional language and thus the program and object definitions are unordered. For example in cond the “=>” string indicates the default case; it is only placed last as a convention. This also means that all conditions have to be fully evaluated before the default case can be chosen. Not usually a issue, but something to be aware of.\nIn paragraph the | indicates that the next block is text. The $ in the block is interpolation.","lastmodified":"2025-05-08T00:00:00-04:00","tags":null,"title":"YAMLScript"},"/radar/platforms/cloudbees":{"content":"All marketing aside, CloudBees both provides a managed Jenkins solution as well as actively maintains Jenkins. If you are in the Jenkins ecosystem already then this is basically your only serious option. But, Jenkins has not kept up with the changes of the industry so it should be avoided if at all possible.","lastmodified":"2025-05-10T00:00:00-04:00","tags":null,"title":"CloudBees"},"/radar/platforms/docker-desktop":{"content":"Docker Desktop is a developer platform which can be critical if you are in the container ecosystem. This makes it especially useful for Dev Container based development. With the adoption of built-in native kubernetes, it becomes a critical solution by enabling seamless container orchestration, simplifying development workflows, and providing an integrated environment for building, sharing, and running containerized applications.\nFor larger orgs it comes with a hefty price tag, but the investment can be justified by the productivity gains, streamlined workflows, and robust features such as built-in Kubernetes support and containerized development setups.","lastmodified":"2025-04-23T00:00:00-04:00","tags":null,"title":"Docker Desktop"},"/radar/platforms/docker-swarm":{"content":"As of 2025 swarm maintenance has been to a separate company from the one that maintains Docker. We therefore recommend you avoid this platform.\nOriginally Docker Swarm was billed as an alternative to Kubernetes. However, when that didn’t get traction it changed course and retooled as way to bridge Docker Compose to a multi-machine setup. However, there are some issues with stateful data that - due to the lack of on-going support - will likely not be solved.","lastmodified":"2025-05-10T00:00:00-04:00","tags":null,"title":"Docker Swarm"},"/radar/platforms/jenkins":{"content":" Security Blackhole!\nThere is no way to properly secure Jenkins and make it Cattle not pets. It should not be allowed to touch production workloads, and where possible should be removed anywhere security is a concern.\nJenkins is the OSS version of Hudson and Hudson was originally designed to build Java projects. It predates DevOps and DevSecOps by a wide margin. While over the years there have been many attempts to make it compatible with the change in mindset, there are too many existing installations of it to make this possible. It is software that is best avoided, and other proper CI-CD tools used.\nIf you must use Jenkins then you MUST treat it as a Pet / Snowflake and manually manage it. I cannot tell you the number of Jenkins instances that either die on reboot, die on upgrade, or fail in a disaster recovery (DR) situation. By the time the issue is identified it is often easier to rebuilt the entire pipeline using an appropriate tool then it is trying to make Jenkins work again.\nAdditionally, absolutely DO NOT use the internal sensitive information store. First, it isn’t actually secure, and is easily reversed. Second, the information there cannot be properly managed, tokens cannot be rotated, etc… Finally, the ID of the sensitive info is tied to machine ID. So if the machine fails and you need to restore to a different machine then none of the ID will match and all the pipelines will have to be rebuilt by hand even if the pipeline was defined by IaC.\nInstead, consider using dedicated secret management tools such as HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault. These tools provide robust security features, including encryption, access control, and automated secret rotation, which are essential for managing sensitive information in CI/CD pipelines.","lastmodified":"2025-05-10T00:00:00-04:00","tags":null,"title":"Jenkins"},"/radar/platforms/jenkins-x":{"content":"Yet another option for using Jenkins places it doesn’t belong. Save yourself the trouble and use ArgoCD, or Argo Workflows if you plan on running your pipelines on Kubernetes.","lastmodified":"2025-05-10T00:00:00-04:00","tags":null,"title":"Jenkins X"},"/radar/platforms/kubernetes":{"content":"Kubernetes, also known as K8s, is an open-source system for automating deployment, scaling, and management of containerized applications. Unlike many systems before it, Kubernetes is fully multi-cloud and truly planet-scale. At the time of this writing, there is a theoretical limit of a single cluster having 25 million nodes, with each node having thousands of CPUs and millions of GPUs. In practice, clusters tend to be much smaller than that, but they also tend to be very dynamic and scale as needed.\nIt uses a YAML based Declarative IaC syntax to orchestrate Containers. While there are alternatives like Nomad, Docker Swarm, and DCOS, realistically, they each have limitations that Kubernetes does not, making it likely that you will decide to (or be forced to) migrate.\nIf you want to avoid cloud vendor lock-in or operate in a Hybrid Cloud environment, Kubernetes is essential.","lastmodified":"2025-05-10T00:00:00-04:00","tags":null,"title":"Kubernetes"},"/radar/platforms/nomad":{"content":"Hashicorp Nomad bills itself as an orchestration system that is easier then kubernetes. While that is true there is less of a community, so your only option is Hashicorp commercial support. It is a good platform to know, but your mileage may vary for production workloads.","lastmodified":"2025-05-11T00:00:00-04:00","tags":null,"title":"Nomad"},"/radar/platforms/replicated":{"content":"Replicated is an air-gapped Kubernetes installer. If you live in that space then this is basically your only commercial option. However, you aren’t doing yourself any favors choosing Kubernetes. Therefore, for the majority of users, this option should be avoided.","lastmodified":"2025-05-19T00:00:00-04:00","tags":null,"title":"Replicated"},"/radar/platforms/system-initiative":{"content":"System Initiative is a visual DevOps tool. Think wysiwyg terraform.\nWhen it started in 2019, we were optimistic about its novel approach of using a hypergraph to perform all the work on a digital twin before applying it to the underlying infrastructure. We appreciate that it is an open source project, but the source reveals that the hypergraph is essentially JSON objects, and the digital twin is just JSON transformers that are applied to the in-memory version of that objects.\nWe are 5 years on, and novelty of that approach has waned as it struggles to implement useful features like support for anything but AWS, templates of reusable code, basic linting, or 3rd party integrations. Unless you are already invested in this solution we recommend avoiding this platform.","lastmodified":"2025-04-23T00:00:00-04:00","tags":null,"title":"System Initiative"},"/radar/platforms/tailscale":{"content":"Tailscale is a software defined VPN platform which uses Wireguard under the hood. If you have a need for a traditional VPN then Tailscale might be a easier solutions. However, in this day and age you are likely better served with a ZTNA solution, so we remain cautious.","lastmodified":"2025-04-23T00:00:00-04:00","tags":null,"title":"Tailscale"},"/radar/techniques/12-factor-app":{"content":"The 12-factor methodology is designed for building SaaS services that are declarative, clean, maximally portable, and minimally divergent. By following these principles, the code becomes more portable, testable, and manageable in production, without being restricted by language, platform, or framework choice. All services should adhere to these principles.","lastmodified":"2023-12-17T00:00:00-05:00","tags":null,"title":"12 Factor App"},"/radar/techniques/3-point-estimate":{"content":"The three-point estimation technique is used in management and information systems applications for the construction of an approximate probability distribution representing the outcome of future events, based on very limited information. It is very useful in estimating the scope of engineering projects as well.\nIf your team is struggling with estimating, this might be a viable alternative to simple effort points, which are common in Scrum. By taking a range of estimates, you can avoid unnecessary debates and over-analysis that often arise during estimation. Many teams will find it easier to estimate dozens of tasks in the same time it would have taken to estimate just a few with planning poker.\nEstimates Before we get to the math the 3-points in 3-point estimation are the following estimates. For each task these are the values you need to track. As developers we recommend estimating in whole number days. However, some teams choose 1/2 days or hours, but we strongly recommend against that.\na - The best-case. This is probably what an expert would say it is. Or someone outside of eng would think it should take. m - The likely case. This is what you’d arrive at with planning poker or some other consensus estimating technique. b - The worst-case. This is the “it could possible take more then…” number. The Math For each task that has been estimated calculate the weighted average and standard deviation:\n$E = \\frac{a + 4m + b}6$ $SD = \\frac{b-a}6$ Then calculate the confidence of the project by combining the estimates for each task:\n$E(project) = \\sum E(task)$ $SD(project) = \\sqrt{\\sum SD(task)^2}$ You can then convert this into a confidence interval for the project:\n68% = $E(project) \\pm SD(project)$ 90% = $E(project) \\pm 1.645 \\times SD(project)$ 95% = $E(project) \\pm 2 \\times SD(project)$ 99.7% = $E(project) \\pm 3 \\times SD(project)$ 95% is usually the target.\nWith spreadsheet this is pretty easy to do.","lastmodified":"2025-05-09T00:00:00-04:00","tags":null,"title":"3 Point Estimate"},"/radar/techniques/access-on-demand":{"content":"Access on Demand (AoD, a.k.a On-demand Access) is a technique in which no standing access is granted to sensitive resources like production systems. Instead there is a lightweight audited process to elevate access for a limited time period.\nTo meet even the most strict compliance frameworks all that is needed is the ability for the user to declare a reason for the increased access and the duration which is then logged. Therefore we recommend trying this technique with your existing SSO provider to see if it is viable.\nThere are many ways to implement this type of system, but a common that works for SAML and OIDC workflows is as follows:\nUser is granted the groups for which they get standing access and aod_ (for example aod_aws_prod_admin) for ones where they can increase access. A system is provided that allows the user to requires aod groups for a set time using. The reason given is logged. When the request is made a new session token is provided with the new group added, and a expiry timestamp no longer then the requested duration. The user uses the new token as they would any other token. Each system is looking for authentication using the non-aod groups (for example aws_prod_admin) and thus access is granted if appropriate. ","lastmodified":"2025-05-10T00:00:00-04:00","tags":null,"title":"Access on Demand"},"/radar/techniques/agile-software-development":{"content":"In order to ensure credit is given we won’t restate the manifesto here and instead redirect you to the Manifesto for Agile Software Development.. The software industry as we know it owes much to this manifesto.\nThe single most important thing is that we are people trying to deliver value and get paid. Keep practices that make this easier and possible. Dispose of those that get in the way and learn from your mistakes. This should be adopted by all teams as the foundation for their processes.","lastmodified":"2025-05-09T00:00:00-04:00","tags":null,"title":"Agile Software Development"},"/radar/techniques/cattle-not-pets":{"content":"Once upon a time we treated servers like pets. We gave them names, allowed them to become unique, and spend a lot of time caring for them individually. This didn’t scale so we increasing made the pets do more and more and it was catastrophic when they died.\nCattle are numbered, identical, fungible (interchangeable and replaceable without loss of value), follow instructions, and are replaced when they fail. This facilitates almost all modern software practices. Things like containers, Kubernetes, Software as a Service, or Cloud wouldn’t exist without this technique. Therefore it is an absolute must.\nReferences https://cloudscaling.com/blog/cloud-computing/the-history-of-pets-vs-cattle/","lastmodified":"2025-05-10T00:00:00-04:00","tags":null,"title":"Cattle Not Pets"},"/radar/techniques/chatops":{"content":"ChatOps means two distinct things: 1) sending alerts to a chat application, and 2) using a bot to receive commands from a chat app. Chat apps like Slack strongly recommend this model. They even include hundreds (maybe thousands) of integrations specifically targeted at these use cases. However, for development work, this approach is often a mistake. It is either ignored or, more likely, slows the team down due to constant distractions.\nIt was and is strongly recommended by tools that are in the chat space like Slack. However, in almost every case they are a bad idea and should be avoided for development work. Thus we put ChatOps in the hold ring. While it may be entertaining for generating memes, it is not suitable for critical tasks.\nChatOps for Alerting In DevOps / DevSecOps alerting is very important. It is the way that you know something is wrong and that your attention is actually needed. You need to leave that space clear for only the most important alerts and thus mixing that type of alerting with co-worker chats is a bad idea.\nChatOps for Actions Triggering automation from a chat program seems like a boon. Who wouldn’t want to force encrypt all their drives right before they walk onto an air plane for a 5 hour trip… Anyone in DevSecOps.\nAlso, it is really inadvisable to give your chat software access to your critical infrastructure. Therefore any investment made in this area will quickly fail a standard audit leaving you with no viable path. It is better to not fall in this trap to begin with.","lastmodified":"2023-07-23T00:00:00-04:00","tags":null,"title":"ChatOps"},"/radar/techniques/cloud":{"content":"At its most basic, “Cloud” just means someone else’s computer. More importantly, it refers to shared infrastructure that is managed and configured programmatically. This infrastructure can be used as a base for customers to build on. It is also known as Infrastructure as a Service (IaaS) or Platform as a Service (PaaS).\nHistorically, AWS was the first true cloud provider. Amazon sought to offset the operational costs of infrastructure that was heavily utilized during Black Friday and the lead-up to Christmas but remained largely idle at other times. Their revolutionary approach eliminated the need to consider physical hardware, replacing it with configurable options.\nMany years on and all the major cloud offerings have the same set of base offerings, like object storage, network attached storage, VMs. Then on top of that they also have ubiquitous managed offerings like Postgres, Kubernetes, and Serverless. Specifics are different, but the concepts are all the same. And for this reason, if you are a modern DevOps then you must the cloud concepts.","lastmodified":"2025-05-11T00:00:00-04:00","tags":null,"title":"Cloud"},"/radar/techniques/cloud-lift-and-shift":{"content":"Cloud Lift and Shift was a term coined in the early days of the Cloud. Companies wanted to migrate their datacenters to the cloud to lower cost, but they didn’t want to change anything about their workloads. This approach was largely unsuccessful at the time.\nCurrently, it has come to mean switching between similar services of the different cloud providers in a way to save cost. Very commonly it is migrating workloads (stateful and stateless), databases, and files between the different clouds chasing cloud credits. It is a losing game, and it is far better to lean into a Hybrid Cloud mentality where the effective workloads remain in the most effective cloud provider, and you bridge your cloud accounts securely.","lastmodified":"2025-07-10T00:00:00-04:00","tags":null,"title":"Cloud Lift and Shift"},"/radar/techniques/code-review":{"content":"A code review should involve both human (if possible) and machine review. Any opposition, such as unresolved comments, failed automated checks, or policy violations, should block the review from being merged. Luckily most Source Version Control (SVC) systems like GitHub, GitLab, and Gerrit have a code review process that is built into their merge process. It should be the entry point to the Continuous Integration process, and should be adopted by all teams.\nEven if you are a party of 1, having machine review centralized in a standard code review process will save you a lot of headaches.\nNominclature GitHub calls it a Pull Request, sometimes appreviated PR.\nGitLab calls it a Merge Request, sometimes appreviated MR.","lastmodified":"2025-05-11T00:00:00-04:00","tags":null,"title":"Code Review"},"/radar/techniques/declarative-iac":{"content":"Infrastructure as Code (IaC) is the idea that all infrastructure can (and should) be described as executable code. This allows machines to continually reconcile the code against reality and eliminate drift before it becomes costly. “Declarative” means that the IaC file has minimal logic and declares the end-state.\nDeclarative IaC should be adopted and CDKs and other forms of Imperative IaC should be avoided.\nOne of the biggest complaints about Declarative IaC is that it often leads to repetitive code. Additionally, the logic can sometimes be unclear or difficult to follow. However, most IaC tools offer code encapsulation features that address these concerns effectively.","lastmodified":"2025-01-05T00:00:00-05:00","tags":null,"title":"Declarative IaC"},"/radar/techniques/dry":{"content":"Don’t Repeat Yourself is a technique aimed at reducing repetition of code.\nWhile the assertion that repeated code is harder to maintain and therefore error prone is true, the way that DRY is implemented is often wrong and leads to increasingly complex code with more abstractions than necessary. It also leads to premature optimization and various other anti-patterns which increase the blast radius of changes.\nIn the IaC world, minimizing blast radius is paramount. Therefore a little repeated code here and there is a small price to pay for a robust infrastructure.\nResources https://en.wikipedia.org/wiki/Don%27t_repeat_yourself https://dev.to/ralphcone/please-do-repeat-yourself-dry-is-dead-1jbg https://dev.to/jeroendedauw/the-fallacy-of-dry ","lastmodified":"2025-01-05T00:00:00-05:00","tags":null,"title":"DRY"},"/radar/techniques/enterprise-ready":{"content":"Enterprise Ready is a series common features that are needed by enterprise companies, like Single sign-on, RBAC, and SLAs. If you are trying to attract enterprise customers then the features listed here are table-stakes. But more broadly they are good set of features to have in general.\nYou can think of it as 12 Factor App for businesses targetting Enterprise customers.","lastmodified":"2025-05-12T00:00:00-04:00","tags":null,"title":"Enterprise Ready"},"/radar/techniques/gitops":{"content":"GitOps is a key part of ShiftLeft. The idea is that the Pull Request (aka Merge Request) process that is common for developers is an ideal way to handle *-as-code items like IaC (Infrastructure as Code), PaC (Policy as Code), or SaC (Security as Code) items. Just as a developer PR would kick off a build pipeline, a GitOps PR would kick off an deployment pipeline. For this reason it is a strong adopt.\nBenefits of GitOps GitOps:\nIs able to describe complex multi-part pipelines in a single place Is able to track changes to config over time Is able to fully attribute change source (i.e., committer) and approvals Is able to design complex branching and merging behaviors easily Has readily available tooling Challenges with GitOps Even the best tool makes certain assumptions that have to be solved before the tool can be used to its full potential.\nGitOps:\nAssumes things can be described via code. This is getting better with tools like Terraform, Ansible, and Kubernetes but there are a lot of legacy out there. Separates identity from actions. Again mostly a legacy thing where the legacy attributes a human user to each action. CD pipelines are always triggered by machine so there is an extra step needed for attribution. Must have permissions checking and review at the file/directory level to ensure triggering by only correct folks. This is related to the separation of actions and identity. Since the CI will run using its identity and permissions GitOps can be a means to escalate privileges. The solution might be as simple as a CODEOWNERS files, or using a branch permission model. Must have a good means of handling sensitive info outside of Git. Should have a policy engine to check compliance. Something like Terrascan, or OpenPolicyFramework which can lint the change before it applied. Makes it difficult (but not impossible) for machines to effect change. Humans are good at handling conflicts, but machines are not. Also, adding a git repo mid-stream might be unnecessary as there are usually better ways to handle state between CD stages. ","lastmodified":"2023-03-30T00:00:00-04:00","tags":null,"title":"GitOps"},"/radar/techniques/hybrid-cloud":{"content":"We are big fans of Hybrid Cloud, if and only if you do it correctly. The worst thing you can do is flop back and forth between similar services across clouds (also known as Cloud Lift and Shift). However, if you utilize the best of each cloud’s offering and allow your developers to choose the ideal environment for their needs, then Hybrid Cloud can be a multiplier.\nIt is important that you bridge your Cloud accounts securely either via SDWAN, VPN Gateway, or Beyond Corp.","lastmodified":"2025-07-10T00:00:00-04:00","tags":null,"title":"Hybrid Cloud"},"/radar/techniques/imperative-iac":{"content":"Imperative IaC is using a programming language to generate the IaC code. This is often seen as a boon since the number one complaint about Declarative IaC is that it is not DRY enough. And being able to use the suite of programming refactoring tools seems great.\nHowever, the two main goals of IaC are decreased blast radius and security over time, which imperative IaC makes impossible. Almost all clients we have worked have regretted the choice of Imperative IaC tools like Pulumi and AWS CDK. While imperative IaC code might be smaller, the side effects of simple changes are vastly more complicated to understand, which makes responding to change risky, and error prone.\nMore often then note our first order is to us a tool like terraformer to generate raw IaC from the state of the cloud account and then painstakingly refactor the IaC correctly as it should have been originally. Though use of proper IaC refactoring technique we arrive code that is both clean and maintainable with minimal repetition, rendering promises of Imperative IaC moot and thus we move this to hold.","lastmodified":"2025-01-05T00:00:00-05:00","tags":null,"title":"Imperative IaC"},"/radar/techniques/inbox-pattern":{"content":"The inbox pattern is an effective means to guarantee delivery of a work item. You might be familiar with it from email. It is a highly effective way to make sure that work is complete, or retried until it completes. It is light weight and can be used effectively in much more complex systems like work queues, or workflow. It should be adopted before more complex designs.\nsequenceDiagram actor User box Service participant API participant TaskExecutor end participant DB as Database User ->> API: request API ->> DB: insert job API ->> User: accepted loop DB -->> TaskExecutor: get job TaskExecutor ->> TaskExecutor: do work TaskExecutor ->> DB: record results endEffectively, the service serializes enough information into the database to perform the work, along with job details such as the date and completion status. If the TaskExecutor might fail before completing a long-running job, use a reworkAfter timestamp instead of a simple boolean status. When the TaskExecutor picks up a job, set the reworkAfter timestamp to the current time plus twice the maximum allowable time. Any job with a null or expired reworkAfter timestamp can be reprocessed. Continue processing all jobs until completion, repeating as necessary.\nAKA Outbox Pattern\nThe outbox pattern is essentially the inverse of the inbox pattern. In this approach, the service writes outgoing messages or events to an outbox table in the database during normal operations. The TaskExecutor then reads from the outbox and sends these messages to external systems or services.\nFor example, consider an e-commerce application where order confirmations need to be sent to customers. When an order is placed, the service writes the order details to the outbox. A separate TaskExecutor processes the outbox, sending confirmation emails to customers. This ensures reliable delivery even if the email service is temporarily unavailable, as the TaskExecutor can retry sending messages until successful.","lastmodified":"2025-05-19T00:00:00-04:00","tags":null,"title":"Inbox Pattern"},"/radar/techniques/planning-poker":{"content":"Scrum Story Points are probably the single biggest stumbling block for team members. According to Scrum they shouldn’t be related to hours, but should be related to how easy or hard something is. Unfortunately, this concept can be challenging for many team members to grasp, so games like Planning Poker have to be adopted.\nAt its most basic, every one on the team gets a vote. They vote with cards they pull from their hand. Their hand is made up of Story Point cards for each valid set of points in the system being used. Everyone shows their cards. If the cards shown are all similar then the story is updated and folks move on. If the cards are different then debate is done and another vote is had. Continue until everything is scoped.\nIt is better than having no structure, but it doesn’t scale well. In many cases, much of the debate will center around what a “point” actually means. Older tickets often have to be rescoped as the session moves forward. We are cautious about using this technique.\nWhile not perfect we find that 3 Point Estimation is usually a better solution to estimation.","lastmodified":"2025-05-09T00:00:00-04:00","tags":null,"title":"Planning Poker"},"/radar/techniques/scrum":{"content":"Scrum is one of the more popular Agile Software Development techniques. Interestingly enough it just Waterfall Software Development rebranded. It is one of the few that you can purchase training and certification for, which should be your clue that Scrum really isn’t agile in the strictest sense. However, in comparison to other techniques like Six Sigma it is much lighter weight. There are some good points which you should try in whatever development process you use.\nThe good:\nSprint Planning Scrum Team Definition of Done / Acceptance Criteria Retrospective The bad:\nDaily Scrum / Standup Backlog Grooming Points Burn-down Charts The ugly:\nInfinite Backlog Sprints Moving Goal Posts / Crunch Culture “Full stack” Ceremonies Most of the good part of Scrum align the team’s focus with delivering value. Most of the bad and ugly parts are ceremonies, which folks oft confuse as value. This can lead to a death spiral of more formal process being used as a “fix” for a delivery problem.\nCare must be taken as following Scrum does not guarantee results. Instead Scrum should be treated like a collection of light weight techniques that can be applied or ignored as needed.","lastmodified":"2025-05-09T00:00:00-04:00","tags":null,"title":"Scrum"},"/radar/techniques/sre":{"content":"Site Reliability Engineer (SRE) is a very specific implementation of DevOps that Google uses. However, outside of Google, DevOps usually have a broader mandate. You have to decide if a narrowly focused SRE role is right for you.\nReference https://sre.google/ ","lastmodified":"2025-05-11T00:00:00-04:00","tags":null,"title":"SRE"},"/radar/techniques/test-pyramid":{"content":"Test Pyramid is a testing technique that places a higher emphasis on smaller, faster, and more isolated testing (i.e., unit testing). It is the testing equivalent of shift-left and results in better overall quality and faster shipment. It should be a key component of your testing strategy, if not the entire strategy.","lastmodified":"2025-03-29T00:00:00-04:00","tags":null,"title":"Test Pyramid"},"/radar/techniques/zanzibar":{"content":"Google Zanzibar is a white paper on how google handles fine-grained permissions authorization at scale. The first 2 sections of the document (Introduction & Model, Language, and API), and 1/2 of the third (Architecture and Implementation) are broadly applicable to anyone concerned about permissions. The rest of the document is really about Google’s scale, which is less universal. If you need to implement a permissions model then this is a good one.\nThe Permissions Schema Basically, the data schema is a simple tuple of