code

Asynchronous Convergent Coding: A proposal

Robin Michael

04 May 2024 • 5 min read

What are the goals of code review?

This is to me a subjective question. In my opinion they are the following.

Finding missing conceptual issues
Shared understanding of the system
Identification for future enhancements that will be documented or exist in the reviewer/reviewee common framework
A desirable goal is to allow for extension in testing

What code reviews are not for in my opinion

Finding bugs and defects That is a test. A test that is deterministic and repeatable. If a test doesn't exist write, modify, or request the test that proves it.
Performance request changes These are notorious for being thrown around without proof. Example of this is a Quick Sort vs Bucket Sort. A quick sort is a good sort for large data sets, but for small diverse sets a bucket sort will outperform quicksort. Awareness of the underlying goals translated to data workflows determines applicability. This should also be a test if desired, but understanding of the expectations and boundaries would eliminate the majority of these requests. True performance request changes would reveal themselves against the production code ( If you are testing in production why are you reviewing? Read the logs and overall statistics to guide you) and should be changed in a new code request change that is explicit about performance requirements i.e. Needs to handle X requests in Y milliseconds for this Z time period.
Style issues That is for a linter or the language itself to determine. Styles come and go. JavaScript is notorious for this, but it is with all languages programmatic, written, spoken, natural, etc. A linter automatically formats code to whatever the standard of the time is when you open or work on it and is the arbiter of taste on this subject. Conventions are added to the linter as needed.

Preface

This is something that I have been mulling over for awhile Async Code Reviews Are Choking Your Company’s Throughput

TLDR; The above proposal is about collaborating in real time pair programming to get to the stated goals of code review, however I think it is not the correct approach for asynchronous teams.

For asynchronous teams a better analog would be of loosely coupled independent APIs working together as in Company A publishes a public API and then Company B creates an interface to work with the Company A published API, they do not control the API, but do control their handling of it. Wait, wait, wait this is microservices repackaged. I will admit that would be a natural further development down this line.

With that intention the question is: How do we achieve the same application aspects between two individuals writing and reviewing code to achieve a goal within the same department/division/company?

Goals

Cover the original request, bug, feature for intent
Share meaningful understanding of the code and how it relates to the overall codebase
Identify changes as complete
Help the current and future codebase by building valuable tests
Loosely coupling of the code to create strong boundary points, but flexible internal implementations.

Proposal: Convergent Coding

Two or more people are assigned to the original request
Both read the requirements and expectations
First person begins writing the changes
Second person begins writing or modifying the tests
They will reach agreement on the code integration points of their efforts while they build broad understanding of the changes
Keeping the process asynchronous is necessary and delivers many emergent benefits.
- Avoids peer programming influence, each person is doing their own work to reach the goal
- This prevents the situation of the large code change that is looked over as too much to understand and so often is judged by if it runs then it must be good. Now you will have full confidence because you understand the details and are looking to prove it independently as your task
- Avoids missing why decisions are made as asynchronous forces documentation of the details
- Independent accountability is maintained. This is not a new concept think of the following as the reason why this is desirable. Suppose you have a stack of cash bills and want to make sure nothing is lost or stolen. Put two people in one room and have each count the bills and write down the amount and when they leave give the number to the auditor at the door. If the amounts do not match send them back to count again. The example is a bit contrived but is a practice at most cash only systems.

Example Flow

o---o---o---o---o---o  main (Merge Feature Changes and Feature Tests)
   |\         /
   | o---o---o  Feature Changes (Author A 👨🏿‍💻)
   |         \
   |          o---o---o  Feature Tests (Author B 👩🏼‍, approved by Author A 👨🏿‍)
   |\         /
   | o---o---o  Feature Tests (Author B 👩🏼‍💻)
   |         \
   |          o---o---o  Feature Changes (Author A 👨🏿‍, approved by Author B 👩🏼‍)

The programmers work independently on their respective branches and then swap and approve or reject changes by validating against the mirror branch. All changes are merged back to the the main branch in a single merge.

What I would expect to see from this is

An agreed upon interface for the solution that is testable as in this change will make payments do something, the test would be able to send payments of a format to the new code
Both participants will have thorough knowledge of what the original goal is and what the end goal is
Robust test code that will cover, validate, and prove acceptance
The flexibility that when complete the roles are reversed and the assigned person can grasp and run through the others code with the same original specification goal. This means that there are separate branches and each programmer does not make commits in the others branch because then they wouldn't be able to objectively approve or reject the branch changes.
Supports asynchronous working as the common programmatic interface will change little once established or be iteratively versionable like Graphql for example that means each implementor has a wide degree of flexibility to work under. This is why loosely coupling the codepoints is something that is required to adhere to.

Caveats

At the beginning of this process the amount of work is going to be roughly the same effort for both as the interface touchpoints in the codebase are not well defined, but as they harden to expectations this will reduce long term.
Asymptotically the test code writer will do less work because the tests capabilities will increase faster than the feature code.
In a language like .NET the unit of distribution is the dynamically linked library which means I would recommend the touchpoints as the shared dlls. As in there will be a Model dll shared between test code and feature code and an Interface dll shared between test and feature code. These can grow or shrink as necessary, but the constraint is that the Model dll should have 0 project/solution dependencies and the Interface dll can only have the Model dll as a project reference dll. I have done this system before successfully and it enforces separation of concerns and clarity.

Final Code `"Review"` to Code `"Audit"`

There will still be a final code review step, but it would be to give the above finished product to another person than the original two and they would run the tests and perform a code audit as in while testing it does it meet the original goal.

They would run the tests, do interactive user like actions and observe the results. They may have different data and after reading the original request may point out that the code does not do something. They possibly could run the code through linters if missed etc., but their goal is to understand the changes and thus another person is gaining shared knowledge of a new change to the codebase.