The following improvements were made:
- Argument parsing was ported to argparse which is more idiomatic for python scripts nowadays
- Added strictness command line option so build bots can fail when there is a difference from the reference results
- The test should no longer fail when there is an empty line in the build command file