The following improvements were made:
- Argument parsing was ported to argparse which is more idiomatic for python scripts nowadays
 - Added strictness command line option so build bots can fail when there is a difference from the reference results
 - The test should no longer fail when there is an empty line in the build command file