You found the main computation takes over 90% of the processing time, or you found the framework to be slow in general. We got in contact and discussed speeding up your software, after which we shared this assessment with you. This assessment should give you insights if the software-project is ready to be shared with Stream HPC.
The first step is to prepare code for porting or optimization. As it’s not always easy to know what to do, we’ve defined 3 levels of code quality. The higher the quality-level, the fewer obstacles for the project and the lower the costs, the less communication is required and the fewer frustrations.
Preparing a project for porting / optimizing
The sections below discuss the 3 levels where a project can be. The goal is that you do a self-assessment and write down answers for each question with details, not only provide the final answer.
The code needs to have all levels marked in red: full level 1 and the high level of level 2. It does not matter if the existing code is written in Matlab, Python, C, C++, Assembly, OpenCL, CUDA or any other language.
The action points of all levels need to be done. When a project is not ready yet, we can assist in improving code-quality, but assume a lot has to be done by your team. This will be a separate (pre-)project, and the full estimation for the porting/optimization can only be done after that.
If not possible to level up the software or no source files are available, it will be handled as a black box project or R&D project. Do know that such projects can never be done fixed priced and are always unique. The generic part of the process is described in this blog – we are experienced in doing such focused R&D projects.
Level 1: Understandability
Goal: Can the software be understood without help from the main developers?
- Are the algorithms explained in i.e. scientific papers?
- Do alternate implementations exist? I.e. in Python or Matlab.
- Optional: is there a presentation/overview on the algorithm?
- Is there a software design document?
- Is test-data provided that can be used to run the code?
- Are all functions documented?
- Is it clear what each (part of the) function is doing?
- Are all variables documented?
- Is it clear what each variable means?
- A good communication-plan to get questions answered quickly. This includes a direct contact and regular calls.
- Walk the code and improve/update the existing in-code documentation. Often it’s not looked at since it was written.
Level 2: Testability
Goal: Are there few bugs and can new bugs be easily detected?
- Is there a golden standard of the standard, such like a proof-of-concept or the existing code? Is it available to us?
- Are the outputs deterministic? Or can they be made deterministic quickly?
- Is there a good understanding where the algorithm is less stable to unstable, and can this be explained?
- Is there clarity on the required maximum quantitative errors that would define the correctness of output? Are there high level test-cases for all of these?
- Is there clarity on required maximum qualitative errors that would define the correctness of output? Is the collection of examples and counter-examples large enough to define the error in full?
- Can the whole library tested using the sample input and output?
- Is there clarity on the required precision? When is an answer correct?
- Does the CI contain static code analysis?
- Are there test-cases or automated tools for finding qualitative errors?
- Are the compute-intensive functions covered by functional tests?
- Are the most important functions covered by functional tests?
- Are the other functions covered by functional tests?
- Making the code deterministic by temporary removing random number generator.
- Define what is correct output in detail.
- Complete the test cases for the different types of errors.
- Creating several sets of data-in and its correct data-out. This will be used for acceptance of ported code, giving the maximum errors allowed.
- Decide which sub-results need to be compared.
Level 3: Quality
Goal: Is the software easy to maintain and extend?
- Is there build-automaton, like Cmake or Meson?
- Is the code multi-OS?
- Are tests run with every commit, or at least daily/nightly?
- Are functions defined to only do one thing and do it well?
- Nobody of the team labelled the code as “Spaghetti code”?
- Are function names self-explanatory?
- Are variable-names self-explanatory?
- There is no duplicated code?
- There is no code that actually can be made much less complex?
- There are no functions or methods longer than 100 lines excluding documentation?
- There are no large classes?
- There are no functions or methods with far more than 7 parameters?
- There is only few commented out code?
- There is only few dead code? This is code that is not being called from anywhere.
- There is no code that should be rewritten?
- There is no variables that are being reused?
- There is no significant dependence on global state?
- Prepare the (new) GPU-code for continuous integration.
- Document how the ported code maps to the existing code.
Code at level 0 can take 10 – 20 times more time than code at level 1. How much exactly is difficult to say, and that’s exactly the reason we require a minimal level. The bitter pill is that the costs of not cleaning up the code is often even higher due to increased hardware costs, increased maintenance costs and lack of innovation.
When level 1 is mostly done, each missing item can take 20% to 40% of extra costs. A good example is not having good test-cases or no CPU-code available or not even an executable. This adds costs in both the implementation phase and the acceptance phase. Making a new CPU-implementation first is often cheaper.
From the described minimal level (level 1 in full, level 2 the high level) costs are more predictable and less costly. When getting a quote, these can be requested to be mentioned and you can choose to do it yourself or let us do it.
Just discuss your goals, after done an assessment. We can guide you in prioritization of making your code ready for porting.
If you find this self-assessment useful, know it took us a long time to improve this document and a lot of experience is hidden in it. But as we find quality software very important, we’re releasing this list under CC-BY-NC-ND: it cannot be altered or used commercially, and it must have a clear reference to Stream HPC as the authors. In other words: it must be clear that you did not do the research or writing of this assessment yourself.
If you have feedback or suggestions, we’d really like to hear from you!
Want to know more? Get in contact!
We are the acknowledged experts in OpenCL, CUDA and performance optimization for CPUs and GPUs. We proudly boast a portfolio of satisfied customers worldwide, and can also help you build high performance software. E-mail us today