The tcMultiplyPart routine has a flag that says whether to add to the accumulator or overwrite it. By using the overwrite mode on the first iteration we don't need to initialize the accumulator to zero. Note, the initialization in tcFullMultiply was only initializing the first rhsParts of dst. tcMultiplyPart always overwrites the rhsParts+1 part that just contains the last carry. The first write to each part of dst past rhsParts is a carry write so that's how the upper part of dst is initialized.
97 KiB
97 KiB