If intrinsic `get_fpenv` or `set_fpenv` is lowered to the form where FP
environment is represented as a region in memory, extra moves can
appear. For example the code:
define void @func_01(ptr %ptr) {
%env = call i256 @llvm.get.fpenv.i256()
store i256 %env, ptr %ptr
ret void
}
produces DAG:
ch = get_fpenv_mem ch, memory_region
val: i256, ch = load ch, memory_region
ch = store ch, ptr, val
In this case the extra moves can be avoided if `get_fpenv_mem` got
pointer to the memory where the FP environment should be finally placed.
This change implement such optimization for this use case.
Differential Revision: https://reviews.llvm.org/D150437