This is an alternative to #128506 which doesn't attempt to change the codegen for fmin and fmax on their way to the CLC library. The amdgcn and r600 custom definitions of fmin/fmax are now converted to custom definitions of __clc_fmin and __clc_fmax. For simplicity, the CLC library doesn't provide vector/scalar versions of these builtins. The OpenCL layer wraps those up to the vector/vector versions. The only codegen change is that non-standard vector/scalar overloads of fmin/fmax have been removed. We were currently (accidentally, presumably) providing overloads with mixed elment types such as fmin(double2, float), fmax(half4, double), etc. The only vector/scalar overloads in the OpenCL spec are those with scalars of the same element type as the vector in the first argument.
11 lines
281 B
Plaintext
11 lines
281 B
Plaintext
cl_khr_int64_extended_atomics/minmax_helpers.ll
|
|
mem_fence/fence.cl
|
|
synchronization/barrier.cl
|
|
workitem/get_global_offset.cl
|
|
workitem/get_group_id.cl
|
|
workitem/get_global_size.cl
|
|
workitem/get_local_id.cl
|
|
workitem/get_local_size.cl
|
|
workitem/get_num_groups.cl
|
|
workitem/get_work_dim.cl
|