pub fn _kadd_mask16(a: u16, b: u16) -> u16
Add 16-bit masks a and b, and store the result in dst.
Intel’s Documentation