MobileNetV3

Hi,

For MobileNetV3, seems that you have done several modifications on the original implementation: disable h-swish, remove non-linear funtion in DW, add extra non-linear funtion after SElayer, change last layer to GDC, etc.

Does doing this particularly help a lot in face recognition task? What's the performance of the vanilla MobileNetV3? And what's you training strategy?

Thanks a lot.