Efficient Emotional Adaptation for Audio-driven Talking-Head Generation