The use of metaphoric gestures by speakers has long been known to influence thought in the viewer. What is less clear is the extent to which the expression of multiple metaphors in a single gesture reliably affect viewer interpretation. Additionally, gestures which express only one metaphor are not sufficient to explain the broad array of metaphoric gestures and metaphoric scenes that human speakers naturally produce. In this paper we address three issues related to the implementation of metaphoric gestures in virtual humans. First, we break down naturally occurring examples of multiple-metaphor gestures, as well as metaphoric scenes created by gesture sequences. Then, we show the importance of capturing multiple metaphoric aspects of gesture with a behavioral experiment using crowdsourced judgements of videos of alterations of the naturally occurring gestures. Finally, we discuss the challenges for computationally modeling metaphoric gestures that are raised by our findings.