Within the domain of traditional art, Chinese Wuhu Iron Painting distinguishes itself through its distinctive craftsmanship, aesthetic expressiveness, and choice of materials, presenting a formidable challenge in the arena of stylistic transformation. This paper introduces an innovative Hierarchical Visual Transformer (HVT) framework aimed at achieving effectiveness and precision in the style transfer of Wuhu Iron Paintings. The study begins with an in-depth analysis of the artistic style of Wuhu Iron Paintings, extracting key stylistic elements that meet technical requirements for style conversion. Furthermore, in response to the unique artistic characteristics of Wuhu Iron Paintings, this research constructs a multi-layered network structure capable of effectively capturing and parsing style and content features. Building on this, we have designed an Efficient Local Attention Decoder (ELA-Decoder) that adaptively decodes the style and content features through correlation, significantly enhancing the length dependency of local and global information. Additionally, this paper proposes a Content Correction Module (CCM) to eliminate redundant features generated during the style transfer process, further optimizing the migration results. In light of the scarcity of existing datasets for Wuhu Iron Paintings, this study also collects and constructs a dedicated dataset for the style transfer of Wuhu Iron Paintings. Our method achieves optimal performance in terms of loss metrics, with a reduction of at least 4% in style loss and 5% in content loss compared to other advanced methods. Moreover, expert evaluations were conducted to validate the effectiveness of our approach, and the results show that our method received the highest number of votes, further demonstrating its superiority.
Keywords: Chinese Wuhu iron paintings; content correction module; feature symmetry; style transfer.