-
Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset
Authors:
Rui Liu,
Haolin Zuo,
Zheng Lian,
Xiaofen Xing,
Björn W. Schuller,
Haizhou Li
Abstract:
Emotion and Intent Joint Understanding in Multimodal Conversation (MC-EIU) aims to decode the semantic information manifested in a multimodal conversational history, while inferring the emotions and intents simultaneously for the current utterance. MC-EIU is enabling technology for many human-computer interfaces. However, there is a lack of available datasets in terms of annotation, modality, lang…
▽ More
Emotion and Intent Joint Understanding in Multimodal Conversation (MC-EIU) aims to decode the semantic information manifested in a multimodal conversational history, while inferring the emotions and intents simultaneously for the current utterance. MC-EIU is enabling technology for many human-computer interfaces. However, there is a lack of available datasets in terms of annotation, modality, language diversity, and accessibility. In this work, we propose an MC-EIU dataset, which features 7 emotion categories, 9 intent categories, 3 modalities, i.e., textual, acoustic, and visual content, and two languages, i.e., English and Mandarin. Furthermore, it is completely open-source for free access. To our knowledge, MC-EIU is the first comprehensive and rich emotion and intent joint understanding dataset for multimodal conversation. Together with the release of the dataset, we also develop an Emotion and Intent Interaction (EI$^2$) network as a reference system by modeling the deep correlation between emotion and intent in the multimodal conversation. With comparative experiments and ablation studies, we demonstrate the effectiveness of the proposed EI$^2$ method on the MC-EIU dataset. The dataset and codes will be made available at: https://github.com/MC-EIU/MC-EIU.
△ Less
Submitted 4 July, 2024; v1 submitted 2 July, 2024;
originally announced July 2024.
-
Multi-Modal UAV Detection, Classification and Tracking Algorithm -- Technical Report for CVPR 2024 UG2 Challenge
Authors:
Tianchen Deng,
Yi Zhou,
Wenhua Wu,
Mingrui Li,
Jingwei Huang,
Shuhong Liu,
Yanzeng Song,
Hao Zuo,
Yanbo Wang,
Yutao Yue,
Hesheng Wang,
Weidong Chen
Abstract:
This technical report presents the 1st winning model for UG2+, a task in CVPR 2024 UAV Tracking and Pose-Estimation Challenge. This challenge faces difficulties in drone detection,
UAV-type classification and 2D/3D trajectory estimation in extreme weather conditions with multi-modal sensor information, including stereo vision, various Lidars, Radars, and audio arrays. Leveraging this information…
▽ More
This technical report presents the 1st winning model for UG2+, a task in CVPR 2024 UAV Tracking and Pose-Estimation Challenge. This challenge faces difficulties in drone detection,
UAV-type classification and 2D/3D trajectory estimation in extreme weather conditions with multi-modal sensor information, including stereo vision, various Lidars, Radars, and audio arrays. Leveraging this information, we propose a multi-modal UAV detection, classification, and 3D tracking method for accurate UAV classification and tracking. A novel classification pipeline which incorporates sequence fusion, region of interest (ROI) cropping, and keyframe selection is proposed. Our system integrates cutting-edge classification techniques and sophisticated post-processing steps to boost accuracy and robustness. The designed pose estimation pipeline incorporates three modules: dynamic points analysis, a multi-object tracker, and trajectory completion techniques. Extensive experiments have validated the effectiveness and precision of our approach. In addition, we also propose a novel dataset pre-processing method and conduct a comprehensive ablation study for our design. We finally achieved the best performance in the classification and tracking of the MMUAD dataset. The code and configuration of our method are available at https://github.com/dtc111111/Multi-Modal-UAV.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Model Free Prediction with Uncertainty Assessment
Authors:
Yuling Jiao,
Lican Kang,
Jin Liu,
Heng Peng,
Heng Zuo
Abstract:
Deep nonparametric regression, characterized by the utilization of deep neural networks to learn target functions, has emerged as a focus of research attention in recent years. Despite considerable progress in understanding convergence rates, the absence of asymptotic properties hinders rigorous statistical inference. To address this gap, we propose a novel framework that transforms the deep estim…
▽ More
Deep nonparametric regression, characterized by the utilization of deep neural networks to learn target functions, has emerged as a focus of research attention in recent years. Despite considerable progress in understanding convergence rates, the absence of asymptotic properties hinders rigorous statistical inference. To address this gap, we propose a novel framework that transforms the deep estimation paradigm into a platform conducive to conditional mean estimation, leveraging the conditional diffusion model. Theoretically, we develop an end-to-end convergence rate for the conditional diffusion model and establish the asymptotic normality of the generated samples. Consequently, we are equipped to construct confidence regions, facilitating robust statistical inference. Furthermore, through numerical experiments, we empirically validate the efficacy of our proposed methodology.
△ Less
Submitted 16 June, 2024; v1 submitted 21 May, 2024;
originally announced May 2024.
-
Latent Schr{ö}dinger Bridge Diffusion Model for Generative Learning
Authors:
Yuling Jiao,
Lican Kang,
Huazhen Lin,
Jin Liu,
Heng Zuo
Abstract:
This paper aims to conduct a comprehensive theoretical analysis of current diffusion models. We introduce a novel generative learning methodology utilizing the Schr{ö}dinger bridge diffusion model in latent space as the framework for theoretical exploration in this domain. Our approach commences with the pre-training of an encoder-decoder architecture using data originating from a distribution tha…
▽ More
This paper aims to conduct a comprehensive theoretical analysis of current diffusion models. We introduce a novel generative learning methodology utilizing the Schr{ö}dinger bridge diffusion model in latent space as the framework for theoretical exploration in this domain. Our approach commences with the pre-training of an encoder-decoder architecture using data originating from a distribution that may diverge from the target distribution, thus facilitating the accommodation of a large sample size through the utilization of pre-existing large-scale models. Subsequently, we develop a diffusion model within the latent space utilizing the Schr{ö}dinger bridge framework. Our theoretical analysis encompasses the establishment of end-to-end error analysis for learning distributions via the latent Schr{ö}dinger bridge diffusion model. Specifically, we control the second-order Wasserstein distance between the generated distribution and the target distribution. Furthermore, our obtained convergence rates effectively mitigate the curse of dimensionality, offering robust theoretical support for prevailing diffusion models.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
NetTrack: Tracking Highly Dynamic Objects with a Net
Authors:
Guangze Zheng,
Shijie Lin,
Haobo Zuo,
Changhong Fu,
Jia Pan
Abstract:
The complex dynamicity of open-world objects presents non-negligible challenges for multi-object tracking (MOT), often manifested as severe deformations, fast motion, and occlusions. Most methods that solely depend on coarse-grained object cues, such as boxes and the overall appearance of the object, are susceptible to degradation due to distorted internal relationships of dynamic objects. To addr…
▽ More
The complex dynamicity of open-world objects presents non-negligible challenges for multi-object tracking (MOT), often manifested as severe deformations, fast motion, and occlusions. Most methods that solely depend on coarse-grained object cues, such as boxes and the overall appearance of the object, are susceptible to degradation due to distorted internal relationships of dynamic objects. To address this problem, this work proposes NetTrack, an efficient, generic, and affordable tracking framework to introduce fine-grained learning that is robust to dynamicity. Specifically, NetTrack constructs a dynamicity-aware association with a fine-grained Net, leveraging point-level visual cues. Correspondingly, a fine-grained sampler and matching method have been incorporated. Furthermore, NetTrack learns object-text correspondence for fine-grained localization. To evaluate MOT in extremely dynamic open-world scenarios, a bird flock tracking (BFT) dataset is constructed, which exhibits high dynamicity with diverse species and open-world scenarios. Comprehensive evaluation on BFT validates the effectiveness of fine-grained learning on object dynamicity, and thorough transfer experiments on challenging open-world benchmarks, i.e., TAO, TAO-OW, AnimalTrack, and GMOT-40, validate the strong generalization ability of NetTrack even without finetuning. Project page: https://george-zhuang.github.io/nettrack/.
△ Less
Submitted 17 March, 2024;
originally announced March 2024.
-
OCEAN: An Openspace Collision-free Trajectory Planner for Autonomous Parking Based on ADMM
Authors:
Dongxu Wang,
Yanbin Lu,
Weilong Liu,
Hao Zuo,
Jiade Xin,
Xiang Long,
Yuncheng Jiang
Abstract:
In this paper, we propose an Openspace Collision-freE trAjectory plaNner (OCEAN) for autonomous parking. OCEAN is an optimization-based trajectory planner accelerated by Alternating Direction Method of Multiplier (ADMM) with enhanced computational efficiency and robustness, and is suitable for all scenes with few dynamic obstacles. Starting from a hierarchical optimization-based collision avoidanc…
▽ More
In this paper, we propose an Openspace Collision-freE trAjectory plaNner (OCEAN) for autonomous parking. OCEAN is an optimization-based trajectory planner accelerated by Alternating Direction Method of Multiplier (ADMM) with enhanced computational efficiency and robustness, and is suitable for all scenes with few dynamic obstacles. Starting from a hierarchical optimization-based collision avoidance framework, the trajectory planning problem is first warm-started by a collision-free Hybrid A* trajectory, then the collision avoidance trajectory planning problem is reformulated as a smooth and convex dual form, and solved by ADMM in parallel. The optimization variables are carefully split into several groups so that ADMM sub-problems are formulated as Quadratic Programming (QP), Sequential Quadratic Programming (SQP),and Second Order Cone Programming (SOCP) problems that can be efficiently and robustly solved. We validate our method both in hundreds of simulation scenarios and hundreds of hours of public parking areas. The results show that the proposed method has better system performance compared with other benchmarks.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Inpainting Normal Maps for Lightstage data
Authors:
Hancheng Zuo,
Bernard Tiddeman
Abstract:
This study introduces a novel method for inpainting normal maps using a generative adversarial network (GAN). Normal maps, often derived from a lightstage, are crucial in performance capture but can have obscured areas due to movement (e.g., by arms, hair, or props). Inpainting fills these missing areas with plausible data. Our approach extends previous general image inpainting techniques, employi…
▽ More
This study introduces a novel method for inpainting normal maps using a generative adversarial network (GAN). Normal maps, often derived from a lightstage, are crucial in performance capture but can have obscured areas due to movement (e.g., by arms, hair, or props). Inpainting fills these missing areas with plausible data. Our approach extends previous general image inpainting techniques, employing a bow tie-like generator network and a discriminator network, with alternating training phases. The generator aims to synthesize images aligning with the ground truth and deceive the discriminator, which differentiates between real and processed images. Periodically, the discriminator undergoes retraining to enhance its ability to identify processed images. Importantly, our method adapts to the unique characteristics of normal map data, necessitating modifications to the loss function. We utilize a cosine loss instead of mean squared error loss for generator training. Limited training data availability, even with synthetic datasets, demands significant augmentation, considering the specific nature of the input data. This includes appropriate image flipping and in-plane rotations to accurately alter normal vectors. Throughout training, we monitored key metrics such as average loss, Structural Similarity Index Measure (SSIM), and Peak Signal-to-Noise Ratio (PSNR) for the generator, along with average loss and accuracy for the discriminator. Our findings suggest that the proposed model effectively generates high-quality, realistic inpainted normal maps, suitable for performance capture applications. These results establish a foundation for future research, potentially involving more advanced networks and comparisons with inpainting of source images used to create the normal maps.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Learning High-Order Relationships of Brain Regions
Authors:
Weikang Qiu,
Huangrui Chu,
Selena Wang,
Haolan Zuo,
Xiaoxiao Li,
Yize Zhao,
Rex Ying
Abstract:
Discovering reliable and informative relationships among brain regions from functional magnetic resonance imaging (fMRI) signals is essential in phenotypic predictions. Most of the current methods fail to accurately characterize those interactions because they only focus on pairwise connections and overlook the high-order relationships of brain regions. We propose that these high-order relationshi…
▽ More
Discovering reliable and informative relationships among brain regions from functional magnetic resonance imaging (fMRI) signals is essential in phenotypic predictions. Most of the current methods fail to accurately characterize those interactions because they only focus on pairwise connections and overlook the high-order relationships of brain regions. We propose that these high-order relationships should be maximally informative and minimally redundant (MIMR). However, identifying such high-order relationships is challenging and under-explored due to the exponential search space and the absence of a tractable objective. In response to this gap, we propose a novel method named HYBRID which aims to extract MIMR high-order relationships from fMRI data. HYBRID employs a CONSTRUCTOR to identify hyperedge structures, and a WEIGHTER to compute a weight for each hyperedge, which avoids searching in exponential space. HYBRID achieves the MIMR objective through an innovative information bottleneck framework named multi-head drop-bottleneck with theoretical guarantees. Our comprehensive experiments demonstrate the effectiveness of our model. Our model outperforms the state-of-the-art predictive model by an average of 11.2%, regarding the quality of hyperedges measured by CPM, a standard protocol for studying brain connections.
△ Less
Submitted 8 June, 2024; v1 submitted 2 December, 2023;
originally announced December 2023.
-
Learning Noise-Robust Joint Representation for Multimodal Emotion Recognition under Incomplete Data Scenarios
Authors:
Qi Fan,
Haolin Zuo,
Rui Liu,
Zheng Lian,
Guanglai Gao
Abstract:
Multimodal emotion recognition (MER) in practical scenarios is significantly challenged by the presence of missing or incomplete data across different modalities. To overcome these challenges, researchers have aimed to simulate incomplete conditions during the training phase to enhance the system's overall robustness. Traditional methods have often involved discarding data or substituting data seg…
▽ More
Multimodal emotion recognition (MER) in practical scenarios is significantly challenged by the presence of missing or incomplete data across different modalities. To overcome these challenges, researchers have aimed to simulate incomplete conditions during the training phase to enhance the system's overall robustness. Traditional methods have often involved discarding data or substituting data segments with zero vectors to approximate these incompletenesses. However, such approaches neither accurately represent real-world conditions nor adequately address the issue of noisy data availability. For instance, a blurry image cannot be simply replaced with zero vectors, and still retain information. To tackle this issue and develop a more precise MER system, we introduce a novel noise-robust MER model that effectively learns robust multimodal joint representations from noisy data. This approach includes two pivotal components: firstly, a noise scheduler that adjusts the type and level of noise in the data to emulate various realistic incomplete situations. Secondly, a Variational AutoEncoder (VAE)-based module is employed to reconstruct these robust multimodal joint representations from the noisy inputs. Notably, the introduction of the noise scheduler enables the exploration of an entirely new type of incomplete data condition, which is impossible with existing methods. Extensive experimental evaluations on the benchmark datasets IEMOCAP and CMU-MOSEI demonstrate the effectiveness of the noise scheduler and the excellent performance of our proposed model.
△ Less
Submitted 7 May, 2024; v1 submitted 21 September, 2023;
originally announced November 2023.
-
SAM-DA: UAV Tracks Anything at Night with SAM-Powered Domain Adaptation
Authors:
Changhong Fu,
Liangliang Yao,
Haobo Zuo,
Guangze Zheng,
Jia Pan
Abstract:
Domain adaptation (DA) has demonstrated significant promise for real-time nighttime unmanned aerial vehicle (UAV) tracking. However, the state-of-the-art (SOTA) DA still lacks the potential object with accurate pixel-level location and boundary to generate the high-quality target domain training sample. This key issue constrains the transfer learning of the real-time daytime SOTA trackers for chal…
▽ More
Domain adaptation (DA) has demonstrated significant promise for real-time nighttime unmanned aerial vehicle (UAV) tracking. However, the state-of-the-art (SOTA) DA still lacks the potential object with accurate pixel-level location and boundary to generate the high-quality target domain training sample. This key issue constrains the transfer learning of the real-time daytime SOTA trackers for challenging nighttime UAV tracking. Recently, the notable Segment Anything Model (SAM) has achieved a remarkable zero-shot generalization ability to discover abundant potential objects due to its huge data-driven training approach. To solve the aforementioned issue, this work proposes a novel SAM-powered DA framework for real-time nighttime UAV tracking, i.e., SAM-DA. Specifically, an innovative SAM-powered target domain training sample swelling is designed to determine enormous high-quality target domain training samples from every single raw nighttime image. This novel one-to-many generation significantly expands the high-quality target domain training sample for DA. Comprehensive experiments on extensive nighttime UAV videos prove the robustness and domain adaptability of SAM-DA for nighttime UAV tracking. Especially, compared to the SOTA DA, SAM-DA can achieve better performance with fewer raw nighttime images, i.e., the fewer-better training. This economized training approach facilitates the quick validation and deployment of algorithms for UAVs. The code is available at https://github.com/vision4robotics/SAM-DA.
△ Less
Submitted 24 March, 2024; v1 submitted 3 July, 2023;
originally announced July 2023.
-
Continuity-Aware Latent Interframe Information Mining for Reliable UAV Tracking
Authors:
Changhong Fu,
Mutian Cai,
Sihang Li,
Kunhan Lu,
Haobo Zuo,
Chongjun Liu
Abstract:
Unmanned aerial vehicle (UAV) tracking is crucial for autonomous navigation and has broad applications in robotic automation fields. However, reliable UAV tracking remains a challenging task due to various difficulties like frequent occlusion and aspect ratio change. Additionally, most of the existing work mainly focuses on explicit information to improve tracking performance, ignoring potential i…
▽ More
Unmanned aerial vehicle (UAV) tracking is crucial for autonomous navigation and has broad applications in robotic automation fields. However, reliable UAV tracking remains a challenging task due to various difficulties like frequent occlusion and aspect ratio change. Additionally, most of the existing work mainly focuses on explicit information to improve tracking performance, ignoring potential interframe connections. To address the above issues, this work proposes a novel framework with continuity-aware latent interframe information mining for reliable UAV tracking, i.e., ClimRT. Specifically, a new efficient continuity-aware latent interframe information mining network (ClimNet) is proposed for UAV tracking, which can generate highly-effective latent frame between two adjacent frames. Besides, a novel location-continuity Transformer (LCT) is designed to fully explore continuity-aware spatial-temporal information, thereby markedly enhancing UAV tracking. Extensive qualitative and quantitative experiments on three authoritative aerial benchmarks strongly validate the robustness and reliability of ClimRT in UAV tracking performance. Furthermore, real-world tests on the aerial platform validate its practicability and effectiveness. The code and demo materials are released at https://github.com/vision4robotics/ClimRT.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Prediction of superconducting properties of materials based on machine learning models
Authors:
Jie Hu,
Yongquan Jiang,
Yang Yan,
Houchen Zuo
Abstract:
The application of superconducting materials is becoming more and more widespread. Traditionally, the discovery of new superconducting materials relies on the experience of experts and a large number of "trial and error" experiments, which not only increases the cost of experiments but also prolongs the period of discovering new superconducting materials. In recent years, machine learning has been…
▽ More
The application of superconducting materials is becoming more and more widespread. Traditionally, the discovery of new superconducting materials relies on the experience of experts and a large number of "trial and error" experiments, which not only increases the cost of experiments but also prolongs the period of discovering new superconducting materials. In recent years, machine learning has been increasingly applied to materials science. Based on this, this manuscript proposes the use of XGBoost model to identify superconductors; the first application of deep forest model to predict the critical temperature of superconductors; the first application of deep forest to predict the band gap of materials; and application of a new sub-network model to predict the Fermi energy level of materials. Compared with our known similar literature, all the above algorithms reach state-of-the-art. Finally, this manuscript uses the above models to search the COD public dataset and identify 50 candidate superconducting materials with possible critical temperature greater than 90 K.
△ Less
Submitted 6 November, 2022;
originally announced November 2022.
-
Explicit Intensity Control for Accented Text-to-speech
Authors:
Rui Liu,
Haolin Zuo,
De Hu,
Guanglai Gao,
Haizhou Li
Abstract:
Accented text-to-speech (TTS) synthesis seeks to generate speech with an accent (L2) as a variant of the standard version (L1). How to control the intensity of accent in the process of TTS is a very interesting research direction, and has attracted more and more attention. Recent work design a speaker-adversarial loss to disentangle the speaker and accent information, and then adjust the loss weig…
▽ More
Accented text-to-speech (TTS) synthesis seeks to generate speech with an accent (L2) as a variant of the standard version (L1). How to control the intensity of accent in the process of TTS is a very interesting research direction, and has attracted more and more attention. Recent work design a speaker-adversarial loss to disentangle the speaker and accent information, and then adjust the loss weight to control the accent intensity. However, such a control method lacks interpretability, and there is no direct correlation between the controlling factor and natural accent intensity. To this end, this paper propose a new intuitive and explicit accent intensity control scheme for accented TTS. Specifically, we first extract the posterior probability, called as ``goodness of pronunciation (GoP)'' from the L1 speech recognition model to quantify the phoneme accent intensity for accented speech, then design a FastSpeech2 based TTS model, named Ai-TTS, to take the accent intensity expression into account during speech generation. Experiments show that the our method outperforms the baseline model in terms of accent rendering and intensity control.
△ Less
Submitted 27 October, 2022;
originally announced October 2022.
-
Exploiting modality-invariant feature for robust multimodal emotion recognition with missing modalities
Authors:
Haolin Zuo,
Rui Liu,
Jinming Zhao,
Guanglai Gao,
Haizhou Li
Abstract:
Multimodal emotion recognition leverages complementary information across modalities to gain performance. However, we cannot guarantee that the data of all modalities are always present in practice. In the studies to predict the missing data across modalities, the inherent difference between heterogeneous modalities, namely the modality gap, presents a challenge. To address this, we propose to use…
▽ More
Multimodal emotion recognition leverages complementary information across modalities to gain performance. However, we cannot guarantee that the data of all modalities are always present in practice. In the studies to predict the missing data across modalities, the inherent difference between heterogeneous modalities, namely the modality gap, presents a challenge. To address this, we propose to use invariant features for a missing modality imagination network (IF-MMIN) which includes two novel mechanisms: 1) an invariant feature learning strategy that is based on the central moment discrepancy (CMD) distance under the full-modality scenario; 2) an invariant feature based imagination module (IF-IM) to alleviate the modality gap during the missing modalities prediction, thus improving the robustness of multimodal joint representation. Comprehensive experiments on the benchmark dataset IEMOCAP demonstrate that the proposed model outperforms all baselines and invariantly improves the overall emotion recognition performance under uncertain missing-modality conditions. We release the code at: https://github.com/ZhuoYulang/IF-MMIN.
△ Less
Submitted 27 October, 2022;
originally announced October 2022.
-
WikiLink: an encyclopedia-based semantic network for design innovation
Authors:
Haoyu Zuo,
Qianzhi Jing,
Tianqi Song,
Huiting Liu,
Lingyun Sun,
Peter Childs,
Liuqing Chen
Abstract:
Data-driven design and innovation is a process to reuse and provide valuable and useful information. However, existing semantic networks for design innovation is built on data source restricted to technological and scientific information. Besides, existing studies build the edges of a semantic network only on either statistical or semantic relationships, which is less likely to make full use of th…
▽ More
Data-driven design and innovation is a process to reuse and provide valuable and useful information. However, existing semantic networks for design innovation is built on data source restricted to technological and scientific information. Besides, existing studies build the edges of a semantic network only on either statistical or semantic relationships, which is less likely to make full use of the benefits from both types of relationships and discover implicit knowledge for design innovation. Therefore, we constructed WikiLink, a semantic network based on Wikipedia. Combined weight which fuses both the statistic and semantic weights between concepts is introduced in WikiLink, and four algorithms are developed for inspiring new ideas. Evaluation experiments are undertaken and results show that the network is characterised by high coverage of terms, relationships and disciplines, which proves the network's effectiveness and usefulness. Then a demonstration and case study results indicate that WikiLink can serve as an idea generation tool for innovation in conceptual design. The source code of WikiLink and the backend data are provided open-source for more users to explore and build on.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
Hardness prediction of age-hardening aluminum alloy based on ensemble learning
Authors:
Houchen Zuo,
Yongquan Jiang,
Yan Yang,
Baoying Liu,
Jie Hu
Abstract:
With the rapid development of artificial intelligence, the combination of material database and machine learning has driven the progress of material informatics. Because aluminum alloy is widely used in many fields, so it is significant to predict the properties of aluminum alloy. In this thesis, the data of Al-Cu-Mg-X (X: Zn, Zr, etc.) alloy are used to input the composition, aging conditions (ti…
▽ More
With the rapid development of artificial intelligence, the combination of material database and machine learning has driven the progress of material informatics. Because aluminum alloy is widely used in many fields, so it is significant to predict the properties of aluminum alloy. In this thesis, the data of Al-Cu-Mg-X (X: Zn, Zr, etc.) alloy are used to input the composition, aging conditions (time and temperature) and predict its hardness. An ensemble learning solution based on automatic machine learning and an attention mechanism introduced into the secondary learner of deep neural network are proposed respectively. The experimental results show that selecting the correct secondary learner can further improve the prediction accuracy of the model. This manuscript introduces the attention mechanism to improve the secondary learner based on deep neural network, and obtains a fusion model with better performance. The R-Square of the best model is 0.9697 and the MAE is 3.4518HV.
△ Less
Submitted 16 June, 2022;
originally announced June 2022.
-
Prediction of properties of metal alloy materials based on machine learning
Authors:
Houchen Zuo,
Yongquan Jiang,
Yan Yang,
Jie Hu
Abstract:
Density functional theory and its optimization algorithm are the main methods to calculate the properties in the field of materials. Although the calculation results are accurate, it costs a lot of time and money. In order to alleviate this problem, we intend to use machine learning to predict material properties. In this paper, we conduct experiments on atomic volume, atomic energy and atomic for…
▽ More
Density functional theory and its optimization algorithm are the main methods to calculate the properties in the field of materials. Although the calculation results are accurate, it costs a lot of time and money. In order to alleviate this problem, we intend to use machine learning to predict material properties. In this paper, we conduct experiments on atomic volume, atomic energy and atomic formation energy of metal alloys, using the open quantum material database. Through the traditional machine learning models, deep learning network and automated machine learning, we verify the feasibility of machine learning in material property prediction. The experimental results show that the machine learning can predict the material properties accurately.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
Patent-KG: Patent Knowledge Graph Use for Engineering Design
Authors:
Haoyu Zuo,
Yuan Yin,
Peter Childs
Abstract:
To facilitate knowledge reuse in engineering design, several dataset approaches have been proposed and applied by designers. This paper builds a patent-based knowledge graph, patent-KG, to represent the knowledge facts in patents for engineering design. The arising patent-KG approach proposes a new unsupervised mechanism to extract knowledge facts in a patent, by searching the attention graph in l…
▽ More
To facilitate knowledge reuse in engineering design, several dataset approaches have been proposed and applied by designers. This paper builds a patent-based knowledge graph, patent-KG, to represent the knowledge facts in patents for engineering design. The arising patent-KG approach proposes a new unsupervised mechanism to extract knowledge facts in a patent, by searching the attention graph in language models. This method avoids using expensive labelled data in supervised learning or listing complex syntactic rules in rule-based extraction. The extracted entities are compared with other benchmarks in the criteria of recall rate. The result reaches the highest 0.9 recall rate in the standard list of mechanical engineering related technical terms, which means the highest coverage of engineering words. The extracted relationships are also compared with other benchmarks. The result shows that our method provides more contextual information in relationships, and extracts more relationship types including positional and negation relationships.
△ Less
Submitted 16 September, 2021; v1 submitted 26 August, 2021;
originally announced August 2021.
-
On Classification of Toric Surface Codes of Low Dimension
Authors:
Xue Luo,
Stephen S. -T. Yau,
Mingyi Zhang,
Huaiqing Zuo
Abstract:
This work is a natural continuation of our previous work \cite{yz}. In this paper, we give a complete classification of toric surface codes of dimension less than or equal to 6, except a special pair, $C_{P_6^{(4)}}$ and $C_{P_6^{(5)}}$ over $\mathbb{F}_8$. Also, we give an example, $C_{P_6^{(5)}}$ and $C_{P_6^{(6)}}$ over $\mathbb{F}_7$, to illustrate that two monomially equivalent toric codes ca…
▽ More
This work is a natural continuation of our previous work \cite{yz}. In this paper, we give a complete classification of toric surface codes of dimension less than or equal to 6, except a special pair, $C_{P_6^{(4)}}$ and $C_{P_6^{(5)}}$ over $\mathbb{F}_8$. Also, we give an example, $C_{P_6^{(5)}}$ and $C_{P_6^{(6)}}$ over $\mathbb{F}_7$, to illustrate that two monomially equivalent toric codes can be constructed from two lattice non-equivalent polygons.
△ Less
Submitted 13 September, 2014; v1 submitted 1 February, 2014;
originally announced February 2014.