-
XMainframe: A Large Language Model for Mainframe Modernization
Authors:
Anh T. V. Dau,
Hieu Trung Dao,
Anh Tuan Nguyen,
Hieu Trung Tran,
Phong X. Nguyen,
Nghi D. Q. Bui
Abstract:
Mainframe operating systems, despite their inception in the 1940s, continue to support critical sectors like finance and government. However, these systems are often viewed as outdated, requiring extensive maintenance and modernization. Addressing this challenge necessitates innovative tools that can understand and interact with legacy codebases. To this end, we introduce XMainframe, a state-of-th…
▽ More
Mainframe operating systems, despite their inception in the 1940s, continue to support critical sectors like finance and government. However, these systems are often viewed as outdated, requiring extensive maintenance and modernization. Addressing this challenge necessitates innovative tools that can understand and interact with legacy codebases. To this end, we introduce XMainframe, a state-of-the-art large language model (LLM) specifically designed with knowledge of mainframe legacy systems and COBOL codebases. Our solution involves the creation of an extensive data collection pipeline to produce high-quality training datasets, enhancing XMainframe's performance in this specialized domain. Additionally, we present MainframeBench, a comprehensive benchmark for assessing mainframe knowledge, including multiple-choice questions, question answering, and COBOL code summarization. Our empirical evaluations demonstrate that XMainframe consistently outperforms existing state-of-the-art LLMs across these tasks. Specifically, XMainframe achieves 30% higher accuracy than DeepSeek-Coder on multiple-choice questions, doubles the BLEU score of Mixtral-Instruct 8x7B on question answering, and scores six times higher than GPT-3.5 on COBOL summarization. Our work highlights the potential of XMainframe to drive significant advancements in managing and modernizing legacy systems, thereby enhancing productivity and saving time for software developers.
△ Less
Submitted 26 August, 2024; v1 submitted 5 August, 2024;
originally announced August 2024.
-
AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology
Authors:
Minh Huynh Nguyen,
Thang Phan Chau,
Phong X. Nguyen,
Nghi D. Q. Bui
Abstract:
Software agents have emerged as promising tools for addressing complex software engineering tasks. Existing works, on the other hand, frequently oversimplify software development workflows, despite the fact that such workflows are typically more complex in the real world. Thus, we propose AgileCoder, a multi agent system that integrates Agile Methodology (AM) into the framework. This system assign…
▽ More
Software agents have emerged as promising tools for addressing complex software engineering tasks. Existing works, on the other hand, frequently oversimplify software development workflows, despite the fact that such workflows are typically more complex in the real world. Thus, we propose AgileCoder, a multi agent system that integrates Agile Methodology (AM) into the framework. This system assigns specific AM roles - such as Product Manager, Developer, and Tester to different agents, who then collaboratively develop software based on user inputs. AgileCoder enhances development efficiency by organizing work into sprints, focusing on incrementally developing software through sprints. Additionally, we introduce Dynamic Code Graph Generator, a module that creates a Code Dependency Graph dynamically as updates are made to the codebase. This allows agents to better comprehend the codebase, leading to more precise code generation and modifications throughout the software development process. AgileCoder surpasses existing benchmarks, like ChatDev and MetaGPT, establishing a new standard and showcasing the capabilities of multi agent systems in advanced software engineering environments.
△ Less
Submitted 14 July, 2024; v1 submitted 16 June, 2024;
originally announced June 2024.
-
A degenerate trion liquid in atomic double layers
Authors:
Phuong X. Nguyen,
Raghav Chaturvedi,
Liguo Ma,
Patrick Knuppel,
Kenji Watanabe,
Takashi Taniguchi,
Kin Fai Mak,
Jie Shan
Abstract:
Trions are a three-particle bound state of electrons and holes. Experimental realization of a trion liquid in the degenerate quantum limit would open a wide range of phenomena in quantum many-body physics. However, trions have been observed only as optically excited states in doped semiconductors to date. Here we report the emergence of a degenerate trion liquid in a Bose-Fermi mixture of holes an…
▽ More
Trions are a three-particle bound state of electrons and holes. Experimental realization of a trion liquid in the degenerate quantum limit would open a wide range of phenomena in quantum many-body physics. However, trions have been observed only as optically excited states in doped semiconductors to date. Here we report the emergence of a degenerate trion liquid in a Bose-Fermi mixture of holes and excitons in Coulomb-coupled MoSe2/WSe2 monolayers. By electrically tuning the hole density in WSe2 to be two times the electron density in MoSe2, we generate equilibrium interlayer trions with binding energy about 1 meV at temperatures two orders of magnitude below the Fermi temperature. We further demonstrate a density-tuned phase transition to an electron-hole plasma, spin-singlet correlations for the constituent holes and Zeeman-field-induced dissociation of trions. The results pave the way for exploration of the correlated phases of composite particles in solids.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Perfect Coulomb drag in a dipolar excitonic insulator
Authors:
Phuong X. Nguyen,
Liguo Ma,
Raghav Chaturvedi,
Kenji Watanabe,
Takashi Taniguchi,
Jie Shan,
Kin Fai Mak
Abstract:
Excitonic insulators (EIs), arising in semiconductors when the electron-hole binding energy exceeds the band gap, are a solid-state prototype for bosonic phases of matter. Unlike the charged excitations that are frozen and unable to transport current, the neutral electron-hole pairs (excitons) are free to move in EIs. However, it is intrinsically difficult to demonstrate exciton transport in bulk…
▽ More
Excitonic insulators (EIs), arising in semiconductors when the electron-hole binding energy exceeds the band gap, are a solid-state prototype for bosonic phases of matter. Unlike the charged excitations that are frozen and unable to transport current, the neutral electron-hole pairs (excitons) are free to move in EIs. However, it is intrinsically difficult to demonstrate exciton transport in bulk EI candidates. The recently emerged dipolar EIs based on Coulomb-coupled atomic double layers open the possibility to realize exciton transport across the insulator because separate electrical contacts can be made to the electron and hole layers. Here we show that the strong interlayer excitonic correlation at equal electron and hole densities in the MoSe2/WSe2 double layers separated by a 2-nm barrier gives rise to perfect Coulomb drag. A charge current in one layer induces an equal but opposite drag current in the other. The drag current ratio remains above 0.9 up to about 20 K for low exciton densities. As exciton density increases above the Mott density, the excitons dissociate into the electron-hole plasma abruptly, and only weak Fermi liquid frictional drag is observed. Our experiment moves a step closer to realizing exciton circuitry and superfluidity.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Contextual Explainable Video Representation: Human Perception-based Understanding
Authors:
Khoa Vo,
Kashu Yamazaki,
Phong X. Nguyen,
Phat Nguyen,
Khoa Luu,
Ngan Le
Abstract:
Video understanding is a growing field and a subject of intense research, which includes many interesting tasks to understanding both spatial and temporal information, e.g., action detection, action recognition, video captioning, video retrieval. One of the most challenging problems in video understanding is dealing with feature extraction, i.e. extract contextual visual representation from given…
▽ More
Video understanding is a growing field and a subject of intense research, which includes many interesting tasks to understanding both spatial and temporal information, e.g., action detection, action recognition, video captioning, video retrieval. One of the most challenging problems in video understanding is dealing with feature extraction, i.e. extract contextual visual representation from given untrimmed video due to the long and complicated temporal structure of unconstrained videos. Different from existing approaches, which apply a pre-trained backbone network as a black-box to extract visual representation, our approach aims to extract the most contextual information with an explainable mechanism. As we observed, humans typically perceive a video through the interactions between three main factors, i.e., the actors, the relevant objects, and the surrounding environment. Therefore, it is very crucial to design a contextual explainable video representation extraction that can capture each of such factors and model the relationships between them. In this paper, we discuss approaches, that incorporate the human perception process into modeling actors, objects, and the environment. We choose video paragraph captioning and temporal action detection to illustrate the effectiveness of human perception based-contextual representation in video understanding. Source code is publicly available at https://github.com/UARK-AICV/Video_Representation.
△ Less
Submitted 17 December, 2022; v1 submitted 12 December, 2022;
originally announced December 2022.
-
Robust, General, and Low Complexity Acoustic Scene Classification Systems and An Effective Visualization for Presenting a Sound Scene Context
Authors:
Lam Pham,
Dusan Salovic,
Anahid Jalali,
Alexander Schindler,
Khoa Tran,
Canh Vu,
Phu X. Nguyen
Abstract:
In this paper, we present a comprehensive analysis of Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. In particular, we firstly propose an inception-based and low footprint ASC model, referred to as the ASC baseline. The proposed ASC baseline is then compared with benchmark and high-complexity network architectures of Mobile…
▽ More
In this paper, we present a comprehensive analysis of Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. In particular, we firstly propose an inception-based and low footprint ASC model, referred to as the ASC baseline. The proposed ASC baseline is then compared with benchmark and high-complexity network architectures of MobileNetV1, MobileNetV2, VGG16, VGG19, ResNet50V2, ResNet152V2, DenseNet121, DenseNet201, and Xception. Next, we improve the ASC baseline by proposing a novel deep neural network architecture which leverages residual-inception architectures and multiple kernels. Given the novel residual-inception (NRI) model, we further evaluate the trade off between the model complexity and the model accuracy performance. Finally, we evaluate whether sound events occurring in a sound scene recording can help to improve ASC accuracy, then indicate how a sound scene context is well presented by combining both sound scene and sound event information. We conduct extensive experiments on various ASC datasets, including Crowded Scenes, IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2018 Task 1A and 1B, 2019 Task 1A and 1B, 2020 Task 1A, 2021 Task 1A, 2022 Task 1. The experimental results on several different ASC challenges highlight two main achievements; the first is to propose robust, general, and low complexity ASC systems which are suitable for real-life applications on a wide range of edge devices and mobiles; the second is to propose an effective visualization method for comprehensively presenting a sound scene context.
△ Less
Submitted 16 October, 2022;
originally announced October 2022.
-
Sound-Dr: Reliable Sound Dataset and Baseline Artificial Intelligence System for Respiratory Illnesses
Authors:
Truong V. Hoang,
Quang H. Nguyen,
Cuong Q. Nguyen,
Phong X. Nguyen,
Hoang D. Nguyen
Abstract:
As the burden of respiratory diseases continues to fall on society worldwide, this paper proposes a high-quality and reliable dataset of human sounds for studying respiratory illnesses, including pneumonia and COVID-19. It consists of coughing, mouth breathing, and nose breathing sounds together with metadata on related clinical characteristics. We also develop a proof-of-concept system for establ…
▽ More
As the burden of respiratory diseases continues to fall on society worldwide, this paper proposes a high-quality and reliable dataset of human sounds for studying respiratory illnesses, including pneumonia and COVID-19. It consists of coughing, mouth breathing, and nose breathing sounds together with metadata on related clinical characteristics. We also develop a proof-of-concept system for establishing baselines and benchmarking against multiple datasets, such as Coswara and COUGHVID. Our comprehensive experiments show that the Sound-Dr dataset has richer features, better performance, and is more robust to dataset shifts in various machine learning tasks. It is promising for a wide range of real-time applications on mobile devices. The proposed dataset and system will serve as practical tools to support healthcare professionals in diagnosing respiratory disorders. The dataset and code are publicly available here: https://github.com/ReML-AI/Sound-Dr/.
△ Less
Submitted 4 August, 2023; v1 submitted 12 January, 2022;
originally announced January 2022.
-
An Audio-Visual Dataset and Deep Learning Frameworks for Crowded Scene Classification
Authors:
Lam Pham,
Dat Ngo,
Phu X. Nguyen,
Truong Hoang,
Alexander Schindler
Abstract:
This paper presents a task of audio-visual scene classification (SC) where input videos are classified into one of five real-life crowded scenes: 'Riot', 'Noise-Street', 'Firework-Event', 'Music-Event', and 'Sport-Atmosphere'. To this end, we firstly collect an audio-visual dataset (videos) of these five crowded contexts from Youtube (in-the-wild scenes). Then, a wide range of deep learning framew…
▽ More
This paper presents a task of audio-visual scene classification (SC) where input videos are classified into one of five real-life crowded scenes: 'Riot', 'Noise-Street', 'Firework-Event', 'Music-Event', and 'Sport-Atmosphere'. To this end, we firstly collect an audio-visual dataset (videos) of these five crowded contexts from Youtube (in-the-wild scenes). Then, a wide range of deep learning frameworks are proposed to deploy either audio or visual input data independently. Finally, results obtained from high-performed deep learning frameworks are fused to achieve the best accuracy score. Our experimental results indicate that audio and visual input factors independently contribute to the SC task's performance. Significantly, an ensemble of deep learning frameworks exploring either audio or visual input data can achieve the best accuracy of 95.7%.
△ Less
Submitted 16 December, 2021;
originally announced December 2021.
-
Strongly correlated excitonic insulator in atomic double layers
Authors:
Liguo Ma,
Phuong X. Nguyen,
Zefang Wang,
Yongxin Zeng,
Kenji Watanabe,
Takashi Taniguchi,
Allan H. MacDonald,
Kin Fai Mak,
Jie Shan
Abstract:
Excitonic insulators (EI) arise from the formation of bound electron-hole pairs (excitons) in semiconductors and provide a solid-state platform for quantum many-boson physics. Strong exciton-exciton repulsion is expected to stabilize condensed superfluid and crystalline phases by suppressing both density and phase fluctuations. Although spectroscopic signatures of EIs have been reported, conclusiv…
▽ More
Excitonic insulators (EI) arise from the formation of bound electron-hole pairs (excitons) in semiconductors and provide a solid-state platform for quantum many-boson physics. Strong exciton-exciton repulsion is expected to stabilize condensed superfluid and crystalline phases by suppressing both density and phase fluctuations. Although spectroscopic signatures of EIs have been reported, conclusive evidence for strongly correlated EI states has remained elusive. Here, we demonstrate a strongly correlated spatially indirect two-dimensional (2D) EI ground state formed in transition metal dichalcogenide (TMD) semiconductor double layers. An equilibrium interlayer exciton fluid is formed when the bias voltage applied between the two electrically isolated TMD layers, is tuned to a range that populates bound electron-hole pairs, but not free electrons or holes. Capacitance measurements show that the fluid is exciton-compressible but charge-incompressible - direct thermodynamic evidence of the EI. The fluid is also strongly correlated with a dimensionless exciton coupling constant exceeding 10. We further construct an exciton phase diagram that reveals both the Mott transition and interaction-stabilized quasi-condensation. Our experiment paves the path for realizing the exotic quantum phases of excitons, as well as multi-terminal exciton circuitry for applications.
△ Less
Submitted 13 April, 2021; v1 submitted 11 April, 2021;
originally announced April 2021.
-
Photo-Induced Anomalous Hall Effect in Two-Dimensional Transition-Metal Dichalcogenides
Authors:
Phuong X. Nguyen,
Wang-Kong Tse
Abstract:
A circularly polarized a.c. pump field illuminated near resonance on two-dimensional transition metal dichalcogenides (TMDs) produces an anomalous Hall effect in response to a d.c. bias field. In this work, we develop a theory for this photo-induced anomalous Hall effect in undoped TMDs irradiated by a strong coherent laser field. The strong field renormalizes the equilibrium bands and opens up a…
▽ More
A circularly polarized a.c. pump field illuminated near resonance on two-dimensional transition metal dichalcogenides (TMDs) produces an anomalous Hall effect in response to a d.c. bias field. In this work, we develop a theory for this photo-induced anomalous Hall effect in undoped TMDs irradiated by a strong coherent laser field. The strong field renormalizes the equilibrium bands and opens up a dynamical energy gap where single-photon resonance occurs. The resulting photon dressed states, or Floquet states, are treated within the rotating wave approximation. A quantum kinetic equation approach is developed to study the non-equilibrium density matrix and time-averaged transport currents under the simultaneous influence of the strong a.c. pump field and the weak d.c. probe field. Dissipative effects are taken into account in the kinetic equation that captures relaxation and dephasing. The photo-induced longitudinal and Hall conductivities display notable resonant signatures when the pump field frequency reaches the spin-split interband transition energies. Rather than valley polarization, we find that the anomalous Hall current is mainly driven by the intraband response of photon-dressed electron populations near the dynamical gap at both valleys, accompanied by a smaller contribution due to interband coherences. These findings highlight the importance of photon-dressed bands and non-equilibrium distribution functions in achieving a proper understanding of photo-induced anomalous Hall effect in a strong pump field.
△ Less
Submitted 29 May, 2020;
originally announced June 2020.
-
UAV-Assisted Secure Communications in Terrestrial Cognitive Radio Networks: Joint Power Control and 3D Trajectory Optimization
Authors:
Phu X. Nguyen,
Van-Dinh Nguyen,
Hieu V. Nguyen,
Oh-Soon Shin
Abstract:
This paper considers secure communications for an underlay cognitive radio network (CRN) in the presence of an external eavesdropper (Eve). The secrecy performance of CRNs is usually limited by the primary receiver's interference power constraint. To overcome this issue, we propose to use an unmanned aerial vehicle (UAV) as a friendly jammer to interfere with Eve in decoding the confidential messa…
▽ More
This paper considers secure communications for an underlay cognitive radio network (CRN) in the presence of an external eavesdropper (Eve). The secrecy performance of CRNs is usually limited by the primary receiver's interference power constraint. To overcome this issue, we propose to use an unmanned aerial vehicle (UAV) as a friendly jammer to interfere with Eve in decoding the confidential message from the secondary transmitter (ST). Our goal is to jointly optimize the transmit power and UAV's trajectory in the three-dimensional (3D) space to maximize the average achievable secrecy rate of the secondary system. The formulated optimization problem is nonconvex due to the nonconvexity of the objective and nonconvexity of constraints, which is very challenging to solve. To obtain a suboptimal but efficient solution to the problem, we first transform the original problem into a more tractable form and develop an iterative algorithm for its solution by leveraging the inner approximation framework. We further extend the proposed algorithm to the case of imperfect location information of Eve, where the average worst-case secrecy rate is considered as the objective function. Extensive numerical results are provided to demonstrate the merits of the proposed algorithms over existing approaches.
△ Less
Submitted 25 March, 2020; v1 submitted 21 March, 2020;
originally announced March 2020.
-
Deep Learning versus Traditional Classifiers on Vietnamese Students' Feedback Corpus
Authors:
Phu X. V. Nguyen,
Tham T. T. Hong,
Kiet Van Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
Student's feedback is an important source of collecting students' opinions to improve the quality of training activities. Implementing sentiment analysis into student feedback data, we can determine sentiments polarities which express all problems in the institution since changes necessary will be applied to improve the quality of teaching and learning. This study focused on machine learning and n…
▽ More
Student's feedback is an important source of collecting students' opinions to improve the quality of training activities. Implementing sentiment analysis into student feedback data, we can determine sentiments polarities which express all problems in the institution since changes necessary will be applied to improve the quality of teaching and learning. This study focused on machine learning and natural language processing techniques (NaiveBayes, Maximum Entropy, Long Short-Term Memory, Bi-Directional Long Short-Term Memory) on the VietnameseStudents' Feedback Corpus collected from a university. The final results were compared and evaluated to find the most effective model based on different evaluation criteria. The experimental results show that the Bi-Directional LongShort-Term Memory algorithm outperformed than three other algorithms in terms of the F1-score measurement with 92.0% on the sentiment classification task and 89.6% on the topic classification task. In addition, we developed a sentiment analysis application analyzing student feedback. The application will help the institution to recognize students' opinions about a problem and identify shortcomings that still exist. With the use of this application, the institution can propose an appropriate method to improve the quality of training activities in the future.
△ Less
Submitted 17 November, 2019;
originally announced November 2019.
-
Weakly-supervised Action Localization with Background Modeling
Authors:
Phuc Xuan Nguyen,
Deva Ramanan,
Charless C. Fowlkes
Abstract:
We describe a latent approach that learns to detect actions in long sequences given training videos with only whole-video class labels. Our approach makes use of two innovations to attention-modeling in weakly-supervised learning. First, and most notably, our framework uses an attention model to extract both foreground and background frames whose appearance is explicitly modeled. Most prior works…
▽ More
We describe a latent approach that learns to detect actions in long sequences given training videos with only whole-video class labels. Our approach makes use of two innovations to attention-modeling in weakly-supervised learning. First, and most notably, our framework uses an attention model to extract both foreground and background frames whose appearance is explicitly modeled. Most prior works ignore the background, but we show that modeling it allows our system to learn a richer notion of actions and their temporal extents. Second, we combine bottom-up, class-agnostic attention modules with top-down, class-specific activation maps, using the latter as form of self-supervision for the former. Doing so allows our model to learn a more accurate model of attention without explicit temporal supervision. These modifications lead to 10% AP@IoU=0.5 improvement over existing systems on THUMOS14. Our proposed weaklysupervised system outperforms recent state-of-the-arts by at least 4.3% AP@IoU=0.5. Finally, we demonstrate that weakly-supervised learning can be used to aggressively scale-up learning to in-the-wild, uncurated Instagram videos. The addition of these videos significantly improves localization performance of our weakly-supervised model
△ Less
Submitted 18 August, 2019;
originally announced August 2019.
-
Optimal User Pairing for Achieving Rate Fairness in Downlink NOMA Networks
Authors:
Van-Phuc Bui,
Phu X. Nguyen,
Hieu V. Nguyen,
Van-Dinh Nguyen,
Oh-Soon Shin
Abstract:
In this paper, a downlink non-orthogonal multiple access (NOMA) network is studied. We investigate the problem of jointly optimizing user pairing and beamforming design to maximize the minimum rate among all users. The considered problem belongs to a difficult class of mixed-integer nonconvex optimization programming. We first relax the binary constraints and adopt sequential convex approximation…
▽ More
In this paper, a downlink non-orthogonal multiple access (NOMA) network is studied. We investigate the problem of jointly optimizing user pairing and beamforming design to maximize the minimum rate among all users. The considered problem belongs to a difficult class of mixed-integer nonconvex optimization programming. We first relax the binary constraints and adopt sequential convex approximation method to solve the relaxed problem, which is guaranteed to converge at least to a locally optimal solution. Numerical results show that the proposed method attains higher rate fairness among users, compared with traditional beamforming solutions, i.e., random pairing NOMA and beamforming systems.
△ Less
Submitted 30 December, 2018;
originally announced December 2018.
-
Phrase-Based Attentions
Authors:
Phi Xuan Nguyen,
Shafiq Joty
Abstract:
Most state-of-the-art neural machine translation systems, despite being different in architectural skeletons (e.g. recurrence, convolutional), share an indispensable feature: the Attention. However, most existing attention methods are token-based and ignore the importance of phrasal alignments, the key ingredient for the success of phrase-based statistical machine translation. In this paper, we pr…
▽ More
Most state-of-the-art neural machine translation systems, despite being different in architectural skeletons (e.g. recurrence, convolutional), share an indispensable feature: the Attention. However, most existing attention methods are token-based and ignore the importance of phrasal alignments, the key ingredient for the success of phrase-based statistical machine translation. In this paper, we propose novel phrase-based attention methods to model n-grams of tokens as attention entities. We incorporate our phrase-based attentions into the recently proposed Transformer network, and demonstrate that our approach yields improvements of 1.3 BLEU for English-to-German and 0.5 BLEU for German-to-English translation tasks on WMT newstest2014 using WMT'16 training data.
△ Less
Submitted 30 September, 2018;
originally announced October 2018.
-
An Efficient Spectral Leakage Filtering for IEEE 802.11af in TV White Space
Authors:
Phu Xuan Nguyen,
Thinh Hung Pham,
Trang Hoang,
Oh-Soon Shin
Abstract:
Orthogonal frequency division multiplexing (OFDM) has been widely adopted for modern wireless standards and become a key enabling technology for cognitive radios. However, one of its main drawbacks is significant spectral leakage due to the accumulation of multiple sinc-shaped subcarriers. In this paper, we present a novel pulse shaping scheme for efficient spectral leakage suppression in OFDM bas…
▽ More
Orthogonal frequency division multiplexing (OFDM) has been widely adopted for modern wireless standards and become a key enabling technology for cognitive radios. However, one of its main drawbacks is significant spectral leakage due to the accumulation of multiple sinc-shaped subcarriers. In this paper, we present a novel pulse shaping scheme for efficient spectral leakage suppression in OFDM based physical layer of IEEE 802.11af standard. With conventional pulse shaping filters such as a raised-cosine filter, vestigial symmetry can be used to reduce spectral leakage very effectively. However, these pulse shaping filters require long guard interval, i.e., cyclic prefix in an OFDM system, to avoid inter-symbol interference (ISI), resulting in a loss of spectral efficiency. The proposed pulse shaping method based on asymmetric pulse shaping achieves better spectral leakage suppression and decreases ISI caused by filtering as compared to conventional pulse shaping filters.
△ Less
Submitted 22 December, 2017;
originally announced December 2017.
-
The Open World of Micro-Videos
Authors:
Phuc Xuan Nguyen,
Gregory Rogez,
Charless Fowlkes,
Deva Ramanan
Abstract:
Micro-videos are six-second videos popular on social media networks with several unique properties. Firstly, because of the authoring process, they contain significantly more diversity and narrative structure than existing collections of video "snippets". Secondly, because they are often captured by hand-held mobile cameras, they contain specialized viewpoints including third-person, egocentric, a…
▽ More
Micro-videos are six-second videos popular on social media networks with several unique properties. Firstly, because of the authoring process, they contain significantly more diversity and narrative structure than existing collections of video "snippets". Secondly, because they are often captured by hand-held mobile cameras, they contain specialized viewpoints including third-person, egocentric, and self-facing views seldom seen in traditional produced video. Thirdly, due to to their continuous production and publication on social networks, aggregate micro-video content contains interesting open-world dynamics that reflects the temporal evolution of tag topics. These aspects make micro-videos an appealing well of visual data for developing large-scale models for video understanding. We analyze a novel dataset of micro-videos labeled with 58 thousand tags. To analyze this data, we introduce viewpoint-specific and temporally-evolving models for video understanding, defined over state-of-the-art motion and deep visual features. We conclude that our dataset opens up new research opportunities for large-scale video analysis, novel viewpoints, and open-world dynamics.
△ Less
Submitted 31 March, 2016; v1 submitted 30 March, 2016;
originally announced March 2016.