-
ESQA: Event Sequences Question Answering
Authors:
Irina Abdullaeva,
Andrei Filatov,
Mikhail Orlov,
Ivan Karpukhin,
Viacheslav Vasilev,
Denis Dimitrov,
Andrey Kuznetsov,
Ivan Kireev,
Andrey Savchenko
Abstract:
Event sequences (ESs) arise in many practical domains including finance, retail, social networks, and healthcare. In the context of machine learning, event sequences can be seen as a special type of tabular data with annotated timestamps. Despite the importance of ESs modeling and analysis, little effort was made in adapting large language models (LLMs) to the ESs domain. In this paper, we highlig…
▽ More
Event sequences (ESs) arise in many practical domains including finance, retail, social networks, and healthcare. In the context of machine learning, event sequences can be seen as a special type of tabular data with annotated timestamps. Despite the importance of ESs modeling and analysis, little effort was made in adapting large language models (LLMs) to the ESs domain. In this paper, we highlight the common difficulties of ESs processing and propose a novel solution capable of solving multiple downstream tasks with little or no finetuning. In particular, we solve the problem of working with long sequences and improve time and numeric features processing. The resulting method, called ESQA, effectively utilizes the power of LLMs and, according to extensive experiments, achieves state-of-the-art results in the ESs domain.
△ Less
Submitted 19 July, 2024; v1 submitted 3 July, 2024;
originally announced July 2024.
-
Kandinsky 3.0 Technical Report
Authors:
Vladimir Arkhipkin,
Andrei Filatov,
Viacheslav Vasilev,
Anastasia Maltseva,
Said Azizov,
Igor Pavlov,
Julia Agafonova,
Andrey Kuznetsov,
Denis Dimitrov
Abstract:
We present Kandinsky 3.0, a large-scale text-to-image generation model based on latent diffusion, continuing the series of text-to-image Kandinsky models and reflecting our progress to achieve higher quality and realism of image generation. In this report we describe the architecture of the model, the data collection procedure, the training technique, and the production system for user interaction…
▽ More
We present Kandinsky 3.0, a large-scale text-to-image generation model based on latent diffusion, continuing the series of text-to-image Kandinsky models and reflecting our progress to achieve higher quality and realism of image generation. In this report we describe the architecture of the model, the data collection procedure, the training technique, and the production system for user interaction. We focus on the key components that, as we have identified as a result of a large number of experiments, had the most significant impact on improving the quality of our model compared to the others. We also describe extensions and applications of our model, including super resolution, inpainting, image editing, image-to-video generation, and a distilled version of Kandinsky 3.0 - Kandinsky 3.1, which does inference in 4 steps of the reverse process and 20 times faster without visual quality decrease. By side-by-side human preferences comparison, Kandinsky becomes better in text understanding and works better on specific domains. The code is available at https://github.com/ai-forever/Kandinsky-3
△ Less
Submitted 28 June, 2024; v1 submitted 6 December, 2023;
originally announced December 2023.
-
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline
Authors:
Vladimir Arkhipkin,
Zein Shaheen,
Viacheslav Vasilev,
Elizaveta Dakhova,
Andrey Kuznetsov,
Denis Dimitrov
Abstract:
Multimedia generation approaches occupy a prominent place in artificial intelligence research. Text-to-image models achieved high-quality results over the last few years. However, video synthesis methods recently started to develop. This paper presents a new two-stage latent diffusion text-to-video generation architecture based on the text-to-image diffusion model. The first stage concerns keyfram…
▽ More
Multimedia generation approaches occupy a prominent place in artificial intelligence research. Text-to-image models achieved high-quality results over the last few years. However, video synthesis methods recently started to develop. This paper presents a new two-stage latent diffusion text-to-video generation architecture based on the text-to-image diffusion model. The first stage concerns keyframes synthesis to figure the storyline of a video, while the second one is devoted to interpolation frames generation to make movements of the scene and objects smooth. We compare several temporal conditioning approaches for keyframes generation. The results show the advantage of using separate temporal blocks over temporal layers in terms of metrics reflecting video generation quality aspects and human preference. The design of our interpolation model significantly reduces computational costs compared to other masked frame interpolation approaches. Furthermore, we evaluate different configurations of MoVQ-based video decoding scheme to improve consistency and achieve higher PSNR, SSIM, MSE, and LPIPS scores. Finally, we compare our pipeline with existing solutions and achieve top-2 scores overall and top-1 among open-source solutions: CLIPSIM = 0.2976 and FVD = 433.054. Project page: https://ai-forever.github.io/kandinsky-video/
△ Less
Submitted 20 December, 2023; v1 submitted 21 November, 2023;
originally announced November 2023.
-
On the properties of some low-parameter models for color reproduction in terms of spectrum transformations and coverage of a color triangle
Authors:
Alexey Kroshnin,
Viacheslav Vasilev,
Egor Ershov,
Denis Shepelev,
Dmitry Nikolaev,
Mikhail Tchobanou
Abstract:
One of the classical approaches to solving color reproduction problems, such as color adaptation or color space transform, is the use of low-parameter spectral models. The strength of this approach is the ability to choose a set of properties that the model should have, be it a large coverage area of a color triangle, an accurate description of the addition or multiplication of spectra, knowing on…
▽ More
One of the classical approaches to solving color reproduction problems, such as color adaptation or color space transform, is the use of low-parameter spectral models. The strength of this approach is the ability to choose a set of properties that the model should have, be it a large coverage area of a color triangle, an accurate description of the addition or multiplication of spectra, knowing only the tristimulus corresponding to them. The disadvantage is that some of the properties of the mentioned spectral models are confirmed only experimentally. This work is devoted to the theoretical substantiation of various properties of spectral models. In particular, we prove that the banded model is the only model that simultaneously possesses the properties of closure under addition and multiplication. We also show that the Gaussian model is the limiting case of the von Mises model and prove that the set of protomers of the von Mises model unambiguously covers the color triangle in both the case of convex and non-convex spectral locus.
△ Less
Submitted 21 October, 2021;
originally announced October 2021.
-
Recovering the characteristic functions of the Sturm-Liouville differential operators with singular potentials on star-type graph with cycle
Authors:
Sergey V. Vasilev
Abstract:
We consider Sturm-Liouville operators with singular potentials from the class on star-type graph with cycle, which consist the edges with commensurable lengths. Asymptotic representation for eigenvalues for such operators is obtained. Recovering of the characteristic function the Sturm-Liouville operators with the singular potentials is considered.
We consider Sturm-Liouville operators with singular potentials from the class on star-type graph with cycle, which consist the edges with commensurable lengths. Asymptotic representation for eigenvalues for such operators is obtained. Recovering of the characteristic function the Sturm-Liouville operators with the singular potentials is considered.
△ Less
Submitted 30 January, 2019;
originally announced January 2019.
-
Potential and Limitations of the Archaeo-Geophysical Techniques
Authors:
Yavor Shopov,
Diana Stoykova,
Antoniya Petrova,
Valentin Vasilev,
Ludmil Tsankov
Abstract:
This work demonstrates the potential and the limitations of archaeo-geophysical techniques available at the Archaeological Geophysics Laboratory of the Department of Physics at the University of Sofia with various case studies in natural and artificial environment. Special attention is focused on GPR which is the most powerful archaeogeophysical technique This laboratory is the only one in Bulga…
▽ More
This work demonstrates the potential and the limitations of archaeo-geophysical techniques available at the Archaeological Geophysics Laboratory of the Department of Physics at the University of Sofia with various case studies in natural and artificial environment. Special attention is focused on GPR which is the most powerful archaeogeophysical technique This laboratory is the only one in Bulgaria, which develops new geophysical techniques and equipment for survey of archaeological sites and their dating.
△ Less
Submitted 8 September, 2009;
originally announced September 2009.
-
An adjacency criterion for the prime graph of a finite simple group
Authors:
Anrei V. Vasilév,
Evgeny P. Vdovin
Abstract:
In the paper we give an exhaustive arithmetic criterion of adjacency in prime graph $GK(G)$ for every finite nonabelian simple group $G$. By using this criterion for all finite simple groups an independence set with the maximal number of vertices, an independence set containing 2 with the maximal number of vertices, and the orders of these independence sets are given. We assemble this information…
▽ More
In the paper we give an exhaustive arithmetic criterion of adjacency in prime graph $GK(G)$ for every finite nonabelian simple group $G$. By using this criterion for all finite simple groups an independence set with the maximal number of vertices, an independence set containing 2 with the maximal number of vertices, and the orders of these independence sets are given. We assemble this information in the tables at the end of the paper. Several applications of obtained results for various problems of finite group theory are considered.
△ Less
Submitted 12 July, 2010; v1 submitted 15 June, 2005;
originally announced June 2005.