Vladan Radosavljevic

Vladan Radosavljevic

New York, New York, United States
5K followers 500+ connections

Über uns

Interests: Applied Machine Learning and Data Science, Search Relevance and Ranking…

Activity

Join now to see all activity

Erleben Sie

  • Spotify Graphic

    Spotify

    Greater New York City Area

  • -

    Greater Buenos Aires, Argentina

  • -

    Greater Pittsburgh Area

  • -

    Greater Pittsburgh Area

  • -

    Sunnyvale, CA

  • -

    Philadelphia, PA

  • -

    Philadelphia, PA

  • -

    Philadelphia, PA

  • -

    Princeton, NJ

  • -

    Princeton, NJ

  • -

    Belgrad, Serbien

Bildung

Publications

  • Context- and Content-aware Embeddings for Query Rewriting in Sponsored Search

    Special Interest Group on Information Retrieval (SIGIR)

    Search engines represent one of the most popular web services, visited by more than 85% of internet users on a daily basis. Advertisers are interested in making use of this vast business potential, as very clear intent signal communicated through the issued query allows effective targeting of users. This idea is embodied in a sponsored search model, where each advertiser maintains a list of keywords they deem indicative of increased user response rate with regards to their business. According…

    Search engines represent one of the most popular web services, visited by more than 85% of internet users on a daily basis. Advertisers are interested in making use of this vast business potential, as very clear intent signal communicated through the issued query allows effective targeting of users. This idea is embodied in a sponsored search model, where each advertiser maintains a list of keywords they deem indicative of increased user response rate with regards to their business. According to this targeting model, when a query is issued all advertisers with a matching keyword are entered into an auction according to the amount they bid for the query, and the winner gets to show their ad. One of the main challenges is the fact that a query may not match many keywords, resulting in lower auction value, lower ad quality, and lost revenue for advertisers and publishers. Possible solution is to expand a query into a set of related queries and use them to increase the number of matched ads, called query rewriting. To this end, we propose rewriting method based on a novel query embedding algorithm, which jointly models query content as well as its context within an online session. As a result, queries with similar content and context are mapped into vectors close in the embedding space, which allows expansion of a query via simple K-nearest neighbor search in the projected space. The method was trained on more than 12 billion sessions, one of the largest corpuses reported thus far, and evaluated on both public TREC data set and in-house sponsored search data set. The results show the proposed approach significantly outperformed existing state-of-the-art, strongly indicating its benefits and the monetization potential.

    Other authors
  • E-commerce in Your Inbox: Product Recommendations at Scale

    ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)

    In recent years online advertising has become increasingly ubiquitous and effective. Advertisements shown to visitors fund sites and apps that publish digital content, manage social networks, and operate e-mail services. Given such large variety of internet resources, determining an appropriate type of advertising for a given platform has become critical to financial success. Native advertisements, namely ads that are similar in look and feel to content, have had great success in news and…

    In recent years online advertising has become increasingly ubiquitous and effective. Advertisements shown to visitors fund sites and apps that publish digital content, manage social networks, and operate e-mail services. Given such large variety of internet resources, determining an appropriate type of advertising for a given platform has become critical to financial success. Native advertisements, namely ads that are similar in look and feel to content, have had great success in news and social feeds. However, to date there has not been a winning formula for ads in e-mail clients. In this paper we describe a system that leverages user purchase history determined from e-mail receipts to deliver highly personalized product ads to Yahoo Mail users. We propose to use a novel neural language-based algorithm specifically tailored for delivering effective product recommendations, which was evaluated against baselines that included showing popular products and products predicted based on co-occurrence. We conducted rigorous offline testing using a large-scale product purchase data set, covering purchases of more than 29 million users from 172 e-commerce websites. Ads in the form of product recommendations were successfully tested on online traffic, where we observed a steady 9% lift in click-through rates over other ad formats in mail, as well as comparable lift in conversion rates. Following successful tests, the system was launched into production during the holiday season of 2014.

    Other authors
    See publication
  • Gender and Interest Targeting for Sponsored Post Advertising at Tumblr

    ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)

    As one of the leading platforms for creative content, Tumblr offers advertisers a unique way of creating brand identity. Advertisers can tell their story through images, animation, text, music, video, and more, and can promote that content by sponsoring it to appear as an advertisement in the users' live feeds. In this paper, we present a framework that enabled two of the key targeted advertising components for Tumblr, gender and interest targeting. We describe the main challenges encountered…

    As one of the leading platforms for creative content, Tumblr offers advertisers a unique way of creating brand identity. Advertisers can tell their story through images, animation, text, music, video, and more, and can promote that content by sponsoring it to appear as an advertisement in the users' live feeds. In this paper, we present a framework that enabled two of the key targeted advertising components for Tumblr, gender and interest targeting. We describe the main challenges encountered during the development of the framework, which include the creation of a ground truth for training gender prediction models, as well as mapping Tumblr content to a predefined interest taxonomy. For purposes of inferring user interests, we propose a novel semi-supervised neural language model for categorization of Tumblr content (i.e., post tags and post keywords). The model was trained on a large-scale data set consisting of 6.8 billion user posts, with a very limited amount of categorized keywords, and was shown to have superior performance over the baseline approaches. We successfully deployed gender and interest targeting capability in Yahoo production systems, delivering inference for users that covers more than 90% of daily activities on Tumblr. Online performance results indicate advantages of the proposed approach, where we observed 20% increase in user engagement with sponsored posts in comparison to untargeted campaigns.

    Other authors
    See publication
  • Hate Speech Detection with Comment Embeddings

    International World Wide Web Conference (WWW)

    We address the problem of hate speech detection in online user comments. Hate speech, defined as an "abusive speech targeting specific group characteristics, such as ethnicity, religion, or gender", is an important problem plaguing websites that allow users to leave feedback, having a negative impact on their online business and overall user experience. We propose to learn distributed low-dimensional representations of comments using recently proposed neural language models, that can then be…

    We address the problem of hate speech detection in online user comments. Hate speech, defined as an "abusive speech targeting specific group characteristics, such as ethnicity, religion, or gender", is an important problem plaguing websites that allow users to leave feedback, having a negative impact on their online business and overall user experience. We propose to learn distributed low-dimensional representations of comments using recently proposed neural language models, that can then be fed as inputs to a classification algorithm. Our approach addresses issues of high-dimensionality and sparsity that impact the current state-of-the-art, resulting in highly efficient and effective hate speech detectors.

    Other authors
  • Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content

    International World Wide Web Conference (WWW)

    We consider the problem of learning distributed representations for documents in data streams. The documents are represented as low-dimensional vectors, jointly learned with distributed vector representations of word tokens in a hierarchical framework with two embedded neural language models. We exploit the context of documents in streams and model their sequential relationships as well as the word content within them. The models learn continuous vector representations for both word tokens and…

    We consider the problem of learning distributed representations for documents in data streams. The documents are represented as low-dimensional vectors, jointly learned with distributed vector representations of word tokens in a hierarchical framework with two embedded neural language models. We exploit the context of documents in streams and model their sequential relationships as well as the word content within them. The models learn continuous vector representations for both word tokens and documents such that semantically similar documents and words are close in a common vector space. We discuss extensions to our models, which learn user-specific vectors to represent individual preferences and can be applied to personalized recommendation and social relationship mining. We validated the learned representations on a public movie rating data set from MovieLens, as well as on a large-scale Yahoo News data set comprising three months of user activity collected on Yahoo servers. The results indicate that the proposed model can learn high-quality vector representations of both documents and word tokens, outperforming the current state-of-the-art by a large margin.

    Other authors
  • queryCategorizr: A Large-Scale Semi-Supervised System for Categorization of Web Search Queries

    International World Wide Web Conference (WWW)

    Understanding interests expressed through user's search query is a task of critical importance for many internet applications. To help identify user interests, web engines commonly utilize classification of queries into one or more pre-defined interest categories. However, majority of the queries are noisy short texts, making accurate classification a challenging task. In this demonstration, we present queryCategorizr, a novel semi-supervised learning system that embeds queries into…

    Understanding interests expressed through user's search query is a task of critical importance for many internet applications. To help identify user interests, web engines commonly utilize classification of queries into one or more pre-defined interest categories. However, majority of the queries are noisy short texts, making accurate classification a challenging task. In this demonstration, we present queryCategorizr, a novel semi-supervised learning system that embeds queries into low-dimensional vector space using a neural language model applied on search log sessions, and classifies them into general interest categories while relying on a small set of labeled queries. Empirical results on large-scale data show that queryCategorizr outperforms the current state-of-the-art approaches. In addition, we describe a Graphical User Interface (GUI) that allows users to query the system and explore classification results in an interactive manner.

    Other authors
  • Search Retargeting using Directed Query Embeddings

    International World Wide Web Conference (WWW)

    Determining user audience for online ad campaigns is a critical problem to companies competing in online advertising space. One of the most popular strategies is search retargeting, which involves targeting users that issued search queries related to advertiser's core business, commonly specified by advertisers themselves. However, advertisers often fail to include many relevant queries, which results in suboptimal campaigns and negatively impacts revenue for both advertisers and publishers. To…

    Determining user audience for online ad campaigns is a critical problem to companies competing in online advertising space. One of the most popular strategies is search retargeting, which involves targeting users that issued search queries related to advertiser's core business, commonly specified by advertisers themselves. However, advertisers often fail to include many relevant queries, which results in suboptimal campaigns and negatively impacts revenue for both advertisers and publishers. To address this issue, we use recently proposed neural language models to learn low-dimensional, distributed query embeddings, which can be used to expand query lists with related queries through simple nearest neighbor searches in the embedding space. Experiments on real-world data set strongly suggest benefits of the approach.

    Other authors
  • Gaussian Conditional Random Fields for Aggregation of Operational Aerosol Retrievals

    IEEE Geoscience and Remote Sensing Letters

    We present a Gaussian Conditional Random Field model for aggregation of Aerosol Optical Depth (AOD) retrievals from multiple satellite instruments into a joint retrieval. The model provides aggregated retrievals with higher accuracy and coverage than any of the individual instruments, while also providing an estimation of retrieval uncertainty. The proposed model finds an optimal, temporally-smoothed combination of individual retrievals that minimizes Root Mean Squared Error of AOD retrieval…

    We present a Gaussian Conditional Random Field model for aggregation of Aerosol Optical Depth (AOD) retrievals from multiple satellite instruments into a joint retrieval. The model provides aggregated retrievals with higher accuracy and coverage than any of the individual instruments, while also providing an estimation of retrieval uncertainty. The proposed model finds an optimal, temporally-smoothed combination of individual retrievals that minimizes Root Mean Squared Error of AOD retrieval. We evaluated the model on five years (2006 - 2010) of satellite data over North America from 5 instruments (Aqua and Terra MODIS, MISR, SeaWiFS, and OMI), collocated with ground-based AERONET ground-truth AOD readings, clearly showing that aggregation of different sources leads to improvements in accuracy and coverage of AOD retrievals.

    Other authors
  • Hidden Conditional Random Fields with Deep User Embeddings for Ad Targeting

    IEEE International Conference on Data Mining (ICDM)

    Estimating user's propensity to click on a display ad or purchase a particular item is a critical task in targeted advertising, a burgeoning online industry worth billions of dollars. Better and more accurate estimation methods result in improved online experience for users, as only relevant and interesting ads are shown, and may also lead to large benefits for advertisers, as targeted users are more likely to click or make a purchase. In this paper we address this important problem, and…

    Estimating user's propensity to click on a display ad or purchase a particular item is a critical task in targeted advertising, a burgeoning online industry worth billions of dollars. Better and more accurate estimation methods result in improved online experience for users, as only relevant and interesting ads are shown, and may also lead to large benefits for advertisers, as targeted users are more likely to click or make a purchase. In this paper we address this important problem, and propose an approach for improved estimation of ad click or conversion probability based on a sequence of user's online actions, modeled using state-of-the-art Hidden Conditional Random Fields (HCRF) model. In addition, in order to address the sparsity issue at the input side of the HCRF model, we propose to learn a distributed, low-dimensional representation of user actions through a directed skip-gram model, a novel deep architecture suitable for sequential data. The experimental results on a real-world data set comprising thousands of online user sessions collected at servers of a large internet company clearly indicate the benefits and the potential of the proposed approach, which outperformed the competing state-of-the-art algorithms and obtained significant improvements in terms of retrieval measures.

    Other authors
  • Neural Gaussian Conditional Random Fields

    Machine Learning and Knowledge Discovery in Databases (ECML/PKDD)

    We propose a Conditional Random Field (CRF) model for structured regression. By constraining the feature functions as quadratic functions of outputs, the model can be conveniently represented in a
    Gaussian canonical form. We improved the representational power of the resulting Gaussian CRF (GCRF) model by (1) introducing an adaptive feature function that can learn nonlinear relationships between inputs and outputs and (2) allowing the weights of feature functions to be dependent
    on…

    We propose a Conditional Random Field (CRF) model for structured regression. By constraining the feature functions as quadratic functions of outputs, the model can be conveniently represented in a
    Gaussian canonical form. We improved the representational power of the resulting Gaussian CRF (GCRF) model by (1) introducing an adaptive feature function that can learn nonlinear relationships between inputs and outputs and (2) allowing the weights of feature functions to be dependent
    on inputs. Since both the adaptive feature functions and weights can be constructed using feedforward neural networks, we call the resulting model Neural GCRF. The appeal of Neural GCRF is in conceptual simplicity and computational efficiency of learning and inference through use of sparse matrix computations. Experimental evaluation on the remote sensing problem of aerosol estimation from satellite measurements and on the problem of document retrieval showed that Neural GCRF is more accurate than the benchmark predictors.

    Other authors
  • Non-linear Label Ranking for Large-scale Prediction of Long-Term User Interests

    AAAI Conference on Artificial Intelligence (AAAI)

    We consider the problem of personalization of online services from the viewpoint of display ad targeting, where we seek to find the best ad categories to be shown to each user, resulting in improved user experience and increased advertiser's revenue. We propose to address this problem as a task of ranking the ad categories by each user's preferences, and introduce a novel label ranking approach capable of efficiently learning non-linear, highly accurate models in large-scale settings…

    We consider the problem of personalization of online services from the viewpoint of display ad targeting, where we seek to find the best ad categories to be shown to each user, resulting in improved user experience and increased advertiser's revenue. We propose to address this problem as a task of ranking the ad categories by each user's preferences, and introduce a novel label ranking approach capable of efficiently learning non-linear, highly accurate models in large-scale settings. Experiments on real-world advertising data set with more than 3.2 million users show that the proposed algorithm outperforms the existing solutions in terms of both rank loss and top-K retrieval performance, strongly suggesting the benefit of using the proposed model on large-scale ranking problems.

    Other authors
  • Utilizing temporal patterns for estimating uncertainty in interpretable early decision making

    ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)

    Early classification of time series is prevalent in many time-sensitive applications such as, but not limited to, early warning of disease outcome and early warning of crisis in stock market. For example,
    early diagnosis allows physicians to design appropriate therapeutic strategies at early stages of diseases. However, practical adaptation of early classification of time series requires an easy to understand explanation (interpretability) and a measure of confidence of the prediction…

    Early classification of time series is prevalent in many time-sensitive applications such as, but not limited to, early warning of disease outcome and early warning of crisis in stock market. For example,
    early diagnosis allows physicians to design appropriate therapeutic strategies at early stages of diseases. However, practical adaptation of early classification of time series requires an easy to understand explanation (interpretability) and a measure of confidence of the prediction results (uncertainty estimates). These two aspects were not jointly addressed in previous time series early
    classification studies, such that a difficult choice of selecting one of these aspects is required. In this study, we propose a simple and yet effective method to provide uncertainty estimates for an interpretable early classification method. The question we address here is "how to provide estimates of uncertainty in regard to interpretable early prediction." In our extensive evaluation on twenty time series datasets we showed that the proposed method has several advantages over the state-of-the-art method that provides reliability estimates in early classification. Namely, the proposed method is more effective than the state-of-the-art method, is simple to implement, and provides interpretable results.

    Other authors
  • Continuous Conditional Random Fields for Efficient Regression in Large Fully Connected Graphs

    AAAI Conference on Artificial Intelligence (AAAI)

    When used for structured regression, powerful Conditional Random Fields (CRFs) are typically restricted to modeling effects of interactions among examples in local neighborhoods. Using more expressive representation would result in dense graphs, making these methods impractical for large-scale applications. To address this issue, we propose an effective CRF model with linear scale-up properties regarding approximate learning and inference for structured regression on large, fully connected…

    When used for structured regression, powerful Conditional Random Fields (CRFs) are typically restricted to modeling effects of interactions among examples in local neighborhoods. Using more expressive representation would result in dense graphs, making these methods impractical for large-scale applications. To address this issue, we propose an effective CRF model with linear scale-up properties regarding approximate learning and inference for structured regression on large, fully connected graphs. The proposed method is validated on real-world large-scale problems of image denoising and remote sensing. In conducted experiments, we demonstrated that dense connectivity provides an improvement in prediction accuracy. Inference time of less than ten seconds on graphs with millions of nodes and trillions of edges makes the proposed model an attractive tool for large-scale, structured regression problems.

    Other authors
    • Kosta Ristovski
    • Zoran Obradovic
  • Extraction of interpretable multivariate patterns for early diagnostics

    IEEE International Conference on Data Mining (ICDM)

    Leveraging temporal observations to predict a patient's health state at a future period is a very challenging task. Providing such a prediction early and accurately allows for designing a more successful treatment that starts before a disease completely develops. Information for this kind of early
    diagnosis could be extracted by use of temporal data mining methods for handling complex multivariate time series. However, physicians usually prefer to use interpretable models that can be easily…

    Leveraging temporal observations to predict a patient's health state at a future period is a very challenging task. Providing such a prediction early and accurately allows for designing a more successful treatment that starts before a disease completely develops. Information for this kind of early
    diagnosis could be extracted by use of temporal data mining methods for handling complex multivariate time series. However, physicians usually prefer to use interpretable models that can be easily explained, rather than relying on more complex black-box approaches. In this study, a temporal data mining method is proposed for extracting interpretable patterns from multivariate time series data, which can be used to assist in providing interpretable early diagnosis. The problem is formulated as an optimization based binary classification task addressed in three steps. First, the time series data is transformed into a binary matrix representation suitable for application of classification methods. Second, a novel convex-concave optimization problem is defined to extract multivariate patterns from the constructed binary matrix. Then, a mixed integer discrete optimization formulation is provided to
    reduce the dimensionality and extract interpretable multivariate patterns. Finally, those interpretable multivariate patterns are used for early classification in challenging clinical applications. In the conducted experiments on two human viral infection datasets and a larger myocardial infarction dataset, the proposed method was more accurate and provided classifications earlier than three
    alternative state-of-the-art methods.

    Other authors
  • Multi-dimensional Coherence Deblending of Simultaneous Sources

    IEEE Intl Geoscience and Remote Sensing Symposium (IGARSS)

    Recent seismic exploration research proposes blended acquisition from simultaneous sources with temporal overlap between records to reduce survey time and improve the quality of seismic imaging. However, separating source records for traditional pre-stack processing poses significant challenges. Traditional procedures are incapable to remove interference from other sources equivalent to blended noise. We propose a new approach that enforces simultaneous source coherence across multiple domains…

    Recent seismic exploration research proposes blended acquisition from simultaneous sources with temporal overlap between records to reduce survey time and improve the quality of seismic imaging. However, separating source records for traditional pre-stack processing poses significant challenges. Traditional procedures are incapable to remove interference from other sources equivalent to blended noise. We propose a new approach that enforces simultaneous source coherence across multiple domains in order to estimate the maximum likely distribution of energy amongst the sources.

    Other authors
    • Heiko Claussen (first author)
    • Justinian Rosca
    See publication
  • A Data Mining Approach for Optimization of Acute Inflammation Therapy

    IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

    Acute inflammation is a medical condition which occurs over seconds, minutes or hours and is characterized as a systemic inflammatory response to an infection. Delaying treatment by only one hour decreases patient chance of survival by about 7%. Therefore, there is a critical need for tools that can aid therapy optimization for this potentially fatal condition. Towards this objective we developed a data driven approach for therapy optimization where a predictive model for patients' behavior is…

    Acute inflammation is a medical condition which occurs over seconds, minutes or hours and is characterized as a systemic inflammatory response to an infection. Delaying treatment by only one hour decreases patient chance of survival by about 7%. Therefore, there is a critical need for tools that can aid therapy optimization for this potentially fatal condition. Towards this objective we developed a data driven approach for therapy optimization where a predictive model for patients' behavior is learned directly from historical data. As such, the predictive model is incorporated into a model predictive control optimization algorithm to find optimal therapy, which will lead the patient to a healthy state. To save on the cost of clinical trials and potential failure, we evaluated our model on a population of virtual patients capable of emulating the inflammatory response. Patients are treated with two drugs for which dosage and timing are critical for the outcome of the treatment. Our results show significant improvement in percentage of healthy outcomes comparing to previously proposed methods for acute inflammation treatment found in literature and in clinical practice. In particular, application of our method rescued 88% of patients that would otherwise die within 168 hours due to septic or aseptic state. In contrast, the best method from literature rescued only 73% of patients.

    Other authors
    • Zoran Obradovic
    • Kosta Ristovski
    See publication
  • Kernel-based Characterization of Dynamics in a Heterogeneous Population of Septic Patients Under Therapy

    International Conference on Machine Learning Workshop on Machine Learning for Clinical Data Analysis (ICML Workshop)

    Sepsis is a medical condition characterized as a systemic inflammatory response to an infection. The high level of heterogeneity among sepsis patients is one of the main reasons of unsuccessful clinical trials. A more careful targeting of specific therapeutic strategies to more biologically homogeneous groups of patients is essential to developing ?effective sepsis treatment. We propose a kernel-based approach to characterize dynamics of inflammatory response in a heterogeneous population of…

    Sepsis is a medical condition characterized as a systemic inflammatory response to an infection. The high level of heterogeneity among sepsis patients is one of the main reasons of unsuccessful clinical trials. A more careful targeting of specific therapeutic strategies to more biologically homogeneous groups of patients is essential to developing ?effective sepsis treatment. We propose a kernel-based approach to characterize dynamics of inflammatory response in a heterogeneous population of septic patients. Our method utilizes Linear State Space Control (LSSC) models to take into account dynamics of inflammatory response over time as well as the effect of therapy applied to the patient. We use a similarity measure defined on kernels of LSSC models to ?find homogeneous groups of patients. An application of the proposed method to analysis of dynamics of inflammatory response to sepsis therapy in 64 virtual patients identified four biologically relevant
    homogeneous groups providing the initial evidence that patient-specific sepsis treatment based on several treatment protocols is feasible.

    Other authors
    • Zoran Obradovic
    • Kosta Ristovski
    See publication
  • Travel Speed Forecasting by Means of Continuous Conditional Random Fields

    Transportation Research Record: Journal of the Transportation Research Board

    This paper explores the application of the recently proposed continuous conditional random fields (CCRF) to travel forecasting. CCRF is a flexible, probabilistic framework that can seamlessly incorporate multiple traffic predictors and exploit spatial and temporal correlations inherently present in traffic data. In addition to improving prediction accuracy, the probabilistic approach provides information about prediction uncertainty. Moreover, information about the relative importance of…

    This paper explores the application of the recently proposed continuous conditional random fields (CCRF) to travel forecasting. CCRF is a flexible, probabilistic framework that can seamlessly incorporate multiple traffic predictors and exploit spatial and temporal correlations inherently present in traffic data. In addition to improving prediction accuracy, the probabilistic approach provides information about prediction uncertainty. Moreover, information about the relative importance of particular predictor and spatial–temporal correlations can be easily extracted from the model. CCRF is fault-tolerant and can provide predictions even when some observations are missing. Several CCRF models were applied to the problem of travel speed prediction in a range from 10 to 60 min ahead and evaluated on loop detector data from a 5.71-mi section of I-35W in Minneapolis, Minnesota. Several CCRF models, with increasing levels of complexity, are proposed to assess performance of the method better. When these CCRF models were compared with the linear regression model, they reduced the mean absolute error by around 4%. The results imply that modeling spatial and temporal neighborhoods in traffic data and combining various baseline predictors under the CCRF framework can be beneficial.

    Other authors
    • Nemanja Djuric (first author)
    • Vladimir Coric
    • Slobodan Vucetic
    See publication
  • Continuous Conditional Random Fields for Regression in Remote Sensing

    19th European Conf. on Artificial Intelligence (ECAI)

    Conditional random fields (CRF) are widely used for predicting output variables that have some internal structure. Most of the CRF research has been done on structured classification where the outputs are discrete. In this study we propose a CRF probabilistic model for structured regression that uses multiple non-structured predictors as its features. We construct features as squared prediction errors and show that this results in a Gaussian predictor. Learning becomes a convex optimization…

    Conditional random fields (CRF) are widely used for predicting output variables that have some internal structure. Most of the CRF research has been done on structured classification where the outputs are discrete. In this study we propose a CRF probabilistic model for structured regression that uses multiple non-structured predictors as its features. We construct features as squared prediction errors and show that this results in a Gaussian predictor. Learning becomes a convex optimization problem leading to a global solution for a set of parameters. Inference can be conveniently conducted through matrix computation. Experimental results on the remote sensing problem of estimating Aerosol Optical Depth (AOD) provide strong evidence that the proposed CRF model successfully exploits the inherent spatio-temporal properties of AOD data. The experiments revealed that CRF are more accurate than the baseline neural network and domain-based predictors.

    Other authors
    • Zoran Obradovic
    • Slobodan Vucetic
    See publication
  • A Data Mining Technique for Aerosol Retrieval Across Multiple Accuracy Measures

    IEEE Geoscience and Remote Sensing Letters

    A typical approach in supervised learning is to select an accuracy measure and train a predictor that maximizes it. This can be insufficient in remote-sensing applications where predictor performance is often evaluated over multiple domain-specific accuracy measures. Here, we test the hypothesis that predictors can be trained to maximize performance over multiple accuracy measures. To do this, we evaluate several metalearning algorithms on the problem of aerosol optical depth (AOD) retrieval…

    A typical approach in supervised learning is to select an accuracy measure and train a predictor that maximizes it. This can be insufficient in remote-sensing applications where predictor performance is often evaluated over multiple domain-specific accuracy measures. Here, we test the hypothesis that predictors can be trained to maximize performance over multiple accuracy measures. To do this, we evaluate several metalearning algorithms on the problem of aerosol optical depth (AOD) retrieval. The multiple accuracy measures included mean squared error, correlation, relative squared error, and fraction of satisfactory predictions. The proposed metalearning algorithms have a two-layer architecture, where the first layer consists of multiple neural networks, each trained using a different accuracy measure, and the second layer aggregates decisions of the first layer predictors. To evaluate AOD predictors, we used nearly 70 000 collocated data points whose attributes were radiances, solar and view angles, and terrain elevation from MODerate resolution Imaging Spectrometer (MODIS) instrument satellite observations and whose target AOD variable was obtained from the ground-based AEROsol robotic NETwork (AERONET) instruments. The data were collected at 221 AERONET locations over the globe in the period between 2005 and 2007. AOD prediction accuracies of neural networks were compared to the recently developed operational MODIS C005 retrieval algorithm and to several other data-mining methods. Results showed that neural networks are better at reproducing the test data than the operational retrieval algorithm and that predictors obtained by metalearning are robust over multiple accuracy measures.

    Other authors
    • Zoran Obradovic
    • Slobodan Vucetic
    See publication
  • Reduction of Ground-Based Sensor Sites for Spatio-Temporal Analysis of Aerosols

    Proc. 3rd International Workshop on Knowledge Discovery from Sensor Data at the 15th ACM SIGKDD Conf. Knowledge Discovery and Data Mining (ACM SIGKDD Workshop)

    In many remote sensing applications it is important to use multiple sensors to be able to understand the major spatio-temporal distribution patterns of an observed phenomenon. A particular remote sensing application addressed in this study is estimation of an important property of atmosphere, called Aerosol Optical Depth (AOD). Remote sensing data for AOD estimation are collected from ground and satellite-based sensors. Satellite-based measurements can be used as attributes for estimation of…

    In many remote sensing applications it is important to use multiple sensors to be able to understand the major spatio-temporal distribution patterns of an observed phenomenon. A particular remote sensing application addressed in this study is estimation of an important property of atmosphere, called Aerosol Optical Depth (AOD). Remote sensing data for AOD estimation are collected from ground and satellite-based sensors. Satellite-based measurements can be used as attributes for estimation of AOD and in this way could lead to better understanding of spatio-temporal aerosol patterns on a global scale. Ground-based AOD estimation is more accurate and is traditionally used as ground-truth information in validation of satellite-based AOD estimations. In contrast to this traditional role of ground-based sensors, a data mining approach allows more active use of ground-based measurements as labels in supervised learning of a regression model for AOD estimation from satellite measurements. Considering the high operational costs of ground-based sensors, we are studying a budget-cut scenario that requires a reduction in a number of ground-based sensors. To minimize loss of information, the objective is to retain sensors that are the most useful as a source of labeled data. The proposed goodness criterion for the selection is how close the accuracy of a regression model built on data from a reduced sensor set is to the accuracy of a model built of the entire set of sensors. We developed an iterative method that removes sensors one by one from locations where AOD can be predicted most accurately using training data from the remaining sites. Extensive experiments on two years of globally distributed AERONET ground-based sensor data provide strong evidence that sensors selected using the proposed algorithm are more informative than the competing approaches that select sensors at random or that select sensors based on spatial diversity.

    Other authors
    • Zoran Obradovic
    • Slobodan Vucetic
    See publication
  • Spatio-Temporal Partitioning for Improving Aerosol Prediction Accuracy

    Eight SIAM Int'l Conf. on Data Mining (SDM)

    In supervised learning on data collected over space and time different relationships can be found over different spatio-temporal regions. In such situations an appropriate spatio-temporal data partitioning followed by building specialized predictors could often achieve higher overall prediction accuracy than when learning a single predictor on all the data. In practice, partitions are typically decided based on prior knowledge. As an alternative to the domain-based partitioning, we propose a…

    In supervised learning on data collected over space and time different relationships can be found over different spatio-temporal regions. In such situations an appropriate spatio-temporal data partitioning followed by building specialized predictors could often achieve higher overall prediction accuracy than when learning a single predictor on all the data. In practice, partitions are typically decided based on prior knowledge. As an alternative to the domain-based partitioning, we propose a method that automatically discovers a spatio-temporal partitioning through the competition of regression models. The method is evaluated on a challenging problem of using satellite observations to predict Aerosol Optical Depth (AOD) which represents the amount of depletion that a beam of radiation undergoes as it passes through the atmosphere. Our experiments used more than 20,000 labeled data points collected during 3 years over more than 100 sites worldwide. Our partitioning-based approach was compared to the recently developed operational AOD prediction algorithm, called C5, which uses domain knowledge for spatio-temporal partitioning of the Earth and implements a region-specific deterministic predictor that utilizes forward simulations from the postulated physical models. Data partitioning used in C5 divides the world into three spatio-temporal regions that differ based on the location and the time of the year as decided by domain experts. The results showed that a neural network predictor trained on all the data has accuracy comparable to C5. When specialized neural network predictors were learned on C5-based partitions, the overall prediction accuracy was not improved. On the other hand, our competition-based spatio-temporal data partitioning approach resulted in large accuracy improvements.

    Other authors
    • Zoran Obradovic
    • Slobodan Vucetic
    See publication

Honors & Awards

  • Outstanding Graduate Teaching Assistant Award

    Computer and Information Sciences Department, Temple University, Philadelphia

    Awarded annually for outstanding teaching as a graduate teaching assistant.

  • 1st Prize, Graduate Student Project Competition

    Computer and Information Sciences Department, Temple University, Philadelphia

  • 1st Prize, Graduate Student Project Competition

    Computer and Information Sciences Department, Temple University, Philadelphia

  • Temple University Graduate Fellowship

    Temple University, Philadelphia, PA

Recommendations received

More activity by Vladan

View Vladan’s full profile

  • See who you know in common
  • Get introduced
  • Contact Vladan directly
Join to view full profile

Other similar profiles

Gemeinsame Artikel erkunden

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Vladan Radosavljevic