Chu-Cheng Hsieh

Chu-Cheng Hsieh

San Francisco Bay Area
11K followers 500+ connections

Über uns

I am an accomplished technology executive with deep expertise in engineering…

Articles by Chu-Cheng

See all articles

Activity

Join now to see all activity

Erleben Sie

  • SHEIN Graphic

    SHEIN

    Vereinigte Staaten

  • -

    San Francisco Bay Area

  • -

    San Francisco Bay Area

  • -

    San Francisco Bay Area

  • -

    San Francisco Bay Area

  • -

    San Francisco Bay Area

  • -

    San Francisco Bay Area

  • -

    Greater Los Angeles Area

  • -

    Taipei City, Taiwan

  • -

    Taipei City, Taiwan

Bildung

Publications

  • Neural-based agent assistance interface for providing answers based on a query vector

    US Patent Number 11,429,834

    Certain aspects of the present disclosure provide techniques for providing automated intelligence in a support session. In one example, a method includes generating a set of tokens based on a text-based query posted by a support agent to a live chat thread; generating a set of vectors based on the set of tokens; extracting a set of features based on the set of tokens; generating a query vector based on the set of vectors and the set of features; determining a predicted intent of the text-based…

    Certain aspects of the present disclosure provide techniques for providing automated intelligence in a support session. In one example, a method includes generating a set of tokens based on a text-based query posted by a support agent to a live chat thread; generating a set of vectors based on the set of tokens; extracting a set of features based on the set of tokens; generating a query vector based on the set of vectors and the set of features; determining a predicted intent of the text-based query based on the query vector, wherein the predicted intent is one of a plurality of predefined intents; determining a predicted answer to the text-based query based on: the query vector; and the predicted intent; and providing the predicted answer to the text-based query in the live chat thread.

    Other authors
    See publication
  • Speech processing using embedding data

    United States Patent Office

    This patent relates to removing user-identifying audio characteristics from audio before it is sent to the cloud, while still enabling the cloud to distinguish between different users' voice inputs to provide a personalized, cloud-enabled voice user interface. A user device (e.g., Echo, phone, etc.) processes audio locally to extract representative audio characteristics of the user's voice to create a new representation of the utterance (an “audio embedding”) and further process these vocal…

    This patent relates to removing user-identifying audio characteristics from audio before it is sent to the cloud, while still enabling the cloud to distinguish between different users' voice inputs to provide a personalized, cloud-enabled voice user interface. A user device (e.g., Echo, phone, etc.) processes audio locally to extract representative audio characteristics of the user's voice to create a new representation of the utterance (an “audio embedding”) and further process these vocal characteristics to transform them into characteristics of a synthesized voice (e.g., using a “hash” function). The device sends those characteristics to a remote system that can then use them to identify who spoke the utterance. In this way the “voice” of the user is transformed into a something that is different but repeatable and distinguishable from other voices. The user device and/or remote system may further use these transformed characteristics to actually create the synthesized voice to verify that the user ID is correct. Among other things, this approach balances privacy with personalization.

    Other authors
    See publication
  • Toward Pareto Efficient Fairness-Utility Trade-off in Recommendation through Reinforcement Learning

    WSDM'22 – 15th ACM International WSDM Conference

    The issue of fairness in recommendation is becoming increasingly essential as Recommender Systems (RS) touch and influence more and more people in their daily lives. In fairness-aware recommendation, most of the existing algorithmic approaches mainly aim at solving a constrained optimization problem by imposing a constraint on the level of fairness while optimizing the main recommendation objective, e.g., click through rate (CTR). While this alleviates the impact of unfair recommendations, the…

    The issue of fairness in recommendation is becoming increasingly essential as Recommender Systems (RS) touch and influence more and more people in their daily lives. In fairness-aware recommendation, most of the existing algorithmic approaches mainly aim at solving a constrained optimization problem by imposing a constraint on the level of fairness while optimizing the main recommendation objective, e.g., click through rate (CTR). While this alleviates the impact of unfair recommendations, the expected return of an approach may significantly compromise the recommendation accuracy due to the inherent trade-off between fairness and utility. This motivates us to deal with these conflicting objectives and explore the optimal trade-off between them in RS.

    One conspicuous approach is to seek a Pareto efficient/optimal solution to guarantee optimal compromises between utility and fairness. Moreover, considering the needs of real-world e-commerce platforms, it would be more desirable if we can generalize the whole Pareto Frontier, so that the decision-makers can specify any preference of one objective over another based on their current business needs. Therefore, in this work, we propose a fairness-aware recommendation framework using multi-objective reinforcement learning (MORL), called MoFIR, which is able to learn a single parametric representation for optimal recommendation policies over the space of all possible preferences. Specially, we modify traditional Deep Deterministic Policy Gradient (DDPG) by introducing conditioned network (CN) into it, which conditions the networks directly on these preferences and outputs Q-value-vectors.

    Experiments on several real-world recommendation datasets verify the superiority of our framework on both fairness metrics and recommendation measures when compared with all other baselines. We also extract the approximate Pareto Frontier on real-world datasets generated by MoFIR and compare it to state-of-the-art fairness methods.

    Other authors
    See publication
  • Interpretable Attribute-based Action-aware Bandits for Within-Session Personalization in E-commerce

    Knowledge Management in eCommerce Workshop, in conjunction with WWW 2021

    When shopping online, buyers often express and refine their purchase preferences by exploring different items in the product catalog based on varying attributes, such as color, size, shape, and material. As such, it is increasingly important for e-commerce ranking systems to quickly learn a buyer’s fine-grained preferences and re-rank items based on their most recent activity within the session. In this paper, we propose an 𝑂nline 𝑃ersonalized 𝐴ttribute-based 𝑅e-ranker (OPAR), a…

    When shopping online, buyers often express and refine their purchase preferences by exploring different items in the product catalog based on varying attributes, such as color, size, shape, and material. As such, it is increasingly important for e-commerce ranking systems to quickly learn a buyer’s fine-grained preferences and re-rank items based on their most recent activity within the session. In this paper, we propose an 𝑂nline 𝑃ersonalized 𝐴ttribute-based 𝑅e-ranker (OPAR), a lightweight, within-session personalization approach using multi-arm bandits (MAB). As the buyer continues on their shopping mission and interacts with different products in an online shop, OPAR learns which attributes the buyer likes and dislikes, forming an interpretable user preference profile and im- proving re-ranking performance over time, within the same session. By representing each arm in the MAB as an attribute, we reduce the complexity space (compared with modeling preferences at the item level) while offering more fine-grained personalization (compared with modeling preferences at the product category level). We naturally extend this formulation to weight attributes differently in the reward function, depending on how the buyer interacts with the item (e.g. click, add-to-cart, purchase). We train and evaluate OPAR on a real-world e-commerce search ranking system and benchmark it against 4 state-of-the-art baselines on 8 datasets and show an improvement in ranking performance across all tasks.

    Other authors
    See publication
  • Automatic Speaker Recognition with Limited Data

    the 13th ACM International WSDM Conference (WSDM’20)

    Automatic speaker recognition (ASR) is a stepping-stone technology towards semantic multimedia understanding and benefits versatile downstream applications. In recent years, neural network-based ASR methods have demonstrated remarkable power to achieve excellent recognition performance with sufficient training data. However, it is impractical to collect sufficient training data for every user, especially for fresh users. Therefore, a large portion of users usually has a very limited number of…

    Automatic speaker recognition (ASR) is a stepping-stone technology towards semantic multimedia understanding and benefits versatile downstream applications. In recent years, neural network-based ASR methods have demonstrated remarkable power to achieve excellent recognition performance with sufficient training data. However, it is impractical to collect sufficient training data for every user, especially for fresh users. Therefore, a large portion of users usually has a very limited number of training instances. As a consequence, the lack of training data prevents ASR systems from accurately learning users acoustic biometrics, jeopardizes the downstream applications, and eventually impairs user experience.

    In this work, we propose an adversarial few-shot learning-based speaker identification framework (AFEASI) to develop robust speaker identification models with only a limited number of training instances. We first employ metric learning-based few-shot learning to learn speaker acoustic representations, where the limited instances are comprehensively utilized to improve the identification performance. In addition, adversarial learning is applied to further enhance the generalization and robustness for speaker identification with adversarial examples. Experiments conducted on a publicly available large-scale dataset demonstrate that \model significantly outperforms eleven baseline methods. An in-depth analysis further indicates both effectiveness and robustness of the proposed method.

    Other authors
    See publication
  • Bridging Mixture Density Networks with Meta-learning for Automatic Speaker Identification

    the 45th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’20)

    Speaker identification answers the fundamental question “Who is speaking?” The identification technology enables various downstream applications to provide a personalized experience. Both the prevalent i-vector based solutions and the state-of-the-art deep learning solutions usually treat all users equally, with no distinctions between new users and existing users, during the training process. We notice that a good many new users start with limited labeled training data, which often results in…

    Speaker identification answers the fundamental question “Who is speaking?” The identification technology enables various downstream applications to provide a personalized experience. Both the prevalent i-vector based solutions and the state-of-the-art deep learning solutions usually treat all users equally, with no distinctions between new users and existing users, during the training process. We notice that a good many new users start with limited labeled training data, which often results in inferior predicting performance of recognizing users’ voices. To alleviate the disadvantage caused by train- ing data deficiency, we propose a Mixture Density Network-based Meta-Learning method (MDNML) for speaker identification. MDNML emphasizes the expeditious process of learn- ing to recognize new users where each has only a few seconds of labeled data.
    We conduct experiments on the LibriSpeech dataset and compare MDNML with four state-of-the-art baseline methods. The results conclude that MDNML achieves higher ac- curacy in recognizing new users with limited labeled utterances than all baseline methods. Our proposed solution significantly expedites the learning by transferring the knowledge learned from the existing user base through gradient- based meta-learning. We consider our work to be a stepping- stone for more sophisticated meta-learning frameworks for accelerating voice recognition. Furthermore, we discuss a strategy for enhancing the accuracy by incorporating the notion of household-based acoustic profiles with MDNML.

    Other authors
  • Isa: Intuit Smart Agent, A Neural-Based Agent-Assist Chatbot (to appear)

    IEEE International Conference on Data Mining (ICDM'18)

    Hiring seasonal workers in call centers to provide customer service is a common practice in B2C companies. The quality of service delivered by both contracting and employee customer service agents depends heavily on the domain knowledge available to them. When observing the internal group messaging channels used by agents, we found that similar questions are often asked repetitively by different agents, especially from less experienced ones. The goal of our work is to leverage the promising…

    Hiring seasonal workers in call centers to provide customer service is a common practice in B2C companies. The quality of service delivered by both contracting and employee customer service agents depends heavily on the domain knowledge available to them. When observing the internal group messaging channels used by agents, we found that similar questions are often asked repetitively by different agents, especially from less experienced ones. The goal of our work is to leverage the promising advances in conversational AI to provide a chatbot-like mechanism for assisting agents in promptly resolving a customer’s issue. In this paper, we develop a neural-based conversational solution that employs BiLSTM with attention mechanism and demonstrates how our system boosts the effectiveness of customer support agents. In addition, we discuss the design principles and the necessary considerations for our system. We then demonstrate how our system, named Isa (Intuit Smart Agent), can help customer service agents provide a high-quality customer experience by reducing customer wait time and by applying the knowledge accumulated from customer interactions in future applications.

    Other authors
  • Monetary Discount Strategies for Real-Time Promotion Campaign

    The 26th World Wide Web conference (WWW'​ 17)

    The effectiveness of monetary promotions has been well reported in the literature to affect shopping decisions for utilitarian products in real life experience [3]. Nowadays, e-commerce retailers are facing more fierce competition on price promotion in that online consumers can easily hunt for the best products with the highest value at a reasonable price. We study e-commerce data – shopping
    receipts collected from email accounts, and conclude that for utilitarian products like books or…

    The effectiveness of monetary promotions has been well reported in the literature to affect shopping decisions for utilitarian products in real life experience [3]. Nowadays, e-commerce retailers are facing more fierce competition on price promotion in that online consumers can easily hunt for the best products with the highest value at a reasonable price. We study e-commerce data – shopping
    receipts collected from email accounts, and conclude that for utilitarian products like books or electronics, buyers are price sensitive and are willing to delay the purchase for better deals. We then present a real-time promotion framework, called the RTP framework: a one-time promoted discount price is offered to allure a potential buyer making a decision promptly. To achieve more ef-
    fectiveness on real-time promotion in pursuit of better profits, we propose two discount-giving strategies: an algorithm based on Kernel density estimation, and the other algorithm based on Thompson sampling strategy. We show that, given a pre-determined discount budget, our algorithms can significantly acquire better revenue in return than classical strategies with simply fixed discount on label price, demonstrating its feasibility to be a promising deployment in e-commerce services for real-time promotion.

    Other authors
    See publication
  • Efficient Approximate Thompson Sampling for Search Query Recommendation

    The 30th ACM/SIGAPP Symposium On Applied Computing (SAC'15)

    Query suggestions have been a valuable feature for e-commerce sites in helping shoppers refine their search intent. In this paper, we develop an algorithm that helps e-commerce sites like eBay mingle the output of different recommendation algorithms. Our algorithm is based on “Thompson Sampling” — a technique designed for solving multi-arm bandit problems where the best results are not known in advance but instead are tried out to gather feedback. Our approach is to treat query suggestions as a…

    Query suggestions have been a valuable feature for e-commerce sites in helping shoppers refine their search intent. In this paper, we develop an algorithm that helps e-commerce sites like eBay mingle the output of different recommendation algorithms. Our algorithm is based on “Thompson Sampling” — a technique designed for solving multi-arm bandit problems where the best results are not known in advance but instead are tried out to gather feedback. Our approach is to treat query suggestions as a competition among data resources: we have many query suggestion candidates competing for limited space on the search results page. An “arm” is played when a query suggestion candidate is chosen for display, and our goal is to maximize the expected reward (user clicks on a suggestion). Our experiments have shown promising results in using the click-based user feedback to drive success by enhancing the quality of query suggestions.

    Other authors
    See publication
  • A short-term bookmarking system for collecting user-interest data

    PAKDD Workshop on Big Data Science and Engineering on E-Commerce

    During the shopping process, users typically narrow down their search to a small collection of products before making a final purchase. These data, consisting of products that users are considering purchasing, correlate strongly with user search intent and product desirability. By allowing users to bookmark products between browsing and purchasing, we collect user-interest information. We then propose a product recommendation algorithm based on these data. By considering both popular and…

    During the shopping process, users typically narrow down their search to a small collection of products before making a final purchase. These data, consisting of products that users are considering purchasing, correlate strongly with user search intent and product desirability. By allowing users to bookmark products between browsing and purchasing, we collect user-interest information. We then propose a product recommendation algorithm based on these data. By considering both popular and long-tail queries, we shed light on the potential usage of the data.

    Other authors
    See publication
  • Incorporating Popularity in Topic Models for Social Network Analysis

    the 36th Annual ACM Special Interest Group on Information Retrieval (SIGIR)

    In this paper, we propose topic models to deal with social network data. Our topic models are specialized in dealing with \popularity bias" caused by dominance of a limited number of popular user (or node) in a dataset. These popular nodes have been simply removed in topic models because they do not have much meaning (e.g., the and is). However, in a social network dataset, most people are interested in popular users (e.g., Barack Obama and Britney Spears) and they should be carefully…

    In this paper, we propose topic models to deal with social network data. Our topic models are specialized in dealing with \popularity bias" caused by dominance of a limited number of popular user (or node) in a dataset. These popular nodes have been simply removed in topic models because they do not have much meaning (e.g., the and is). However, in a social network dataset, most people are interested in popular users (e.g., Barack Obama and Britney Spears) and they should be carefully handled.

    To solve this problem, we introduce a notion of "popularity component" and explore various ways to e?ectively incorporate it. Through extensive experiments, we show that our proposed models achieve signi?cant improvements over the existing models in terms of lowering "perplexity". We
    also show that the outgoing edge degree (how many people a user follows) does not help much in achieving the lower perplexity. Our models can be useful in providing more accurate recommendations and clusterings for various services including social network services.

    Other authors
    See publication
  • Finding similar items by leveraging social tag clouds

    ACM Symposium on Applied Computing (SAC)

    Recently social collaboration projects such as Wikipedia and Flickr have been gaining popularity, and more and more social tag information is being accumulated. In this study, we demonstrate how to effectively use social tags created by humans to find similar items. We create a query-by-example interface for finding similar items through offering examples as a query. Our work aims to measure the similarity between a query, expressed as a group of items, and another item through utilizing the…

    Recently social collaboration projects such as Wikipedia and Flickr have been gaining popularity, and more and more social tag information is being accumulated. In this study, we demonstrate how to effectively use social tags created by humans to find similar items. We create a query-by-example interface for finding similar items through offering examples as a query. Our work aims to measure the similarity between a query, expressed as a group of items, and another item through utilizing the tag information. We show that using human-generated tags to find similar items has at least two major challenges: popularity bias and the missing tag effect. We propose several approaches to overcome the challenges. We build a prototype website allowing users to search over all entries in Wikipedia based on tag information, and then collect 600 valid questionnaires from 69 students to create a benchmark for evaluating our algorithms based on user satisfaction. Our results show that the presented techniques are promising and surpass the leading commercial product, Google Sets, in terms of user satisfaction.

    Other authors
    • Junghoo Cho
    See publication
  • Detecting Unknown Malicious Executables Using Portable Executable Headers

    Fifth International Joint Conference on INC, IMS and IDC

    Even though numerous kinds of anti-virus software packages have been used for many years, previously unseen malware is still a serious threat to computer and information system. By analyzing portable executable header entries of executables, a malware detection model which consists of four stages: attribute extraction, attribute binarization, attribute elimination, and feature selection and classifier training was carried out in this study. First, we collected header entries from all…

    Even though numerous kinds of anti-virus software packages have been used for many years, previously unseen malware is still a serious threat to computer and information system. By analyzing portable executable header entries of executables, a malware detection model which consists of four stages: attribute extraction, attribute binarization, attribute elimination, and feature selection and classifier training was carried out in this study. First, we collected header entries from all executables in our dataset and viewed each entry as a potential attribute. Second, information gain and gain ratio were used to binarize numerical and nominal attributes. Next, useless and redundant attributes were eliminated in the third stage. Finally, by using support vector machine which is a classification algorithm of conspicuous generalization ability, feature selection was simultaneously performed with classifier training to reduce the number of attributes and retain the performance of classifier in a cost-effective. We evaluated our model by 1,908 benign programs and 7,863 malicious files (virus, email worm, trojan and backdoor) and estimated its generalization ability by cross validation. The experiment results showed that our model had promising performance for detecting virus and email worm.

    Other authors
    See publication
  • A Virus Prevention Model Based on Static Analysis and Data Mining Methods

    the IEEE 8th International Conference on Computer and Information Technology

    Owing to the lack of prevention ability of traditional anti-virus methods, a behavior-based virus prevention model for detecting unknown virus is proposed in this study. We first defined the behaviors of an executable by observing its usage of dynamically linked libraries and Application Programming Interfaces. Then, information gain and support vector machines were applied to filter out the redundant behavior attributes and select informative feature for training a virus classifier. The…

    Owing to the lack of prevention ability of traditional anti-virus methods, a behavior-based virus prevention model for detecting unknown virus is proposed in this study. We first defined the behaviors of an executable by observing its usage of dynamically linked libraries and Application Programming Interfaces. Then, information gain and support vector machines were applied to filter out the redundant behavior attributes and select informative feature for training a virus classifier. The performance of our model was evaluated by a dataset contains 1,758 benign executables and 846 viruses. The experiment results are promising, and the overall accuracies are 99% and 96.66% for detecting the known viruses and the previously unseen viruses respectively.

    Other authors
    See publication
  • Experts vs The Crowd: Examining Popular News Prediction Perfomance on Twitter

    -

    In the finance domain, the famous Efficient Market Hypothesis(EMH) concludes that crowd wisdom is superior to any expert wisdom in picking financial stocks. In this study, we test a similar hypothesis in the domain of news recommendation by conducting experiments on Twitter. We first identify a group of experts on Twitter who have consistently identified ``interesting'' (or popular) news early on and have recommended them in their tweets. We then collect two sets of news: a set of incoming news…

    In the finance domain, the famous Efficient Market Hypothesis(EMH) concludes that crowd wisdom is superior to any expert wisdom in picking financial stocks. In this study, we test a similar hypothesis in the domain of news recommendation by conducting experiments on Twitter. We first identify a group of experts on Twitter who have consistently identified ``interesting'' (or popular) news early on and have recommended them in their tweets. We then collect two sets of news: a set of incoming news recommended by these experts and a similar set recommended by the ``crowd''. We then observe, for a few months, how widely the news in the two sets are circulated on Twitter, and evaluate which set contains more widely-circulated news (and therefore are more likely to be interesting).After conducting repeated experiments, we draw a similar conclusion to the EMH -- the crowd wisdom is always the winner in our experiments; we could not identify an expert group whose news recommendation performance was consistently better than that of the crowd. We then proceed to investigate whether the expert wisdom can be used to improve crowd wisdom in any way.

    Other authors
    See publication

Courses

  • Artificial Intelligence

    CS161

  • Automatically Reasonaing Theory and Application

    CS264A

  • Connectionist Language Processing

    CS263B

  • Data and Knowledge Base System

    CS240A

  • Database

    CS143

  • Distributed Database System

    CS244A

  • Intelligent Information System

    CS245A

  • Web Applications

    CS144

  • Web Information System

    CS246

Projects

Honors & Awards

  • Best Paper Runner Up

    SIGIR 2013

    Paper (with Youngchul Cha, Bin Bi, and Junghoo “John” Cho), “Incorporating popularity in topic models for social network analysis”, in Proceedings of the 36th Annual ACM Special Interest Group on Information Retrieval (SIGIR), May, 2013.

  • Conference Travel Award

    SAC'12

    Paper (with Junghoo ”John” Cho), “Finding similar items by leveraging social tag clouds”, in Pro- ceedings of the Annual ACM Symposium on Applied Computing (SAC), March 26-30, 2012. Student Travel Award.

Languages

  • Englisch

    -

  • Chinese

    -

Recommendations received

27 people have recommended Chu-Cheng

Join now to view

More activity by Chu-Cheng

View Chu-Cheng’s full profile

  • See who you know in common
  • Get introduced
  • Contact Chu-Cheng directly
Join to view full profile

Other similar profiles

Gemeinsame Artikel erkunden

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Add new skills with these courses