Zum Hauptinhalt springen

Showing 1–45 of 45 results for author: Ng, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.16916  [pdf, other

    cs.NE q-bio.NC

    A Computational Framework for Modeling Emergence of Color Vision in the Human Brain

    Authors: Atsunobu Kotani, Ren Ng

    Abstract: It is a mystery how the brain decodes color vision purely from the optic nerve signals it receives, with a core inferential challenge being how it disentangles internal perception with the correct color dimensionality from the unknown encoding properties of the eye. In this paper, we introduce a computational framework for modeling this emergence of human color vision by simulating both the eye an… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: 23 pages, 10 figures, Webpage: https://color-vision.github.io

  2. arXiv:2405.01842  [pdf, ps, other

    cs.CL

    SGHateCheck: Functional Tests for Detecting Hate Speech in Low-Resource Languages of Singapore

    Authors: Ri Chi Ng, Nirmalendu Prakash, Ming Shan Hee, Kenny Tsu Wei Choo, Roy Ka-Wei Lee

    Abstract: To address the limitations of current hate speech detection models, we introduce \textsf{SGHateCheck}, a novel framework designed for the linguistic and cultural context of Singapore and Southeast Asia. It extends the functional testing approach of HateCheck and MHC, employing large language models for translation and paraphrasing into Singapore's main languages, and refining these with native ann… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  3. arXiv:2402.08855  [pdf, other

    cs.HC cs.AI

    GhostWriter: Augmenting Collaborative Human-AI Writing Experiences Through Personalization and Agency

    Authors: Catherine Yeh, Gonzalo Ramos, Rachel Ng, Andy Huntington, Richard Banks

    Abstract: Large language models (LLMs) are becoming more prevalent and have found a ubiquitous use in providing different forms of writing assistance. However, LLM-powered writing systems can frustrate users due to their limited personalization and control, which can be exacerbated when users lack experience with prompt engineering. We see design as one way to address these challenges and introduce GhostWri… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 29 pages, 12 figures

  4. arXiv:2402.03659  [pdf, other

    cs.LG cs.CL q-fin.ST

    Learning to Generate Explainable Stock Predictions using Self-Reflective Large Language Models

    Authors: Kelvin J. L. Koa, Yunshan Ma, Ritchie Ng, Tat-Seng Chua

    Abstract: Explaining stock predictions is generally a difficult task for traditional non-generative deep learning models, where explanations are limited to visualizing the attention weights on important texts. Today, Large Language Models (LLMs) present a solution to this problem, given their known capabilities to generate human-readable explanations for their decision-making process. However, the task of s… ▽ More

    Submitted 29 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: WWW 2024

  5. arXiv:2309.00073  [pdf, other

    q-fin.ST cs.LG q-fin.CP

    Diffusion Variational Autoencoder for Tackling Stochasticity in Multi-Step Regression Stock Price Prediction

    Authors: Kelvin J. L. Koa, Yunshan Ma, Ritchie Ng, Tat-Seng Chua

    Abstract: Multi-step stock price prediction over a long-term horizon is crucial for forecasting its volatility, allowing financial institutions to price and hedge derivatives, and banks to quantify the risk in their trading books. Additionally, most financial regulators also require a liquidity horizon of several days for institutional investors to exit their risky assets, in order to not materially affect… ▽ More

    Submitted 29 October, 2023; v1 submitted 18 August, 2023; originally announced September 2023.

    Comments: CIKM 2023

  6. arXiv:2305.14617  [pdf, other

    cs.CL cs.AI

    COMET-M: Reasoning about Multiple Events in Complex Sentences

    Authors: Sahithya Ravi, Raymond Ng, Vered Shwartz

    Abstract: Understanding the speaker's intended meaning often involves drawing commonsense inferences to reason about what is not stated explicitly. In multi-event sentences, it requires understanding the relationships between events based on contextual knowledge. We propose COMET-M (Multi-Event), an event-centric commonsense model capable of generating commonsense inferences for a target event within a comp… ▽ More

    Submitted 23 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

  7. arXiv:2305.07930  [pdf, other

    cs.HC

    FoundWright: A System to Help People Re-find Pages from Their Web-history

    Authors: Haekyu Park, Gonzalo Ramos, Jina Suh, Christopher Meek, Rachel Ng, Mary Czerwinski

    Abstract: Re-finding information is an essential activity, however, it can be difficult when people struggle to express what they are looking for. Through a need-finding survey, we first seek opportunities for improving re-finding experiences, and explore one of these opportunities by implementing the FoundWright system. The system leverages recent advances in language transformer models to expand people's… ▽ More

    Submitted 13 May, 2023; originally announced May 2023.

    Comments: 26 pages

  8. arXiv:2302.09715  [pdf, other

    cs.CL

    What happens before and after: Multi-Event Commonsense in Event Coreference Resolution

    Authors: Sahithya Ravi, Chris Tanner, Raymond Ng, Vered Shwartz

    Abstract: Event coreference models cluster event mentions pertaining to the same real-world event. Recent models rely on contextualized representations to recognize coreference among lexically or contextually similar mentions. However, models typically fail to leverage commonsense inferences, which is particularly limiting for resolving lexically-divergent mentions. We propose a model that extends event men… ▽ More

    Submitted 21 February, 2023; v1 submitted 19 February, 2023; originally announced February 2023.

    Comments: Accepted to EACL 2023

  9. arXiv:2207.11754  [pdf, other

    cs.HC

    Virtual Reality Therapy for the Psychological Well-being of Palliative Care Patients in Hong Kong

    Authors: Daniel Eckhoff, Royce Ng, Alvaro Cassinelli

    Abstract: In this paper we introduce novel Virtual Reality (VR) and Augmented Reality (AR) treatments to improve the psychological well being of patients in palliative care, based on interviews with a clinical psychologist who has successfully implemented VR assisted interventions on palliative care patients in the Hong Kong hospital system. Our VR and AR assisted interventions are adaptations of traditiona… ▽ More

    Submitted 24 July, 2022; originally announced July 2022.

  10. A cross-corpus study on speech emotion recognition

    Authors: Rosanna Milner, Md Asif Jalal, Raymond W. M. Ng, Thomas Hain

    Abstract: For speech emotion datasets, it has been difficult to acquire large quantities of reliable data and acted emotions may be over the top compared to less expressive emotions displayed in everyday life. Lately, larger datasets with natural emotions have been created. Instead of ignoring smaller, acted datasets, this study investigates whether information learnt from acted emotions is useful for detec… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Comments: ASRU 2019

    Journal ref: IEEE Workshop on Automatic Speech Recognition and Understanding 2019

  11. arXiv:2206.06448  [pdf

    eess.IV cs.CR cs.CV cs.LG

    Assessing Privacy Leakage in Synthetic 3-D PET Imaging using Transversal GAN

    Authors: Robert V. Bergen, Jean-Francois Rajotte, Fereshteh Yousefirizi, Arman Rahmim, Raymond T. Ng

    Abstract: Training computer-vision related algorithms on medical images for disease diagnosis or image segmentation is difficult in large part due to privacy concerns. For this reason, generative image models are highly sought after to facilitate data sharing. However, 3-D generative models are understudied, and investigation of their privacy leakage is needed. We introduce our 3-D generative model, Transve… ▽ More

    Submitted 31 October, 2023; v1 submitted 13 June, 2022; originally announced June 2022.

    Comments: arXiv admin note: text overlap with arXiv:2111.01866

  12. arXiv:2205.13741  [pdf, other

    cs.LG

    Generating multivariate time series with COmmon Source CoordInated GAN (COSCI-GAN)

    Authors: Ali Seyfi, Jean-Francois Rajotte, Raymond T. Ng

    Abstract: Generating multivariate time series is a promising approach for sharing sensitive data in many medical, financial, and IoT applications. A common type of multivariate time series originates from a single source such as the biometric measurements from a medical patient. This leads to complex dynamical patterns between individual time series that are hard to learn by typical generation models such a… ▽ More

    Submitted 14 December, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: 19 pages, 16 figures

  13. arXiv:2204.03715  [pdf, other

    cs.CV astro-ph.IM

    Gravitationally Lensed Black Hole Emission Tomography

    Authors: Aviad Levis, Pratul P. Srinivasan, Andrew A. Chael, Ren Ng, Katherine L. Bouman

    Abstract: Measurements from the Event Horizon Telescope enabled the visualization of light emission around a black hole for the first time. So far, these measurements have been used to recover a 2D image under the assumption that the emission field is static over the period of acquisition. In this work, we propose BH-NeRF, a novel tomography approach that leverages gravitational lensing to recover the conti… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

    Comments: To appear in the IEEE Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2022. Supplemental material including accompanying pdf, code, and video highlight can be found in the project page: http://imaging.cms.caltech.edu/bhnerf/

  14. arXiv:2111.01866  [pdf

    eess.IV cs.CV cs.LG physics.med-ph

    3-D PET Image Generation with tumour masks using TGAN

    Authors: Robert V Bergen, Jean-Francois Rajotte, Fereshteh Yousefirizi, Ivan S Klyuzhin, Arman Rahmim, Raymond T. Ng

    Abstract: Training computer-vision related algorithms on medical images for disease diagnosis or image segmentation is difficult due to the lack of training data, labeled samples, and privacy concerns. For this reason, a robust generative method to create synthetic data is highly sought after. However, most three-dimensional image generators require additional image input or are extremely memory intensive.… ▽ More

    Submitted 2 November, 2021; originally announced November 2021.

  15. arXiv:2103.14024  [pdf, other

    cs.CV cs.GR

    PlenOctrees for Real-time Rendering of Neural Radiance Fields

    Authors: Alex Yu, Ruilong Li, Matthew Tancik, Hao Li, Ren Ng, Angjoo Kanazawa

    Abstract: We introduce a method to render Neural Radiance Fields (NeRFs) in real time using PlenOctrees, an octree-based 3D representation which supports view-dependent effects. Our method can render 800x800 images at more than 150 FPS, which is over 3000 times faster than conventional NeRFs. We do so without sacrificing quality while preserving the ability of NeRFs to perform free-viewpoint rendering of sc… ▽ More

    Submitted 17 August, 2021; v1 submitted 25 March, 2021; originally announced March 2021.

    Comments: ICCV 2021 (Oral)

  16. arXiv:2101.07235  [pdf, other

    stat.ML cs.AI cs.CV cs.DC cs.LG

    Reducing bias and increasing utility by federated generative modeling of medical images using a centralized adversary

    Authors: Jean-Francois Rajotte, Sumit Mukherjee, Caleb Robinson, Anthony Ortiz, Christopher West, Juan Lavista Ferres, Raymond T Ng

    Abstract: We introduce FELICIA (FEderated LearnIng with a CentralIzed Adversary) a generative mechanism enabling collaborative learning. In particular, we show how a data owner with limited and biased data could benefit from other data owners while keeping data from all the sources private. This is a common scenario in medical image analysis where privacy legislation prevents data from being shared outside… ▽ More

    Submitted 28 August, 2021; v1 submitted 18 January, 2021; originally announced January 2021.

    Comments: 10 pages, 10 figures

    MSC Class: 68W15 ACM Class: I.2.11

  17. arXiv:2012.02189  [pdf, other

    cs.CV

    Learned Initializations for Optimizing Coordinate-Based Neural Representations

    Authors: Matthew Tancik, Ben Mildenhall, Terrance Wang, Divi Schmidt, Pratul P. Srinivasan, Jonathan T. Barron, Ren Ng

    Abstract: Coordinate-based neural representations have shown significant promise as an alternative to discrete, array-based representations for complex low dimensional signals. However, optimizing a coordinate-based network from randomly initialized weights for each new signal is inefficient. We propose applying standard meta-learning algorithms to learn the initial weight parameters for these fully-connect… ▽ More

    Submitted 23 March, 2021; v1 submitted 3 December, 2020; originally announced December 2020.

    Comments: Project page: https://www.matthewtancik.com/learnit

  18. arXiv:2010.07017  [pdf

    cs.CY cs.CL stat.OT

    Computational Skills by Stealth in Secondary School Data Science

    Authors: Wesley Burr, Fanny Chevalier, Christopher Collins, Alison L Gibbs, Raymond Ng, Chris Wild

    Abstract: The unprecedented growth in the availability of data of all types and qualities and the emergence of the field of data science has provided an impetus to finally realizing the implementation of the full breadth of the Nolan and Temple Lang proposed integration of computing concepts into statistics curricula at all levels in statistics and new data science programs and courses. Moreover, data scien… ▽ More

    Submitted 8 October, 2020; originally announced October 2020.

    Comments: 38 pages, 8 figures

  19. arXiv:2010.05382  [pdf

    eess.IV cs.CV physics.optics

    Miniscope3D: optimized single-shot miniature 3D fluorescence microscopy

    Authors: Kyrollos Yanny, Nick Antipa, William Liberti, Sam Dehaeck, Kristina Monakhova, Fanglin Linda Liu, Konlin Shen, Ren Ng, Laura Waller

    Abstract: Miniature fluorescence microscopes are a standard tool in systems biology. However, widefield miniature microscopes capture only 2D information, and modifications that enable 3D capabilities increase the size and weight and have poor resolution outside a narrow depth range. Here, we achieve the 3D capability by replacing the tube lens of a conventional 2D Miniscope with an optimized multifocal pha… ▽ More

    Submitted 11 October, 2020; originally announced October 2020.

    Comments: Published with Nature Springer in Light: Science and Applications

    Journal ref: Light: Science & Applications 9.1 (2020): 1-13

  20. arXiv:2009.11362  [pdf, other

    cs.CV cs.LG

    Dense Forecasting of Wildfire Smoke Particulate Matter Using Sparsity Invariant Convolutional Neural Networks

    Authors: Renhao Wang, Ashutosh Bhudia, Brandon Dos Remedios, Minnie Teng, Raymond Ng

    Abstract: Accurate forecasts of fine particulate matter (PM 2.5) from wildfire smoke are crucial to safeguarding cardiopulmonary public health. Existing forecasting systems are trained on sparse and inaccurate ground truths, and do not take sufficient advantage of important spatial inductive biases. In this work, we present a convolutional neural network which preserves sparsity invariance throughout, and l… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

    Comments: Submitted to the 2020 NeurIPS Workshop on Machine learning in Public Health

  21. arXiv:2009.06764  [pdf, other

    stat.ML cs.CR cs.LG

    Private data sharing between decentralized users through the privGAN architecture

    Authors: Jean-Francois Rajotte, Raymond T Ng

    Abstract: More data is almost always beneficial for analysis and machine learning tasks. In many realistic situations however, an enterprise cannot share its data, either to keep a competitive advantage or to protect the privacy of the data sources, the enterprise's clients for example. We propose a method for data owners to share synthetic or fake versions of their data without sharing the actual data, nor… ▽ More

    Submitted 14 September, 2020; originally announced September 2020.

    Comments: 6 pages, 9 figures, to be in the proceedings of International Workshop on Privacy and Security in Enterprise Modeling (PriSEM'20)

  22. arXiv:2006.10739  [pdf, other

    cs.CV cs.LG

    Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains

    Authors: Matthew Tancik, Pratul P. Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron, Ren Ng

    Abstract: We show that passing input points through a simple Fourier feature mapping enables a multilayer perceptron (MLP) to learn high-frequency functions in low-dimensional problem domains. These results shed light on recent advances in computer vision and graphics that achieve state-of-the-art results by using MLPs to represent complex 3D objects and scenes. Using tools from the neural tangent kernel (N… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

    Comments: Project page: https://people.eecs.berkeley.edu/~bmild/fourfeat/

  23. arXiv:2005.08925  [pdf, other

    cs.CV cs.GR

    Portrait Shadow Manipulation

    Authors: Xuaner Cecilia Zhang, Jonathan T. Barron, Yun-Ta Tsai, Rohit Pandey, Xiuming Zhang, Ren Ng, David E. Jacobs

    Abstract: Casually-taken portrait photographs often suffer from unflattering lighting and shadowing because of suboptimal conditions in the environment. Aesthetic qualities such as the position and softness of shadows and the lighting ratio between the bright and dark parts of the face are frequently determined by the constraints of the environment rather than by the photographer. Professionals address this… ▽ More

    Submitted 20 May, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

    Comments: (updated version); SIGGRAPH 2020;Project webpage: https://people.eecs.berkeley.edu/~cecilia77/project-pages/portrait Video: https://youtu.be/M_qYTXhzyac

  24. arXiv:2005.01322  [pdf, other

    cs.HC

    Building Proactive Voice Assistants: When and How (not) to Interact

    Authors: O. Miksik, I. Munasinghe, J. Asensio-Cubero, S. Reddy Bethi, S-T. Huang, S. Zylfo, X. Liu, T. Nica, A. Mitrocsak, S. Mezza, R. Beard, R. Shi, R. Ng, P. Mediano, Z. Fountas, S-H. Lee, J. Medvesek, H. Zhuang, Y. Rogers, P. Swietojanski

    Abstract: Voice assistants have recently achieved remarkable commercial success. However, the current generation of these devices is typically capable of only reactive interactions. In other words, interactions have to be initiated by the user, which somewhat limits their usability and user experience. We propose, that the next generation of such devices should be able to proactively provide the right infor… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

    Comments: 17 pages, technical report

  25. arXiv:2003.08934  [pdf, other

    cs.CV cs.GR

    NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

    Authors: Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng

    Abstract: We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. Our algorithm represents a scene using a fully-connected (non-convolutional) deep network, whose input is a single continuous 5D coordinate (spatial location $(x,y,z)$ and viewing direction… ▽ More

    Submitted 3 August, 2020; v1 submitted 19 March, 2020; originally announced March 2020.

    Comments: ECCV 2020 (oral). Project page with videos and code: http://tancik.com/nerf

  26. arXiv:1905.13221  [pdf, other

    eess.IV cs.CV eess.SP

    Video from Stills: Lensless Imaging with Rolling Shutter

    Authors: Nick Antipa, Patrick Oare, Emrah Bostan, Ren Ng, Laura Waller

    Abstract: Because image sensor chips have a finite bandwidth with which to read out pixels, recording video typically requires a trade-off between frame rate and pixel count. Compressed sensing techniques can circumvent this trade-off by assuming that the image is compressible. Here, we propose using multiplexing optics to spatially compress the scene, enabling information about the whole scene to be sample… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

    Comments: 8 pages, 7 figures, IEEE International Conference on Computational Photography 2019, Tokyo

  27. arXiv:1905.06326  [pdf, other

    cs.CV cs.GR

    Synthetic Defocus and Look-Ahead Autofocus for Casual Videography

    Authors: Xuaner Zhang, Kevin Matzen, Vivien Nguyen, Dillon Yao, You Zhang, Ren Ng

    Abstract: In cinema, large camera lenses create beautiful shallow depth of field (DOF), but make focusing difficult and expensive. Accurate cinema focus usually relies on a script and a person to control focus in realtime. Casual videographers often crave cinematic focus, but fail to achieve it. We either sacrifice shallow DOF, as in smartphone videos; or we struggle to deliver accurate focus, as in videos… ▽ More

    Submitted 21 May, 2019; v1 submitted 15 May, 2019; originally announced May 2019.

    Comments: (V2 author name corrected) SIGGRAPH 2019; project website: https://ceciliavision.github.io/vid-auto-focus/

  28. arXiv:1905.05169  [pdf, other

    cs.CV eess.IV

    Zoom To Learn, Learn To Zoom

    Authors: Xuaner Cecilia Zhang, Qifeng Chen, Ren Ng, Vladlen Koltun

    Abstract: This paper shows that when applying machine learning to digital zoom for photography, it is beneficial to use real, RAW sensor data for training. Existing learning-based super-resolution methods do not use real sensor data, instead operating on RGB images. In practice, these approaches result in loss of detail and accuracy in their digitally zoomed output when zooming in on distant image regions.… ▽ More

    Submitted 13 May, 2019; originally announced May 2019.

    Comments: CVPR 2019, https://ceciliavision.github.io/project-pages/project-zoom.html (paper, video, supp, code, dataset)

  29. arXiv:1905.00889  [pdf, other

    cs.CV cs.GR

    Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines

    Authors: Ben Mildenhall, Pratul P. Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, Abhishek Kar

    Abstract: We present a practical and robust deep learning solution for capturing and rendering novel views of complex real world scenes for virtual exploration. Previous approaches either require intractably dense view sampling or provide little to no guidance for how users should sample views of a scene to reliably render high-quality novel views. Instead, we propose an algorithm for view synthesis from an… ▽ More

    Submitted 2 May, 2019; originally announced May 2019.

    Comments: SIGGRAPH 2019. Project page with video and code: http://people.eecs.berkeley.edu/~bmild/llff/

  30. arXiv:1905.00413  [pdf, other

    cs.CV

    Pushing the Boundaries of View Extrapolation with Multiplane Images

    Authors: Pratul P. Srinivasan, Richard Tucker, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng, Noah Snavely

    Abstract: We explore the problem of view synthesis from a narrow baseline pair of images, and focus on generating high-quality view extrapolations with plausible disocclusions. Our method builds upon prior work in predicting a multiplane image (MPI), which represents scene content as a set of RGB$α$ planes within a reference view frustum and renders novel views by projecting this content into the target vie… ▽ More

    Submitted 1 May, 2019; originally announced May 2019.

    Comments: Oral presentation at CVPR 2019

  31. arXiv:1904.05343  [pdf, other

    cs.CV

    StegaStamp: Invisible Hyperlinks in Physical Photographs

    Authors: Matthew Tancik, Ben Mildenhall, Ren Ng

    Abstract: Printed and digitally displayed photos have the ability to hide imperceptible digital data that can be accessed through internet-connected imaging systems. Another way to think about this is physical photographs that have unique QR codes invisibly embedded within them. This paper presents an architecture, algorithms, and a prototype implementation addressing this vision. Our key technical contribu… ▽ More

    Submitted 25 March, 2020; v1 submitted 10 April, 2019; originally announced April 2019.

    Comments: CVPR 2020, Project page: http://www.matthewtancik.com/stegastamp

  32. arXiv:1806.05376  [pdf, other

    cs.CV

    Single Image Reflection Separation with Perceptual Losses

    Authors: Xuaner Zhang, Ren Ng, Qifeng Chen

    Abstract: We present an approach to separating reflection from a single image. The approach uses a fully convolutional network trained end-to-end with losses that exploit low-level and high-level image information. Our loss function includes two perceptual losses: a feature loss from a visual perception network, and an adversarial loss that encodes characteristics of images in the transmission layers. We al… ▽ More

    Submitted 14 June, 2018; originally announced June 2018.

    Comments: 9 pages, 8 figures, CVPR 2018

  33. arXiv:1712.02327  [pdf, other

    cs.CV

    Burst Denoising with Kernel Prediction Networks

    Authors: Ben Mildenhall, Jonathan T. Barron, Jiawen Chen, Dillon Sharlet, Ren Ng, Robert Carroll

    Abstract: We present a technique for jointly denoising bursts of images taken from a handheld camera. In particular, we propose a convolutional neural network architecture for predicting spatially varying kernels that can both align and denoise frames, a synthetic data generation approach based on a realistic noise formation model, and an optimization guided by an annealed loss function to avoid undesirable… ▽ More

    Submitted 29 March, 2018; v1 submitted 6 December, 2017; originally announced December 2017.

    Comments: To appear in CVPR 2018 (spotlight). Project page: http://people.eecs.berkeley.edu/~bmild/kpn/

  34. arXiv:1711.07933  [pdf, other

    cs.CV

    Aperture Supervision for Monocular Depth Estimation

    Authors: Pratul P. Srinivasan, Rahul Garg, Neal Wadhwa, Ren Ng, Jonathan T. Barron

    Abstract: We present a novel method to train machine learning algorithms to estimate scene depths from a single image, by using the information provided by a camera's aperture as supervision. Prior works use a depth sensor's outputs or images of the same scene from alternate viewpoints as supervision, while our method instead uses images from the same viewpoint taken with a varying camera aperture. To enabl… ▽ More

    Submitted 29 March, 2018; v1 submitted 21 November, 2017; originally announced November 2017.

    Comments: To appear at CVPR 2018 (updated to camera ready version)

  35. arXiv:1710.02134  [pdf, other

    cs.CV

    DiffuserCam: Lensless Single-exposure 3D Imaging

    Authors: Nick Antipa, Grace Kuo, Reinhard Heckel, Ben Mildenhall, Emrah Bostan, Ren Ng, Laura Waller

    Abstract: We demonstrate a compact and easy-to-build computational camera for single-shot 3D imaging. Our lensless system consists solely of a diffuser placed in front of a standard image sensor. Every point within the volumetric field-of-view projects a unique pseudorandom pattern of caustics on the sensor. By using a physical approximation and simple calibration scheme, we solve the large-scale inverse pr… ▽ More

    Submitted 5 October, 2017; originally announced October 2017.

    Comments: The first two authors contributed equally

  36. arXiv:1708.03292  [pdf, other

    cs.CV cs.GR

    Learning to Synthesize a 4D RGBD Light Field from a Single Image

    Authors: Pratul P. Srinivasan, Tongzhou Wang, Ashwin Sreelal, Ravi Ramamoorthi, Ren Ng

    Abstract: We present a machine learning algorithm that takes as input a 2D RGB image and synthesizes a 4D RGBD light field (color and depth of the scene in each ray direction). For training, we introduce the largest public light field dataset, consisting of over 3300 plenoptic camera light fields of scenes containing flowers and plants. Our synthesis pipeline consists of a convolutional neural network (CNN)… ▽ More

    Submitted 10 August, 2017; originally announced August 2017.

    Comments: International Conference on Computer Vision (ICCV) 2017

  37. arXiv:1704.05416  [pdf, other

    cs.CV

    Light Field Blind Motion Deblurring

    Authors: Pratul P. Srinivasan, Ren Ng, Ravi Ramamoorthi

    Abstract: We study the problem of deblurring light fields of general 3D scenes captured under 3D camera motion and present both theoretical and practical contributions. By analyzing the motion-blurred light field in the primal and Fourier domains, we develop intuition into the effects of camera motion on the light field, show the advantages of capturing a 4D light field instead of a conventional 2D image fo… ▽ More

    Submitted 18 April, 2017; originally announced April 2017.

    Comments: To be presented at CVPR 2017

  38. arXiv:1606.03333  [pdf, other

    cs.MM cs.CL cs.IR

    Automatic Genre and Show Identification of Broadcast Media

    Authors: Mortaza Doulaty, Oscar Saz, Raymond W. M. Ng, Thomas Hain

    Abstract: Huge amounts of digital videos are being produced and broadcast every day, leading to giant media archives. Effective techniques are needed to make such data accessible further. Automatic meta-data labelling of broadcast media is an essential task for multimedia indexing, where it is standard to use multi-modal input for such purposes. This paper describes a novel method for automatic detection of… ▽ More

    Submitted 10 June, 2016; originally announced June 2016.

    Comments: Proc. of 17th Interspeech (2016), San Francisco, California, USA

  39. The 2015 Sheffield System for Transcription of Multi-Genre Broadcast Media

    Authors: Oscar Saz, Mortaza Doulaty, Salil Deena, Rosanna Milner, Raymond W. M. Ng, Madina Hasan, Yulan Liu, Thomas Hain

    Abstract: We describe the University of Sheffield system for participation in the 2015 Multi-Genre Broadcast (MGB) challenge task of transcribing multi-genre broadcast shows. Transcription was one of four tasks proposed in the MGB challenge, with the aim of advancing the state of the art of automatic speech recognition, speaker diarisation and automatic alignment of subtitles for broadcast media. Four topic… ▽ More

    Submitted 21 December, 2015; originally announced December 2015.

    Comments: IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015), 13-17 Dec 2015, Scottsdale, Arizona, USA

  40. Latent Dirichlet Allocation Based Organisation of Broadcast Media Archives for Deep Neural Network Adaptation

    Authors: Mortaza Doulaty, Oscar Saz, Raymond W. M. Ng, Thomas Hain

    Abstract: This paper presents a new method for the discovery of latent domains in diverse speech data, for the use of adaptation of Deep Neural Networks (DNNs) for Automatic Speech Recognition. Our work focuses on transcription of multi-genre broadcast media, which is often only categorised broadly in terms of high level genres such as sports, news, documentary, etc. However, in terms of acoustic modelling… ▽ More

    Submitted 16 November, 2015; originally announced November 2015.

    Comments: IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015), 13-17 Dec 2015, Scottsdale, Arizona, USA

  41. arXiv:1509.03870  [pdf, other

    cs.CL

    The USFD Spoken Language Translation System for IWSLT 2014

    Authors: Raymond W. M. Ng, Mortaza Doulaty, Rama Doddipatla, Wilker Aziz, Kashif Shah, Oscar Saz, Madina Hasan, Ghada AlHarbi, Lucia Specia, Thomas Hain

    Abstract: The University of Sheffield (USFD) participated in the International Workshop for Spoken Language Translation (IWSLT) in 2014. In this paper, we will introduce the USFD SLT system for IWSLT. Automatic speech recognition (ASR) is achieved by two multi-pass deep neural network systems with adaptation and rescoring techniques. Machine translation (MT) is achieved by a phrase-based system. The USFD pr… ▽ More

    Submitted 13 September, 2015; originally announced September 2015.

    Journal ref: Proc. of 11th International Workshop on Spoken Language Translation (SLT 2014) 86-91, Lake Tahoe, USA, December 4th and 5th, 2014

  42. Topic Segmentation and Labeling in Asynchronous Conversations

    Authors: Shafiq Rayhan Joty, Giuseppe Carenini, Raymond T Ng

    Abstract: Topic segmentation and labeling is often considered a prerequisite for higher-level conversation analysis and has been shown to be useful in many Natural Language Processing (NLP) applications. We present two new corpora of email and blog conversations annotated with topics, and evaluate annotator reliability for the segmentation and labeling tasks in these asynchronous conversations. We propose a… ▽ More

    Submitted 3 February, 2014; originally announced February 2014.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 47, pages 521-573, 2013

  43. arXiv:1303.5735  [pdf

    cs.AI

    Non-monotonic Negation in Probabilistic Deductive Databases

    Authors: Raymond T. Ng, V. S. Subrahmanian

    Abstract: In this paper we study the uses and the semantics of non-monotonic negation in probabilistic deductive data bases. Based on the stable semantics for classical logic programming, we introduce the notion of stable formula, functions. We show that stable formula, functions are minimal fixpoints of operators associated with probabilistic deductive databases with negation. Furthermore, since a. prob… ▽ More

    Submitted 20 March, 2013; originally announced March 2013.

    Comments: Appears in Proceedings of the Seventh Conference on Uncertainty in Artificial Intelligence (UAI1991)

    Report number: UAI-P-1991-PG-249-256

  44. arXiv:1303.5420  [pdf

    cs.AI cs.DB

    Empirical Probabilities in Monadic Deductive Databases

    Authors: Raymond T. Ng, V. S. Subrahmanian

    Abstract: We address the problem of supporting empirical probabilities in monadic logic databases. Though the semantics of multivalued logic programs has been studied extensively, the treatment of probabilities as results of statistical findings has not been studied in logic programming/deductive databases. We develop a model-theoretic characterization of logic databases that facilitates such a treatment.… ▽ More

    Submitted 13 March, 2013; originally announced March 2013.

    Comments: Appears in Proceedings of the Eighth Conference on Uncertainty in Artificial Intelligence (UAI1992)

    Report number: UAI-P-1992-PG-215-222

  45. arXiv:1104.3212  [pdf

    cs.DB cs.DS

    Similarity Join Size Estimation using Locality Sensitive Hashing

    Authors: Hongrae Lee, Raymond T. Ng, Kyuseok Shim

    Abstract: Similarity joins are important operations with a broad range of applications. In this paper, we study the problem of vector similarity join size estimation (VSJ). It is a generalization of the previously studied set similarity join size estimation (SSJ) problem and can handle more interesting cases such as TF-IDF vectors. One of the key challenges in similarity join size estimation is that the joi… ▽ More

    Submitted 16 April, 2011; originally announced April 2011.

    Comments: VLDB2011

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 4, No. 6, pp. 338-349 (2011)