-
Location-based Radiology Report-Guided Semi-supervised Learning for Prostate Cancer Detection
Authors:
Alex Chen,
Nathan Lay,
Stephanie Harmon,
Kutsev Ozyoruk,
Enis Yilmaz,
Brad J. Wood,
Peter A. Pinto,
Peter L. Choyke,
Baris Turkbey
Abstract:
Prostate cancer is one of the most prevalent malignancies in the world. While deep learning has potential to further improve computer-aided prostate cancer detection on MRI, its efficacy hinges on the exhaustive curation of manually annotated images. We propose a novel methodology of semisupervised learning (SSL) guided by automatically extracted clinical information, specifically the lesion locat…
▽ More
Prostate cancer is one of the most prevalent malignancies in the world. While deep learning has potential to further improve computer-aided prostate cancer detection on MRI, its efficacy hinges on the exhaustive curation of manually annotated images. We propose a novel methodology of semisupervised learning (SSL) guided by automatically extracted clinical information, specifically the lesion locations in radiology reports, allowing for use of unannotated images to reduce the annotation burden. By leveraging lesion locations, we refined pseudo labels, which were then used to train our location-based SSL model. We show that our SSL method can improve prostate lesion detection by utilizing unannotated images, with more substantial impacts being observed when larger proportions of unannotated images are used.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Quantum computing with Qiskit
Authors:
Ali Javadi-Abhari,
Matthew Treinish,
Kevin Krsulich,
Christopher J. Wood,
Jake Lishman,
Julien Gacon,
Simon Martiel,
Paul D. Nation,
Lev S. Bishop,
Andrew W. Cross,
Blake R. Johnson,
Jay M. Gambetta
Abstract:
We describe Qiskit, a software development kit for quantum information science. We discuss the key design decisions that have shaped its development, and examine the software architecture and its core components. We demonstrate an end-to-end workflow for solving a problem in condensed matter physics on a quantum computer that serves to highlight some of Qiskit's capabilities, for example the repre…
▽ More
We describe Qiskit, a software development kit for quantum information science. We discuss the key design decisions that have shaped its development, and examine the software architecture and its core components. We demonstrate an end-to-end workflow for solving a problem in condensed matter physics on a quantum computer that serves to highlight some of Qiskit's capabilities, for example the representation and optimization of circuits at various abstraction levels, its scalability and retargetability to new gates, and the use of quantum-classical computations via dynamic circuits. Lastly, we discuss some of the ecosystem of tools and plugins that extend Qiskit for various tasks, and the future ahead.
△ Less
Submitted 18 June, 2024; v1 submitted 14 May, 2024;
originally announced May 2024.
-
Semi-Supervised Weed Detection for Rapid Deployment and Enhanced Efficiency
Authors:
Alzayat Saleh,
Alex Olsen,
Jake Wood,
Bronson Philippa,
Mostafa Rahimi Azghadi
Abstract:
Weeds present a significant challenge in agriculture, causing yield loss and requiring expensive control measures. Automatic weed detection using computer vision and deep learning offers a promising solution. However, conventional deep learning methods often require large amounts of labelled training data, which can be costly and time-consuming to acquire. This paper introduces a novel method for…
▽ More
Weeds present a significant challenge in agriculture, causing yield loss and requiring expensive control measures. Automatic weed detection using computer vision and deep learning offers a promising solution. However, conventional deep learning methods often require large amounts of labelled training data, which can be costly and time-consuming to acquire. This paper introduces a novel method for semi-supervised weed detection, comprising two main components. Firstly, a multi-scale feature representation technique is employed to capture distinctive weed features across different scales. Secondly, we propose an adaptive pseudo-label assignment strategy, leveraging a small set of labelled images during training. This strategy dynamically assigns confidence scores to pseudo-labels generated from unlabeled data. Additionally, our approach integrates epoch-corresponding and mixed pseudo-labels to further enhance the learning process. Experimental results on the COCO dataset and five prominent weed datasets -- CottonWeedDet12, CropAndWeed, Palmer amaranth, RadishWheat, and RoboWeedMap -- illustrate that our method achieves state-of-the-art performance in weed detection, even with significantly less labelled data compared to existing techniques. This approach holds the potential to alleviate the labelling burden and enhance the feasibility and deployment speed of deep learning for weed detection in real-world agricultural scenarios.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
AllTheDocks road safety dataset: A cyclist's perspective and experience
Authors:
Chia-Yen Chiang,
Ruikang Zhong,
Jennifer Ding,
Joseph Wood,
Stephen Bee,
Mona Jaber
Abstract:
Active travel is an essential component in intelligent transportation systems. Cycling, as a form of active travel, shares the road space with motorised traffic which often affects the cyclists' safety and comfort and therefore peoples' propensity to uptake cycling instead of driving. This paper presents a unique dataset, collected by cyclists across London, that includes video footage, accelerome…
▽ More
Active travel is an essential component in intelligent transportation systems. Cycling, as a form of active travel, shares the road space with motorised traffic which often affects the cyclists' safety and comfort and therefore peoples' propensity to uptake cycling instead of driving. This paper presents a unique dataset, collected by cyclists across London, that includes video footage, accelerometer, GPS, and gyroscope data. The dataset is then labelled by an independent group of London cyclists to rank the safety level of each frame and to identify objects in the cyclist's field of vision that might affect their experience. Furthermore, in this dataset, the quality of the road is measured by the international roughness index of the surface, which indicates the comfort of cycling on the road. The dataset will be made available for open access in the hope of motivating more research in this area to underpin the requirements for cyclists' safety and comfort and encourage more people to replace vehicle travel with cycling.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Weights with Maximal Symmetry and Failures of the MacWilliams Identities
Authors:
Jay A. Wood
Abstract:
This paper examines the $w$-weight enumerators of weights $w$ with maximal symmetry over finite chain rings and matrix rings over finite fields. In many cases, including the homogeneous weight, the MacWilliams identities for $w$-weight enumerators fail because there exist two linear codes with the same $w$-weight enumerator whose dual codes have different $w$-weight enumerators.
This paper examines the $w$-weight enumerators of weights $w$ with maximal symmetry over finite chain rings and matrix rings over finite fields. In many cases, including the homogeneous weight, the MacWilliams identities for $w$-weight enumerators fail because there exist two linear codes with the same $w$-weight enumerator whose dual codes have different $w$-weight enumerators.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
ShadowRemovalNet: Efficient Real-Time Shadow Removal
Authors:
Alzayat Saleh,
Alex Olsen,
Jake Wood,
Bronson Philippa,
Mostafa Rahimi Azghadi
Abstract:
Shadows significantly impact computer vision tasks, particularly in outdoor environments. State-of-the-art shadow removal methods are typically too computationally intensive for real-time image processing on edge hardware. We propose ShadowRemovalNet, a novel method designed for real-time image processing on resource-constrained hardware. ShadowRemovalNet achieves significantly higher frame rates…
▽ More
Shadows significantly impact computer vision tasks, particularly in outdoor environments. State-of-the-art shadow removal methods are typically too computationally intensive for real-time image processing on edge hardware. We propose ShadowRemovalNet, a novel method designed for real-time image processing on resource-constrained hardware. ShadowRemovalNet achieves significantly higher frame rates compared to existing methods, making it suitable for real-time computer vision pipelines like those used in field robotics. Beyond speed, ShadowRemovalNet offers advantages in efficiency and simplicity, as it does not require a separate shadow mask during inference. ShadowRemovalNet also addresses challenges associated with Generative Adversarial Networks (GANs) for shadow removal, including artefacts, inaccurate mask estimations, and inconsistent supervision between shadow and boundary pixels. To address these limitations, we introduce a novel loss function that substantially reduces shadow removal errors. ShadowRemovalNet's efficiency and straightforwardness make it a robust and effective solution for real-time shadow removal in outdoor robotics and edge computing applications.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Feature-Action Design Patterns for Storytelling Visualizations with Time Series Data
Authors:
Saiful Khan,
Scott Jones,
Benjamin Bach,
Jaehoon Cha,
Min Chen,
Julie Meikle,
Jonathan C Roberts,
Jeyan Thiyagalingam,
Jo Wood,
Panagiotis D. Ritsos
Abstract:
We present a method to create storytelling visualization with time series data. Many personal decisions nowadays rely on access to dynamic data regularly, as we have seen during the COVID-19 pandemic. It is thus desirable to construct storytelling visualization for dynamic data that is selected by an individual for a specific context. Because of the need to tell data-dependent stories, predefined…
▽ More
We present a method to create storytelling visualization with time series data. Many personal decisions nowadays rely on access to dynamic data regularly, as we have seen during the COVID-19 pandemic. It is thus desirable to construct storytelling visualization for dynamic data that is selected by an individual for a specific context. Because of the need to tell data-dependent stories, predefined storyboards based on known data cannot accommodate dynamic data easily nor scale up to many different individuals and contexts. Motivated initially by the need to communicate time series data during the COVID-19 pandemic, we developed a novel computer-assisted method for meta-authoring of stories, which enables the design of storyboards that include feature-action patterns in anticipation of potential features that may appear in dynamically arrived or selected data. In addition to meta-storyboards involving COVID-19 data, we also present storyboards for telling stories about progress in a machine learning workflow. Our approach is complementary to traditional methods for authoring storytelling visualization, and provides an efficient means to construct data-dependent storyboards for different data-streams of similar contexts.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Precise Robotic Weed Spot-Spraying for Reduced Herbicide Usage and Improved Environmental Outcomes -- A Real-World Case Study
Authors:
Mostafa Rahimi Azghadi,
Alex Olsen,
Jake Wood,
Alzayat Saleh,
Brendan Calvert,
Terry Granshaw,
Emilie Fillols,
Bronson Philippa
Abstract:
Precise robotic weed control plays an essential role in precision agriculture. It can help significantly reduce the environmental impact of herbicides while reducing weed management costs for farmers. In this paper, we demonstrate that a custom-designed robotic spot spraying tool based on computer vision and deep learning can significantly reduce herbicide usage on sugarcane farms. We present resu…
▽ More
Precise robotic weed control plays an essential role in precision agriculture. It can help significantly reduce the environmental impact of herbicides while reducing weed management costs for farmers. In this paper, we demonstrate that a custom-designed robotic spot spraying tool based on computer vision and deep learning can significantly reduce herbicide usage on sugarcane farms. We present results from field trials that compare robotic spot spraying against industry-standard broadcast spraying, by measuring the weed control efficacy, the reduction in herbicide usage, and the water quality improvements in irrigation runoff. The average results across 25 hectares of field trials show that spot spraying on sugarcane farms is 97% as effective as broadcast spraying and reduces herbicide usage by 35%, proportionally to the weed density. For specific trial strips with lower weed pressure, spot spraying reduced herbicide usage by up to 65%. Water quality measurements of irrigation-induced runoff, three to six days after spraying, showed reductions in the mean concentration and mean load of herbicides of 39% and 54%, respectively, compared to broadcast spraying. These promising results reveal the capability of spot spraying technology to reduce herbicide usage on sugarcane farms without impacting weed control and potentially providing sustained water quality benefits.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
Are Vision Transformers More Data Hungry Than Newborn Visual Systems?
Authors:
Lalit Pandey,
Samantha M. W. Wood,
Justin N. Wood
Abstract:
Vision transformers (ViTs) are top performing models on many computer vision benchmarks and can accurately predict human behavior on object recognition tasks. However, researchers question the value of using ViTs as models of biological learning because ViTs are thought to be more data hungry than brains, with ViTs requiring more training data to reach similar levels of performance. To test this a…
▽ More
Vision transformers (ViTs) are top performing models on many computer vision benchmarks and can accurately predict human behavior on object recognition tasks. However, researchers question the value of using ViTs as models of biological learning because ViTs are thought to be more data hungry than brains, with ViTs requiring more training data to reach similar levels of performance. To test this assumption, we directly compared the learning abilities of ViTs and animals, by performing parallel controlled rearing experiments on ViTs and newborn chicks. We first raised chicks in impoverished visual environments containing a single object, then simulated the training data available in those environments by building virtual animal chambers in a video game engine. We recorded the first-person images acquired by agents moving through the virtual chambers and used those images to train self supervised ViTs that leverage time as a teaching signal, akin to biological visual systems. When ViTs were trained through the eyes of newborn chicks, the ViTs solved the same view invariant object recognition tasks as the chicks. Thus, ViTs were not more data hungry than newborn visual systems: both learned view invariant object representations in impoverished visual environments. The flexible and generic attention based learning mechanism in ViTs combined with the embodied data streams available to newborn animals appears sufficient to drive the development of animal-like object recognition.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
WeedCLR: Weed Contrastive Learning through Visual Representations with Class-Optimized Loss in Long-Tailed Datasets
Authors:
Alzayat Saleh,
Alex Olsen,
Jake Wood,
Bronson Philippa,
Mostafa Rahimi Azghadi
Abstract:
Image classification is a crucial task in modern weed management and crop intervention technologies. However, the limited size, diversity, and balance of existing weed datasets hinder the development of deep learning models for generalizable weed identification. In addition, the expensive labelling requirements of mainstream fully-supervised weed classifiers make them cost- and time-prohibitive to…
▽ More
Image classification is a crucial task in modern weed management and crop intervention technologies. However, the limited size, diversity, and balance of existing weed datasets hinder the development of deep learning models for generalizable weed identification. In addition, the expensive labelling requirements of mainstream fully-supervised weed classifiers make them cost- and time-prohibitive to deploy widely, for new weed species, and in site-specific weed management. This paper proposes a novel method for Weed Contrastive Learning through visual Representations (WeedCLR), that uses class-optimized loss with Von Neumann Entropy of deep representation for weed classification in long-tailed datasets. WeedCLR leverages self-supervised learning to learn rich and robust visual features without any labels and applies a class-optimized loss function to address the class imbalance problem in long-tailed datasets. WeedCLR is evaluated on two public weed datasets: CottonWeedID15, containing 15 weed species, and DeepWeeds, containing 8 weed species. WeedCLR achieves an average accuracy improvement of 4.3\% on CottonWeedID15 and 5.6\% on DeepWeeds over previous methods. It also demonstrates better generalization ability and robustness to different environmental conditions than existing methods without the need for expensive and time-consuming human annotations. These significant improvements make WeedCLR an effective tool for weed classification in long-tailed datasets and allows for more rapid and widespread deployment of site-specific weed management and crop intervention technologies.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Large Language Models in Analyzing Crash Narratives -- A Comparative Study of ChatGPT, BARD and GPT-4
Authors:
Maroa Mumtarin,
Md Samiullah Chowdhury,
Jonathan Wood
Abstract:
In traffic safety research, extracting information from crash narratives using text analysis is a common practice. With recent advancements of large language models (LLM), it would be useful to know how the popular LLM interfaces perform in classifying or extracting information from crash narratives. To explore this, our study has used the three most popular publicly available LLM interfaces- Chat…
▽ More
In traffic safety research, extracting information from crash narratives using text analysis is a common practice. With recent advancements of large language models (LLM), it would be useful to know how the popular LLM interfaces perform in classifying or extracting information from crash narratives. To explore this, our study has used the three most popular publicly available LLM interfaces- ChatGPT, BARD and GPT4. This study investigated their usefulness and boundaries in extracting information and answering queries related to accidents from 100 crash narratives from Iowa and Kansas. During the investigation, their capabilities and limitations were assessed and their responses to the queries were compared. Five questions were asked related to the narratives: 1) Who is at-fault? 2) What is the manner of collision? 3) Has the crash occurred in a work-zone? 4) Did the crash involve pedestrians? and 5) What are the sequence of harmful events in the crash? For questions 1 through 4, the overall similarity among the LLMs were 70%, 35%, 96% and 89%, respectively. The similarities were higher while answering direct questions requiring binary responses and significantly lower for complex questions. To compare the responses to question 5, network diagram and centrality measures were analyzed. The network diagram from the three LLMs were not always similar although they sometimes have the same influencing events with high in-degree, out-degree and betweenness centrality. This study suggests using multiple models to extract viable information from narratives. Also, caution must be practiced while using these interfaces to obtain crucial safety related information.
△ Less
Submitted 24 August, 2023;
originally announced August 2023.
-
Follow Anything: Open-set detection, tracking, and following in real-time
Authors:
Alaa Maalouf,
Ninad Jadhav,
Krishna Murthy Jatavallabhula,
Makram Chahine,
Daniel M. Vogt,
Robert J. Wood,
Antonio Torralba,
Daniela Rus
Abstract:
Tracking and following objects of interest is critical to several robotics use cases, ranging from industrial automation to logistics and warehousing, to healthcare and security. In this paper, we present a robotic system to detect, track, and follow any object in real-time. Our approach, dubbed ``follow anything'' (FAn), is an open-vocabulary and multimodal model -- it is not restricted to concep…
▽ More
Tracking and following objects of interest is critical to several robotics use cases, ranging from industrial automation to logistics and warehousing, to healthcare and security. In this paper, we present a robotic system to detect, track, and follow any object in real-time. Our approach, dubbed ``follow anything'' (FAn), is an open-vocabulary and multimodal model -- it is not restricted to concepts seen at training time and can be applied to novel classes at inference time using text, images, or click queries. Leveraging rich visual descriptors from large-scale pre-trained models (foundation models), FAn can detect and segment objects by matching multimodal queries (text, images, clicks) against an input image sequence. These detected and segmented objects are tracked across image frames, all while accounting for occlusion and object re-emergence. We demonstrate FAn on a real-world robotic system (a micro aerial vehicle) and report its ability to seamlessly follow the objects of interest in a real-time control loop. FAn can be deployed on a laptop with a lightweight (6-8 GB) graphics card, achieving a throughput of 6-20 frames per second. To enable rapid adoption, deployment, and extensibility, we open-source all our code on our project webpage at https://github.com/alaamaalouf/FollowAnything . We also encourage the reader to watch our 5-minutes explainer video in this https://www.youtube.com/watch?v=6Mgt3EPytrw .
△ Less
Submitted 9 February, 2024; v1 submitted 10 August, 2023;
originally announced August 2023.
-
C-DARL: Contrastive diffusion adversarial representation learning for label-free blood vessel segmentation
Authors:
Boah Kim,
Yujin Oh,
Bradford J. Wood,
Ronald M. Summers,
Jong Chul Ye
Abstract:
Blood vessel segmentation in medical imaging is one of the essential steps for vascular disease diagnosis and interventional planning in a broad spectrum of clinical scenarios in image-based medicine and interventional medicine. Unfortunately, manual annotation of the vessel masks is challenging and resource-intensive due to subtle branches and complex structures. To overcome this issue, this pape…
▽ More
Blood vessel segmentation in medical imaging is one of the essential steps for vascular disease diagnosis and interventional planning in a broad spectrum of clinical scenarios in image-based medicine and interventional medicine. Unfortunately, manual annotation of the vessel masks is challenging and resource-intensive due to subtle branches and complex structures. To overcome this issue, this paper presents a self-supervised vessel segmentation method, dubbed the contrastive diffusion adversarial representation learning (C-DARL) model. Our model is composed of a diffusion module and a generation module that learns the distribution of multi-domain blood vessel data by generating synthetic vessel images from diffusion latent. Moreover, we employ contrastive learning through a mask-based contrastive loss so that the model can learn more realistic vessel representations. To validate the efficacy, C-DARL is trained using various vessel datasets, including coronary angiograms, abdominal digital subtraction angiograms, and retinal imaging. Experimental results confirm that our model achieves performance improvement over baseline methods with noise robustness, suggesting the effectiveness of C-DARL for vessel segmentation.
△ Less
Submitted 31 July, 2023;
originally announced August 2023.
-
StyleGAN2-based Out-of-Distribution Detection for Medical Imaging
Authors:
McKell Woodland,
John Wood,
Caleb O'Connor,
Ankit B. Patel,
Kristy K. Brock
Abstract:
One barrier to the clinical deployment of deep learning-based models is the presence of images at runtime that lie far outside the training distribution of a given model. We aim to detect these out-of-distribution (OOD) images with a generative adversarial network (GAN). Our training dataset was comprised of 3,234 liver-containing computed tomography (CT) scans from 456 patients. Our OOD test data…
▽ More
One barrier to the clinical deployment of deep learning-based models is the presence of images at runtime that lie far outside the training distribution of a given model. We aim to detect these out-of-distribution (OOD) images with a generative adversarial network (GAN). Our training dataset was comprised of 3,234 liver-containing computed tomography (CT) scans from 456 patients. Our OOD test data consisted of CT images of the brain, head and neck, lung, cervix, and abnormal livers. A StyleGAN2-ADA architecture was employed to model the training distribution. Images were reconstructed using backpropagation. Reconstructions were evaluated using the Wasserstein distance, mean squared error, and the structural similarity index measure. OOD detection was evaluated with the area under the receiver operating characteristic curve (AUROC). Our paradigm distinguished between liver and non-liver CT with greater than 90% AUROC. It was also completely unable to reconstruct liver artifacts, such as needles and ascites.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
Ethical Considerations of AR Applications in Smartphones; A Systematic Literature Review of Consumer Perspectives
Authors:
Nicola J Wood
Abstract:
This study focuses on the ethical considerations that a consumer perceives with augmented reality (AR) in the context of smartphone applications. Through a systematic review, this research can provide an understanding and ability for developers, product managers, digital marketers and associated business professionals to effectively implement and deploy mobile AR related applications and campaigns…
▽ More
This study focuses on the ethical considerations that a consumer perceives with augmented reality (AR) in the context of smartphone applications. Through a systematic review, this research can provide an understanding and ability for developers, product managers, digital marketers and associated business professionals to effectively implement and deploy mobile AR related applications and campaigns, with consideration to the perceptions of the ethical considerations that consumers have of this growing technology. The rise in digital transformation and new technologies paved this research agenda. Trends in the data revealed two overarching factors of 'Benefits' and 'Ethical Considerations'. Within these two factors, several consumer perceived themes were identified with regards to AR applications and their association categorised either positive, negative or neutral. 'Benefits' revealed 3 consistent themes of personalisation, interactivity and information acquisition. 'Ethical Considerations' revealed consistent patterns of educational awareness, privacy, transparency and security. From identifying the consumer perceptions, business professionals can strategically address and or challenge the inherent limitations and their associations during AR application development, product adoption strategies or marketing purposes.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
A newborn embodied Turing test for view-invariant object recognition
Authors:
Denizhan Pak,
Donsuk Lee,
Samantha M. W. Wood,
Justin N. Wood
Abstract:
Recent progress in artificial intelligence has renewed interest in building machines that learn like animals. Almost all of the work comparing learning across biological and artificial systems comes from studies where animals and machines received different training data, obscuring whether differences between animals and machines emerged from differences in learning mechanisms versus training data…
▽ More
Recent progress in artificial intelligence has renewed interest in building machines that learn like animals. Almost all of the work comparing learning across biological and artificial systems comes from studies where animals and machines received different training data, obscuring whether differences between animals and machines emerged from differences in learning mechanisms versus training data. We present an experimental approach-a "newborn embodied Turing Test"-that allows newborn animals and machines to be raised in the same environments and tested with the same tasks, permitting direct comparison of their learning abilities. To make this platform, we first collected controlled-rearing data from newborn chicks, then performed "digital twin" experiments in which machines were raised in virtual environments that mimicked the rearing conditions of the chicks. We found that (1) machines (deep reinforcement learning agents with intrinsic motivation) can spontaneously develop visually guided preference behavior, akin to imprinting in newborn chicks, and (2) machines are still far from newborn-level performance on object recognition tasks. Almost all of the chicks developed view-invariant object recognition, whereas the machines tended to develop view-dependent recognition. The learning outcomes were also far more constrained in the chicks versus machines. Ultimately, we anticipate that this approach will help researchers develop embodied AI systems that learn like newborn animals.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Parallel development of social preferences in fish and machines
Authors:
Joshua McGraw,
Donsuk Lee,
Justin Wood
Abstract:
What are the computational foundations of social grouping? Traditional approaches to this question have focused on verbal reasoning or simple (low-dimensional) quantitative models. In the real world, however, social preferences emerge when high-dimensional learning systems (brains and bodies) interact with high-dimensional sensory inputs during an animal's embodied interactions with the world. A d…
▽ More
What are the computational foundations of social grouping? Traditional approaches to this question have focused on verbal reasoning or simple (low-dimensional) quantitative models. In the real world, however, social preferences emerge when high-dimensional learning systems (brains and bodies) interact with high-dimensional sensory inputs during an animal's embodied interactions with the world. A deep understanding of social grouping will therefore require embodied models that learn directly from sensory inputs using high-dimensional learning mechanisms. To this end, we built artificial neural networks (ANNs), embodied those ANNs in virtual fish bodies, and raised the artificial fish in virtual fish tanks that mimicked the rearing conditions of real fish. We then compared the social preferences that emerged in real fish versus artificial fish. We found that when artificial fish had two core learning mechanisms (reinforcement learning and curiosity-driven learning), artificial fish developed fish-like social preferences. Like real fish, the artificial fish spontaneously learned to prefer members of their own group over members of other groups. The artificial fish also spontaneously learned to self-segregate with their in-group, akin to self-segregation behavior seen in nature. Our results suggest that social grouping can emerge from three ingredients: (1) reinforcement learning, (2) intrinsic motivation, and (3) early social experiences with in-group members. This approach lays a foundation for reverse engineering animal-like social behavior with image-computable models, bridging the divide between high-dimensional sensory inputs and social preferences.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
Machine learning framework for end-to-end implementation of Incident duration prediction
Authors:
Smrithi Ajit,
Varsha R Mouli,
Skylar Knickerbocker,
Jonathan S. Wood
Abstract:
Traffic congestion caused by non-recurring incidents such as vehicle crashes and debris is a key issue for Traffic Management Centers (TMCs). Clearing incidents in a timely manner is essential for improving safety and reducing delays and emissions for the traveling public. However, TMCs and other responders face a challenge in predicting the duration of incidents (until the roadway is clear), maki…
▽ More
Traffic congestion caused by non-recurring incidents such as vehicle crashes and debris is a key issue for Traffic Management Centers (TMCs). Clearing incidents in a timely manner is essential for improving safety and reducing delays and emissions for the traveling public. However, TMCs and other responders face a challenge in predicting the duration of incidents (until the roadway is clear), making decisions of what resources to deploy difficult. To address this problem, this research developed an analytical framework and end-to-end machine-learning solution for predicting incident duration based on information available as soon as an incident report is received. Quality predictions of incident duration can help TMCs and other responders take a proactive approach in deploying responder services such as tow trucks, maintenance crews or activating alternative routes. The predictions use a combination of classification and regression machine learning modules. The performance of the developed solution has been evaluated based on the Mean Absolute Error (MAE), or deviation from the actual incident duration as well as Area Under the Curve (AUC) and Mean Absolute Percentage Error (MAPE). The results showed that the framework significantly improved incident duration prediction compared to methods from previous research.
△ Less
Submitted 22 April, 2023;
originally announced April 2023.
-
Distance Map Supervised Landmark Localization for MR-TRUS Registration
Authors:
Xinrui Song,
Xuanang Xu,
Sheng Xu,
Baris Turkbey,
Bradford J. Wood,
Thomas Sanford,
Pingkun Yan
Abstract:
In this work, we propose to explicitly use the landmarks of prostate to guide the MR-TRUS image registration. We first train a deep neural network to automatically localize a set of meaningful landmarks, and then directly generate the affine registration matrix from the location of these landmarks. For landmark localization, instead of directly training a network to predict the landmark coordinate…
▽ More
In this work, we propose to explicitly use the landmarks of prostate to guide the MR-TRUS image registration. We first train a deep neural network to automatically localize a set of meaningful landmarks, and then directly generate the affine registration matrix from the location of these landmarks. For landmark localization, instead of directly training a network to predict the landmark coordinates, we propose to regress a full-resolution distance map of the landmark, which is demonstrated effective in avoiding statistical bias to unsatisfactory performance and thus improving performance. We then use the predicted landmarks to generate the affine transformation matrix, which outperforms the clinicians' manual rigid registration by a significant margin in terms of TRE.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Evaluating the Performance of StyleGAN2-ADA on Medical Images
Authors:
McKell Woodland,
John Wood,
Brian M. Anderson,
Suprateek Kundu,
Ethan Lin,
Eugene Koay,
Bruno Odisio,
Caroline Chung,
Hyunseon Christine Kang,
Aradhana M. Venkatesan,
Sireesha Yedururi,
Brian De,
Yuan-Mao Lin,
Ankit B. Patel,
Kristy K. Brock
Abstract:
Although generative adversarial networks (GANs) have shown promise in medical imaging, they have four main limitations that impeded their utility: computational cost, data requirements, reliable evaluation measures, and training complexity. Our work investigates each of these obstacles in a novel application of StyleGAN2-ADA to high-resolution medical imaging datasets. Our dataset is comprised of…
▽ More
Although generative adversarial networks (GANs) have shown promise in medical imaging, they have four main limitations that impeded their utility: computational cost, data requirements, reliable evaluation measures, and training complexity. Our work investigates each of these obstacles in a novel application of StyleGAN2-ADA to high-resolution medical imaging datasets. Our dataset is comprised of liver-containing axial slices from non-contrast and contrast-enhanced computed tomography (CT) scans. Additionally, we utilized four public datasets composed of various imaging modalities. We trained a StyleGAN2 network with transfer learning (from the Flickr-Faces-HQ dataset) and data augmentation (horizontal flipping and adaptive discriminator augmentation). The network's generative quality was measured quantitatively with the Fréchet Inception Distance (FID) and qualitatively with a visual Turing test given to seven radiologists and radiation oncologists.
The StyleGAN2-ADA network achieved a FID of 5.22 ($\pm$ 0.17) on our liver CT dataset. It also set new record FIDs of 10.78, 3.52, 21.17, and 5.39 on the publicly available SLIVER07, ChestX-ray14, ACDC, and Medical Segmentation Decathlon (brain tumors) datasets. In the visual Turing test, the clinicians rated generated images as real 42% of the time, approaching random guessing. Our computational ablation study revealed that transfer learning and data augmentation stabilize training and improve the perceptual quality of the generated images. We observed the FID to be consistent with human perceptual evaluation of medical images. Finally, our work found that StyleGAN2-ADA consistently produces high-quality results without hyperparameter searches or retraining.
△ Less
Submitted 7 October, 2022;
originally announced October 2022.
-
Integrative Imaging Informatics for Cancer Research: Workflow Automation for Neuro-oncology (I3CR-WANO)
Authors:
Satrajit Chakrabarty,
Syed Amaan Abidi,
Mina Mousa,
Mahati Mokkarala,
Isabelle Hren,
Divya Yadav,
Matthew Kelsey,
Pamela LaMontagne,
John Wood,
Michael Adams,
Yuzhuo Su,
Sherry Thorpe,
Caroline Chung,
Aristeidis Sotiras,
Daniel S. Marcus
Abstract:
Efforts to utilize growing volumes of clinical imaging data to generate tumor evaluations continue to require significant manual data wrangling owing to the data heterogeneity. Here, we propose an artificial intelligence-based solution for the aggregation and processing of multisequence neuro-oncology MRI data to extract quantitative tumor measurements. Our end-to-end framework i) classifies MRI s…
▽ More
Efforts to utilize growing volumes of clinical imaging data to generate tumor evaluations continue to require significant manual data wrangling owing to the data heterogeneity. Here, we propose an artificial intelligence-based solution for the aggregation and processing of multisequence neuro-oncology MRI data to extract quantitative tumor measurements. Our end-to-end framework i) classifies MRI sequences using an ensemble classifier, ii) preprocesses the data in a reproducible manner, iii) delineates tumor tissue subtypes using convolutional neural networks, and iv) extracts diverse radiomic features. Moreover, it is robust to missing sequences and adopts an expert-in-the-loop approach, where the segmentation results may be manually refined by radiologists. Following the implementation of the framework in Docker containers, it was applied to two retrospective glioma datasets collected from the Washington University School of Medicine (WUSM; n = 384) and the M.D. Anderson Cancer Center (MDA; n = 30) comprising preoperative MRI scans from patients with pathologically confirmed gliomas. The scan-type classifier yielded an accuracy of over 99%, correctly identifying sequences from 380/384 and 30/30 sessions from the WUSM and MDA datasets, respectively. Segmentation performance was quantified using the Dice Similarity Coefficient between the predicted and expert-refined tumor masks. Mean Dice scores were 0.882 ($\pm$0.244) and 0.977 ($\pm$0.04) for whole tumor segmentation for WUSM and MDA, respectively. This streamlined framework automatically curated, processed, and segmented raw MRI data of patients with varying grades of gliomas, enabling the curation of large-scale neuro-oncology datasets and demonstrating a high potential for integration as an assistive tool in clinical practice.
△ Less
Submitted 6 October, 2022;
originally announced October 2022.
-
Visualization for Epidemiological Modelling: Challenges, Solutions, Reflections & Recommendations
Authors:
Jason Dykes,
Alfie Abdul-Rahman,
Daniel Archambault,
Benjamin Bach,
Rita Borgo,
Min Chen,
Jessica Enright,
Hui Fang,
Elif E. Firat,
Euan Freeman,
Tuna Gonen,
Claire Harris,
Radu Jianu,
Nigel W. John,
Saiful Khan,
Andrew Lahiff,
Robert S. Laramee,
Louise Matthews,
Sibylle Mohr,
Phong H. Nguyen,
Alma A. M. Rahat,
Richard Reeve,
Panagiotis D. Ritsos,
Jonathan C. Roberts,
Aidan Slingsby
, et al. (8 additional authors not shown)
Abstract:
We report on an ongoing collaboration between epidemiological modellers and visualization researchers by documenting and reflecting upon knowledge constructs -- a series of ideas, approaches and methods taken from existing visualization research and practice -- deployed and developed to support modelling of the COVID-19 pandemic. Structured independent commentary on these efforts is synthesized th…
▽ More
We report on an ongoing collaboration between epidemiological modellers and visualization researchers by documenting and reflecting upon knowledge constructs -- a series of ideas, approaches and methods taken from existing visualization research and practice -- deployed and developed to support modelling of the COVID-19 pandemic. Structured independent commentary on these efforts is synthesized through iterative reflection to develop: evidence of the effectiveness and value of visualization in this context; open problems upon which the research communities may focus; guidance for future activity of this type; and recommendations to safeguard the achievements and promote, advance, secure and prepare for future collaborations of this kind. In describing and comparing a series of related projects that were undertaken in unprecedented conditions, our hope is that this unique report, and its rich interactive supplementary materials, will guide the scientific community in embracing visualization in its observation, analysis and modelling of data as well as in disseminating findings. Equally we hope to encourage the visualization community to engage with impactful science in addressing its emerging data challenges. If we are successful, this showcase of activity may stimulate mutually beneficial engagement between communities with complementary expertise to address problems of significance in epidemiology and beyond. https://ramp-vis.github.io/RAMPVIS-PhilTransA-Supplement/
△ Less
Submitted 20 June, 2022; v1 submitted 14 April, 2022;
originally announced April 2022.
-
Controlled-rearing studies of newborn chicks and deep neural networks
Authors:
Donsuk Lee,
Pranav Gujarathi,
Justin N. Wood
Abstract:
Convolutional neural networks (CNNs) can now achieve human-level performance on challenging object recognition tasks. CNNs are also the leading quantitative models in terms of predicting neural and behavioral responses in visual recognition tasks. However, there is a widely accepted critique of CNN models: unlike newborn animals, which learn rapidly and efficiently, CNNs are thought to be "data hu…
▽ More
Convolutional neural networks (CNNs) can now achieve human-level performance on challenging object recognition tasks. CNNs are also the leading quantitative models in terms of predicting neural and behavioral responses in visual recognition tasks. However, there is a widely accepted critique of CNN models: unlike newborn animals, which learn rapidly and efficiently, CNNs are thought to be "data hungry," requiring massive amounts of training data to develop accurate models for object recognition. This critique challenges the promise of using CNNs as models of visual development. Here, we directly examined whether CNNs are more data hungry than newborn animals by performing parallel controlled-rearing experiments on newborn chicks and CNNs. We raised newborn chicks in strictly controlled visual environments, then simulated the training data available in that environment by constructing a virtual animal chamber in a video game engine. We recorded the visual images acquired by an agent moving through the virtual chamber and used those images to train CNNs. When CNNs received similar visual training data as chicks, the CNNs successfully solved the same challenging view-invariant object recognition tasks as the chicks. Thus, the CNNs were not more data hungry than animals: both CNNs and chicks successfully developed robust object models from training data of a single object.
△ Less
Submitted 11 December, 2021;
originally announced December 2021.
-
Development of collective behavior in newborn artificial agents
Authors:
Donsuk Lee,
Samantha M. W. Wood,
Justin N. Wood
Abstract:
Collective behavior is widespread across the animal kingdom. To date, however, the developmental and mechanistic foundations of collective behavior have not been formally established. What learning mechanisms drive the development of collective behavior in newborn animals? Here, we used deep reinforcement learning and curiosity-driven learning -- two learning mechanisms deeply rooted in psychologi…
▽ More
Collective behavior is widespread across the animal kingdom. To date, however, the developmental and mechanistic foundations of collective behavior have not been formally established. What learning mechanisms drive the development of collective behavior in newborn animals? Here, we used deep reinforcement learning and curiosity-driven learning -- two learning mechanisms deeply rooted in psychological and neuroscientific research -- to build newborn artificial agents that develop collective behavior. Like newborn animals, our agents learn collective behavior from raw sensory inputs in naturalistic environments. Our agents also learn collective behavior without external rewards, using only intrinsic motivation (curiosity) to drive learning. Specifically, when we raise our artificial agents in natural visual environments with groupmates, the agents spontaneously develop ego-motion, object recognition, and a preference for groupmates, rapidly learning all of the core skills required for collective behavior. This work bridges the divide between high-dimensional sensory inputs and collective action, resulting in a pixels-to-actions model of collective animal behavior. More generally, we show that two generic learning mechanisms -- deep reinforcement learning and curiosity-driven learning -- are sufficient to learn collective behavior from unsupervised natural experience.
△ Less
Submitted 5 November, 2021;
originally announced November 2021.
-
End-to-end Ultrasound Frame to Volume Registration
Authors:
Hengtao Guo,
Xuanang Xu,
Sheng Xu,
Bradford J. Wood,
Pingkun Yan
Abstract:
Fusing intra-operative 2D transrectal ultrasound (TRUS) image with pre-operative 3D magnetic resonance (MR) volume to guide prostate biopsy can significantly increase the yield. However, such a multimodal 2D/3D registration problem is a very challenging task. In this paper, we propose an end-to-end frame-to-volume registration network (FVR-Net), which can efficiently bridge the previous research g…
▽ More
Fusing intra-operative 2D transrectal ultrasound (TRUS) image with pre-operative 3D magnetic resonance (MR) volume to guide prostate biopsy can significantly increase the yield. However, such a multimodal 2D/3D registration problem is a very challenging task. In this paper, we propose an end-to-end frame-to-volume registration network (FVR-Net), which can efficiently bridge the previous research gaps by aligning a 2D TRUS frame with a 3D TRUS volume without requiring hardware tracking. The proposed FVR-Net utilizes a dual-branch feature extraction module to extract the information from TRUS frame and volume to estimate transformation parameters. We also introduce a differentiable 2D slice sampling module which allows gradients backpropagating from an unsupervised image similarity loss for content correspondence learning. Our model shows superior efficiency for real-time interventional guidance with highly competitive registration accuracy.
△ Less
Submitted 13 July, 2021;
originally announced July 2021.
-
Cross-modal Attention for MRI and Ultrasound Volume Registration
Authors:
Xinrui Song,
Hengtao Guo,
Xuanang Xu,
Hanqing Chao,
Sheng Xu,
Baris Turkbey,
Bradford J. Wood,
Ge Wang,
Pingkun Yan
Abstract:
Prostate cancer biopsy benefits from accurate fusion of transrectal ultrasound (TRUS) and magnetic resonance (MR) images. In the past few years, convolutional neural networks (CNNs) have been proved powerful in extracting image features crucial for image registration. However, challenging applications and recent advances in computer vision suggest that CNNs are quite limited in its ability to unde…
▽ More
Prostate cancer biopsy benefits from accurate fusion of transrectal ultrasound (TRUS) and magnetic resonance (MR) images. In the past few years, convolutional neural networks (CNNs) have been proved powerful in extracting image features crucial for image registration. However, challenging applications and recent advances in computer vision suggest that CNNs are quite limited in its ability to understand spatial correspondence between features, a task in which the self-attention mechanism excels. This paper aims to develop a self-attention mechanism specifically for cross-modal image registration. Our proposed cross-modal attention block effectively maps each of the features in one volume to all features in the corresponding volume. Our experimental results demonstrate that a CNN network designed with the cross-modal attention block embedded outperforms an advanced CNN network 10 times of its size. We also incorporated visualization techniques to improve the interpretability of our network. The source code of our work is available at https://github.com/DIAL-RPI/Attention-Reg .
△ Less
Submitted 11 July, 2021; v1 submitted 9 July, 2021;
originally announced July 2021.
-
Modeling Object Recognition in Newborn Chicks using Deep Neural Networks
Authors:
Donsuk Lee,
Denizhan Pak,
Justin N. Wood
Abstract:
In recent years, the brain and cognitive sciences have made great strides developing a mechanistic understanding of object recognition in mature brains. Despite this progress, fundamental questions remain about the origins and computational foundations of object recognition. What learning algorithms underlie object recognition in newborn brains? Since newborn animals learn largely through unsuperv…
▽ More
In recent years, the brain and cognitive sciences have made great strides developing a mechanistic understanding of object recognition in mature brains. Despite this progress, fundamental questions remain about the origins and computational foundations of object recognition. What learning algorithms underlie object recognition in newborn brains? Since newborn animals learn largely through unsupervised learning, we explored whether unsupervised learning algorithms can be used to predict the view-invariant object recognition behavior of newborn chicks. Specifically, we used feature representations derived from unsupervised deep neural networks (DNNs) as inputs to cognitive models of categorization. We show that features derived from unsupervised DNNs make competitive predictions about chick behavior compared to supervised features. More generally, we argue that linking controlled-rearing studies to image-computable DNN models opens new experimental avenues for studying the origins and computational basis of object recognition in newborn animals.
△ Less
Submitted 14 June, 2021;
originally announced June 2021.
-
Cetacean Translation Initiative: a roadmap to deciphering the communication of sperm whales
Authors:
Jacob Andreas,
Gašper Beguš,
Michael M. Bronstein,
Roee Diamant,
Denley Delaney,
Shane Gero,
Shafi Goldwasser,
David F. Gruber,
Sarah de Haas,
Peter Malkin,
Roger Payne,
Giovanni Petri,
Daniela Rus,
Pratyusha Sharma,
Dan Tchernov,
Pernille Tønnesen,
Antonio Torralba,
Daniel Vogt,
Robert J. Wood
Abstract:
The past decade has witnessed a groundbreaking rise of machine learning for human language analysis, with current methods capable of automatically accurately recovering various aspects of syntax and semantics - including sentence structure and grounded word meaning - from large data collections. Recent research showed the promise of such tools for analyzing acoustic communication in nonhuman speci…
▽ More
The past decade has witnessed a groundbreaking rise of machine learning for human language analysis, with current methods capable of automatically accurately recovering various aspects of syntax and semantics - including sentence structure and grounded word meaning - from large data collections. Recent research showed the promise of such tools for analyzing acoustic communication in nonhuman species. We posit that machine learning will be the cornerstone of future collection, processing, and analysis of multimodal streams of data in animal communication studies, including bioacoustic, behavioral, biological, and environmental data. Cetaceans are unique non-human model species as they possess sophisticated acoustic communications, but utilize a very different encoding system that evolved in an aquatic rather than terrestrial medium. Sperm whales, in particular, with their highly-developed neuroanatomical features, cognitive abilities, social structures, and discrete click-based encoding make for an excellent starting point for advanced machine learning tools that can be applied to other animals in the future. This paper details a roadmap toward this goal based on currently existing technology and multidisciplinary scientific community effort. We outline the key elements required for the collection and processing of massive bioacoustic data of sperm whales, detecting their basic communication units and language-like higher-level structures, and validating these models through interactive playback experiments. The technological capabilities developed by such an undertaking are likely to yield cross-applications and advancements in broader communities investigating non-human communication and animal behavioral research.
△ Less
Submitted 17 April, 2021;
originally announced April 2021.
-
A data-driven personalized smart lighting recommender system
Authors:
Atousa Zarindast,
Jonathan Wood,
Anuj Sharma
Abstract:
Recommender systems attempts to identify and recommend the most preferable item (product-service) to an individual user. These systems predict user interest in items based on related items, users, and the interactions between items and users. We aim to build an auto-routine and color scheme recommender system that leverages a wealth of historical data and machine learning methods. We introduce an…
▽ More
Recommender systems attempts to identify and recommend the most preferable item (product-service) to an individual user. These systems predict user interest in items based on related items, users, and the interactions between items and users. We aim to build an auto-routine and color scheme recommender system that leverages a wealth of historical data and machine learning methods. We introduce an unsupervised method to recommend a routine for lighting. Moreover, by analyzing users' daily logs, geographical location, temporal and usage information we understand user preference and predict their preferred color for lights. To do so, we cluster users based on their geographical information and usage distribution. We then build and train a predictive model within each cluster and aggregate the results. Results indicate that models based on similar users increases the prediction accuracy, with and without prior knowledge about user preferences.
△ Less
Submitted 5 April, 2021;
originally announced April 2021.
-
RAMPVIS: Towards a New Methodology for Developing Visualisation Capabilities for Large-scale Emergency Responses
Authors:
M. Chen,
A. Abdul-Rahman,
D. Archambault,
J. Dykes,
A. Slingsby,
P. D. Ritsos,
T. Torsney-Weir,
C. Turkay,
B. Bach,
A. Brett,
H. Fang,
R. Jianu,
S. Khan,
R. S. Laramee,
P. H. Nguyen,
R. Reeve,
J. C. Roberts,
F. Vidal,
Q. Wang,
J. Wood,
K. Xu
Abstract:
The effort for combating the COVID-19 pandemic around the world has resulted in a huge amount of data, e.g., from testing, contact tracing, modelling, treatment, vaccine trials, and more. In addition to numerous challenges in epidemiology, healthcare, biosciences, and social sciences, there has been an urgent need to develop and provide visualisation and visual analytics (VIS) capacities to suppor…
▽ More
The effort for combating the COVID-19 pandemic around the world has resulted in a huge amount of data, e.g., from testing, contact tracing, modelling, treatment, vaccine trials, and more. In addition to numerous challenges in epidemiology, healthcare, biosciences, and social sciences, there has been an urgent need to develop and provide visualisation and visual analytics (VIS) capacities to support emergency responses under difficult operational conditions. In this paper, we report the experience of a group of VIS volunteers who have been working in a large research and development consortium and providing VIS support to various observational, analytical, model-developmental and disseminative tasks. In particular, we describe our approaches to the challenges that we have encountered in requirements analysis, data acquisition, visual design, software design, system development, team organisation, and resource planning. By reflecting on our experience, we propose a set of recommendations as the first step towards a methodology for developing and providing rapid VIS capacities to support emergency responses.
△ Less
Submitted 8 December, 2020;
originally announced December 2020.
-
Federated Semi-Supervised Learning for COVID Region Segmentation in Chest CT using Multi-National Data from China, Italy, Japan
Authors:
Dong Yang,
Ziyue Xu,
Wenqi Li,
Andriy Myronenko,
Holger R. Roth,
Stephanie Harmon,
Sheng Xu,
Baris Turkbey,
Evrim Turkbey,
Xiaosong Wang,
Wentao Zhu,
Gianpaolo Carrafiello,
Francesca Patella,
Maurizio Cariati,
Hirofumi Obinata,
Hitoshi Mori,
Kaku Tamura,
Peng An,
Bradford J. Wood,
Daguang Xu
Abstract:
The recent outbreak of COVID-19 has led to urgent needs for reliable diagnosis and management of SARS-CoV-2 infection. As a complimentary tool, chest CT has been shown to be able to reveal visual patterns characteristic for COVID-19, which has definite value at several stages during the disease course. To facilitate CT analysis, recent efforts have focused on computer-aided characterization and di…
▽ More
The recent outbreak of COVID-19 has led to urgent needs for reliable diagnosis and management of SARS-CoV-2 infection. As a complimentary tool, chest CT has been shown to be able to reveal visual patterns characteristic for COVID-19, which has definite value at several stages during the disease course. To facilitate CT analysis, recent efforts have focused on computer-aided characterization and diagnosis, which has shown promising results. However, domain shift of data across clinical data centers poses a serious challenge when deploying learning-based models. In this work, we attempt to find a solution for this challenge via federated and semi-supervised learning. A multi-national database consisting of 1704 scans from three countries is adopted to study the performance gap, when training a model with one dataset and applying it to another. Expert radiologists manually delineated 945 scans for COVID-19 findings. In handling the variability in both the data and annotations, a novel federated semi-supervised learning technique is proposed to fully utilize all available data (with or without annotations). Federated learning avoids the need for sensitive data-sharing, which makes it favorable for institutions and nations with strict regulatory policy on data privacy. Moreover, semi-supervision potentially reduces the annotation burden under a distributed setting. The proposed framework is shown to be effective compared to fully supervised scenarios with conventional data sharing instead of model weight sharing.
△ Less
Submitted 23 November, 2020;
originally announced November 2020.
-
Transducer Adaptive Ultrasound Volume Reconstruction
Authors:
Hengtao Guo,
Sheng Xu,
Bradford J. Wood,
Pingkun Yan
Abstract:
Reconstructed 3D ultrasound volume provides more context information compared to a sequence of 2D scanning frames, which is desirable for various clinical applications such as ultrasound-guided prostate biopsy. Nevertheless, 3D volume reconstruction from freehand 2D scans is a very challenging problem, especially without the use of external tracking devices. Recent deep learning based methods demo…
▽ More
Reconstructed 3D ultrasound volume provides more context information compared to a sequence of 2D scanning frames, which is desirable for various clinical applications such as ultrasound-guided prostate biopsy. Nevertheless, 3D volume reconstruction from freehand 2D scans is a very challenging problem, especially without the use of external tracking devices. Recent deep learning based methods demonstrate the potential of directly estimating inter-frame motion between consecutive ultrasound frames. However, such algorithms are specific to particular transducers and scanning trajectories associated with the training data, which may not be generalized to other image acquisition settings. In this paper, we tackle the data acquisition difference as a domain shift problem and propose a novel domain adaptation strategy to adapt deep learning algorithms to data acquired with different transducers. Specifically, feature extractors that generate transducer-invariant features from different datasets are trained by minimizing the discrepancy between deep features of paired samples in a latent space. Our results show that the proposed domain adaptation method can successfully align different feature distributions while preserving the transducer-specific information for universal freehand ultrasound volume reconstruction.
△ Less
Submitted 16 November, 2020;
originally announced November 2020.
-
A Linear Algebra Approach to Linear Metatheory
Authors:
James Wood,
Robert Atkey
Abstract:
Linear typed $λ$-calculi are more delicate than their simply typed siblings when it comes to metatheoretic results like preservation of typing under renaming and substitution. Tracking the usage of variables in contexts places more constraints on how variables may be renamed or substituted. We present a methodology based on linear algebra over semirings, extending McBride's kits and traversals app…
▽ More
Linear typed $λ$-calculi are more delicate than their simply typed siblings when it comes to metatheoretic results like preservation of typing under renaming and substitution. Tracking the usage of variables in contexts places more constraints on how variables may be renamed or substituted. We present a methodology based on linear algebra over semirings, extending McBride's kits and traversals approach for the metatheory of syntax with binding to linear usage-annotated terms. Our approach is readily formalisable, and we have done so in Agda.
△ Less
Submitted 30 December, 2021; v1 submitted 5 May, 2020;
originally announced May 2020.
-
Scaling down an insect-size microrobot, HAMR-VI into HAMR-Jr
Authors:
Kaushik Jayaram,
Jennifer Shum,
Samantha Castellanos,
E. Farrell Helbling,
Robert J. Wood
Abstract:
Here we present HAMR-Jr, a \SI{22.5}{\milli\meter}, \SI{320}{\milli\gram} quadrupedal microrobot. With eight independently actuated degrees of freedom, HAMR-Jr is, to our knowledge, the most mechanically dexterous legged robot at its scale and is capable of high-speed locomotion (\SI{13.91}{bodylengths~\second^{-1}}) at a variety of stride frequencies (\SI{1}{}-\SI{200}{\hertz}) using multiple gai…
▽ More
Here we present HAMR-Jr, a \SI{22.5}{\milli\meter}, \SI{320}{\milli\gram} quadrupedal microrobot. With eight independently actuated degrees of freedom, HAMR-Jr is, to our knowledge, the most mechanically dexterous legged robot at its scale and is capable of high-speed locomotion (\SI{13.91}{bodylengths~\second^{-1}}) at a variety of stride frequencies (\SI{1}{}-\SI{200}{\hertz}) using multiple gaits. We achieved this using a design and fabrication process that is flexible, allowing scaling with minimum changes to our workflow. We further characterized HAMR-Jr's open-loop locomotion and compared it with the larger scale HAMR-VI microrobot to demonstrate the effectiveness of scaling laws in predicting running performance.
△ Less
Submitted 6 March, 2020;
originally announced March 2020.
-
Unified Multi-scale Feature Abstraction for Medical Image Segmentation
Authors:
Xi Fang,
Bo Du,
Sheng Xu,
Bradford J. Wood,
Pingkun Yan
Abstract:
Automatic medical image segmentation, an essential component of medical image analysis, plays an importantrole in computer-aided diagnosis. For example, locating and segmenting the liver can be very helpful in livercancer diagnosis and treatment. The state-of-the-art models in medical image segmentation are variants ofthe encoder-decoder architecture such as fully convolutional network (FCN) and U…
▽ More
Automatic medical image segmentation, an essential component of medical image analysis, plays an importantrole in computer-aided diagnosis. For example, locating and segmenting the liver can be very helpful in livercancer diagnosis and treatment. The state-of-the-art models in medical image segmentation are variants ofthe encoder-decoder architecture such as fully convolutional network (FCN) and U-Net.1A major focus ofthe FCN based segmentation methods has been on network structure engineering by incorporating the latestCNN structures such as ResNet2and DenseNet.3In addition to exploring new network structures for efficientlyabstracting high level features, incorporating structures for multi-scale image feature extraction in FCN hashelped to improve performance in segmentation tasks. In this paper, we design a new multi-scale networkarchitecture, which takes multi-scale inputs with dedicated convolutional paths to efficiently combine featuresfrom different scales to better utilize the hierarchical information.
△ Less
Submitted 24 October, 2019;
originally announced October 2019.
-
Design by Immersion: A Transdisciplinary Approach to Problem-Driven Visualizations
Authors:
Kyle Wm. Hall,
Adam J. Bradley,
Uta Hinrichs,
Samuel Huron,
Jo Wood,
Christopher Collins,
Sheelagh Carpendale
Abstract:
While previous work exists on how to conduct and disseminate insights from problem-driven visualization projects and design studies, the literature does not address how to accomplish these goals in transdisciplinary teams in ways that advance all disciplines involved. In this paper we introduce and define a new methodological paradigm we call design by immersion, which provides an alternative pers…
▽ More
While previous work exists on how to conduct and disseminate insights from problem-driven visualization projects and design studies, the literature does not address how to accomplish these goals in transdisciplinary teams in ways that advance all disciplines involved. In this paper we introduce and define a new methodological paradigm we call design by immersion, which provides an alternative perspective on problem-driven visualization work. Design by immersion embeds transdisciplinary experiences at the center of the visualization process by having visualization researchers participate in the work of the target domain (or domain experts participate in visualization research). Based on our own combined experiences of working on cross-disciplinary, problem-driven visualization projects, we present six case studies that expose the opportunities that design by immersion enables, including (1) exploring new domain-inspired visualization design spaces, (2) enriching domain understanding through personal experiences, and (3) building strong transdisciplinary relationships. Furthermore, we illustrate how the process of design by immersion opens up a diverse set of design activities that can be combined in different ways depending on the type of collaboration, project, and goals. Finally, we discuss the challenges and potential pitfalls of design by immersion.
△ Less
Submitted 17 October, 2019; v1 submitted 1 August, 2019;
originally announced August 2019.
-
OpBerg: Discovering causal sentences using optimal alignments
Authors:
Justin Wood,
Nicholas J. Matiasz,
Alcino J. Silva,
William Hsu,
Alexej Abyzov,
Wei Wang
Abstract:
The biological literature is rich with sentences that describe causal relations. Methods that automatically extract such sentences can help biologists to synthesize the literature and even discover latent relations that had not been articulated explicitly. Current methods for extracting causal sentences are based on either machine learning or a predefined database of causal terms. Machine learning…
▽ More
The biological literature is rich with sentences that describe causal relations. Methods that automatically extract such sentences can help biologists to synthesize the literature and even discover latent relations that had not been articulated explicitly. Current methods for extracting causal sentences are based on either machine learning or a predefined database of causal terms. Machine learning approaches require a large set of labeled training data and can be susceptible to noise. Methods based on predefined databases are limited by the quality of their curation and are unable to capture new concepts or mistakes in the input. We address these challenges by adapting and improving a method designed for a seemingly unrelated problem: finding alignments between genomic sequences. This paper presents a novel and outperforming method for extracting causal relations from text by aligning the part-of-speech representations of an input set with that of known causal sentences. Our experiments show that when applied to the task of finding causal sentences in biological literature, our method improves on the accuracy of other methods in a computationally efficient manner.
△ Less
Submitted 3 April, 2019;
originally announced April 2019.
-
Contact-Implicit Optimization of Locomotion Trajectories for a Quadrupedal Microrobot
Authors:
Neel Doshi,
Kaushik Jayaram,
Benjamin Goldberg,
Zachary Manchester,
Robert J. Wood,
Scott Kuindersma
Abstract:
Planning locomotion trajectories for legged microrobots is challenging because of their complex morphology, high frequency passive dynamics, and discontinuous contact interactions with their environment. Consequently, such research is often driven by time-consuming experimental methods. As an alternative, we present a framework for systematically modeling, planning, and controlling legged microrob…
▽ More
Planning locomotion trajectories for legged microrobots is challenging because of their complex morphology, high frequency passive dynamics, and discontinuous contact interactions with their environment. Consequently, such research is often driven by time-consuming experimental methods. As an alternative, we present a framework for systematically modeling, planning, and controlling legged microrobots. We develop a three-dimensional dynamic model of a 1.5 gram quadrupedal microrobot with complexity (e.g., number of degrees of freedom) similar to larger-scale legged robots. We then adapt a recently developed variational contact-implicit trajectory optimization method to generate feasible whole-body locomotion plans for this microrobot, and we demonstrate that these plans can be tracked with simple joint-space controllers. We plan and execute periodic gaits at multiple stride frequencies and on various surfaces. These gaits achieve high per-cycle velocities, including a maximum of 10.87 mm/cycle, which is 15% faster than previously measured velocities for this microrobot. Furthermore, we plan and execute a vertical jump of 9.96 mm, which is 78% of the microrobot's center-of-mass height. To the best of our knowledge, this is the first end-to-end demonstration of planning and tracking whole-body dynamic locomotion on a millimeter-scale legged microrobot.
△ Less
Submitted 25 January, 2019;
originally announced January 2019.
-
Effective Locomotion at Multiple Stride Frequencies Using Proprioceptive Feedback on a Legged Microrobot
Authors:
Neel Doshi,
Kaushik Jayaram,
Samantha Castellanos,
Scott Kuindersma,
Robert J Wood
Abstract:
Limitations in actuation, sensing, and computation have forced small legged robots to rely on carefully tuned, mechanically mediated leg trajectories for effective locomotion. Recent advances in manufacturing, however, have enabled the development of small legged robots capable of operation at multiple stride frequencies using multi-degree-of-freedom leg trajectories. Proprioceptive sensing and co…
▽ More
Limitations in actuation, sensing, and computation have forced small legged robots to rely on carefully tuned, mechanically mediated leg trajectories for effective locomotion. Recent advances in manufacturing, however, have enabled the development of small legged robots capable of operation at multiple stride frequencies using multi-degree-of-freedom leg trajectories. Proprioceptive sensing and control is key to extending the capabilities of these robots to a broad range of operating conditions. In this work, we use concomitant sensing for piezoelectric actuation with a computationally efficient framework for estimation and control of leg trajectories on a quadrupedal microrobot. We demonstrate accurate position estimation (< 16% root-mean-square error) and control (16% root-mean-square tracking error) during locomotion across a wide range of stride frequencies (10-50 Hz). This capability enables the exploration of two bioinspired parametric leg trajectories designed to reduce leg slip and increase locomotion performance (e.g., speed, cost-of-transport, etc.). Using this approach, we demonstrate high performance locomotion at stride frequencies of (10-30 Hz) where the robot's natural dynamics result in poor open-loop locomotion. Furthermore, we validate the biological hypotheses that inspired the our trajectories and identify regions of highly dynamic locomotion, low cost-of-transport (3.33), and minimal leg slippage (< 10%).
△ Less
Submitted 3 June, 2019; v1 submitted 24 January, 2019;
originally announced January 2019.
-
DeepWeeds: A Multiclass Weed Species Image Dataset for Deep Learning
Authors:
Alex Olsen,
Dmitry A. Konovalov,
Bronson Philippa,
Peter Ridd,
Jake C. Wood,
Jamie Johns,
Wesley Banks,
Benjamin Girgenti,
Owen Kenny,
James Whinney,
Brendan Calvert,
Mostafa Rahimi Azghadi,
Ronald D. White
Abstract:
Robotic weed control has seen increased research of late with its potential for boosting productivity in agriculture. Majority of works focus on developing robotics for croplands, ignoring the weed management problems facing rangeland stock farmers. Perhaps the greatest obstacle to widespread uptake of robotic weed control is the robust classification of weed species in their natural environment.…
▽ More
Robotic weed control has seen increased research of late with its potential for boosting productivity in agriculture. Majority of works focus on developing robotics for croplands, ignoring the weed management problems facing rangeland stock farmers. Perhaps the greatest obstacle to widespread uptake of robotic weed control is the robust classification of weed species in their natural environment. The unparalleled successes of deep learning make it an ideal candidate for recognising various weed species in the complex rangeland environment. This work contributes the first large, public, multiclass image dataset of weed species from the Australian rangelands; allowing for the development of robust classification methods to make robotic weed control viable. The DeepWeeds dataset consists of 17,509 labelled images of eight nationally significant weed species native to eight locations across northern Australia. This paper presents a baseline for classification performance on the dataset using the benchmark deep learning models, Inception-v3 and ResNet-50. These models achieved an average classification accuracy of 95.1% and 95.7%, respectively. We also demonstrate real time performance of the ResNet-50 architecture, with an average inference time of 53.4 ms per image. These strong results bode well for future field implementation of robotic weed control methods in the Australian rangelands.
△ Less
Submitted 14 February, 2019; v1 submitted 9 October, 2018;
originally announced October 2018.
-
Qiskit Backend Specifications for OpenQASM and OpenPulse Experiments
Authors:
David C. McKay,
Thomas Alexander,
Luciano Bello,
Michael J. Biercuk,
Lev Bishop,
Jiayin Chen,
Jerry M. Chow,
Antonio D. Córcoles,
Daniel Egger,
Stefan Filipp,
Juan Gomez,
Michael Hush,
Ali Javadi-Abhari,
Diego Moreda,
Paul Nation,
Brent Paulovicks,
Erick Winston,
Christopher J. Wood,
James Wootton,
Jay M. Gambetta
Abstract:
As interest in quantum computing grows, there is a pressing need for standardized API's so that algorithm designers, circuit designers, and physicists can be provided a common reference frame for designing, executing, and optimizing experiments. There is also a need for a language specification that goes beyond gates and allows users to specify the time dynamics of a quantum experiment and recover…
▽ More
As interest in quantum computing grows, there is a pressing need for standardized API's so that algorithm designers, circuit designers, and physicists can be provided a common reference frame for designing, executing, and optimizing experiments. There is also a need for a language specification that goes beyond gates and allows users to specify the time dynamics of a quantum experiment and recover the time dynamics of the output. In this document we provide a specification for a common interface to backends (simulators and experiments) and a standarized data structure (Qobj --- quantum object) for sending experiments to those backends via Qiskit. We also introduce OpenPulse, a language for specifying pulse level control (i.e. control of the continuous time dynamics) of a general quantum device independent of the specific hardware implementation.
△ Less
Submitted 10 September, 2018;
originally announced September 2018.
-
Learning Deep Similarity Metric for 3D MR-TRUS Registration
Authors:
Grant Haskins,
Jochen Kruecker,
Uwe Kruger,
Sheng Xu,
Peter A. Pinto,
Brad J. Wood,
Pingkun Yan
Abstract:
Purpose: The fusion of transrectal ultrasound (TRUS) and magnetic resonance (MR) images for guiding targeted prostate biopsy has significantly improved the biopsy yield of aggressive cancers. A key component of MR-TRUS fusion is image registration. However, it is very challenging to obtain a robust automatic MR-TRUS registration due to the large appearance difference between the two imaging modali…
▽ More
Purpose: The fusion of transrectal ultrasound (TRUS) and magnetic resonance (MR) images for guiding targeted prostate biopsy has significantly improved the biopsy yield of aggressive cancers. A key component of MR-TRUS fusion is image registration. However, it is very challenging to obtain a robust automatic MR-TRUS registration due to the large appearance difference between the two imaging modalities. The work presented in this paper aims to tackle this problem by addressing two challenges: (i) the definition of a suitable similarity metric and (ii) the determination of a suitable optimization strategy.
Methods: This work proposes the use of a deep convolutional neural network to learn a similarity metric for MR-TRUS registration. We also use a composite optimization strategy that explores the solution space in order to search for a suitable initialization for the second-order optimization of the learned metric. Further, a multi-pass approach is used in order to smooth the metric for optimization.
Results: The learned similarity metric outperforms the classical mutual information and also the state-of-the-art MIND feature based methods. The results indicate that the overall registration framework has a large capture range. The proposed deep similarity metric based approach obtained a mean TRE of 3.86mm (with an initial TRE of 16mm) for this challenging problem.
Conclusion: A similarity metric that is learned using a deep neural network can be used to assess the quality of any given image registration and can be used in conjunction with the aforementioned optimization framework to perform automatic registration that is robust to poor initialization.
△ Less
Submitted 15 October, 2018; v1 submitted 12 June, 2018;
originally announced June 2018.
-
Adversarial Image Registration with Application for MR and TRUS Image Fusion
Authors:
Pingkun Yan,
Sheng Xu,
Ardeshir R. Rastinehad,
Brad J. Wood
Abstract:
Robust and accurate alignment of multimodal medical images is a very challenging task, which however is very useful for many clinical applications. For example, magnetic resonance (MR) and transrectal ultrasound (TRUS) image registration is a critical component in MR-TRUS fusion guided prostate interventions. However, due to the huge difference between the image appearances and the large variation…
▽ More
Robust and accurate alignment of multimodal medical images is a very challenging task, which however is very useful for many clinical applications. For example, magnetic resonance (MR) and transrectal ultrasound (TRUS) image registration is a critical component in MR-TRUS fusion guided prostate interventions. However, due to the huge difference between the image appearances and the large variation in image correspondence, MR-TRUS image registration is a very challenging problem. In this paper, an adversarial image registration (AIR) framework is proposed. By training two deep neural networks simultaneously, one being a generator and the other being a discriminator, we can obtain not only a network for image registration, but also a metric network which can help evaluate the quality of image registration. The developed AIR-net is then evaluated using clinical datasets acquired through image-fusion guided prostate biopsy procedures and promising results are demonstrated.
△ Less
Submitted 1 October, 2018; v1 submitted 29 April, 2018;
originally announced April 2018.
-
Abstract: Probabilistic Prognostic Estimates of Survival in Metastatic Cancer Patients
Authors:
Imon Banerjee,
Michael Francis Gensheimer,
Douglas J. Wood,
Solomon Henry,
Daniel Chang,
Daniel L. Rubin
Abstract:
We propose a deep learning model - Probabilistic Prognostic Estimates of Survival in Metastatic Cancer Patients (PPES-Met) for estimating short-term life expectancy (3 months) of the patients by analyzing free-text clinical notes in the electronic medical record, while maintaining the temporal visit sequence. In a single framework, we integrated semantic data mapping and neural embedding technique…
▽ More
We propose a deep learning model - Probabilistic Prognostic Estimates of Survival in Metastatic Cancer Patients (PPES-Met) for estimating short-term life expectancy (3 months) of the patients by analyzing free-text clinical notes in the electronic medical record, while maintaining the temporal visit sequence. In a single framework, we integrated semantic data mapping and neural embedding technique to produce a text processing method that extracts relevant information from heterogeneous types of clinical notes in an unsupervised manner, and we designed a recurrent neural network to model the temporal dependency of the patient visits. The model was trained on a large dataset (10,293 patients) and validated on a separated dataset (1818 patients). Our method achieved an area under the ROC curve (AUC) of 0.89. To provide explain-ability, we developed an interactive graphical tool that may improve physician understanding of the basis for the model's predictions. The high accuracy and explain-ability of the PPES-Met model may enable our model to be used as a decision support tool to personalize metastatic cancer treatment and provide valuable assistance to the physicians.
△ Less
Submitted 13 July, 2018; v1 submitted 9 January, 2018;
originally announced January 2018.
-
The Extension Theorem for Bi-invariant Weights over Frobenius Rings and Frobenius Bimodules
Authors:
Oliver W. Gnilke,
Marcus Greferath,
Thomas Honold,
Jay A. Wood,
Jens Zumbrägel
Abstract:
We give a sufficient condition for a bi-invariant weight on a Frobenius bimodule to satisfy the extension property. This condition applies to bi-invariant weights on a finite Frobenius ring as a special case. The complex-valued functions on a Frobenius bimodule are viewed as a module over the semigroup ring of the multiplicative semigroup of the coefficient ring.
We give a sufficient condition for a bi-invariant weight on a Frobenius bimodule to satisfy the extension property. This condition applies to bi-invariant weights on a finite Frobenius ring as a special case. The complex-valued functions on a Frobenius bimodule are viewed as a module over the semigroup ring of the multiplicative semigroup of the coefficient ring.
△ Less
Submitted 27 November, 2017;
originally announced November 2017.
-
Aztec: A Platform to Render Biomedical Software Findable, Accessible, Interoperable, and Reusable
Authors:
Wei Wang,
Brian Bleakley,
Chelsea Ju,
Vincent Kyi,
Patrick Tan,
Howard Choi,
Xinxin Huang,
Yichao Zhou,
Justin Wood,
Ding Wang,
Alex Bui,
Peipei Ping
Abstract:
Precision medicine and health requires the characterization and phenotyping of biological systems and patient datasets using a variety of data formats. This scenario mandates the centralization of various tools and resources in a unified platform to render them Findable, Accessible, Interoperable, and Reusable (FAIR Principles). Leveraging these principles, Aztec provides the scientific community…
▽ More
Precision medicine and health requires the characterization and phenotyping of biological systems and patient datasets using a variety of data formats. This scenario mandates the centralization of various tools and resources in a unified platform to render them Findable, Accessible, Interoperable, and Reusable (FAIR Principles). Leveraging these principles, Aztec provides the scientific community with a new platform that promotes a long-term, sustainable ecosystem of biomedical research software. Aztec is available at https://aztec.bio and its source code is hosted at https://github.com/BD2K-Aztec.
△ Less
Submitted 19 June, 2017;
originally announced June 2017.
-
Gaze2Segment: A Pilot Study for Integrating Eye-Tracking Technology into Medical Image Segmentation
Authors:
Naji Khosravan,
Haydar Celik,
Baris Turkbey,
Ruida Cheng,
Evan McCreedy,
Matthew McAuliffe,
Sandra Bednarova,
Elizabeth Jones,
Xinjian Chen,
Peter L. Choyke,
Bradford J. Wood,
Ulas Bagci
Abstract:
This study introduced a novel system, called Gaze2Segment, integrating biological and computer vision techniques to support radiologists' reading experience with an automatic image segmentation task. During diagnostic assessment of lung CT scans, the radiologists' gaze information were used to create a visual attention map. This map was then combined with a computer-derived saliency map, extracted…
▽ More
This study introduced a novel system, called Gaze2Segment, integrating biological and computer vision techniques to support radiologists' reading experience with an automatic image segmentation task. During diagnostic assessment of lung CT scans, the radiologists' gaze information were used to create a visual attention map. This map was then combined with a computer-derived saliency map, extracted from the gray-scale CT images. The visual attention map was used as an input for indicating roughly the location of a object of interest. With computer-derived saliency information, on the other hand, we aimed at finding foreground and background cues for the object of interest. At the final step, these cues were used to initiate a seed-based delineation process. Segmentation accuracy of the proposed Gaze2Segment was found to be 86% with dice similarity coefficient and 1.45 mm with Hausdorff distance. To the best of our knowledge, Gaze2Segment is the first true integration of eye-tracking technology into a medical image segmentation task without the need for any further user-interaction.
△ Less
Submitted 10 August, 2016;
originally announced August 2016.
-
Source-LDA: Enhancing probabilistic topic models using prior knowledge sources
Authors:
Justin Wood,
Patrick Tan,
Wei Wang,
Corey Arnold
Abstract:
A popular approach to topic modeling involves extracting co-occurring n-grams of a corpus into semantic themes. The set of n-grams in a theme represents an underlying topic, but most topic modeling approaches are not able to label these sets of words with a single n-gram. Such labels are useful for topic identification in summarization systems. This paper introduces a novel approach to labeling a…
▽ More
A popular approach to topic modeling involves extracting co-occurring n-grams of a corpus into semantic themes. The set of n-grams in a theme represents an underlying topic, but most topic modeling approaches are not able to label these sets of words with a single n-gram. Such labels are useful for topic identification in summarization systems. This paper introduces a novel approach to labeling a group of n-grams comprising an individual topic. The approach taken is to complement the existing topic distributions over words with a known distribution based on a predefined set of topics. This is done by integrating existing labeled knowledge sources representing known potential topics into the probabilistic topic model. These knowledge sources are translated into a distribution and used to set the hyperparameters of the Dirichlet generated distribution over words. In the inference these modified distributions guide the convergence of the latent topics to conform with the complementary distributions. This approach ensures that the topic inference process is consistent with existing knowledge. The label assignment from the complementary knowledge sources are then transferred to the latent topics of the corpus. The results show both accurate label assignment to topics as well as improved topic generation than those obtained using various labeling approaches based off Latent Dirichlet allocation (LDA).
△ Less
Submitted 17 May, 2017; v1 submitted 2 June, 2016;
originally announced June 2016.
-
MacWilliams' Extension Theorem for Bi-Invariant Weights over Finite Principal Ideal Rings
Authors:
Marcus Greferath,
Thomas Honold,
Cathy Mc Fadden,
Jay A. Wood,
Jens Zumbrägel
Abstract:
A finite ring R and a weight w on R satisfy the Extension Property if every R-linear w-isometry between two R-linear codes in R^n extends to a monomial transformation of R^n that preserves w. MacWilliams proved that finite fields with the Hamming weight satisfy the Extension Property. It is known that finite Frobenius rings with either the Hamming weight or the homogeneous weight satisfy the Exten…
▽ More
A finite ring R and a weight w on R satisfy the Extension Property if every R-linear w-isometry between two R-linear codes in R^n extends to a monomial transformation of R^n that preserves w. MacWilliams proved that finite fields with the Hamming weight satisfy the Extension Property. It is known that finite Frobenius rings with either the Hamming weight or the homogeneous weight satisfy the Extension Property. Conversely, if a finite ring with the Hamming or homogeneous weight satisfies the Extension Property, then the ring is Frobenius.
This paper addresses the question of a characterization of all bi-invariant weights on a finite ring that satisfy the Extension Property. Having solved this question in previous papers for all direct products of finite chain rings and for matrix rings, we have now arrived at a characterization of these weights for finite principal ideal rings, which form a large subclass of the finite Frobenius rings. We do not assume commutativity of the rings in question.
△ Less
Submitted 12 September, 2013;
originally announced September 2013.
-
Framing Human-Robot Task Communication as a POMDP
Authors:
Mark P. Woodward,
Robert J. Wood
Abstract:
As general purpose robots become more capable, pre-programming of all tasks at the factory will become less practical. We would like for non-technical human owners to be able to communicate, through interaction with their robot, the details of a new task; we call this interaction "task communication". During task communication the robot must infer the details of the task from unstructured human si…
▽ More
As general purpose robots become more capable, pre-programming of all tasks at the factory will become less practical. We would like for non-technical human owners to be able to communicate, through interaction with their robot, the details of a new task; we call this interaction "task communication". During task communication the robot must infer the details of the task from unstructured human signals and it must choose actions that facilitate this inference. In this paper we propose the use of a partially observable Markov decision process (POMDP) for representing the task communication problem; with the unobservable task details and unobservable intentions of the human teacher captured in the state, with all signals from the human represented as observations, and with the cost function chosen to penalize uncertainty. We work through an example representation of task communication as a POMDP, and present results from a user experiment on an interactive virtual robot, compared with a human controlled virtual robot, for a task involving a single object movement and binary approval input from the teacher. The results suggest that the proposed POMDP representation produces robots that are robust to teacher error, that can accurately infer task details, and that are perceived to be intelligent.
△ Less
Submitted 1 April, 2012;
originally announced April 2012.