Author: Nora Brooks

scholar

Accelerating Goal-Conditioned RL Algorithms and Research

By Nora BrooksAugust 26, 20240

Authors: Michał Bortkiewicz、Władek Pałucki、Vivek Myers、Tadeusz Dziarmaga、Tomasz Arczewski、Łukasz Kuciński、Benjamin Eysenbach Paper: https://arxiv.org/abs/2408.11052 Introduction Self-supervised learning has revolutionized various domains within machine learning, such as natural language processing and computer vision. However, its application in reinforcement learning (RL) has not seen similar success. This paper addresses the challenges faced by self-supervised goal-conditioned reinforcement learning (GCRL) methods, particularly the lack of data from slow environments and unstable algorithms. The authors introduce JaxGCRL, a high-performance codebase and benchmark for self-supervised GCRL, which enables researchers to train agents for millions of environment steps in minutes on a single GPU. This paper aims to provide a…

scholar

Decoding Human Emotions: Analyzing Multi-Channel EEG Data using LSTM Networks

By Nora BrooksAugust 26, 20240

Authors: Shyam K Sateesh、Sparsh BK、Uma D Paper: https://arxiv.org/abs/2408.10328 Introduction Emotion recognition from electroencephalogram (EEG) signals is a burgeoning field, particularly in neuroscience and Human-Computer Interaction (HCI). EEG signals provide a descriptive temporal view of brain activity, making them indispensable for understanding complex human emotional states. This study aims to enhance the predictive accuracy of emotional state classification by applying Long Short-Term Memory (LSTM) networks to analyze EEG signals. Using the DEAP dataset, which contains multi-channel EEG recordings, the study leverages LSTM networks’ ability to handle temporal dependencies within EEG data. The results demonstrate significant improvements in emotion recognition, achieving accuracies…

scholar

OpenCity: Open Spatio-Temporal Foundation Models for Traffic Prediction

By Nora BrooksAugust 26, 20240

Authors: Zhonghang Li、Long Xia、Lei Shi、Yong Xu、Dawei Yin、Chao Huang Paper: https://arxiv.org/abs/2408.10269 Introduction Urban transportation systems are the backbone of modern cities, facilitating the movement of people and goods. Accurate traffic forecasting is essential for effective urban planning and transportation management, enabling efficient resource allocation and enhanced travel experiences. However, existing traffic prediction models often struggle with generalization, particularly in zero-shot prediction scenarios for unseen regions and cities, and long-term forecasting. This is due to the inherent challenges in handling the spatial and temporal heterogeneity of traffic data and significant distribution shifts across time and space. In this study, we introduce OpenCity,…

scholar

Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches

By Nora BrooksAugust 26, 20240

Authors: Yanjie Dong、Xiaoyi Fan、Fangxin Wang、Chengming Li、Victor C. M. Leung、Xiping Hu Paper: https://arxiv.org/abs/2408.10691 Introduction Since the introduction of GPT-2 in 2019, large language models (LLMs) have evolved from specialized tools to versatile foundation models. These models exhibit impressive zero-shot capabilities, enabling them to perform tasks such as text generation, machine translation, and question answering without specific training for those tasks. However, fine-tuning these models on local datasets and deploying them efficiently remains a significant challenge due to their substantial computational and storage requirements. The traditional fine-tuning techniques using first-order optimizers demand substantial GPU memory, often exceeding the capacity of mainstream hardware.…

scholar

SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views

By Nora BrooksAugust 23, 20240

Authors: Chao Xu、Ang Li、Linghao Chen、Yulin Liu、Ruoxi Shi、Hao Su、Minghua Liu Paper: https://arxiv.org/abs/2408.10195 Introduction 3D object reconstruction is a critical task with applications in various fields such as augmented reality, virtual reality, and robotics. Traditional methods often require dense view inputs, which are not always feasible in practical scenarios. Recent advancements in single-image-to-3D methods have shown promise but often lack controllability and produce hallucinated regions that may not align with user expectations. This paper introduces SpaRP, a novel method designed to reconstruct 3D textured meshes and estimate camera poses from sparse, unposed 2D images. SpaRP leverages 2D diffusion models to infer 3D…

scholar

3D-Aware Instance Segmentation and Tracking in Egocentric Videos

By Nora BrooksAugust 23, 20240

Authors: Yash Bhalgat、Vadim Tschernezki、Iro Laina、João F. Henriques、Andrea Vedaldi、Andrew Zisserman Paper: https://arxiv.org/abs/2408.09860 Introduction Egocentric videos, which capture the world from a first-person perspective, are gaining significant attention in computer vision due to their applications in augmented reality, robotics, and more. However, these videos present unique challenges for 3D scene understanding, including rapid camera motion, frequent object occlusions, and limited object visibility. Traditional 2D video object segmentation (VOS) methods struggle with these challenges, often resulting in fragmented and incomplete object tracks. This paper introduces a novel approach to instance segmentation and tracking in egocentric videos that leverages 3D awareness to overcome these…

scholar

Paired Completion: Flexible Quantification of Issue-framing at Scale with LLMs

By Nora BrooksAugust 23, 20240

Authors: Simon D Angus、

scholar

ExpoMamba: Exploiting Frequency SSM Blocks for Efficient and Effective Image Enhancement

By Nora BrooksAugust 23, 20240

Authors: Eashan Adhikarla、Kai Zhang、John Nicholson、Brian D. Davison Paper: https://arxiv.org/abs/2408.09650 Introduction Low-light image enhancement is a critical task in computer vision, with applications ranging from consumer gadgets like phone cameras to sophisticated surveillance systems. Traditional techniques often struggle to balance processing speed and high-quality results, especially with high-resolution images. This leads to issues like noise and color distortion in scenarios requiring quick processing, such as mobile photography and real-time video streaming. Recent advancements in foundation models, such as transformers and diffusion models, have shown promise in various domains, including low-light image enhancement. However, these models are often limited by their computational…

scholar

MAT-SED: AMasked Audio Transformer with Masked-Reconstruction Based Pre-training for Sound Event Detection

By Nora BrooksAugust 22, 20240

Authors: Pengfei Cai、Yan Song、Kang Li、Haoyu Song、Ian McLoughlin Paper: https://arxiv.org/abs/2408.08673 Introduction Sound event detection (SED) aims to identify not only the types of events occurring in an audio signal but also their temporal locations. This technology has garnered significant interest due to its applications in smart homes, smart cities, and surveillance systems. Traditional SED systems often rely on a combination of convolutional neural networks (CNNs) for feature extraction and recurrent neural networks (RNNs) for modeling temporal dependencies. However, the scarcity of labeled data poses a significant challenge for these systems. Recent advancements have seen the rise of Transformer-based SED models, inspired…

scholar

CodeMirage: Hallucinations in Code Generated by Large Language Models

By Nora BrooksAugust 21, 20240

Authors: Vibhor Agarwal、Yulong Pei、Salwa Alamir、Xiaomo Liu Paper: https://arxiv.org/abs/2408.08333 Introduction Large Language Models (LLMs) have demonstrated significant capabilities in natural language generation and program generation. However, these models are prone to generating hallucinations—text that sounds plausible but is incorrect. This phenomenon is not limited to natural language but extends to code generation as well. The generated code can contain syntactical or logical errors, security vulnerabilities, memory leaks, and other issues. Given the increasing adoption of LLMs in code generation, it is crucial to investigate these hallucinations. This paper introduces the concept of code hallucinations, provides a comprehensive taxonomy of hallucination types,…

scholar

What's Hot

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Author: Nora Brooks

Accelerating Goal-Conditioned RL Algorithms and Research

Decoding Human Emotions: Analyzing Multi-Channel EEG Data using LSTM Networks

OpenCity: Open Spatio-Temporal Foundation Models for Traffic Prediction

Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches

SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views

3D-Aware Instance Segmentation and Tracking in Egocentric Videos

Paired Completion: Flexible Quantification of Issue-framing at Scale with LLMs

ExpoMamba: Exploiting Frequency SSM Blocks for Efficient and Effective Image Enhancement

MAT-SED: AMasked Audio Transformer with Masked-Reconstruction Based Pre-training for Sound Event Detection

CodeMirage: Hallucinations in Code Generated by Large Language Models

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

NeCo: Improving DINOv2’s spatial representations in 19 GPU hours with Patch Neighbor Consistency

Our Picks

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Subscribe to Updates

What's Hot

Author: Nora Brooks