Author: Hazel King

scholar

RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands

By Hazel KingAugust 26, 20240

Authors: Yi Zhao、Le Chen、Jan Schneider、Quankai Gao、Juho Kannala、Bernhard Schölkopf、Joni Pajarinen、Dieter Büchler Paper: https://arxiv.org/abs/2408.11048 Introduction Empowering robots with human-level dexterity has been a long-standing challenge in robotics. This challenge becomes even more complex when considering tasks that require both dynamic and manipulation skills, such as piano playing. Robot piano playing involves coordinating multiple fingers to press keys accurately and dynamically, making it a high-dimensional control task. While reinforcement learning (RL) has shown promise in single-task performance, it often struggles in multi-task settings. This paper introduces the Robot Piano 1 Million (RP1M) dataset, which aims to bridge this gap by enabling imitation learning…

scholar

Exploiting Large Language Models Capabilities for Question Answer-Driven Knowledge Graph Completion Across Static and Temporal Domains

By Hazel KingAugust 26, 20240

Authors: Rui Yang、Jiahao Zhu、Jianping Man、Li Fang、Yi Zhou Paper: https://arxiv.org/abs/2408.10819 Introduction Knowledge Graphs (KGs) are structured semantic knowledge bases that organize entities, concepts, attributes, and their relationships in a graph format. They are pivotal in various applications such as semantic search, recommendation systems, and natural language processing (NLP). However, due to limited annotation resources and technical constraints, existing KGs often have missing key entities or relationships, limiting their functionality. Knowledge Graph Completion (KGC) aims to infer and fill in these missing entities and relationships, thereby enhancing the value and effectiveness of KGs. Traditional KGC methods, such as link prediction and instance…

scholar

Just a Hint: Point-Supervised Camouflaged Object Detection

By Hazel KingAugust 26, 20240

Authors: Huafeng Chen、Dian Shao、Guangqian Guo、Shan Gao Paper: https://arxiv.org/abs/2408.10777 Introduction Camouflaged Object Detection (COD) is a challenging task in computer vision that involves identifying objects that blend seamlessly into their surroundings. This task is not only difficult for models but also for human annotators, as it requires precise pixel-wise annotations, which are labor-intensive and time-consuming. Traditional methods demand extensive annotation efforts, often taking up to 60 minutes per image. To address this issue, the authors propose a novel approach that leverages point-based supervision, significantly reducing the annotation burden while maintaining high detection accuracy. Related Work Weakly Supervised Camouflaged Detection Recent research…

scholar

Probing the Safety Response Boundary of Large Language Models via Unsafe Decoding Path Generation

By Hazel KingAugust 26, 20240

Authors: Haoyu Wang、Bingzhe Wu、Yatao Bian、Yongzhe Chang、Xueqian Wang、Peilin Zhao Paper: https://arxiv.org/abs/2408.10668 Introduction The rapid advancement of Large Language Models (LLMs) such as GPT-4 has significantly impacted various aspects of daily life, providing intelligent assistance in numerous domains. However, alongside these benefits, there are growing concerns about the potential misuse of these models, particularly their ability to generate harmful or dangerous content. Ensuring the safety and reliability of LLMs is crucial for fostering public trust and promoting the responsible use of AI technology. This study addresses the hidden vulnerabilities in LLMs that may persist even after safety alignment. The authors propose a…

scholar

XCB: an effective contextual biasing approach to bias cross-lingual phrases in speech recognition

By Hazel KingAugust 26, 20240

Authors: Xucheng Wan、Naijun Zheng、Kai Liu、Huan Zhou Paper: https://arxiv.org/abs/2408.10524 XCB: An Effective Contextual Biasing Approach to Bias Cross-Lingual Phrases in Speech Recognition Introduction In recent years, End-to-End (E2E) Automatic Speech Recognition (ASR) models have made significant strides in improving speech recognition accuracy. Models such as Transformer, Transducer, and Conformer have set new benchmarks in various speech recognition tasks. However, these models often struggle with recognizing rare words, such as jargon or unique named entities, especially in real-world applications. One popular solution to this problem is contextualized ASR, which integrates contextual information from a predefined list of rare words to enhance recognition…

scholar

Weakly Supervised Pretraining and Multi-Annotator Supervised Finetuning for Facial Wrinkle Detection

By Hazel KingAugust 23, 20240

Authors: Ik Jun Moon、Junho Moon、Ikbeom Jang Paper: https://arxiv.org/abs/2408.09952 Introduction With the increasing interest in skin diseases and aesthetics, the ability to predict and analyze facial wrinkles has become crucial. Facial wrinkles are significant indicators of aging and can be useful in assessing skin conditions, skin care, and early diagnosis of skin diseases. However, manually analyzing extensive collections of images for facial wrinkles is resource-intensive and subjective, leading to variability in research findings. To address these challenges, this study proposes a deep learning-based approach to automatically segment facial wrinkles. By combining wrinkle data labeled by multiple annotators and leveraging transfer learning,…

scholar

SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training

By Hazel KingAugust 21, 20240

Authors: Gengwei Zhang、Liyuan Wang、Guoliang Kang、Ling Chen、Yunchao Wei Paper: https://arxiv.org/abs/2408.08295 Unleashing the Power of Sequential Fine-tuning for Continual Learning with Pre-training: An In-depth Look at SLCA++ Continual learning (CL) has long been a challenging problem in machine learning, primarily due to the issue of catastrophic forgetting. The advent of pre-trained models (PTMs) has revolutionized this field, offering new avenues for knowledge transfer and robustness. However, the progressive overfitting of pre-trained knowledge into specific downstream tasks remains a significant hurdle. This blog delves into the paper “SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training,” which introduces the Slow…

scholar

What's Hot

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Author: Hazel King

RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands

Exploiting Large Language Models Capabilities for Question Answer-Driven Knowledge Graph Completion Across Static and Temporal Domains

Just a Hint: Point-Supervised Camouflaged Object Detection

Probing the Safety Response Boundary of Large Language Models via Unsafe Decoding Path Generation

XCB: an effective contextual biasing approach to bias cross-lingual phrases in speech recognition

Weakly Supervised Pretraining and Multi-Annotator Supervised Finetuning for Facial Wrinkle Detection

SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

NeCo: Improving DINOv2’s spatial representations in 19 GPU hours with Patch Neighbor Consistency

Our Picks

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Subscribe to Updates

What's Hot

Author: Hazel King