Subscribe to Updates
Subscribe to get the latest content in real time.
Author: Tom Johnson
Authors: Boa Jang、Youngbin Ahn、Eun Kyung Choe、Chang Ki Yoon、Hyuk Jin Choi、Young-Gon Kim Paper: https://arxiv.org/abs/2408.08790 Introduction Vision is a critical aspect of quality of life, especially as people age. Common eye diseases such as age-related macular degeneration (AMD), glaucoma, diabetic retinopathy (DR), retinal vein occlusion (RVO), pathologic myopia (PM), and epiretinal membrane (ERM) can lead to blindness if not diagnosed and treated early. With the increasing prevalence of these conditions and a projected shortage of ophthalmologists by 2035, there is a pressing need for efficient, accessible screening and diagnostic systems. Advancements in fundus imaging and artificial intelligence (AI) have paved the way…
Authors: Yuqi Ye、Wei Gao Paper: https://arxiv.org/abs/2408.08682 Introduction Point cloud data is essential for applications like autonomous driving and virtual reality. The challenge lies in compressing this data efficiently while preserving its intricate 3D structure. Traditional methods, both voxel-based and tree-based, have limitations in context modeling due to constraints in data volume and model size. This paper introduces a novel approach, leveraging the capabilities of large language models (LLMs) for point cloud geometry compression (PCGC). Framework Overview Encoding Pipeline The encoding process begins with clustering the input 3D point clouds. Each cluster undergoes several steps: Normalization: Coordinates are normalized by subtracting…
Authors: Mohamed Osman、Daniel Z. Kaplan、Tamer Nadeem Paper: https://arxiv.org/abs/2408.07851 Introduction Speech Emotion Recognition (SER) has become a pivotal area of research due to its potential to enhance human-computer interaction by making it more natural and empathetic. Recent advancements in self-supervised learning (SSL) have led to the development of powerful speech representation models such as wav2vec2, HuBERT, and WavLM. Despite their impressive performance on various speech processing tasks, these models face significant challenges in generalizing across diverse languages and emotional expressions. Existing SER benchmarks often focus on a limited set of well-studied datasets, which may not accurately reflect real-world scenarios. Moreover, the…
Authors: Majid Ghasemi、Amir Hossein Moosavi、Ibrahim Sorkhoh、Anjali Agrawal、Fadi Alzhouri、Dariush Ebrahimi Paper: https://arxiv.org/abs/2408.07712 Introduction Reinforcement Learning (RL) is a branch of Artificial Intelligence (AI) that focuses on training agents to make decisions by interacting with their environment to maximize cumulative rewards. Unlike supervised learning, where an agent learns from labeled examples, or unsupervised learning, which is based on detecting patterns in the data, RL involves an autonomous agent that must make intuitive decisions and learn from its actions. The key idea is to learn how the world works (e.g., what action gets a reward and which does not) to maximize cumulative rewards…
Authors: Xin Sun、Xiao Tang、Abdallah El Ali、Zhuying Li、Xiaoyu Shen、Pengjie Ren、Jan de Wit、Jiahuan Pei、Jos A.Bosch Paper: https://arxiv.org/abs/2408.06527 Introduction Motivational Interviewing (MI) is a client-centered counseling technique designed to encourage individuals to change behaviors. It enhances intrinsic motivation and collaboration between therapists and clients by addressing ambivalence and boosting self-efficacy. Traditional MI chatbots rely on expert-written scripts, which can be rigid and lack diversity. This paper explores the use of Large Language Models (LLMs) to generate MI dialogues that align with therapeutic strategies, aiming for controllable and explainable generation in psychotherapy. Related Work NLG in Motivational Interviewing Natural Language Generation (NLG) in MI…
Authors: Qiong Liu、Ye Guo、Tong Xu Paper: https://arxiv.org/abs/2408.06776 Introduction As the world moves towards a carbon-neutral society, active distribution networks (ADNs) are increasingly incorporating renewable distributed generators (DGs) such as photovoltaic (PV) systems and wind turbines. These DGs, while beneficial, introduce volatility and uncertainty in power generation, leading to challenges like voltage violations and increased power loss. Inverter-based devices, capable of fast reactive control, present an opportunity for real-time volt-var control (VVC) to optimize voltage profiles and minimize power loss in ADNs. This paper addresses the challenges of limited measurement deployment in ADNs using a robust deep reinforcement learning (DRL) approach.…