Author: Tom Johnson

scholar

A Disease-Specific Foundation Model Using Over 100K Fundus Images: Release and Validation for Abnormality and Multi-Disease Classification on Downstream Tasks

By Tom JohnsonAugust 22, 20240

Authors: Boa Jang、Youngbin Ahn、Eun Kyung Choe、Chang Ki Yoon、Hyuk Jin Choi、Young-Gon Kim Paper: https://arxiv.org/abs/2408.08790 Introduction Vision is a critical aspect of quality of life, especially as people age. Common eye diseases such as age-related macular degeneration (AMD), glaucoma, diabetic retinopathy (DR), retinal vein occlusion (RVO), pathologic myopia (PM), and epiretinal membrane (ERM) can lead to blindness if not diagnosed and treated early. With the increasing prevalence of these conditions and a projected shortage of ophthalmologists by 2035, there is a pressing need for efficient, accessible screening and diagnostic systems. Advancements in fundus imaging and artificial intelligence (AI) have paved the way…

scholar

LLM-PCGC: Large Language Model-based Point Cloud Geometry Compression

By Tom JohnsonAugust 21, 20240

Authors: Yuqi Ye、Wei Gao Paper: https://arxiv.org/abs/2408.08682 Introduction Point cloud data is essential for applications like autonomous driving and virtual reality. The challenge lies in compressing this data efficiently while preserving its intricate 3D structure. Traditional methods, both voxel-based and tree-based, have limitations in context modeling due to constraints in data volume and model size. This paper introduces a novel approach, leveraging the capabilities of large language models (LLMs) for point cloud geometry compression (PCGC). Framework Overview Encoding Pipeline The encoding process begins with clustering the input 3D point clouds. Each cluster undergoes several steps: Normalization: Coordinates are normalized by subtracting…

scholar

SER Evals: In-domain and Out-of-domain Benchmarking for Speech Emotion Recognition

By Tom JohnsonAugust 21, 20240

Authors: Mohamed Osman、Daniel Z. Kaplan、Tamer Nadeem Paper: https://arxiv.org/abs/2408.07851 Introduction Speech Emotion Recognition (SER) has become a pivotal area of research due to its potential to enhance human-computer interaction by making it more natural and empathetic. Recent advancements in self-supervised learning (SSL) have led to the development of powerful speech representation models such as wav2vec2, HuBERT, and WavLM. Despite their impressive performance on various speech processing tasks, these models face significant challenges in generalizing across diverse languages and emotional expressions. Existing SER benchmarks often focus on a limited set of well-studied datasets, which may not accurately reflect real-world scenarios. Moreover, the…

scholar

An Introduction to Reinforcement Learning: Fundamental Concepts and Practical Applications

By Tom JohnsonAugust 21, 20240

Authors: Majid Ghasemi、Amir Hossein Moosavi、Ibrahim Sorkhoh、Anjali Agrawal、Fadi Alzhouri、Dariush Ebrahimi Paper: https://arxiv.org/abs/2408.07712 Introduction Reinforcement Learning (RL) is a branch of Artificial Intelligence (AI) that focuses on training agents to make decisions by interacting with their environment to maximize cumulative rewards. Unlike supervised learning, where an agent learns from labeled examples, or unsupervised learning, which is based on detecting patterns in the data, RL involves an autonomous agent that must make intuitive decisions and learn from its actions. The key idea is to learn how the world works (e.g., what action gets a reward and which does not) to maximize cumulative rewards…

scholar

Chain-of-Strategy Planning with LLMs: Aligning the Generation of Psychotherapy Dialogue with Strategy in Motivational Interviewing

By Tom JohnsonAugust 16, 20240

Authors: Xin Sun、Xiao Tang、Abdallah El Ali、Zhuying Li、Xiaoyu Shen、Pengjie Ren、Jan de Wit、Jiahuan Pei、Jos A.Bosch Paper: https://arxiv.org/abs/2408.06527 Introduction Motivational Interviewing (MI) is a client-centered counseling technique designed to encourage individuals to change behaviors. It enhances intrinsic motivation and collaboration between therapists and clients by addressing ambivalence and boosting self-efficacy. Traditional MI chatbots rely on expert-written scripts, which can be rigid and lack diversity. This paper explores the use of Large Language Models (LLMs) to generate MI dialogues that align with therapeutic strategies, aiming for controllable and explainable generation in psychotherapy. Related Work NLG in Motivational Interviewing Natural Language Generation (NLG) in MI…

scholar

Robust Deep Reinforcement Learning for Inverter-based Volt-Var Control in Partially Observable Distribution Networks

By Tom JohnsonAugust 16, 20240

Authors: Qiong Liu、Ye Guo、Tong Xu Paper: https://arxiv.org/abs/2408.06776 Introduction As the world moves towards a carbon-neutral society, active distribution networks (ADNs) are increasingly incorporating renewable distributed generators (DGs) such as photovoltaic (PV) systems and wind turbines. These DGs, while beneficial, introduce volatility and uncertainty in power generation, leading to challenges like voltage violations and increased power loss. Inverter-based devices, capable of fast reactive control, present an opportunity for real-time volt-var control (VVC) to optimize voltage profiles and minimize power loss in ADNs. This paper addresses the challenges of limited measurement deployment in ADNs using a robust deep reinforcement learning (DRL) approach.…

scholar

What's Hot

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Author: Tom Johnson

A Disease-Specific Foundation Model Using Over 100K Fundus Images: Release and Validation for Abnormality and Multi-Disease Classification on Downstream Tasks

LLM-PCGC: Large Language Model-based Point Cloud Geometry Compression

SER Evals: In-domain and Out-of-domain Benchmarking for Speech Emotion Recognition

An Introduction to Reinforcement Learning: Fundamental Concepts and Practical Applications

Chain-of-Strategy Planning with LLMs: Aligning the Generation of Psychotherapy Dialogue with Strategy in Motivational Interviewing

Robust Deep Reinforcement Learning for Inverter-based Volt-Var Control in Partially Observable Distribution Networks

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

NeCo: Improving DINOv2’s spatial representations in 19 GPU hours with Patch Neighbor Consistency

Our Picks

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Subscribe to Updates

What's Hot

Author: Tom Johnson