Author: Sophia Mitchell

scholar

KAN 2.0: Kolmogorov-Arnold Networks Meet Science

By Sophia MitchellAugust 23, 20240

Authors: Ziming Liu、Pingchuan Ma、Yixuan Wang、Wojciech Matusik、Max Tegmark Paper: https://arxiv.org/abs/2408.10205 Introduction In recent years, the intersection of artificial intelligence (AI) and science has led to significant advancements in various fields, such as protein folding prediction, automated theorem proving, and weather forecasting. However, a major challenge remains: the inherent incompatibility between AI’s connectionism and science’s symbolism. To address this, the paper “KAN 2.0: Kolmogorov-Arnold Networks Meet Science” proposes a framework that synergizes Kolmogorov-Arnold Networks (KANs) with scientific discovery. This framework aims to bridge the gap between AI and science by incorporating scientific knowledge into KANs and extracting scientific insights from them. Related…

scholar

Rhyme-aware Chinese lyric generator based on GPT

By Sophia MitchellAugust 23, 20240

Authors: Yixiao Yuan、Yangchen Huang、Yu Ma、Xinjin Li、Zhenglin Li、Yiming Shi、Huapeng Zhou Paper: https://arxiv.org/abs/2408.10130 Rhyme-aware Chinese Lyric Generator Based on GPT Introduction Writing lyrics is a challenging task, even for experienced lyricists. The creative process often requires inspiration, which can sometimes be elusive. This study aims to design an AI-based lyric generator to assist lyricists, particularly in generating Chinese lyrics. Existing methods for Chinese lyric generation have been found inadequate, prompting the need for modifications to improve results. This research leverages pre-trained models, specifically GPT-2, to enhance the generation of Chinese lyrics by incorporating rhyme information, which is crucial for lyrical quality. Related…

scholar

MergeRepair: An Exploratory Study on Merging Task-Specific Adapters in Code LLMs for Automated Program Repair

By Sophia MitchellAugust 23, 20240

Authors: Meghdad Dehghan、Jie JW Wu、Fatemeh H. Fard、Ali Ouni Paper: https://arxiv.org/abs/2408.09568 Introduction In the realm of Software Engineering (SE), automating code-related tasks such as bug prediction, bug fixing, code generation, and more, has been a focal point of research. Large Language Models (LLMs) have demonstrated significant potential in these areas. However, training and fine-tuning these models for each specific task can be resource-intensive and time-consuming. This study introduces MergeRepair, a framework designed to merge multiple task-specific adapters in Code LLMs, specifically for the Automated Program Repair (APR) task. The primary objective is to explore whether merging these adapters can enhance the…

scholar

Deformation-aware GAN for Medical Image Synthesis with Substantially Misaligned Pairs

By Sophia MitchellAugust 23, 20240

Authors: Bowen Xin、Tony Young、Claire E Wainwright、Tamara Blake、Leo Lebrat、Thomas Gaass、Thomas Benkert、Alto Stemmer、David Coman、Jason Dowling Paper: https://arxiv.org/abs/2408.09432 Introduction Medical image synthesis is a crucial technique in modern healthcare, providing additional imaging modalities that are often costly, invasive, or harmful to acquire. This technology is particularly beneficial in scenarios such as MRI-only radiotherapy dose planning or children’s airway assessment, where additional scans can be avoided. Generative Adversarial Networks (GANs) have been widely adopted for this purpose, leveraging either well-aligned imaging pairs (supervised methods) or randomly unpaired data (unsupervised methods). However, substantial misalignment between image pairs, such as lung MRI-CT pairs affected by respiratory…

scholar

Multi-Modal Dialogue State Tracking for Playing GuessWhich Game

By Sophia MitchellAugust 21, 20240

Authors: Wei Pang、Ruixue Duan、Jinfu Yang、Ning Li Paper: https://arxiv.org/abs/2408.08431 Introduction In the realm of vision-language tasks, the GuessWhich game stands out as a unique challenge. This game involves two bots: a Questioner Bot (QBot) and an Answer Bot (ABot). The QBot’s objective is to identify a hidden image by asking a series of questions to the ABot. While ABot has been extensively studied, research on QBot, particularly in the context of visual reasoning, remains limited. This paper addresses this gap by proposing a novel approach that enables QBot to perform visually related reasoning through a mental model of the undisclosed image.…

scholar

The Dawn of KAN in Image-to-Image (I2I) Translation: Integrating Kolmogorov-Arnold Networks with GANs for Unpaired I2I Translation

By Sophia MitchellAugust 21, 20240

Authors: Arpan Mahara、Naphtali D. Rishe、Liangdong Deng Paper: https://arxiv.org/abs/2408.08216 Introduction Generative AI has become a cornerstone in various research fields, including healthcare, remote sensing, physics, chemistry, and photography. Among the many methodologies, Generative Adversarial Networks (GANs) have shown remarkable success, particularly in image-to-image (I2I) translation. This study introduces the Kolmogorov-Arnold Network (KAN) as a potential replacement for the Multi-layer Perceptron (MLP) in generative AI, specifically in the subdomain of I2I translation. The proposed KAN-CUT model replaces the two-layer MLP in the existing Contrastive Unpaired Image-to-Image Translation (CUT) model with a two-layer KAN, aiming to generate more informative features for high-quality image…

scholar

WorldScribe: Towards Context-Aware Live Visual Descriptions

By Sophia MitchellAugust 16, 20240

Authors: Ruei-Che Chang、Yuxuan Liu、Anhong Guo Paper: https://arxiv.org/abs/2408.06627 Introduction In the realm of assistive technology, providing rich, contextual, and timely visual descriptions for blind or visually impaired (BVI) individuals has been a persistent challenge. The paper titled “WorldScribe: Towards Context-Aware Live Visual Descriptions” introduces WorldScribe, a system designed to generate automated live visual descriptions that are customizable and adaptive to users’ contexts. This blog post delves into the various chapters of the paper, explaining the system’s design, functionality, and evaluation. Abstract WorldScribe aims to enhance the autonomy and independence of BVI individuals by providing live visual descriptions that are: 1. Tailored…

scholar

RW-NSGCN: A Robust Approach to Structural Attacks via Negative Sampling

By Sophia MitchellAugust 16, 20240

Authors: Shuqi He、Jun Zhuang、Ding Wang、Jun Song Paper: https://arxiv.org/abs/2408.06665 Introduction Graph Neural Networks (GNNs) have become essential tools for node classification tasks in graph-structured networks, finding applications in various domains such as financial transactions, e-commerce, e-health systems, and real estate markets. Despite their powerful capabilities, GNNs face significant challenges due to topological vulnerabilities and weight instability in graph-structured networks. These issues can lead to decreased classification performance and model instability. Topological Vulnerability Topological vulnerability refers to the significant impact on model output caused by minor changes in the node connections (i.e., the topology) of graph-structured data. GNNs update node representations by…

scholar

What's Hot

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Author: Sophia Mitchell

KAN 2.0: Kolmogorov-Arnold Networks Meet Science

Rhyme-aware Chinese lyric generator based on GPT

MergeRepair: An Exploratory Study on Merging Task-Specific Adapters in Code LLMs for Automated Program Repair

Deformation-aware GAN for Medical Image Synthesis with Substantially Misaligned Pairs

Multi-Modal Dialogue State Tracking for Playing GuessWhich Game

The Dawn of KAN in Image-to-Image (I2I) Translation: Integrating Kolmogorov-Arnold Networks with GANs for Unpaired I2I Translation

WorldScribe: Towards Context-Aware Live Visual Descriptions

RW-NSGCN: A Robust Approach to Structural Attacks via Negative Sampling

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

NeCo: Improving DINOv2’s spatial representations in 19 GPU hours with Patch Neighbor Consistency

Our Picks

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Subscribe to Updates

What's Hot

Author: Sophia Mitchell