Author: Aiden Green

Authors: Qiyao Liang、Ziming Liu、Mitchell Ostrow、Ila Fiete ArXiv:https://arxiv.org/abs/2408.13256 Introduction Diffusion models have demonstrated a remarkable ability to generate realistic images by combining elements in novel ways, a phenomenon known as compositional generalization. Despite their success, the underlying mechanisms that enable these models to achieve compositionality remain poorly understood. Inspired by cognitive neuroscience, this study aims to investigate whether and when diffusion models learn semantically meaningful and factorized representations of combinable features. By training conditional denoised diffusion probabilistic models (DDPMs) on 2D Gaussian data for controlled experiments, we seek to reveal how these models encode and generalize the complexity of component features. Related…

Read More

Authors: Sreyoshi Bhaduri、Satya Kapoor、Alex Gil、Anshul Mittal、Rutu Mulkar Paper: https://arxiv.org/abs/2408.11043 Introduction Background and Problem Statement Talent management research is pivotal in understanding employee sentiment, behavior, and overall organizational dynamics. This field often relies on qualitative data collection methods such as interviews and focus groups to gather rich, context-dependent insights. However, the manual analysis of qualitative data is labor-intensive and time-consuming, posing significant challenges for researchers. Traditional statistical techniques often fall short in capturing the nuanced complexities of qualitative data, leading to potential misinterpretations. Large Language Models (LLMs) like BERT, GPT-3, and PaLM have shown promise in text summarization, classification, and information…

Read More

Authors: Yin-Jyun Luo、Kin Wai Cheuk、Woosung Choi、Toshimitsu Uesaka、Keisuke Toyama、Koichi Saito、Chieh-Hsin Lai、Yuhta Takida、Wei-Hsiang Liao、Simon Dixon、Yuki Mitsufuji Paper: https://arxiv.org/abs/2408.10807 Introduction Disentangled representation learning (DRL) has been a significant area of research in machine learning, focusing on capturing semantically meaningful latent features of observed data in a low-dimensional latent space. This approach has been applied to various domains, including music audio, to extract representations of timbre and pitch. However, existing work has primarily focused on single-instrument audio, leaving a gap in handling multi-instrument mixtures. To address this, the authors propose DisMix, a generative framework designed to disentangle pitch and timbre representations from mixtures of…

Read More

Authors: Huafeng Chen、Pengxu Wei、Guangqian Guo、Shan Gao Paper: https://arxiv.org/abs/2408.10760 Introduction Camouflaged Object Detection (COD) is a challenging task that involves identifying objects that blend seamlessly into their surroundings. Traditional COD methods rely heavily on mask annotations, which are labor-intensive and time-consuming to produce. This paper introduces SAM-COD, a novel framework designed to address the limitations of existing weakly-supervised COD methods. SAM-COD leverages the Segment Anything Model (SAM) and introduces several innovative components to improve performance under weakly-supervised settings. Related Work Camouflaged Object Detection COD aims to detect objects that are visually indistinguishable from their backgrounds. Previous works like SINet and ZoomNet…

Read More

Authors: Junhao Chen、Bowen Wang、Zhouqiang jiang、Yuta Nakashima Paper: https://arxiv.org/abs/2408.10573 Introduction Large Language Models (LLMs) have revolutionized the field of question answering (QA) by leveraging extensive world knowledge from vast publicly available corpora. However, the effectiveness of LLMs in QA is often compromised by the vagueness of user questions. This paper addresses this issue by introducing a single-round instance-level prompt optimization technique known as the question rewriter. By enhancing the clarity of human questions for black-box LLMs, the question rewriter significantly improves the quality of generated answers. The rewriter is optimized using direct preference optimization based on feedback from automatic criteria for…

Read More

Authors: Lun Ai、Stephen H. Muggleton Paper: https://arxiv.org/abs/2408.10369 Boolean Matrix Logic Programming: A Detailed Exploration Introduction Boolean Matrix Logic Programming (BMLP) is an innovative approach to datalog query evaluation that leverages the power of boolean matrix operations. Traditional datalog query evaluations have primarily focused on symbolic computations. However, recent studies have demonstrated that matrix operations can significantly enhance the performance of datalog query evaluations. This paper introduces BMLP as a general query answering problem using boolean matrices and presents two novel BMLP modules designed for efficient bottom-up inferences on linear and non-linear recursive datalog programs. Related Work Bottom-Up Datalog Evaluation Historically,…

Read More

Authors: Lulu Yu、Keping Bi、Shiyu Ni、Jiafeng Guo Paper: https://arxiv.org/abs/2408.09817 Introduction Background Learning to Rank (LTR) is a critical component in many real-world systems, such as search engines and recommendation systems. Traditionally, LTR relies on human annotations to train models, where experts label the relevance of documents. However, obtaining these annotations is costly and may not always align with user preferences. Consequently, researchers have turned to implicit user feedback, such as clicks, to optimize ranking models. Problem Statement While implicit feedback is valuable, it is inherently biased due to factors like position bias, trust bias, and contextual bias. Unbiased Learning to Rank…

Read More

Authors: Yuan Tian、Tianyi Zhang Paper: https://arxiv.org/abs/2408.09121 Selective Prompt Anchoring for Code Generation: A Detailed Interpretive Blog Introduction Background Recent advancements in large language models (LLMs) like Copilot and ChatGPT have significantly transformed software development by automating coding tasks. These models leverage vast datasets and sophisticated algorithms to interpret natural language descriptions and generate corresponding code. Despite their impressive capabilities, LLMs still face challenges in reducing error rates and fully meeting user expectations. This study aims to address these challenges by proposing a novel approach called Selective Prompt Anchoring (SPA). Problem Statement The primary issue identified in this study is the…

Read More

Authors: Björn Schembera、Frank Wübbeling、Hendrik Kleikamp、Burkhard Schmidt、Aurela Shehu、Marco Reidelbach、Christine Biedinger、Jochen Fiedler、Thomas Koprucki、Dorothea Iglezakis、Dominik Göddeke Paper: https://arxiv.org/abs/2408.10003 Introduction In the realm of scientific research, data and knowledge-driven approaches have emerged as the fourth pillar of science. The proliferation of computer simulations, big measurement data in physics, and statistical data in social sciences underscores the importance of processing and generating data for scientific reasoning. Sharing and citing research data is increasingly recognized as a crucial aspect of the scientific process, necessitating adherence to the FAIR principles (Findable, Accessible, Interoperable, Reusable) to avoid dark data and ensure reproducibility. In fields utilizing mathematical methods, research…

Read More