Author: Scarlett Fox

Authors: Zhongliang Guo、Lei Fang、Jingyu Lin、Yifei Qian、Shuai Zhao、Zeyu Wang、Junhao Dong、Cunjian Chen、Ognjen Arandjelović、Chun Pong Lau Paper: https://arxiv.org/abs/2408.10901 Introduction The rapid advancements in generative artificial intelligence, particularly in image synthesis and manipulation, have brought about significant transformations in various industries. State-of-the-art methodologies, such as Latent Diffusion Models (LDMs), have demonstrated exceptional capabilities in producing photorealistic imagery. However, these advancements also pose ethical and security challenges, including data misappropriation and intellectual property infringement. The ease with which diffusion-based models can manipulate existing visual content presents a significant threat to the integrity of digital assets. Adversarial attacks on machine learning models have emerged as a…

Read More

Authors: Julius Pesonen、Teemu Hakala、Väinö Karjalainen、Niko Koivumäki、Lauri Markelin、Anna-Maria Raita-Hakola、Juha Suomalainen、Ilkka Pölönen、Eija Honkavaara Paper: https://arxiv.org/abs/2408.10843 Introduction Background The increasing frequency and intensity of wildfires pose significant risks to the environment, ecosystems, and societies. Early detection is crucial to prevent large-scale disasters. Uncrewed aerial vehicles (UAVs) offer a promising solution for rapid deployment and surveying large areas with minimal infrastructure. However, in remote areas, UAVs are limited to on-board computing due to the lack of high-bandwidth mobile networks. This necessitates lightweight, specialized models for real-time detection and localization of wildfires. Problem Statement Accurate camera-based localization of wildfires requires segmentation of detected smoke. However,…

Read More

Authors: Udo Schlegel、Daniel A. Keim、Tobias Sutter Paper: https://arxiv.org/abs/2408.10628 Introduction Understanding how models process and interpret time series data remains a significant challenge in deep learning, especially for applications in safety-critical areas such as healthcare. In this paper, the authors introduce Sequence Dreaming, a technique that adapts Activation Maximization to analyze sequential information, aiming to enhance the interpretability of neural networks operating on univariate time series. By leveraging this method, they visualize the temporal dynamics and patterns most influential in model decision-making processes. This approach is tested on a time series classification dataset encompassing applications in predictive maintenance. Related Work In…

Read More

Authors: Samuel Chevalier、Duncan Starkenburg、Krishnamurthy、Dvijotham Paper: https://arxiv.org/abs/2408.10491 Introduction Background Formal verification is a critical process in ensuring the reliability and robustness of systems, particularly those involving Neural Networks (NNs). This process involves reformulating NNs into mathematical programs optimized for verification. However, the non-convex nature of these reformulations poses significant challenges, necessitating the use of convex relaxations for nonlinear activation functions. Problem Statement Common relaxations, such as static linear cuts for “S-shaped” activation functions like the sigmoid, often result in overly loose bounds, which can slow down the verification process. This paper introduces a novel approach to derive tunable hyperplanes that tightly…

Read More

Authors: Xuechu Yu Paper: https://arxiv.org/abs/2408.10292 Introduction Background Contrastive representation learning has emerged as a powerful technique in self-supervised learning, particularly for tasks such as image classification, object detection, and instance segmentation. The core idea is to learn representations by maximizing the mutual information between different views of unlabeled data. However, recent studies have shown that simply increasing the estimated mutual information does not necessarily lead to better performance in downstream tasks. This observation suggests that the learned representations may contain not only task-relevant information but also task-irrelevant (superfluous) information, which can degrade performance. Problem Statement The presence of superfluous information…

Read More

Authors: Rafael M. Mamede、Pedro C. Neto、Ana F. Sequeira Paper: https://arxiv.org/abs/2408.10175 Introduction The rapid advancement of machine learning-based biometric applications, particularly facial recognition systems, has brought to light significant concerns regarding their fairness, safety, and trustworthiness. Initial research primarily focused on enhancing recognition performance, but recent studies have shifted towards understanding and mitigating biases within these systems. This shift is driven by incidents of misclassification in critical scenarios, such as criminal trials, and the need for transparency as mandated by regulations like the EU General Data Protection Regulation (GDPR). This study investigates the impact of occlusions on the fairness of facial…

Read More

Authors: Xiao Wang、Shiao Wang、Pengpeng Shao、Bo Jiang、Lin Zhu、Yonghong Tian Paper: https://arxiv.org/abs/2408.09764 Introduction Human Action Recognition (HAR) is a crucial research area in computer vision and artificial intelligence, primarily driven by the advancements in deep learning techniques. Traditionally, HAR has relied heavily on RGB cameras to capture and analyze human activities. However, RGB cameras face significant challenges in real-world applications, such as varying light conditions, fast motion, and privacy concerns. These limitations necessitate the exploration of alternative technologies. Event cameras, also known as Dynamic Vision Sensors (DVS), have emerged as a promising alternative due to their unique advantages, including low energy consumption,…

Read More

Authors: Hongyin Zhu Paper: https://arxiv.org/abs/2408.09205 Abstract The development of a large language model (LLM) infrastructure is a pivotal undertaking in artificial intelligence. This paper explores the intricate landscape of LLM infrastructure, software, and data management. By analyzing these core components, we emphasize the pivotal considerations and safeguards crucial for successful LLM development. This work presents a concise synthesis of the challenges and strategies inherent in constructing a robust and effective LLM infrastructure, offering valuable insights for researchers and practitioners alike. Infrastructure Configuration In the realm of infrastructure configuration for LLM training endeavors, server clusters equipped with H100/H800 GPUs have emerged…

Read More

Authors: Maris F. L. Galesloot、Marnix Suilen、Thiago D. Simão、Steven Carr、Matthijs T. J. Spaan、Ufuk Topcu、Nils Jansen Paper: https://arxiv.org/abs/2408.08770 Introduction Partially observable Markov decision processes (POMDPs) are a fundamental model for decision-making under uncertainty. They require policies that select actions based on limited state information to achieve specific objectives, such as minimizing expected costs. Traditional POMDPs assume precise knowledge of transition and observation probabilities, which is often unrealistic due to data limitations or sensor inaccuracies. Robust POMDPs (RPOMDPs) address this by incorporating uncertainty sets for these probabilities, optimizing policies against the worst-case scenarios within these sets. Challenges in RPOMDPs Finding robust policies for…

Read More