Subscribe to Updates
Subscribe to get the latest content in real time.
Author: Aurora Ward
Authors: Cristian Sestito、Shady Agwa、Themis Prodromakis Paper: https://arxiv.org/abs/2408.10243 Introduction Convolutional Neural Networks (CNNs) have become the cornerstone of various AI applications, from computer vision to speech recognition. However, the computational and memory demands of CNNs, especially for high-resolution images, pose significant challenges. The VGG-16 CNN, for instance, requires substantial memory and computational resources to process its multi-dimensional feature maps (fmaps). This paper introduces TrIM, a novel dataflow architecture designed to address these challenges by reducing memory accesses and enhancing energy efficiency. Related Work Systolic Arrays and Dataflows Systolic Arrays (SAs) are a promising solution to mitigate the Von Neumann bottleneck by…
Authors: Rujia Shen、Boran Wang、Chao Zhao、Yi Guan、Jingchi Jiang Paper: https://arxiv.org/abs/2408.08023 Introduction Causal discovery from time-series data is essential for understanding the relationships among variables over time. This understanding is crucial for various scientific disciplines, such as medicine and economics. Traditional methods for causal discovery from non-time-series data are not directly applicable to time-series data due to the need for serialized samples and a larger number of observed time steps. To address these challenges, the authors propose a novel gradient-based causal discovery approach called STIC (Short-Term Invariance-based Convolutional causal discovery). STIC leverages convolutional neural networks (CNNs) to uncover causal relationships by focusing…
Authors: Jiri Hron、Laura Culp、Gamaleldin Elsayed、Rosanne Liu、Ben Adlam、Maxwell Bileschi、Bernd Bohnet、JD Co-Reyes、Noah Fiedel、C. Daniel Freeman、Izzeddin Gur、Kathleen Kenealy、Jaehoon Lee、Peter J. Liu、Gaurav Mishra、Igor Mordatch、Azade Nova、Roman Novak、Aaron Parisi、Jeffrey Pennington、Alex Rizkowsky、Isabelle Simpson、Hanie Sedghi、Jascha Sohl-dickstein、Kevin Swersky、Sharad Vikram、Tris Warkentin、Lechao Xiao、Kelvin Xu、Jasper Snoek、Simon Kornblith Paper: https://arxiv.org/abs/2408.07852 Introduction Despite significant advancements in the capabilities of large language models (LMs), hallucinations—instances where models generate incorrect or nonsensical information—remain a persistent challenge. This paper investigates how the scale of LMs influences hallucinations, focusing on cases where the correct answer appears verbatim in the training set. By training LMs on a knowledge graph (KG)-based dataset, the study aims to understand the extent of…
Authors: Jordan F. Masakuna、DJeff Kanda Nkashama、Arian Soltani、Marc Frappier、Pierre-Martin Tardif、Froduald Kabanza Paper: https://arxiv.org/abs/2408.07718 Introduction and Related Work Unsupervised anomaly detection is a critical area in machine learning, relying on the assumption that training datasets are free from anomalies. However, this assumption often does not hold true in practice, as datasets frequently contain anomalous instances, referred to as contamination. The presence of contamination can significantly undermine the effectiveness and reliability of anomaly detection models. To address this challenge, robust anomaly detection models have been developed, such as Isolation Forest (IF), Local Outlier Factor (LOF), One-Class Support Vector Machine (OCSVM), Neural Transformation Learning…
Authors: Kaushik Rangadurai、Siyang Yuan、Minhui Huang、Yiqun Liu、Golnaz Ghasemiesfeh、Yunchen Pu、Xinfeng Xie、Xingfeng He、Fangzhou Xu、Andrew Cui、Vidhoon Viswanathan、Yan Dong、Liang Xiong、Lin Yang、Liang Wang、Jiyan Yang、Chonglin Sun Paper: https://arxiv.org/abs/2408.06653 Introduction In the realm of machine learning and recommender systems, the retrieval stage is crucial for narrowing down millions of candidate ads to a few thousand relevant ones. This paper introduces the Hierarchical Structured Neural Network (HSNN), a novel approach designed to address the limitations of traditional Embedding Based Retrieval (EBR) systems. HSNN leverages sophisticated interactions and model architectures to enhance the retrieval process while maintaining sub-linear inference costs. Related Work Clustering Methods Clustering algorithms are broadly categorized into…