继续创作,加快生长!这是我参加「日新计划 6 月更文应战」的第23天,点击检查活动概况
ShowMeAI日报系列全新升级!掩盖AI人工智能 东西&结构 | 项目&代码 | 博文&共享 | 数据&资源 | 研讨&论文 等方向。点击检查 历史文章列表,在大众号内订阅话题 #ShowMeAI资讯日报,可接收每日最新推送。点击 专题合辑&电子月刊 快速阅读各专题全集。
1.东西&结构
东西:妙言 – 轻灵的 Markdown 笔记本
GitHub: github.com/tw93/MiaoYa…
东西:ktop – Kubernetes集群的相似top的资源检查东西
‘ktop – A top-like tool for your Kubernetes clusters’ by Vladimir Vivien
GitHub: github.com/vladimirviv…
东西:PyScript CLI – PyScript的命令行界面
PyScript是能够在HTML页面嵌入和履行Python代码的JS库
‘PyScript CLI – A CLI for PyScript’
GitHub: github.com/pyscript/py…
东西:NoiseTorch – Linux下的麦克风实时噪声按捺
‘NoiseTorch – Real-time microphone noise suppression on Linux.’ by lawl
GitHub: github.com/noisetorch/…
东西库:morfeus – 用于核算分子特征的Python包
‘morfeus – A Python package for calculating molecular features’ by Kjell Jorner
GitHub: github.com/kjelljorner…
2.博文&共享
视频共享:李沐大佬的项目《深度学习论文精读》系列视频
GitHub: github.com/mli/paper-r…
视频也在B站和知乎更新。论文选取的原则是10年内深度学习里有影响力文章(必读文章),或许近期比较有意思的文章。
3.数据&资源
资源共享:AndroidReverseStudy – 安卓逆向学习
GitHub: github.com/heyhu/Andro…
资源列表:少样本增量学习相关文献资源列表
‘Awesome Few-Shot Class-Incremental Learning’ by Da-Wei Zhou
GitHub: github.com/zhoudw-zdw/…
资源列表:软件工程面试相关资源列表
‘Awesome Software Engineering Interview’ by imkgarg
GitHub: github.com/imkgarg/Awe…
资源列表:AI-research-tools – 科研东西列表
地址: github.com/bighuang624…
包含论文查找、阅读、写作类的各种东西。有通用的东西,专业方面的以核算机为主。
4.研讨&论文
大众号回复关键字 日报,免费获取整理好的6月论文合辑。
论文:Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt
论文标题:Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt
论文时刻:14 Jun 2022
所属范畴:深度学习
对应使命:损失函数优化
论文地址:arxiv.org/abs/2206.07…
代码完成:github.com/oatml/rho-l…
论文作者:Sren Mindermann, Jan Brauner, Muhammed Razzak, Mrinank Sharma, Andreas Kirsch, Winnie Xu, Benedikt Hltgen, Aidan N. Gomez, Adrien Morisot, Sebastian Farquhar, Yarin Gal
论文简介:But most computation and time is wasted on redundant and noisy points that are already learnt or not learnable./可是大多数核算和时刻都浪费在现已学习或不行学习的冗余和嘈杂点上。
论文摘要:Training on web-scale data can take months. But most computation and time is wasted on redundant and noisy points that are already learnt or not learnable. To accelerate training, we introduce Reducible Holdout Loss Selection (RHO-LOSS), a simple but principled technique which selects approximately those points for training that most reduce the model’s generalization loss. As a result, RHO-LOSS mitigates the weaknesses of existing data selection methods: techniques from the optimization literature typically select ‘hard’ (e.g. high loss) points, but such points are often noisy (not learnable) or less task-relevant. Conversely, curriculum learning prioritizes ‘easy’ points, but such points need not be trained on once learned. In contrast, RHO-LOSS selects points that are learnable, worth learning, and not yet learnt. RHO-LOSS trains in far fewer steps than prior art, improves accuracy, and speeds up training on a wide range of datasets, hyperparameters, and architectures (MLPs, CNNs, and BERT). On the large web-scraped image dataset Clothing-1M, RHO-LOSS trains in 18x fewer steps and reaches 2% higher final accuracy than uniform data shuffling.
互联网大规模数据的练习或许需要几个月的时刻。可是大多数核算和时刻都浪费在现已学习或不行学习的冗余和嘈杂点上。为了加快练习,咱们引入了 Reducible Holdout Loss Selection (RHO-LOSS),这是一种简略但有原则的技能,能够近似地挑选那些最能减少模型泛化损失的点进行练习。因而,RHO-LOSS 减轻了现有数据挑选办法的弱点:优化文献中的技能通常挑选“难”(例如高损失)点,但这些点通常是嘈杂的(不行学习的)或与使命相关性较低。一般练习优先考虑“简略”点,但这些点一旦学习就不需要练习。比较之下,RHO-LOSS 挑选可学习、值得学习和没有学习的点。 RHO-LOSS 的练习进程比现有技能少得多,提高了精确性,并加快了对各种数据集、超参数和架构(MLP、CNN 和 BERT)的练习。在大型网络抓取图画数据集 Clothing-1M 上,RHO-LOSS 的练习步数减少了 18 倍,最终精确度比统一数据混洗高 2%。
论文:Online Segmentation of LiDAR Sequences: Dataset and Algorithm
论文标题:Online Segmentation of LiDAR Sequences: Dataset and Algorithm
论文时刻:16 Jun 2022
所属范畴:核算机视觉
对应使命:Autonomous Vehicles,LIDAR Semantic Segmentation,Semantic Segmentation,自动驾驭轿车,激光雷达语义切割,语义切割
论文地址:arxiv.org/abs/2206.08…
代码完成:github.com/romainloise…
论文作者:Romain Loiseau, Mathieu Aubry, Loc Landrieu
论文简介:Helix4D operates on acquisition slices that correspond to a fraction of a full rotation of the sensor, significantly reducing the total latency./Helix4D 在对应于传感器完好旋转的一小部分的搜集切片上运转,明显降低了总延迟。
论文摘要:Roof-mounted spinning LiDAR sensors are widely used by autonomous vehicles, driving the need for real-time processing of 3D point sequences. However, most LiDAR semantic segmentation datasets and algorithms split these acquisitions into 360∘ frames, leading to acquisition latency that is incompatible with realistic real-time applications and evaluations. We address this issue with two key contributions. First, we introduce HelixNet, a 10 billion point dataset with fine-grained labels, timestamps, and sensor rotation information that allows an accurate assessment of real-time readiness of segmentation algorithms. Second, we propose Helix4D, a compact and efficient spatio-temporal transformer architecture specifically designed for rotating LiDAR point sequences. Helix4D operates on acquisition slices that correspond to a fraction of a full rotation of the sensor, significantly reducing the total latency. We present an extensive benchmark of the performance and real-time readiness of several state-of-the-art models on HelixNet and SemanticKITTI. Helix4D reaches accuracy on par with the best segmentation algorithms with a reduction of more than 5 in terms of latency and 50 in model size. Code and data are available at romainloiseau.fr/helixnet
安装在顶部的旋转式 LiDAR 传感器被自动驾驭轿车广泛运用,推动了对 3D 点序列实时处理的需求。但是,大多数 LiDAR 语义切割数据集和算法将这些搜集切割成 360帧,导致搜集延迟与实践的实时运用和评价不兼容。咱们经过两个关键奉献来处理这个问题。首要,咱们介绍 HelixNet,这是一个 100 亿点的数据集,具有细粒度的标签、时刻戳和传感器旋转信息,能够精确评价切割算法的实时准备状况。其次,咱们提出了 Helix4D,这是一种专为旋转 LiDAR 点序列规划的紧凑且高效的时空变换器架构。 Helix4D 在对应于传感器完好旋转的一小部分的搜集切片上运转,明显降低了总延迟。咱们在 HelixNet 和 SemanticKITTI 上展现了几个最先进模型的功能和实时准备状况的广泛基准。 Helix4D 达到了与最佳切割算法适当的精度,延迟减少了 5 倍以上,模型大小减少了 50 倍。代码和数据可在以下网址取得 romainloiseau.fr/helixnet
论文:Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple yet Strong Baseline
论文标题:Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple yet Strong Baseline
论文时刻:16 Jun 2022
所属范畴:核算机视觉
对应使命:Autonomous Driving,Trajectory Planning,自动驾驭,轨道规划
论文地址:arxiv.org/abs/2206.08…
代码完成:github.com/OpenPercept…
论文作者:Penghao Wu, Xiaosong Jia, Li Chen, Junchi Yan, Hongyang Li, Yu Qiao
论文简介:The two branches are connected so that the control branch receives corresponding guidance from the trajectory branch at each time step./两个分支连接起来,使得操控分支在每个时刻步都从轨道分支接收到相应的引导。
论文摘要:Current end-to-end autonomous driving methods either run a controller based on a planned trajectory or perform control prediction directly, which have spanned two separately studied lines of research. Seeing their potential mutual benefits to each other, this paper takes the initiative to explore the combination of these two well-developed worlds. Specifically, our integrated approach has two branches for trajectory planning and direct control, respectively. The trajectory branch predicts the future trajectory, while the control branch involves a novel multi-step prediction scheme such that the relationship between current actions and future states can be reasoned. The two branches are connected so that the control branch receives corresponding guidance from the trajectory branch at each time step. The outputs from two branches are then fused to achieve complementary advantages. Our results are evaluated in the closed-loop urban driving setting with challenging scenarios using the CARLA simulator. Even with a monocular camera input, the proposed approach ranks first on the official CARLA Leaderboard, outperforming other complex candidates with multiple sensors or fusion mechanisms by a large margin. The source code and data will be made publicly available at github.com/OpenPercept…
当时的端到端自动驾驭办法要么根据规划的轨道运转操控器,要么直接履行操控猜测,这现已跨过了两个独立研讨的研讨方向。看到它们潜在的互利作用,本文探究这两种方式的结合优化。具体来说,咱们的归纳办法有两个分支,别离用于轨道规划和直接操控。轨道分支猜测未来轨道,而操控分支涉及一种新颖的多步猜测计划,以便能够推断当时动作和未来状态之间的联系。将两个分支连接起来,使得操控分支在每个时刻步都接收到来自轨道分支的相应引导。然后融合两个分支的输出以完成优势互补。运用 CARLA 模拟器在具有应战性场景的闭环城市驾驭环境中评价咱们的成果。即使运用单目相机输入,所提出的办法在官方 CARLA 排行榜上也排名榜首,大大优于其他具有多个传感器或融合机制的杂乱候选办法。源代码和数据将在 github.com/OpenPercept… 上公开。
论文:General-purpose, long-context autoregressive modeling with Perceiver AR
论文标题:General-purpose, long-context autoregressive modeling with Perceiver AR
论文时刻:15 Feb 2022
所属范畴:深度学习
对应使命:Density Estimation,密度估计,自回归模型
论文地址:arxiv.org/abs/2202.07…
代码完成:github.com/google-rese…
论文作者:Curtis Hawthorne, Andrew Jaegle, Ctlina Cangea, Sebastian Borgeaud, Charlie Nash, Mateusz Malinowski, Sander Dieleman, Oriol Vinyals, Matthew Botvinick, Ian Simon, Hannah Sheahan, Neil Zeghidour, Jean-Baptiste Alayrac, Joo Carreira, Jesse Engel
论文简介:Real-world data is high-dimensional: a book, image, or musical performance can easily contain hundreds of thousands of elements even after compression./实践国际的数据是高维的:即使经过紧缩,一本书、图画或音乐扮演也能够轻松包含数十万个元素。
论文摘要:Real-world data is high-dimensional: a book, image, or musical performance can easily contain hundreds of thousands of elements even after compression. However, the most commonly used autoregressive models, Transformers, are prohibitively expensive to scale to the number of inputs and layers needed to capture this long-range structure. We develop Perceiver AR, an autoregressive, modality-agnostic architecture which uses cross-attention to map long-range inputs to a small number of latents while also maintaining end-to-end causal masking. Perceiver AR can directly attend to over a hundred thousand tokens, enabling practical long-context density estimation without the need for hand-crafted sparsity patterns or memory mechanisms. When trained on images or music, Perceiver AR generates outputs with clear long-term coherence and structure. Our architecture also obtains state-of-the-art likelihood on long-sequence benchmarks, including 64 x 64 ImageNet images and PG-19 books.
实践国际的数据是高维的:即使经过紧缩,一本书、图画或音乐扮演也能够轻松包含数十万个元素。但是,最常用的自回归模型 Transformers 很难扩展到捕获这种长途结构所需的输入和层数。咱们开发了 Perceiver AR,这是一种自回归、与模态无关的架构,它运用交叉注意力将长途输入映射到少数潜在目标,一起还保持端到端的因果屏蔽。 Perceiver AR 能够直接处理超越十万个token,完成实用的长上下文密度估计不需要手工制造的稀疏模式或内存机制。在对图画或音乐进行练习时,Perceiver AR 生成的输出具有明晰的长时刻连贯性和结构。咱们的架构还在长序列基准上取得了最先进的或许性,包含 64 x 64 ImageNet 图画和 PG-19 书本。
论文:Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing
论文标题:Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing
论文时刻:CVPR 2022
所属范畴:核算机视觉
论文地址:arxiv.org/abs/2206.08…
代码完成:github.com/adobe-resea…
论文作者:Gaurav Parmar, Yijun Li, Jingwan Lu, Richard Zhang, Jun-Yan Zhu, Krishna Kumar Singh
论文简介:We propose a new method to invert and edit such complex images in the latent space of GANs, such as StyleGAN2./咱们提出了一种在 GAN 的潜在空间中回转和修改此类杂乱图画的新办法,例如 StyleGAN2。
论文摘要:Existing GAN inversion and editing methods work well for aligned objects with a clean background, such as portraits and animal faces, but often struggle for more difficult categories with complex scene layouts and object occlusions, such as cars, animals, and outdoor images. We propose a new method to invert and edit such complex images in the latent space of GANs, such as StyleGAN2. Our key idea is to explore inversion with a collection of layers, spatially adapting the inversion process to the difficulty of the image. We learn to predict the “invertibility” of different image segments and project each segment into a latent layer. Easier regions can be inverted into an earlier layer in the generator’s latent space, while more challenging regions can be inverted into a later feature space. Experiments show that our method obtains better inversion results compared to the recent approaches on complex categories, while maintaining downstream editability. Please refer to our project page at www.cs.cmu.edu/~SAMInversi…
现有的 GAN 回转和修改办法适用于具有洁净背景的对齐目标,例如肖像和动物面孔,但通常难以处理具有杂乱场景布局和目标遮挡的更困难的类别,例如轿车、动物和野外图画。咱们提出了一种在 GAN 的潜在空间中回转和修改此类杂乱图画的新办法,例如 StyleGAN2。咱们的关键思维是经过一系列层来探究反演,使反演进程在空间上习惯图画的难度。咱们学习猜测不同图画片段的“可逆性”,并将每个片段投影到一个潜在层中。更简单的区域能够回转到生成器的潜在空间中的较早层,而更具应战性的区域能够回转到较晚的特征空间中。试验标明,与最近的杂乱类别办法比较,咱们的办法取得了更好的反演成果,一起保持了下流的可修改性。请参阅咱们的项目页面 www.cs.cmu.edu/~SAMInversi…
论文:Heterogeneous Information Network based Default Analysis on Banking Micro and Small Enterprise Users
论文标题:Heterogeneous Information Network based Default Analysis on Banking Micro and Small Enterprise Users
论文时刻:24 Apr 2022
所属范畴:深度学习
对应使命:Feature Engineering,特征工程
论文地址:arxiv.org/abs/2204.11…
代码完成:github.com/adlington/h…
论文作者:Zheng Zhang, Yingsheng Ji, Jiachen Shen, Xi Zhang, Guangwen Yang
论文简介:Risk assessment is a substantial problem for financial institutions that has been extensively studied both for its methodological richness and its various practical applications./危险评价是金融机构面对的一个重大问题,因其办法丰厚性和各种实践运用而遭到广泛研讨。
论文摘要:Risk assessment is a substantial problem for financial institutions that has been extensively studied both for its methodological richness and its various practical applications. With the expansion of inclusive finance, recent attentions are paid to micro and small-sized enterprises (MSEs). Compared with large companies, MSEs present a higher exposure rate to default owing to their insecure financial stability. Conventional efforts learn classifiers from historical data with elaborate feature engineering. However, the main obstacle for MSEs involves severe deficiency in credit-related information, which may degrade the performance of prediction. Besides, financial activities have diverse explicit and implicit relations, which have not been fully exploited for risk judgement in commercial banks. In particular, the observations on real data show that various relationships between company users have additional power in financial risk analysis. In this paper, we consider a graph of banking data, and propose a novel HIDAM model for the purpose. Specifically, we attempt to incorporate heterogeneous information network with rich attributes on multi-typed nodes and links for modeling the scenario of business banking service. To enhance feature representation of MSEs, we extract interactive information through meta-paths and fully exploit path information. Furthermore, we devise a hierarchical attention mechanism respectively to learn the importance of contents inside each meta-path and the importance of different metapahs. Experimental results verify that HIDAM outperforms state-of-the-art competitors on real-world banking data.
危险评价是金融机构面对的一个重要问题,因其办法丰厚性和各种实践运用而遭到广泛研讨。随着普惠金融的开展,近年来小微企业遭到重视。与大公司比较,因为财政稳定性不稳,小微企业的违约危险较高。传统的努力经过精心规划的特征工程从历史数据中学习分类器。但是,小微企业的首要障碍是信誉相关信息的严重缺少,这或许会降低猜测的功能。此外,金融活动具有多种显性和隐性联系,商业银行没有充分使用这些联系进行危险判断。特别是对实在数据的调查标明,公司用户之间的各种联系在财政危险剖析中具有额外的才能。在本文中,咱们考虑了银行“图”数据,并为此提出了一种新颖的 HIDAM 模型。具体而言,咱们尝试在多类型节点和链路上结合具有丰厚属性的异构信息网络,对商业银行服务场景进行建模。为了增强 MSE 的特征表示,咱们经过元途径提取交互信息并充分使用途径信息。此外,咱们别离规划了一种分层注意机制来学习每个元途径内内容的重要性以及不同元途径的重要性。试验成果验证了 HIDAM 在实在银行数据上的体现优于最先进的竞争对手。
论文:HaGRID — HAnd Gesture Recognition Image Dataset
论文标题:HaGRID — HAnd Gesture Recognition Image Dataset
论文时刻:16 Jun 2022
所属范畴:核算机视觉
对应使命:Gesture Recognition,Hand Detection,Hand Gesture Recognition,Hand-Gesture Recognition,手势辨认,手部检测,手势辨认,手势辨认
论文地址:arxiv.org/abs/2206.08…
代码完成:github.com/hukenovs/ha…
论文作者:Alexander Kapitanov, Andrew Makhlyarchuk, Karina Kvanchiani
论文简介:In this paper, we introduce an enormous dataset HaGRID (HAnd Gesture Recognition Image Dataset) for hand gesture recognition (HGR) systems./在本文中,咱们介绍了一个用于手势辨认 (HGR) 体系的巨大数据集 HaGRID (HAnd Gesture Recognition Image Dataset)。
论文摘要:In this paper, we introduce an enormous dataset HaGRID (HAnd Gesture Recognition Image Dataset) for hand gesture recognition (HGR) systems. This dataset contains 552,992 samples divided into 18 classes of gestures. The annotations consist of bounding boxes of hands with gesture labels and markups of leading hands. The proposed dataset allows for building HGR systems, which can be used in video conferencing services, home automation systems, the automotive sector, services for people with speech and hearing impairments, etc. We are especially focused on interaction with devices to manage them. That is why all 18 chosen gestures are functional, familiar to the majority of people, and may be an incentive to take some action. In addition, we used crowdsourcing platforms to collect the dataset and took into account various parameters to ensure data diversity. We describe the challenges of using existing HGR datasets for our task and provide a detailed overview of them. Furthermore, the baselines for the hand detection and gesture classification tasks are proposed.
在本文中,咱们介绍了一个用于手势辨认(HGR)体系的巨大数据集 HaGRID(HAnd Gesture Recognition Image Dataset)。该数据集包含 552,992 个样本,分为 18 类手势。注释由带有手势标签的手的鸿沟框和领先手的符号组成。提议的数据集答应构建 HGR 体系,该体系可用于视频会议服务、家庭自动化体系、轿车行业、为有语言和听力障碍的人供给的服务等。咱们特别重视与设备的交互来管理它们。这便是为什么挑选的一切 18 个手势都是功能性的、大多数人了解的,而且或许是驱动某些举动的标志。此外,咱们运用众包渠道搜集数据集并考虑各种参数以确保数据的多样性。咱们描述了为咱们的使命运用现有 HGR 数据集的应战,并供给了它们的具体概述。此外,还提出了手部检测和手势分类使命的基线。
论文:Exploring Smoothness and Class-Separation for Semi-supervised Medical Image Segmentation
论文标题:Exploring Smoothness and Class-Separation for Semi-supervised Medical Image Segmentation
论文时刻:2 Mar 2022
所属范畴:医疗科技
对应使命:Medical Image Segmentation,Semantic Segmentation,Semi-supervised Medical Image Segmentation,医学图画切割,语义切割,半监督医学图画切割
论文地址:arxiv.org/abs/2203.01…
代码完成:github.com/HiLab-git/S… , github.com/ycwu1997/ss…
论文作者:Yicheng Wu, Zhonghua Wu, Qianyi Wu, ZongYuan Ge, Jianfei Cai
论文简介:The pixel-level smoothness forces the model to generate invariant results under adversarial perturbations./像素级滑润度迫使模型在对抗性扰动下生成不变的成果。
论文摘要:Semi-supervised segmentation remains challenging in medical imaging since the amount of annotated medical data is often limited and there are many blurred pixels near the adhesive edges or low-contrast regions. To address the issues, we advocate to firstly constrain the consistency of samples with and without strong perturbations to apply sufficient smoothness regularization and further encourage the class-level separation to exploit the unlabeled ambiguous pixels for the model training. Particularly, in this paper, we propose the SS-Net for semi-supervised medical image segmentation tasks, via exploring the pixel-level Smoothness and inter-class Separation at the same time. The pixel-level smoothness forces the model to generate invariant results under adversarial perturbations. Meanwhile, the inter-class separation constrains individual class features should approach their corresponding high-quality prototypes, in order to make each class distribution compact and separate different classes. We evaluated our SS-Net against five recent methods on the public LA and ACDC datasets. The experimental results under two semi-supervised settings demonstrate the superiority of our proposed SS-Net, achieving new state-of-the-art (SOTA) performance on both datasets. The code is available at github.com/ycwu1997/SS…
半监督切割在医学成像中依然具有应战性,因为带注释的医学数据量通常是有限的,而且在粘合边际或低对比度区域附近有许多含糊像素。为了处理这一问题,咱们建议首要在有和无强扰动的状况下约束样本的一致性,以运用足够的滑润正则化,并进一步鼓励类级别离,以使用未符号的含糊像素进行模型练习。特别是,在本文中,咱们经过一起探究像素级滑润度和类间别离,提出了用于半监督医学图画切割使命的SS网络。像素级的滑润度迫使模型在对抗性扰动下生成不变的成果。一起,类间别离约束单个类特征应接近其相应的高质量原型,以使每个类分布紧凑,别离不同的类。咱们根据公共LA和ACDC数据集上的五种最新办法评价了SS网络。在两种半监督设置下的试验成果证明了咱们提出的SS网络的优越性,在两种数据集上都完成了最新的功能(SOTA)。该代码可在 github.com/ycwu1997/SS… 取得。
咱们是 ShowMeAI,致力于传达AI优质内容,共享行业处理计划,用常识加快每一次技能生长!点击检查 历史文章列表,在大众号内订阅话题 #ShowMeAI资讯日报,可接收每日最新推送。点击 专题合辑&电子月刊 快速阅读各专题全集。
- 作者:韩信子@ShowMeAI
- 历史文章列表
- 专题合辑&电子月刊
- 声明:版权一切,转载请联系渠道与作者并注明出处
- 欢迎回复,拜托点赞,留言推荐中有价值的文章、东西或建议,咱们都会赶快回复哒~