DepthAnythingのROS2 packageを作った
概要 DepthAnything のROS2 packageを作った それに対する感想や周辺Toolを作った話をつらつら 作ったもののlink https://github.com/scepter914/DepthAnything-ROS https://github.com/scepter914/ros-useful-tools/tree/main/nuscenes_rosbag 作ったもの DepthAnything-ROS 可視化 data: nuScenes Fine tu
VSCode Vim で日本語の変換が変になる問題への対処
概要 VSCode Vim で日本語の変換が変になる問題への対処 環境 Ubuntu 22.04 VSCode v1.82.0 VSCode Vimのextensionを入れている状態で、日本語入力をすると変換で変になる現象
MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception (arxiv2022/11)
MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception (arxiv2022/11) Summary https://github.com/Megvii-BaseDetection/BEVDepth BEVDepthの後継、軽量version 軽量なBEV-base Camera 3d detection CPUでも動作するレベルで軽量 CPUでも数10
DeepFusion: A Robust and Modular 3D Object Detector for Lidars, Cameras and Radars (IROS2022)
DeepFusion: A Robust and Modular 3D Object Detector for Lidars, Cameras and Radars (IROS2022) Summary moduleのように扱える BEV baseのCamera-LiDAR-Radar fusion 3d detection 定性評価で細かく解析 LiDAR only だと縦
aiMotive Dataset: A Multimodal Dataset for Robust Autonomous Driving with Long-Range Perception (arxiv 2022/11)
aiMotive Dataset: A Multimodal Dataset for Robust Autonomous Driving with Long-Range Perception (arxiv 2022/11) Summary https://arxiv.org/pdf/2211.09445.pdf https://github.com/aimotive/aimotive_dataset/tree/f71828446692587318ebccbd3cdad5b4335eb9f3 datasetに関して https://github.com/aimotive/aimotive-dataset-loader api https://github.com/aimotive/mm_training Camera-LiDAR-Radar dataset の提案 200mまでannotationされているので遠距離detectio
MTP: Multi-hypothesis Tracking and Prediction for Reduced Error Propagation (IV2022)
MTP: Multi-hypothesis Tracking and Prediction for Reduced Error Propagation (IV2022) Summary from Carnegie mellon and nvidia https://www.youtube.com/watch?v=ydQ9IPbX_-A multi-hypothesis tracking and prediction framework の提案 tracking results を複数持つことでpredictionの性能を上げる tracking errors が prediction performance に与える影響の解析も行って
Simple-BEV: What Really Matters for Multi-Sensor BEV Perception? (arxiv 2022/09)
Simple-BEV: What Really Matters for Multi-Sensor BEV Perception? (arxiv 2022/09) Summary https://simple-bev.github.io/ https://github.com/aharley/simple_bev NuScenes, Lyft でtrain code NuScenesはpretrain modelあり Camera-Radar fusion の BEV detection 検出 input: Camera * 6 (360度) + radar pointcloud Depth-based, Homography-based では
BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving (arxiv2022/05)
BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving (arxiv2022/05) Summary https://arxiv.org/pdf/2205.09743.pdf https://github.com/zhangyp15/BEVerse Multi-Camera BEV perception 3d detection, motion prediction, semantic map のmulti-task learning semantic map は mapのみでobjectは含まない motion prediction はsegmentatio
BEVFormer: Learning Bird’s-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers (ECCV2022)
BEVFormer: Learning Bird’s-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers (ECCV2022) Summary https://github.com/fundamentalvision/BEVFormer Attention base でのMulti-cameraからBird’s-Eye-View Representation を得る
BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird’s-Eye View Representation (arxiv2022/05)
BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird’s-Eye View Representation (arxiv2022/05) Summary https://bevfusion.mit.edu/ 公式 https://github.com/mit-han-lab/bevfusion mmdet base、waymo, nuscenes で評価 pretrained modelがある https://www.youtube.com/watch?v=uCAka90si9E BEV特徴量空間でfusionするCam