journal
Journals IEEE Transactions on Pattern A...

IEEE Transactions on Pattern Analysis and Machine Intelligence

https://read.qxmd.com/read/38625775/fast-building-instance-proxy-reconstruction-for-large-urban-scenes
#1
JOURNAL ARTICLE
Jianwei Guo, Haobo Qin, Yinchang Zhou, Xin Chen, Liangliang Nan, Hui Huang
Digitalization of large-scale urban scenes (in particular buildings) has been a long-standing open problem, which attributes to the challenges in data acquisition, such as incomplete scene coverage, lack of semantics, low efficiency, and low reliability in path planning. In this paper, we address these challenges in urban building reconstruction from aerial images, and we propose an effective workflow and a few novel algorithms for efficient 3D building instance proxy reconstruction for large urban scenes. Specifically, we propose a novel learning-based approach to instance segmentation of urban buildings from aerial images followed by a voting-based algorithm to fuse the multi-view instance information to a sparse point cloud (reconstructed using a standard Structure from Motion pipeline)...
April 16, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38625774/bridging-visual-and-textual-semantics-towards-consistency-for-unbiased-scene-graph-generation
#2
JOURNAL ARTICLE
Ruonan Zhang, Gaoyun An, Yiqing Hao, Dapeng Oliver Wu
Scene Graph Generation (SGG) aims to detect visual relationships in an image. However, due to long-tailed bias, SGG is far from practical. Most methods depend heavily on the assistance of statistics co-occurrence to generate a balanced dataset, so they are dataset-specific and easily affected by noises. The fundamental cause is that SGG is simplified as a classification task instead of a reasoning task, thus the ability capturing the fine-grained details is limited and the difficulty in handling ambiguity is increased...
April 16, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38625773/analysis-of-video-quality-datasets-via-design-of-minimalistic-video-quality-models
#3
JOURNAL ARTICLE
Wei Sun, Wen Wen, Xiongkuo Min, Long Lan, Guangtao Zhai, Kede Ma
Blind video quality assessment (BVQA) plays an indispensable role in monitoring and improving the end-users' viewing experience in various real-world video-enabled media applications. As an experimental field, the improvements of BVQA models have been measured primarily on a few human-rated VQA datasets. Thus, it is crucial to gain a better understanding of existing VQA datasets in order to properly evaluate the current progress in BVQA. Towards this goal, we conduct a first-of-its-kind computational analysis of VQA datasets via designing minimalistic BVQA models...
April 16, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38625772/on-the-benefit-of-optimal-transport-for-curriculum-reinforcement-learning
#4
JOURNAL ARTICLE
Pascal Klink, Carlo D'Eramo, Jan Peters, Joni Pajarinen
Curriculum reinforcement learning (CRL) allows solving complex tasks by generating a tailored sequence of learning tasks, starting from easy ones and subsequently increasing their difficulty. Although the potential of curricula in RL has been clearly shown in various works, it is less clear how to generate them for a given learning environment, resulting in various methods aiming to automate this task. In this work, we focus on framing curricula as interpolations between task distributions, which has previously been shown to be a viable approach to CRL...
April 16, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38619951/learning-to-sketch-a-neural-approach-to-item-frequency-estimation-in-streaming-data
#5
JOURNAL ARTICLE
Yukun Cao, Yuan Feng, Hairu Wang, Xike Xie, S Kevin Zhou
Recently, there has been a trend of designing neural data structures to go beyond handcrafted data structures by leveraging patterns of data distributions for better accuracy and adaptivity. Sketches are widely used data structures in real-time web analysis, network monitoring, and self-driving to estimate item frequencies of data streams within limited space. However, existing sketches have not fully exploited the patterns of the data stream distributions, making it challenging to tightly couple them with neural networks that excel at memorizing pattern information...
April 15, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38607718/disentangled-explanations-of-neural-network-predictions-by-finding-relevant-subspaces
#6
JOURNAL ARTICLE
Pattarawat Chormai, Jan Herrmann, Klaus-Robert Muller, Gregoire Montavon
Explainable AI aims to overcome the black-box nature of complex ML models like neural networks by generating explanations for their predictions. Explanations often take the form of a heatmap identifying input features (e.g. pixels) that are relevant to the model's decision. These explanations, however, entangle the potentially multiple factors that enter into the overall complex decision strategy. We propose to disentangle explanations by extracting at some intermediate layer of a neural network, subspaces that capture the multiple and distinct activation patterns (e...
April 12, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38607717/deep-learning-methods-for-calibrated-photometric-stereo-and-beyond
#7
JOURNAL ARTICLE
Yakun Ju, Kin-Man Lam, Wuyuan Xie, Huiyu Zhou, Junyu Dong, Boxin Shi
Photometric stereo recovers the surface normals of an object from multiple images with varying shading cues, i.e., modeling the relationship between surface orientation and intensity at each pixel. Photometric stereo prevails in superior per-pixel resolution and fine reconstruction details. However, it is a complicated problem because of the non-linear relationship caused by non-Lambertian surface reflectance. Recently, various deep learning methods have shown a powerful ability in the context of photometric stereo against non-Lambertian surfaces...
April 12, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38607716/rdfc-gan-rgb-depth-fusion-cyclegan-for-indoor-depth-completion
#8
JOURNAL ARTICLE
Haowen Wang, Zhengping Che, Yufan Yang, Mingyuan Wang, Zhiyuan Xu, Xiuquan Qiao, Mengshi Qi, Feifei Feng, Jian Tang
Raw depth images captured in indoor scenarios frequently exhibit extensive missing values due to the inherent limitations of the sensors and environments. For example, transparent materials frequently elude detection by depth sensors; surfaces may introduce measurement inaccuracies due to their polished textures, extended distances, and oblique incidence angles from the sensor. The presence of incomplete depth maps imposes significant challenges for subsequent vision applications, prompting the development of numerous depth completion techniques to mitigate this problem...
April 12, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38607715/ensemble-predictors-possibilistic-combination-of-conformal-predictors-for-multivariate-time-series-classification
#9
JOURNAL ARTICLE
Andrea Campagner, Marilia Barandas, Duarte Folgado, Hugo Gamboa, Federico Cabitza
In this article we propose a conceptual framework to study ensembles of conformal predictors (CP), that we call Ensemble Predictors (EP). Our approach is inspired by the application of imprecise probabilities in information fusion. Based on the proposed framework, we study, for the first time in the literature, the theoretical properties of CP ensembles in a general setting, by focusing on simple and commonly used possibilistic combination rules. We also illustrate the applicability of the proposed methods in the setting of multivariate time-series classification, showing that these methods provide better performance (in terms of both robustness, conservativeness, accuracy and running time) than both standard classification algorithms and other combination rules proposed in the literature, on a large set of benchmarks from the UCR time series archive...
April 12, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38607714/sensitivity-aware-density-estimation-in-multiple-dimensions
#10
JOURNAL ARTICLE
Aleix Boquet-Pujadas, Pol Del Aguila Pla, Michael Unser
We formulate an optimization problem to estimate probability densities in the context of multidimensional problems that are sampled with uneven probability. It considers detector sensitivity as an heterogeneous density and takes advantage of the computational speed and flexible boundary conditions offered by splines on a grid. We choose to regularize the Hessian of the spline via the nuclear norm to promote sparsity. As a result, the method is spatially adaptive and stable against the choice of the regularization parameter, which plays the role of the bandwidth...
April 12, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38607713/transferring-annotator-and-instance-dependent-transition-matrix-for-learning-from-crowds
#11
JOURNAL ARTICLE
Shikun Li, Xiaobo Xia, Jiankang Deng, Shiming Gey, Tongliang Liu
Learning from crowds describes that the annotations of training data are obtained with crowd-sourcing services. Multiple annotators each complete their own small part of the annotations, where labeling mistakes that depend on annotators occur frequently. Modeling the label-noise generation process by the noise transition matrix is a powerful tool to tackle the label noise. In real-world crowd-sourcing scenarios, noise transition matrices are both annotator- and instance-dependent. However, due to the high complexity of annotator- and instance-dependent transition matrices (AIDTM), annotation sparsity, which means each annotator only labels a tiny part of instances, makes modeling AIDTM very challenging...
April 12, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38607712/bridging-actions-generate-3d-poses-and-shapes-in-between-photos
#12
JOURNAL ARTICLE
Wen-Li Wei, Jen-Chun Lin
Generating realistic 3D human motion has been a fundamental goal of the game/animation industry. This work presents a novel transition generation technique that can bridge the actions of people in the foreground by generating 3D poses and shapes in-between photos, allowing 3D animators/novice users to easily create/edit 3D motions. To achieve this, we propose an adaptive motion network (ADAM-Net) that effectively learns human motion from masked action sequences to generate kinematically compliant 3D poses and shapes in-between given temporally-sparse photos...
April 12, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38607711/vote2cap-detr-decoupling-localization-and-describing-for-end-to-end-3d-dense-captioning
#13
JOURNAL ARTICLE
Sijin Chen, Hongyuan Zhu, Mingsheng Li, Xin Chen, Peng Guo, Yinjie Lei, Gang Yu, Taihao Li, Tao Chen
3D dense captioning requires a model to translate its understanding of an input 3D scene into several captions associated with different object regions. Existing methods adopt a sophisticated "detect-then-describe" pipeline, which builds explicit relation modules upon a 3D detector with numerous hand-crafted components. While these methods have achieved initial success, the cascade pipeline tends to accumulate errors because of duplicated and inaccurate box estimations and messy 3D scenes. In this paper, we first propose Vote2Cap-DETR, a simple-yet-effective transformer framework that decouples the decoding process of caption generation and object localization through parallel decoding...
April 12, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38607710/hierarchical-augmentation-and-distillation-for-class-incremental-audio-visual-video-recognition
#14
JOURNAL ARTICLE
Yukun Zuo, Hantao Yao, Liansheng Zhuang, Changsheng Xu
Audio-visual video recognition (AVVR) aims to integrate audio and visual clues to categorize videos accurately. While existing methods train AVVR models using provided datasets and achieve satisfactory results, they struggle to retain historical class knowledge when confronted with new classes in real-world situations. Currently, there are no dedicated methods for addressing this problem, so this paper concentrates on exploring Class Incremental Audio-Visual Video Recognition (CIAVVR). For CIAVVR, since both stored data and learned model of past classes contain historical knowledge, the core challenge is how to capture past data knowledge and past model knowledge to prevent catastrophic forgetting...
April 12, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38602857/xgrad-boosting-gradient-based-optimizers-with-weight-prediction
#15
JOURNAL ARTICLE
Lei Guan, Dongsheng Li, Yanqi Shi, Jian Meng
In this paper, we propose a general deep learning training framework XGrad which introduces weight prediction into the popular gradient-based optimizers to boost their convergence and generalization when training the deep neural network (DNN) models. In particular, ahead of each mini-batch training, the future weights are predicted according to the update rule of the used optimizer and are then applied to both the forward pass and backward propagation. In this way, during the whole training period, the optimizer always utilizes the gradients w...
April 11, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38602856/efficient-and-robust-point-cloud-registration-via-heuristics-guided-parameter-search
#16
JOURNAL ARTICLE
Tianyu Huang, Haoang Li, Liangzu Peng, Yinlong Liu, Yun-Hui Liu
Estimating the rigid transformation with 6 degrees of freedom based on a putative 3D correspondence set is a crucial procedure in point cloud registration. Existing correspondence identification methods usually lead to large outlier ratios (> 95% is common), underscoring the significance of robust registration methods. Many researchers turn to parameter search-based strategies (e.g., Branch-and-Bround) for robust registration. Although related methods show high robustness, their efficiency is limited to the high-dimensional search space...
April 11, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38602855/on-the-consistency-and-large-scale-extension-of-multiple-kernel-clustering
#17
JOURNAL ARTICLE
Weixuan Liang, Chang Tang, Xinwang Liu, Yong Liu, Jiyuan Liu, En Zhu, Kunlun He
Existing multiple kernel clustering (MKC) algorithms have two ubiquitous problems. From the theoretical perspective, most MKC algorithms lack sufficient theoretical analysis, especially the consistency of learned parameters, such as the kernel weights. From the practical perspective, the high complexity makes MKC unable to handle large-scale datasets. This paper tries to address the above two issues. We first make a consistency analysis of an influential MKC method named Simple Multiple Kernel k-Means (SimpleMKKM)...
April 11, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38598389/perf-panoramic-neural-radiance-field-from-a-single-panorama
#18
JOURNAL ARTICLE
Guangcong Wang, Peng Wang, Zhaoxi Chen, Wenping Wang, Chen Change Loy, Ziwei Liu
Neural Radiance Field (NeRF) has achieved substantial progress in novel view synthesis given multi-view images. Recently, some works have attempted to train a NeRF from a single image with 3D priors. They mainly focus on a limited field of view with a few occlusions, which greatly limits their scalability to real-world 360-degree panoramic scenarios with large-size occlusions. In this paper, we present PERF, a 360-degree novel view synthesis framework that trains a panoramic neural radiance field from a single panorama...
April 10, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38598388/hybrid-open-set-segmentation-with-synthetic-negative-data
#19
JOURNAL ARTICLE
Matej Grcic, Sinisa Segvic
Open-set segmentation can be conceived by complementing closed-set classification with anomaly detection. Many of the existing dense anomaly detectors operate through generative modelling of regular data or by discriminating with respect to negative data. These two approaches optimize different objectives and therefore exhibit different failure modes. Consequently, we propose a novel anomaly score that fuses generative and discriminative cues. Our score can be implemented by upgrading any closed-set segmentation model with dense estimates of dataset posterior and unnormalized data likelihood...
April 10, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38598387/bayesian-optimization-for-sparse-neural-networks-with-trainable-activation-functions
#20
JOURNAL ARTICLE
Mohamed Fakhfakh, Lotfi Chaari
In the literature on deep neural networks, there is considerable interest in developing activation functions that can enhance neural network performance. In recent years, there has been renewed scientific interest in proposing activation functions that can be trained throughout the learning process, as they appear to improve network performance, especially by reducing overfitting. In this paper, we propose a trainable activation function whose parameters need to be estimated. A fully Bayesian model is developed to automatically estimate from the learning data both the model weights and activation function parameters...
April 10, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
journal
journal
34134
1
2
Fetch more papers »
Fetching more papers... Fetching...
Remove bar
Read by QxMD icon Read
×

Save your favorite articles in one place with a free QxMD account.

×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"

We want to hear from doctors like you!

Take a second to answer a survey question.