Read by QxMD icon Read

IEEE Transactions on Pattern Analysis and Machine Intelligence

Wenhan Luo, Peng Sun, Fangwei Zhong, Wei Liu, Tong Zhang, Yizhou Wang
We study active object tracking, where a tracker takes visual observations (i.e., frame sequences) as inputs and produces the corresponding camera control signals as outputs (e.g., move forward, turn left, etc.). Conventional methods tackle tracking and camera control tasks separately, and the resulting system is difficult to tune jointly. Such an approach also requires significant human efforts for image labeling and expensive trial-and-error system tuning in real-world. To address these issues, we propose, in this paper, an end-to-end solution via deep reinforcement learning...
February 14, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Jun Liu, Amir Shahroudy, Gang Wang, Ling-Yu Duan, Alex Kot Chichung
Action prediction is to recognize the class label of an ongoing activity when only a part of it is observed. In this paper, we focus on online action prediction in streaming 3D skeleton sequences. A dilated convolutional network is introduced to model the motion dynamics in temporal dimension via a sliding window over the temporal axis. Since there are significant temporal scale variations in the observed part of the ongoing action at different time steps, a novel window scale selection method is proposed to make our network focus on the performed part of the ongoing action and try to suppress the possible incoming interference from the previous actions at each step...
February 12, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Shubham Tulsiani, Tinghui Zhou, Alyosha Efros, Jitendra Malik
We study the notion of consistency between a 3D shape and a 2D observation and propose a differentiable formulation which allows computing gradients of the 3D shape given an observation from an arbitrary view. We do so by reformulating view consistency using a differentiable ray consistency (DRC) term. We show that this formulation can be incorporated in a learning framework to leverage different types of multi-view observations e.g. foreground masks, depth, color images, semantics etc. as supervision for learning single-view 3D prediction...
February 12, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Fang Wan, Pengxu Wei, Zhenjun Han, Jianbin Jiao, Qixiang Ye
Weakly supervised object detection is a challenging task when provided with image category supervision but required to learn, at the same time, object locations and object detectors. The inconsistency between the weak supervision and learning objectives introduces significant randomness to object locations and ambiguity to detectors. In this paper, a min-entropy latent model (MELM) is proposed for weakly supervised object detection. Min-entropy serves as a model to learn object locations and a metric to measure the randomness of object localization during learning...
February 12, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Jovisa Zunic, Paul Rosin
In this paper we have developed a family of shape measures. All the measures from the family evaluate the degree to which a shape looks like a predefined convex polygon. A quite new approach in designing object shape based measures has been applied. In the most cases such measures were defined by exploiting some of shape properties. An illustrative example might be the shape circularity measure derived by exploiting the well-know result that the circle has the largest area among all the shapes with the same perimeter...
February 12, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Jie Shi, Yalin Wang
Shape space is an active research topic in computer vision and medical imaging fields. The distance defined in a shape space may provide a simple and refined index to represent a unique shape. This work studies the Wasserstein space and proposes a novel framework to compute the Wasserstein distance between general topological surfaces by integrating hyperbolic Ricci flow, hyperbolic harmonic map, and hyperbolic power Voronoi diagram algorithms. The resulting hyperbolic Wasserstein distance can intrinsically measure the similarity between general topological surfaces...
February 8, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Markus Braun, Sebastian Krebs, Fabian Flohr, Dariu Gavrila
Big data has had a great share in the success of deep learning in computer vision. Recent works suggest that there is significant further potential to increase object detection performance by utilizing even bigger datasets. In this paper, we introduce the EuroCity Persons dataset, which provides a large number of highly diverse, accurate and detailed annotations of pedestrians, cyclists and other riders in urban traffic scenes. The images for this dataset were collected on-board a moving vehicle in 31 cities of 12 European countries...
February 5, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Pengfei Zhang, Cuiling Lan, Junliang Xing, Wenjun Zeng, Jianru Xue, Nanning Zheng
Skeleton-based human action recognition has recently attracted increasing attention thanks to the accessibility and the popularity of 3D skeleton data. One of the key challenges in skeleton-based action recognition lies in the large view variations when capturing data. In order to alleviate the effects of view variations, this paper introduces a novel view adaptation scheme, which automatically determines the virtual observation viewpoints in a learning based data driven manner. We design two view adaptive neural networks, i...
January 31, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Christian Hane, Shubham Tulsiani, Jitendra Malik
Recently, Convolutional Neural Networks have shown promising results for 3D geometry prediction. They can make predictions from very little input data such as a single color image. A major limitation of such approaches is that they only predict a coarse resolution voxel grid, which does not capture the surface of the objects well. We propose a general framework, called hierarchical surface prediction (HSP), which facilitates prediction of high resolution voxel grids. The main insight is that it is sufficient to predict high resolution voxels around the predicted surfaces...
January 30, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Yongqiao Wang, Lishuai Li, Chuangyin Dang
In many real-world classification problems, accurate prediction of membership probabilities is critical for further decision making. The probability calibration problem studies how to map scores obtained from one classification algorithm to membership probabilities. The requirement of non-decreasingness for this mapping involves an infinite number of inequality constraints, which makes its estimation computationally intractable. For the sake of this difficulty, existing methods failed to achieve four desiderata of probability calibration: universal flexibility, non-decreasingness, continuousness and computational tractability...
January 28, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Wenhan Yang, Robby T Tan, Jiashi Feng, Jiaying Liu, Shuicheng Yan, Zongming Guo
Rain streaks, particularly in heavy rain, not only degrade visibility but also make many computer vision algorithms fail to function properly. In this paper, we address this visibility problem by focusing on single-image rain removal, even in the presence of dense rain streaks and rain-streak accumulation, which is visually similar to mist or fog. To achieve this, we introduce a new rain model and a deep learning architecture. Our rain model incorporates a binary rain map indicating rain-streak regions, and accommodates various shapes, directions, and sizes of overlapping rain streaks, as well as rain accumulation, to model heavy rain...
January 28, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Mahdi M Kalayeh, Mubarak Shah
Batch Normalization (BN) is essential to effectively train state-of-the-art deep Convolutional Neural Networks (CNN). It normalizes the layer outputs during training using the statistics of each mini-batch. BN accelerates training procedure by allowing to safely utilize large learning rates and alleviates the need for careful initialization of the parameters. In this work, we study BN from the viewpoint of Fisher kernels that arise from generative probability models. We show that assuming samples within a mini-batch are from the same probability density function, then BN is identical to the Fisher vector of a Gaussian distribution...
January 28, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Xinwang Liu, Lei Wang, Xinzhong Zhu, Miaomiao Li, En Zhu, Tongliang Liu, Li Liu, Yong Dou, Jianping Yin
Multiple kernel learning (MKL) has been intensively studied during the past decade. It optimally combines the multiple channels of each sample to improve classification performance. However, existing MKL algorithms cannot effectively handle the situation where some channels of the samples are missing, which is not uncommon in practical applications. This paper proposes three absent MKL (AMKL) algorithms to address this issue. Different from existing approaches where missing channels are firstly imputed and then a standard MKL algorithm is deployed on the imputed data, our algorithms directly classify each sample based on its observed channels, without performing imputation...
January 28, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Dongqing Zou, Xiaowu Chen, Guangying Cao, Xiaogang Wang
A novel method, unsupervised video matting via sparse and low-rank representation, is proposed which can achieve high quality in a variety of challenging examples featuring illumination changes, feature ambiguity, topology changes, transparency variation, dis-occlusion, fast motion and motion blur, Previous matting methods introduced a nonlocal prior to search samples for estimating the alpha matte, which have achieved impressive results on some data. However, on one hand, searching inadequate or excessive samples may miss good samples or introduce noise; on the other hand, it is difficult to construct consistent nonlocal structures for pixels with similar features, yielding inconsistent video matte...
January 25, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Jun Liu, Henghui Ding, Amir Shahroudy, Ling-Yu Duan, Xudong Jiang, Gang Wang, Alex Kot Chichung
In this paper, a feature boosting network is proposed for estimating 3D hand pose and 3D body pose from a single RGB image. In this method, the features learned by the convolutional layers are boosted with a new long short-term dependence-aware (LSTD) module, which enables the intermediate convolutional feature maps to perceive the graphical long short-term dependency among different hand (or body) parts using the designed Graphical ConvLSTM. Learning a set of features that are reliable and discriminatively representative of the pose of a hand (or body) part is difficult due to the ambiguities, texture and illumination variation, and self-occlusion in the real application of 3D pose estimation...
January 22, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Cenek Albl, Zuzana Kukelova, Viktor Larsson, Tomas Pajdla
We present a minimal, non-iterative solutions to the absolute pose problem for images from rolling shutter cameras. Absolute pose problem is a key problem in computer vision and rolling shutter is present in a vast majority of today's digital cameras. We discuss several camera motion models and propose two feasible rolling shutter camera models for a polynomial solver. In previous work a linearized camera model was used that required an initial estimate of the camera orientation. We show how to simplify the system of equations and make this solver faster...
January 22, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Deqing Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz
We investigate two crucial and closely related aspects of CNNs for optical flow estimation: models and training. First, we design a compact but effective CNN model, called PWC-Net, according to simple and well-established principles: pyramidal processing, warping, and cost volume processing. PWC-Net is 17 times smaller in size, 2 times faster in inference, and 11% more accurate on Sintel final than the recent FlowNet2 model. It is the winning entry in the optical flow competition of the robust vision challenge...
January 22, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Lianli Gao, Xiangpeng Li, Jingkuan Song, Heng Tao Shen
Recent progress has been made in using attention based encoder-decoder framework for image and video captioning. Most existing decoders apply the attention mechanism to every generated word including both visual words (e.g., "gun" and "shooting") and non-visual words (e.g. "the", "a"). However, these non-visual words can be easily predicted using natural language model without considering visual signals or attention. Imposing attention mechanism on non-visual words could mislead and decrease the overall performance of visual captioning...
January 21, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Shuhui Jiang, Zhengming Ding, Yun Fu
Real-world recommender usually makes use of heterogeneous types of user feedbacks-for example, binary ratings such as likes and dislikes and numerical rating such as 5-star grades. In this work, we focus on transferring knowledge from binary ratings to numerical ratings, facing more serious data sparsity problem. Conventional Collective Factorization methods usually assume that multiple domains share some common latent information across users and items. However, related domains may also share some knowledge of rating patterns...
January 21, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Reuben Farrugia, Christine Guillemot
Light field imaging has recently known a regain of interest due to the availability of practical light field capturing systems that offer a wide range of applications in the field of computer vision. However, capturing high-resolution light fields remains technologically challenging since the increase in angular resolution is often accompanied by a significant reduction in spatial resolution. This paper describes a learning-based spatial light field super-resolution method that allows the restoration of the entire light field with consistency across all angular views...
January 21, 2019: IEEE Transactions on Pattern Analysis and Machine Intelligence
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"