Attentive Linear Transformation for Image Captioning.

Senmao Ye, Nian Liu, Junwei Han

IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society 2018 July 13

We propose a novel attention framework called attentive linear transformation (ALT). Instead of learning the spatial or channel-wise attention in existing models, ALT learns to attend to the high-dimensional transformation matrix from the image feature space to the context vector space. Thus ALT can learn various relevant feature abstractions, including spatial attention, channel-wise attention and visual dependence. Besides, we propose a soft threshold regression to predict the attention probabilities for local regions. Soft threshold regression preserves more useful visual information than popular softmax regression. Extensive experiments on the MS COCO and the Flickr30k datasets demonstrate the superiority of our model compared with other state-of-the-art methods.

Full text links

We have located links that may give you full text access.

Show additional links to paperHide additional links to paper

PubMed

Add to Saved Papers

Get 1-tap access

Related Resources

For the best experience, use the Read mobile app

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app