keyword
https://read.qxmd.com/read/38683713/consistency-aware-anchor-pyramid-network-for-crowd-localization
#21
JOURNAL ARTICLE
Xinyan Liu, Guorong Li, Yuankai Qi, Zhenjun Han, Anton van den Hengel, Nicu Sebe, Ming-Hsuan Yang, Qingming Huang
Crowd localization aims to predict the positions of humans in images of crowded scenes. While existing methods have made significant progress, two primary challenges remain: (i) a fixed number of evenly distributed anchors can cause excessive or insufficient predictions across regions in an image with varying crowd densities, and (ii) ranking inconsistency of predictions between the testing and training phases leads to the model being sub-optimal in inference. To address these issues, we propose a Consistency-Aware Anchor Pyramid Network (CAAPN) comprising two key components: an Adaptive Anchor Generator (AAG) and a Localizer with Augmented Matching (LAM)...
April 29, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38683711/a-versatile-framework-for-multi-scene-person-re-identification
#22
JOURNAL ARTICLE
Wei-Shi Zheng, Junkai Yan, Yi-Xing Peng
Person Re-identification (ReID) has been extensively developed for a decade in order to learn the association of images of the same person across non-overlapping camera views. To overcome significant variations between images across camera views, mountains of variants of ReID models were developed for solving a number of challenges, such as resolution change, clothing change, occlusion, modality change, and so on. Despite the impressive performance of many ReID variants, these variants typically function distinctly and cannot be applied to other challenges...
April 29, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38676184/adopting-graph-neural-networks-to-analyze-human-object-interactions-for-inferring-activities-of-daily-living
#23
JOURNAL ARTICLE
Peng Su, Dejiu Chen
Human Activity Recognition (HAR) refers to a field that aims to identify human activities by adopting multiple techniques. In this field, different applications, such as smart homes and assistive robots, are introduced to support individuals in their Activities of Daily Living (ADL) by analyzing data collected from various sensors. Apart from wearable sensors, the adoption of camera frames to analyze and classify ADL has emerged as a promising trend for achieving the identification and classification of ADL...
April 17, 2024: Sensors
https://read.qxmd.com/read/38669165/structure-guided-image-completion-with-image-level-and-object-level-semantic-discriminators
#24
JOURNAL ARTICLE
Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Eli Shechtman, Connelly Barnes, Jianming Zhang, Qing Liu, Sohrab Amirghodsi, Yuqian Zhou, Jiebo Luo
Structure-guided image completion aims to inpaint a local region of an image according to an input guidance map from users. While such a task enables many practical applications for interactive editing, existing methods often struggle to hallucinate realistic object instances in complex natural scenes. Such a limitation is partially due to the lack of semantic-level constraints inside the hole region as well as the lack of a mechanism to enforce realistic object generation. In this work, we propose a learning paradigm that consists of semantic discriminators and object-level discriminators for improving the generation of complex semantics and objects...
April 26, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38664866/probing-the-content-of-affective-semantic-memory-following-caregiving-related-early-adversity
#25
JOURNAL ARTICLE
Anna Vannucci, Andrea Fields, Paul A Bloom, Nicolas L Camacho, Tricia Choy, Amaesha Durazi, Syntia Hadis, Chelsea Harmon, Charlotte Heleniak, Michelle VanTieghem, Mary Dozier, Michael P Milham, Simona Ghetti, Nim Tottenham
Cognitive science has demonstrated that we construct knowledge about the world by abstracting patterns from routinely encountered experiences and storing them as semantic memories. This preregistered study tested the hypothesis that caregiving-related early adversities (crEAs) shape affective semantic memories to reflect the content of those adverse interpersonal-affective experiences. We also tested the hypothesis that because affective semantic memories may continue to evolve in response to later-occurring positive experiences, child-perceived attachment security will inform their content...
April 25, 2024: Developmental Science
https://read.qxmd.com/read/38662662/accuracy-and-efficiency-stereo-matching-network-with-adaptive-feature-modulation
#26
JOURNAL ARTICLE
Sen Lin, Xinxin Zhuo, Baozhen Qi
Feature enhancement plays a crucial role in improving the quality and discriminative power of features used in matching tasks. By enhancing the informative and invariant aspects of features, the matching process becomes more robust and reliable, enabling accurate predictions even in challenging scenarios, such as occlusion and reflection in stereo matching. In this paper, we propose an end-to-end dual-dimension feature modulation network called DFMNet to address the issue of mismatches in interference areas...
2024: PloS One
https://read.qxmd.com/read/38662568/enhancing-video-language-representations-with-structural-spatio-temporal-alignment
#27
JOURNAL ARTICLE
Hao Fei, Shengqiong Wu, Meishan Zhang, Min Zhang, Tat-Seng Chua, Shuicheng Yan
While pre-training large-scale video-language models (VLMs) has shown remarkable potential for various downstream video-language tasks, existing VLMs can still suffer from certain commonly seen limitations, e.g., coarse-grained cross-modal aligning, under-modeling of temporal dynamics, detached video-language view. In this work, we target enhancing VLMs with a fine-grained structural spatio-temporal alignment learning method (namely Finsta). First of all, we represent the input texts and videos with fine-grained scene graph (SG) structures, both of which are further unified into a holistic SG (HSG) for bridging two modalities...
April 25, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38661636/the-time-course-of-encoding-specific-and-gist-episodic-memory-representations-among-young-and-older-adults
#28
JOURNAL ARTICLE
Nathaniel R Greene, Moshe Naveh-Benjamin
How rapidly can we encode the specifics versus the gist of episodic memories? Competing theories have opposing answers, but empirical tests are based primarily on tasks of item memory. Few studies have addressed this question with tasks measuring the binding of event components (e.g., a person and a location), which forms the core of episodic memory. None of these prior studies included older adults, whose episodic memories are less specific in nature. We addressed this critical gap by presenting face-scene pairs (e...
April 25, 2024: Journal of Experimental Psychology. General
https://read.qxmd.com/read/38656855/neuralrecon-real-time-coherent-3d-scene-reconstruction-from-monocular-video
#29
JOURNAL ARTICLE
Xi Chen, Jiaming Sun, Yiming Xie, Hujun Bao, Xiaowei Zhou
We present a novel framework named NeuralRecon for real-time 3D scene reconstruction from a monocular video. Unlike previous methods that estimate single-view depth maps separately on each key-frame and fuse them later, we propose to directly reconstruct local surfaces represented as sparse TSDF volumes for each video fragment sequentially by a neural network. A learning-based TSDF fusion module based on gated recurrent units is used to guide the network to fuse features from previous fragments. This design allows the network to capture local smoothness prior and global shape prior of 3D surfaces when sequentially reconstructing the surfaces, resulting in accurate, coherent, and real-time surface reconstruction...
April 24, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38652604/exploring-the-semantic-inconsistency-effect-in-scenes-using-a-continuous-measure-of-linguistic-semantic-similarity
#30
JOURNAL ARTICLE
Claudia Damiano, Maarten Leemans, Johan Wagemans
Viewers use contextual information to visually explore complex scenes. Object recognition is facilitated by exploiting object-scene relations (which objects are expected in a given scene) and object-object relations (which objects are expected because of the occurrence of other objects). Semantically inconsistent objects deviate from these expectations, so they tend to capture viewers' attention (the semantic-inconsistency effect ). Some objects fit the identity of a scene more or less than others, yet semantic inconsistencies have hitherto been operationalized as binary (consistent vs...
April 23, 2024: Psychological Science
https://read.qxmd.com/read/38648137/deepmesh-differentiable-iso-surface-extraction
#31
JOURNAL ARTICLE
Benoit Guillard, Edoardo Remelli, Artem Lukoianov, Pierre Yvernay, Stephan R Richter, Timur Bagautdinov, Pierre Baque, Pascal Fua
Geometric Deep Learning has recently made striking progress with the advent of continuous deep implicit fields. They allow for detailed modeling of watertight surfaces of arbitrary topology while not relying on a 3D Euclidean grid, resulting in a learnable parameterization that is unlimited in resolution. Unfortunately, these methods are often unsuitable for applications that require an explicit mesh-based surface representation because converting an implicit field to such a representation relies on the Marching Cubes algorithm, which cannot be differentiated with respect to the underlying implicit field...
April 22, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38641188/a-multi-featured-expression-recognition-model-incorporating-attention-mechanism-and-object-detection-structure-for-psychological-problem-diagnosis
#32
JOURNAL ARTICLE
Xiufeng Zhang, Bingyi Li, Guobin Qi
Expression is the main method for judging the emotional state and psychological condition of the human body, and the prediction of changes in facial expressions can effectively determine the mental health of a person, thus avoiding serious psychological or psychiatric disorders due to early negligence. From a computer vision perspective, most researchers have focused on studying facial expression analysis, and in some cases, body posture is also considered. However their performance is more limited under unconstrained natural conditions, which requires more information to be used in human emotion analysis...
April 18, 2024: Physiology & Behavior
https://read.qxmd.com/read/38637993/macaque-claustrum-pulvinar-and-putative-dorsolateral-amygdala-support-the-cross-modal-association-of-social-audio-visual-stimuli-based-on-meaning
#33
JOURNAL ARTICLE
Mathilda Froesel, Maƫva Gacoin, Simon Clavagnier, Marc Hauser, Quentin Goudard, Suliann Ben Hamed
Social communication draws on several cognitive functions such as perception, emotion recognition and attention. The association of audio-visual information is essential to the processing of species-specific communication signals. In this study, we use functional magnetic resonance imaging in order to identify the subcortical areas involved in the cross-modal association of visual and auditory information based on their common social meaning. We identified three subcortical regions involved in audio-visual processing of species-specific communicative signals: the dorsolateral amygdala, the claustrum and the pulvinar...
April 18, 2024: European Journal of Neuroscience
https://read.qxmd.com/read/38637870/beyond-visual-integration-sensitivity-of-the-temporal-parietal-junction-for-objects-places-and-faces
#34
JOURNAL ARTICLE
Johannes Rennig, Christina Langenberger, Hans-Otto Karnath
One important role of the TPJ is the contribution to perception of the global gist in hierarchically organized stimuli where individual elements create a global visual percept. However, the link between clinical findings in simultanagnosia and neuroimaging in healthy subjects is missing for real-world global stimuli, like visual scenes. It is well-known that hierarchical, global stimuli activate TPJ regions and that simultanagnosia patients show deficits during the recognition of hierarchical stimuli and real-world visual scenes...
April 18, 2024: Behavioral and Brain Functions: BBF
https://read.qxmd.com/read/38636427/mediating-sequential-turn-on-and-turn-off-fluorescence-signals-for-discriminative-detection-of-ag-and-hg-2-via-readily-available-cdse-quantum-dots
#35
JOURNAL ARTICLE
Rong Wang, Zi Yi Xu, Ting Li, Nian Bing Li, Hong Qun Luo
Realizing the accurate recognition and quantification of heavy metal ions is pivotal but challenging in the environmental, biological, and physiological science fields. In this work, orange fluorescence emitting quantum dots (OQDs) have been facilely synthesized by one-step method. The participation of silver ion (Ag+ ) can evoke the unique aggregation-induced emission (AIE) of OQDs, resulting in prominent fluorescence enhancement, which is scarcely reported previously. Moreover, the Ag+ -triggered turn-on fluorescence can be continuously shut down by mercury ion (Hg2+ )...
April 13, 2024: Spectrochimica Acta. Part A, Molecular and Biomolecular Spectroscopy
https://read.qxmd.com/read/38633658/multiscale-apple-recognition-method-based-on-improved-centernet
#36
JOURNAL ARTICLE
Han Zhou
Traditional apple-picking robots are unable to detect apples in real-time in complex environments. In order to improve detection efficiency, a fast CenterNet apple recognition method for multiple apple targets in dense scenes is proposed. This method can quickly and accurately identify multiple apple targets in dense scenes. The backbone network mainly consists of resnet-44 fully convolutional network, region of interest network (RPN), and region of interest (ROI). The experimental results show that the improved YoloV5 network model has a higher recognition accuracy of 94...
April 15, 2024: Heliyon
https://read.qxmd.com/read/38626617/weakly-supervised-temporal-action-localization-with-actionness-guided-false-positive-suppression
#37
JOURNAL ARTICLE
Zhilin Li, Zilei Wang, Qinying Liu
Weakly supervised temporal action localization aims to locate the temporal boundaries of action instances in untrimmed videos using video-level labels and assign them the corresponding action category. Generally, it is solved by a pipeline called "localization-by-classification", which finds the action instances by classifying video snippets. However, since this approach optimizes the video-level classification objective, the generated activation sequences often suffer interference from class-related scenes, resulting in a large number of false positives in the prediction results...
April 15, 2024: Neural Networks: the Official Journal of the International Neural Network Society
https://read.qxmd.com/read/38625775/fast-building-instance-proxy-reconstruction-for-large-urban-scenes
#38
JOURNAL ARTICLE
Jianwei Guo, Haobo Qin, Yinchang Zhou, Xin Chen, Liangliang Nan, Hui Huang
Digitalization of large-scale urban scenes (in particular buildings) has been a long-standing open problem, which attributes to the challenges in data acquisition, such as incomplete scene coverage, lack of semantics, low efficiency, and low reliability in path planning. In this paper, we address these challenges in urban building reconstruction from aerial images, and we propose an effective workflow and a few novel algorithms for efficient 3D building instance proxy reconstruction for large urban scenes. Specifically, we propose a novel learning-based approach to instance segmentation of urban buildings from aerial images followed by a voting-based algorithm to fuse the multi-view instance information to a sparse point cloud (reconstructed using a standard Structure from Motion pipeline)...
April 16, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38625774/bridging-visual-and-textual-semantics-towards-consistency-for-unbiased-scene-graph-generation
#39
JOURNAL ARTICLE
Ruonan Zhang, Gaoyun An, Yiqing Hao, Dapeng Oliver Wu
Scene Graph Generation (SGG) aims to detect visual relationships in an image. However, due to long-tailed bias, SGG is far from practical. Most methods depend heavily on the assistance of statistics co-occurrence to generate a balanced dataset, so they are dataset-specific and easily affected by noises. The fundamental cause is that SGG is simplified as a classification task instead of a reasoning task, thus the ability capturing the fine-grained details is limited and the difficulty in handling ambiguity is increased...
April 16, 2024: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://read.qxmd.com/read/38616235/figure-ground-segmentation-based-on-motion-in-the-archerfish
#40
JOURNAL ARTICLE
Svetlana Volotsky, Ronen Segev
Figure-ground segmentation is a fundamental process in visual perception that involves separating visual stimuli into distinct meaningful objects and their surrounding context, thus allowing the brain to interpret and understand complex visual scenes. Mammals exhibit varying figure-ground segmentation capabilities, ranging from primates that can perform well on figure-ground segmentation tasks to rodents that perform poorly. To explore figure-ground segmentation capabilities in teleost fish, we studied how the archerfish, an expert visual hunter, performs figure-ground segmentation...
April 15, 2024: Animal Cognition
keyword
keyword
102340
2
3
Fetch more papers »
Fetching more papers... Fetching...
Remove bar
Read by QxMD icon Read
×

Save your favorite articles in one place with a free QxMD account.

×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"

We want to hear from doctors like you!

Take a second to answer a survey question.