Action-Aware Embedding Enhancement for Image-Text Retrieval | Proceedings of the AAAI Conference on Artificial Intelligence | | 6 | 2022 |
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations | International Journal of Computer Vision |
- Science: Mathematics: Instruments and machines: Electronic computers. Computer science
- Technology: Mechanical engineering and machinery
- Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics
- Technology: Engineering (General). Civil engineering (General)
| 998 | 2017 |
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for Improved Cross-Modal Retrieval | Transactions of the Association for Computational Linguistics |
- Language and Literature: Philology. Linguistics: Computational linguistics. Natural language processing
- Science: Mathematics: Instruments and machines: Electronic computers. Computer science
- Language and Literature: Philology. Linguistics
- Language and Literature: Philology. Linguistics
- Science: Mathematics: Instruments and machines: Electronic computers. Computer science
| 8 | 2022 |
From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions | Transactions of the Association for Computational Linguistics |
- Language and Literature: Philology. Linguistics: Computational linguistics. Natural language processing
- Science: Mathematics: Instruments and machines: Electronic computers. Computer science
- Language and Literature: Philology. Linguistics
- Language and Literature: Philology. Linguistics
- Science: Mathematics: Instruments and machines: Electronic computers. Computer science
| 480 | 2014 |
P2T: pyramid pooling transformer for scene understanding | | | | 2022 |