EarthMarker: A Visual Prompting Multi-modal Large Language Model for Remote Sensing

Official repository for EarthMarker.

Authors: Wei Zhang*, Miaoxin Cai*, Tong Zhang, Yin Zhuang, and Xuerui Mao

The authors contributed equally to this work.

📣 News

[2024.01.06]: We have released the dataset RSVP! 🔥 🔥🔥
[2024.12.22]: EarthMarker has been accepted to IEEE TGRS. 🎉

[2024.07.19]: The paper for EarthMarker is released arxiv. 🚀

✨ Introduction

A visual prompting MLLM called EarthMarker is proposed in the remote sensing (RS) domain for the first time. EarthMarker can comprehend RS imagery under visual and text joint prompts, and flexibly switch interpretation levels, including image, region, and point levels. More importantly, the proposed EarthMarker fills the gap in visual prompting MLLMs for RS, significantly catering to the fine-grained interpretation needs of RS imagery in real-world applications. EarthMarker is capable of various RS visual tasks including scene classification, referring object classification, captioning, and relationship analyses, which are beneficial to making informed decisions in real-world applications.

✨ The first RS Visual Prompting instruction Dataset RSVP

The entire data of RSVP is released! 🚀 RSVP contains roughly 3.65 M image-point-text and image-region-text pairings.

link1: https://pan.baidu.com/s/1_kMO5bBje7JXTNpxDiCvqg?pwd=gqdb pwd: gqdb

link2: OneDrive version is uploading.

🔖 Citation

@article{zhang2024earthmarker,
  title={EarthMarker: A Visual Prompting Multi-modal Large Language Model for Remote Sensing},
  author={Zhang, Wei and Cai, Miaoxin and Zhang, Tong and Zhuang, Yin and Li, Jun and Mao, Xuerui},
  journal={IEEE Transactions on Geoscience and Remote Sensing},
  year={2024},
  publisher={IEEE}
}

📝 Acknowledgment

This paper benefits from llama. Thanks for their wonderful work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

EarthMarker: A Visual Prompting Multi-modal Large Language Model for Remote Sensing

📣 News

✨ Introduction

✨ The first RS Visual Prompting instruction Dataset RSVP

🔖 Citation

📝 Acknowledgment

Files

README.md

Latest commit

History

README.md

File metadata and controls

EarthMarker: A Visual Prompting Multi-modal Large Language Model for Remote Sensing

📣 News

✨ Introduction

✨ The first RS Visual Prompting instruction Dataset RSVP

🔖 Citation

📝 Acknowledgment