Bringing 3D models together: Mining video liaisons in crowdsourced reconstructions

Ke Wang, Enrique Dunn Rivera, Mikel Rodriguez, Jan-Michael Frahm

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

The recent advances in large-scale scene modeling have enabled the automatic 3D reconstruction of landmark sites from crowdsourced photo collections. Here, we address the challenge of leveraging crowdsourced video collections to identify connecting visual observations that enable the alignment and subsequent aggregation, of disjoint 3D models. We denote these connecting image sequences as video liaisons and develop a data-driven framework for fully unsupervised extraction and exploitation. Towards this end, we represent video contents in terms of a histogram representation of iconic imagery contained within existing 3D models attained from a photo collection. We then use this representation to efficiently identify and prioritize the analysis of individual videos within a large-scale video collection, in an effort to determine camera motion trajectories connecting different landmarks. Results on crowdsourced data illustrate the efficiency and effectiveness of our proposed approach.

LanguageEnglish (US)
Title of host publicationComputer Vision - 13th Asian Conference on Computer Vision, ACCV 2016, Revised Selected Papers
EditorsKo Nishino, Shang-Hong Lai, Vincent Lepetit, Yoichi Sato
PublisherSpringer Verlag
Pages408-423
Number of pages16
ISBN (Print)9783319541891
DOIs
StatePublished - Jan 1 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10114 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Fingerprint

Landmarks
3D Model
Mining
3D Reconstruction
Image Sequence
Data-driven
Exploitation
Histogram
Aggregation
Disjoint
Alignment
Agglomeration
Camera
Cameras
Trajectories
Trajectory
Denote
Motion
Modeling
Vision

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Wang, K., Dunn Rivera, E., Rodriguez, M., & Frahm, J-M. (2017). Bringing 3D models together: Mining video liaisons in crowdsourced reconstructions. In K. Nishino, S-H. Lai, V. Lepetit, & Y. Sato (Eds.), Computer Vision - 13th Asian Conference on Computer Vision, ACCV 2016, Revised Selected Papers (pp. 408-423). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10114 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-54190-7_25

Bringing 3D models together : Mining video liaisons in crowdsourced reconstructions. / Wang, Ke; Dunn Rivera, Enrique; Rodriguez, Mikel; Frahm, Jan-Michael.

Computer Vision - 13th Asian Conference on Computer Vision, ACCV 2016, Revised Selected Papers. ed. / Ko Nishino; Shang-Hong Lai; Vincent Lepetit; Yoichi Sato. Springer Verlag, 2017. p. 408-423 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10114 LNCS).

Research output: Chapter in Book/Report/Conference proceedingChapter

Wang, K, Dunn Rivera, E, Rodriguez, M & Frahm, J-M 2017, Bringing 3D models together: Mining video liaisons in crowdsourced reconstructions. in K Nishino, S-H Lai, V Lepetit & Y Sato (eds), Computer Vision - 13th Asian Conference on Computer Vision, ACCV 2016, Revised Selected Papers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10114 LNCS, Springer Verlag, pp. 408-423. https://doi.org/10.1007/978-3-319-54190-7_25
Wang K, Dunn Rivera E, Rodriguez M, Frahm J-M. Bringing 3D models together: Mining video liaisons in crowdsourced reconstructions. In Nishino K, Lai S-H, Lepetit V, Sato Y, editors, Computer Vision - 13th Asian Conference on Computer Vision, ACCV 2016, Revised Selected Papers. Springer Verlag. 2017. p. 408-423. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-54190-7_25
Wang, Ke ; Dunn Rivera, Enrique ; Rodriguez, Mikel ; Frahm, Jan-Michael. / Bringing 3D models together : Mining video liaisons in crowdsourced reconstructions. Computer Vision - 13th Asian Conference on Computer Vision, ACCV 2016, Revised Selected Papers. editor / Ko Nishino ; Shang-Hong Lai ; Vincent Lepetit ; Yoichi Sato. Springer Verlag, 2017. pp. 408-423 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inbook{ae462906a0274867af54e4e747ae0faf,
title = "Bringing 3D models together: Mining video liaisons in crowdsourced reconstructions",
abstract = "The recent advances in large-scale scene modeling have enabled the automatic 3D reconstruction of landmark sites from crowdsourced photo collections. Here, we address the challenge of leveraging crowdsourced video collections to identify connecting visual observations that enable the alignment and subsequent aggregation, of disjoint 3D models. We denote these connecting image sequences as video liaisons and develop a data-driven framework for fully unsupervised extraction and exploitation. Towards this end, we represent video contents in terms of a histogram representation of iconic imagery contained within existing 3D models attained from a photo collection. We then use this representation to efficiently identify and prioritize the analysis of individual videos within a large-scale video collection, in an effort to determine camera motion trajectories connecting different landmarks. Results on crowdsourced data illustrate the efficiency and effectiveness of our proposed approach.",
author = "Ke Wang and {Dunn Rivera}, Enrique and Mikel Rodriguez and Jan-Michael Frahm",
year = "2017",
month = "1",
day = "1",
doi = "10.1007/978-3-319-54190-7_25",
language = "English (US)",
isbn = "9783319541891",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "408--423",
editor = "Ko Nishino and Shang-Hong Lai and Vincent Lepetit and Yoichi Sato",
booktitle = "Computer Vision - 13th Asian Conference on Computer Vision, ACCV 2016, Revised Selected Papers",
address = "Germany",

}

TY - CHAP

T1 - Bringing 3D models together

T2 - Mining video liaisons in crowdsourced reconstructions

AU - Wang, Ke

AU - Dunn Rivera, Enrique

AU - Rodriguez, Mikel

AU - Frahm, Jan-Michael

PY - 2017/1/1

Y1 - 2017/1/1

N2 - The recent advances in large-scale scene modeling have enabled the automatic 3D reconstruction of landmark sites from crowdsourced photo collections. Here, we address the challenge of leveraging crowdsourced video collections to identify connecting visual observations that enable the alignment and subsequent aggregation, of disjoint 3D models. We denote these connecting image sequences as video liaisons and develop a data-driven framework for fully unsupervised extraction and exploitation. Towards this end, we represent video contents in terms of a histogram representation of iconic imagery contained within existing 3D models attained from a photo collection. We then use this representation to efficiently identify and prioritize the analysis of individual videos within a large-scale video collection, in an effort to determine camera motion trajectories connecting different landmarks. Results on crowdsourced data illustrate the efficiency and effectiveness of our proposed approach.

AB - The recent advances in large-scale scene modeling have enabled the automatic 3D reconstruction of landmark sites from crowdsourced photo collections. Here, we address the challenge of leveraging crowdsourced video collections to identify connecting visual observations that enable the alignment and subsequent aggregation, of disjoint 3D models. We denote these connecting image sequences as video liaisons and develop a data-driven framework for fully unsupervised extraction and exploitation. Towards this end, we represent video contents in terms of a histogram representation of iconic imagery contained within existing 3D models attained from a photo collection. We then use this representation to efficiently identify and prioritize the analysis of individual videos within a large-scale video collection, in an effort to determine camera motion trajectories connecting different landmarks. Results on crowdsourced data illustrate the efficiency and effectiveness of our proposed approach.

UR - http://www.scopus.com/inward/record.url?scp=85016059884&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85016059884&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-54190-7_25

DO - 10.1007/978-3-319-54190-7_25

M3 - Chapter

SN - 9783319541891

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 408

EP - 423

BT - Computer Vision - 13th Asian Conference on Computer Vision, ACCV 2016, Revised Selected Papers

A2 - Nishino, Ko

A2 - Lai, Shang-Hong

A2 - Lepetit, Vincent

A2 - Sato, Yoichi

PB - Springer Verlag

ER -