Extraction and Interpretation of Information from Large-Scale
Hyperspectral Data for Mapping and Monitoring Wetland Ecosystems
Project Summary
Enormous quantities of high-dimensional data of very high spectral resolution
are being acquired nowadays by new space-based hyperspectral sensors. Each
resulting dataset requires several GBytes of storage and provides far superior
characterization of the chemistry-based spectral response of remotely sensed
areas. Unfortunately, current analysis tools are woefully inadequate for fully
utilizing such rich information to map and monitor changes in ground cover
over extended regions. The overall objective of this project is to develop a
comprehensive system that can efficiently and intelligently extract, analyze
and manage very large hyperspectral datasets used for classifying a large
variety of land covers in environmentally sensitive ecosystems. Scalability
to large areas will be primarily achieved by adapting models developed for
one area to new regions (with somewhat different characteristics) using
knowledge transfer and reuse mechanisms, and only small amounts of additional
labelled data. In parallel, a knowledge repository of features/classes that
can quickly narrow down both the sensor data that are most relevant and the
possible class types and hierarchies that are likely to be pertinent to a
given area, will be built. This will substantially reduce data storage
requirements and processing time. Both knowledge transfer and domain
knowledge construction will be facilitated by our recently developed
binary hierarchical classifier (BHC) for multiclass problems. The BHC
recursively partitions classes based on their natural affinities while
simultaneously discovering the most suitable feature space in which to
distinguish among each partition. A key feature of this technique is the
creation of valuable domain knowledge about class hierarchies and relevant
features, that shall be exploited in this proposal to scale to extremely
large datasets from multiple areas.
Concurrently, single area analysis capabilities will be substantially
enhanced by two approaches to simultaneously deal with the problem of
having little labelled data and a large number of potential classes.
The first adaptively combines highly correlated, adjacent spectral
bands to reduce the feature space by an amount commensurate with the
availability of labeled examples, and incorporates this method within
multiple category classification systems such as the BHC and error
correcting output codes. It forms the foundation for developing simple,
adaptive, soft-sensors that can perform band-combinations on-the-fly
onboard, and reduce the
amount of data that must be downlinked for storage and processing.
The second adapts semi-supervised learning and active learning concepts
to hyperspectral data analysis within a distributed data mining
framework.
Broader Impacts
The proposed research is an interdisciplinary
effort that applies data mining approaches to an emerging large-scale problem domain which
demands tight interaction between data acquisition and processing/analysis.
This domain poses several notable challenges: high-dimensional, spectrally
correlated data with tens of classes,
natural class hierarchies, class characteristics and mixing proportions that change
substantially both in space and time, limited training data but
extremely large amounts of unlabelled areas, and multi-resolution properties.
Although the project focuses on analysis of hyperspectral data for mapping
wetland ecosystems, the research results will be also useful for a wider range
data mining applications that exhibit some of these characteristics.
Further, knowledge transfer across domains is considered a key technology
for rapidly adapting existing solutions to somewhat different but related
problems, thus substantially increasing the utility of existing point solutions.
The overall framework and analysis capability will also provide a prototype
for management and analysis of global datasets acquired by the future
constellations of spaceborne multispectral and hyperspectral sensors.
This project will foster interdisciplinary training of students, as it demands
close interaction with
an applications-oriented user community and with domain experts.
The visual nature of results from analysis of remotely sensed data
make it a powerful modality of introducing the general population to issues of
broad concern, such as the impact of global warming and disaster management.
Several modes of community outreach are identified in this proposal.
People
Schlumberger Centennial Chair Professor, Dept. of Electrical & Computer Engineering, The University of Texas at Austin
Director, IDEAL (Intelligent Data Exploration and Analysis Lab) (formerly called Lab for Artificial Neural Systems (LANS)
Professor, Dept. of Civil Engineering, Purdue University
Assistant Dean, Engineering for Interdisciplinary Research
Director, Laboratory for Applications of Remote Sensing
PhD Students:
Suju Rajan, Dept. of Electrical & Computer Engineering, The University of Texas at Austin
Yangchi Chen, Dept. of Operations Research & Industrial Engineering, The University of Texas at Austin
Jisoo Ham, Dept. of Operations Research & Industrial Engineering, The University of Texas at Austin
Publications
Journal Papers
J. T. Morgan, A. Henneguelle, J. Ham, J. Ghosh and M.M. Crawford,
“Adaptive Feature Spaces for Land Cover Classification with Limited Ground
Truth Data”, in Intl. Jl. of Pattern Recognition and Artificial Intelligence,
Spl. Issue on Fusion of Multiple Classifiers, J. Kittler and F. Roli (Eds.),
18(5), Aug 2004, pp. 777-800.
pdf
J. Ham, Y. Chen, M. Crawford, and J. Ghosh, "Investigation of the Random Forest
Framework for Classification of Hyperspectral Data," IEEE Trans. on
Geoscience and Remote Sensing, vol. 43, no. 3, pp. 492-501, March 2005.
pdf
S. Rajan, J. Ghosh, and M. M. Crawford. Exploiting class hierarchies for
knowledge transfer in hyperspectral data. IEEE Transactions on Geoscience
and Remote Sensing, 44(11):3408 – 3417, 2006. pdf
Conference Papers
M.M. Crawford, J. Ham, Y. Chen, and J. Ghosh, "Random Forests of Binary
Hierarchical Classifiers for Analysis of Hyperspectral Data," Proc.
IEEE Workshop on Advances in Techniques for Analysis of Remotely Sensed
Data, Goddard Space Flight Center, Greenbelt, MD, Oct 27-28,
(publication via CD), 2003.
pdf
A.Henneguelle, J. Ghosh, and M.M. Crawford, "Polyline Feature Extraction for
Land Cover Classification Using Hyperspectral Data," Proc. IICAI-2003,
Hyderabad, India, Dec. 18-20, (publication via CD), 2003.
Y. Chen, M.M. Crawford, and J. Ghosh, "Integrating Support Vector Machines in
a Hierarchical Output Decomposition Framework," Proc. 2004 International
Geoscience and Remote Sensing Symposium, Anchorage, Alaska, Sept 20-24,
949-953, 2004.
S. Rajan and J. Ghosh , "An Empirical Comparison of Hierarchical vs.
Two-Level Approaches to Multiclass Problems", in Multiple Classifier Systems,
F. Roli, J. Kittler and T. Windeatt (Eds), LNCS 3077, Springer, pp. 283-292.
pdf
Y. Chen, M.M. Crawford, and J. Ghosh, "Applying Nonlinear Manifold
Learning to Hyperspectral Data for Land Cover Classification",
International Geoscience and Remote Sensing Symposium, July 2005.
pdf
S. Rajan and Joydeep Ghosh, "Exploiting Class Hierarchies for Knowledge
Transfer in Hyperspectral Data", In N.C. Oza, R. Polikar, J. Kittler and
F. Roli editors, Multiple Classifier Systems, pages 417-428, LNCS Vol.3541,
Springer 2005. pdf
S. Rajan, J.Ghosh, and M. M. Crawford, "An Active Learning Approach to
Knowledge Transfer for Hyperspectral Data Analysis", Accepted, IEEE
International Geoscience and Remote Sensing Symposium, Colorado, August 2006.
pdf
Y. Chen, M. M. Crawford and J. Ghosh, "Improved nonlinear manifold learning
for land cover classification via intelligent landmark selection", In 2006
International Geosci. and Sens. Symposium, Denver, Colorado, USA July
31-August 04, 2006. pdf
Data and Software
Data and classification codes can be downloaded from this website.
link
Dates for the start / end of project
8/1/03 to 7/31/06
This page is last updated on 11 Jan, 2007