Abstract
Prostate cancer accounts for one-third of noncutaneous cancers diagnosed in US men and is a leading cause of cancer-related death. Advances in Fourier transform infrared spectroscopic imaging now provide very large data sets describing both the structural and local chemical properties of cells within prostate tissue. Uniting spectroscopic imaging data and computer-aided diagnoses (CADx), our long term goal is to provide a new approach to pathology by automating the recognition of cancer in complex tissue. The first step toward the creation of such CADx tools requires mechanisms for automatically learning to classify tissue types—a key step on the diagnosis process. Here we demonstrate that genetics-based machine learning (GBML) can be used to approach such a problem. However, to efficiently analyze this problem there is a need to develop efficient and scalable GBML implementations that are able to process very large data sets. In this paper, we propose and validate an efficient GBML technique—\({\tt NAX}\)—based on an incremental genetics-based rule learner. \({\tt NAX}\) exploits massive parallelisms via the message passing interface (MPI) and efficient rule-matching using hardware-implemented operations. Results demonstrate that \({\tt NAX}\) is capable of performing prostate tissue classification efficiently, making a compelling case for using GBML implementations as efficient and powerful tools for biomedical image processing.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Amdahl G (1967) Validity of the single processor approach to achieving large-scale computing capabilities. In Proceedings of the American federation of information processing societies conference (AFIPS). 30:483–485 AFIPS
Bacardit J, Butz M (2006) Advances at the frontier of Learning Classifier Systems. Chapter data mining in Learning Classifier Systems: Comparing XCS with GAssist, vol I. Springer
Bacardit J, Krasnogor N (2006) Biohel: Bioinformatics-oriented hierarchical evolutionary learning (Nottingham ePrints). University of Nottingham
Barry A, Drugowitsch J (1997) LCSWeb: the LCS wiki. http://www.lcsweb.cs.bath.ac.uk/
Bernadó E, Llorà X, Garrell J (2001) Advances in Learning Classifier Systems: 4th international workshop (IWLCS 2001). Chapter XCS and GALE: a comparative study of two Learning Classifier Systems with six other learning algorithms on classification tasks. Springer Berlin, Heidelberg, pp 115–132
Bhargava R, Fernandez D, Hewitt S, Levin I (2006) High throughput assessment of cells and tissues: Bayesian classification of spectral metrics from infrared vibrational spectroscopic imaging data. Biochemica et Biophisica Acta 1758(7):830–845
Cantú-Paz E (2000) Efficient and accurate parallel genetic algorithms. Kluwer Academic Publishers
Cordón O, Herrera F, Hoffmann F, Magdalena L (2001) Genetic fuzzy systems. Evolutionary tuning and learning of fuzzy knowledge bases. World Scientific
Fernandez D, Bhargava R, Hewitt S, Levin I (2005) Infrared spectroscopic imaging for histopathologic recognition. Nat Biotechnol 23(4):469–474
Flockhart I (1995) GA-MINER: parallel data mining with hierarchical genetic algorithms (final report). (Technical Report Technical Report EPCCAIKMS-GA-MINER-REPORT 1.0). University of Edinburgh
Gabriel E, Fagg G, Bosilca G, Angskun T, Dongarra J, Squyres J, Sahay V, Kambadur P, Barrett B, Lumsdaine A, Castain R, Daniel D, Graham R, Woodall T (2004) Open MPI: goals, concept, and design of a next generation MPI implementation. In Proceedings of the 11th European PVMMPI Users’ group meeting Springer
Goldberg D (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley Professional
Goldberg D (2002) The design of innovation: lessons from and for competent genetic algorithms. Springer
Grama A, Gupta A, Karypis G, Kumar V (2003) Introduction to parallel computing. Addison-Wesley
Holte R (1993) Very simple classification rules perform well on most commonly used datasets. Mach Learn 11:63–91
Lattouf J-B, Saad F (2002) Gleason score on biopsy: is it reliable for predcting the final grade on pathology? BJU Int 90:694–699
Levin I, Bhargava R (2005) Fourier transform infrared vibrational spectroscopic imaging: integrating microscopy and molecular recognition. Annu Rev Phys Chem 56: 429–474
Llorà X (2002) Genetics-based machine learning using fine-grained parallelism for data mining. Doctoral dissertation, Enginyeria i Arquitectura La Salle. Ramon Llull University, Barcelona, Catalonia, European Union
Llorà X (2006) Learning Classifier Systems and other genetics-based machine learning Blog. http://www-illigal.ge.uiuc.edulcs-n-gbml/
Llorà X, Garrell J (2001) Knowledge-independent data mining with fine-grained parallel evolutionary algorithms. In Proceedings of the genetic and evolutionary computation conference (GECCO’2001). Morgan Kaufmann Publishers, pp 461–468
Llorà X, Goldberg D (2003) Bounding the effect of noise in multiobjective Learning Classifier Systems. Evol Comput J 11(3):279–298
Llorà X, Sastry K (2006) Fast rule matching for Learning Classifier Systems via vector instructions. In Proceedings of the 2006 genetic and evolutionary computation conference. ACM Press, pp 1513–1520
Llorà X, Sastry K, Goldberg D (2005) The compact classifier system: motivation, analysis and first results. In Proceedings of the congress on evolutionary computation, vol 1. IEEE press, (Also as IlliGAL TR No 2005019, pp 596–603)
Llorà X, Sastry K, Goldberg D, de la Ossa L (2007) The χ-ary extended compact classifier system: linkage learning in Pittsburgh LCS. In Advances at the frontier of Learning Classifier Systems, vol II. IlliGAL report no 2006015. Springer, pp (in preparation)
Merz CJ, Murphy PM (1998) UCI repository for machine learning data-bases. http://www.ics.uci.edu/~mlearn/MLRepository.html
Mitchell T (1997) Machine learning. McGraw Hill
Orriols-Puig A, Bernadó-Mansilla E (2006) A further look at UCS classifier system. In Proceedings of the 8th annual conference on genetic and evolutionary computation workshop program. ACM Press
Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann
Stone C, Bull L (2003) For real! XCS with continuous-valued inputs. Evol Comput J 11(3):279–298
Wilson S (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175
Wilson S (2000a) Get real! XCS with continuous-valued inputs. Lect Notes Comput Sci 1813:209–219
Wilson S (2000b) Mining oblique data with xcs. In Revised papers of the 3th international workshop on Learning Classifier Systems (IWLCS 2000). Springer, pp 158–176
Acknowledgments
We would like to thank David E. Goldberg for his continual support and encouragement, allowing us to have access to the IlliGAL resources. Thanks also to Kumara Sastry for hallway discussions and to the Automated Learning Group and the Data-Intensive Technologies and Applications at the National Center for Supercomputing Applications for hosting this joint collaboration.
This work was sponsored by the Air Force Office of Scientific Research, Air Force Materiel Command, USAF, under grant FA9550-06-1-0370, the National Science Foundation under grant IIS-02-09199, and the National Institute of Health. The US Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon.
The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Air Force Office of Scientific Research, the National Science Foundation, or the US Government.
Rohit Bhargava would like to acknowledge collaborators over the years, especially Dr. Stephen M. Hewitt and Dr. Ira W. Levin of the National Institutes of Health, for numerous useful discussions and guidance. Funding for this work was provided in part by University of Illinois Research Board and by the Department of Defense Prostate Cancer Research Program. This work was also funded in part by the National Center for Supercomputing Applications and the University of Illinois, under the auspices of the NCSA/UIUC faculty fellows program.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Llorà, X., Priya, A. & Bhargava, R. Observer-invariant histopathology using genetics-based machine learning. Nat Comput 8, 101–120 (2009). https://doi.org/10.1007/s11047-007-9056-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11047-007-9056-6