Abstract
Digital imaging plays a critical role for image guided diagnosis and clinical trials, and the amount of image data is fast growing. There are two major requirements for image data management: scalability for massive scales and support of comprehensive queries. Traditional Picture Archiving and Communication Systems (PACS for short) are based on relational data management systems and suffer from limited scalability and query support. Therefore, new systems that support fast, scalable and comprehensive queries on image data are highly demanded. In this paper, we introduce two alternative approaches: DCMRL/XMLStore (RL/XML for short)—a parallel, hybrid relational and XML data management approach, and DCMDocStore (DOC for short)—a NoSQL document store approach. DCMRL/XMLStore manages DICOM images as binary large objects and metadata as relational tables and XML documents based on IBM DB2, which is parallelized through data partitioning. DCMDocStore manages DICOM metadata as JSON objects, and DICOM images as encoded attachments in MongoDB running on multiple nodes. We have delivered two open source systems DCMRL/XMLStore and DCMDocStore. Both systems support scalable data management and comprehensive queries. We also evaluated them with nearly one million DICOM images from National Biomedical Imaging Archive. The results show that, DCMDocStore demonstrates high data loading speed, high scalability and fault tolerance. DCMRL/XMLStore provides efficient queries, but comes with slower data loading. Traditional PACS systems have inherent limitations on flexible queries and scalability for massive amount of images.
Similar content being viewed by others
References
N. E. M. Association. DICOM Supplement 145: Whole Slide Microscopic Image IOD and SOP Classes (2018). ftp://medical.nema.org/medical/dicom/final/sup145_ft.pdf
Kahn Jr., C.E., Langlotz, C.P., Channin, D.S., Rubin, D.L.: Informatics in radiology: an information model of the DICOM standard. Radiographics 31(1), 295–304 (2011)
Kim, K.J., Kim, B., Lee, H., Choi, H., Jeon, J.-J., Ahn, J.-H., Lee, K.H.: Predicting the fidelity of JPEG2000 compressed CT images using DICOM header information. Med. Phys. 38(12), 6449–6457 (2011)
Källman, H.-E., Halsius, E., Folkesson, M., Larsson, Y., Stenström, M., Båth, M.: Automated detection of changes in patient exposure in digital projection radiography using exposure index from DICOM header metadata. Acta Oncol. 50(6), 960–965 (2011)
Jahnen, A., Kohler, S., Hermen, J., Tack, D., Back, C.: Automatic computed tomography patient dose calculation using DICOM header metadata. Radiat. Protect. Dosim. 147(1–2), 317–320 (2011)
Dave, J.K., Gingold, E.L.: Extraction of CT dose information from DICOM metadata: automated matlab-based approach. Am. J. Roentgenol. 200(1), 142–145 (2013)
Källman, H.-E., Halsius, E., Olsson, M., Stenström, M.: DICOM Metadata repository for technical information in digital medical images. Acta Oncol. 48(2), 285–288 (2009)
Rampado, O., Garelli, E., Zatteri, R., Escoffier, U., De Lucchi, R., Ropolo, R.: Patient dose evaluation by means of DICOM images for a direct radiography system. Radiol. Med. 113(8), 1219–1228 (2008)
N. E. M. Association: Digital Imaging and Communications in Medicine (DICOM) Part 3: Information Object Definitions (2018). ftp://medical.nema.org/medical/dicom/2013/output/pdf/part03.pdf
Open Source Clinical Image and Object Management (2018). http://www.dcm4che.org/
Huang, H.K.: PACS and Imaging Informatics: Basic Principles and Applications, 2nd edn. Wiley-Blackwell, Hoboken (2010)
National Biomedical Imaging Archive (2018). https://imaging.nci.nih.gov/ncia/login.jsf
Aji, A., Wang, F., Vo, H., Lee, R., Liu, Q., Zhang, X., Saltz, J.: Hadoop-GIS: a high performance spatial data warehousing system over mapreduce. Proc VLDB Endow 6(11), 1009–1020 (2013)
Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stonebraker, M.: A comparison of approaches to large-scale data analysis. In: SIGMOD, pp. 165–178 (2009)
Dell, Siemens Partner On Image Sharing, Archiving (2018). http://www.informationweek.com/healthcare/clinical-information-systems/dell-siemens-partner-on-image-sharing-archiving-/d/d-id/1102877?
Ambra Health (2018). https://ambrahealth.com/
Rascovsky, S.J., Delgado, J.A., Sanz, A., Calvo, V.D., Castrillón, G.: Informatics in radiology: use of CouchDB for document-based storage of DICOM objects. Radiographics 32(3), 913–927 (2012)
Apache CoachDB (2018). http://couchdb.apache.org/
Savaris, A., Harder, T., von Wangenheim, A.: Evaluating a row-store data model for full-content DICOM management. In: 2014 IEEE 27th International Symposium on Computer-Based Medical Systems (CBMS), pp. 193–198 (2014)
Apache Cassandra (2018). http://cassandra.apache.org/
JavaScript Object Notation (JSON) (2018). http://json.org/
Yu, C., Yao, Z.: XML-based DICOM data format. J. Digit. Imaging 23(2), 192–202 (2010)
W3C. XML Path Language (XPath) (2018). http://www.w3.org/TR/xpath-30/
W3C. XQuery 1.0: An XML Query Language (2018). http://www.w3.org/TR/xquery/
Oracle XML DB
Beyer, K., Cochrane, R.J., Josifovski, V., Kleewein, J., Lapis, G., Lohman, G., Lyle, B., Özcan, F., Pirahesh, H., Seemann, N., Truong, T., Linden, B.V., Vickery, B., Zhang, C.: System RX: one part relational, one part XML. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland (2005)
DeWitt, D.J., Gray, J.: Parallel database systems: the future of high performance database processing. Commun. ACM 25(6), 85–98 (1992)
Database Partitioning, Table Partitioning, and MDC for DB2 9 (2018). http://www.redbooks.ibm.com/redbooks/pdfs/sg247467.pdf
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazon’s highly available key-value store. In: Proceedings of Twenty-First ACM SIGOPS Symposium on Operating Systems Principles, Stevenson, Washington, USA (2007)
MongoDB (2018). http://www.mongodb.com
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2), 1–26 (2008)
Garcia-Molina, H., Ullman, J.D., Widom, J.: Database Systems: The Complete Book. Prentice Hall Press, Upper Saddle River (2008)
Gilbert, S., Lynch, N.: Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News 33(2), 51–59 (2002)
DICOM PS3.18 2013—Web Services (2018). http://medical.nema.org/dicom/2013/output/pdf/part18.pdf
DCMRL/XMLStore (2018). https://github.com/tengdj/DCMRLXMLStore
DCMDOCStore (2018). https://github.com/tengdj/DCMDOCStore
The Cancer Imaging Archive (TCIA) (2018). http://www.cancerimagingarchive.net/
Acknowledgements
Funding was provided by National Science Foundation (Grant nos. ACI 1443054 and IIS 1350885) and National Institutes of Health (Grant no. K25CA181503).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Teng, D., Kong, J. & Wang, F. Scalable and flexible management of medical image big data. Distrib Parallel Databases 37, 235–250 (2019). https://doi.org/10.1007/s10619-018-7230-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-018-7230-8