Abstract
The Database Group (DBGroup, www.dbgroup.unimore.it) and Information System Group (ISGroup, www.isgroup.unimore.it) research activities have been mainly devoted to the Data Integration Reserach Area. The DBGroup designed and developed the MOMIS data integration system, giving raise to a successful innovative enterprise DataRiver (www.datariver.it), distributing MOMIS as open source. MOMIS provides an integrated access to structured and semistructured data sources and allows a user to pose a single query and to receive a single unified answer. Description Logics, Automatic Annotation of schemata plus clustering techniques constitute the theoretical framework. In the context of data integration, the ISGroup addressed problems related to the management and querying of heterogeneous data sources in large-scale and dynamic scenarios. The reference architectures are the Peer Data Management Systems and its evolutions toward dataspaces. In these contexts, the ISGroup proposed and evaluated effective and efficient mechanisms for network creation with limited information loss and solutions for mapping management query reformulation and processing and query routing. The main issues of data integration have been faced: automatic annotation, mapping discovery, global query processing, provenance, multidimensional Information integration, keyword search, within European and national projects. With the incoming new requirements of integrating open linked data, textual and multimedia data in a big data scenario, the research has been devoted to the Big Data Integration Research Area. In particular, the most relevant achieved research results are: a scalable entity resolution method, a scalable join operator and a tool, LODEX, for automatically extracting metadata from Linked Open Data (LOD) resources and for visual querying formulation on LOD resources. Moreover, in collaboration with DATARIVER, Data Integration was successfully applied to smart e-health.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
NORMS will be included in the next release of the MOMIS Open Source version, available at http://www.datariver.it/data-integration/momis/.
- 2.
- 3.
For a complete description see http://dbgroup.ing.unimore.it/MomisDashboard.
- 4.
References
I. Bartolini, D. Beneventano, S. Bergamaschi, P. Ciaccia, A. Corni, M. Orsini, M. Patella, M.M. Santese, MOMIS goes multimedia: WINDSURF and the case of top-k queries, in SEBD’15, Gaeta, 14–17 June 2015. (2015), pp. 200–207
F. Benedetti, S. Bergamaschi, L. Po, Lodex: a tool for visual querying linked open data, in ISWC’15 Posters & Demonstrations Track (2015)
F. Benedetti, S. Bergamaschi, L. Po, Visual querying LOD sources with lodex, in K-CAP’15, Palisades, NY, USA, 7-10 Oct 2015 (2015), pp. 12:1–12:8
D. Beneventano, Provenance based conflict handling strategies, in DASFAA’12, Busan, South Korea, 15–18 Apr 2012 (2012), pp. 286–297
D. Beneventano, S. Bergamaschi, The momis methodology for integrating heterogeneous data sources, in IFIP 18th World Computer Congress 22–27 Aug 2004 Toulouse, France (Springer, US, 2004), pp. 19–24
D. Beneventano, S. Bergamaschi, Provenance-aware semantic search engines based on data integration systems. IJOCI 4(2), 1–30 (2014)
D. Beneventano, S. Bergamaschi, A.R. Dannaoui, Integration and provenance of cereals genotypic and phenotypic data, in SEBD’12 (2012), pp. 91–98
D. Beneventano, S. Bergamaschi, L. Gagliardelli, L. Po, Driving innovation in youth policies with open data, in IC3K’15, Revised Selected Papers, Communications in Computer and Information Science (Springer, 2016)
D. Beneventano, S. Bergamaschi, F. Guerra, M. Vincini, The SEWASIE network of mediator agents for semantic search. J. UCS 13(12), 1936–1969 (2007)
D. Beneventano, S. Bergamaschi, R. Martoglia, Exploiting semantics for searching agricultural bibliographic data. J. of Inf. Sci. 42(6), 748–762 (2016)
D. Beneventano, S. Bergamaschi, S. Sorrentino, M. Vincini, F. Benedetti, Semantic annotation of the CEREALAB database by the AGROVOC linked dataset. Ecol. Inf. 26(2), 119–126 (2015)
D. Beneventano, A.R. Dannaoui, A. Sala, On provenance of data fusion queries, in SEBD’11, 26–29 June 2011 (2011), pp. 84–94
D. Beneventano, C. Gennaro, S. Bergamaschi, F. Rabitti, A mediator-based approach for integrating heterogeneous multimedia sources. Multimed. Tools Appl. 62(2), 427–450 (2013)
D. Beneventano, F. Guerra, S. Magnani, M. Vincini, A web service based framework for the semantic mapping amongst product classification schemas. J. Electron. Commer. Res. 5(2), 114–127 (2004)
D. Beneventano, F. Guerra, A. Maurino, M. Palmonari, G. Pasi, A. Sala, Unified semantic search of data and services, in MTSR’09 (2009), pp. 95–107
D. Beneventano, S.E. Haoum, D. Montanari, Mapping of heterogeneous schemata, business structures, and terminologies, in Workshop at DEXA’07 (2007), pp. 412–418
D. Beneventano, M. Olaru, M. Vincini, Analyzing dimension mappings and properties in data warehouse integration, in OTM’13 (2013), pp. 616–623
S. Bergamaschi, D. Beneventano, F. Guerra, M. Orsini, Data integration, in Handbook of Conceptual Modeling: Theory, Practice and Research Challenges, ed. By D.W. Embley, B. Thalheim (Springer, 2011)
S. Bergamaschi, D. Beneventano, F. Guerra, M. Vincini, Building a tourism information provider with the MOMIS system. J. Inf. Technol. Tour. 7(3–4), 221–238 (2004)
S. Bergamaschi, S. Castano, M. Vincini, Semantic integration of semistructured and structured data sources. SIGMOD Rec. 28(1) (1999)
S. Bergamaschi, E. Domnori, F. Guerra, M. Orsini, R. Trillo-Lado, Y. Velegrakis, Keymantic: semantic keyword-based searching in data integration systems. PVLDB 3(2) (2010)
S. Bergamaschi, E. Domnori, F. Guerra, R. Trillo-Lado, Y. Velegrakis, Keyword search over relational databases: a metadata approach, in SIGMOD (ACM, 2011), pp. 565–576
S. Bergamaschi, D. Ferrari, F. Guerra, G. Simonini, Y. Velegrakis, Providing insight into data source topics. J. Data Semant. 5(4), 211–228 (2016)
S. Bergamaschi, N. Ferro, F. Guerra, G. Silvello, Keyword-based search over databases: a roadmap for a reference architecture paired with an evaluation framework. Trans. Comput. Collect. Intell. 21, 1–20 (2016)
S. Bergamaschi, F. Guerra, M. Interlandi, R.T. Lado, Y. Velegrakis, QUEST: a keyword search system for relational data based on semantic and machine learning techniques. PVLDB 6(12), 1222–1225 (2013)
S. Bergamaschi, F. Guerra, M. Interlandi, R.T. Lado, Y. Velegrakis, Combining user and database perspective for solving keyword queries over relational databases. Inf. Syst. 55, 1–19 (2016)
S. Bergamaschi, F. Guerra, S. Rota, Y. Velegrakis, A hidden markov model approach to keyword-based search over relational databases, in ER, vol. 6998 (LNCS, Springer, 2011), pp. 411–420
S. Bergamaschi, L. Po, S. Sorrentino, Automatic annotation for mapping discovery in integration systems, in SEBD’08 (2008), pp. 334–341
J. Bleiholder, F. Naumann, Data fusion. ACM Comp. Surv. 41, 1–41 (2008)
G.H.L. Fletcher, F. Mandreoli, No users no dataspaces! query-driven dataspace orchestration? in Proceedings of SEBD (2016), pp. 150–157
B. Glavic, G. Alonso, R.J. Miller, L.M. Haas, Tramp: Understanding the behavior of schema mappings through provenance. PVLDB 3(1), 1314–1325 (2010)
M. Golfarelli, F. Mandreoli, W. Penzo, S. Rizzi, E. Turricchia, Towards OLAP query reformulation in peer-to-peer data warehousing, in Proceedings of ACM (DOLAP) (2010), pp. 37–44
A.Y. Halevy, M.J. Franklin, D. Maier, Principles of dataspace systems, in ACM PODS (2006), pp. 1–9
A.Y. Halevy, Z.G. Ives, D. Suciu, I. Tatarinov, Schema mediation for large-scale semantic data sharing. VLDB J. 14(1), 68–83 (2005)
J. Hammer, M. Stonebraker, O. Topsakal, Thalia: test harness for the assessment of legacy information integration, in ICDE (2005), pp. 485–486
M. Lenzerini, Data integration: a theoretical perspective, in PODS (2002), pp. 233–246
R. Lenzi, C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatelli, A unified multimedia and semantic perspective for data retrieval in the semantic web. Inf. Syst. 36(2), 174–191 (2011)
J.N. Levi, The Syntax and Semantics of Complex Nominals(Academic Press, Cambridge, 1978)
F. Mandreoli, R. Martoglia, Knowledge-based sense disambiguation (almost) for all structures. Inf. Syst. 36(2), 406–430 (2011)
F. Mandreoli, R. Martoglia, W. Penzo, Approximating expressive queries on graph-modeled data: the gex approach. J. Syst. Softw. 2015(109), 106–123 (2015)
F. Mandreoli, R. Martoglia, W. Penzo, S. Sassatelli, Data-sharing p2p networks with semantic approximation capabilities. IEEE IC 13(5), 60–70 (2009)
F. Mandreoli, R. Martoglia, W. Penzo, S. Sassatelli, G. Villani, Sri@work: efficient and effective routing strategies in a pdms, in WISE (2007), pp. 285–297
F. Mandreoli, R. Martoglia, W. Penzo, S. Sassatelli, G. Villani, Building a pdms infrastructure for xml data sharing with sunrise, in EDBT-DATAX (2008)
F. Mandreoli, R. Martoglia, W. Penzo, G. Villani, Flexible query answering on graph-modeled data. Proc. EDBT 2009, 216–227 (2009)
F. Mandreoli, R. Martoglia, E. Ronchetti, Versatile structural disambiguation for semantic-aware applications, in Proceedings of ACM CIKM (2005), pp. 209–216
F. Mandreoli, R. Martoglia, E. Ronchetti, Strider: a versatile system for structural disambiguation. Proc. EDBT 2006, 1194–1197 (2006)
F. Mandreoli, R. Martoglia, S. Sassatelli, W. Penzo, Sri: exploiting semantic information for effective query routing in a pdms, in Proceedings of of the ACM CIKM Workshop WIDM (2006), pp. 19–26
F. Mandreoli, W. Penzo, S. Rizzi, M. Golfarelli, E. Turricchia, Olap query reformulation in peer-to-peer data warehousing. Inf. Syst. 37(5), 393–411 (2012)
F. Mandreoli, W. Penzo, S. Sassatelli, S. Lodi, R. Martoglia, Semantic peer, here are the neighbors you want!. Proc. EDBT 2008, 26–37 (2008)
J. Milc, A. Sala, S. Bergamaschi, N. Pecchioni, A genotypic and phenotypic information source: the cerealab database. Database (2011)
G.A. Miller, Wordnet: a lexical database for english. C. ACM 38(11), 39–41 (1995)
R.J. Miller, D. Fisla, M. Huang, F. Kymlicka, V. Lee, The amalgam schema and data integration test suite (2001), www.cs.toronto.edu/~miller/amalgam
S. Rota, S. Bergamaschi, F. Guerra, The list viterbi training algorithm and its application to keyword search over databases, in CIKM (2011), pp. 1601–1606
G. Simonini, S. Bergamaschi, Enhancing Entity Resolution Efficiency with Loosely Schema-Aware Techniques (2016), pp. 270–277
G. Simonini, S. Bergamaschi, H.V. Jagadish, BLAST: a loosely schema-aware meta-blocking approach for entity resolution. PVLDB 9(12), 1173–1184 (2016)
S. Sorrentino, S. Bergamaschi, E. Fusari, D. Beneventano, Semantic annotation and publication of linked open data. Comput. Sci. Appl. - ICCSA 2013, 462–474 (2013)
S. Sorrentino, S. Bergamaschi, M. Gawinecki, NORMS: an automatic tool to perform schema label normalization, in ICDE’11 (2011), pp. 1344–1347
S. Sorrentino, S. Bergamaschi, M. Gawinecki, L. Po, Schema label normalization for improving schema matching. DKE 69(12), 1254–1273 (2010)
M. Vincini, D. Beneventano, S. Bergamaschi, Semantic integration of heterogeneous data sources in the momis data transformation system. J. UCS - J. Univers. Comput. Sci. 19(13), 1986–2012 (2013)
G. Wiederhold, Intelligent integration of information, in SIGMOD’93, Washington, D.C., 26–28 May 1993 (ACM Press, 1993), pp. 434–437
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Bergamaschi, S. et al. (2018). From Data Integration to Big Data Integration. In: Flesca, S., Greco, S., Masciari, E., Saccà, D. (eds) A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years. Studies in Big Data, vol 31. Springer, Cham. https://doi.org/10.1007/978-3-319-61893-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-61893-7_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61892-0
Online ISBN: 978-3-319-61893-7
eBook Packages: EngineeringEngineering (R0)