Abstract
Data rather than functionality are the sources of competitive advantage for Web2.0 applications such as wikis, blogs and social networking websites. This valuable information might need to be capitalized by third-party applications or be subject to migration or data analysis. Model-Driven Engineering (MDE) can be used for these purposes. However, MDE first requires obtaining models from the wiki/blog/website database (a.k.a. model harvesting). This can be achieved through SQL scripts embedded in a program. However, this approach leads to laborious code that exposes the iterations and table joins that serve to build the model. By contrast, a Domain-Specific Language (DSL) can hide these “how” concerns, leaving the designer to focus on the “what”, i.e. the mapping of database schemas to model classes. This paper introduces Schemol, a DSL tailored for extracting models out of databases which considers Web2.0 specifics. Web2.0 applications are often built on top of general frameworks (a.k.a. engines) that set the database schema (e.g., MediaWiki, Blojsom). Hence, table names offer little help in automating the extraction process. In addition, Web2.0 data tend to be annotated. User-provided data (e.g., wiki articles, blog entries) might contain semantic markups which provide helpful hints for model extraction. Unfortunately, these data end up being stored as opaque strings. Therefore, there exists a considerable conceptual gap between the source database and the target metamodel. Schemol offers extractive functions and view-like mechanisms to confront these issues. Examples using Blojsom as the blog engine are available for download.
Similar content being viewed by others
References
Architecture-Driven Modernization (ADM). Accessed 21-Dec-10. http://adm.omg.org
Eclipse Modeling Framework. Accessed 21-Dec-10. http://www.eclipse.org/modeling/emf
hCard Microformat. Accessed 21-Dec-10. http://microformats.org/wiki/hcard
Hibernate. Accessed 21-Dec-10. http://www.hibernate.org
hProduct Microformat. Accessed 21-Dec-10. http://microformats.org/wiki/hproduct
ISO 9126 Software Quality Model. Accessed 21-Dec-10. http://www.sqa.net/iso9126.html
MDA Specifications. http://www.omg.org/mda/specs.htm
MediaWiki. accessed 21-Dec-10. http://www.mediawiki.org
Microformats. Accessed 21-Dec-10. http://microformats.org
Rdfa. Accessed 21-Dec-10. http://rdfa.info/wiki/Introduction
Structured Blogging. Accessed 21-Dec-10. http://structuredblogging.org/
Teneo. Accessed 21-Dec-10. http://wiki.eclipse.org/Teneo
Use Class With Semantics in Mind, W3C. Accessed 21-Dec-10. http://www.w3.org/QA/Tips/goodclassnames
XText. Accessed 21-Dec-10. http://www.eclipse.org/Xtext/
Barbier, G., Bruneliere H., Jouault F., Lennon Y., Madiot F.: Modisco, a model-driven platform to support real legacy modernization uses cases. In: Information Systems Transformation: Architecture-Driven Modernization Case Studies. Elsevier Science, Amsterdam (2010)
Michael, R.B.: On reverse engineering of vendor databases. In: Working Conference on Reverse Engineering (WCRE), pp. 183–190 (1998)
Cánovas, J.L., Cuadrado, J.S., Molina J.G.: Gra2MoL: a domain specific transformation language for bridging grammarware to modelware in software modernization. In: MODSE 2008 (2008)
Cook, S.: Domain-specific modeling and model driven architecture. MDA J. (2004, last accessed Oct 2010). http://www.bptrends.com/publicationfiles/01-04kel-Cook.pdf
Czarnecki, D.: Blojsom. Accessed 21-Dec-10 http://wiki.blojsom.com
Davis, K.H., Aiken P.H.: Data reverse engineering: a historical survey. In: Working Conference on Reverse Engineering (WCRE), pp. 70–78 (2000)
Díaz O., Villoria F.M.: Generating blogs out of product catalogues: an MDE approach. J. Syst. Softw. 83(10), 1970–1982 (2010)
Hainaut, J.-L., Cleve, A., Henrard, J., Hick, J.-M.: Migration of Legacy information systems. In: Mens and Demeyer [33], pp. 105–138
Heidenreich, F., Johannes, J., Karol, S., Seifert, M., Wende, C.: Derivation and refinement of textual syntax for models. In: ECMDA-FA, pp. 114–129 (2009)
Cánovas J.L., Molina J.G.: An architecture-driven modernization tool for calculating metrics. IEEE Softw. 27, 37–43 (2010)
Jahnke J.H.: Cognitive support in software reengineering based on generic fuzzy reasoning nets. Fuzzy Sets Syst. 145(1), 3–27 (2004)
Jahnke, J.H., Schäfer, W., Zündorf, A.: Generic fuzzy reasoning nets as a basis for reverse engineering relational database applications. In: ESEC/SIGSOFT FSE, pp. 193–210 (1997)
Jouault, F., Allilaire, F., Bézivin, J., Kurtev, I., Valduriez, P.: ATL: a QVT-like transformation language. In: OOPSLA Companion (2006)
Jouault, F., Kurtev, I.: Transforming models with ATL. In: MoDELS Satellite Events, pp. 128–138 (2005)
Kurtev, I., Bézivin, J., Aksit, M.: Technological spaces: an initial appraisal. In: International Symposium on Distributed Objects and Applications, DOA (2002)
Lockwood, N.S., Dennis, A.R.: Exploring the corporate blogosphere: a taxonomi for research and practice. In: Proceedings of the 41st Annual Hawaii International Conference on System Sciences-HICSS (2008)
Markines, B.: Socially induced semantic networks and applications. SIGWEB Newsl., pp. 3:1–3:3, September (2009)
MartSoft. Open Catalog Format. Accessed 21-Dec-10. http://xml.coverpages.org/ocp.html
Mens T., Demeyer S.: Software Evolution. Springer, Berlin (2008)
Müller, H.A., Jahnke, J.H., Smith, D.B., Storey, M.-A., Tilley, S.R., Wong, K.: Reverse engineering: a roadmap. In: International Conference on Software Engineering (ICSE), pp. 47–60 (2000)
Carr, N.: Lessons in Corporate Blogging, 2006. Business Week Online. Accessed 21-Dec-10. http://www.businessweek.com
Polo M., Rodríguez de Guzmán I.G., Piattini M.: An MDA-based approach for database re-engineering. J. Softw. Maintenance 19(6), 383–417 (2007)
Reus, T., Geers, H., van Deursen, A.: Harvesting software systems for MDA-based reengineering. In: ECMDA-FA, pp. 213–225 (2006)
Simitsis, A., Skoutas, D., Castellanos, M.: Representation of conceptual ETL designs in natural language using semantic web technology. In: Data & Knowledge Engineering (2009)
Steinberg D., Budinsky F., Paternostro M., Merks E.: EMF: Eclipse Modeling Framework. Addison-Wesley, Reading (2008)
Stonebraker M., Moore D.: Object-Relational DBMSs: The Next Great Wave. Morgan Kaufmann, USA (1996)
Ulrich W.M., Newcomb P.H.: Information Systems Transformation: ADM Case Studies. Morgan Kaufmann, USA (2010)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Gustavo Rossi, Nora Koch, Geert-Jan Houben, and Antonio Vallecillo.
Rights and permissions
About this article
Cite this article
Díaz, O., Puente, G., Cánovas Izquierdo, J.L. et al. Harvesting models from web 2.0 databases. Softw Syst Model 12, 15–34 (2013). https://doi.org/10.1007/s10270-011-0194-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10270-011-0194-z