Abstract
Relationship extraction concerns with the detection and classification of semantic relationships between entities mentioned in a collection of textual documents. This paper proposes a simple and on-line approach for addressing the automated extraction of semantic relations, based on the idea of nearest neighbor classification, and leveraging a minwise hashing method for measuring similarity between relationship instances. Experiments with three different datasets that are commonly used for benchmarking relationship extraction methods show promising results, both in terms of classification performance and scalability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Airola, A., Pyysalo, S., Björne, J., Pahikkala, T., Ginter, F., Salakoski, T.: A graph kernel for protein-protein interaction extraction. In: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing (2008)
Broder, A.: On the resemblance and containment of documents. In: Proceedings of the Conference on Compression and Complexity of Sequences (1997)
Bunescu, R., Mooney, R.: A shortest path dependency kernel for relation extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2005)
Bunescu, R., Mooney, R.: Subsequence kernels for relation extraction. In: Proceedings of the Conference on Neural Information Processing Systems (2006)
Chum, O., Philbin, J., Zisserman, A.: Near duplicate image detection: min-hash and tf-idf weighting. In: Proceedings of the British Machine Vision Conference (2008)
Culotta, A., McCallum, A., Betz, J.: Integrating probabilistic extraction models and data mining to discover relations and patterns in text. In: Proceedings of the Conference of the North American Chapter of the ACL (2006)
Culotta, A., Sorensen, J.: Dependency tree kernels for relation extraction. In: Proceedings of the Annual Meeting of the ACL (2004)
Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2011)
Hendrickx, I., Kim, N., Kozareva, Z., Nakov, P., Séaghdha, D., Padó, S., Pennacchiotti, M., Romano, L., Szpakowicz, S.: Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals. In: Proceedings of the International Workshop on Semantic Evaluation (2010)
Kambhatla, N.: Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In: Proceedings of the Annual Meeting of the ACL (2004)
Kim, S., Yoon, J., Yang, J., Park, S.: Walk-weighted subsequence kernels for protein-protein interaction extraction. BMC Bioinformatics 11(107) (2010)
Li, P., König, C.: b-bit minwise hashing. In: Proceedings of the International Conference on World Wide Web (2010)
Nguyen, T.-V., Moschitti, A., Riccardi, G.: Convolution kernels on constituent, dependency and sequential structures for relation extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2009)
Petrov, S., Das, D., McDonald, R.T.: A universal part-of-speech tagset. In: Proceedings of the Conference on Language Resources and Evaluation (2012)
Rajaraman, A., Ullman, J.: Mining of massive datasets, ch. 3. Finding Similar Items. Cambridge University Press (2011)
Teixeira, C., Silva, A., Junior, W.: Min-hash fingerprints for graph kernels: A trade-off among accuracy, efficiency, and compression. Journal of Information and Data Management 3(3) (2012)
Tikk, D., Thomas, P., Palaga, P., Hakenberg, J., Leser, U.: A comprehensive benchmark of kernel methods to extract protein protein interactions from literature. PLoS Computational Biology 6(7) (2010)
Zelenko, D., Aone, C., Richardella, A.: Kernel methods for relation extraction. Journal of Machine Learning Research 3 (2003)
Zhang, Y., Lin, H., Yang, Z., Wang, J., Li, Y.: Hash subgraph pairwise kernel for protein-protein interaction extraction. IEEE/ACM Transactions on Computer Biology and Bioinformatics 9(4) (2012)
Zhao, S., Grishman, R.: Extracting relations with integrated information using kernel methods. In: Proceedings of the Annual Meeting of the ACL (2005)
Zhou, G., Zhang, M.: Extracting relation information from text documents by exploring various types of knowledge. Information Processing and Management 43(4) (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Batista, D.S., Silva, R., Martins, B., Silva, M.J. (2013). A Minwise Hashing Method for Addressing Relationship Extraction from Text. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds) Web Information Systems Engineering – WISE 2013. WISE 2013. Lecture Notes in Computer Science, vol 8181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41154-0_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-41154-0_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41153-3
Online ISBN: 978-3-642-41154-0
eBook Packages: Computer ScienceComputer Science (R0)