Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

A difference-of-convex programming approach with parallel branch-and-bound for sentence compression via a hybrid extractive model

  • Original Paper
  • Published:
Optimization Letters Aims and scope Submit manuscript

Abstract

Sentence compression is an important problem in natural language processing with wide applications in text summarization, search engine and human–AI interaction system etc. In this paper, we design a hybrid extractive sentence compression model combining a probability language model and a parse tree language model for compressing sentences by guaranteeing the syntax correctness of the compression results. Our compression model is formulated as an integer linear programming problem, which can be rewritten as a difference-of-convex (DC) programming problem based on the exact penalty technique. We use a well-known efficient DC algorithm—DCA to handle the penalized problem for local optimal solutions. Then a hybrid global optimization algorithm combining DCA with a parallel branch-and-bound framework, namely PDCABB, is used for finding global optimal solutions. Numerical results demonstrate that our sentence compression model can provide excellent compression results evaluated by F-score, and indicate that PDCABB is a promising algorithm for solving our sentence compression model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. \(\left[ \!\left[ m,n\right] \!\right] \) with \(m\le n\) stands for the set of integers between m and n.

References

  1. Bertsekas, D.P.: Nonlinear Programming, 2nd edn. MIT Press, Cambridge (1999)

    MATH  Google Scholar 

  2. Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O’Reilly Media Inc, Newton (2009)

    MATH  Google Scholar 

  3. Chopra, S., Auli, M., Rush, A.M.: Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 93–98 (2016)

  4. Clarke, J., Lapata, M.: Global inference for sentence compression: an integer linear programming approach. J. Artif. Intell. Res. 31, 399–429 (2008)

    Article  Google Scholar 

  5. Cohn, T., Lapata, M.: Sentence compression beyond word deletion. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pp. 137–144 (2008)

  6. Filippova, K., Alfonseca, E., Colmenares, C.A., Kaiser, Ł., Vinyals, O.: Sentence compression by deletion with LSTMS. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 360–368 (2015)

  7. Filippova, K., Altun, Y.: Overcoming the lack of parallel data in sentence compression. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1481–1491 (2013)

  8. Gurobi optimizer reference manual. https://www.gurobi.com

  9. Hori, C., Furui, S.: Speech summarization: an approach through word extraction and a method for evaluation. IEICE Trans. Inf. Syst. 87(1), 15–25 (2004)

    Google Scholar 

  10. Jing, H.Y.: Sentence reduction for automatic text summarization. In: Sixth Applied Natural Language Processing Conference, pp. 310–315 (2000)

  11. Knight, K., Marcu, D.: Summarization beyond sentence extraction: a probabilistic approach to sentence compression. Artif. Intell. 139(1), 91–107 (2002)

    Article  Google Scholar 

  12. Le Thi, H.A., Le, H.M., Pham, D.T., Bouvry, P.: Solving the perceptron problem by deterministic optimization approach based on DC programming and DCA. In: 2009 7th IEEE International Conference on Industrial Informatics, pp. 222–226. IEEE (2009)

  13. Le Thi, H.A., Moeini, M., Pham, D.T.: Portfolio selection under downside risk measures and cardinality constraints based on DC programming and DCA. Comput. Manag. Sci. 6(4), 459–475 (2009)

    Article  MathSciNet  Google Scholar 

  14. Le Thi, H.A., Nguyen, Q.T., Nguyen, H.T., Pham, D.T.: Solving the earliness tardiness scheduling problem by DC programming and DCA. Math. Balk. 23(3–4), 271–288 (2009)

    MathSciNet  MATH  Google Scholar 

  15. Le Thi, H.A., Pham, D.T.: A continuous approach for globally solving linearly constrained quadratic zero-one programming problems. Optimization 50(1–2), 93–120 (2001)

    Article  MathSciNet  Google Scholar 

  16. Le Thi, H.A., Pham, D.T.: The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann. Oper. Res. 133(3), 23–46 (2005)

    MathSciNet  MATH  Google Scholar 

  17. Le Thi, H.A., Pham, D.T.: DC programming and DCA: thirty years of developments. Math. Program. 169(1), 5–68 (2018)

    Article  MathSciNet  Google Scholar 

  18. Le Thi, H.A., Pham, D.T., Huynh, V.N.: Exact penalty and error bounds in DC programming. J. Glob. Optim. 52(3), 509–535 (2012)

    Article  MathSciNet  Google Scholar 

  19. Le Thi, H.A., Pham, D.T., Muu, L.D.: Exact penalty in DC programming. Vietnam J. Math. 22(2), 169–178 (1999)

    MATH  Google Scholar 

  20. Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J.R., Bethard, S., McClosky, D.: The stanford corenlp natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)

  21. McDonald, R.: Discriminative sentence compression with soft syntactic evidence. In: 11th Conference of the European Chapter of the Association for Computational Linguistics (2006)

  22. Michele, B., Vibhu, O.M., Michael, J.W.: Headline generation based on statistical translation. In: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, pp. 318–325. Association for Computational Linguistics (2000)

  23. Niu, Y.S.: Programmation dc et dca en optimisation combinatoire et optimisation polynomiale via les techniques de sdp. Ph.D. thesis, INSA de Rouen, France (2010)

  24. Niu, Y.S.: PDCABB: a parallel mixed-integer nonlinear optimization solver (parallel DCA-BB global optimization algorithm). https://github.com/niuyishuai/PDCABB (2017)

  25. Niu, Y.S.: On difference-of-SOS and difference-of-convex-SOS decompositions for polynomials. arXiv preprint arXiv:1803.09900 (2018)

  26. Niu, Y.S.: A parallel branch and bound with DC algorithm for mixed integer optimization. In: The 23rd International Symposium in Mathematical Programming (ISMP2018), Bordeaux, France (2018)

  27. Niu, Y.S., Pham, D.T.: A dc programming approach for mixed-integer linear programs. In: International Conference on Modelling, Computation and Optimization in Information Systems and Management Sciences, pp. 244–253. Springer (2008)

  28. Niu, Y.S., Pham, D.T.: Efficient dc programming approaches for mixed-integer quadratic convex programs. In: Proceedings of the International Conference on Industrial Engineering and Systems Management (IESM2011), Metz, France, pp. 222–231 (2011)

  29. Niu, Y.S., You, Y., Liu, W.Z.: Parallel DC cutting plane algorithms for mixed binary linear program. In: Proceedings of the World Congress on Global Optimization (WCGO2019), pp. 330–340 (2019)

  30. NLTK 3.2.5: The natural language toolkit. http://www.nltk.org

  31. Over, P., Dang, H., Harman, D.: DUC in context. Inf. Process. Manag. 43(6), 1506–1520 (2007)

    Article  Google Scholar 

  32. Pham, D.T., Le Thi, H.A.: Convex analysis approach to D.C. programming: theory, algorithm and applications. Acta Math. Vietnam. 22(1), 288–355 (1997)

    MathSciNet  Google Scholar 

  33. Pham, D.T., Le Thi, H.A.: A D.C. optimization algorithm for solving the trust-region subproblem. SIAM J. Optim. 8(2), 476–505 (1998)

    Article  MathSciNet  Google Scholar 

  34. Pham, D.T., Le Thi, H.A., Pham, V.N., Niu, Y.S.: DC programming approaches for discrete portfolio optimization under concave transaction costs. Optim. Lett. 10(2), 261–282 (2016)

    Article  MathSciNet  Google Scholar 

  35. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

    Book  Google Scholar 

  36. Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685 (2015)

  37. Schleich, J., Le Thi, H.A., Bouvry, P.: Solving the minimum m-dominating set problem by a continuous optimization approach based on DC programming and DCA. J. Comb. Optim. 24(4), 397–412 (2012)

    Article  MathSciNet  Google Scholar 

  38. Stanford CoreNLP, natural language software. https://stanfordnlp.github.io/CoreNLP/

  39. Zajic, D., Dorr, B.J., Schwartz, R.: Bbn/umd at duc-2004: Topiary. In: Proceedings of the HLT-NAACL 2004 Document Understanding Workshop, Boston pp. 112–119 (2004)

Download references

Acknowledgements

The authors are supported by the National Natural Science Foundation of China (Grant 11601327) and the National “985” Key Program of China (Grant WF220426001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi-Shuai Niu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

A List of labels for POS tagging and parsing

Some labels used for sentence parsing and POS tagging are listed in Table 4. Some notations refer to POS tags of the Penn Treebank Tagset [30], others are introduced by ourselves. For example, notation “ADJ” refers to adjective tag cases “JJ”, “JJR” and “JJS” in the Penn Treebank Tagset, but “ADJP” is a notation that we defined to denote ‘adjective phrase’.

Table 4 Labels for POS tagging and parsing

B List of CFG grammar and compression rules for statements

See Table 5.

Table 5 CFG grammar and compression rules for statements

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Niu, YS., You, Y., Xu, W. et al. A difference-of-convex programming approach with parallel branch-and-bound for sentence compression via a hybrid extractive model. Optim Lett 15, 2407–2432 (2021). https://doi.org/10.1007/s11590-020-01695-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11590-020-01695-9

Keywords

Navigation