Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Close the Gaps: A Learning-While-Doing Algorithm for Single-Product Revenue Management Problems

Published: 01 April 2014 Publication History

Abstract

We consider a retailer selling a single product with limited on-hand inventory over a finite selling season. Customer demand arrives according to a Poisson process, the rate of which is influenced by a single action taken by the retailer such as price adjustment, sales commission, advertisement intensity, etc. The relationship between the action and the demand rate is not known in advance. However, the retailer is able to learn the optimal action on the fly as she maximizes her total expected revenue based on the observed demand reactions.
Using the pricing problem as an example, we propose a dynamic learning-while-doing algorithm that only involves function value estimation to achieve a near-optimal performance. Our algorithm employs a series of shrinking price intervals and iteratively tests prices within that interval using a set of carefully chosen parameters. We prove that the performance of our algorithm is among the best of all possible algorithms in terms of the asymptotic regret the relative loss compared to the full information optimal solution. Our result closes the performance gaps between parametric and nonparametric learning and between the post-price mechanism and the customer-bidding mechanism. Important managerial insight from this research is that the values of information on both the parametric form of the demand function as well as each customer's exact reservation price are less important than prior literature suggests. Our results also suggest that firms would be better off to perform dynamic learning and action concurrently rather than sequentially.

References

[1]
Agrawal S, Wang Z, Ye Y (2010) A dynamic near-optimal algorithm for online linear programming. Working paper, Stanford University, Stanford, CA.
[2]
Araman VF, Caldentey RA (2009) Dynamic pricing for nonperishable products with demand learning. Oper. Res. 57(5):1169-1188.
[3]
Auer P, Cesa-Bianchi N, Freund Y, Schapire RE (2002) The nonstochastic multi-armed bandit problem. SIAM J. Comput. 32(1):48-77.
[4]
Aviv Y, Pazgal A (2002) Pricing of short life-cycle products through active learning. Technical report, Washington University, St. Louis.
[5]
Ball M, Queyranne M (2009) Towards robust revenue management competitive analysis of online booking. Oper. Res. 57(4):950-963.
[6]
Bertsimas D, Perakis G (2006) Dynamics pricing: A learning approach. Lawphongpanich S, Hearn DW, Smith MJ, eds. Mathematical and Computational Models for Congestion Charging, Applied Optimization, Vol. 101 (Springer, New York), 45-79.
[7]
Besbes O, Zeevi A (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407-1420.
[8]
Besbes O, Zeevi A (2012a) Blind network revenue management. Oper. Res. 60(6):1537-1550.
[9]
Besbes O, Zeevi A (2012b) On the surprising sufficiency of linear models for dynamic pricing with demand learning. Working paper, Columbia University, New York.
[10]
Bitran G, Caldentey R (2003) An overview of pricing models for revenue management. Manufacturing Service Oper. Management 5(3): 203-229.
[11]
Bremaud P (1980) Point Process and Queues: Martingale Dynamics (Springer-Verlag, Berlin).
[12]
Broder J, Rusmevichientong P (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965-980.
[13]
Carvalho A, Puterman M (2005) Learning and pricing in an Internet environment with binomial demands. J. Revenue and Pricing Management 3(4):320-336.
[14]
den Boer A, Zwart B (2011) Dynamic pricing and learning with finite inventories. Working paper, Eindhoven University of Technology, Amsterdam.
[15]
den Boer A, Zwart B (2014) Simultaneously learning and optimizing using controlled variance pricing. Management Sci. 60(3):770-783.
[16]
Elmaghraby W, Keskinocak P (2003) Dynamic pricing in the presence of inventory considerations: Research overview, current practices, and future directions. Management Sci. 49(10):1287-1309.
[17]
Farias VF, Van Roy B (2010) Dynamic pricing with a prior on market response. Oper. Res. 58(1):16-29.
[18]
Gallego G, van Ryzin G (1994) Optimal dynamic pricing of inventories with stochastic demand over finite horizons. Management Sci. 40(8):999-1029.
[19]
Harrison JM, Keskin NB, Zeevi A (2012) Bayesian dynamic pricing policies: Learning and earning under a binary prior distribution. Management Sci. 58(3):570-586.
[20]
Kleinberg R, Leighton T (2003) The value of knowing a demand curve: Bounds on regret for online posted-price auctions. Proc. 44th Annual IEEE Sympos. Foundations Comput. Sci., FOCS '03 (IEEE, Piscataway, NJ), 594-605.
[21]
Lai TL, Robbins H (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4-22.
[22]
Lim A, Shanthikumar J (2007) Relative entropy, exponential utility, and robust dynamic pricing. Oper. Res. 55(2):198-214.
[23]
Lobo MS, Boyd S (2003) Pricing and learning with uncertain demand. Working paper, Duke University, Durham, NC.
[24]
Perakis G, Roels G (2006) The "price of information": Inventory management with limited information about demand. Manufacturing Service Oper. Management 8(1):98-117.
[25]
Talluri K, van Ryzin G (2005) Theory and Practice of Revenue Management (Springer, New York).
[26]
Wang Z (2012) Dynamic learning mechanism in revenue management problems. Ph.D. thesis, Stanford University, Stanford, CA.

Cited By

View all
  • (2023)Pricing experimental design: causal effect, expected revenue and tail riskProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619727(31788-31799)Online publication date: 23-Jul-2023
  • (2022)Context-based dynamic pricing with partially linear demand modelProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601997(23780-23791)Online publication date: 28-Nov-2022
  • (2022)Adversarial Bandits with KnapsacksJournal of the ACM10.1145/355704569:6(1-47)Online publication date: 17-Nov-2022
  • Show More Cited By
  1. Close the Gaps: A Learning-While-Doing Algorithm for Single-Product Revenue Management Problems

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Operations Research
    Operations Research  Volume 62, Issue 2
    April 2014
    267 pages

    Publisher

    INFORMS

    Linthicum, MD, United States

    Publication History

    Published: 01 April 2014
    Accepted: 01 October 2013
    Received: 01 January 2011

    Author Tags

    1. asymptotic optimality
    2. dynamic decision making
    3. learning
    4. nonparametric
    5. pricing
    6. revenue management

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Pricing experimental design: causal effect, expected revenue and tail riskProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619727(31788-31799)Online publication date: 23-Jul-2023
    • (2022)Context-based dynamic pricing with partially linear demand modelProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601997(23780-23791)Online publication date: 28-Nov-2022
    • (2022)Adversarial Bandits with KnapsacksJournal of the ACM10.1145/355704569:6(1-47)Online publication date: 17-Nov-2022
    • (2021)Bandits with knapsacks beyond the worst caseProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3542037(23191-23204)Online publication date: 6-Dec-2021
    • (2021)Multi-armed bandit requiring monotone arm sequencesProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3541492(16093-16103)Online publication date: 6-Dec-2021
    • (2021)Technical Note—Joint Learning and Optimization of Multi-Product Pricing with Finite Resource Capacity and Unknown Demand ParametersOperations Research10.1287/opre.2020.207869:2(560-573)Online publication date: 1-Mar-2021
    • (2021)Novel pricing strategies for revenue maximization and demand learning using an exploration–exploitation frameworkSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-021-06047-y25:17(11711-11733)Online publication date: 1-Sep-2021
    • (2020)Constrained episodic reinforcement learning in concave-convex and knapsack settingsProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3497093(16315-16326)Online publication date: 6-Dec-2020
    • (2020)Dynamic Inventory and Price Controls Involving Unknown Demand on Discrete Nonperishable ItemsOperations Research10.1287/opre.2019.197468:5(1335-1355)Online publication date: 1-Sep-2020
    • (2020)Multidimensional Dynamic Pricing for Welfare MaximizationACM Transactions on Economics and Computation10.1145/33815278:1(1-35)Online publication date: 17-Apr-2020
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media