Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3570748.3570755acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaintecConference Proceedingsconference-collections
research-article

Jujuby: Design and Deployment of a Crawler for Twitch CDN Mapping

Published: 19 December 2022 Publication History

Abstract

Content Delivery Networks (CDNs) deliver much of the world’s content on the Internet today. As demand for content rises, the role of CDNs is as crucial as ever, making it imperative to understand the internal workings of such systems. Focusing on Twitch, one of the most popular live video services, we present in this work (1) the design of a Twitch CDN crawler, (2) two deployment methodologies, (3) a study of the data collected, and (4) a comparison of the deployment methodologies.
Towards a Twitch CDN crawler, we sniffed the traffic in a Twitch session. A closer look into the traffic revealed that a load-balancer, named Usher, is in charge of redirecting the clients to their appropriate video servers. This enabled us to design and implement a programmable crawler that can be deployed to discover Twitch’s CDN at a large scale. Previous works have deployed their crawlers on platforms such as PlanetLab and the global Web proxies for broad network coverage. As these platforms become less available, we experimented on two fee-based platforms, i.e., the cloud computing and the virtual private network (VPN) services, for better availability into the future. By running our crawler on the two platforms, we collected data showing that Twitch serves its users in geographically wide but regionally concentrated clusters. These clusters are located primarily in North America, Europe, Asia, and Oceania. We were able to compare as well how the two crawling platforms are suitable for different crawling purposes.

References

[1]
[n. d.]. Docker. Retrieved August 18, 2022 from https://www.docker.com
[2]
[n. d.]. NordVPN Website. Retrieved August 18, 2022 from https://www.nordvpn.com
[3]
[n. d.]. Surfshark Website. Retrieved August 18, 2022 from https://surfshark.com/
[4]
V. K. Adhikari, Y. Guo, F. Hao, V. Hilt, and Z. Zhang. 2012. A tale of three CDNs: An active measurement study of Hulu and its CDNs. In 2012 Proceedings IEEE INFOCOM Workshops. 7–12. https://doi.org/10.1109/INFCOMW.2012.6193524
[5]
V. K. Adhikari, S. Jain, Y. Chen, and Z. Zhang. 2012. Vivisecting YouTube: An active measurement study. In 2012 Proceedings IEEE INFOCOM. 2521–2525. https://doi.org/10.1109/INFCOM.2012.6195644
[6]
V. K. Adhikari, Yang Guo, Fang Hao, M. Varvello, V. Hilt, M. Steiner, and Z. Zhang. 2012. Unreeling netflix: Understanding and improving multi-CDN movie delivery. In 2012 Proceedings IEEE INFOCOM. 1620–1628. https://doi.org/10.1109/INFCOM.2012.6195531
[7]
Azure Network Admin. 2020. Private communication.
[8]
Apple. 2021. HTTP Live Streaming. https://developer.apple.com/streaming/
[9]
Timm Böttger, Felix Cuadrado, Gareth Tyson, Ignacio Castro, and Steve Uhlig. 2018. Open Connect Everywhere: A Glimpse at the Internet Ecosystem through the Lens of the Netflix CDN. SIGCOMM Comput. Commun. Rev. 48, 1 (April 2018), 28–34. https://doi.org/10.1145/3211852.3211857
[10]
Matt Calder, Xun Fan, Zi Hu, Ethan Katz-Bassett, John Heidemann, and Ramesh Govindan. 2013. Mapping the Expansion of Google’s Serving Infrastructure. In Proceedings of the 2013 Conference on Internet Measurement Conference (Barcelona, Spain) (IMC ’13). Association for Computing Machinery, New York, NY, USA, 313–326. https://doi.org/10.1145/2504730.2504754
[11]
GraphQL Community. [n. d.]. GraphQL: A query language for your API. Retrieved May 27, 2021 from https://graphql.org
[12]
PlanetLab Community. [n. d.]. PlanetLab: An Open Platform for developing, deploying, and accessing planetary-scale services. Retrieved August 18, 2022 from https://planetlab.cs.princeton.edu/
[13]
Wireshark Community. [n. d.]. Wireshark. https://www.wireshark.org
[14]
Deng et al.2017. Internet Scale User-Generated Live Video Streaming: The Twitch Case. In Proceedings of PAM (Sydney, Australia). Springer, New York, NY, 60–71.
[15]
Jacobson et al.[n. d.]. Tcpdump/libpcap. http://www.tcpdump.org
[16]
D. Giordano, S. Traverso, L. Grimaudo, M. Mellia, E. Baralis, A. Tongaonkar, and S. Saha. 2015. YouLighter: A Cognitive Approach to Unveil YouTube CDN and Changes. IEEE Transactions on Cognitive Communications and Networking 1, 2(2015), 161–174. https://doi.org/10.1109/TCCN.2016.2517004
[17]
Google. 2021. Chrome dev tools. https://developer.chrome.com/docs/devtools
[18]
hidden. [n. d.]. for anonymity. https://github.com/hyes92121/Jujuby
[19]
T. Hoff. 2010. Justin.Tv’s Live Video Broadcasting Architecture. http://highscalability.com/blog/2010/3/16/justintvs-live-video-broadcasting-architecture.html
[20]
Run Huang, Mengying Zhou, Tiancheng Guo, and Yang Chen. 2022. Locating CDN Edge Servers with HTTP Responses. In Proceedings of the ACM SIGCOMM 2022.
[21]
MaxMind. [n. d.]. GeoIP and GeoLite. Retrieved August 18, 2022 from https://dev.maxmind.com/geoip?lang=en
[22]
Microsoft. [n. d.]. Azure. Retrieved August 18, 2022 from https://azure.microsoft.com
[23]
Karine Pires and Gwendal Simon. 2014. DASH in Twitch: Adaptive Bitrate Streaming in Live Game Streaming Platforms. In Proceedings of the 2014 Workshop on Design, Quality and Deployment of Adaptive Video Streaming(Sydney, Australia). Association for Computing Machinery, New York, NY, USA, 13–18.
[24]
Karine Pires and Gwendal Simon. 2015. YouTube Live and Twitch: A Tour of User-Generated Live Streaming Systems. In Proceedings of the 6th ACM Multimedia Systems Conference (Portland, Oregon) (MMSys ’15). Association for Computing Machinery, New York, NY, USA, 225–230. https://doi.org/10.1145/2713168.2713195
[25]
similarweb. 2021. similarweb. https://pro.similarweb.com/#/research/companyresearch/websiteanalysis/home
[26]
Streamlink Team. 2021. Streamlink. Retrieved May 27, 2021 from https://github.com/streamlink/streamlink
[27]
R. Torres, A. Finamore, J. R. Kim, M. Mellia, M. M. Munafo, and S. Rao. 2011. Dissecting Video Server Selection Strategies in the YouTube CDN. In 2011 31st International Conference on Distributed Computing Systems. 248–257.
[28]
Twitch. 2017. Twitch Engineering Blog. https://blog.twitch.tv/en/2017/10/10/live-video-transmuxing-transcoding-f-fmpeg-vs-twitch-transcoder-part-i-489c1c125f28/
[29]
TwitchTracker. 2021. TwitchTracker. https://twitchtracker.com/languages
[30]
Twitch.tv. [n. d.]. Authentication – Twitch Developer Documentation.
[31]
Twitch.tv. 2015. Twitch Engineering: An Introduction and Overview. https://blog.twitch.tv/en/2015/12/18/twitch-engineering-an-introduction-and-overview-a23917b71a25/
[32]
Bolun Wang, Xinyi Zhang, Gang Wang, Haitao Zheng, and Ben Y. Zhao. 2016. Anatomy of a Personalized Livestreaming System. In Proceedings of the 2016 Internet Measurement Conference (Santa Monica, California, USA) (IMC ’16). Association for Computing Machinery, New York, NY, USA, 485–498. https://doi.org/10.1145/2987443.2987453
[33]
Caleb Wang, Hsi Chen, and Polly Huang. 2020. Towards Cost Effective Server Population Estimation: A Case Study of Twitch. In Companion Proceedings of the Web Conference 2020 (WWW ’20).
[34]
Marc Anthony Warrior, Yunming Xiao, Matteo Varvello, and Aleksandar Kuzmanovic. 2020. De-Kodi: Understanding the Kodi Ecosystem. In Proceedings of The Web Conference 2020 (Taipei, Taiwan) (WWW ’20). Association for Computing Machinery, New York, NY, USA, 1171–1181. https://doi.org/10.1145/3366423.3380194
[35]
Wei-Shiang Wung, Guan-Ting Ting, Ruey-Tzer Hsu, Cheng Hsu, Yu-Chien Tsai, Caleb Wang, Yuan-Tai Liu, Hsi Chen, and Polly Huang. 2021. Twitch’s CDN as an Open Population Ecosystem. In Proceedings of AINTEC’21 (Virtual Event, Japan) (AINTEC ’21). Association for Computing Machinery, New York, NY, USA, 56–63. https://doi.org/10.1145/3497777.3498551
[36]
Cong Zhang and Jiangchuan Liu. 2015. On Crowdsourced Interactive Live Streaming: A Twitch.TV-Based Measurement Study. Proceedings of the 25th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video, NOSSDAV 2015 (02 2015). https://doi.org/10.1145/2736084.2736091

Cited By

View all
  • (2024)A Study on Design, Development and Deployment of Web Crawler Algorithms and Their Metrics2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS)10.1109/ADICS58448.2024.10533459(1-6)Online publication date: 18-Apr-2024

Index Terms

  1. Jujuby: Design and Deployment of a Crawler for Twitch CDN Mapping

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AINTEC '22: Proceedings of the 17th Asian Internet Engineering Conference
    December 2022
    104 pages
    ISBN:9781450399814
    DOI:10.1145/3570748
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 December 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. CDN Mapping
    2. Measurement Tool
    3. Twitch

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    AINTEC'22
    AINTEC'22: The 17th Asian Internet Engineering Conference
    December 19 - 21, 2022
    Hiroshima, Japan

    Acceptance Rates

    Overall Acceptance Rate 15 of 38 submissions, 39%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)22
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Study on Design, Development and Deployment of Web Crawler Algorithms and Their Metrics2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS)10.1109/ADICS58448.2024.10533459(1-6)Online publication date: 18-Apr-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media