Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu

Perceptual grouping explains constellations across cultures

2022, Psychological Science, Vol. 33(3), pp. 354-363

https://doi.org/10.1177/09567976211044157

Abstract

Cultures around the world organise stars into constellations, or asterisms, and these groupings are often considered to be arbitrary and culture-specific. Yet there are striking similarities in asterisms across cultures and groupings such as Orion, the Big Dipper, the Pleiades and the Southern Cross are widely recognized across many different cultures. It has been informally suggested that these shared patterns are explained by common perceptual principles, such as the Gestalt laws of grouping, but there have been no systematic attempts to catalog asterisms that recur across cultures or to explain the perceptual basis of these groupings. Here we compile data from 27 cultures around the world to show that a simple computational model of perceptual grouping accounts for many of the recurring cross-cultural asterisms. As expected, asterisms such as Orion and the Big Dipper are common in our data, but we also find that lesser-known asterisms such as Delphinus and the head of Aries are both repeated across cultures and captured by our model. Our results suggest that basic perceptual principles account for more of the structure of asterisms across cultures than previously acknowledged and highlight ways in which specific cultures depart from this shared baseline.

Perceptual grouping explains constellations across cultures arXiv:2010.06108v1 [physics.hist-ph] 13 Oct 2020 Charles Kemp1 , Duane W. Hamacher2 , Daniel R. Little1 & Simon J. Cropper1 1 Melbourne School of Psychological Sciences & 2 School of Physics University of Melbourne, Australia Abstract Cultures around the world organise stars into constellations, or asterisms, and these groupings are often considered to be arbitrary and culture-specific. Yet there are striking similarities in asterisms across cultures and groupings such as Orion, the Big Dipper, the Pleiades and the Southern Cross are widely recognized across many different cultures. It has been informally suggested that these shared patterns are explained by common perceptual principles, such as the Gestalt laws of grouping, but there have been no systematic attempts to catalog asterisms that recur across cultures or to explain the perceptual basis of these groupings. Here we compile data from 27 cultures around the world to show that a simple computational model of perceptual grouping accounts for many of the recurring cross-cultural asterisms. As expected, asterisms such as Orion and the Big Dipper are common in our data, but we also find that lesser-known asterisms such as Delphinus and the head of Aries are both repeated across cultures and captured by our model. Our results suggest that basic perceptual principles account for more of the structure of asterisms across cultures than previously acknowledged and highlight ways in which specific cultures depart from this shared baseline. Anyone who has tried to learn the full set of 88 Western constellations will sympathize with Herschel [1] (p 156), who wrote that “the constellations seem to have been almost purposely named and delineated to cause as much confusion and inconvenience as possible,” and that “innumerable snakes twine through long and contorted areas of the heavens, where no memory can follow them.” Yet Herschel [2] (p 4) and others also point out that there are “well-defined natural groups of conspicuous stars” that have been picked out and named by multiple cultures around the world [3, 4, 5]. For example, the Southern Cross is recognized as a cross by multiple cultures [15, 7], and is identified as a stingray by the Yolngu of northern Australia [8], an anchor by the Tainui of Aotearoa/New Zealand [9], and as a curassow bird by the Lokono of the Guianas [10]. Asterisms (e.g. the Southern Cross) are sometimes distinguished from formal constellations (e.g. the region of the sky within which the Southern Cross lies), but in cross-cultural work these two terms are often used interchangeably. It is widely acknowledged that asterisms reflect both universal perceptual principles and culture-specific traditions. For example, Urton [11] (p 5) notes that “almost every culture seems to have recognized a few of the same celestial groupings (e.g., the tight cluster of the Pleiades, the V of the Hyades, the straight line of the belt of Orion), but the large constellation shapes of European astronomy and astrology simply are not universally recognized; the shapes were projected onto the stars because the shapes were important objects or characters in the Western religious, mythological, and calendrical tradition.” Even groupings as apparently salient as the Southern Cross are not inevitable—some Australian cultures have many names for individual stars but tend not to “connect the dots” to form figured constellations [12, 13, 14]. Although cultural factors are undeniably important, we will argue that perceptual factors explain more of the inventory of asterisms across cultures than has previously been recognized. Krupp [3] (p 58) suggests that a “narrow company” of asterisms is common across cultures and lists just four: Orion’s Belt, the 1 Pleiades, the Big Dipper, and the Southern Cross. Here we draw on existing resources to compile a detailed catalog of asterisms across cultures, and find that the list of recurring asterisms goes deeper than the handful of examples typically given by Krupp and others [3, 15, 16]. To demonstrate that these asterisms are mostly consistent with universal perceptual principles, we present a computational model of perceptual grouping and show that it accounts for many of the asterisms that recur across cultures. Our data set includes 22 systems drawn from the Stellarium software package [17] and 5 from the ethnographic literature. The data span six major regions (Asia, Australia, Europe, North America, Oceania, and South America), and include systems from both oral (e.g. Inuit) and literate cultures (e.g. Chinese). Stellarium currently includes a total of 42 systems, and we excluded 20 because they were closely related to a system already included or because their documentation was not sufficiently grounded in the scholarly literature. Most of our sources specify asterism figures in addition to the stars included in each asterism, but we chose not to use these figures as they can vary significantly within a culture and because they were not available for all cultures. Some of our analyses do not require asterism figures, and for those that do we used minimum spanning trees computed over the stars within each asterism. Figure 1A shows a consensus system generated by overlaying minimum spanning trees for asterisms from all 27 cultures. The thick edges in the plot join stars that are grouped by many cultures. The most common asterisms include familiar groups such as Orion’s belt, the Pleiades, the Hyades, the Big Dipper, the Southern Cross, and Cassiopeia. The plot also highlights asterisms such as Corona Borealis, Delphinus and the head of Aries that are less well-known but nevertheless picked out by multiple cultures. All of these asterisms and more are listed in Table 1, which ranks 35 asterisms based on how frequently they recur across cultures (an extended version of the table appears as Table S1). Some cross-cultural similarities in asterisms reflect historical relationships between cultures, and Table 1 also summarizes a mixed-effects analysis that captures some historical relationships by including a random effect for geographic region. This mixedeffects approach prioritizes asterisms that are attested across geographic regions even if they are relatively rare within each region, and the results suggest that asterisms including the Southern Pointers, Lyra, and Corona Australis deserve to be listed alongside the ten singled out at the top of Figure 1A. To explain these shared patterns in the night sky, scholars from multiple disciplines have suggested that asterisms are shaped in part by universal perceptual principles, including the principle that bright objects are especially salient, and that nearby objects are especially likely to be grouped [18, 19, 20]. Claims that these principles account for star grouping across cultures are mostly anecdotal, but the relevant principles have been carefully studied by psychologists [21, 22] and have inspired the development of formal models of perceptual grouping [6, 24, 25, 4, 27, 28]. We build on this tradition by using a computational model (the graph clustering model, or GC model for short) to explore the extent to which the factors of brightness and proximity account for asterisms across cultures. The GC model constructs a graph with the stars as nodes, assigns strengths to the edges based on proximity and brightness, and thresholds the graph so that only the n strongest edges remain. Figure 1B shows the model graph when the threshold n is set to 320. The connected components of this thresholded graph represent model predictions about stars that are likely to be grouped across cultures. There is a strong resemblance between these model predictions and the consensus system in Figure 1A. The model picks out groups that correspond closely to the ten frequently-occurring asterisms highlighted in the inset panels of Figure 1A. Beyond these ten asterisms the model also picks out the Southern Pointers, the teapot in Sagittarius, the head of Draco, the head and stinger of Scorpius, Lyra, the sickle in Leo, the shaft of Aquila, and more. Table S2 lists all groups found by the model and indicates which of them are similar to human asterisms attested in Table S1. The steps carried out by the model are summarized by Figure 2. The first step is to construct a graph over stars. Existing graph-based clustering models typically operate over a graph corresponding to a mini- 2 A B Figure 1: Common asterisms across cultures compared with model predictions. (A) Consensus system created by overlaying minimum spanning trees for all asterisms in our data set of 27 cultures. Edge widths indicate the number of times an edge appears across the entire dataset, and edges that appear three or fewer times are not shown. Node sizes indicate apparent star magnitudes, and only stars with magnitudes brighter than 4.5 have been included. Insets show 10 of the most common asterisms across cultures, and numbers greater than 10 identify additional asterisms mentioned in the text or Table 1: Southern Pointers (11), shaft of Aquila (12), little Dipper (13), head of Scorpius (14), stinger of Scorpius (15), sickle in Leo (16), Corvus (17), Northern Cross (18), Lyra (19), Square of Pegasus (20), Corona Australis (21), head of Draco (22) and the teapot in Sagittarius (23). (B) Asterisms according to the GC model with n = 320. The model assigns a strength to each edge in a graph defined over the stars, and here the strongest 320 edges are shown. Edge widths are proportional to the strengths assigned by the model. 3 1 2 3 4 Human (raw) 0.63 0.62 0.59 0.57 Human (adj) 0.55 0.57 0.51 0.46 Model Score 1.0 1.0 1.0 0.88 5 6 0.43 0.37 0.42 0.36 1.0 0.5 7 8 9 10 11 0.35 0.3 0.29 0.28 0.26 0.35 0.34 0.33 0.31 0.3 1.0 0.6 1.0 0.71 0.45 12 13 14 15 0.25 0.24 0.24 0.24 0.29 0.31 0.38 0.33 0.75 1.0 1.0 0.44 16 17 18 0.23 0.22 0.22 0.27 0.25 0.3 1.0 1.0 0.62 19 20 21 22 23 0.21 0.21 0.21 0.2 0.19 0.34 0.31 0.29 0.29 0.31 0.04 1.0 1.0 1.0 0.75 24 25 26 27 0.19 0.19 0.18 0.18 0.35 0.28 0.25 0.3 0.0 0.83 0.56 0.33 28 0.17 0.22 0.58 29 30 31 0.17 0.16 0.16 0.27 0.33 0.27 1.0 0.83 0.56 32 33 0.16 0.15 0.26 0.22 0.01 0.2 34 35 0.14 0.14 0.3 0.35 0.0 0.67 Stars Description 34DelOri, 46EpsOri, 50ZetOri 25EtaTau, 17Tau, 19Tau, 20Tau, 23Tau, 27Tau 87AlpTau, 54GamTau, 61Del1Tau, 74EpsTau, 78The2Tau 50AlpUMa, 48BetUMa, 64GamUMa, 69DelUMa, 77EpsUMa, 79ZetUMa, 85EtaUMa Alp1Cru, BetCru, GamCru, DelCru 5AlpCrB, 3BetCrB, 8GamCrB, 10DelCrB, 13EpsCrB, 4TheCrB, 14IotCrB 66AlpGem, 78BetGem 34DelOri, 46EpsOri, 50ZetOri, 44IotOri, 42Ori 9AlpDel, 6BetDel, 12Gam2Del, 11DelDel 18AlpCas, 11BetCas, 27GamCas, 37DelCas, 45EpsCas 58AlpOri, 19BetOri, 24GamOri, 34DelOri, 46EpsOri, 50ZetOri, 53KapOri 46EpsOri, 50ZetOri, 48SigOri 13AlpAri, 6BetAri, 5Gam2Ari Alp1Cen, BetCen 50AlpUMa, 48BetUMa, 64GamUMa, 69DelUMa, 77EpsUMa, 79ZetUMa, 85EtaUMa, 1OmiUMa, 29UpsUMa, 30PhiUMa, 63ChiUMa, 23UMa 53AlpAql, 60BetAql, 50GamAql 50AlpUMa, 48BetUMa 1AlpUMi, 7BetUMi, 13GamUMi, 23DelUMi, 22EpsUMi, 16ZetUMi, 21EtaUMi 54AlpPeg, 53BetPeg 8Bet1Sco, 7DelSco, 6PiSco 35LamSco, 34UpsSco Iot1Sco, KapSco, 35LamSco, 34UpsSco 32AlpLeo, 41Gam1Leo, 17EpsLeo, 36ZetLeo, 30EtaLeo, 24MuLeo 21AlpAnd, 88GamPeg 1AlpCrv, 9BetCrv, 4GamCrv, 7DelCrv, 2EpsCrv 21AlpSco, 8Bet1Sco, 7DelSco, 6PiSco, 20SigSco 50AlpCyg, 6Bet1Cyg, 37GamCyg, 18DelCyg, 53EpsCyg, 21EtaCyg 21AlpSco, 8Bet1Sco, 7DelSco, 26EpsSco, Zet2Sco, Mu1Sco, 6PiSco, 20SigSco, 23TauSco 21AlpSco, 20SigSco, 23TauSco 3AlpLyr, 10BetLyr, 14GamLyr, 12Del2Lyr, 6Zet1Lyr 26EpsSco, Zet2Sco, EtaSco, TheSco, Iot1Sco, KapSco, 35LamSco, Mu1Sco, 34UpsSco 21AlpAnd, 54AlpPeg, 53BetPeg, 88GamPeg 34DelOri, 46EpsOri, 50ZetOri, 87AlpTau, 54GamTau, 61Del1Tau, 74EpsTau, 78The2Tau, 17Tau 43GamCnc, 47DelCnc AlpCrA, BetCrA, GamCrA, DelCrA Orion’s Belt Pleiades Hyades Big Dipper Southern Cross Corona Borealis Castor and Pollux Delphinus Cassiopeia Orion Head of Aries Southern Pointers Shaft of Aquila Little Dipper Head of Scorpius Stinger of Scorpius Sickle Corvus Northern Cross Lyra Square of Pegasus Corona Australis Table 1: Common asterisms across cultures. Raw human scores roughly indicate how often an asterism is found in our data set, and adjusted scores are based on a mixed model that allows for historical relationships between cultures. The model scores roughly indicate how well these asterisms are captured by the GC model (1.0 indicates a perfect match). 4 1. Construct graph over stars 4. Scale brightness and proximity within local neighborhood 2. Compute brightness and proximity for each edge 3. Weight brightness and proximity based on 5. Combine brightness and proximity 6. Remove all but n strongest edges to form clusters Figure 2: Steps carried out by the graph clustering (GC) model. Each step is illustrated using a region of the sky that includes the Southern Cross and the Southern Pointers. bxy and pxy denote brightness weights (blue) and proximity weights (red) associated with the edge between x and y. m(x) and m(y) are the apparent magnitudes of stars x and y, and dxy is the angular separation between these stars. bG denotes the median brightness weight across the entire graph, bL denotes the median brightness weight within 60° of a given edge, and pG and pL are defined similarly. In steps 2 through 6 edge widths are proportional to edge weights. 5 mal spanning tree [2] or Delaunay Triangulation [3, 4], and the GC model uses the union of three Delaunay triangulations defined over stars with apparent magnitudes brighter than 3.5, 4.0 and 4.5. Delaunay-like representations are hypothesized to play a role in early stages of human visual processing [25], and combining Delaunay triangulations at multiple scales ensures that the resulting graph includes both edges between bright stars that are relatively distant and edges between fainter stars that are relatively close. The second step assigns a brightness and proximity to each edge. For an edge joining two stars, proximity is inversely related to the angular distance between the stars, and brightness is based on the apparent magnitude of the fainter of the two stars. The third step weights brightness and proximity based on a parameter ρ. For all analyses we set ρ = 3, which means that brightness is weighted more heavily than proximity. The fourth step scales brightness and proximity so that the distribution of these variables within a local neighborhood of 60° is comparable with the distribution across the entire celestial sphere. Scaling in this way allows the impact of brightness and proximity to depend on the local context. For example, the Southern Cross lies in a region that contains many stars in close proximity, and we propose that stars need to be especially close to stand out in this context. Previous psychological models of perceptual grouping incorporate analogous local scaling steps [6, 4], and the neighborhood size of 60° was chosen to match the extent of mid-peripheral vision. The fifth step multiplies brightness and proximity to assign an overall strength to each edge, and the final step thresholds the graph so that only the strongest n edges remain. We compared the GC model to several alternatives, including variants that remove one of its components, k-means clustering, and the CODE model of perceptual grouping [6]. The results reveal that the GC model performs better than all of these alternatives, and full details are provided in the supplementary information. Each human asterism can be assigned a score between 0 and 1 that measures how well it is captured by the GC model. Scores for each culture in our data set are plotted in Figure 3. The model accounts for some cultures well — for example, 13 of 20 Arabic asterisms, 19 of 38 Marshall Islands asterisms and 55 of 161 Chinese asterisms are captured perfectly by the model for some value of the threshold n. The systems captured well by the model are drawn from a diverse set of geographical regions, suggesting that genealogical relationships between cultures are not enough to explain the recurring patterns predicted by the model. Yet there are also many asterisms that are not captured by the model, and the Chinese and Western systems in particular both include many asterisms with a model score of 0. Both systems partition virtually all of the visible sky into asterisms, and achieving this kind of comprehensive coverage may require introducing asterisms (including Herschel’s “innumerable snakes”) that do not correspond to natural perceptual units. Although some attested asterisms missed by the GC model will probably resist explanation by any model of perceptual grouping, others can perhaps be captured by extensions of the model. For example, the model tends not to group stars separated by a relatively large distance. As a result it misses the lower arm of the Northern Cross (Cygnus) and misses the Great Square of Pegasus entirely. These errors could perhaps be addressed by developing a multi-scale approach that forms groups at different levels of spatial resolution [27]. Another possible extension is to incorporate additional grouping cues such as the Gestalt principle of good continuity, which is consistent with some of the most basic processes of visual contour detection [31, 32, 33]. The current model combines Corona Borealis with an extraneous star and does not connect the tail of Scorpius into a single arc, and incorporating a preference for groups that form smooth curves [4] may resolve both shortcomings. In addition to scoring each system in our data relative to the GC model, we also examined how closely each system resembles other systems in our data set (see Figure S13). The system most different from all others is the Chinese system, which includes more than 300 asterisms, many of which are small and have no counterparts in records for other cultures. In future work, the model may prove useful for evaluating 6 Dakota Western 3 2 1 0 15 10 5 0 0 Lokono 3 2 1 0 4 2 0 10 2 1 2 5 0 0.5 1.0 Tongan 4 2 0 Indo−Malay 4 4 2 0 0.0 Boorong 0.5 1.0 2 0 0.0 0.5 1.0 0.0 0.5 1.0 4 4 2 0 0.0 0 Lenekel 0 Inuit 2 Indian 4 0 2 Marshall 0 2 Arabic 4 15 10 5 0 Pacariqtambo 4 Tukano 4 0 2 0 Norse 0 Anutan 1 0 10 5 0 0 Sami 1 Maori 2 Chinese Siberian 2 0 5 40 20 0 Belarusian Ojibwe 4 Macedonian 4 Babylonian 4 2 0 0 Tupi 3 2 1 0 5 Egyptian 2 Romanian count Navajo 4 2 0 0.0 0.5 1.0 0.0 0.5 1.0 model score Figure 3: Model results for individual cultures in our data set. Scores of 1 indicate asterisms that are perfectly captured by the GC model for some value of the threshold n, and each distribution includes scores for all asterisms that remain after filtering at a stellar magnitude of 4.5. The cultures are ordered based on the means of the distributions. hypotheses about historical relationships between systems from different cultures [34, 35]. For example, the model could potentially be used to ask whether Oceanic constellations are more similar to Eurasian constellations than would be expected based on perceptual grouping alone. We have focused throughout on similarities in star groups across cultures, but there are also striking similarities in the names and stories associated with these groups [34, 36, 37]. For example, in Greek traditions Orion is known as a hunter pursuing the seven sisters of the Pleiades, and versions of the same narrative are shared by multiple Aboriginal cultures of Australia [38, 39]. Perceptual grouping helps to explain which patterns of stars are singled out for attention, and it is both surprising and satisfying that a simple model based on brightness and proximity alone can account for so many of the asterisms commonly found across cultures. Understanding the meanings invested in these asterisms, however, requires a deeper knowledge of history, cognition and culture. Data 22 of the systems were drawn from Stellarium [17] and the sources of the remaining 5 appear in the captions of Figures S15-S41. Stellar data were drawn from version 5.0 of the Yale Bright Star Catalog [40]. For the mixedeffects analysis, the 27 systems were organized into 6 regions: Asia (Arabic, Chinese, Indian, Indo-Malay), Australia (Boorong), North America (Dakota, Inuit, Navajo, Ojibwe), Oceania (Anutan, Lenakel, Maori, Marshall Islands, 7 Tongan), South America (Lokono, Pacariqtambo, Tukano, Tupi), and Western (Babylonian, Belarusian, Egyptian, Macedonian, Norse, Romanian, Sami, Siberian, Western). Human scores The match between an asterism a and a reference asterism r is defined as   |a ∩ r| − |a \ r| match(a, r) = max ,0 , |r| (1) where |a ∩ r| is the number of stars shared by a and r, |a \ r| is the number of stars in a that are not shared by r, and |r| is the number of stars in r. The function attains its maximum value of 1 when a and r are identical. The match between asterism a and an entire system of asterisms S is defined as match(a, S) = max (match(a, r)) . r∈S (2) Equation 2 captures the idea that a matches S well if there is at least one asterism r in S such that the match between a and r is high. In Table 1, the variable labeled Human (raw) is defined as human raw(a, Shuman ) = meanS∈Shuman (match(a, S)) , (3) where Shuman is the set of all 27 systems in our data set. We computed scores for all asterisms in the entire data set, but to avoid listing variants of the same basic asterism, an asterism a is included in Table 1 only if match(a, r) < 0.5 for all asterisms r previously listed in the table. The adjusted scores are based on a mixed ordinal regression carried out using the brms package in R [41]. For each asterism a, match scores (Equation 2) for all 27 systems S were mapped to 11 ordered intervals, one for zero scores and the remaining 10 for the intervals (0, 0.1],. . . , (0.9, 1]. We then fit an ordinal regression model that aimed to predict these interval assignments given a constant fixed effect and a random effect for geographic region (the model formula was interval ∼ 1 + (1|region)). We used the fitted model to compute the posterior predictive distribution over intervals for a system from a novel geographic region, and the mean of this distribution is the adjusted score in Table 1 (computing the mean requires identifying each interval with its midpoint). Model scores The model scores in Table 1 are defined as modelscore(a, Smodel ) = max (match(a, S)) S∈Smodel (4) where Smodel includes model systems for all values of n between 1 and 2000. Scoring asterisms in this way avoids having to choose a single value of the threshold parameter n. Acknowledgements We acknowledge the Indigenous custodians of the traditional astronomical knowledge used in this paper, and thank Joshua Abbott, Celia Kemp, Bradley Schaefer and Yuting Zhang for comments on the manuscript. This work was supported in part by ARC FT190100200, ARC DE140101600, the McCoy Seed Fund, the Laby Foundation, the Pierce Bequest, and by a seed grant from the Royal Society of Victoria. References [1] Herschel, J. F. W. A Treatise on Astronomy (Lea & Blanchard, 1842). 8 [2] Herschel, J. F. W. On the Advantages to be Attained by a Revision and Re-arrangment of the Constellations, with Especial Reference to Those of the Southern Hemisphere, and on the Principles Upon Which such rearrangement ought to be conducted (Moyes and Barclay, 1841). [3] Krupp, E. C. Night gallery: The function, origin, and evolution of constellations. Archaeoastronomy 15, 43 (2000). [4] Aveni, A. People and the sky: Our ancestors and the cosmos (2008). [5] Kelley, D. H. & Milone, E. F. Exploring ancient skies: A survey of ancient and cultural astronomy (Springer Science & Business Media, 2011). [6] Urton, G. Constructions of the ritual-agricultural calendar in Pacariqtambo, Peru. In Del Chamberlain, V., Carlson, J. B. & Young, J. M. (eds.) Songs from the Sky: Indigenous Astronomical and Cosmological Traditions of the World (Ocarina Books, 2005). [7] Roe, P. G. Mythic substitution and the stars: Aspects of Shipibo and Quechua ethnoastronomy compared. In Del Chamberlain, V., Carlson, J. B. & Young, J. M. (eds.) Songs from the Sky: Indigenous Astronomical and Cosmological Traditions of the World (Ocarina Books, 2005). [8] Mountford, C. P. Art, Myth and Symbolism. Vol 1. (Melbourne University Press, 1956). [9] Best, E. The astronomical knowledge of the Maori, genuine and empirical (Dominion Museum, 1922). [10] Magaña, E. & Jara, F. The Carib sky. Journal de la Société des Américanistes 105–132 (1982). [11] Urton, G. At the crossroads of the earth and the sky: an Andean cosmology (University of Texas Press, 1981). [12] Johnson, D. Night skies of Aboriginal Australia: a noctuary (Sydney University Press, 2014). [13] Maegraith, B. G. The astronomy of the Aranda and Luritja tribes. Transactions of the Royal Society of South Australia 56, 19–26 (1932). [14] Cairns, H. & Harney, B. Y. Dark sparklers: Yidumduma’s Aboriginal astronomy (2004). [15] Krupp, E. C. Sky tales and why we tell them. In Seline, H. (ed.) Astronomy Across Cultures, 1–30 (Springer, 2000). [16] Aveni, A. F. Skywatchers of ancient Mexico. (1980). [17] Chéreau, F. & the Stellarium team. Stellarium (2020). URL stellarium.org. Version 0.20.1. [18] Metzger, W. Laws of seeing. (MIT Press, 1936/2006). [19] Yantis, S. Multielement visual tracking: Attention and perceptual organization. Cognitive Psychology 24, 295– 340 (1992). [20] Hutchins, E. The role of cultural practices in the emergence of modern human intelligence. Philosophical Transactions of the Royal Society B: Biological Sciences 363, 2011–2019 (2008). [21] Wagemans, J. et al. A century of Gestalt psychology in visual perception: I. perceptual grouping and figure– ground organization. Psychological Bulletin 138, 1172–1217 (2012). [22] Wagemans, J. et al. A century of Gestalt psychology in visual perception: Ii. conceptual and theoretical foundations. Psychological Bulletin 138, 1218–1252 (2012). [23] Compton, B. J. & Logan, G. D. Evaluating a computational model of perceptual grouping by proximity. Perception & Psychophysics 53, 403–421 (1993). 9 [24] Kubovy, M., Holcombe, A. O. & Wagemans, J. On the lawfulness of grouping by proximity. Cognitive Psychology 35, 71–98 (1998). [25] Dry, M. J., Navarro, D. J., Preiss, K. & Lee, M. D. The perceptual organization of point constellations. Proceedings of the 31st Annual Meeting of the Cognitive Science Society 1151–1156 (2009). [26] van den Berg, M. C. J. Grouping by proximity and grouping by good continuation in the perceptual organization of random dot patterns. Ph.D. thesis, University of Virginia (1998). [27] Froyen, V., Feldman, J. & Singh, M. Bayesian hierarchical grouping: Perceptual grouping as mixture estimation. Psychological Review 122, 575 (2015). [28] Im, H. Y., Zhong, S.-h. & Halberda, J. Grouping by proximity and the visual impression of approximate number in random dot arrays. Vision research 126, 291–307 (2016). [29] Zahn, C. T. Graph-theoretical methods for detecting and describing Gestalt clusters. IEEE Transactions on Computers 20, 68–86 (1971). [30] Ahuja, N. Dot pattern processing using Voronoi neighborhoods. IEEE Transactions on Pattern Analysis and Machine Intelligence 336–343 (1982). [31] Field, D. J., Hayes, A. & Hess, R. F. Contour integration by the human visual system: evidence for a local “association field”. Vision research 33, 173–193 (1993). [32] Das, A. & Gilbert, C. D. Topography of contextual modulations mediated by short-range interactions in primary visual cortex. Nature 399, 655–661 (1999). [33] Geisler, W. S., Perry, J. S., Super, B. J. & Gallogly, D. P. Edge co-occurrence in natural images predicts contour grouping performance. Vision Research 41, 711–724 (2001). [34] Gibbon, W. B. Asiatic parallels in North American star lore: Ursa Major. The Journal of American Folklore 77, 236–250 (1964). [35] Berezkin, Y. The cosmic hunt: Variants of a Siberian-North American myth. Folklore: Electronic Journal of Folklore 79–100 (2005). [36] Baity, E. C. et al. Archaeoastronomy and ethnoastronomy so far [and comments and reply]. Current anthropology 14, 389–449 (1973). [37] Culver, R. Astronomy. In Selin, H. (ed.) Encyclopaedia of the history of science, technology, and medicine in non-western cultures, 292–299 (Springer, 2008). [38] Johnson, D. D. Interpretations of the Pleiades in Australian Aboriginal astronomies. Proceedings of the International Astronomical Union 7, 291–297 (2011). [39] Leaman, T. M. & Hamacher, D. W. Baiami and the emu chase: an astronomical interpretation of a Wiradjuri Dreaming associated with the Burbung. Journal of Astronomical History and Heritage 22, 225–237 (2019). [40] Hoffleit, D. & Jaschek, C. The bright star catalogue (1982). [41] Bürkner, P.-C. brms: An R package for Bayesian multilevel models using Stan. Journal of Statistical Software 80, 1–28 (2017). [42] Hamacher, D. W. On the astronomical knowledge and traditions of Aboriginal Australians. Ph.D. thesis, Macquarie University (2012). 10 [43] Van Oeffelen, M. P. & Vos, P. G. Enumeration of dots: An eye movement analysis. Memory & Cognition 12, 607–612 (1984). [44] Compton, B. J. & Logan, G. D. Judgments of perceptual groups: Reliability and sensitivity to stimulus transformation. Perception & Psychophysics 61, 1320–1335 (1999). [45] Schreiner, J. Redefining constellations and asterisms Available at http://www.jschreiner.com/ english/stars/home.html. [46] Xu, S., Chen, K. & Zhou, Y. Re-clustering of constellations through machine learning. Tech. Rep., Stanford University (2014). [47] Avilin, T. Astronyms in Belarussian folk beliefs. Archaeologia Baltica 10, 1 (2009). [48] Stanbridge, W. E. On the astronomy and mythology of the Aborigines of Victoria. Proceedings of the Philosophical Institute of Victoria 2, 137–140 (1857). [49] Kaye, G. R. Hindu astronomy: Ancient science of the Hindus (New Delhi, 1981). [50] Ammarell, G. Astronomy in the Indo-Malay archipelago. In Selin, H. (ed.) Encyclopaedia of the History of Science, Technology, and Medicine in Non-Western Cultures, 324–333 (Springer, 2008). [51] Erdland, P. A. Die Marshall-Insulaner: Leben und Sitte, Sinn und Religion eines Südsee-Volkes (Aschendorffsche, 1914). 11 Supplementary Information Contents 1 Cross-cultural data 12 2 Stellar data 15 3 Measuring the match between asterisms 16 4 Common asterisms 16 5 The GC Model 16 5.1 Fitting ρ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 6 GC model results 7 Model comparisons 7.1 Scoring functions . . 7.2 GC model . . . . . . 7.3 CODE model . . . . 7.4 k-means clustering . 7.5 Additional baselines . 7.6 Model scores . . . . 8 19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 21 23 24 25 27 27 Comparisons across cultures 28 A Asterism systems for 27 cultures 29 B Common asterisms 44 C Asterisms for the GC model with n = 320 45 1 Cross-cultural data Appendix A shows asterisms for all 27 cultures in our data set. The majority of the systems were drawn from the Stellarium software package, and we compiled the remainder using sources given in the figure captions in Appendix A. Stellarium includes multiple systems for some cultures: for example, there are early and later versions of the Babylonian sky culture, and three versions of the Chinese sky culture. In cases like these we removed all but a single representative of each culture. We also removed a number of additional Stellarium systems for reasons documented in Table S1. Our final data set includes 22 of the 42 Stellarium systems available as of May 25, 2020. Before carrying out our analyses we pre-processed each system by removing stars fainter than 4.5 in magnitude then removing all asterisms that included no stars or just one star after filtering. For example, the constellation Mensa is removed from the Western system because the brightest star in this constellation has a magnitude of 5.08. Figure S4 shows the distribution of magnitudes for each system in our data set. 12 Sky culture Almagest Arabic Armintxe Aztec Boorong Chinese contemporary Chinese medieval Hawaiian starlines Indian Japanese moon stations Kamilaroi Korean Maya Mongolian Northern Andes Sardinian Seleucid Western (Sky & Telescope) Western (Hlad) Western (Rey) Reason for exclusion Lists 48 constellations of the Greeks, which are the source of the Western system Based on the 48 constellations of the Greeks, which are the source of the Western system Identifications not sufficiently grounded in the published literature Identifications not sufficiently grounded in the published literature Already included in the data set Only one Chinese system is included Only one Chinese system is included Identifications not sufficiently grounded in the published literature Already included in the data set Identifications not sufficiently grounded in the published literature Includes names of single stars only Closely related to the Chinese system Identifications not sufficiently grounded in the published literature Identifications not sufficiently grounded in the published literature Identifications not sufficiently grounded in the published literature Identifications not sufficiently grounded in the published literature Only one Babylonian system is included Only one Western system is included Only one Western system is included Only one Western system is included Table S1: Stellarium systems excluded from our analysis. 13 Dakota Western 10 5 0 50 5 0 0 Romanian 40 Tupi 15 10 5 0 20 0 Lokono count Navajo Egyptian 10 15 10 5 0 0 0 Arabic Inuit 6 4 2 0 5 0 0.0 2.5 5.0 Marshall Indian 0 Tongan 4 2 0 Lenekel Indo−Malay 4 10 2 5 0 0.0 2.5 5.0 Tukano 15 10 5 0 15 10 5 0 5 Boorong 0 0 Pacariqtambo 6 4 2 0 5 5 4 2 0 2 Ojibwe 10 Anutan Sami 4 4 2 0 Chinese Siberian Norse Babylonian 40 20 0 150 100 50 0 5 15 10 5 0 4 2 0 Macedonian Belarusian Maori 30 20 10 0 0 0.0 2.5 5.0 0.0 2.5 5.0 4 2 0 0.0 2.5 5.0 0.0 2.5 5.0 magnitude Figure S4: Distributions of star magnitudes for all systems in our data set. The vertical line in each plot shows the threshold value of 4.5, and the order of the systems matches Figure 3. 14 # disconnected ● 10 Egyptian Babylonian ● Western Chinese ● ● 5 0 ●●●●● ● ●●● ● ●●●● ●● ● ● 0 ● ● 50 100 150 # connected Figure S5: Counts of asterisms that are connected and disconnected with respect to the GC model graph. Each point corresponds to a system in our data, and systems with more than 3 disconnected asterisms have been labelled. Filtering at 4.5 removes around 25% of the stars across the entire set of systems, but the proportion of faint stars varies across systems. Nearly 50% of the stars in the Tukano system have magnitudes greater than 4.5, but for around half of the systems 10% or fewer of the stars have magnitudes greater than 4.5. Many of the cultures in our data have systems of asterisms that have not been documented in full, and the ethnoastronomical accounts that do exist naturally tend to focus on brighter stars. The distributions in Figure S4 therefore may not reflect the full set of asterisms that would be identified by expert astronomers from the cultures in question. The plots in Figures S18 through S44 show all asterisms remaining after the initial filtering step. In each plot the constellation figures are minimum spanning trees computed over the model graph using angular distance as the edge weight. In some cases an asterism does not correspond to a connected subset of the model graph, and in these cases minimal spanning forests are shown instead. For example, the Dakota system in Figure S23 includes a large asterism called “Ki Inyanka Ocanku” (The Race Track) that groups stars from Gemini, Canis Minor, Canis Major, Orion, Taurus and Auriga into a large circle. The scale of this asterism is larger than the scale of the model graph, and as a result Figure S23 shows the asterism as a collection of 5 disconnected components. Figure S5 shows the number of disconnected asterisms for each culture in our data. Around 90% of asterisms in the filtered data are connected with respect to the model graph, and the Egyptian, Babylonian and Western systems stand out as having relatively high proportions of disconnected asterisms. 2 Stellar data We used stellar data from version 5.0 of the Yale Bright Star catalog, which includes information about magnitude and position (right ascension and declination) for 9110 stars. Star positions in our data use J2000 coordinates, and are therefore correct for Jan 1, 2000. Star positions change over time due to precession, nutation, and proper motion. Precession and nutation do not affect our analyses because they do not affect the relative positions of stars with respect to each other. Proper motion does affect the shapes of asterisms over long periods of time — for example, Hamacher[1] describes how the shape of the Southern Cross has changed over the past 10,000 years. J2000 coordinates are suitable for our purposes because the systems 15 analyzed in this paper are based on records from the last few thousand years, and because the stars with the greatest proper motion are at or below the threshold of visibility. We filtered the data to retain only stars brighter than 6.5 in magnitude, which roughly corresponds to the faintest magnitude still visible to the naked eye. Some stars (e.g. double stars) are very close to each other, and if two stars had positions that matched up to 5 decimal places we replaced them with a single star with magnitude equal to the combined magnitude of the pair. These initial pre-processing steps yielded a set of 8258 stars. For all analyses we filtered the set further and considered only the 918 stars brighter than 4.5 in magnitude. 3 Measuring the match between asterisms The function for computing the match(a, r) between asterism a and reference asterism r appears as Equation 1 in Materials and Methods. There are two ways in which a can differ from r: it can include extraneous stars, and it can fail to include some of the stars in r. The match function penalizes the first of these failings more heavily than the second. This property is especially useful when comparing an asterism against a reference that includes a relatively large number of stars. For example, the teapot asterism includes 8 of the brightest stars in Sagittarius, and the version of Sagittarius in our Western system includes 17 stars after thresholding at magnitude 4.5. Intuitively, the teapot matches Sagittarius fairly well, and the function in Equation 1 assigns a match of 0.47 between the teapot (a) and Sagittarius (r). If we used an alternative match function where the numerator included penalties for both |a \ r| and |r \ a|, then the match between the teapot and Sagittarius would be 0. The match between an asterism a and an entire system of asterisms S is defined in Equation 2 of Materials and Methods. 4 Common asterisms The most common asterisms across our data set are listed in Table S2 in Appendix B, which extends Table 1 by including additional asterisms. 5 The GC Model The Graph Clustering (GC) model begins by building a graph over the 918 stars that remained after preprocessing. We construct three Delaunay triangulations over stars with magnitudes less than 3.5, 4.0 and 4.5 respectively, and the final graph G (shown in Figure S6) is the union of all three. An edge in G that joins stars x and y is labelled with two attributes: mxy , the apparent magnitude of the fainter of the two stars, and dxy , the angular distance between the stars. The two attributes are on different scales: m lies between -1.46 and 4.5, and d lies between 0.1 and 41.6 degrees. In both cases higher values are “worse:” distant stars are relatively unlikely to be grouped, and faint stars are relatively unlikely to be included in groupings. We convert each magnitude m to a brightness b, and each distance d to a proximity p: bxy = exp (−mxy ) pxy = exp (−dxy ) 16 (S1) Figure S6: Graph over stars used by the GC model. 17 The negative exponential transformation means that higher values are now “better.”1 We then weight brightness bxy and proximity pxy based on a parameter ρ: ρ b ← b ρ+1 1 (S2) p ← p ρ+1 where we have dropped the subscripts of both bxy and pxy . When ρ = 1 proximity and brightness are weighted equally, and when ρ > 1 brightness is weighted more than proximity. When ρ = 0 brightness is effectively discarded, and when ρ = ∞ proximity is effectively discarded. The next step is to scale the brightness and proximity values within a local neighborhood of 60°. For each edge (s1 , s2 ) joining stars s1 and s2 , the local neighborhood L is the subgraph of the full model graph G that includes all stars that lie within 60° of either s1 or s2 . The p value of the edge (s1 , s2 ) is then scaled by the factor median{eg (p)} eg ∈G pG = (S3) pL median{el (p)} el ∈L where eg is an edge in the full model graph G, el is an edge that lies within the local neighborhood L, and ei (p) is the p value of edge ei . Scaling p in this way means that the distribution of p values within any local neighborhood becomes comparable to the distribution over the entire graph. For example, consider a neighborhood that includes many close stars. Before scaling, most edges in the neighborhood will have high values of p. After scaling, only pairs of stars that are especially close relative to the neighborhood will have high values of p. The same approach in Equation S3 is used to scale the brightness values b. When scaling both attributes a neighborhood size of 60° was chosen so that the neighborhood corresponds roughly to the extent of mid-peripheral vision. After scaling, the proximity and brightness values for each edge are combined multiplicatively to produce a single strength s = bp for each edge. We then threshold the graph by removing all but the top n edges in the graph, and the clusters returned by the model correspond to connected components of the thresholded graph. To assess the contribution made by different components of the GC model we will compare the model to three variants. First is a model that omits the local scaling step. This GC (no scaling) model can also be viewed as a variant in which neighborhood L in Equation S3 expands to encompass the entire graph G. The second and third variants set ρ = 0 and ρ = ∞ respectively, and we refer to them as the GC (no brightness) and GC (no proximity) models. These labels indicate whether or not brightness and proximity contribute to the final edge strengths s, but in both cases brightness and proximity are still used when constructing the original graph G: each Delaunay triangulation uses proximity, and combining the three triangulations means that only stars brighter than 4.5 in magnitude are included. 5.1 Fitting ρ The GC model has two parameters: ρ, which determines the relative contributions of proximity and brightness, and n, which determines the number of edges in the thresholded graph. We fit ρ based on the idea that stars connected by strong edges in the model graph should be frequently grouped across cultures. The first step is to assemble a set of human edges by computing minimum spanning trees (MSTs) for each asterism 1 −mxy Brightness could be defined as flux (i.e. bxy = 10 2.5 ) but the natural exponential formulation is simpler and equally good for our purposes. Changing from base e to base 10 leaves the model unchanged if the parameter ρ is adjusted accordingly. 18 in our data set. All MSTs were computed over the model graph G using raw angular distance as the edge weight. The human edges included all edges in these MSTs, and the human strength of each edge was defined as the number of times it appeared in the set. For example, the edge joining the Southern Pointers appears in 9 of the MSTs and therefore has a human strength of 9. We then assembled a set of model edges that included all of the strongest edges according to the model. The model edges include all edges in the MST of G, where the MST is computed using model strengths s = bp rather than angular distance. Parameter ρ can then be set to the value that maximizes the correlation between the model edge strengths and the human edge strengths. The best values of ρ for the GC and GC (no scaling) models were 3.5 and 3.2, which yield correlations of 0.72 and 0.66 respectively. Both models achieve almost identical correlations for ρ = 3, and for simplicity we set ρ = 3 for all subsequent analyses. Because ρ > 1, this setting means that the edge strengths in the model are influenced more by brightness than by proximity. Figure S7a compares human strengths with strengths according to the GC model. The two edges with greatest human strengths join the three stars in Orion’s belt (δ, ǫ and ζ Ori), and these edges have human strengths of 33 because some of the 27 systems in our data include Orion’s belt in more than one asterism. The same two edges are the strongest and third-strongest edges according to the model, and the second strongest model edge joins the Southern Pointers (α and β Cen). This second edge appears as an outlier in Figure S7a, and one possible reason is that these stars lie relatively far south and our data set is tilted towards cultures from the Northern Hemisphere. Corresponding plots for the three model variants are shown in Figure S7. All three perform worse than the full GC model, suggesting that local scaling helps to account for human groupings and confirming that both brightness and proximity are important. 6 GC model results We began by testing that the prediction that the most common asterisms in Table S2 should be relatively well captured by the model. To avoid having to choose a single value of the threshold parameter n, we created a set Smodel that includes model systems for all values of n between 1 and 2000. The video at www.charleskemp.com/papers/constellations.mp4 includes a frame for each system and shows how model asterisms emerge as n is increased. We then computed model scores for each human asterism a using the function modelscore(a, Smodel ) defined in Equation 4 of Materials and Methods. The model scores in Table S2 indicate that most of the common asterisms are captured fairly well by the model. The most notable exception is the Great Square of Pegasus. Model scores for each culture in our data set are plotted in Figure 3. Although Table S2 suggests that common asterisms are often captured by the model, Figure 3 shows that there are many less common asterisms that the model does not explain. The model evaluations thus far do not depend on a specific setting of the threshold parameter n. The model system in Figure 1B, however, is based on setting n = 320, and Table S3 in Appendix C lists all 124 asterisms in this system. The column labeled Score indicates how well these asterisms match our data set, and this column is defined using human score(a, Shuman ) = max (match(a, S)) , S∈Shuman (S4) where Shuman is the set of all 27 systems in our data set. 27 of the model asterisms are identical to asterisms from one or more cultures, and 105 of the model asterisms have scores of 0.2 or greater, indicating that they correspond at least partially to asterisms from at least one culture. Note that the scoring function is relatively 19 A B r= 0.72 r= 0.66 δ Ori, ε Ori ε Ori, ζ Ori 30 δ Tau, γ Tau ε UMa, ζ UMa 20 human human δ Tau, γ Tau ε Ori, ζ Ori δ Ori, ε Ori 30 α Gem, β Gem λ Sco, υ Sco 10 20 α Cru, β Cru 10 β Cru, γ Cru α Cen, β Cen α Cen, β Cen 0 0 0.00 0.05 0.10 0.15 0.0 0.1 GC model C D r= 0.35 r= 0.23 ε Ori, ζ Ori 30 ε Ori, ζ Ori δ Ori, ε Ori 30 δ Tau, γ Tau δ Ori, ε Ori δ Tau, γ Tau 20 human human 0.2 GC (no scaling) α Cen, β Cen 10 20 10 98 Aqr, 99 Aqr α CMa, α CMi 0 0.0 0.2 0.4 0.6 0.8 ρ Boo, σ Boo δ PsA, γ PsA 0 0 GC (no proximity) ψ Aqr, ψ Aqr 1 2 3 GC (no brightness) Figure S7: Strengths of star pairs according to the cross-cultural data and four models: (A) the GC model, (B) the GC model without local scaling, (C) the GC model with edge strengths based on brightness only, and (D) the GC model with edge strengths based on proximity only. In each panel each point corresponds to a pair of stars joined by an edge in the model graph, and selected edges are labelled in gray. The y-axis of each panel shows counts across the entire data set, and the x-axis shows strengths according to the model. strict, especially for larger asterisms with many variants that can be created by including or excluding fainter stars. For example, Figure 1B suggests that the model captures Orion relatively well, but the model version of Orion achieves a score of only 0.36. 20 7 Model comparisons Our model belongs to a family of graph-based clustering algorithms that rely on a graph defined over the items to be clustered [2, 3, 4]. The main alternative in the literature on perceptual grouping is the CODE model [5, 6, 7] which uses a continuous spatial representation of the items to be clustered. A third possible approach is k-means clustering, which has been previously applied to the problem of grouping stars into asterisms [8, 9]. We compared all three approaches using a set of three different scoring functions. Consistent with our previous analyses, the input to each model includes all stars brighter than 4.5 in magnitude. 7.1 Scoring functions Suppose that H (for human) is a set of human clusters and M is a set of model clusters. A good set M should have high precision: each cluster in M should be similar to a cluster in H. A good set M should also have high recall: for each cluster h in H there should be some cluster in M that is similar to h. We formalize precision and recall as follows: 1 X match(m, H) (S5) precision(M, H) = |M | m∈M recall(M, H) = 1 X match(h, M ) |H| (S6) h∈H where the match(·, ·) function is defined in Equation 2 of Materials and Methods. Precision and recall are typically combined using an F measure: precision · recall Fβ = (1 + β 2 ) 2 (β · precision) + recall (S7) The standard F measure sets β = 1, but we also consider an F10 measure that sets β = 10 and weights recall more heavily than precision. The F10 measure captures the idea that a model system M that includes just one or two clusters (i.e. recall is low) should not score highly regardless of how well the model clusters match attested clusters. Our measures of precision and recall and their combination using a Fβ score are directly inspired by the literature on information retrieval. If match(a, S) returned 1 if a belonged to S and 0 otherwise, then our formulations of precision and recall in Equations S5 and S6 would be equivalent to the standard definitions. Our match(·, ·) function, however, is graded, which means that our formulations of precision and recall are extensions of the standard definitions. Our third scoring function is the adjusted Rand index, which is a standard measure of the similarity between two partitions. Many of the cluster systems that we consider pick out a relatively small number of clusters against a background of unclustered stars. In order to apply the adjusted Rand index we assign all unclustered stars to an “everything else” category. Of the three scoring functions, the F10 measure deserves the most attention. The standard F measure (i.e. F1 ) has the shortcoming of assigning high scores to model solutions with a very small number of clusters. The adjusted Rand index is undesirable because of the need to include an “everything else” category. We report results for both measures because they are standard in the literature, but will focus primarily on the F10 measure. Each of the measures so far scores a model system M relative to a single human system H. We evaluate a model solution relative to the full set H of systems for 27 cultures by computing the average score across this set. 21 Figure S8: Asterisms returned by the GC model (n = 165). Figure S9: Asterisms returned by the GC model (no scaling, n = 122). 22 Figure S10: Asterisms returned by the GC model (no brightness, n = 321). 7.2 GC model The model includes two parameters: ρ, which controls the relative weights of brightness and proximity, and the threshold parameter n. Previously ρ was set to 3 based on the correlation analysis summarized by Figure S11: Asterisms returned by the GC model (no proximity, n = 123). 23 Figure S12: Asterisms returned by the CODE model with rescaled Gaussian kernels, local distances and the sum combination function (t = 0.86, β = 0.86, h = 0.5 ). Figure S7a, and we retain that value here. The threshold is set to the value (n = 165) that maximizes model performance according to the F10 measure. In addition to the GC model we consider the three variants of the model previously evaluated in Figure S7, and the n parameter is optimized separately for each one using the F10 measure. Asterisms returned by all four models are shown in Figures S8 through S11. 7.3 CODE model The CODE model can be implemented by dropping a kernel function (e.g. a Gaussian) on each item, combining all of these kernel functions to form an “activation surface,” then cutting the activation surface at some threshold t to produce clusters. Previous applications of the CODE model consider the problem of clustering a field of perceptually identical items, but for us the items are stars with different magnitudes. To capture the idea that brighter stars are more likely to be included in asterisms, we adapted the CODE model to allow taller kernel functions for brighter stars. If two stars are extremely close to each other, the sum of the kernels on the two should be identical to a single kernel for a star with apparent magnitude equivalent to the two stars combined. To satisfy this condition we set kernel heights based on the flux (i.e. apparent brightness) of a star. For a star with apparent magnitude m, the flux F of the star in the visual band is −m F = F0 × 10 2.5 (S8) where F0 is a normalizing constant. For our purposes we can drop the constant because scaling all kernels by a constant is equivalent to adjusting the threshold used by the CODE model. We also introduce an additional parameter β so that the height of the kernel on a star of magnitude m is  −m β 10 2.5 (S9) 24 When β = 1 kernel height is proportional to flux, and when β = 0 all stars have the same kernel height regardless of flux. Our formulation of the CODE model has two additional numeric parameters: the threshold t, and the nearest-neighbour coefficient h. The standard deviation of the kernel for star i is σi = hdi (S10) where di is the distance between the star and its nearest neighbour. In addition to these numeric parameters Compton and Logan (1993) consider several qualitative parameters of the model: • Gaussian vs Laplacian: kernel functions may be Gaussian or Laplacian • sum vs max: kernels may be combined using a sum or a max function • local vs global: if global, all distances di in Equation S10 are replaced by the global mean of di • standard vs rescaled: if rescaled, all kernel functions are rescaled to have the same height In our implementation the rescaling step specified by the fourth factor is carried out before adjusting the kernels for brightness as specified by Equation S9. As a result, the kernels are equal in height at the end of the process only when rescaling is applied and β = 0. We evaluated all 16 combinations of the four factors, and optimized the three numeric parameters (β, t and h) separately for each combination using the F10 measure. The optimization began with a grid search then used Powell’s conjugate direction method initialized using the best values found in the grid search. The best performing version used rescaled Gaussian kernels, local distances and the sum combination function, and the best parameters for this model were t = 0.86, β = 0.86, and h = 0.5. The clusters returned by this model are shown in Figure S12. 7.4 k-means clustering k-means clustering begins by randomly choosing a set of k cluster centers. The algorithm then repeatedly assigns items to the nearest cluster and recomputes the cluster centers based on these assignments until convergence. We used the spherecluster package in Python to implement k-means clustering with distances computed over the celestial sphere. We ran k-means clustering for all k between 1 and 300. For comparison, the largest system of asterisms in our data (Chinese) includes 318 asterisms, and 161 remain when we threshold the system at a stellar magnitude of 4.5. For each value of k we ran the algorithm using 100 different initial cluster assignments chosen using the package default (the k-means++ algorithm). Random initialization means that there is some noise in the results, but model performance according to the F10 measure tended to increase monotonically with k. The best-scoring system in our simulations, however, had k = 249 and is shown in Figure S13. As Figure S13 shows, k-means tends to partition the stars into clusters that are roughly equal in size. In contrast, the GC and CODE models both pick out a relatively small number of clusters against a background of “unclustered” stars. To parallel this behavior we consider a variant of k-means with a magnitude threshold m. This k-means threshold model runs regular k-means on all stars brighter than m, and all remaining stars are treated as unclustered. Based on the F10 measure the thresholded model achieves best performance at m = 2.9 and k = 100, and a model result for these parameter values is shown in Figure S14. Other than the magnitude threshold, the k-means threshold model does not take brightness into account, and the same applies to the basic k-means model. The distance measure used by k-means could potentially be adjusted to take brightness into account, but our focus here is on k-means clustering as it is typically applied. 25 Figure S13: Asterisms returned by k-means clustering (k = 249). Figure S14: Asterisms returned by k-means clustering (k = 100) with a magnitude threshold of 2.9. 26 F10 score F score adjusted Rand index 0.25 0.20 0.20 0.3 0.15 0.2 0.10 0.15 singleton one cluster k means (thresh) CODE k means GC (no proximity) GC (no scaling) GC (no brightness) singleton one cluster k means (thresh) CODE k means GC (no proximity) GC (no scaling) GC (no brightness) GC singleton one cluster k means (thresh) CODE 0.00 k means 0.00 GC (no proximity) 0.0 GC (no brightness) 0.05 GC 0.05 GC 0.10 0.1 GC (no scaling) score 0.4 Figure S15: Scores of nine models according to three measures: the F10 measure, the F measure, and the adjusted Rand index. 7.5 Additional baselines Two additional baselines were included in the model comparison, both of which rely on a magnitude parameter m. The “one cluster” model assigns all stars brighter than m to a single cluster, and the “singleton model” assigns each of these stars to its own cluster. When m = 4.5 the singleton model is the limit of kmeans when k approaches the total number of stars. For each baseline and each scoring metric we identified the best-performing value of m using an exhaustive search over m ∈ {3, 3.1, . . . , 4.5}. 7.6 Model scores Scores for all models according to the three scoring measures are shown in Figure S15. The GC model performs best regardless of which scoring measure is used, but we focus here on results for the F10 measure. The second best model is k-means with a magnitude threshold. Figure S14 shows that this model picks out asterisms including the Southern Cross and Orion’s belt but the best magnitude threshold for the model (m = 2.9) means that it misses the Big Dipper, which includes a star of magnitude 3.3. Figure S13 shows that k-means without the magnitude threshold produces a large number of compact groupings that cover the sky in a way that is qualitatively unlike any of the systems in our data. The CODE model performs worse than the GC model, and the asterisms in Figure S12 reveal at least two qualitative limitations of the model. First, the model misses asterisms (e.g. the Big Dipper) that include stars separated by relatively large distances. Second, in relatively dense regions (e.g. the area of the Milky Way surrounding the Southern Cross) the model tends to form groups containing relatively large numbers of fainter stars. If the CODE activation surface lies above the threshold in a given region, then all stars in the region are included, regardless of how faint they are. In contrast, human asterisms sometimes pick out a handful of bright stars without including fainter stars that lie nearby. For example, Betelgeuse and Bellatrix (the shoulders of Orion) are often grouped in our data without including a fainter star (32 Orionis) that lies 27 Chinese Navajo 0 4 2 2 2 0 0 5 0 0 Indian Boorong 10 5 0 2 0 0.5 1.0 Maori 0 0 4 2 0 0.0 Norse Lenekel 0.5 1.0 4 2 0 0.0 0.5 1.0 0.0 0.5 1.0 6 4 2 0 4 0.0 5 Indo−Malay 4 2 0 Tongan 1 Belarusian 15 10 5 0 6 4 2 0 Siberian Marshall Lokono 2 10 5 0 4 2 0 0 Sami 3 2 1 0 2 Pacariqtambo 5 6 4 2 0 Inuit 4 15 10 5 0 2 Macedonian 0 Arabic 10 Ojibwe 4 Babylonian 4 0 Anutan 4 Romanian Dakota 5 Tukano 20 10 0 Egyptian 3 2 1 0 Western count Tupi 3 2 1 0 30 20 10 0 0.0 0.5 1.0 0.0 0.5 1.0 asterism score Figure S16: Distributions of “other culture” scores for the asterisms in each culture. Scores of 1 indicate asterisms that are identical to asterisms in some other culture. The cultures are ordered based on the means of the distributions. between them. 8 Comparisons across cultures In addition to comparing each system to the predictions of the GC model (Figure 3), we compared each system to other systems in the data set. For each system S let S−S be the set that includes all systems except for S. For each asterism a in S we used Equation 4 (Materials and Methods) to compute the extent to which a resembled an asterism in some system belonging to S−S . An “other culture” score of zero indicates that asterism a is dissimilar from all asterisms in all other cultures, and a score of 1 indicates that a is identical to an asterism from at least one other culture. Distributions of scores for each culture are shown in Figure S16. As mentioned in the main text, the system that differs most from all others is the Chinese system, but this result should be interpreted in light of genealogical relationships between cultures. There are strong historical relationships between some systems in our data set—for example, Western constellations are based in part on Babylonian tradition. One reason why the Chinese system stands out as different from the others is that the genealogical relationships between Chinese culture and most other cultures in our data are rather distant. Figure S17 explores whether systems that tend to resemble systems from other cultures also tend to match the predictions of the GC model. The x and y coordinates of each point in the figure correspond to 28 Norse ● 1.0 Boorong ● Belarusian Lenekel ● Maori ● Tongan Indian ● ● ● Siberian Pacariqtambo● ● ● other culture score Indo−Malay Lokono 0.8 ● Inuit ● Sami ● Romanian ● Macedonian ● ● ● Arabic ● Marshall Ojibwe Anutan ● 0.6 Western ● ● ● Dakota Babylonian ● ● Egyptian Tupi ● Navajo Tukano ● 0.4 Chinese ● 0.4 0.5 0.6 0.7 0.8 0.9 model score Figure S17: “Other culture” scores compared to model scores. Systems above the line match other cultures better than they match the predictions of the model. means of distributions plotted in Figures 3 and S16. The results are again influenced by genealogical relationships between cultures. In particular, the Western tradition is overrepresented in our data set, meaning that systems from this tradition have higher “other culture” scores than would otherwise be expected. A related bias arises because characterizations of other systems are often influenced by the Western system. For example, the Belarusian system achieves a very high “other culture” score partly because some of the asterisms in this system are assumed to be identical to Western constellations such as Draco and Gemini [10]. Despite these limitations, Figure S17 suggests that model scores and “other culture” scores are highly correlated, which is expected given that the asterisms identified by the model tend to be shared across cultures. Most of the points fall above the line, indicating that systems tend to match systems from other cultures better than they match predictions of the model. This result is undoubtedly influenced by genealogical relationships between cultures, but may also indicate that there is scope to improve the model to better capture common patterns that recur across cultures. A Asterism systems for 27 cultures 29 Figure S18: Anutan (Stellarium) Figure S19: Arabic moon stations (Stellarium) 30 Figure S20: Belarusian (Stellarium) Figure S21: Boorong [11, 1]. 31 Figure S22: Chinese (Stellarium). Figure S23: Dakota (Stellarium). 32 Figure S24: Egyptian (Stellarium). Figure S25: Indian [12]. 33 Figure S26: Indo-Malay [13]. Figure S27: Inuit (Stellarium). 34 Figure S28: Lokono (Stellarium). Figure S29: Macedonian (Stellarium). 35 Figure S30: Maori (Stellarium). Figure S31: Babylonian (MUL.APIN sky culture in Stellarium). 36 Figure S32: Marshall Islands [14]. Figure S33: Navajo (Stellarium). 37 Figure S34: Norse (Stellarium). Figure S35: Ojibwe (Stellarium). 38 Figure S36: Pacariqtambo [15]. Figure S37: Romanian (Stellarium). 39 Figure S38: Sami (Stellarium). Figure S39: Siberian (Stellarium). 40 Figure S40: Tongan (Stellarium). Figure S41: Tukano (Stellarium). 41 Figure S42: Tupi (Stellarium). Figure S43: Lenakel (Vanuatu) (Netwar sky culture in Stellarium). 42 Figure S44: Western (Stellarium). 43 B Common asterisms 1 2 3 4 Human (raw) 0.63 0.62 0.59 0.57 Human (adj) 0.55 0.57 0.51 0.46 Model Score 1.0 1.0 1.0 0.88 5 6 0.43 0.37 0.42 0.36 1.0 0.5 7 8 9 10 11 0.35 0.3 0.29 0.28 0.26 0.35 0.34 0.33 0.31 0.3 1.0 0.6 1.0 0.71 0.45 12 13 14 15 0.25 0.24 0.24 0.24 0.29 0.31 0.38 0.33 0.75 1.0 1.0 0.44 16 17 18 0.23 0.22 0.22 0.27 0.25 0.3 1.0 1.0 0.62 19 20 21 22 23 0.21 0.21 0.21 0.2 0.19 0.34 0.31 0.29 0.29 0.31 0.04 1.0 1.0 1.0 0.75 24 25 26 27 0.19 0.19 0.18 0.18 0.35 0.28 0.25 0.3 0.0 0.83 0.56 0.33 28 0.17 0.22 0.58 29 30 31 0.17 0.16 0.16 0.27 0.33 0.27 1.0 0.83 0.56 32 0.16 0.26 0.01 Stars Description 34DelOri, 46EpsOri, 50ZetOri 25EtaTau, 17Tau, 19Tau, 20Tau, 23Tau, 27Tau 87AlpTau, 54GamTau, 61Del1Tau, 74EpsTau, 78The2Tau 50AlpUMa, 48BetUMa, 64GamUMa, 69DelUMa, 77EpsUMa, 79ZetUMa, 85EtaUMa Alp1Cru, BetCru, GamCru, DelCru 5AlpCrB, 3BetCrB, 8GamCrB, 10DelCrB, 13EpsCrB, 4TheCrB, 14IotCrB 66AlpGem, 78BetGem 34DelOri, 46EpsOri, 50ZetOri, 44IotOri, 42Ori 9AlpDel, 6BetDel, 12Gam2Del, 11DelDel 18AlpCas, 11BetCas, 27GamCas, 37DelCas, 45EpsCas 58AlpOri, 19BetOri, 24GamOri, 34DelOri, 46EpsOri, 50ZetOri, 53KapOri 46EpsOri, 50ZetOri, 48SigOri 13AlpAri, 6BetAri, 5Gam2Ari Alp1Cen, BetCen 50AlpUMa, 48BetUMa, 64GamUMa, 69DelUMa, 77EpsUMa, 79ZetUMa, 85EtaUMa, 1OmiUMa, 29UpsUMa, 30PhiUMa, 63ChiUMa, 23UMa 53AlpAql, 60BetAql, 50GamAql 50AlpUMa, 48BetUMa 1AlpUMi, 7BetUMi, 13GamUMi, 23DelUMi, 22EpsUMi, 16ZetUMi, 21EtaUMi 54AlpPeg, 53BetPeg 8Bet1Sco, 7DelSco, 6PiSco 35LamSco, 34UpsSco Iot1Sco, KapSco, 35LamSco, 34UpsSco 32AlpLeo, 41Gam1Leo, 17EpsLeo, 36ZetLeo, 30EtaLeo, 24MuLeo 21AlpAnd, 88GamPeg 1AlpCrv, 9BetCrv, 4GamCrv, 7DelCrv, 2EpsCrv 21AlpSco, 8Bet1Sco, 7DelSco, 6PiSco, 20SigSco 50AlpCyg, 6Bet1Cyg, 37GamCyg, 18DelCyg, 53EpsCyg, 21EtaCyg 21AlpSco, 8Bet1Sco, 7DelSco, 26EpsSco, Zet2Sco, Mu1Sco, 6PiSco, 20SigSco, 23TauSco 21AlpSco, 20SigSco, 23TauSco 3AlpLyr, 10BetLyr, 14GamLyr, 12Del2Lyr, 6Zet1Lyr 26EpsSco, Zet2Sco, EtaSco, TheSco, Iot1Sco, KapSco, 35LamSco, Mu1Sco, 34UpsSco 21AlpAnd, 54AlpPeg, 53BetPeg, 88GamPeg Orion’s Belt Pleiades Hyades Big Dipper 44 Southern Cross Corona Borealis Castor and Pollux Delphinus Cassiopeia Orion Head of Aries Southern Pointers Shaft of Aquila Little Dipper Head of Scorpius Stinger of Scorpius Sickle Corvus Northern Cross Lyra Square of Pegasus 33 0.15 0.22 0.2 34 35 36 37 0.14 0.14 0.14 0.14 0.3 0.35 0.34 0.35 0.0 0.67 1.0 0.56 38 0.13 0.21 0.27 39 40 0.13 0.13 0.21 0.21 1.0 0.32 41 42 43 44 45 46 47 0.12 0.12 0.12 0.11 0.11 0.11 0.11 0.23 0.35 0.24 0.33 0.29 0.26 0.32 0.33 1.0 0.5 1.0 0.23 0.57 0.09 48 49 0.11 0.11 0.29 0.33 0.67 0.53 50 0.1 0.21 0.0 51 52 53 54 0.1 0.1 0.1 0.1 0.31 0.23 0.27 0.18 1.0 1.0 1.0 0.5 55 56 57 0.1 0.1 0.1 0.31 0.3 0.25 0.33 0.75 0.33 34DelOri, 46EpsOri, 50ZetOri, 87AlpTau, 54GamTau, 61Del1Tau, 74EpsTau, 78The2Tau, 17Tau 43GamCnc, 47DelCnc AlpCrA, BetCrA, GamCrA, DelCrA 39LamOri, 37Phi1Ori, 40Phi2Ori 6Alp2Cap, 9BetCap, 40GamCap, 49DelCap, 34ZetCap, 23TheCap, 32IotCap, 16PsiCap, 18OmeCap 58AlpOri, 24GamOri, 34DelOri, 46EpsOri, 50ZetOri, 47OmeOri, 51Ori 58AlpOri, 24GamOri 21AlpSco, 8Bet1Sco, 7DelSco, 26EpsSco, Zet1Sco, EtaSco, TheSco, Iot1Sco, KapSco, 35LamSco, Mu1Sco, 6PiSco, 23TauSco AlpCrA, BetCrA, GamCrA, EpsCrA, ZetCrA 68DelLeo, 70TheLeo 42AlpCom, 43BetCom, 15GamCom 3AlpLyr, 4Eps1Lyr, 6Zet1Lyr 13AlpAur, 34BetAur, 37TheAur, 3IotAur, 112BetTau EpsCar, IotCar, DelVel, KapVel 11AlpDra, 23BetDra, 33GamDra, 57DelDra, 63EpsDra, 22ZetDra, 14EtaDra, 13TheDra, 12IotDra, 5KapDra, 1LamDra, 25Nu2Dra, 32XiDra, 60TauDra, 44ChiDra 48GamAqr, 55Zet2Aqr, 62EtaAqr, 52PiAqr 16AlpBoo, 42BetBoo, 27GamBoo, 49DelBoo, EpsBoo, 30ZetBoo, 8EtaBoo, 25RhoBoo, 5UpsBoo AlpCrA, BetCrA, GamCrA, DelCrA, EpsCrA, ZetCrA, Eta2CrA, TheCrA, Kap2CrA, LamCrA, 7122, 7129 9Alp2Lib, 27BetLib 7BetUMi, 13GamUMi, 5UMi 10AlpCMi, 3BetCMi 5RhoOph, 21AlpSco, 8Bet1Sco, 7DelSco, 6PiSco, 5RhoSco, 20SigSco, 23TauSco, 6070 67AlpVir, 29GamVir, 43DelVir, 47EpsVir, 3NuVir 23BetDra, 33GamDra, 57DelDra, 25Nu2Dra, 32XiDra 58AlpOri, 24GamOri, 39LamOri, 37Phi1Ori, 40Phi2Ori Table S2: An extended version of Table 1 that includes 57 asterisms in total. C Asterisms for the GC model with n = 320 1 2 3 Score 1.0 1.0 1.0 4 1.0 Stars 25EtaTau, 17Tau, 19Tau, 20Tau, 23Tau, 27Tau Alp1Cen, BetCen 10Gam2Sgr, 19DelSgr, 20EpsSgr, 38ZetSgr, EtaSgr, 22LamSgr, 34SigSgr, 40TauSgr, 27PhiSgr 53AlpAql, 60BetAql, 50GamAql 45 Description Pleiades Southern Pointers Teapot Shaft of Aquila Corona Australis Capricornus Scorpius Auriga False Cross Draco Water Jar Boötes 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.88 29 0.86 30 0.83 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 0.8 0.8 0.8 0.75 0.75 0.75 0.67 0.67 0.67 0.67 0.67 0.67 0.67 0.67 0.67 0.67 0.67 0.67 34AlpAqr, 48GamAqr, 55Zet2Aqr, 62EtaAqr 9AlpDel, 6BetDel, 12Gam2Del, 11DelDel AlpCrA, BetCrA, GamCrA 5AlpSge, 6BetSge, 12GamSge, 7DelSge 13AlpAri, 6BetAri, 5Gam2Ari 5GamEqu, 7DelEqu 95 Her, 102Her 60BetOph, 62GamOph 67PiHer, 75RhoHer 25IotOph, 27KapOph 13EpsAql, 17ZetAql 40TauLib, 39UpsLib 7BetUMi, 13GamUMi, 5UMi EpsBoo, 25RhoBoo, 28SigBoo MuCen, NuCen, PhiCen 42ZetPeg, 46XiPeg 33LamUMa, 34MuUMa 39LamOri, 37Phi1Ori, 40Phi2Ori 13TheCyg, 10Iot2Cyg, 1KapCyg 20EtaLyr, 21TheLyr 90PhiAqr, 91Psi1Aqr, 93Psi2Aqr 57DelDra, 63EpsDra 1Pi3Ori, 3Pi4Ori 87AlpTau, 54GamTau, 61Del1Tau, 68Del3Tau, 74EpsTau, 78The2Tau, 71Tau 50AlpUMa, 48BetUMa, 64GamUMa, 69DelUMa, 77EpsUMa, 79ZetUMa, 85EtaUMa, 80UMa 18AlpCas, 11BetCas, 27GamCas, 37DelCas, 45EpsCas, 17ZetCas, 24EtaCas 23BetDra, 33GamDra, 25Nu2Dra, 32XiDra 5AlpCrB, 3BetCrB, 8GamCrB, 13EpsCrB, 4TheCrB, 49DelBoo 53BetPeg, 44EtaPeg, 47LamPeg, 48MuPeg Alp1Cru, BetCru, GamCru, DelCru, EpsCru 32AlpLeo, 41Gam1Leo, 36ZetLeo, 30EtaLeo, 31Leo 37Xi2Sgr, 39OmiSgr, 41PiSgr TheSco, Iot1Sco, KapSco, 35LamSco, 34UpsSco, 6630 4BetTri, 9GamTri 3AlpLyr, 12Del2Lyr, 4Eps1Lyr, 6Zet1Lyr 66AlpGem, 78BetGem, 62RhoGem, 75SigGem 9IotUMa, 12KapUMa 16AlpBoo, 8EtaBoo, 4TauBoo, 5UpsBoo 42TheOph, 44Oph 2XiTau, 1OmiTau Bet1Sgr, Bet2Sgr 22ZetDra, 14EtaDra 67Oph, 70Oph 10BetLyr, 14GamLyr 46 Water Jar (part) Delphinus Corona Australis Sagitta Head of Aries Judge of right and wrong (Chinese) Textile ruler (Chinese) Official for the royal clan (Chinese) Woman’s bed (Chinese) Dipper for solids (Chinese) ar in Mejleb (Marshall Islands) Celestial spokes (Chinese) Jemenuwe (Marshall Islands) Celestial lance (Chinese) Ujela (Marshall Islands) Thunder and lightning (Chinese) Kam Anij (Marshall Islands) Al-Hekaah (Arabic) Xi Zhong (Chinese) Nin-SAR and Erragal (Babylonian) Mhua (Tukano) Celestial kitchen (Chinese) Lulal and Latarak (Babylonian) Hyades Big Dipper Cassiopeia Head of Draco Corona Borealis Resting palace (Chinese) Southern Cross Sickle (part) Establishment (Chinese) Tail of Scorpius Triangulum (part) Lyra (part) Castor & Pollux 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 0.67 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.44 71 72 0.43 0.43 73 0.43 74 75 76 77 78 79 80 81 82 0.43 0.43 0.4 0.4 0.4 0.4 0.4 0.4 0.36 83 84 85 86 87 88 89 90 0.33 0.33 0.33 0.33 0.33 0.3 0.29 0.29 16PsiCap, 18OmeCap TheCar, 4050, 4140 5AlpCep, 3EtaCep, 2TheCep BetAra, GamAra, ZetAra 68DelLeo, 70TheLeo, 60Leo 10AlpCMi, 3BetCMi, 4GamCMi 4GamCrv, 7DelCrv, 8EtaCrv AlpMus, BetMus 16LamAql, 12Aql 7AlpLac, 3BetLac 10ThePsc, 17IotPsc 24GamGem, 31XiGem, 30Gem 9AlpCMa, 2BetCMa 13GamLep, 15DelLep 11AlpLep, 9BetLep 13AlpAur, 34BetAur, 7EpsAur, 8ZetAur, 10EtaAur, 35PiAur 22TauHer, 11PhiHer 5Alp1Cap, 6Alp2Cap, 9BetCap AlpGru, BetGru, EpsGru, ZetGru MuCep, 10NuCep AlpPhe, EpsPhe, KapPhe 50AlpCyg, 37GamCyg, 18DelCyg, 53EpsCyg, 58NuCyg, 62XiCyg, 31Cyg, 32Cyg 24AlpPsA, 22GamPsA, 23DelPsA 21AlpSco, 8Bet1Sco, 7DelSco, 14NuSco, 6PiSco, 20SigSco, 23TauSco, 10Ome2Sco, 9Ome1Sco 25DelCMa, 21EpsCMa, 31EtaCMa, 24Omi2CMa, 22SigCMa, 28OmeCMa 11EpsHya, 16ZetHya, 13RhoHya 17IotAnd, 19KapAnd, 16LamAnd 78IotLeo, 77SigLeo 86Aqr, 88Aqr, 98Aqr, 99Aqr 44ZetPer, 38OmiPer 31EtaCet, 45TheCet 43BetAnd, 37MuAnd 26EpsSco, Mu1Sco 67BetEri, 69LamEri, 58AlpOri, 19BetOri, 24GamOri, 34DelOri, 46EpsOri, 50ZetOri, 28EtaOri, 43The2Ori, 44IotOri, 53KapOri, 48SigOri, 20TauOri, 29Ori, 32Ori, 42Ori, 1887 BetPhe, GamPhe 51And, PhiPer 40GamCap, 49DelCap 26BetPer, 25RhoPer 40AlpLyn, 38Lyn 24AlpSer, 13DelSer, 37EpsSer AlpCol, BetCol 51MuPer, 48Per 47 Corvus (part) Musca (part) Grus (part) Northern Cross Head of Scorpius Rear of Canis Major Orion 91 92 0.29 0.29 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 0.25 0.25 0.25 0.25 0.25 0.25 0.22 0.21 0.2 0.2 0.2 0.2 0.2 0.18 0.18 0.15 0.15 0.13 0.12 0.11 0.06 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 31DelAnd, 30EpsAnd 33AlpPer, 23GamPer, 39DelPer, 15EtaPer, IotPer, 35SigPer, 18TauPer, 37PsiPer 28BetSer, 41GamSer, 35KapSer 7EtaGem, 13MuGem TheGru, IotGru 27BetHer, 20GamHer 27DelCep, 23EpsCep, 21ZetCep EpsCar, IotCar, DelVel, KapVel, 3447, 3659, 3803 41Ups4Eri, 43Eri 17EpsLeo, 4LamLeo, 24MuLeo 1DelOph, 2EpsOph 76DelAqr, 71Tau2Aqr GamLup, DelLup EtaCen, KapCen, AlpLup, BetLup PhiEri, ChiEri 14ZetLep, 16EtaLep 23DelEri, 18EpsEri 1Lac, 8485 39Cyg, 41Cyg 86MuHer, 94NuHer, 92XiHer, 103OmiHer GamCen, DelCen, TauCen 67SigCyg, 65TauCyg 46LMi, 54NuUMa, 53XiUMa 37TheAur, 32NuAur AlpCar, TauPup 21EtaCyg, ChiCyg PiPup, 2787 ZetPup, 3080 AlpEri, AlpHyi 25TheUMa, 26UMa Del1Gru, Del2Gru 43PhiDra, 44ChiDra 3445, 3487 65Kap1Tau, 69UpsTau Perseus (part) False Cross Table S3: Asterisms picked out by the GC model with n = 320. The scores roughly indicate how similar each asterism is to the closest asterism in the human data (1.0 indicates a perfect match). In some cases the descriptions are approximate only—for example, the asterism labeled “Corona Borealis” includes an extra star (49DelBoo). References [1] D. W. Hamacher. On the astronomical knowledge and traditions of Aboriginal Australians. PhD thesis, Macquarie University, 2012. 48 [2] C. T. Zahn. Graph-theoretical methods for detecting and describing Gestalt clusters. IEEE Transactions on Computers, 20(1):68–86, 1971. [3] N. Ahuja. Dot pattern processing using Voronoi neighborhoods. IEEE Transactions on Pattern Analysis and Machine Intelligence, (3):336–343, 1982. [4] M. C. J. van den Berg. Grouping by proximity and grouping by good continuation in the perceptual organization of random dot patterns. PhD thesis, University of Virginia, 1998. [5] Michiel P Van Oeffelen and Peter G Vos. Enumeration of dots: An eye movement analysis. Memory & Cognition, 12(6):607–612, 1984. [6] B. J. Compton and G. D. Logan. Evaluating a computational model of perceptual grouping by proximity. Perception & Psychophysics, 53(4):403–421, 1993. [7] Brian J Compton and Gordon D Logan. Judgments of perceptual groups: Reliability and sensitivity to stimulus transformation. Perception & Psychophysics, 61(7):1320–1335, 1999. [8] J Schreiner. Redefining constellations and asterisms. Available at http://www.jschreiner. com/english/stars/home.html. [9] S. Xu, K. Chen, and Y. Zhou. Re-clustering of constellations through machine learning. Technical report, Stanford University, 2014. [10] Tsimafei Avilin. Astronyms in Belarussian folk beliefs. Archaeologia Baltica, 10:1, 2009. [11] W. E. Stanbridge. On the astronomy and mythology of the Aborigines of Victoria. Proceedings of the Philosophical Institute of Victoria, 2:137–140, 1857. [12] G. R. Kaye. Hindu astronomy: Ancient science of the Hindus. New Delhi, 1981. [13] G. Ammarell. Astronomy in the Indo-Malay archipelago. In H. Selin, editor, Encyclopaedia of the History of Science, Technology, and Medicine in Non-Western Cultures, pages 324–333. Springer, 2008. [14] P. A. Erdland. Die Marshall-Insulaner: Leben und Sitte, Sinn und Religion eines Südsee-Volkes. Aschendorffsche, 1914. [15] G. Urton. Constructions of the ritual-agricultural calendar in Pacariqtambo, Peru. In V. Del Chamberlain, J. B. Carlson, and J. M. Young, editors, Songs from the Sky: Indigenous Astronomical and Cosmological Traditions of the World. Ocarina Books, 2005. 49