Applied Filters
- P. Missier
- AuthorRemove filter
People
Colleagues
- Carole Anne Goble (22)
- Suzanne M Embury (12)
- Khalid Belhajjame (10)
- Sean K Bechhofer (8)
- Alun David Preece (7)
- Binling Jin (6)
- Bertram Ludäscher (5)
- Mark A Greenwood (5)
- Víctor Cuevas-Vicenttín (5)
- Jacek Sroka (4)
- Jan Hidders (4)
- Katy Wolstencroft (4)
- Danius T Michaelides (3)
- David de Roure (3)
- David Charles de Roure (3)
- Paul Watson (3)
- Shoaib Sufi (3)
- Valter Crescenzi (3)
- Vasa Curcin (3)
Publication
Journal/Magazine Names
- Future Generation Computer Systems (7)
- Concurrency and Computation: Practice & Experience (4)
- ACM SIGMOD Record (3)
- Journal of Data and Information Quality (3)
- IEEE Internet Computing (2)
- Proceedings of the VLDB Endowment (2)
- ACM Transactions on Database Systems (1)
- ACM Transactions on Knowledge Discovery from Data (1)
- Data & Knowledge Engineering (1)
- Decision Support Systems (1)
- Distributed and Parallel Databases (1)
- Fundamenta Informaticae (1)
- Journal of Computer and System Sciences (1)
- Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (1)
- Web Semantics: Science, Services and Agents on the World Wide Web (1)
Proceedings/Book Names
- IPAW 2014: Revised Selected Papers of the 5th International Provenance and Annotation Workshop on Provenance and Annotation of Data and Processes - Volume 8628 (4)
- IPAW'12: Proceedings of the 4th international conference on Provenance and Annotation of Data and Processes (3)
- ACIIDS'13: Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part I (2)
- ESCIENCE '10: Proceedings of the 2010 IEEE Sixth International Conference on e-Science (2)
- IPAW 2016: Proceedings of the 6th International Workshop on Provenance and Annotation of Data and Processes - Volume 9672 (2)
- Provenance and Annotation of Data and Processes (2)
- SWPM'09: Proceedings of the First International Conference on Semantic Web in Provenance Management - Volume 526 (2)
- CloudDB '11: Proceedings of the third international workshop on Cloud data management (1)
- EDBT '09: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (1)
- EDBT '10: Proceedings of the 13th International Conference on Extending Database Technology (1)
- EDBT '13: Proceedings of the 16th International Conference on Extending Database Technology (1)
- EDBT '13: Proceedings of the Joint EDBT/ICDT 2013 Workshops (1)
- HIP '11: Proceedings of the 2011 Workshop on Historical Document Imaging and Processing (1)
- IQIS '05: Proceedings of the 2nd international workshop on Information quality in information systems (1)
- Modeling Decisions for Artificial Intelligence (1)
- SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data (1)
- SSDBM '14: Proceedings of the 26th International Conference on Scientific and Statistical Database Management (1)
- Wands '10: Proceedings of the 1st International Workshop on Workflow Approaches to New Data-centric Science (1)
- Web Engineering (1)
- WORKS '11: Proceedings of the 6th workshop on Workflows in support of large-scale science (1)
Publisher
- Association for Computing Machinery (24)
- Springer-Verlag (24)
- Elsevier Science Publishers B. V. (10)
- IEEE Computer Society (10)
- John Wiley and Sons Ltd. (4)
- VLDB Endowment (4)
- USENIX Association (3)
- CEUR-WS.org (2)
- IEEE Educational Activities Department (2)
- Kluwer Academic Publishers (2)
- Academic Press, Inc. (1)
- IEEE Press (1)
- IOS Press (1)
Publication Date
Export Citations
Publications
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleOpen AccessPublished By ACMPublished By ACM
Experience: A Comparative Analysis of Multivariate Time-Series Generative Models: A Case Study on Human Activity Data
- Naif Alzahrani
School of Computing, Newcastle University, Newcastle upon Tyne, United Kingdom of Great Britain and Northern Ireland
, - Jacek Cała
National Innovation Centre for Data at Newcastle University, Newcastle upon Tyne, United Kingdom of Great Britain and Northern Ireland
, - Paolo Missier
School of Computing, Newcastle University, Newcastle upon Tyne, United Kingdom of Great Britain and Northern Ireland
Journal of Data and Information Quality, Volume 16, Issue 3•September 2024, Article No.: 18, pp 1-18 • https://doi.org/10.1145/3688393Human activity recognition (HAR) is an active research field that has seen great success in recent years due to advances in sensory data collection methods and activity recognition systems. Deep artificial intelligence (AI) models have contributed to the ...
- 0Citation
- 437
- Downloads
MetricsTotal Citations0Total Downloads437Last 12 Months437Last 6 weeks210
- Naif Alzahrani
- research-article
Validity constraints for data analysis workflows
- Florian Schintke
Zuse Institute Berlin, Takustr. 7, 14195, Berlin, Germany
, - Khalid Belhajjame
PSL, Université Paris Dauphine, LAMSADE, Paris, France
, - Ninon De Mecquenem
Humboldt-Universität zu Berlin, Berlin, Germany
, - David Frantz
Trier University, Trier, Germany
, - Vanessa Emanuela Guarino
Humboldt-Universität zu Berlin, Berlin, Germany
Max-Delbrück Center for Molecular Medicine, Berlin, Germany
, - Marcus Hilbrich
Humboldt-Universität zu Berlin, Berlin, Germany
, - Fabian Lehmann
Humboldt-Universität zu Berlin, Berlin, Germany
, - Paolo Missier
Newcastle University, Newcastle upon Tyne, United Kingdom
, - Rebecca Sattler
Humboldt-Universität zu Berlin, Berlin, Germany
, - Jan Arne Sparka
Humboldt-Universität zu Berlin, Berlin, Germany
, - Daniel T. Speckhard
Humboldt-Universität zu Berlin, Berlin, Germany
Fritz-Haber-Institut der Max–Planck-Gesellschaft, Berlin, Germany
, - Hermann Stolte
Humboldt-Universität zu Berlin, Berlin, Germany
, - Anh Duc Vu
Humboldt-Universität zu Berlin, Berlin, Germany
, - Ulf Leser
Humboldt-Universität zu Berlin, Berlin, Germany
Future Generation Computer Systems, Volume 157, Issue C•Aug 2024, pp 82-97 • https://doi.org/10.1016/j.future.2024.03.037AbstractPorting a scientific data analysis workflow (DAW) to a cluster infrastructure, a new software stack, or even only a new dataset with some notably different properties is often challenging. Despite the structured definition of the steps (tasks) ...
Highlights- Portability and adaptability of scientific workflows suffer from hidden assumptions.
- Validity constraints in workflow languages make hidden assumption explicit.
- Validity constraints ensure integrity and enable conformance checking ...
- 1Citation
MetricsTotal Citations1
- Florian Schintke
- research-articlePublished By ACMPublished By ACM
Supporting Better Insights of Data Science Pipelines with Fine-grained Provenance
- Adriane Chapman
University of Southampton, Southampton, UK
, - Luca Lauro
Università Roma Tre, Roma, Italy
, - Paolo Missier
Newcastle University, Newcastle upon Tyne, UK
, - Riccardo Torlone
Universitá Roma Tre, Roma, Italy
ACM Transactions on Database Systems, Volume 49, Issue 2•June 2024, Article No.: 6, pp 1-42 • https://doi.org/10.1145/3644385Successful data-driven science requires complex data engineering pipelines to clean, transform, and alter data in preparation for machine learning, and robust results can only be achieved when each step in the pipeline can be justified, and its effect on ...
- 2Citation
- 414
- Downloads
MetricsTotal Citations2Total Downloads414Last 12 Months414Last 6 weeks44
- Adriane Chapman
- research-articleOpen AccessPublished By ACMPublished By ACM
Fair and Private Data Preprocessing through Microaggregation
- Vladimiro González-Zelaya
Universidad Panamericana, Mexico
, - Julián Salas
Universitat Oberta de Catalunya (UOC), Spain
, - David Megías
Universitat Oberta de Catalunya (UOC), Spain
, - Paolo Missier
Newcastle University, UK
ACM Transactions on Knowledge Discovery from Data, Volume 18, Issue 3•April 2024, Article No.: 49, pp 1-24 • https://doi.org/10.1145/3617377Privacy protection for personal data and fairness in automated decisions are fundamental requirements for responsible Machine Learning. Both may be enforced through data preprocessing and share a common target: data should remain useful for a task, while ...
- 0Citation
- 1,339
- Downloads
MetricsTotal Citations0Total Downloads1,339Last 12 Months1,177Last 6 weeks116- 1
Supplementary Materialtkdd-2022-07-0274-file002.zip
- Vladimiro González-Zelaya
- Article
Preprocessing Matters: Automated Pipeline Selection for Fair Classification
- Vladimiro González-Zelaya
Universidad Panamericana, Facultad de Ciencias Económicas y Empresariales, Mexico City, Mexico
, - Julián Salas
Internet Interdisciplinary Institute, Universitat Oberta de Catalunya, Barcelona, Spain
, - Dennis Prangle
University of Bristol, Institute for Statistical Science, Bristol, UK
, - Paolo Missier
Newcastle University, School of Computing, Newcastle upon Tyne, UK
Modeling Decisions for Artificial Intelligence•June 2023, pp 202-213• https://doi.org/10.1007/978-3-031-33498-6_14AbstractImproving fairness by manipulating the preprocessing stages of classification pipelines is an active area of research, closely related to AutoML. We propose a genetic optimisation algorithm, FairPipes, which optimises for user-defined combinations ...
- 0Citation
MetricsTotal Citations0
- Vladimiro González-Zelaya
- research-articleOpen AccessPublished By ACMPublished By ACM
ConvBoost: Boosting ConvNets for Sensor-based Activity Recognition
- Shuai Shao
Department of Computer Science, University of Warwick, Coventry, UK
, - Yu Guan
Department of Computer Science, University of Warwick, Coventry, UK
, - Bing Zhai
Computer and Information Sciences, Northumbria University, Newcastle upon Tyne, UK
, - Paolo Missier
School of Computing, Newcastle University, Newcastle upon Tyne, UK
, - Thomas Plötz
School of Interactive Computing, Georgia Institute of Technology, Atlanta, USA
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Volume 7, Issue 2•June 2023, Article No.: 75, pp 1-21 • https://doi.org/10.1145/3596234Human activity recognition (HAR) is one of the core research themes in ubiquitous and wearable computing. With the shift to deep learning (DL) based analysis approaches, it has become possible to extract high-level features and perform classification in ...
- 5Citation
- 941
- Downloads
MetricsTotal Citations5Total Downloads941Last 12 Months453Last 6 weeks48
- Shuai Shao
- research-article
DPDS: assisting data science with data provenance
- Adriane Chapman
University of Southampton, UK
, - Luca Lauro
Università Roma Tre, Italy
, - Paolo Missier
Newcastle University, UK
, - Riccardo Torlone
Università Roma Tre, Italy
Proceedings of the VLDB Endowment, Volume 15, Issue 12•August 2022, pp 3614-3617 • https://doi.org/10.14778/3554821.3554857Successful data-driven science requires a complex combination of data engineering pipelines and data modelling techniques. Robust and defensible results can only be achieved when each step in the pipeline that is designed to clean, transform and alter ...
- 2Citation
- 118
- Downloads
MetricsTotal Citations2Total Downloads118Last 12 Months26Last 6 weeks2
- Adriane Chapman
- research-articlePublished By ACMPublished By ACM
Knowledge-Driven Data Ecosystems Toward Data Transparency
- Sandra Geisler
Fraunhofer FIT and RWTH Aachen University, Aachen, Germany
, - Maria-Esther Vidal
TIB-Leibniz Information Centre for Science and Technology, Hannover, Gerrmany
, - Cinzia Cappiello
Politecnico di Milano, Milano, Italy
, - Bernadette Farias Lóscio
Federal University of Pernambuco, Brazil
, - Avigdor Gal
Technion Israel Institute of Technology, Haifa, Israel
, - Matthias Jarke
RWTH Aachen University and Fraunhofer FIT, Aachen, Germany
, - Maurizio Lenzerini
Sapienza Università di Roma, Roma, Italy
, - Paolo Missier
Newcastle University, United Kingdom
, - Boris Otto
TU Dortmund University and Fraunhofer ISST, Dortmund, Germany
, - Elda Paja
IT University of Copenhagen, Copenhagen S, Denmark
, - Barbara Pernici
Politecnico di Milano, Milano,, Italy
, - Jakob Rehof
TU Dortmund University and Fraunhofer ISST, Dortmund, Germany
Journal of Data and Information Quality, Volume 14, Issue 1•March 2022, Article No.: 3, pp 1-12 • https://doi.org/10.1145/3467022A data ecosystem (DE) offers a keystone-player or alliance-driven infrastructure that enables the interaction of different stakeholders and the resolution of interoperability issues among shared data. However, despite years of research in data governance ...
- 24Citation
- 1,566
- Downloads
MetricsTotal Citations24Total Downloads1,566Last 12 Months531Last 6 weeks59
- Sandra Geisler
- research-article
A customisable pipeline for the semi-automated discovery of online activists and social campaigns on Twitter
- Flavio Primo
School of Computing, Newcastle University, Science Central, Newcastle upon Tyne, UK
, - Alexander Romanovsky
School of Computing, Newcastle University, Science Central, Newcastle upon Tyne, UK
, - Rafael de Mello
PUC-Rio/ CEFET-RJ, Rio de Janeiro, Brasil
, - Alessandro Garcia
PUC-Rio, Rio de Janeiro, Brasil
, - Paolo Missier
School of Computing, Newcastle University, Science Central, Newcastle upon Tyne, UK
World Wide Web, Volume 24, Issue 4•Jul 2021, pp 1235-1271 • https://doi.org/10.1007/s11280-021-00887-2AbstractSubstantial research is available on detecting influencers on social media platforms. In contrast, comparatively few studies exists on the role of online activists, defined informally as users who actively participate in socially-minded online ...
- 1Citation
MetricsTotal Citations1
- Flavio Primo
- research-article
Capturing and querying fine-grained provenance of preprocessing pipelines in data science
- Adriane Chapman
University of Southampton
, - Paolo Missier
Newcastle University
, - Giulia Simonelli
Università Roma Tre
, - Riccardo Torlone
Università Roma Tre
Proceedings of the VLDB Endowment, Volume 14, Issue 4•December 2020, pp 507-520 • https://doi.org/10.14778/3436905.3436911Data processing pipelines that are designed to clean, transform and alter data in preparation for learning predictive models, have an impact on those models' accuracy and performance, as well on other properties, such as model fairness. It is therefore ...
- 8Citation
- 526
- Downloads
MetricsTotal Citations8Total Downloads526Last 12 Months64Last 6 weeks11
- Adriane Chapman
- Article
A Customisable Pipeline for Continuously Harvesting Socially-Minded Twitter Users
- Flavio Primo
School of Computing, Newcastle University, Science Central, Newcastle upon Tyne, UK
, - Paolo Missier
School of Computing, Newcastle University, Science Central, Newcastle upon Tyne, UK
, - Alexander Romanovsky
School of Computing, Newcastle University, Science Central, Newcastle upon Tyne, UK
, - Mickael Figueredo
Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
, - Nelio Cacho
Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
AbstractOn social media platforms and Twitter in particular, specific classes of users such as influencers have been given satisfactory operational definitions in terms of network and content metrics. Others, for instance online activists, are not less ...
- 0Citation
MetricsTotal Citations0
- Flavio Primo
- research-articlePublished By ACMPublished By ACM
Report on the First International Workshop on Incremental Re-computation: Provenance and Beyond
- Paolo Missier
Newcastle University, , United Kingdom
, - Tanu Malik
DePaul University, Chicago, IL, USA
, - Jacek Cala
Newcastle University, , United Kingdom
ACM SIGMOD Record, Volume 47, Issue 4•December 2018, pp 35-38 • https://doi.org/10.1145/3335409.3335418In the last decade, advances in computing have deeply transformed data processing. Increasingly systems aim to process massive amounts of data efficiently, often with fast response times that are typically characterised by the 4V's, i.e., Volume, ...
- 0Citation
- 71
- Downloads
MetricsTotal Citations0Total Downloads71Last 12 Months2Last 6 weeks1
- Paolo Missier
- research-articlePublished By ACMPublished By ACM
Mind my value: a decentralized infrastructure for fair and trusted IoT data trading
- Paolo Missier
Newcastle University, Newcastle, UK
, - Shaimaa Bajoudah
Newcastle University, Newcastle, UK
, - Angelo Capossele
Digital Catapult, London, UK
, - Andrea Gaglione
Digital Catapult, London, UK
, - Michele Nati
Digital Catapult, London, UK
IoT '17: Proceedings of the Seventh International Conference on the Internet of Things•October 2017, Article No.: 15, pp 1-8• https://doi.org/10.1145/3131542.3131564Internet of Things (IoT) data are increasingly viewed as a new form of massively distributed and large scale digital assets, which are continuously generated by millions of connected devices. The real value of such assets can only be realized by ...
- 29Citation
- 621
- Downloads
MetricsTotal Citations29Total Downloads621Last 12 Months16Last 6 weeks2
- Paolo Missier
- article
TAPER: query-aware, partition-enhancement for large, heterogenous graphs
- Hugo Firth
School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
, - Paolo Missier
School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
Distributed and Parallel Databases, Volume 35, Issue 2•June 2017, pp 85-115 • https://doi.org/10.1007/s10619-017-7196-yGraph partitioning has long been seen as a viable approach to addressing Graph DBMS scalability. A partitioning, however, may introduce extra query processing latency unless it is sensitive to a specific query workload, and optimised to minimise inter-...
- 2Citation
MetricsTotal Citations2
- Hugo Firth
- research-article
Scalable and efficient whole-exome data processing using workflows on the cloud
- J. Cała
School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
, - E. Marei
School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
, - Y. Xu
Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, UK
, - K. Takeda
Microsoft Research, Cambridge, UK
, - P. Missier
School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
Future Generation Computer Systems, Volume 65, Issue C•December 2016, pp 153-168 • https://doi.org/10.1016/j.future.2016.01.001Dataflow-style workflows offer a simple, high-level programming model for flexible prototyping of scientific applications as an attractive alternative to low-level scripting. At the same time, workflow management systems (WFMS) may support data ...
- 3Citation
MetricsTotal Citations3
- J. Cała
- research-articlePublished By ACMPublished By ACM
Clustering provenance facilitating provenance exploration through data abstraction
- Linus Karsai
University of Sydney, Australia
, - Alan Fekete
University of Sydney, Australia
, - Judy Kay
University of Sydney, Australia
, - Paolo Missier
Newcastle University, United Kingdom
HILDA '16: Proceedings of the Workshop on Human-In-the-Loop Data Analytics•June 2016, Article No.: 6, pp 1-5• https://doi.org/10.1145/2939502.2939508As digital objects become increasingly important in people's lives, people may need to understand the provenance, or lineage and history, of an important digital object, to understand how it was produced. This is particularly important for objects ...
- 8Citation
- 255
- Downloads
MetricsTotal Citations8Total Downloads255Last 12 Months9Last 6 weeks1
- Linus Karsai
- Article
The data, they are a-changin
- Paolo Missier
School of Computing Science, Newcastle University
, - Jacek Cala
School of Computing Science, Newcastle University
, - Eldarina Wijaya
School of Computing Science, Newcastle University
TaPP'16: Proceedings of the 8th USENIX Conference on Theory and Practice of Provenance•June 2016, pp 54-58The cost of deriving actionable knowledge from large datasets has been decreasing thanks to a convergence of positive factors: low cost data generation, inexpensively scalable storage and processing infrastructure (cloud), software frameworks and tools ...
- 0Citation
MetricsTotal Citations0
- Paolo Missier
- Article
DataONE: A Data Federation with Provenance Support
- Yang Cao
University of Illinois, Urbana-Champaign, Illinois, USA
, - Christopher Jones
National Center for Ecological Analysis and Synthesis, UCSB, Santa Barbara, USA
, - Víctor Cuevas-Vicenttín
Universidad Popular Autónoma del Estado de Puebla, Puebla, Mexico
, - Matthew B. Jones
National Center for Ecological Analysis and Synthesis, UCSB, Santa Barbara, USA
, - Bertram Ludäscher
University of Illinois, Urbana-Champaign, Illinois, USA
, - Timothy Mcphillips
University of Illinois, Urbana-Champaign, Illinois, USA
, - Paolo Missier
School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
, - Christopher Schwalm
Woods Hole Research Center, Falmouth, USA
, - Peter Slaughter
National Center for Ecological Analysis and Synthesis, UCSB, Santa Barbara, USA
, - Dave Vieglais
University of Kansas, Lawrence, USA
, - Lauren Walker
National Center for Ecological Analysis and Synthesis, UCSB, Santa Barbara, USA
, - Yaxing Wei
Environmental Sciences Division, ORNL, Oak Ridge, USA
IPAW 2016: Proceedings of the 6th International Workshop on Provenance and Annotation of Data and Processes - Volume 9672•June 2016, pp 230-234DataONE is a federated data network focusing on earth and environmental science data. We present the provenance and search features of DataONE by means of an example involving three earth scientists who interact through a DataONE Member Node. DataONE ...
- 0Citation
MetricsTotal Citations0
- Yang Cao
- Article
Analyzing Provenance Across Heterogeneous Provenance Graphs
- Wellington Oliveira
Instituto de Computação, Universidade Federal Fluminense UFF, Niterói, Brazil and DACC, Instituto Federal do Sudeste de Minas Gerais, Rio Pomba, Brazil
, - Paolo Missier
School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
, - Kary Ocaña
Laboratório Nacional de Computação Científica LNCC, Petrópolis, Brazil
, - Daniel Oliveira
Instituto de Computação, Universidade Federal Fluminense UFF, Niterói, Brazil
, - Vanessa Braganholo
Instituto de Computação, Universidade Federal Fluminense UFF, Niterói, Brazil
IPAW 2016: Proceedings of the 6th International Workshop on Provenance and Annotation of Data and Processes - Volume 9672•June 2016, pp 57-70Provenance generated by different workflow systems is generally expressed using different formats. This is not an issue when scientists analyze provenance graphs in isolation, or when they use the same workflow system. However, when analyzing ...
- 0Citation
MetricsTotal Citations0
- Wellington Oliveira
- Article
Tracking Dengue Epidemics Using Twitter Content Classification and Topic Modelling
- Paolo Missier
School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
, - Alexander Romanovsky
School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
, - Tudor Miu
School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
, - Atinder Pal
School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
, - Michael Daniilakis
School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
, - Alessandro Garcia
PUC-Rio, Rio de Janeiro, Brazil
, - Diego Cedrim
PUC-Rio, Rio de Janeiro, Brazil
, - Leonardo Silva Sousa
PUC-Rio, Rio de Janeiro, Brazil
ICWE 2016 International Workshops on Current Trends in Web Engineering - Volume 9881•June 2016, pp 80-92• https://doi.org/10.1007/978-3-319-46963-8_7Detecting and preventing outbreaks of mosquito-borne diseases such as Dengue and Zika in Brasil and other tropical regions has long been a priority for governments in affected areas. Streaming social media content, such as Twitter, is increasingly being ...
- 2Citation
MetricsTotal Citations2
- Paolo Missier
Author Profile Pages
- Description: The Author Profile Page initially collects all the professional information known about authors from the publications record as known by the ACM bibliographic database, the Guide. Coverage of ACM publications is comprehensive from the 1950's. Coverage of other publishers generally starts in the mid 1980's. The Author Profile Page supplies a quick snapshot of an author's contribution to the field and some rudimentary measures of influence upon it. Over time, the contents of the Author Profile page may expand at the direction of the community.
Please see the following 2007 Turing Award winners' profiles as examples: - History: Disambiguation of author names is of course required for precise identification of all the works, and only those works, by a unique individual. Of equal importance to ACM, author name normalization is also one critical prerequisite to building accurate citation and download statistics. For the past several years, ACM has worked to normalize author names, expand reference capture, and gather detailed usage statistics, all intended to provide the community with a robust set of publication metrics. The Author Profile Pages reveal the first result of these efforts.
- Normalization: ACM uses normalization algorithms to weigh several types of evidence for merging and splitting names.
These include:- co-authors: if we have two names and cannot disambiguate them based on name alone, then we see if they have a co-author in common. If so, this weighs towards the two names being the same person.
- affiliations: names in common with same affiliation weighs toward the two names being the same person.
- publication title: names in common whose works are published in same journal weighs toward the two names being the same person.
- keywords: names in common whose works address the same subject matter as determined from title and keywords, weigh toward being the same person.
The more conservative the merging algorithms, the more bits of evidence are required before a merge is made, resulting in greater precision but lower recall of works for a given Author Profile. Many bibliographic records have only author initials. Many names lack affiliations. With very common family names, typical in Asia, more liberal algorithms result in mistaken merges.
Automatic normalization of author names is not exact. Hence it is clear that manual intervention based on human knowledge is required to perfect algorithmic results. ACM is meeting this challenge, continuing to work to improve the automated merges by tweaking the weighting of the evidence in light of experience.
- Bibliometrics: In 1926, Alfred Lotka formulated his power law (known as Lotka's Law) describing the frequency of publication by authors in a given field. According to this bibliometric law of scientific productivity, only a very small percentage (~6%) of authors in a field will produce more than 10 articles while the majority (perhaps 60%) will have but a single article published. With ACM's first cut at author name normalization in place, the distribution of our authors with 1, 2, 3..n publications does not match Lotka's Law precisely, but neither is the distribution curve far off. For a definition of ACM's first set of publication statistics, see Bibliometrics
- Future Direction:
The initial release of the Author Edit Screen is open to anyone in the community with an ACM account, but it is limited to personal information. An author's photograph, a Home Page URL, and an email may be added, deleted or edited. Changes are reviewed before they are made available on the live site.
ACM will expand this edit facility to accommodate more types of data and facilitate ease of community participation with appropriate safeguards. In particular, authors or members of the community will be able to indicate works in their profile that do not belong there and merge others that do belong but are currently missing.
A direct search interface for Author Profiles will be built.
An institutional view of works emerging from their faculty and researchers will be provided along with a relevant set of metrics.
It is possible, too, that the Author Profile page may evolve to allow interested authors to upload unpublished professional materials to an area available for search and free educational use, but distinct from the ACM Digital Library proper. It is hard to predict what shape such an area for user-generated content may take, but it carries interesting potential for input from the community.
Bibliometrics
The ACM DL is a comprehensive repository of publications from the entire field of computing.
It is ACM's intention to make the derivation of any publication statistics it generates clear to the user.
- Average citations per article = The total Citation Count divided by the total Publication Count.
- Citation Count = cumulative total number of times all authored works by this author were cited by other works within ACM's bibliographic database. Almost all reference lists in articles published by ACM have been captured. References lists from other publishers are less well-represented in the database. Unresolved references are not included in the Citation Count. The Citation Count is citations TO any type of work, but the references counted are only FROM journal and proceedings articles. Reference lists from books, dissertations, and technical reports have not generally been captured in the database. (Citation Counts for individual works are displayed with the individual record listed on the Author Page.)
- Publication Count = all works of any genre within the universe of ACM's bibliographic database of computing literature of which this person was an author. Works where the person has role as editor, advisor, chair, etc. are listed on the page but are not part of the Publication Count.
- Publication Years = the span from the earliest year of publication on a work by this author to the most recent year of publication of a work by this author captured within the ACM bibliographic database of computing literature (The ACM Guide to Computing Literature, also known as "the Guide".
- Available for download = the total number of works by this author whose full texts may be downloaded from an ACM full-text article server. Downloads from external full-text sources linked to from within the ACM bibliographic space are not counted as 'available for download'.
- Average downloads per article = The total number of cumulative downloads divided by the number of articles (including multimedia objects) available for download from ACM's servers.
- Downloads (cumulative) = The cumulative number of times all works by this author have been downloaded from an ACM full-text article server since the downloads were first counted in May 2003. The counts displayed are updated monthly and are therefore 0-31 days behind the current date. Robotic activity is scrubbed from the download statistics.
- Downloads (12 months) = The cumulative number of times all works by this author have been downloaded from an ACM full-text article server over the last 12-month period for which statistics are available. The counts displayed are usually 1-2 weeks behind the current date. (12-month download counts for individual works are displayed with the individual record.)
- Downloads (6 weeks) = The cumulative number of times all works by this author have been downloaded from an ACM full-text article server over the last 6-week period for which statistics are available. The counts displayed are usually 1-2 weeks behind the current date. (6-week download counts for individual works are displayed with the individual record.)
ACM Author-Izer Service
Summary Description
ACM Author-Izer is a unique service that enables ACM authors to generate and post links on both their homepage and institutional repository for visitors to download the definitive version of their articles from the ACM Digital Library at no charge.
Downloads from these sites are captured in official ACM statistics, improving the accuracy of usage and impact measurements. Consistently linking to definitive version of ACM articles should reduce user confusion over article versioning.
ACM Author-Izer also extends ACM’s reputation as an innovative “Green Path” publisher, making ACM one of the first publishers of scholarly works to offer this model to its authors.
To access ACM Author-Izer, authors need to establish a free ACM web account. Should authors change institutions or sites, they can utilize the new ACM service to disable old links and re-authorize new links for free downloads from a different site.
How ACM Author-Izer Works
Authors may post ACM Author-Izer links in their own bibliographies maintained on their website and their own institution’s repository. The links take visitors to your page directly to the definitive version of individual articles inside the ACM Digital Library to download these articles for free.
The Service can be applied to all the articles you have ever published with ACM.
Depending on your previous activities within the ACM DL, you may need to take up to three steps to use ACM Author-Izer.
For authors who do not have a free ACM Web Account:
- Go to the ACM DL http://dl.acm.org/ and click SIGN UP. Once your account is established, proceed to next step.
For authors who have an ACM web account, but have not edited their ACM Author Profile page:
- Sign in to your ACM web account and go to your Author Profile page. Click "Add personal information" and add photograph, homepage address, etc. Click ADD AUTHOR INFORMATION to submit change. Once you receive email notification that your changes were accepted, you may utilize ACM Author-izer.
For authors who have an account and have already edited their Profile Page:
- Sign in to your ACM web account, go to your Author Profile page in the Digital Library, look for the ACM Author-izer link below each ACM published article, and begin the authorization process. If you have published many ACM articles, you may find a batch Authorization process useful. It is labeled: "Export as: ACM Author-Izer Service"
ACM Author-Izer also provides code snippets for authors to display download and citation statistics for each “authorized” article on their personal pages. Downloads from these pages are captured in official ACM statistics, improving the accuracy of usage and impact measurements. Consistently linking to the definitive version of ACM articles should reduce user confusion over article versioning.
Note: You still retain the right to post your author-prepared preprint versions on your home pages and in your institutional repositories with DOI pointers to the definitive version permanently maintained in the ACM Digital Library. But any download of your preprint versions will not be counted in ACM usage statistics. If you use these AUTHOR-IZER links instead, usage by visitors to your page will be recorded in the ACM Digital Library and displayed on your page.
FAQ
- Q. What is ACM Author-Izer?
A. ACM Author-Izer is a unique, link-based, self-archiving service that enables ACM authors to generate and post links on either their home page or institutional repository for visitors to download the definitive version of their articles for free.
- Q. What articles are eligible for ACM Author-Izer?
- A. ACM Author-Izer can be applied to all the articles authors have ever published with ACM. It is also available to authors who will have articles published in ACM publications in the future.
- Q. Are there any restrictions on authors to use this service?
- A. No. An author does not need to subscribe to the ACM Digital Library nor even be a member of ACM.
- Q. What are the requirements to use this service?
- A. To access ACM Author-Izer, authors need to have a free ACM web account, must have an ACM Author Profile page in the Digital Library, and must take ownership of their Author Profile page.
- Q. What is an ACM Author Profile Page?
- A. The Author Profile Page initially collects all the professional information known about authors from the publications record as known by the ACM Digital Library. The Author Profile Page supplies a quick snapshot of an author's contribution to the field and some rudimentary measures of influence upon it. Over time, the contents of the Author Profile page may expand at the direction of the community. Please visit the ACM Author Profile documentation page for more background information on these pages.
- Q. How do I find my Author Profile page and take ownership?
- A. You will need to take the following steps:
- Create a free ACM Web Account
- Sign-In to the ACM Digital Library
- Find your Author Profile Page by searching the ACM Digital Library for your name
- Find the result you authored (where your author name is a clickable link)
- Click on your name to go to the Author Profile Page
- Click the "Add Personal Information" link on the Author Profile Page
- Wait for ACM review and approval; generally less than 24 hours
- Q. Why does my photo not appear?
- A. Make sure that the image you submit is in .jpg or .gif format and that the file name does not contain special characters
- Q. What if I cannot find the Add Personal Information function on my author page?
- A. The ACM account linked to your profile page is different than the one you are logged into. Please logout and login to the account associated with your Author Profile Page.
- Q. What happens if an author changes the location of his bibliography or moves to a new institution?
- A. Should authors change institutions or sites, they can utilize ACM Author-Izer to disable old links and re-authorize new links for free downloads from a new location.
- Q. What happens if an author provides a URL that redirects to the author’s personal bibliography page?
- A. The service will not provide a free download from the ACM Digital Library. Instead the person who uses that link will simply go to the Citation Page for that article in the ACM Digital Library where the article may be accessed under the usual subscription rules.
However, if the author provides the target page URL, any link that redirects to that target page will enable a free download from the Service.
- Q. What happens if the author’s bibliography lives on a page with several aliases?
- A. Only one alias will work, whichever one is registered as the page containing the author’s bibliography. ACM has no technical solution to this problem at this time.
- Q. Why should authors use ACM Author-Izer?
- A. ACM Author-Izer lets visitors to authors’ personal home pages download articles for no charge from the ACM Digital Library. It allows authors to dynamically display real-time download and citation statistics for each “authorized” article on their personal site.
- Q. Does ACM Author-Izer provide benefits for authors?
- A. Downloads of definitive articles via Author-Izer links on the authors’ personal web page are captured in official ACM statistics to more accurately reflect usage and impact measurements.
Authors who do not use ACM Author-Izer links will not have downloads from their local, personal bibliographies counted. They do, however, retain the existing right to post author-prepared preprint versions on their home pages or institutional repositories with DOI pointers to the definitive version permanently maintained in the ACM Digital Library.
- Q. How does ACM Author-Izer benefit the computing community?
- A. ACM Author-Izer expands the visibility and dissemination of the definitive version of ACM articles. It is based on ACM’s strong belief that the computing community should have the widest possible access to the definitive versions of scholarly literature. By linking authors’ personal bibliography with the ACM Digital Library, user confusion over article versioning should be reduced over time.
In making ACM Author-Izer a free service to both authors and visitors to their websites, ACM is emphasizing its continuing commitment to the interests of its authors and to the computing community in ways that are consistent with its existing subscription-based access model.
- Q. Why can’t I find my most recent publication in my ACM Author Profile Page?
- A. There is a time delay between publication and the process which associates that publication with an Author Profile Page. Right now, that process usually takes 4-8 weeks.
- Q. How does ACM Author-Izer expand ACM’s “Green Path” Access Policies?
- A. ACM Author-Izer extends the rights and permissions that authors retain even after copyright transfer to ACM, which has been among the “greenest” publishers. ACM enables its author community to retain a wide range of rights related to copyright and reuse of materials. They include:
- Posting rights that ensure free access to their work outside the ACM Digital Library and print publications
- Rights to reuse any portion of their work in new works that they may create
- Copyright to artistic images in ACM’s graphics-oriented publications that authors may want to exploit in commercial contexts
- All patent rights, which remain with the original owner