Pseudointelligence: A Unifying Lens on Language Model Evaluation.

AllBooks Images Shopping Maps Videos News

Showing results for Pseudo Intelligence: A Unifying Lens on Language Model Evaluation.

Search instead for Pseudointelligence: A Unifying Lens on Language Model Evaluation.

Pseudointelligence: A Unifying Lens on Language Model Evaluation

aclanthology.org › 2023.findings-emnlp....

We propose pseudointelligence, which captures the maxim that “(perceived) intelligence lies in the eye of the beholder.”

Scholarly articles for Pseudo Intelligence: A Unifying Lens on Language Model Evaluation.

scholar.google.com › citations

Large language model alignment: A survey
Shen · Cited by 111

… and Opportunities with Large Language Models
Dai · Cited by 7

… assessment of recent large vision-language models
Jiang · Cited by 13

[PDF] Pseudointelligence: A Unifying Lens on Language Model Evaluation

openreview.net › pdf

Figure 1: Evaluation of a pseudointelligent model. For each capability µ, (1) iid samples are drawn and (2) fed to the learners, who (3) output a model and ...

A Unifying Framework for Language Model Evaluation - arXiv

arxiv.org › cs

Oct 18, 2023 · We propose a complexity-theoretic framework of model evaluation cast as a dynamic interaction between a model and a learned evaluator.

Missing: Lens | Show results with:Lens

Pseudointelligence: A Unifying Lens on Language Model Evaluation

www.researchgate.net › publication › 37...

In this paper, a new word-based language model evaluation measure is proposed to account for the effect of word segmentation and the goal of predicting CER.

A Unifying Framework for Language Model Evaluation - ar5iv - arXiv

ar5iv.labs.arxiv.org › html

Inspired by pseudorandomness, we propose pseudointelligence, which captures the maxim that “(perceived) intelligence lies in the eye of the beholder.” That is, ...

Pseudointelligence: A Unifying Framework for Language Model ...

synthical.com › article

Oct 18, 2023 · With large language models surpassing human performance on an increasing number of benchmarks, we must take a principled approach for ...

Missing: Lens | Show results with:Lens

Factual consistency evaluation of summarization in the Era of large ...

www.sciencedirect.com › article › pii

Recent advances in Large language models (LLMs) have demonstrated remarkable potential in text evaluation but their effectiveness in assessing FC in ...

(PDF) Evaluating Intelligence and Knowledge in Large Language ...

www.researchgate.net › publication › 38...

Aug 4, 2024 · This article aims to explore the dual challenge of assessing the effects of Large Language Models and associated semantic technologies on text dissemination ...

Missing: Pseudo | Show results with:Pseudo

Accepted Main Conference Papers - ACL 2024

2024.aclweb.org › program › main_conf...

EmoBench: Evaluating the Emotional Intelligence of Large Language Models ... Revealing the Parametric Knowledge of Language Models: A Unified Framework for ...

Effectiveness assessment of recent large vision-language models

link.springer.com › Visual Intelligence

Jun 28, 2024 · This paper endeavors to evaluate the competency of popular LVLMs in specialized and general tasks, respectively, aiming to offer a comprehensive understanding ...