Yingjia Wan
Yingjia Wan
About
Publications
Experiences
Accomplishments
CV
Contact
Light
Dark
Automatic
1
FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMs
FaStfact
is a multi-agent pipeline for evaluating long-form generation factuality that achieves the highest alignment with human evaluation and time/token efficiency among existing baselines. An annotated FaStfact-bench is also open-sourced.
Yingjia Wan
,
Haochen Tan
,
Xiao Zhu
,
Xinyu Zhou
,
Zhiwei Li
,
Qingsong Lv
,
Changxuan Sun
,
Jiaqi Zeng
,
Yi Xu
,
Jianqiao Lu
,
Yinhong Liu
,
Zhijiang Guo
Code
arxiv
SATBench: Benchmarking LLMs’ Logical Reasoning via Automated Puzzle Generation from SAT Formulas
SATBench is a benchmark for evaluating LLMs logical reasoning through logical puzzles derived from Boolean satisfiability (SAT) problems.
Anjiang Wei
,
Yuheng Wu
,
Yingjia Wan
,
Tarun Suresh
,
Huanmi Tan
,
Zhanke Zhou
,
Sanmi Koyejo
,
Ke Wang
,
Alex Aiken
Code
arxiv
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs
MR-BEN is a comprehensive process-based benchmark to evaluate advanced `meta-reasoning’ skills, where models are asked to locate and analyse errors in the provided CoT solutions. It comprises 5,975 multi-domain samples with annotated groundtruths.
Zhongshen Zeng
,
Yinhong Liu
,
Yingjia Wan
,
Jingyao Li
,
Pengguang Chen
,
Jianbo Dai
,
Yuxuan Yao
,
Rongwu Xu
,
Zehan Qi
,
Wanru Zhao
,
Linling Shen
,
Jianqiao Lu
,
Haochen Tan
,
Yukang Chen
,
Hao Zhang
,
Zhan Shi
,
Bailin Wang
,
Zhijiang Guo
,
Jiaya Jia
Code
Dataset
Leaderboard
arXiv
AutoPSV: Automated Process-Supervised Verifier
AutoPSV proposes a simple, effective, and efficient method to automatically annotate reasoning steps (even without requiring grountruth answers).
Jianqiao Lu
,
Zhiyang Dou
,
Hongru Wang
,
Zeyu Cao
,
Jianbo Dai
,
Yingjia Wan
,
Yinya Huang
,
Zhijiang Guo
Code
arxiv
Reading-While-Listening vs. Reading-Only in A Second Language at Different Language Proficiencies: an Eye-Tracking Study
Reading-while-listening (R/L) has a facilitation effect on second language (L2) reading comprehension after longitudinal R/L training …
Yingjia Wan
,
Matthew Wallace
Last updated on July 7, 12127
Pedagogy in a Pandemic: College Instructor Perspectives on Online Instruction during COVID-19 at Universities in USA and China
Higher education institutions globally saw a collective mandate to move classes online, where afforded, at the onset of the COVID- 19 …
Sarah Stilwell
,
Anjli Narwani
,
Jessica Pelton
,
Xi Zhang
,
Qi Zeng
,
Qi Zhao
,
Yingjia Wan
,
Kevin Miller
Last updated on July 7, 12127
Cite
×