TwitterXDownload
@AndrewZ45732491
PhD Candidate @Tsinghua_Uni. Absolute Zero,ExpeL,Diver-CT Ex. Research Intern @MSFTResearch, @ BIGAI. Interested in RL, LLM Reasoning/Safety, LLM-based Agents.
❄️Introducing Absolute Zero Reasoner: Our reasoner learns to both propose tasks that maximize learnability and improve reasoning by solving them, entirely through self-play—with no external data! It overall outperforms other "zero" models in math & coding domains. 🧵 1/
© 2024 TwitterXDownload सर्वाधिकार सुरक्षित।
अन्य लिंक
हमसे संपर्क करें