Deep Thinking
What is Deep Thinking?
深度不夠,長度來湊
Question -----> LLM ----->
<think> ....... </think>-> Verification , Explore , Planning
Example: Alpha Go
- Alpha Go 的思考過程是用 MCTS (Monte Carlo Tree Search)
Test Time Scaling
- 思考的越多結果會越好!
- Paper: Scaling Scaling Laws with Board Games
Build Reasoning LLM Method
你可以混著用
Chain of Thought (CoT)
- Don't need to change the model
- Few-shot CoT
- 給Example叫LLM回答
- Paper: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
- Zero-shot CoT
- 直接叫LLM "Let think step by step" 來回答問題
- Paper: Large Language Models are Zero-Shot Reasoners
- Long CoT
- Paper: Towards Reasoning Era
- Supervised CoT
- 用更好的prompt來引導LLM回答問題
- Paper: Supervised Chain of Thought
給Model reasoning 工作流程
- Don't need to change the model