LM Studio的GPU卸载对Token生成速度的影响-以7840hs的780M核显为例背景1年前已经验证了核显跑32b的可行性https://lizhiyong.blog.csdn.net/article/details/147237895据说最近新出了更聪明的小模型正好也验证下如何提升下老爷机的Token生成速度。其中有个关键的参数就是GPU卸载。默认情况LM Studio中默认的GPU卸载4chat虎鲸是鱼嘛此时GPU核显的负载并不高但是CPU很高显然负载大部分给到了CPU导致Token生成速度并不是非常理想ThinkHeres a thinking process:AnalyzeUserInput:Question:虎鲸是鱼嘛(Areorcas/fish?)Language:ChineseCorequestion:Classificationof 虎鲸(orca/killer whale)-is it a fish?IdentifyKeyFacts:虎鲸(Orca/KillerWhale)is a marinemammal.ItbelongstotheorderCetacea,familyDelphinidae.Characteristics:breathes airwithlungs,gives birthtoliveyoung,nurses themwithmilk,maintains constant body temperature,has hair(at least at birth).Fishare cold-blooded,breathe through gills,lay eggs(mostly),havescales.Therefore,虎鲸 isNOTafish.FormulateResponse(InternalRefinement-Chinese):Directanswer:不是。Explanation:虎鲸是哺乳动物属于鲸目海豚科。它用肺呼吸、胎生、哺乳、恒温不具备鱼类特征如鳃呼吸、卵生、变温等。虽然生活在海洋里且外形似鱼但分类学上明确属于哺乳类。Keepit concise as per systemprompt.CheckagainstConstraints:Plain,complete,no moralizing/disclaimers/hedging.Followexact formatifrequested(none here).Concisebydefault.Languagematches user(Chinese).Draft:不是。虎鲸是哺乳动物属于鲸目海豚科。它用肺呼吸、胎生、哺乳、恒温不具备鱼类的鳃呼吸和卵生等特征。虽然生活在海洋且外形似鱼但分类学上明确属于哺乳类。FinalOutputGeneration:(matches draft)✅Direct,accurate,concise,matches all constraints.✅Noextra fluff.✅Ready.Outputmatches the refineddraft.Proceeds.[Done]Self-Correction/Verificationduring thought:Isit a fish?No.Whatis it?Marinemammal(cetacean).Keydifferentiators:lungs,live birth,milk,warm-blooded vs gills,eggs,cold-blooded.Allcorrect.Outputmatches.✅Outputgeneration.“不是。虎鲸是哺乳动物属于鲸目海豚科。它用肺呼吸、胎生、哺乳、恒温不具备鱼类用鳃呼吸和卵生等特征。虽生活在海洋且外形似鱼但分类学上明确属于哺乳类。”(Slightlypolishedforflow)-matches exactly.✅Done.Outputmatches response.[Proceeds]显然思考过程是3min58s。Token生成速度不是。虎鲸是哺乳动物属于鲸目海豚科。它用肺呼吸、胎生、哺乳、恒温不具备鱼类用鳃呼吸和卵生等特征。虽生活在海洋且外形似鱼但分类学上明确属于哺乳类。Token生成速度是2.64t/s。提高GPU卸载数由于96GB内存有48GB给核显跑个Q4的30b级别小模型还是够的直接把GPU卸载拉满改成64改完要重启才会生效。chat可以看到此时CPU负载明显下降GPU负载终于是拉起来了而且内存占用也升高了ThinkHeres a thinking process:AnalyzeUserInput:Question:虎鲸是鱼嘛(Isan orca a fish?)Language:ChineseCoreconcept:Biologicalclassification of orcas(killer whales)IdentifyKeyFacts:Orca虎鲸(hǔ jīng)inChineseScientificname:OrcinusorcaTaxonomy:AnimaliaChordataMammaliaCetaceaOdontocetiDelphinidaeKeycharacteristic:Its a marine mammal,not afish.Distinguishingfeatures from fish:Breathsairwithlungs(not gills),gives birthtoliveyoung(viviparous),nurses themwithmilk,has hair(though minimal),is warm-blooded,etc.FormulateResponse(InternalRefinement-Chinese):Directanswer:不是。Explanation:虎鲸是哺乳动物属于鲸目齿鲸亚目海豚科。Keydifferences from fish:用肺呼吸、胎生、哺乳、恒温等。Keepit concise asrequested.DraftResponse(Chinese):不是。虎鲸是哺乳动物属于鲸类海豚科。它们用肺呼吸、胎生、哺乳、恒温与鱼类在分类和生理结构上完全不同。Checkagainst constraints:Plain,complete,no moralizing/disclaimersFollowexact formatifrequested(none here)ConcisebydefaultMatcheslanguage(Chinese)AccuratescientificallyAllgood.Readytooutput.✅同一个问题Think耗时缩短到1min49s。Token生成速度明显打到了3.5t/s。对比情况相同的模型同一个问题结果差异不大性能提升还是比较明显的Mode默认情况的Time提高GPU卸载后的Time提升比例Think3min58s1min49s54.2%Answer2.64t/s3.5t/s32.58%提升还是蛮可观的。转载请注明出处https://lizhiyong.blog.csdn.net/article/details/161386515