java性能:Java的性能调整



Java性能调整,这并不是新话题了,本文主要从思路方法、对象创建、多线程和同步、list效率、String处理、代码有效长度缩减等多个应用例子中整理总结应用窍门技巧,加上结合新JDK1.4编译器,是开发者可以创建几乎和C样块应用...

[i]ThefollowingmantrawasfirststatedabouttwodecadesagoinJonBentley\'s\"ProgrammingPearls\"column­deferoptimizationandgetyourcodeworkingfirst.Thiswisdomhasbeenampliedbynumerouswritersonobject-orienteddesign,coding,thinking,andmore.Thereigningphilosophyhasbeenstatedas,\"getitworkingfirst,thendetermihichareasarethecriticalonesandoptimizeonlythose.\"[/i]
Since20%ofthecodeisrun80%ofthetime,thisseemslikeareasonableidea.Bentleywasanearlyadvocateoftheuseofprofilingtoolsthatshowwhichpartsofthecodearerunthemost,andtargetingthemostcriticalareasfirst.Onthewhole,thisisgoodadvice,butthelessonhasbeenlearnedtoowell.Todayit\'scommonpracticetoignoreefficiency,scatteringlayersofunnecessaryinefficiencyeverywherewithoutthought.Thisarticleshowsthatit\'sjustaseasytowritefastercodewithouttakingextradevelopmenttimetodoit,andteachessomethingaboutthewayJavaoptimizesyourcode.


MethodCalls
Object-orienteddesignisanorganizationaltechnique;attheindividualmethodlevel,codehasbeenwrittenthesamewayforthepast40years.Thefocusonorganizationmerelyscodedownosmallmanageableunits­esthatcontainanumberof(typicallyquitesmall)methods.InalanguagesuchasCthisisnotaproblematall,asthelanguageitselfhasextensivecontroloverthecostofthemethodcalls.Infact,withtheinlinedirective,thecostofmethodexecutioninCcandroptozero.InJava,however,thedefaultmethoddefinitioncheckstoseewhatobjectitisandcallstheappropriatemethod,whichisequivalenttoavirtualfunctioninC.Thisinvolvesoverhead:theprogrammustfirstexaminetheobjecttodetermineitstype,selecttheappropriatemethod,andthencallit.Callingamethodisquiteslowcomparedtoexecutinginstructionswithinamethod.Infact,asmybenchmarksshow,aloopthatexecutesntimesdoingnothingbutcountingis50timesfasterthanonethatcallsamethodthatdoesnothing.(SeethebenchmarkonmyWebsite,javaperformance/benchmarks\' target=\'_blank\' =\'l2\'>www.righttrak.com/javaperformance/benchmarks.)

Whatisthecostofamethodcallandhowcanyoureduceit?AmethodcostsaboutthreeunitsoftimeonmyPentium4PC,whereaunitisdbyanemptycountingloop.Afinalmethodcostsroughlythesame,andanordinarynonfinalmethodisaboutthreetimesasexpensive.Theconclusionisobvious:wheneverpossible,uhefinalorqualiersonmethods(inotherwords,youdon\'tendtooverridethemethod,sayso).



Let\'sstartbysayingthatyouwantaprogramtorunfast,getJDK1.4andrunitwithoptimizationturnedon:


java-serverMyClass

The-serveroptionscanstheentireloadedprogramasit\'sbeingrun,eliminatingmethodsbyinliningthem,turningmethodsonativeassemblers,removingconstantevaluationsfromloops,andotheroptimizations.Itimprovesperformance,oftenbyafactorof10inCPU-ensivebitsofcode.Itmightsurpriseyoutothinkaboutoptimizingprogramsatruntime,butconsideringthatJavarunsondferentmachines,theonlywaytooptimizeforyourparticularprocessorisatruntime.

Thisfeatureisin1.4.There\'sa\"badfeature\"in1.3thattendstoinvoketheJITcompilerlazily,oftentoolate.Ifyoucompilethefollowingprogramandrunitunder1.3,anycodeinisoptimized,butcodeinfisoptimizedonlyafterit\'scalledonce:



publicvoidf{...}
publicvoid(Stringargs){
f;
}
Inthis,sincefiscalledonlyonce,it\'sobviousthatthecompilerhadbetteroptimizefbeforeexecutingit,ornotbother.

Thecompilermakescertainassumptionsaboutwhatisworthinliningbasedonthefactthatinliningcodecantakemorememorythecodeisbig.Ifyouhaveabigroutine,itwon\'tbeinlined.Thiscanbeveryunfortunateinsomespecializeds,whichiswherehumanelligencecomesin.

Supposeyou\'rewritingf,whichcallsg,whichcallsh.You\'redoingthisjusttoupthecodeandmakeiteasiertoread.However,eventhoughfistheonlyfunctionthatcallsg,gisbigenoughitwon\'tbeinlined.Itdoesn\'tmatterthere\'sabigloopinvolved:



voidf{g;}


voidg{
for(i=0;i<100000000;i)
...;
}

becauhetimeittakestoperformtheloopdwarfsthetimeittakestogetinandout,andsothepercentagecostoftheprocedurecallistiny.

Thereis,however,atobemadeforapplyinghumanelligencetoinliningcode.Frequentlyamethodislarge,butthefirstlineofthemethodisatestthatdetermineswhetherornottoexecutetherest.Consideraloggingroutine.Ifdebugmodeison,itshouldwriteout.Butforallthoseswheredebugisfalse,whycalltheroutineatall?Here\'sanexampleofthisinaction:


privatebooleandebug;

Inthefirst,f7callstheloggingroutineregardlessofthestateofthedebuggingflag:



publicf7(n){
sum=0;
for(i=0;i<n;i){
log(i);
}
sum;
}
Inthesecond,f8callstheloggingroutineonlydebugmodeison.Atthecostofanextrastatementeverytimeyoucallthelogroutine,thiscoderuns25%faster.





publicf8(n){
sum=0;
for(i=0;i<n;i){
(debug)
log2(i);
}
sum;
}

MultithreadingandSynchronization
SinceJavaisamultithreadedlanguage,thesynchronizedprimitiveisprovidedtomakesurethatmultiplethreadsofexecutiondonotdestroyobjects.Whenenteringasynchronizedmethod,itacquiresalockthat\'sassociatedwiththeobjectandpreventsanyothersynchronizedmethodfromentering.Acquiringsuchalockisaslowmachinelanguageinstruction.Theresultisthatcallingasynchronizedmethodisthreetimesasslowasanordinarymethod,whichurnisthreetimesasslowasaorfinalmethod.Thecomputermustfirstcheckwhethersomeonealreadyhasthelock,andnot,acquirethelockallinoneatomicoperation.


publicsynchronizedvoidf{...}

Multithreadingisacomplextopic,andthereaderiswelladvisedtoreadoneofthemanybooksonthetopicforafullunderstanding.However,averyquickoverviewofoptimizationshouldbeginwiththeobservationthatsinceacquiringlocksandmanagingthemisacostlybusiness,weshouldavoiditunlessthereissomerealreasonfortheiruse.Furthermore,multithreadingwillonlyresultinagainofefficiencywecanovercometheimmediatelossofefficiencythatresultsfromcallingsynchronizedmethods.Inagreatmanysituations,multithreadingisnotcalledfor,andthesimplestandbestwaytohandlethesituationistosaythattheobjectisnotthread-safe,andthatprogrammersshouldnevermaketwosimultaneouscallstothesameobject.

Ontheotherhand,multithreadinghasasignicantadvantage,thebestwaytoachieveit,wherepossible,istohavemorethanoneobjectandgiveeachthreaditsownobject.Asperfectexamplesofthis,considerIOstreamsinaWebenvironment.WhiletheremaybemanythreadssimultaneouslywritingWebpages,eachoneiswritingtoitsownWebpage.Insuchs,theIOstreamforservletscouldbeimplementedasafast,unsynchronizedversionofPrStream.

Sometimes,however,thereareapplications(likealog)whereit\'svitalthatmultiplethreadsbeabletowritetothesameobject.Insuchs,synchronizationisvitalforcorrectness.Whilewecanaddatothelibrarytosupportunthread-safeIO,wemustalwayscontinuetosupportthread-safeIOforthosefewswhereit\'simportant.

Ifyou\'regoingtoacquirealock,dosoonlyonce.Planninghowlocksareacquiredandreleasedisnotonlygoodoptimizationpractice,it\'sworthreallythinkingoverasthisisoneofthorickyareaswherebadlythought-outdesignsarenotonlyslow,butoftendon\'tworkinverysubtle,nonrepeatableways.Thesearethehardestpossiblesituationstodebug.Becauseacquiringthelockmeansthatnoonecanenter,synchronizedcriticalsectionsshould:



Beasaspossible
Notcallothersynchronizedroutines(i.e.,dowhateverneedstobedoneinasinglesynchronizedsectionpossible)
Neverallowunsynchronizedaccesstocriticaldata
Neverdeadlock
[i]CaseStudy[/i]
Simplyremovingallthesynchronizationfromjava.io.PrWriterandwritingathatisfunctionallyequivalentbutnotthreadsaferesultedina50%improvementinspeed.ClassPrWritercontainssynchronizedmethodsthatcallothersynchronizedmethods,insomesthreedeep.ThelongchainsofmethodinvocationbeforegettingtoanyactualcodeisalargepartofwhatslowsdownIO.


CallingNativeMethods
Youmightassumethatyoureallyneedspeed,youcanresorttolinkinginsomeCcodeandcallthatfortheultimateinperformance.Theanswermaysurpriseyou;itcertainlysurprisedme.EvenignoringtheobviousdisadvantagesofusingC­thelackofportability,requiringasharedlibrarytodeployanapplication,etc.­thesimplefactisthatcallinganativemethodistwiceasslowasanordinarymethodcall.

HavinglookedabitattheimplementationoftheJDK,Icantellyouthatwhileitmaybetweakedabit,thereasonisessentiallysound­tocallaCroutine,youmustfirstmakeanativemodecall(that\'sone)andthenupacalltotheunderlyingCroutine;twiceasmuchwork,twiceasmuchtime,right?AndtocommunicatewithanythingheJavaenvironmenttakesfurthercallsaswell,sotheonlywayyou\'llseeasignicantspeedadvantageisbystayingheCworldforawhile.In,nativemethodsseemtobetotallyoutedatthispobyacombinationofincreasinglygoodoptimizationheJavaworldandthesomewhatinefficientcodeinvolvedhecommunicationbetweenthetwo.


CreatingObjects
AsaCprogrammeroriginally,IassumedthatthebiggestcostIwaslikelytofindwasthesynchronizedmethodcall.Iwassurprised­theslowestoperationbyfarwasthecreationofanobject.Inhindsightitmakesperfectsense.Creatinganobjectrequirestheallocationofmemory,includingalltheoverheadforidentyingtheoftheobject,itslock,andtheamountofmemorybeingused.Afterusingtheobjectforatimeandinvokingmethods,thegarbagecollectormusteventuallyfreethememorythathasbeenallocated.Theactofallocatingthememoryalone,evenwhenoptimizedinJDK1.4,isfarmoreexpensivethanasynchronizedmethodcall.TheoverridingruleinJavacodeoptimizationissimple:don\'tcreateunnecessarytemporaryobjects.

Inthefollowingexample,thefirstversion,whichcreatesonlyasingleobjectandrepeatedlyqueriesit,is800msversus26,300ms,ormorethan30timesfasterthanthesecondone,whichrepeatedlyre-createstheobject.Thisisanextremeexample,ofcourse,becausewhatisbeingdoneisverysimplecomparedtotheobjectcreation,butitgivesanideaofjusthowcostlyobjectcreationis.





publicf10(n){
sum=0;
TempThingt=TempThing(0);
for(i=0;i<n;i)
sumt.getV;
sum;
}


publicf11(n){
sum=0;
for(i=0;i<n;i){
TempThingt=TempThing(0);
sumt.getV;
}
sum;
}

[i]CaseStudy[/i]
WhileremovingsynchronizationandstreamliningthecodepathofPrWriterresultedinafactoroftwoimprovementsinperformance,eliminatingthetemporarycreatedinpringanresultedinasixfoldperformanceimprovement.


StringManipulation
Manyprogrammershaveseenthesequence:


Strings=\"a\"+\"b\"+\"c\";

andknowthatStringBufferisbetter:



StringBufferb=StringBuffer;
b.append(\"a\").append(\"b\").append(\"c\");
Thisknowledgeseemstodownafterthispo.Ifyou\'reprocessinglargesinStringBuffers,don\'tthenturnthembackostopassthemtoanotherroutineunlessyou\'reworriedaboutmultithreadingproblems.Aslongasyou\'reprocessingsinglethreaded,you\'rebetteroffcontinuingtoappendotheStringBufferuntilyou\'redone.Thefollowingroutine:



publicStringgetAsXML{
StringBufferb=StringBuffer;
b.append(...);
b.toString;
}
mustmakeanunnecessarycopyinordertoturntheStringBufferoa.Then,thecallerisgoingtoappendmoretext,thismustbeappendedoyetanotherStringBuffer.Thisisabigwaste.Instead,try:



publicvoidgetAsXML(StringBufferbuf){
buf.append(...);
}
wherethecallerallocatestheStringBufferandpassesittotheroutine,whichfillsit.Thecallercanthencontinueprocessing.Thisapproachhasanotheradvantage,namelythatthecallerusuallyhasamuchbetterideaofthetotaltheStringBufferattheendofprocessing.Itisvastlymoreefficient,youknowhowmanycharactersareinvolved,topreallocatethemratherthanallowtheStringBuffertostartatthedefault16andgrow,whichrequiresalotofcopying.Forexample,youknowtheeventualthewillbeashighas2K,then:



StringBufferbuf=StringBuffer(2048);
obj.getAsXML(buf);
willtypicallyresultinapproximately100%performanceimprovementovertheoriginalcode.It\'sfarbettertooverallocatethantounderallocateandrequireagrowoperation.Remember,thisworksonlytheinquestionisnotbeingassaultedbymultiplethreads.

Manipulatings,evenoptimizedones,takesafairamountofworkandcode,eventhelengthisone.Ifyou\'reprocessingasinglecharacter,usingacharismuchfaster,so:


buf.append(\'\\n\');

issignicantlyfasterthan:


buf.append(\"\\n\");

EfficientUseofLists
Javaprovidesafairlyrichofdatastructures.They\'renotallthesame,andwhiletheymayworkerchangeably,thatdoesn\'tmeanthey\'reallequallygoodinallcircumstances.Tobuildupalistinorder,ArrayListisfasterthanLinkedListbyafactoroftwo.LinkedListissubstantiallyslowerbecauseeachnoderequiresthecreationofanobject.Vectorisaclosesecondinspeed;it\'sslowerbecauseit\'sasynchronizeddatastructure.However,insituationswherevaluesaretobeinsertedhemiddleofthelist(orworsestill,thebeginning),LinkedLististhebestbyordersofmagnitudesinceitdoesnothavetoconstantlycopyelementstomovethemaside.





ArrayListv=ArrayList(n);
for(i=0;i<n;i)
v.add(Integer(i));
v.size;
WhilebothVectorandArrayListuseadoublingalgorithmthatwilladaptivelygrablargerandlargerchunkseverytimethesizeisexceeded,eachtimetheygrowanenormousexpenseisincurred.AswithStringBuffer,it\'sabouttwiceasfasttopreallocateasmuchspaceasyou\'llneedthantogrowlater,evenyouoverallocate.

Last,rememberingthatobjectallocationistheslowestactivityofall,youcaneasilyseethatthislist,whichmustcreateobjectwrappersforeach,isvastlyinefficient.Thefollowingcode,usingalistwrittenjustforelements(seemyWebsiteforthecode),runsafullfour-and-a-halftimesfasterthanArrayList.



IntArrayLista=IntArrayList(n);
for(i=0;i<n;i)
a.add(i);
Forscanningthroughanexistinglist,ArrayLististhefastestoftheJDKlistes;gettinganelementfroman.gif' />isatrivialoperation,sosynchronizationdominatesthetime.Here,LinkedListcanbemonstrouslyslowyouuseitincorrectly.SinceLinkedListisnotarandom-accessdatastructure,callingget(i)meansitmuststartfromelement0andscanforwarduntilitreachespositioni.AloopthatscansthroughtheentirelististhereforenotanO(n)operation,butO(n2).Foralistof100,000elements,mycomputerperformedtheArrayListtraversalin3.25milliseconds.LinkedListtraversaltookanastounding113,657milliseconds,or34,971timesslower.



LinkedListl=l1;
for(i=0;i<n;i)
l.get(i);
ThecorrectwaytocodetraversalthroughaLinkedLististouheiteratordesignpattern:



LinkedListl=l1;
for(Iteratori=l.listIterator;i.hasNext;)
i.next;
Thelessontobelearnedhereisthatitpaystounderstandyourdatastructureswell.Justchoosingtherightdatastructureforyoursituationcanpayenormousdividends.Andusingoneincorrectly,asheofLinkedListtraversal,canbeverycostly.
Last,youwanttostorealistofprimitives,thebestwaywouldbetohaveesdesignedforthepurpose,likeIntArrayList.Nooantstogototheexpenseofwritingandtainingallpermutationsoflistsforalltheprimitivedatatypes;thisisonereasonJavaneedsahigh-qualitytemplatefacilitylikeC.That\'satopicforanotherday,butonethatIhopetorevisitinafuturearticle.Fornow,afriendandIareproposingsomeprimitivelistestoaddtotheJavalibrary,becausewhenyoudowantalistofprimitives,there\'snosubstituteforadecentdatastructure.


Maps
HashMapisquiteabitfasterthantheolderHashtable,mostlybyvirtueofnotbeingsynchronized.However,thealgorithmusedisstilllessthanoptimal.Toanalyzeitfurther,youhavetolookoHashMap\'ssourcecode,andknowabitabouthashingalgorithms.Ingeneral,ahashalgorithmisfastbecauseit\"hashes\"thekeyandturnsitdirectlyothelocationofthebinwherethevalueisstored,makingitanO(1)operation.Theproblemcomeswhentwodferentkeyshappentohashtothesamebin.Statistically,thishappensfairlyoften,andit\'sthejobofthewriteroftheHashMaptoreduceitasmuchaspossible.



Collisionscannotbetotallyeliminatedhegeneral,sothedesignofhashalgorithmsmustallowforthem.Therefore,eachbinheHashMapisessentiallyalinkedlistforallthekeysandvaluesthatcouldhypotheticallyendupthere.ThismeansthateverytimeyouaddanelementtoaHashMap,you\'reonceagaincreatinganobjectthatholdsthekey,thevalue,andareferencetothenextnodeinanymorevalueshappentolandhesamebin.

Objectcreationisthemostexpensiveoperationpossible,soI\'vetriedadferentapproachandhaveonmysiteacoupleofexperimentalesthatperformtwiceasfastasHashMap(FastHashMap)orfourtimesasfastyourkeyisan(FastIntHashMap).Theydo,however,achievepartoftheirspectacularspeedbynotcheckingthesizeeachtimeaelementisadded,soyoumustallocatetherightsizetableinadvance.

AswithallotherJavadatastructures,youaddtoomanyelementstoaHashtableorHashMap,theygrow.Thisistheworstthingyoucando,sincegrowingrequirespainfullyreinsertingeveryelement.Hashingrequiresabout25­30%morebinsthanthereareelementsforefficientoperation.AlwayspreallocatewhatyouthinkistherightsizeforyourHashtable,begenerous,andcheckattheendtobesureyouwererightandthatthetabledidnothavetogrow.

Last,becauhehashalgorithmforslooksateverycharacterhe,avoidhashinglargesatallpossible.Thesmallerthe,thefasterthehash.


StrengthReduction
Turningslowmachinelanguageinstructionsoequivalentbutfasteronesistraditionallythejobofapeepholeoptimizerinacompiler;theoptimizerlooksatawindowofinstructionscomingoutofthecodegeneratorandmakesjudicioussubstitutions.IntheJavaenvironmenttherearetwostagesatwhichpeepholeoptimizationscanbedone.OneisduringcompiletimewhenthesourcecodeisturnedoJVMcode;theotheriswhenthecodeisrunandtheJITturnsJVMcodeoanativeassembler.ThelatteristheapproachchosenbySun,becauhatwaytheycanoptimizecodefortheparticularprocessorrunningthecode.

Havingadmittedthatmoststrengthreductionsarethingscompilersshoulddo,yourcompilerdoesn\'tdothem(andJavadidn\'tusedto),thenit\'suptoyoutodothemyourself.Indoingso,thereareanumberofissues:Willtheresultingcodebeassimpleasorsimplerthantheoriginal?Gainingalittlespeedwhilelosingunderstandabilityisnotagreatbargain.Willtheresultingcodebefaster?Programmersoftenassumethey\'reoptimizing,wheninfactthey\'redoingthereverse.Thekindofclockcyclecountingiscertainlybetterdonebyacompiler,withknowledgeofthetargetCPUandenvironmentatallpossible.ThegoodsisJDK1.4nowdoessomestrengthreduction.It\'suptoyoutodecidehowmuchspeedyouneednow.



First,whatnottodo.Multiplicationsbytheconstantpowerof2areautomaticallyconvertedtoshtsbythecomputer:



x*2x<<1
x*16x<<4
Morecomplex,butnotworthit,aremultiplicationsbyconstants:


x*10(x<<3)+(x<<1)
Divisionsarenotsupportedatthemoment,butwillbesoon.Ifyouneedthespeedrightnow,thespeedofthedivisionitselfisfiveorsixtimesfaster.



x/2x>>1
Amuchmoreimportantstrengthreduction,andonethattheJITisnotlikelytodetecthenearfuture,alsoinvolvesdivision.Often,programmerswanttogoaroundaloop,butdosomethingdferenteveryntimes.Onestandardtrickistocountandtakethecountermodulon,ashefollowingexample:



for(i=0;i<100000;i)
(i%109){
//dosomethingeverytenthtime
}
Thisisslow;thefollowingisfourtofivetimesfaster:



for(i=0,j=10;i<100000;i,j--){
(j0){
//dosomethingeverytenthtime
j=10;//restartthecount
}
Similarly,youhavecodeinalooplike:



j=(j+1)%n;
//jshouldalwaysendupbetween0andn-1
it\'smuchfastertowrite:



(jn)
j=0;
Ingeneral,foranypositivenumberx,x%nisequivalenttox&(n-1)nisapoweroftwo.Sox%8x&7aslongasxispositive.Usingthe&operatorisalotfaster.


Summary
Alltheperformanceenhancementshisarticlehaveinvolvedtheapplicationofsimpletechniquestomakeindividualsectionsofcodefaster.Ifyoulearnthericksandapplythemeverywhereasamatterofcourse,yourcodecangetsignicantlyfasterwithoutalotofeffort.Theechniques,combinedwithJDK1.4andthenextgenerationofJavacompilers,aregoingtotakeuswithinahair\'sbreadthofbeingasfastasawell-writtenCapplication­andmostapplicationsinCarenotwellwritten.Theworldwillenjoytheresultingcrisphandlingoftheprogramstocome.Getoutthereandwritesomethinggreat.


AuthorBio
DovKrugerispresidentofRightTRAK,Inc.,aconsultingandtrainingcompanyfocusingonJava,object-oriented,andWeb-basedtechnologies.He\'scurrentlyworkingonimprovingtheperformanceofdynamicWebpageswithgraphics,ernationalization,andsomeoftheJavalibraries.



Tags:  调整以优化性能 java性能优化 java性能

延伸阅读

最新评论

发表评论