Bor-YuhEvanChang,AdamChlipala,andGeorgeC.Necula
UniversityofCalifornia,Berkeley,California,USA
{bec,adamc,necula}@cs.berkeley.edu
Abstract.Acertifiedprogramanalysisisananalysiswhoseimplemen-tationisaccompaniedbyacheckableproofofsoundness.Wepresentaframeworkwhosepurposeistosimplifythedevelopmentofcertifiedpro-gramanalyseswithoutcompromisingtherun-timeefficiencyoftheanaly-ses.AtthecoreoftheframeworkisanoveltechniqueforautomaticallyextractingCoqproof-assistantspecificationsfromMLimplementationsofprogramanalyses,whilepreservingtoalargeextentthestructureoftheimplementation.Weshowthatthisframeworkallowsdevelopersofmobilecodetoprovidetothecodereceiversuntrustedcodeverifiersintheformofcertifiedprogramanalyses.Wedemonstrateefficientimple-mentationsinthisframeworkofbytecodeverification,typedassemblylanguage,andproof-carryingcode.
1Introduction
Whenstaticanalysisorverificationtoolsareusedforvalidatingsafety-criticalcode[6],itbecomesimportanttoconsiderthequestionofwhethertheresultsoftheanalysesaretrustworthy[22,3].Thisquestionisbecomingmoreandmoredifficulttoanswerasboththeanalysisalgorithmsandtheirimplementationsarebecomingincreasinglycomplexinordertoimproveprecision,performance,andscalability.Wedescribeaframeworkwhosegoalistoassistthedevelopersofprogramanalysesinproducingformalproofsthattheimplementationsandalgorithmsusedaresoundwithrespecttoaconcretesemanticsofthecode.Wecallsuchanalysescertifiedsincetheycomewithmachine-checkableproofsoftheirsoundness.Wealsoseeksoundnessassurancesthatarefoundational,thatis,thatavoidassumptionsortrustrelationshipsthatdon’tseemfundamentaltotheobjectivesofusers.Ourcontributionsdealwithmakingthedevelopmentofsuchanalysesmorepractical,withparticularemphasisonnotsacrificingtheefficiencyoftheanalysisintheprocess.
Thestrongsoundnessguaranteesgivenbycertifiedprogramanalyzersandverifiersareimportantwhenthepotentialcostofwrongresultsissignificant.
ThisresearchwassupportedinpartbyNSFGrantsCCR-0326577,CCF-0524784,andCCR-00225610;anNSFGraduateFellowship;andanNDSEGFellowship.TheinformationpresentedheredoesnotnecessarilyreflectthepositionorthepolicyoftheGovernmentandnoofficialendorsementshouldbeinferred.
E.A.EmersonandK.S.Namjoshi(Eds.):VMCAI2006,LNCS3855,pp.174-189,2006.
cSpringer-Verlag2005.Thisistheauthor’sversionofthework.Itispostedherebypermissionof
Springer-Verlagforyourpersonaluse.Notforredistribution.
2Bor-YuhEvanChang,AdamChlipala,andGeorgeC.Necula
Moreover,theabilitytocheckindependentlythattheimplementationoftheanalysisissoundallowsustoconstructamobile-codereceiverthatallowsun-trustedpartiestoprovidethecodeverifier.Thecodeverifierispresentedasacertifiedprogramanalysiswhoseproofofsoundnessentailssoundnessofcodeverification.
Themaincontributionsoftheframeworkweproposearethefollowing:–Wedescribeamethodologyfortranslatingautomaticallyimplementationsofanalyseswritteninageneral-purposelanguage(currently,ML)intomod-elsandspecificationsforaproofassistant(currently,Coq).Specifically,weshowhowtohandlethoseaspectsofageneral-purposelanguagethatdonottranslatedirectlytothewell-foundedlogicusedbytheproofassistant,suchasside-effectsandnon-primitiverecursivefunctions.Weusetheframeworkofabstractinterpretation[12]toderivethesoundnesstheoremsthatmustbeprovedforeachcertifiedanalysis.
–Weshowadesignforaflexibleandefficientmobile-codeverificationprotocol,inwhichtheuntrustedcodeproducerhascompletefreedominthesafetymechanismsandcompilationstrategiesusedformobilecode,aslongasitcanprovideacodeverifierintheformofacertifiedanalysis,whoseproofofsoundnesswitnessesthattheanalysisenforcesthedesiredcode-receiversafetypolicy.Inthenextsection,wedescribeourprogramanalysisframeworkandintroduceanexampleanalyzer.Then,inSect.3,wepresentourtechniqueforspecificationextractionfromcodewritteninageneral-purposelanguage.WethendiscusstheprogramanalyzercertificationprocessinSect.4.InSect.5,wepresentanapplicationofcertifiedprogramanalysistomobilecodesafetyandhighlightitsadvantagesandthendescribehowtoimplementinthisarchitecture(founda-tional)typedassemblylanguage,Javabytecodeverification,andproof-carryingcodeinSect.6.Finally,wesurveyrelatedwork(Sect.7)andconclude(Sect.8).
2TheCertifiedProgramAnalysisFramework
Inordertocertifyaprogramanalysis,onemightconsiderprovingdirectlythesoundnessoftheimplementationoftheanalysis.Thisispossibleinourframe-work,butweexpectthatanalternativestrategyisoftensimpler.Foreachanaly-sistobecertified,wewriteacertifierthatrunsaftertheanalysisandchecksitsresults.Then,weprovethesoundnessofthecertifier.Thisapproachhasseveralimportantadvantages.Oftenthecertifierissimplerthantheanalysisitself.Forexample,itdoesnotneedtoiteratemorethanonceovereachinstruction,anditdoesnotneedallthecomplicatedheuristicsthattheanalysisitselfmightusetospeeduptheconvergencetoafixpoint.Thus,weexpectthecertifieriseasiertoprovesoundthantheanalysisitself.Thebiggestbenefit,however,isthatwecanuseanexistingimplementationofaprogramanalysisasablackbox,evenifitiswritteninalanguagethatwearenotreadytoanalyzeformally,andeveniftheanalysisalgorithmdoesnotfitperfectlywiththeformalismdesiredforthe
AFrameworkforCertifiedProgramAnalysisandItsApplications
Certifiersource code3
Model/SpecificationExtractionCompilerModelProofCheckerCertifierCertifierProof ofsoundnessAnalysisInput programInstallation timeVerification timeFig.1.Ourcertifiedverifierarchitecturewiththetrustedcodebaseshadedcertificationanditssoundnessproofs.Asanextremeexample,theanalysisitselfmightcontainamodelchecker,whilewemightwanttodothesoundnessproofusingtheformalismofabstractinterpretation[28].InFig.1,wediagramthisbasicarchitectureforthepurposeofmobile-codesafety.Wedistinguishbetween“installationtime”activity,whichoccursonceperanalyzer,and“verificationtime”activity,whichoccursonceperprogramtoanalyze.Wechoosethetheoryofabstractinterpretation[12]asthefoundationforthesoundnessproofsofcertifiersbecauseofitsgeneralityandbecauseitssoundnessconditionsaresimpleandwellunderstood.Wepresentfirsttherequirementsforthedevelopersofcertifiers,andtheninSect.4,wedescribethesoundnessverification.
Thecoreofthecertifieristypeabsval
anuntrustedcustommodulecon-typeabs={pc:nat;a:absval}taininganimplementationofthevalainv:abslist
abstracttransitionrelation(pro-valastep:abs->result
videdbythecertifierdeveloper).datatyperesult=Fail|SuccofabslistThecustommoduleofacertifier
mustimplementthesignaturegivenadjacently.Thetypeabsencodesabstractstates,whichincludeaprogramcounterandanabstractvalueofatypethatcanbechosenbythecertifierdeveloper.Thevalueainvconsistsoftheabstractinvariants.Theymustataminimumincludeinvariantsfortheentrypointstothecodeandforeachdestinationofajump.Thefunctionastepimplementstheabstracttransitionrelation:givenanabstractstateataparticularinstruc-tion,computethesetofsuccessorstates,minusthestatesalreadypartofainv.Thetransitionrelationmayalsofail,forexamplewhentheabstractstatedoesnotensurethesafeexecutionoftheinstruction.Wewilltakeadvantageofthispossibilitytowritesafetycheckersformobile-codeusingthisframework.Inourimplementationandintheexamplesinthispaper,weusetheMLlanguageforimplementingcustomcertifiers.
4Bor-YuhEvanChang,AdamChlipala,andGeorgeC.Necula
Inordertoexe-funapplyWithTimeout(f:’a->’b,x:’a):’b=...cutesuchcertifiers,funtop(DonePC:natlist,ToDo:abslist):bool=theframeworkpro-caseToDoof
nil=>truevidesatrusteden-gineshowninFig.2.|a::rest=>
ifList.member(a.pc,DonePC)thenfalseelse
Themainentrypoint
(caseapplyWithTimeout(astep,a)of
isthefunctiontop,
Fail=>false
invokedwithalist|Succas=>top(a.pc::DonePC,as@ToDo))ofprogramcountersinthathavebeenpro-top(nil,ainv)
cessedandalistof
abstractstatesstillFig.2.Thetrustedtop-levelanalysisengine.Theinfixtoprocess.Termina-operators::and@arelistconsandappend,respectivelytionisensuredusing
twomechanisms:eachinvocationoftheuntrustedastepisguardedbyatime-out,andeachprogramcounterisprocessedatmostonce.Weuseatimeoutasasimplealternativetoprovingterminationofastep.AsuccessfulrunofthecodeshowninFig.2isintendedtocertifythatalloftheabstractstatesgivenbyainv(i.e.,thepropertiesthatweareverifyingforaprogram)areinvariant,andthattheastepfunctionsucceedsonallreachableinstructions.Wetakeadvantageofthislatterpropertytowriteuntrustedcodeverifiersinthisframework(Sect.5).WediscusstheseguaranteesmorepreciselyinSect.4.
Example:JavaBytecodeVerifier.Nowweintroduceanexampleprogramana-lyzerthatrequirestheexpressivityofageneral-purposeprogramminglanguageandhighlightsthechallengesinspecificationextraction.Inparticular,wecon-sideracertifierinthestyleoftheJavabytecodeverifier,butoperatingonasimpleassemblylanguageinsteadofbytecodes.Fig.3presentsafragmentofthiscustomverifier.Theabstractvalueisapartialmapfromregisterstoclassnames,withamissingentrydenotinganuninitializedregister.1
Intheastepfunction,weshowonlythecaseofthememorywriteinstruc-tion.Theframeworkprovidestheselaccessorfunctionforpartialmaps,theinstructiondecoderinstrAt,thepartialfunctionfieldOfthatreturnsthetypeofafieldatacertainoffset,andthepartialfunctionsuperthatreturnsthesuperclassofaclass.Thiscasesucceedsonlyifthedestinationaddressisoftheformrdest+n,withregisterrdestpointingtoanobjectofclasscdestthathasatoffsetnafieldoftypec,whichmustbeasuperclassofthetypeofregisterrsrc.
WeomitthecodeforcalculatePreconditions,afunctionthatobtainssomepreconditionsfromthemeta-datapackagedwiththe.classfiles,andthenusesaniterativefixed-pointalgorithmtofindagoodtypingpreconditionforeachprogramlabel.Eachsuchpreconditionshouldbesatisfiedanytime
1
Intheactualimplementation,registersthatholdcodepointers(e.g.,returnad-dresses,ordynamicdispatchaddresses)areassignedtypesthatspecifytheabstractstateexpectedbythedestinationcodeblock.
AFrameworkforCertifiedProgramAnalysisandItsApplications
typeabsval=(reg,class)partialmaptypeabs={pc:nat;a:absval}
funsubClass(c1:class,c2:class)=...funcalculatePreconditions():abslist=...valainv:abslist=calculatePreconditions()
funastep(a:abs):result=caseinstrAt(a.pc)of
Write(rdest+n,rsrc)=>
(case(sel(a.a,rdest),sel(a.a,rsrc))of
(SOMEcdest,SOMEcsrc)=>(casefieldOf(cdest,n)of
SOMEcdest’=>ifsubClass(csrc,cdest’)then
Succ[{a=a.a,pc=a.pc+1}]elseFail
|_=>Fail)|_=>Fail)|...
5
Fig.3.SkeletonofaverifierinthestyleoftheJavabytecodeverifiercontrolreachesitslabel.Thiskindofalgorithmisstandardandwellstudied,inthecontextoftheJavaBytecodeVerifierandelsewhere,soweomitthedetailshere.Mostimportantly,wewillnotneedtoreasonformallyaboutthecorrectnessofthisalgorithm.
3SpecificationExtraction
Toobtaincertifiedprogramanalyses,weneedamethodologyforbridgingthegapbetweenanimplementationoftheanalysisandaspecificationthatissuitableforuseinaproofassistant.Anattractivetechniqueistostartwiththespecificationanditsproof,andthenuseprogramextractionsupportedbyproofassistantssuchasCoqorIsabelle[29]toobtaintheimplementation.Thisstrategyisveryproof-centricandwhileitdoesyieldasoundimplementation,itmakesithardtocontrolnon-soundnessrelatedaspectsofthecode,suchasefficiency,instrumentationfordebugging,orinteractionwithexternallibraries.
Yetanotheralternativeisbasedonverificationconditions[15,17],whereeachfunctionisfirstannotatedwithapre-andpostcondition,andtheentireprogramiscompiledintoasingleformulawhosevalidityimpliesthattheprogramsatisfiesitsspecification.Suchformulascanmakegoodinputstoautomateddeductiontools,buttheyareusuallyquiteconfusingtoahumanprover.Theylosemuchofthestructureoftheoriginalprogram.Plus,inourexperience,mostauxiliaryfunctionsinaprogramanalyzerdogoodjobsofservingastheirownspecifications(e.g.,thesubClassfunction).
Sinceitisinevitablethatprovingsoundnesswillbesufficientlycomplicatedtorequirehumanguidance,weseekanapproachthatmaintainsascloseofa
6Bor-YuhEvanChang,AdamChlipala,andGeorgeC.Necula
funsubClass(depth:nat,c1:class,c2:class):booloption=c1=c2orelse(casesuperc1ofNONE=>SOMEfalse
|SOMEsup=>ifdepth=OthenNONEelsesubClass’(depth-1,sup,c2))Fig.4.TranslationofthesubClassfunction.Theboxedelementsareaddedbyourtranslation
correspondencebetweentheimplementationanditsmodelaspossible.Fornon-recursivepurelyfunctionalprograms,wecaneasilyachievetheideal,astheimplementationcanreasonablyfunctionasitsownmodelinasuitablelogic,suchasthatoftheCoqproofassistant.Thissuggeststhatweneedawaytohandleimperativefeatures,andamethodfordealingwithnon-primitivere-cursivefunctions.Intheremainderofthissection,wegiveanoverviewofourapproach.Moredetailcanbefoundinthecompaniontechnicalreport[9].HandlingRecursion.Weexpectthatallinvocationsoftherecursivefunctionsusedduringcertificationterminate,althoughitmaybeinconvenienttowriteallfunctionsinprimitiverecursiveform,asrequiredbyCoq.Inourframework,weforceterminationofallfunctioninvocationsusingtimeouts.Thismeansthatforeachsuccessfulrun(i.e.,onethatdoesnottimeout)thereisaboundonthecall-stackdepth.Weusethisobservationtomakeallfunctionsprimitiverecur-siveonthecall-stackdepth.Whenwetranslateafunctiondefinition,weaddanexplicitargumentdepththatischeckedanddecrementedateachfunctioncall.Fig.4showstheresultoftranslatingatypicalimplementationofthesubClassfunctionforourrunningexample.Theboxedelementsareaddedbythetrans-lation.Notethatinordertobeabletosignalatimeout,thereturntypeofthefunctionisanoptiontype.Coqwillacceptthisfunctionbecauseitcanchecksyntacticallythatitisprimitiverecursiveinthedepthargument.
Thistranslationpreservesanypartialcorrectnesspropertyofthecode.Forexample,ifwecanproveaboutthespecificationthatanyinvocationofsubClassthatyieldsSOMEtrueimpliesthattwoclassesareinasubclassrelationship,thenthesamepropertyholdsfortheoriginalcodewheneveritterminateswiththevaluetrue.
HandlingImperativeFeatures.ThefunctioncalculatePreconditionsfromFig.3usesI/Ooperationstoreadanddecodethebasicblockinvariantsfromthe.classfile(asintheKVM[30]versionofJava),ormustuseanintrapro-ceduralfixed-pointcomputationtodeducethebasicblockpreconditionsfromthemethodstartprecondition(asforstandard.classfiles).Inanycase,thisfunctionmostlikelyusesasignificantnumberofimperativeconstructsorevenexternallibraries.Thisexampledemonstratesasituationwhentheresultofcomplexcomputationsisusedonlyasahint,whoseexactvalueisnotimportantforsoundnessbutonlyforcompleteness.Webelievethatthisisoftenthecasewhenwritingcertifiers,whichsuggeststhatamonadic[31]styleoftranslationwouldunnecessarilycomplicatetheresultingspecification.
AFrameworkforCertifiedProgramAnalysisandItsApplications
funreadu16(s:callstate,buff:intarray,idx:int):int=256*(freshread1s)+(freshread2s)7
funreadu32(s:callstate,buff:intarray,idx:int):int=
65536*readu16(freshstate3s,buff,i)+readu16(freshstate4s,buff,i+2)Fig.5.Translationofafunctionforreadinga16-bitand32-bitbig-endiannumbersfromaclassfile.Originalbodyofreadu16beforetranslationis256∗buff[i]+buff[i+1]
Forsuchsituationsweproposeacheapertranslationschemethatabstractssoundlytheresultofside-effectingoperations.Wedescribethisschemeinfor-mally,bymeansofanexampleoffunctionsthatreadfromaJava.classfile16-bitand32-bitnumbers,respectively,writteninbig-endiannotation,showninFig.5.Eachupdatetomutablestateisignored.Eachsyntacticoccurrenceofamutable-stateaccessisreplacedwithafreshabstractfunction(e.g.,freshread1)whoseargumentisanabstractionofthecall-stackstate.Thecall-stackargumentisneededtoensurethatnorelationshipcanbededucedbetweenrecursivein-vocationsofthesamesyntacticstateaccess.Eachfunctionwhosebodyreadsmutablestate,orcallsfunctionsthatreadmutablestate,getsanewparametersthatistheabstractionofthecall-stackstate.Wheneversuchafunctioncallsanotherfunctionthatneedsacall-stackargument,itusesafreshtransformer(e.g.,freshstate3)toproducethenewactualstateargument.
Thisabstractionissoundinthesensethatitensuresthatnothingcanbeprovedaboutresultsofmutablestateaccesses,andthusanypropertythatwecanproveaboutthisabstractionalsoholdsfortheactualimplementation.Ifwedidnothavethecall-stackargument,onecouldprovethateachinvocationofthereadu16functionproducesthesameresult,andthusallresultsofthereadu32aremultipleof65,537.Thislatterexamplealsoshowswhywecannotusethedepthargumentasanabstractionofthecall-stackstate.
Notethatouruseof“state”differsfromthewell-known“explicitstate-passingstyle”infunctionalprogramming,wherestateisusedliterallytotrackallmutableaspectsoftheexecutionenvironment.Thattranslationstylerequiresthateachfunctionthatupdatesthestatenotonlytakeaninputstatebutalsoproduceanoutputstatethatmustbepassedtothenextstatement.Inourtranslationschemestatesareonlypasseddowntocallers,andtheresulttypeofafunctiondoesnotchange.
Thecostforthesimplicityofthistranslationisalossofcompleteness.Wearenotinterestedinpreservingallthesemanticsofinputprograms.Basedonourconjecturethatwecanrefactorprogramssothattheirsoundnessargumentsdonotdependonimperativeparts,wecangetawaywithaloosertranslation.Inparticular,wewanttobeabletoprovepropertiesoftheinputbyprovingpropertiesofthetranslation.Wedonotneedtheoppositeinclusiontohold.SoundnessoftheSpecificationExtraction.Wearguehereinformallythesound-nessofthespecificationextractionformutablestate.Inourimplementation,the
8Bor-YuhEvanChang,AdamChlipala,andGeorgeC.Necula
soundnessofthecodethatimplementstheextractionprocedureisassumed.Weleaveforfutureworktheinvestigationofwaystorelaxthisassumption.First,weobservethateachsyntacticoccurrenceofafunctioncallhasitsownuniquefreshstatetransformer.Thismeansthat,inanexecutiontraceofthespeci-fication,eachfunctioncallhasanactualstateargumentthatisobtainedbyauniquesequenceofapplicationsoffreshstatetransformerstotheinitialstate.Furthermore,inanysuchfunctioninvocationallthesyntacticoccurrencesofamutablestatereaduseuniquefreshreadaccessfunctions,appliedtouniquevaluesofthestateparameter.Thismeansthatinanyexecutiontraceofthespecification,eachstatereadvalueisabstractedasauniquecombinationoffreshreadandfreshstatefunctions.This,inturn,meansthatforanyactualexecutiontraceoftheoriginalprogram,thereisadefinitionofthefreshreadandfreshstateparametersthatyieldsthesameresultsastheactualreads.Sinceallthefreshreadandfreshstatetransformersareleftabstractinthespecification,anyproofaboutthespecificationworkswithanymodelforthetransformers,andthusappliestoanyexecutiontraceoftheoriginalprogram.Acompleteformalproofisfoundinthecompaniontechnicalreport[9].
4SoundnessCertification
WeusethetechniquesdescribedintheprevioussectiontoconverttheMLdatatypeabstoadescriptionoftheabstractdomainAinthelogicoftheproof-assistant.Similarly,weconverttheainvvalueintoasetAI⊆A.Finally,wemodelthetransitionfunctionastepasanabstracttransitionrelation⊆A×2AsuchthataAwheneverastep(a)=SuccA.Wewillabusenotationslightlyandidentifysetsandlistswhereconvenient.
Weprovesoundnessoftheabstracttransitionrelationwithrespecttoacon-cretetransitionrelation.Let(C,C0,→)beatransitionsystemfortheconcretemachine.Inparticular,Cisadomainofstates;C0isthesetofallowableinitialstates;and→isaone-steptransitionrelation.Theseelementsareprovidedintheproof-assistantlogicandaretrusted.Webuildwhateversafetypolicyinter-estsusinto→intheusualway;wedisallowtransitionsthatwouldviolatethepolicy,sothaterrorsaremodeledas“beingstuck.”Thisistheprecisewayinwhichonecanspecifythetrustedsafetypolicyforthecertifiedprogramverifiers(Sect.5).
Tocertifythesoundnessoftheprogramanalyzer,thecertifierdeveloperneedstoprovideadditionally(intheformofaCoqdefinition)asoundnessrelation⊆C×A(writtenasσin[13]),suchthatcaholdsiftheabstractstateaisasoundabstractionoftheconcretestatec.Todemonstrateisindeedsound,theauthoralsoprovidesproofs(inCoq)forthefollowingstandard,localsoundnesspropertiesofabstractinterpretationsandbi-simulations.
Property1(Initialization).Foreveryc∈C0,thereexistsa∈AIsuchthatca.
Theinitializationpropertyassuresusthattheabstractinterpretationincludesanappropriateinvariantforeverypossibleconcreteinitialstate.
AFrameworkforCertifiedProgramAnalysisandItsApplications9
Property2(Progress).Foreveryc∈Canda∈Asuchthatca,ifthereexistsA⊆AsuchthataA,thenthereexistsc∈Csuchthatc→c.Progressguaranteesthat,wheneveranabstractstateisnotstuck,anycorre-spondingconcretestatesarealsonotstuck.
Property3(Preservation).Foreveryc∈Canda∈Asuchthatca,ifthereexistsA⊆AsuchthataA,thenforeveryc∈Csuchthatc→cthereexistsa∈(A∪AI)suchthatca.
Preservationguaranteesthat,foreverystepmadebytheconcretemachine,theresultingconcretestatematchesoneofthesuccessorstatesoftheabstractmachine.Preservationisonlyrequiredwhentheabstractmachinedoesnotrejecttheprogram.Thisallowstheabstractmachinetorejectsomesafeprograms,ifitsodesires.Itisimportanttonoticethat,inordertoensuretermination,theastepfunction(andthustherelation)onlyreturnsthosesuccessorabstractstatesthatarenotalreadypartoftheinitialabstractstatesainv.Toaccountforthisaspect,weuseAIinthepreservationtheorem.
Together,thesepropertiesimplytheglobalsoundnessofthecertifierthatimplementsthisabstractinterpretation[12],statedasfollowing:
Theorem1(Certificationsoundness).Foranyconcretestatec∈Creach-ablefromaninitialstateinC0,theconcretemachinecanmakefurtherprogress.Also,ifchasthesameprogramcounterasastatea∈AI,thenca.Inthetechnicalreport[9],wegiveanideahowtheseobligationsaremetinpracticebysketchinghowtheproofgoesfortheexampleoftheJavabytecodeverifiershowninFig.3.
5ApplicationstoMobile-CodeSafety
Language-basedsecuritymechanismshavegainedacceptanceforenforcingbasicbutessentialsafetyproperties,suchasmemoryandtypesafety,foruntrustedmobilecode.Themostwidelydeployedsolutionformobilecodesafetyisbyte-codeverification,asintheJavaVirtualMachine(JVM)[25]ortheMicrosoftCommonIntermediateLanguage(MS-CIL)[18].Abytecodeverifierusesaformofabstractinterpretationtotrackthetypesofmachineregisters,andtoenforcememoryandtypesafety.Themainlimitationofthisapproachisthatwemusttrustthesoundnessofthebytecodeverifier.Inturn,thismeansthatwecannoteasilychangetheverifieranditsenforcementmechanism.Thiseffectivelyforcestheclientsofacodereceivertouseafixedtypesystemandoftenevenafixedsourcelanguageformobilecode.Programswritteninothersourcelanguagescanbecompiledintothetrustedintermediatelanguagebutofteninunnaturalwayswithalossofexpressivenessandperformance[4,19,7].
AgoodexampleistheMS-CILlanguage,whichisexpressiveenoughtobethetargetofcompilersforC#,CandC++.CompilersforC#produceintermediatecodethatcanbeverified,whilecompilersforCandC++useintermediate
10Bor-YuhEvanChang,AdamChlipala,andGeorgeC.Necula
languageinstructionsthatarealwaysrejectedbythebuilt-inbytecodeverifier.Inthislattercase,thecodemaybeacceptediftheproducerofthecodecanprovideanexplicitproofthatthecodeobeystherequiredsafetypolicyandthecodereceiverusesproof-carryingcode[1,20,27].
Existingworkonproof-carryingcode(PCC)atteststoitsversatility,butoftenfailstoaddresstheessentialissueofhowtheproofobjectsareobtained.IntheTouchstonesystem[11],proofsaregeneratedbyaspecialtheoremproverwithdetailedknowledgeaboutJavaobjectlayoutandcompilationstrategies.TheFoundationalPCCwork[1,20]eliminatestheneedtohard-codeandtrustallsuchknowledge,butdoessoatthecostofincreasingmanytimestheproofgenerationburden.Boththesesystemsalsoincurthecostoftransmittingproofs.TheOpenVerifierproject[10]proposestosendwiththecodenotper-programproofsbutproofgeneratorstoberunatthecodereceiverendforeachincomingprogram.Thegeneratedproofsarethencheckedbyatrustedproofchecker,asinastandardPCCsetup.
Usingcertifiedprogramanalyseswecanfurtherimprovethisprocess.Theproducerofthemobilecodewritesasafety-policyverifiercustomizedfortheexactcompilationstrategyandsafetyreasoningusedinthegenerationofthemobilecode.Thisverifiercanbewrittenintheformofacertifiedprogramanalysis,whoseabstracttransitionfailswheneveritcannotverifythesafetyofaninstruction.Forexample,wediscussinSect.6caseswhentheprogramanalysisisatypedassemblylanguagechecker,abytecodeverifier,oranactualPCCverificationenginerelyingonannotationsaccompanyingthemobilecode.Thekeyelementisthesoundnessproofthataccompaniesananalysis,whichcanbecheckedautomatically.Atverificationtime,thenow-trustedprogramanalyzerisusedtovalidatethecode,withnoneedtomanipulateexplicitproofobjects.Thissimplifiesthewritingofthevalidator(ascomparedwiththeproof-generatingtheoremproverofTouchstone,ortheOpenVerifier).WealsoshowinSect.6thatthisreducesthevalidationtimebymorethananorderofmagnitude.Wepointoutherethatthesoundnessproofiswithrespecttothetrustedconcretesemantics.Byaddingadditionalsafetychecksintheconcretesemantics(forinstance,thelogicalequivalentsofdynamicchecksthatwouldenforceadesiredsafetypolicy),thecodereceivercanconstructcustomizedsafetypolicies.
6CaseStudies
Inthissection,wepresentcasestudiesofapplyingcertifiedprogramanalyzerstomobilecodesecurity.Wedescribeexperiencewithverifiersfortypedassemblylanguage,Javabytecode,andproof-carryingcode.
Wehavedevelopedaprototypeimplementationofthecertifiedprogramanalysisinfrastructure.TheconcretelanguagetobeanalyzedistheIntelx86assemblylanguage.Thespecificationextractorisbuiltontopofthefront-endoftheOCamlcompiler,anditsupportsalargefragmentoftheMLlanguage.Themostnotablefeaturesnotsupportedaretheobject-orientedfeatures.Inad-ditiontothe3000-lineextractor,thetrustedcomputingbaseincludesthewhole
AFrameworkforCertifiedProgramAnalysisandItsApplications11
OCamlcompilerandtheCoqproofchecker,neitherofwhichisdesignedtobefoundationallysmall.However,ourfocusherehasbeenonexploringtheeaseofuseandrun-timeefficiencyofourapproach.Weleaveminimizingthetrustedbaseforfuturework.
TypedAssemblyLanguage.OurfirstrealisticuseofthisframeworkinvolvedTypedAssemblyLanguage.Inparticular,wedevelopedandprovedcorrectaverifierforTALx86,asprovidedinthefirstreleaseoftheTALCtoolsfromCornell[26].ThisTALincludesseveralinterestingfeatures,includingcontinua-tion,universal,existential,recursive,product,sum,stack,andarraytypes.OurimplementationhandlesallofthefeaturesusedbythetestcasesdistributedwithTALC,withtheexceptionofthemodularityfeatures,whichwehandleby“hand-linking”multiple-filetestsintosinglefiles.TALCincludescompilerstoanx86TALfromPopcorn(asafeCdialect)andmini-Scheme.Weusedthesecompilersunchangedinourcasestudy.
WeimplementedaTALx86verifierin1500linesofMLcode.ThiscomparesfavorablywiththecodesizeoftheTALCtypechecker,whichisabout6000linesofOCaml.Oneofusdevelopedourverifieroverthecourseoftwomonths,whilesimultaneouslyimplementingthecertificationinfrastructure.Weexpectthatitshouldbepossibletoconstructnewverifiersofcomparablecomplexityinaweek’stimenowthattheinfrastructureisstable.
Wealsoprovedthelocalsoundnesspropertiesofthisimplementationin15,000linesofCoqdefinitionsandproofscripts.Thistookaboutamonth,againinterleavedwithdevelopingthetrustedpartsoftheinfrastructure.Were-usedsomedefinitionsfromapreviousTALformalization[10],butwedidn’tre-useanyproofs.It’slikelythatwecansignificantlyreducetheeffortrequiredforsuchproofsbyconstructingsomecustomprooftacticsbasedonourexperiences.Wedon’tbelieveourformalizationtobenovelinanyfundamentalway.ItusesideasfrompreviousworkonfoundationalTAL[2,20,14].Themaindifferenceisthatweprovethesamebasictheoremsaboutthebehaviorofanimplementationofthetypechecker,insteadofaboutthepropertiesofinferencerules.Thismakestheproofsslightlymorecumbersome,but,aswewillsee,itbringssignificantperformanceimprovement.Asmightbeexpected,wefoundandfixedmanybugsintheverifierinthecourseofprovingitssoundness.Thissuggeststhatourin-frastructuremightbeusefulevenifthedeveloperisonlyinterestedindebugginghisanalysis.
Table1presentssomeverification-time
ConvCPVPCCperformanceresultsforourimplementation,
asaveragerunningtimesforinputswithpar-Upto200(13)00.010.07
201-999(7)0.010.020.24ticularcountsofassemblyinstructions.We
rananumberofverifiersonthetestcasespro-1000andup(6)0.040.081.73videdwithTALC,whichuseduptoabout
9000assemblyinstructions.First,thetypeTable1.Averageverifierrun-checkerincludedwithTALCfinisheswithinningtimes(inseconds)theresolutionofourtimingtechniqueforall
cases,sowedon’tincluderesultsforit.Whilethistypecheckeroperatesona
12Bor-YuhEvanChang,AdamChlipala,andGeorgeC.Necula
specialtypedassemblylanguage,theresultswegiveareallforverifyingnativeassemblyprograms,withtypesandmacro-instructionsusedasmeta-data.Asaresult,wecanexpectthatthereshouldbesomeinherentslow-down,sincesomeTALinstructionsmustbecompiledtomultiplerealinstructions.Theexperi-mentswereperformedonanAthlonXP3000+with1GBofRAM,andtimesaregiveninseconds.Wegivetimesfor“Conventional(Conv),”athinwrap-peraroundtheTALCtypecheckertomakeitworkonnativeassemblycode;“CPV,”ourcertifiedprogramverifierimplementation;and“PCC,”ourTALx86verifierimplementationfrompreviouswork[10],inwhichexplicitproofobjectsarecheckedduringverification.
TheresultsshowthatourCPVverifierperformscomparablywiththecon-ventionalverifier,forwhichnoformalcorrectnessproofexists.ItappearsourCPVverifieriswithinasmallconstantfactoroftheconventionalverifier.Thisconstantislikelybecauseweuseaninefficient,Lisp-likeserializationformatforincludingmeta-datainthecurrentimplementation.Weexpectthiswouldbereplacedbyamuchfasterbinary-encodedsysteminamoreelaborateversion.WecanalsoseethatthecertifiedverifierperformsmuchbetterthanthePCCversion.Thedifferenceinperformanceisduetothecostrequiredtomanipulateandcheckexplicitproofobjectsduringverification.Toprovideevidencethatwearen’tcomparingagainstapoorly-constructedstrawman,wecanlooktootherFPCCprojects.Wu,Appel,andStump[32]givesomeperformancere-sultsfortheirProlog-basedimplementationoftrustworthyverifiers.Theyonlypresentresultsoninputprogramsofupto2000instructions,witharunningtimeof.206secondsona2.2GHzPentiumIV.ThisseemsonparwithourownPCCimplementation.Whiletheirtrustedcodebaseismuchsmallerthanours,sincewerequiretrustinourspecificationextractor,thereishopethatwecanachieveasimilarlysmallcheckingkernelbyusingtechniquesrelatedtocertifyingcompilation.
JavaBytecodeVerification.WehavealsousedourframeworktoimplementapartialJavaBytecodeVerifier(JBV)inabout600linesofML.ItchecksmostofthepropertiesthatfullJBVscheck,mainlyexcludingexceptions,objectinitialization,andsubroutines.Ourimplementation’sstructurefollowscloselythatofourrunningexamplefromSect.2.ItsainvbeginsbycallinganOCamlfunctionthatcalculatesafixedpointusingstandardtechniques.Likeinourexample,theprecisecodeheredoesn’tmatter,asthepurposeofthefunctionistopopulateahashtableoffunctionpreconditionsandcontrol-flowjoinpointinvariants.Withthisinformation,ourastepfunctionimplementsthestandardtypingrulesforJBVs.
Whilewehaveextractedcompleteproofobligationsfortheimplementation,wehaveonlybeguntheprocessofprovingthem.However,tomakesureweareontracktoanacceptablefinalproduct,wehaveperformedsomesimplebench-marksagainstthebytecodeverifierincludedwithBlackdownJavaforLinux.WedownloadedafewJava-onlyprojectsfromSourceForgeandraneachverifieroneveryclassineachproject.
AFrameworkforCertifiedProgramAnalysisandItsApplications13
Onthelargestthatourprototypeimplementationcouldhandle,MegaMek,ourverifierfinishesin5.5secondsforchecking668,000bytecodeinstructions,comparedto1secondforthetraditionalverifier.First,wenotethatbothtimesarerelativelysmallinanabsolutesense.Itprobablytakesauserconsiderablylongertodownloadasoftwarepackagethantoverifyitwitheithermethod.Wealsoseethatourverifierisonlyasmallfactorawayfrommatchingthetraditionalapproach,whoseperformanceweknowempiricallythatusersseemwillingtoaccept.Nodoubtfurtherengineeringeffortcouldclosethisgaporcomeclosetodoingso.
Proof-CarryingCode.WecanevenimplementaversionofFoundationalPCCinourframework:foreachbasicblockthemobilecodecontainsaninvariantforthestartoftheblock,andaproofthatthestrongestpostconditionofthestartinvariantalongtheblockimpliestheinvariantforthesuccessorblock.Theabstractstateabsofthecertifierconsistsofapredicatewritteninasuitablelogic,intendedtobethestrongestpostconditionatthegivenprogrampoint.Theainvisobtainedbyreadinginvariantsfromadatasegmentaccompanyingthemobilecode.
Fig.6showsafuncheckProof(prf:proof)(p:pred):bool=...fragmentofthecodefunastep(a:abs):result=forastep,whichcal-caseinstrAta.pcof
culatesthestrongestRegReg(r1,r2)=>Succ[{postconditionforeverypc=a.pc+1;
a=And(Eq(r1,r2),Exists(x,[x/r1]a.a))}]instruction.Atajump
|Jumpl=>wefetchtheinvariant
letdest=getInvarlinforthedestination,a
letprf=fetchProoflinproof,andthencheck
ifcheckProof(prf,Imply(a.a,dest))thentheproof.Toprove
Succ[]
soundness,weonly
elseFail
needtoensurethatgetInvarreturnsoneFig.6.AfragmentofacertifierforPCCoftheinvariantsthat
arepartofainv,andthatthecheckProoffunctionissound.Moreprecisely,wheneverthecalltocheckProofreturnstrue,thenanyconcretestatethatsatis-fiesa.aalsosatisfiesdest.Inparticular,wedonotcareatallhowfetchProofworks,whereitgetstheprooffrom,whetheritdecryptsordecompressesitfirst,orwhetheritactuallyproducestheproofitself.ThissoundnessproofforcheckProofispossibleandevenreasonablystraightforward,sincewearewritingourmeta-proofsinCoq’smoreexpressivelogic.
7RelatedWork
TowardCertifiedProgramAnalyses.TheRhodiumsystemdevelopedbyLerneretal.[24]isthemostsimilarwithrespecttotheoverallgoalofourwork—thatofprovidingarealisticframeworkforcertifiedprogramanalyses.However,they
14Bor-YuhEvanChang,AdamChlipala,andGeorgeC.Necula
focusonsimplercompileranalysisproblemswhosesoundnesscanbeprovedbytoday’sautomatedmethods.WeexpectthatourproofscansimilarlybeautomatedwhenourframeworkisusedforthekindsofanalysesexpressibleinRhodium-styledomainspecificlanguages.
Severalsystemshavebeendevelopedforspecifyingprogramanalysesindomain-specificlanguagesandgeneratingcodefromthesespecifications[23].Again,theexpressivenessofthesesystemsisverylimitedcomparedtowhatisneededforstandardmobilecodesafetyproblems.
Intheotherdirection,wehavethewell-establishedbodyofworkdealingwithextractingformalverificationconditionsfromprogramsannotatedwithspecifications.EspeciallyrelevantaretheWhy[16]andCaduceus[17]tools,whichproduceCoqproofobligationsasoutput.
Therehasbeenagoodamountofworkonconstructingtrustworthyveri-fiersbyextractingtheircodefromconstructiveproofsofsoundness.Cacheraetal.[8]extractedadata-flowanalysisfromaproofbasedonageneralconstraintframework.KleinandNipkow[21]andBertot[5]havebuiltcertifiedJavabyte-codeverifiersthroughprogramextraction/codegenerationfromprogramsandproofsinIsabelleandCoq,respectively.Noneofthesepublicationspresentanyperformancefigurestosuggestthattheirextractedverifiersscaletorealinputsizes
EnforcingMobile-CodeSafety.Asalludedtoearlier,mostpriorworkinFounda-tionalProof-CarryingCodehasfocusedonthegeneralityandexpressivityofvar-iousformalisms,includingtheoriginalFPCCproject[2],SyntacticFPCC[20],andFoundationalTALT[14].Theseprojectshavegivenconvincingargumentsfortheirexpressiveness,buttheyhavenotyetdemonstratedascalableimplemen-tation.SomerecentresearchhaslookedintoefficiencyconsiderationsinFPCCimplementations,includingworkbyWu,Appel,andStump[32]andourownworkontheOpenVerifier[10].
ThearchitectureproposedbyWu,Appel,andStumpisfairlysimilartothearchitecturewepropose,withtherestrictionthatverifiersmustbeimplementedinProlog.Inessence,whilewebuildinanabstractinterpretationengine,Wuetal.buildinaProloginterpreter.Wefeelthatitisimportanttosupportverifiersdevelopedinmoretraditionalprogramminglanguages.Also,theperformancefiguresprovidedbyWuetal.havenotyetdemonstratedscalability.
OurpastworkontheOpenVerifierhasheavilyinfluencedthedesignofthecertifiedprogramanalysisarchitecture.Bothapproachesbuildanabstractin-terpretationengineintothetrustedbaseandallowtheuploadingofcustomizedverifiers.However,theOpenVerifieressentiallyadherestoastandardPCCar-chitectureinthatitstillinvolvesproofgenerationandcheckingforeachmobileprogramtobeverified,anditpaystheusualperformancepricefordoingthis.
8Conclusion
Wehavepresentedastrategyforsimplifyingthetaskofprovingsoundnessnotjustofprogramanalysisalgorithms,butalsooftheirimplementations.Webe-
AFrameworkforCertifiedProgramAnalysisandItsApplications15
lievethatstartingwiththeimplementationandextractingnaturalproofobliga-tionswillallowdeveloperstofinetunenon-functionalaspectsofthecode,suchasperformanceordebugginginstrumentation.
Certifiedprogramanalyseshaveimmediateapplicationsfordevelopingcerti-fiedprogramverifiers,suchthatevenuntrustedpartiescancustomizetheverifi-cationprocessforuntrustedcode.Wehavecreatedaprototypeimplementationandusedittodemonstratethatthesameinfrastructurecansupportinaverynaturalwayproof-carryingcode,typechecking,ordata-flowbasedverificationinthestyleofbytecodeverifiers.Amongthese,wehavecompletedthesoundnessproofofaverifierforx86TypedAssemblyLanguage.Theperformanceofourcertifiedverifierisquiteonparwiththatofatraditional,uncertifiedTALx86typechecker.Webelieveourresultshereprovidethefirstpublishedevidencethatafoundationalcodecertificationsystemcanscale.
References
1.A.W.Appel.Foundationalproof-carryingcode.InProc.ofthe16thSymposiumonLogicinComputerScience,pages247–258,June2001.
2.A.W.AppelandA.P.Felty.Asemanticmodeloftypesandmachineinstruc-tionsforproof-carryingcode.InProc.ofthe27thSymposiumonPrinciplesofProgrammingLanguages,pages243–253,Jan.2000.
3.G.Barthe,P.Courtieu,G.Dufay,andS.deSousa.Tool-assistedspecificationandverificationoftheJavaCardplatform.InProc.ofthe9thInternationalConferenceonAlgebraicMethodologyandSoftwareTechnology,Sept.2002.
4.N.Benton,A.Kennedy,andG.Russell.CompilingStandardMLtoJavabytecodes.InProc.oftheInternationalConferenceonFunctionalProgramming,pages129–140,June1999.
5.Y.Bertot.FormalizingaJVMLverifierforinitializationinatheoremprover.InProc.ofthe13thInternationalConferenceonComputerAidedVerification,volume2102ofLNCS,pages14–24,July2001.
6.B.Blanchet,P.Cousot,R.Cousot,J.Feret,L.Mauborgne,A.Min´e,D.Monniaux,andX.Rival.Astaticanalyzerforlargesafety-criticalsoftware.InProc.oftheConferenceonProgrammingLanguageDesignandImplementation,pages196–207,2003.
7.P.Bothner.Kawa—compilingdynamiclanguagestotheJavaVM.InProc.oftheFreeNIXTrack:USENIX1998annualtechnicalconference,1998.
8.D.Cachera,T.P.Jensen,D.Pichardie,andV.Rusu.Extractingadataflowanalyserinconstructivelogic.InD.A.Schmidt,editor,Proc.ofthe13thEuropeanSymposiumonProgramming,volume2986ofLNCS,pages385–400,Mar.2004.9.B.-Y.E.Chang,A.Chlipala,andG.C.Necula.Aframeworkforcertifiedprogramanalysisanditsapplicationstomobile-codesafety.TechnicalReportUCBERLM05/32,UniversityofCalifornia,Berkeley,2005.
10.B.-Y.E.Chang,A.Chlipala,G.C.Necula,andR.R.Schneck.TheOpenVerifier
frameworkforfoundationalverifiers.InProc.ofthe2ndWorkshoponTypesinLanguageDesignandImplementation,Jan.2005.
11.C.Colby,P.Lee,G.C.Necula,F.Blau,M.Plesko,andK.Cline.Acertifying
compilerforJava.InProc.oftheConferenceonProgrammingLanguageDesignandImplementation,pages95–107,May2000.
16Bor-YuhEvanChang,AdamChlipala,andGeorgeC.Necula
12.P.CousotandR.Cousot.Abstractinterpretation:Aunifiedlatticemodelfor
staticanalysisofprogramsbyconstructionorapproximationoffixpoints.InProc.ofthe4thSymposiumonPrinciplesofProgrammingLanguages,pages234–252,Jan.1977.
13.P.CousotandR.Cousot.Abstractinterpretationframeworks.J.Log.Comput.,
2(4):511–547,1992.
14.K.Crary.Towardafoundationaltypedassemblylanguage.InProc.ofthe30th
SymposiumonPrinciplesofProgrammingLanguages,pages198–212,Jan.2003.15.E.W.Dijkstra.Guardedcommands,nondeterminancyandformalderivationof
programs.CommunicationsoftheACM,18:453–457,1975.16.J.-C.Filliˆatre.Why:amulti-languagemulti-proververificationtool.Research
Report1366,LRI,Universit´eParisSud,March2003.17.J.-C.FilliˆatreandC.March´e.Multi-ProverVerificationofCPrograms.InProc.
ofthe6thInternationalConferenceonFormalEngineeringMethods,volume3308ofLNCS,pages15–29,Nov.2004.
18.A.D.GordonandD.Syme.Typingamulti-languageintermediatecode.InProc.
ofthe28thSymposiumonPrinciplesofProgrammingLanguages,pages248–260,Jan.2001.
19.K.J.GoughandD.Corney.EvaluatingtheJavavirtualmachineasatargetfor
languagesotherthanJava.InJointModulaLanguagesConference,Sept.2000.20.N.A.Hamid,Z.Shao,V.Trifonov,S.Monnier,andZ.Ni.Asyntacticapproach
tofoundationalproof-carryingcode.InProc.ofthe17thSymposiumonLogicinComputerScience,pages89–100,July2002.
21.G.KleinandT.Nipkow.Verifiedlightweightbytecodeverification.Concurrency
–practiceandexperience,13(1),2001.
22.G.KleinandT.Nipkow.Verifiedbytecodeverifiers.Theor.Comput.Sci.,
298(3):583–626,2003.
23.J.H.E.F.Lasseter.Toolkitsfortheautomaticconstructionofdataflowanalyzers.
TechnicalReportCIS-TR-04-03,UniversityofOregon,2003.
24.S.Lerner,T.Millstein,E.Rice,andC.Chambers.Automatedsoundnessproofs
fordataflowanalysesandtransformationsvialocalrules.InProc.ofthe32ndSymposiumonPrinciplesofProgrammingLanguages,pages364–377,2005.
25.T.LindholmandF.Yellin.TheJavaVirtualMachineSpecification.TheJava
Series.Addison-Wesley,Reading,MA,USA,Jan.1997.
26.G.Morrisett,K.Crary,N.Glew,D.Grossman,R.Samuels,F.Smith,
D.Walker,S.Weirich,andS.Zdancewic.Talcreleases,2003.URL:http://www.cs.cornell.edu/talc/releases.html.
27.G.C.Necula.Proof-carryingcode.InProc.ofthe24thSymposiumonPrinciples
ofProgrammingLanguages,pages106–119,Jan.1997.
28.G.C.Necula,R.Jhala,R.Majumdar,T.A.Henzinger,andW.Weimer.Temporal-safetyproofsforsystemscode.InProc.oftheConferenceonComputerAidedVerification,Nov.2002.
29.L.C.Paulson.Isabelle:Agenerictheoremprover.LectureNotesinComputer
Science,828,1994.
30.E.Rose.Lightweightbytecodeverification.J.Autom.Reason.,31(3-4):303–334,
2003.
31.P.Wadler.Monadsforfunctionalprogramming.InAdvancedFunctionalProgram-ming,volume925ofLNCS,pages24–52.Springer,1995.
32.D.Wu,A.W.Appel,andA.Stump.Foundationalproofcheckerswithsmall
witnesses.InProc.ofthe5thInternationalConferenceonPrinciplesandPracticeofDeclarativeProgramming,pages264–274,Aug.2003.
因篇幅问题不能全部显示,请点此查看更多更全内容