基本信息
文件名称:Spark介绍:项目目标、组件及扩展方法.pptx
文件大小:488.25 KB
总页数:39 页
更新时间:2026-04-01
总字数:约7.69千字
文档摘要
MateiZahariaUCBerkeleyIntroductiontoSparkInternalsUCBERKELEY
OutlineProjectgoalsComponentsLifeofajobExtendingSparkHowtocontribute
ProjectGoalsGeneralityLowlatencyFaulttoleranceSimplicity:diverseworkloads,operators,jobsizes:sub-second:faultsshouldn’tbespecialcase:oftencomesfromgenerality
CodebaseSizeSpark:20,000LOCHadoop1.0:90,000LOCHadoop2.0:220,000LOC(non-test,non-examplesources)
CodebaseDetailsHadoopI/O:
400LOCMesosbackend:700LOCStandalonebackend:1700LOCInterpreter:3300LOCSparkcore:16,000LOCOperators:2000Blockmanager:2700Scheduler: