基本信息
文件名称:corpuslinguistics教学讲解课件.pptx
文件大小:14.79 MB
总页数:10 页
更新时间:2025-06-12
总字数:约1.09万字
文档摘要

CorpusLinguistics

1

Whatisacorpus?

Corpus(pl.corpora)

Acollectionoftextsassumedtoberepresentativeofagivenlanguage,dialect,orothersubsetofalanguage,tobeusedforlinguisticanalysis.

(Francis1982)

2

Whatisacorpus?

Threebasicknowledgeaboutcorpus:

Alargeprincipledcollectionofnaturaltexts.

naturaltexts:languagehasbeencollectedfromnaturallyoccurringsources.

Abasicsourceofstoringlanguageknowledge.

Onlyafterbeingprocessed,canitbeavailableresources.

3

Partone-Whatiscorpuslinguistics?

Asameans-exploreactualpatternsoflanguage

Use

Asatool-developingmaterialsforclassroomlanguageinstruction

Andalsouseslargecollectionsofbothspokenandwrittennaturaltextsthatarestoredoncomputers.

4

Whatiscorpuslinguistics?

Corpuslinguisticsisanapproachtoinvestigatelanguagethatischaracterizedbytheuseoflargecollectionsoftexts(spoken,writtenorboth)andcomputer-assistedanalysis

methods.

5

Contributionsofcorpuslinguistics

·Oneofthemajorcontributionsofcorpuslinguisticsisintheareaofexploringpatternsoflanguageuse.

Provideanextremelypowerfultool-theanalysisofnaturallanguage

Providetremendousinsights-howlanguageusevariesindifferentsituations

6

Thedevelopmentofcorpus

In1964,,thefirstcorpuswasbuiltbyAmericanBrownUniversity,storingonemillionwordsandcollectingeachstylesoflanguage

sourcesofAmericanEnglish.

In1978,BritainbuiltLOB(Lancaster-Oslo-Bergen),storingonemillionwordsandcollectinglanguagesourcesofBritishEnglish.

In1980s,BirminghamUniversitybuil+BCET(BirminghamCollectionofEnglishTexts),storing7.3millionwordsandusingfordictionarycompilation.

In1996,thestoragewasextendedto32millionwordsandwasrenamedasBE(BankofEnglish).

In1980,COBUILD(CalLinsBirminghamUniversityInternationalLanguageDatabase)wasbuilt,inclu