浙江大学肖忠华语料库session5(外语学习).ppt

  1. 1、有哪些信誉好的足球投注网站(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。。
  2. 2、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载
  3. 3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
Practice Use both the UCREL/Xu’s LL calculator / SPSS to determine if the difference in the frequencies of passives in the CLEC and LOCNESS corpora is statistically significant CLEC: 7,911 instances in 1,070,602 words LOCNESS: 5,465 instances in 324,304 words Collocation statistics Collocation: the habitual or characteristic co-occurrence patterns of words Can be identified using a statistical approach in CL, e.g. Mutual Information (MI), t test, z score Can be computed using tools like SPSS, Wordsmith, AntConc, Xaira Only a brief introduction here More discussions of collocation statistics to be followed Mutual information Computed by dividing the observed frequency of the co-occurring word in the defined span for the search string (so-called node word), e.g. a 4:4 window, by the expected frequency of the co-occurring word in that span and then taking the logarithm to the base 2 of the result Mutual information A measure of collocational strength The higher the MI score, the stronger the link between two items MI score of 3.0 or higher to be taken as evidence that two items are collocates The closer to 0 the MI score gets, the more likely it is that the two items co-occur by chance A negative MI score indicates that the two items tend to shun each other The t test Computed by subtracting the expected frequency from the observed frequency and then dividing the result by the standard deviation A t score of 2 or higher is normally considered to be statistically significant The specific probability level can be looked up in a table of t distribution The z score The z score is the number of standard deviations from the mean frequency The z test compares the observed frequency with the frequency expected if only chance is affecting the distribution A higher z score indicates a greater degree of collocability of an item with the node word LG3204 Corpus Linguistics 07-08 Making statistic claims Corpus Linguistics Richard Xiao lancsxiaoz@ Update on assignments Deadline

文档评论(0)

july77 + 关注
实名认证
内容提供者

该用户很懒,什么也没介绍

1亿VIP精品文档

相关文档