com.hankcs.hanlp.seg
类 WordBasedGenerativeModelSegment

java.lang.Object
  继承者 com.hankcs.hanlp.seg.Segment
      继承者 com.hankcs.hanlp.seg.WordBasedGenerativeModelSegment
直接已知子类:
DijkstraSegment, NShortSegment, ViterbiSegment

public abstract class WordBasedGenerativeModelSegment
extends Segment

基于词语NGram模型的分词器基类

作者:
hankcs

字段摘要
 
从类 com.hankcs.hanlp.seg.Segment 继承的字段
config
 
构造方法摘要
WordBasedGenerativeModelSegment()
           
 
方法摘要
protected static List<Term> convert(List<Vertex> vertexList)
          将一条路径转为最终结果
protected static List<Term> convert(List<Vertex> vertexList, boolean offsetEnabled)
          将一条路径转为最终结果
protected static List<Term> decorateResultForIndexMode(List<Vertex> vertexList, WordNet wordNetAll)
          为了索引模式修饰结果
protected static void fixResultByRule(List<Vertex> linkedArray)
          通过规则修正一些结果
protected static Graph GenerateBiGraph(WordNet wordNet)
          生成二元词图
protected static void GenerateWord(List<Vertex> linkedArray, WordNet wordNetOptimum)
          对粗分结果执行一些规则上的合并拆分等等,同时合成新词网
protected  void GenerateWordNet(WordNet wordNetStorage)
          生成一元词网
protected static void speechTagging(List<Vertex> vertexList)
          词性标注
 
从类 com.hankcs.hanlp.seg.Segment 继承的方法
atomSegment, combineByCustomDictionary, enableAllNamedEntityRecognize, enableCustomDictionary, enableIndexMode, enableJapaneseNameRecognize, enableMultithreading, enableMultithreading, enableNameRecognize, enableNumberQuantifierRecognize, enableOffset, enableOrganizationRecognize, enablePartOfSpeechTagging, enablePlaceRecognize, enableTranslatedNameRecognize, mergeNumberQuantifier, quickAtomSegment, seg, seg, seg2sentence, segSentence, simpleAtomSegment
 
从类 java.lang.Object 继承的方法
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

构造方法详细信息

WordBasedGenerativeModelSegment

public WordBasedGenerativeModelSegment()
方法详细信息

GenerateWord

protected static void GenerateWord(List<Vertex> linkedArray,
                                   WordNet wordNetOptimum)
对粗分结果执行一些规则上的合并拆分等等,同时合成新词网

参数:
linkedArray - 粗分结果
wordNetOptimum - 合并了所有粗分结果的词网

fixResultByRule

protected static void fixResultByRule(List<Vertex> linkedArray)
通过规则修正一些结果

参数:
linkedArray -

convert

protected static List<Term> convert(List<Vertex> vertexList,
                                    boolean offsetEnabled)
将一条路径转为最终结果

参数:
vertexList -
offsetEnabled - 是否计算offset
返回:

convert

protected static List<Term> convert(List<Vertex> vertexList)
将一条路径转为最终结果

参数:
vertexList -
返回:

GenerateBiGraph

protected static Graph GenerateBiGraph(WordNet wordNet)
生成二元词图

参数:
wordNet -
返回:

GenerateWordNet

protected void GenerateWordNet(WordNet wordNetStorage)
生成一元词网

参数:
wordNetStorage -

decorateResultForIndexMode

protected static List<Term> decorateResultForIndexMode(List<Vertex> vertexList,
                                                       WordNet wordNetAll)
为了索引模式修饰结果

参数:
vertexList -
wordNetAll -

speechTagging

protected static void speechTagging(List<Vertex> vertexList)
词性标注

参数:
vertexList -


Copyright © 2014–2015 码农场. All rights reserved.